A small tool for easily cleaning text.
npm install text-cleaner --save
const TextCleaner = require("text-cleaner");
TextCleaner("Some <b> TEXT to Clean</b>")
.stripHtml()
.condense()
.toLowerCase()
.valueOf();
// some text to clean
const cleanString = TextCleaner("string");
Returns an object, with the following methods:
Returns the current working value of the string being cleaned
TextCleaner("STRING").valueOf();
// "STRING"
TextCleaner("STRING").toString();
// "STRING"
TextCleaner("string").length;
// 6
TextCleaner("string").remove("tr").valueOf();
// "sing"
TextCleaner("string").replace("tr", "l").valueOf();
// "sling"
TextCleaner(" string ").trim().valueOf();
// "string"
TextCleaner("STRING").toLowerCase().valueOf();
// "string"
TextCleaner("string").toUpperCase().valueOf();
// "STRING"
TextCleaner("a long string").truncate(6).valueOf();
// "a long"
Condenses all white space to a single space
TextCleaner("s \t t \nr i n g").condense().valueOf();
// "s t r i n g"
TextCleaner("Email me at: [email protected]").stripEmails().valueOf();
// "Email me at: "
TextCleaner("<b>string<lb>").stripHtml().valueOf();
// "string"
Remove all non-alpha characters, including numbers. Only letters, white space and characters specified in the exclude option will not be removed.
Options (object):
- replaceWith (default: "") Character to replace matched characters with. Allows for characters to be replaced by a space, preventing words from merging on character removal.
- exclude: (default: "") String of characters to exclude. These are added to a regular expression; e.g. "0-9" would exclude numbers from replacement
TextCleaner("~string1!").removeChars({ exclude: "!" }).valueOf();
// "string!"
Remove apostrophes from the text, but leave other single quotes in the text.
TextCleaner("a quote: 'he didn't'").removeApostrophes().valueOf();
// "a quote: 'he didnt'"
Allows words containing apostrophes to be treated separately to removeChars()
, such as when replacing characters with a space with removeChars({ replaceWith: ' ' })
, preserving the word.
/* undesired behaviour */
TextCleaner("don't(text)").removeChars({ replaceWith: " " }).trim().valueOf();
// "don t text"
/* desired behaviour */
TextCleaner("don't(text)")
.removeApostrophes()
.removeChars({ replaceWith: " " })
.trim()
.valueOf();
// "dont text"
Remove common stop words from the text for textual/sentiment anlysis. Uses stopword.
TextCleaner("the test string with some words").removeStopWords().valueOf();
// "test string words"