Clean up messy Assamese text in one click. Normalize Unicode, fix spacing, convert digits, replace periods with the danda (।), strip invisible characters and map shared Bengali letters to their proper Assamese form — entirely in your browser.
The formatter applies your selected rules in a fixed order so the output is always predictable. Unicode normalization and invisible-character removal run first, then whitespace and punctuation rules, then digit conversion, smart quotes and finally the optional Bengali → Assamese letter mapping. Every rule is a pure JavaScript regex transformation, so the entire pipeline finishes in milliseconds even on long documents.
| Rule | What it does | Default |
|---|---|---|
| Normalize Unicode (NFC) | Rewrites the text in canonical composed form so identical-looking strings compare equal. | On |
| Strip invisible characters | Removes ZWJ (U+200D), ZWNJ (U+200C), zero-width space (U+200B), BOM (U+FEFF) and other zero-width formatting marks that often sneak in from Word, PDFs and websites. | On |
| Trim each line | Strips leading and trailing spaces / tabs from every line individually. | On |
| Collapse repeated spaces | Two or more consecutive spaces or tabs become a single space. | On |
| Limit blank lines | Three or more consecutive empty lines collapse to one blank line, preserving paragraph breaks. | On |
| Fix space before punctuation | Removes the unwanted space that often appears before । ॥ , . ? ! ; : | On |
| Add space after punctuation | Inserts a single space after । ॥ , . ? ! if the next character is a letter (so sentences breathe). | On |
| Period → Danda | Replaces the Latin full-stop "." with the Assamese danda "।" — only when surrounded by Assamese letters or whitespace, never inside numbers, URLs or English words. | Off |
| English digits → Assamese | Converts 0 1 2 3 4 5 6 7 8 9 to ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯. | Off |
| Assamese digits → English | The reverse of the above. Useful when preparing data for systems that expect Arabic numerals. | Off |
| Smart quotes | Replaces straight " and ' with curly typographic quotes (" " ' '). | On |
| Bengali → Assamese letters | Maps the three letters where the scripts differ: র → ৰ and ব → ৱ. Apply only if the source is Bengali script — opt-in to avoid corrupting genuine Assamese ব. | Off |
Polish drafts copied out of Word, Google Docs or PDFs — strip stray formatting, fix spacing and tidy punctuation before publishing.
Enforce a consistent house style — one space after danda, smart quotes and proper Assamese letter forms across submissions.
Normalize scraped or user-generated Assamese text so tokenizers, search and TTS engines see one canonical form.
Convert digits between English and Assamese for assignments, and clean up text pulled from messy online sources.
It applies a chain of cleanup rules to your Assamese text: it normalizes Unicode, trims and collapses extra spaces, removes invisible zero-width characters, converts English digits to Assamese numerals, replaces Latin periods with the Assamese danda (।), upgrades straight quotes to typographic quotes, and can map shared Bengali letterforms to their Assamese equivalents.
No. Every transformation runs in your browser using JavaScript. Nothing is sent to a server, so the formatter is safe for confidential or unpublished material.
The same Assamese syllable can be stored in two different ways internally — as a single code point or as several combined code points. NFC (Canonical Composition) rewrites text into the standard composed form, which fixes copy-paste issues, makes search and counting reliable, and prevents broken rendering on some devices.
It is correct for the three letters that differ between the two scripts: ৰ (replacing র), ৱ (replacing certain ব used as wā), and য় (already shared). Because ব is also a real Assamese letter, the tool only replaces ব → ৱ when you opt in. Keep this option off if your text already uses correct Assamese spelling.
Yes. Your original text stays in the input box on the left at all times, and the formatted result appears on the right. You can toggle any rule on or off and the output updates instantly. Use the Reset button to clear everything.