Advertisement — Header Banner (728×90)
Free Browser Tool

Assamese Text Formatter

Clean up messy Assamese text in one click. Normalize Unicode, fix spacing, convert digits, replace periods with the danda (।), strip invisible characters and map shared Bengali letters to their proper Assamese form — entirely in your browser.

100% Free 12 Cleanup Rules Runs Offline Unicode NFC
ASSAMESE TEXT FORMATTER
Cleanup Rules
Input
Formatted Output 0 changes
0
Input chars
0
Output chars
0
Chars saved
0
Rules active

How the Assamese Text Formatter Works

The formatter applies your selected rules in a fixed order so the output is always predictable. Unicode normalization and invisible-character removal run first, then whitespace and punctuation rules, then digit conversion, smart quotes and finally the optional Bengali → Assamese letter mapping. Every rule is a pure JavaScript regex transformation, so the entire pipeline finishes in milliseconds even on long documents.

  1. Paste your Assamese text (or text mixed with English) into the Input box.
  2. Tick the cleanup rules you want to apply — sensible defaults are pre-selected.
  3. The Formatted Output updates instantly with stats showing how many characters were saved.
  4. Use Copy, Download .txt, or Replace Input to chain another round of formatting on the cleaned text.

All 12 Cleanup Rules Explained

RuleWhat it doesDefault
Normalize Unicode (NFC)Rewrites the text in canonical composed form so identical-looking strings compare equal.On
Strip invisible charactersRemoves ZWJ (U+200D), ZWNJ (U+200C), zero-width space (U+200B), BOM (U+FEFF) and other zero-width formatting marks that often sneak in from Word, PDFs and websites.On
Trim each lineStrips leading and trailing spaces / tabs from every line individually.On
Collapse repeated spacesTwo or more consecutive spaces or tabs become a single space.On
Limit blank linesThree or more consecutive empty lines collapse to one blank line, preserving paragraph breaks.On
Fix space before punctuationRemoves the unwanted space that often appears before । ॥ , . ? ! ; :On
Add space after punctuationInserts a single space after । ॥ , . ? ! if the next character is a letter (so sentences breathe).On
Period → DandaReplaces the Latin full-stop "." with the Assamese danda "।" — only when surrounded by Assamese letters or whitespace, never inside numbers, URLs or English words.Off
English digits → AssameseConverts 0 1 2 3 4 5 6 7 8 9 to ০ ১ ২ ৩ ৪ ৫ ৬ ৭ ৮ ৯.Off
Assamese digits → EnglishThe reverse of the above. Useful when preparing data for systems that expect Arabic numerals.Off
Smart quotesReplaces straight " and ' with curly typographic quotes (" " ' ').On
Bengali → Assamese lettersMaps the three letters where the scripts differ: র → ৰ and ব → ৱ. Apply only if the source is Bengali script — opt-in to avoid corrupting genuine Assamese ব.Off
Advertisement — In-Content Inline (responsive)

Who Benefits From an Assamese Text Formatter

Writers & Bloggers

Polish drafts copied out of Word, Google Docs or PDFs — strip stray formatting, fix spacing and tidy punctuation before publishing.

Editors & Publishers

Enforce a consistent house style — one space after danda, smart quotes and proper Assamese letter forms across submissions.

Data & NLP Engineers

Normalize scraped or user-generated Assamese text so tokenizers, search and TTS engines see one canonical form.

Students & Teachers

Convert digits between English and Assamese for assignments, and clean up text pulled from messy online sources.

Frequently Asked Questions

What does the Assamese text formatter do?

It applies a chain of cleanup rules to your Assamese text: it normalizes Unicode, trims and collapses extra spaces, removes invisible zero-width characters, converts English digits to Assamese numerals, replaces Latin periods with the Assamese danda (।), upgrades straight quotes to typographic quotes, and can map shared Bengali letterforms to their Assamese equivalents.

Is my text uploaded anywhere?

No. Every transformation runs in your browser using JavaScript. Nothing is sent to a server, so the formatter is safe for confidential or unpublished material.

What is Unicode normalization (NFC) and why does it matter?

The same Assamese syllable can be stored in two different ways internally — as a single code point or as several combined code points. NFC (Canonical Composition) rewrites text into the standard composed form, which fixes copy-paste issues, makes search and counting reliable, and prevents broken rendering on some devices.

Will the Bengali to Assamese conversion always be correct?

It is correct for the three letters that differ between the two scripts: (replacing র), (replacing certain ব used as ), and য় (already shared). Because ব is also a real Assamese letter, the tool only replaces ব → ৱ when you opt in. Keep this option off if your text already uses correct Assamese spelling.

Can I undo the formatting?

Yes. Your original text stays in the input box on the left at all times, and the formatted result appears on the right. You can toggle any rule on or off and the output updates instantly. Use the Reset button to clear everything.

Related Assamese Tools