Character Counter
Count characters, words, lines, paragraphs and sentences in real time. Unicode-grapheme aware, byte estimates for SMS, reading time, SEO/Twitter limits.
About Character Counter Tool
Counting characters seems trivial — but "how many characters does this string have?" has four legitimate answers depending on the layer you ask about: bytes (UTF-8 octets, what cloud storage charges for), code units (UTF-16 chunks, what JavaScript's str.length returns and what SQL Server NVARCHAR limits), codepoints (Unicode characters, what str iteration yields in Python 3 and modern JS), or grapheme clusters (what humans perceive as one character). The four can disagree dramatically — the family emoji 👨👩👧👦 is 1 grapheme but 7 codepoints, 11 UTF-16 code units, and 25 UTF-8 bytes. This counter reports the user-perceived grapheme count so the number matches what you see on screen, and separately exposes word, line, paragraph and sentence counts using Unicode-aware boundary detection (UAX #29). Because platform limits are layer-specific — Twitter/X counts CJK as 2, SMS GSM-7 packs 160 ASCII into 140 bytes while UCS-2 drops to 70 the moment one emoji appears, SEO titles are pixel-budgeted (~580 px) not character-budgeted, and LLM tokenizers average ~4 chars per token for English but 1-2 for Vietnamese — use this counter for general drafting and validate against each platform's official counter before publishing. Counting runs locally with a 300 ms debounce; nothing is uploaded. See also our Case Converter and the Lorem Ipsum Generator.
Why do character counts differ between this tool, Microsoft Word, and Twitter?
Different platforms count characters using different rules. This tool counts every Unicode codepoint, treating each visible glyph as one unit. Microsoft Word's "Characters" reports two numbers — with and without spaces — and may exclude footnotes by default. Twitter/X is the most complex: it counts URLs as 23 characters regardless of actual length (link wrapping), counts most emojis as 2 characters, treats Han/Hangul/Hiragana ranges as 2 characters each, and applies a weighted formula in its publishing API. To stay safely under platform limits, always count using each platform's official counter for final validation; this tool is for general drafting and is conservative (counts what is visually there).
How are emojis, accented letters, and combining characters counted?
Naive character counting (string length) can give surprising results because the underlying Unicode model is more complex than "one character = one count." A simple emoji like 😀 is one codepoint and one perceived character — straightforward. But a family emoji 👨👩👧👦 is technically four emoji codepoints joined by three zero-width joiners — seven codepoints, one visible glyph. Accented letters can be one precomposed codepoint (é, NFC) or two combined (e + ́, NFD). This tool counts perceived characters (grapheme clusters) when possible, so 👨👩👧👦 reads as 1. JavaScript's str.length still returns codepoint count (often 11 for that family emoji) — different tools may disagree by design.
What's the optimal character count for SEO title tags and meta descriptions in 2026?
Google's SERP renders titles in around 580 pixels and descriptions in around 920 pixels of width, not a fixed character count — wide letters (W, M) take more space than narrow ones (i, l). As a practical proxy, aim for: titles 50-60 characters (mobile truncates earlier at 50), descriptions 120-160 characters (mobile shows ~120, desktop ~160). Google does not penalize longer text; it just truncates with ellipsis, which can hurt CTR. Front-load the most important words. For other platforms: Open Graph titles 60-90, descriptions ~200; Twitter cards 70/200; LinkedIn shares 150 titles, 250 descriptions. Test how your snippet renders with Google's Rich Results Test for your most important pages.
What does WCAG 2.2 say about ideal character count per line for accessibility?
WCAG 2.2 Success Criterion 1.4.8 (Visual Presentation, Level AAA) recommends a maximum line length of 80 characters (40 for Chinese, Japanese, and Korean). Research from typography studies converges on 50-75 characters per line as optimal for reading speed and comprehension — shorter lines (under 40) force too many eye-jumps; longer lines (over 90) cause readers to lose place when returning to start a new line. For body text on the web, set CSS max-width to roughly 65ch (the ch unit equals the width of the 0 character). This tool counts total characters in the entire text, not per line — to check per-line counts, split by newlines and measure each substring. Long-line warning is a quick accessibility-first heuristic.

How do SMS message segments work and why does one emoji split my text into multiple messages?
SMS uses two encodings. GSM-7 (default) packs 160 characters into one 140-byte SMS using 7-bit chars — works for ASCII plus basic accents (é, à, ñ are fine; others trigger fallback). UCS-2 (Unicode) is used the moment any character outside GSM-7 appears — including emojis, curly quotes, em dashes, or many Vietnamese diacritics — and reduces capacity to 70 characters per segment. Multi-segment SMS uses 153 (GSM) or 67 (UCS-2) per segment because routing headers eat the rest. So a 100-character message with one emoji becomes UCS-2 and spans two segments (140 chars total billed as 2 messages). Twilio and other gateways bill per segment, not per character. Strip curly quotes and emojis to keep texts in single GSM-7 segments and save money on bulk SMS.
What's the difference between bytes, codepoints, code units, and grapheme clusters?
These four layers are the source of most character-counting confusion. Bytes: the raw octets in the encoded file (UTF-8 uses 1-4 bytes per codepoint). Code units: the 16-bit chunks in UTF-16 (JavaScript and Java strings, Windows API) — emojis above U+FFFF use 2 code units. Codepoints: actual Unicode characters (U+1F600 for 😀) — a string's iterator in modern languages returns codepoints. Grapheme clusters: what humans perceive as one character — 👨👩👧👦 is 1 grapheme but 7 codepoints, 14 UTF-16 code units, 25 UTF-8 bytes. This tool reports the user-perceived grapheme count. When working with APIs that bill by bytes (cloud storage), or limit by code units (SQL VARCHAR), pick the right layer for your use case.
How can I estimate reading time from character or word count for blog posts?
Average adult silent reading speed in English is 200-300 words per minute (WPM); aloud is slower at 150-160 WPM. Technical content slows readers to 50-100 WPM. To estimate reading time: divide word count by 238 (Medium's default WPM) and round up. For non-English: Spanish 220, French 195, Portuguese 215, Vietnamese 180 — Asian languages without spaces are often measured in characters per minute instead (Chinese ~300 cpm). Character-based estimates are useful when word boundaries are unclear: divide total characters (with spaces) by 1,500 to get minutes for English. This counter shows words and characters; multiply or divide to compute reading time and display "5 min read" badges on long-form content.
How do LLM token counts relate to character counts for prompt cost estimation?
Large language models (GPT, Claude, Llama, Gemini) charge by tokens, not characters. As a rough rule for English text, 1 token ≈ 4 characters ≈ 0.75 words. So a 1,000-character paragraph is roughly 250 tokens. But this ratio varies dramatically: code uses fewer characters per token (~3) because syntax is dense; non-English uses more characters per token because BPE tokenizers were trained primarily on English. Vietnamese and Thai can hit 1.5-2 tokens per character due to multi-byte diacritic encoding. Japanese and Chinese are even less efficient — sometimes 1:1. To budget API costs accurately, use the model's official tokenizer (tiktoken for OpenAI, anthropic-tokenizer for Claude). This character counter gives a fast first estimate: divide characters by 4 for English-heavy prompts.
Example Results
| Input Text | Characters | Words | Spaces | Lines | Paragraphs | Sentences |
|---|---|---|---|---|---|---|
| Hello World! | 11 | 2 | 1 | 1 | 1 | 1 |
| This is a test.\nAnother line here. | 32 | 6 | 5 | 2 | 2 | 2 |
| Character counter tool\nis very useful\nfor writers. | 45 | 7 | 6 | 3 | 3 | 3 |
