Readability Scores Explained: What They Mean and Why They Matter
Readability scores have been used since the 1940s to measure how easy text is to read. They appear in Microsoft Word, Grammarly, and dedicated tools like enhio.com. But most people see the number without understanding what it measures, what it misses, and how to actually use it. Here is the full picture.
The Two Main Formulas
Flesch Reading Ease
Developed by Rudolf Flesch in 1948, this is the most widely used readability formula. It produces a score from 0 to 100, where higher means easier to read:
Score = 206.835 - 1.015 * (total words / total sentences)
- 84.6 * (total syllables / total words)
The formula uses two inputs: average sentence length and average syllables per word. Shorter sentences and simpler words produce higher scores.
Score interpretation: 90-100 is very easy (5th grade), 60-70 is standard (8th-9th grade), 30-50 is difficult (college level), 0-30 is very difficult (graduate level). Most popular web content scores between 55-70. Legal documents typically score 20-30.
Flesch-Kincaid Grade Level
This is a reformulation of the Flesch formula that outputs a US grade level instead of a 0-100 score:
Grade = 0.39 * (total words / total sentences)
+ 11.8 * (total syllables / total words)
- 15.59
A grade level of 8.2 means an average 8th grader should be able to understand the text. The US military requires all public communications to score below grade 8. Most major newspapers aim for grade 6-8.
What These Formulas Actually Measure
Both formulas are proxies. They do not measure comprehension directly. They measure two surface features — sentence length and word length — that are correlated with difficulty. The underlying assumption is reasonable: longer sentences require more working memory to parse, and longer words tend to be rarer and more abstract.
But correlation is not causation. Consider these two sentences:
- "The cat sat on the mat." (Flesch-Kincaid grade: ~1)
- "Ontological commitments necessitate epistemological frameworks." (Flesch-Kincaid grade: ~16)
The formulas correctly rank these. But consider:
- "The eigenvalues of the covariance matrix determine the principal components." (Grade: ~10)
- "The big happy dog ran very fast around the really long beautiful garden." (Grade: ~5)
The first sentence is harder for most people despite the lower grade level, because it requires domain knowledge. The formulas cannot measure conceptual difficulty, only linguistic surface features.
Syllable Counting: The Technical Challenge
Both formulas depend on accurate syllable counting, which is surprisingly hard in English. The standard algorithmic approach counts vowel groups (sequences of consecutive vowels):
function countSyllables(word) {
word = word.toLowerCase();
if (word.length <= 2) return 1;
var count = 0;
var prevVowel = false;
for (var i = 0; i < word.length; i++) {
var isVowel = 'aeiouy'.indexOf(word[i]) >= 0;
if (isVowel && !prevVowel) count++;
prevVowel = isVowel;
}
// Silent e adjustment
if (word.endsWith('e') && count > 1) count--;
return Math.max(1, count);
}
This heuristic works for about 90% of English words but fails on exceptions: "beautiful" has 3 syllables but the algorithm might count 4 (beau-ti-ful vs beau-u-ti-ful). "Area" has 3 syllables but the algorithm counts 3 vowel groups correctly. "Queue" has 1 syllable but has 4 vowels. Perfect syllable counting requires a dictionary lookup, which most tools skip in favor of the faster heuristic.
Beyond the Score: What to Actually Do
Readability scores are diagnostic tools, not targets. Optimizing directly for a Flesch score produces choppy, lifeless writing. Instead, use the score as a signal:
- Check after writing, not during. Write naturally, then run the analysis. Checking while writing interrupts flow.
- Look at the components, not just the total. A score of 50 could mean long sentences with simple words or short sentences with complex words. The fix differs. The enhio.com tool breaks this down with separate sentence and word metrics.
- Compare to your audience's expectations. Grade 14 is perfectly fine for an academic paper. Grade 14 for a product landing page will lose most visitors.
- Track trends, not absolutes. If your blog posts have been scoring 65 and suddenly one scores 45, that particular post probably needs simplification. The absolute number matters less than the deviation.
Other Readability Metrics
While Flesch formulas are the most common, several others exist:
- Gunning Fog Index — Focuses on "complex words" (3+ syllables). Formula: 0.4 * (avg sentence length + percentage of complex words). Generally produces higher numbers than Flesch-Kincaid.
- Coleman-Liau Index — Uses character count instead of syllables, making it easier to compute. Formula uses average letters per 100 words and average sentences per 100 words.
- SMOG Grade — Simplified Measure of Gobbledygook. Counts polysyllabic words in 30 sentences. Considered more accurate for healthcare materials.
- Automated Readability Index (ARI) — Uses characters per word and words per sentence. No syllable counting needed.
For most purposes, Flesch-Kincaid and Flesch Reading Ease are sufficient. They are the most researched, the most widely implemented, and the easiest to interpret.
The Future: AI-Based Readability
Modern language models (see the full landscape at ml0x.com) can assess readability in ways that traditional formulas cannot: they understand context, domain vocabulary, and logical structure. However, they are expensive to run, non-deterministic, and hard to benchmark against a consistent scale. For now, the 75-year-old Flesch formula remains the practical standard, supplemented by human judgment. Visit zovo.one for more developer tools.