Readability Scores Explained: What They Mean and Why They Matter

By Michael Lip · Published April 2025 · 7 min read

Readability scores have been used since the 1940s to measure how easy text is to read. They appear in Microsoft Word, Grammarly, and dedicated tools like enhio.com. But most people see the number without understanding what it measures, what it misses, and how to actually use it. Here is the full picture.

The Two Main Formulas

Flesch Reading Ease

Developed by Rudolf Flesch in 1948, this is the most widely used readability formula. It produces a score from 0 to 100, where higher means easier to read:

Score = 206.835 - 1.015 * (total words / total sentences)
                 - 84.6 * (total syllables / total words)

The formula uses two inputs: average sentence length and average syllables per word. Shorter sentences and simpler words produce higher scores.

Score interpretation: 90-100 is very easy (5th grade), 60-70 is standard (8th-9th grade), 30-50 is difficult (college level), 0-30 is very difficult (graduate level). Most popular web content scores between 55-70. Legal documents typically score 20-30.

Flesch-Kincaid Grade Level

This is a reformulation of the Flesch formula that outputs a US grade level instead of a 0-100 score:

Grade = 0.39 * (total words / total sentences)
      + 11.8 * (total syllables / total words)
      - 15.59

A grade level of 8.2 means an average 8th grader should be able to understand the text. The US military requires all public communications to score below grade 8. Most major newspapers aim for grade 6-8.

What These Formulas Actually Measure

Both formulas are proxies. They do not measure comprehension directly. They measure two surface features — sentence length and word length — that are correlated with difficulty. The underlying assumption is reasonable: longer sentences require more working memory to parse, and longer words tend to be rarer and more abstract.

But correlation is not causation. Consider these two sentences:

"The cat sat on the mat." (Flesch-Kincaid grade: ~1)
"Ontological commitments necessitate epistemological frameworks." (Flesch-Kincaid grade: ~16)

The formulas correctly rank these. But consider:

"The eigenvalues of the covariance matrix determine the principal components." (Grade: ~10)
"The big happy dog ran very fast around the really long beautiful garden." (Grade: ~5)

The first sentence is harder for most people despite the lower grade level, because it requires domain knowledge. The formulas cannot measure conceptual difficulty, only linguistic surface features.

Syllable Counting: The Technical Challenge

Both formulas depend on accurate syllable counting, which is surprisingly hard in English. The standard algorithmic approach counts vowel groups (sequences of consecutive vowels):

Need timestamp conversions for your content publishing? Try EpochPilot's time tools.

function countSyllables(word) {
  word = word.toLowerCase();
  if (word.length <= 2) return 1;
  var count = 0;
  var prevVowel = false;
  for (var i = 0; i < word.length; i++) {
    var isVowel = 'aeiouy'.indexOf(word[i]) >= 0;
    if (isVowel && !prevVowel) count++;
    prevVowel = isVowel;
  }
  // Silent e adjustment
  if (word.endsWith('e') && count > 1) count--;
  return Math.max(1, count);
}

This heuristic works for about 90% of English words but fails on exceptions: "beautiful" has 3 syllables but the algorithm might count 4 (beau-ti-ful vs beau-u-ti-ful). "Area" has 3 syllables but the algorithm counts 3 vowel groups correctly. "Queue" has 1 syllable but has 4 vowels. Perfect syllable counting requires a dictionary lookup, which most tools skip in favor of the faster heuristic.

Beyond the Score: What to Actually Do

Readability scores are diagnostic tools, not targets. Optimizing directly for a Flesch score produces choppy, lifeless writing. Instead, use the score as a signal:

Check after writing, not during. Write naturally, then run the analysis. Checking while writing interrupts flow.
Look at the components, not just the total. A score of 50 could mean long sentences with simple words or short sentences with complex words. The fix differs. The enhio.com tool breaks this down with separate sentence and word metrics.
Compare to your audience's expectations. Grade 14 is perfectly fine for an academic paper. Grade 14 for a product landing page will lose most visitors.
Track trends, not absolutes. If your blog posts have been scoring 65 and suddenly one scores 45, that particular post probably needs simplification. The absolute number matters less than the deviation.

The Future: AI-Based Readability

Modern language models (see the full landscape at ml0x.com) can assess readability in ways that traditional formulas cannot: they understand context, domain vocabulary, and logical structure. However, they are expensive to run, non-deterministic, and hard to benchmark against a consistent scale. For now, the 75-year-old Flesch formula remains the practical standard, supplemented by human judgment. Visit writing tools for more developer tools.