The Science Behind Readability

An 8th grader should be able to understand this. My boss frowned at the verbose slides. I was working in the Mid-West for a diesel engine manufacturer and had put together some slides for a go to market strategy.  My boss was a short man, but had an imposing personality and lots of experience to back it up. He was a former US Navy nuclear submarine officer and had worked many years in strategy consulting. His advice made sense. I am sure that if a written instruction or verbal order on a nuclear submarine was too complex to understand, the effects could be disastrous. Simple writing means clear communication. Great. However, I had no idea how to actually measure the level of readability to improve my writing. Was there a metric that told you if your writing was grade 8?

Flesch and Flesch-Kincaid Readability Formulas

It turns out that the US Navy has been worrying about personal understanding narrative technical material for some time – see US Navy Readability Formulas. There are several metrics used to score the readability of a piece of writing. However, two of the most common are the Flesch Reading Ease and the Flesch-Kincaid (F-K) Readability formulas. Rudolph Flesch was born in Austria and fled before the Nazi invasion. He migrated to the US and earned his PhD in English from Columbia University. In 1948 he came up with the readability metric in his article ‘A New Readability Yardstick’  in the Journal of Applied Psychology.

Simply put, the readability calculations work out the average words per sentence and average syllables per word. They then take these ratios and create a standardized scale. As expected, the fewer words per sentence and fewer syllables per word, the simpler the communication. The Flesch Reading Ease is on a scale of 0 – 100, where a higher score means simpler English. The Flesch-Kincaid score, gives the approximate minimum US grade level required to understand the text. The formulas used are given below.

Flesch Reading Ease

Flesch-Kincaid Readability Grade

Flesch and Flesch-Kincaid Readability Range

Is there a perfect score to aim for in your writing? I don’t know. However, the range of reading ease is shown in the below table. Note that since the Flesch reading ease is a different metric to the Flesch-Kincaid score, they will both give different approximations of the US grade level. Generally the Flesch-Kincaid score will give a lower US grade.

Flesch Reading Ease ScoreStyle Description
Flesch Estimated US Reading Grade
90-100Very Simple English
5th Grade or less

80-90Simple English
6th Grade
70-80Moderately Easy English7th Grade
60-70Plain English
8th - 9th Grade
50-60Moderately Difficult English
10th - 12th Grade
30-50Difficult English

Undergraduate University
0-30Very Difficult EnglishGraduate School

To show some real world examples of these readability metrics, I took some varying texts and ran them through the formulas above. There are a few ways to analyse your text to get these metrics. I’ll show you in the next section.

I downloaded some well known pieces of communication. I took Churchill’s speech ‘We will fight on the beaches..”, The Hamlet soliloquy “To be or not to be…”, Kafka’s Metamorphosis, Obama’s inauguration speech, Trump’s inauguration speech, The latest Trump tweets and a transcript of some of Louis CK’s stand-up comedy. I plotted the results in the below chart with the Flesch Reading Ease on the X-axis and the F-K grade on the Y-Axis.

F-K and Flesch Reading Ease Examples

Interestingly, most of the writing was between an F-K score of 8th to 9th grader and a Flesch Reading Ease of 60 – 70. There was not that much of a gap between the inauguration speeches of Trump and Obama. Churchill’s wartime speech was a little more complex at ~11th grade and Trump’s last tweets, a communication medium that promotes short sentences, are at a 6th grader level. What was surprising, was a direct transcript from the stand-up comedian Louise CK is much lower, with an F-K score at 3rd grade level. This is due to the short sentences that Louis CK uses, true for a lot of conversational English.

How do I calculate readability metrics without writing my own code?

1. MS Word

MS Word can display both the Flesch readability scores after you have checked the spelling and grammar.

Make sure you have ‘readability statistics’ checked. Do this by going into File –> Options –> Proofing. Then check the readability statistics options as per the below.

Next, launch spelling and grammar by going to the Review tab and selecting Spelling & Grammar.

After you have checked all spelling and grammar, you will receive a screen to show the statistics of the writing. The last section gives you both the Flesch Reading Ease as well as the Flesch-Kincaid Grade level.

2. Use an Online tool

There are online tools where you can paste text directly into your browser, such as Readability Online .

How do I calculate readability metrics using Python?

Welcome to the Geek section of the post. Here I will show you how to write a simple Python (v3.2) script to take text files and perform the analytics required to assess writing using the F-K or Flesch Reading Ease calculations. The hardest part of this analysis is the syllables count. I have used a simplification to count syllables using the logic that if there is a vowel in the word, then that is a syllable. The exception to this rule is, if the word ends in ‘es’,’ed’ or ‘e’ . There are better ways to count syllables using Python, but they involve using the Natural Language Toolkit , see NLK.org . You can download the module NLTK and then install the Carnegie Mellon University (CMU) Pronouncing Dictionary. A good blog that shows you how to do this is NLTK and Readability.

The first part of the Python code #01, gets a list of the current directory using os.getcwd(). This is used to test whether the text file exists when the user types the file name.

The next part of the code #02, has a while loop that continues to execute unit the user types ‘q’ to quit. The code then counts the number of sentences by adding punctuation such as “.”, “:”, “;” ,”!”or “?”. The number of words are counted by using len( text.split() ). Lastly, the code adds the number of syllables by counting vowels.

Once we have these three metrics, we can then work out the F-K Grade (G) and the Flesch Readability Ease (R) using the below formulas.

G= ((0.39*words/sentence)+ (11.8*syllable/words)-15.59)
R= ((206.835- (1.015* words/sentence) – (84.6*syllable/words) ) )

The rest of the script has several IF statements based on the range of the R values. That’s all there is to it! You can now test any text file for readability.

 

 

 

Share on LinkedInEmail this to someoneTweet about this on TwitterShare on FacebookPrint this page

Leave a Reply

Your email address will not be published. Required fields are marked *

*