├── .gitignore ├── LICENSE.md ├── README.md ├── Rakefile ├── lib └── liquid_reading_time.rb ├── liquid_reading_time.gemspec └── test └── test_liquid_reading_time.rb /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .*.s[a-w][a-z] 3 | 4 | /*.gem 5 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # The ISC License 2 | 3 | Copyright © 2013, 2015, 2017 Benjamin D. Esham 4 | 5 | Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. 6 | 7 | **The software is provided “as is” and the author disclaims all warranties with regard to this software including all implied warranties of merchantability and fitness. In no event shall the author be liable for any special, direct, indirect, or consequential damages or any damages whatsoever resulting from loss of use, data or profits, whether in an action of contract, negligence or other tortious action, arising out of or in connection with the use or performance of this software.** 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # reading\_time 2 | 3 | A [Liquid](http://www.liquidmarkup.org/) filter that intelligently counts the number of words in a piece of HTML and estimates how long the text will take to read. 4 | 5 | ## Installation 6 | 7 | The easiest way to install this plugin is with RubyGems: `gem install liquid_reading_time`. 8 | 9 | If you’re using Jekyll, see the Jekyll [documentation on installing plugins](http://jekyllrb.com/docs/plugins/#installing-a-plugin) for more-detailed installation instructions. This plugin requires Nokogiri, so if you install this one manually you’ll need to make sure that that one is installed too. 10 | 11 | # Usage 12 | 13 | Two functions are provided: 14 | 15 | * reading\_time 16 | 17 | This function gives an estimate of the amount of time it will take to read the input text. The return value is an integer number of minutes. The input should be HTML (i.e. the text should have already been run through your Markdown or Textile filter). For example, you could use it in a `_layout` file like this: 18 | 19 | {% capture time %}{{ content | reading_time }}{% endcapture %} 20 |

This article will take {{ time }} {% if time == '1' %}minute{% else %}minutes{% endif %} to read.

21 | 22 | Even better, using the [pluralize](https://github.com/bdesham/pluralize) filter, 23 | 24 |

This article will take {{ content | reading_time | pluralize: "minute" }} to read.

25 | 26 | * count\_words 27 | 28 | This function returns the number of words in the input. Like `reading_time`, this function takes HTML as its input. 29 | 30 | ## Details 31 | 32 | These functions try to be smart about counting words. Specifically, words are not counted if they are contained within any of the following HTML elements: area, audio, canvas, code, embed, footer, form, img, map, math, nav, object, pre, script, svg, table, track, and video. My intention here is to prevent words from contributing toward the count if they don’t seem to be part of the running text—contrast this with the simple but inaccurate approach of e.g. Jekyll’s built-in `number_of_words`. 33 | 34 | The plugin assumes a reading speed of 270 words per minute. Wikipedia [cites](https://en.wikipedia.org/w/index.php?title=Words_per_minute&oldid=569027766#Reading_and_comprehension) 250–300 words per minute as a typical range, and I found that I could read articles on my website at about 270 words per minute. 35 | 36 | ## Author 37 | 38 | This plugin was created by [Benjamin Esham](https://esham.io). 39 | 40 | This project is [hosted on GitHub](https://github.com/bdesham/reading_time). Please feel free to submit pull requests. 41 | 42 | ## Version history 43 | 44 | The version numbers of this project conform to [Semantic Versioning 2.0](http://semver.org/). 45 | 46 | * 1.1.3 (2017-07-19) 47 | - Updated the dependencies to reflect that the plugin works with Liquid 4 (without any code changes needed). 48 | * 1.1.2 (2015-03-07) 49 | - Apostrophes and curly single quotes shouldn’t break words into two. 50 | - This plugin works with Liquid 3.x in addition to 2.x; updated the dependencies to reflect that. 51 | - Added unit tests. 52 | * 1.1.1 (2015-03-04) 53 | - Don’t put the `ReadingTime` module in the `Jekyll` module. 54 | - Packaged the plugin as a Gem. 55 | * 1.1.0 (2013-08-30) 56 | - Switched from REXML to Nokogiri for HTML parsing. 57 | - Input can now be HTML or XHTML. Previously, only valid XML was accepted (so things like non-closed `img` tags would make `reading_time` crash). 58 | - Character entities like `蘗` are no longer included in the word count. 59 | * 1.0.0 (2013-08-24): Changed reading speed from 220 to 270 words per minute. 60 | * 0.9.0 (2013-08-19): Initial release. 61 | 62 | ## License 63 | 64 | Copyright © 2013, 2015, 2017 Benjamin D. Esham. This program is released under the ISC license, which you can find in the file LICENSE.md. 65 | -------------------------------------------------------------------------------- /Rakefile: -------------------------------------------------------------------------------- 1 | require 'rake/testtask' 2 | 3 | Rake::TestTask.new do |t| 4 | t.libs << 'test' 5 | end 6 | -------------------------------------------------------------------------------- /lib/liquid_reading_time.rb: -------------------------------------------------------------------------------- 1 | # reading_time 2 | # 3 | # A Liquid filter to estimate how long a passage of text will take to read 4 | # 5 | # https://github.com/bdesham/reading_time 6 | # 7 | # Copyright (c) 2013, 2015 Benjamin D. Esham. This program is released under the 8 | # ISC license, which you can find in the file LICENSE.md. 9 | 10 | require 'nokogiri' 11 | 12 | module ReadingTime 13 | 14 | def count_words(html) 15 | words(html).length 16 | end 17 | 18 | def reading_time(html) 19 | (count_words(html) / 270.0).ceil 20 | end 21 | 22 | private 23 | 24 | def text_nodes(root) 25 | ignored_tags = %w[ area audio canvas code embed footer form img 26 | map math nav object pre script svg table track video ] 27 | 28 | texts = [] 29 | root.children.each { |node| 30 | if node.text? 31 | texts << node.text 32 | elsif not ignored_tags.include? node.name 33 | texts.concat text_nodes node 34 | end 35 | } 36 | texts 37 | end 38 | 39 | def words(html) 40 | fragment = Nokogiri::HTML.fragment html 41 | text_nodes(fragment).map { |text| text.scan(/[\p{L}\p{M}'‘’]+/) }.flatten 42 | end 43 | 44 | end 45 | 46 | Liquid::Template.register_filter(ReadingTime) 47 | -------------------------------------------------------------------------------- /liquid_reading_time.gemspec: -------------------------------------------------------------------------------- 1 | Gem::Specification.new do |s| 2 | s.name = 'liquid_reading_time' 3 | s.version = '1.1.3' 4 | s.date = '2017-07-19' 5 | 6 | s.author = 'Benjamin Esham' 7 | s.email = 'benjamin@esham.io' 8 | s.homepage = 'https://github.com/bdesham/reading_time' 9 | s.license = 'MIT' 10 | 11 | s.summary = 'A Liquid filter to count words and estimate reading times.' 12 | s.description = 'A Liquid filter that intelligently counts the number of words in a piece of HTML and estimates how long the text will take to read.' 13 | 14 | s.files = ['lib/liquid_reading_time.rb'] 15 | 16 | s.add_runtime_dependency('liquid', ['>= 2.6', '< 5.0']) 17 | s.add_runtime_dependency('nokogiri', '~> 1.6') 18 | end 19 | -------------------------------------------------------------------------------- /test/test_liquid_reading_time.rb: -------------------------------------------------------------------------------- 1 | require 'minitest/autorun' 2 | require 'liquid' 3 | require 'liquid_reading_time' 4 | 5 | include ReadingTime 6 | 7 | class WordCountingTest < Minitest::Test 8 | def test_one_word 9 | assert_equal 1, ReadingTime.count_words('I') 10 | end 11 | 12 | def test_whitespace 13 | assert_equal 1, ReadingTime.count_words(' Hello') 14 | assert_equal 1, ReadingTime.count_words('Supercalifragilisticexpialidocious ') 15 | end 16 | 17 | def test_multiple_words 18 | assert_equal 4, ReadingTime.count_words('This is a test') 19 | assert_equal 6, ReadingTime.count_words('Four score and seven years ago') 20 | end 21 | 22 | def test_punctuation 23 | assert_equal 2, ReadingTime.count_words('Hello, world!') 24 | assert_equal 2, ReadingTime.count_words('Hello, world !') 25 | # This module nominally only supports English, but ¿ and é should be handled correctly anyway 26 | assert_equal 3, ReadingTime.count_words('¿Por qué, Maria?') 27 | end 28 | 29 | def test_quotes_and_apostrophes 30 | assert_equal 6, ReadingTime.count_words('"Well, that\'s all right," he said.') 31 | assert_equal 6, ReadingTime.count_words('“Well, that’s all right,” he said.') 32 | assert_equal 6, ReadingTime.count_words('\'Twas brillig, and the slithy toves') 33 | assert_equal 6, ReadingTime.count_words('’Twas brillig, and the slithy toves') 34 | assert_equal 4, ReadingTime.count_words('The contrived sentences\' apostrophes') 35 | assert_equal 4, ReadingTime.count_words('The contrived sentences’ apostrophes') 36 | end 37 | 38 | def test_html_punctuation 39 | assert_equal 3, ReadingTime.count_words('Well that’s annoying') 40 | assert_equal 5, ReadingTime.count_words('These really are separate words') 41 | assert_equal 3, ReadingTime.count_words('Never do this!') 42 | end 43 | 44 | def test_html_tags 45 | assert_equal 4, ReadingTime.count_words('Ends with a tag') 46 | assert_equal 4, ReadingTime.count_words('Starts with a tag') 47 | assert_equal 4, ReadingTime.count_words('This statement is false') 48 | assert_equal 1, ReadingTime.count_words('DCTW!') 49 | assert_equal 1, ReadingTime.count_words('Cool') 50 | assert_equal 45, ReadingTime.count_words('

—Ladies and Gentlemen,

51 | 52 |

—A new generation is growing up in our midst, a generation actuated by new ideas and new 53 | principles. It is serious and enthusiastic for these new ideas and its enthusiasm, even when 54 | it is misdirected, is, I believe, in the main sincere.

') 55 | end 56 | end 57 | 58 | class ReadingTimeTest < Minitest::Test 59 | def test_shorter_than_one_minute 60 | assert_equal 0, ReadingTime.reading_time('') 61 | assert_equal 1, ReadingTime.reading_time('a') 62 | assert_equal 1, ReadingTime.reading_time('This is a test') 63 | assert_equal 1, ReadingTime.reading_time('

—Ladies and Gentlemen,

64 | 65 |

—A new generation is growing up in our midst, a generation actuated by new ideas and new 66 | principles. It is serious and enthusiastic for these new ideas and its enthusiasm, even when 67 | it is misdirected, is, I believe, in the main sincere.

') 68 | assert_equal 1, ReadingTime.reading_time('Foo' * 269) 69 | end 70 | 71 | def test_at_least_one_minute 72 | assert_equal 1, ReadingTime.reading_time('Foo' * 270) 73 | assert_equal 2, ReadingTime.reading_time('Foo' * 271) 74 | assert_equal 2, ReadingTime.reading_time('Foo' * 539) 75 | assert_equal 2, ReadingTime.reading_time('Foo' * 540) 76 | assert_equal 3, ReadingTime.reading_time('Foo' * 541) 77 | end 78 | end 79 | 80 | class LiquidIntegrationTest < Minitest::Test 81 | def test_count_words 82 | template = Liquid::Template.parse('{{ str | count_words }}') 83 | assert_equal '0', template.render('str' => '') 84 | assert_equal '1', template.render('str' => 'Hello') 85 | assert_equal '3', template.render('str' => '“He’s dead, Jim!”') 86 | end 87 | 88 | def test_reading_time 89 | template = Liquid::Template.parse('{{ str | reading_time }}') 90 | assert_equal '0', template.render('str' => '') 91 | assert_equal '1', template.render('str' => 'Hello') 92 | assert_equal '1', template.render('str' => 'Foo' * 270) 93 | assert_equal '2', template.render('str' => 'Foo' * 271) 94 | end 95 | end 96 | --------------------------------------------------------------------------------