kgrams v0.1.2 on CRAN

Natural Language Processing R

kgrams: Classical k-gram Language Models in R.

Valerio Gherardi


Version v0.1.2 of my R package kgrams was just accepted by CRAN. This package provides tools for training and evaluating k-gram language models in R, supporting several probability smoothing techniques, perplexity computations, random text generation and more.

Short demo

# Get k-gram frequency counts from Shakespeare's "Much Ado About Nothing"
freqs <- kgram_freqs(kgrams::much_ado, N = 4)

# Build modified Kneser-Ney 4-gram model, with discount parameters D1, D2, D3.
mkn <- language_model(freqs, smoother = "mkn", D1 = 0.25, D2 = 0.5, D3 = 0.75)

# Sample sentences from the language model at different temperatures
sample_sentences(model = mkn, n = 3, max_length = 10, t = 1)
[1] "i have studied eight or nine truly by your office [...] (truncated output)"
[2] "ere you go : <EOS>"                                                        
[3] "don pedro welcome signior : <EOS>"                                         
sample_sentences(model = mkn, n = 3, max_length = 10, t = 0.1)
[1] "i will not be sworn but love may transform me [...] (truncated output)" 
[2] "i will not fail . <EOS>"                                                
[3] "i will go to benedick and counsel him to fight [...] (truncated output)"
sample_sentences(model = mkn, n = 3, max_length = 10, t = 10)
[1] "july cham's incite start ancientry effect torture tore pains endings [...] (truncated output)"   
[2] "lastly gallants happiness publish margaret what by spots commodity wake [...] (truncated output)"
[3] "born all's 'fool' nest praise hurt messina build afar dancing [...] (truncated output)"          


Overall Software Improvements

API Changes

New features

Bug Fixes



If you see mistakes or want to suggest changes, please create an issue on the source repository.


Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. Source code is available at, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".


For attribution, please cite this work as

Gherardi (2021, Nov. 13). vgherard: kgrams v0.1.2 on CRAN. Retrieved from

BibTeX citation

  author = {Gherardi, Valerio},
  title = {vgherard: kgrams v0.1.2 on CRAN},
  url = {},
  year = {2021}