NEWS.md
tknz_sent() and preprocess() now have a different implementation on Windows and UNIX OSs, respectively (since the previous C++ implementation has impredictable behaviour on Windows, see #30). This fix also included minor changes in the tknz_sent() output, in some corner cases (e.g. tknz_sent("") now returns character(0), wheareas it used to return "").perplexity() gets a new argument exp that allows to return the cross-entropy per word, rather than perplexity (its exponential).perplexity.character() gets a new argument detailed that allows to return, alongside with the total perplexity of the input document, also the cross-entropies and word lengths of individual sentences. Closes #28.?kgram_freqs.R requirements 3.5 -> 4.0.SystemRequirements: C++11 (see this tidyverse blog post)verbose arguments now default to FALSE.probability(), perplexity() and sample_sentences() are restricted to accept only language_model class objects as their model argument..preprocess and .tknz_sent arguments to be ignored in process_sentences().max_lines and batch_size arguments in kgram_freqs.connection().dictionary.dictionary() with batch processing and non-trivial size constraints on vocabulary size.