Beanbag #109 | Project Amaze!

#109 λℓI

Chapter 19 Facts

Number of paragraphs: 247
Number of sentences: 687
Number of tokens: 5,871
Number of unique tokens: 1,406

Number of speakers: 2
Grace : 317 tokens
Rocky : 241 tokens
Direct speech: 9.50% of tokens

Space: 2 sections; 100.00% of tokens

Words unusually frequent for Space sections:
chain, winch, link, sampler, fall.
Words unusually infrequent or lacking for Space sections:
Taumoeba, okay, they, find, sleep.

For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.

Speaker identification was done manually.

Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).

chapter-facts