#45 IIλ
Chapter 08 Facts
Number of paragraphs: 209 Number of sentences: 631 Number of tokens: 5,847 Number of unique tokens: 1,434
Number of speakers: 3
Dr. Lokken : 576 tokens
Stratt : 437 tokens
Grace : 341 tokens
Direct speech: 23.16% of tokens
Space: 3 sections; 67.86% of tokens
Earth: 1 sections; 32.14% of tokens
Words unusually frequent for Earth sections:
Lokken, centrifuge, ship, Stratt, gravity.
Words unusually infrequent or lacking for Earth sections:
he, light, there, their, people.
Words unusually frequent for Space sections:
whisker, doohickey, sphere, cylinder, model.
Words unusually infrequent or lacking for Space sections:
he, Rocky, his, question, much.
For the sentences count, segmentation was performed using spaCy. Tokenization is just based on whitespace, em-dash, en-dash, and ellipsis delimiters. Unique tokens are case-insensitive.
Speaker identification was done manually.
Unusually frequent or infrequent words are based on log-likelihood of lemmas (lemmatization by spaCy).