Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MODULE] - Lexical diversity #391

Open
LeonardPuettmannKern opened this issue Nov 19, 2023 · 0 comments
Open

[MODULE] - Lexical diversity #391

LeonardPuettmannKern opened this issue Nov 19, 2023 · 0 comments
Assignees
Labels
cognition enhancement New feature or request

Comments

@LeonardPuettmannKern
Copy link
Contributor

Please describe the module you would like to add to bricks
Super easy but great indicator for the quality of a text. Can also be used for Cognition.

Do you already have an implementation?

 def lexical_divesity(text):
    word_count = len(text)
    vocab_size = len(set(text))
    return word_count / vocab_size # this is the diversity score

Additional context
Found here: https://btw.informatik.uni-rostock.de/download/workshopband/C2-5.pdf
The actual implementation in the paper is not correct. The correct implementation and many more useful snippets can be found in the book "Natural Language Processing with Python".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cognition enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant