Skip to content

Commit

Permalink
bump version, fix readme and add value error
Browse files Browse the repository at this point in the history
  • Loading branch information
Perevalov committed Nov 5, 2024
1 parent cc8f4b6 commit 7993d6d
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 6 deletions.
20 changes: 16 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ Example:
from linguaf import descriptive_statistics as ds


ds.words_per_sentence(documents)
ds.avg_words_per_sentence(documents)
# Output: 15
```

Expand All @@ -64,7 +64,7 @@ from linguaf import syntactical_complexity as sc


sc.mean_dependency_distance(documents)
# Output: 2.307306255835668
# Output: 2.375
```

### Lexical Diversity
Expand All @@ -83,7 +83,7 @@ from linguaf import lexical_diversity as ld


ld.log_type_token_ratio(documents)
# Output: 94.03574963462502
# Output: 0.9403574963462502
```

### Readability
Expand Down Expand Up @@ -125,7 +125,19 @@ pip install .

## Language Support

At the moment, library supports English and Russian languages for all the methods.
At the moment, library supports the following languages:
* English 🇬🇧 (`en`): full support
* Russian 🇷🇺 (`ru`): full support
* German 🇩🇪 (`de`)
* French 🇫🇷 (`fr`)
* Spanish 🇪🇸 (`es`)
* Chinese 🇨🇳 (`zh`)
* Lithuanian 🇱🇹 (`lt`)
* Belarusian 🇧🇾 (`be`)
* Ukrainian 🇺🇦 (`uk`)
* Armenian 🇦🇲 (`hy`)

**Important:** not every method is implemented for every language. If you use a particular method that does not support the input language, you'll get a `ValueError`.

## Citation

Expand Down
2 changes: 1 addition & 1 deletion linguaf/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

SUPPORTED_LANGS = ['en', 'ru', 'de', 'fr', 'es', 'zh', # stopwords from nltk
'lt', 'be', 'uk', 'hy'] # stopwords from other sources
__version__ = '0.1.1'
__version__ = '0.1.2'


def __load_json(filepath):
Expand Down
4 changes: 4 additions & 0 deletions linguaf/readability.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ def flesch_reading_ease(documents: list, lang: str = 'en', remove_stopwords: boo
return 206.835 - 1.015*asl - 84.6*(syl_total/len(words))
elif lang == 'ru':
return 206.835 - 1.3*asl - 60.1*(syl_total/len(words)) # coefficients for russian
else:
raise ValueError("Syllable counting is currently not supported for the language " + lang + "!")


def flesch_kincaid_grade(documents: list, lang: str = 'en', remove_stopwords: bool = False) -> float:
Expand All @@ -51,6 +53,8 @@ def flesch_kincaid_grade(documents: list, lang: str = 'en', remove_stopwords: bo
return 0.39*asl + 11.8*(syl_total/len(words)) - 15.59
elif lang == 'ru':
return 0.5*asl + 8.4*(syl_total/len(words)) - 15.59 # coefficients for russian
else:
raise ValueError("Syllable counting is currently not supported for the language " + lang + "!")


def automated_readability_index(documents: list, lang: str = 'en', remove_stopwords: bool = False) -> float:
Expand Down
2 changes: 2 additions & 0 deletions linguaf/syntactical_complexity.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,7 @@ def mean_dependency_distance(documents: list, lang: str = 'en') -> float:
doc = nlp(text)
for token in doc:
dd += abs(token.head.i - token.i)
else:
raise ValueError("Syllable counting is currently not supported for the language " + lang + "!")

return dd/(len(words) - len(sentences))
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def read_requirements():

setuptools.setup(
name="linguaf",
version="0.1.0",
version="0.1.2",
author="Aleksandr Perevalov",
author_email="[email protected]",
description="Python package for calculating famous measures in computational linguistics",
Expand Down

0 comments on commit 7993d6d

Please sign in to comment.