John Snow Labs Spark-NLP 3.3.1: New EntityRuler annotator, better integration with TokenClassification annotators, new state-of-the-art XLM-RoBERTa models in African Languages, and bug fixes! #6317
maziyarpanahi
announced in
Announcement
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Overview
We are pleased to release Spark NLP 🚀 3.3.1! This release comes with a new EntityRuler annotator, better compatibility between TokenClassification annotators and other annotators in Spark NLP pipeline, new state-of-the-art XLM-RoBERTa models in African Languages, and bug fixes!
As always, we would like to thank our community for their feedback, questions, and feature requests.
New Features
EntityRuler
annotators to receive either a JSON or CSV ontology file that maps entities to patterns. You can implement a purely rule-based entity recognition system by using EntityRuler, it can be saved as a Model and reused in other pipelines to annotate your document against your knowledge base.Access EntityRuler Documentation
Bug Fixes
Backward compatibility
Models and Pipelines
New state-of-the-art XLM-RoBERTa models in
Luganda
,Naija
,Yoruba
,Hausa
,Kinyarwanda
,Wolof
,Igbo
,Amharic
,Swahili
, andLuo
.New Transformer Models
3.3.1
yo
3.3.1
wo
3.3.1
pcm
3.3.1
sw
3.3.1
lg
3.3.1
rw
3.3.1
ha
3.3.1
ig
3.3.1
am
3.3.1
yo
3.3.1
wo
3.3.1
sw
3.3.1
pcm
3.3.1
lou
The complete list of all 4000+ models & pipelines in 200+ languages is available on Models Hub.
New Notebooks
Documentation
Installation
Python
#PyPI pip install spark-nlp==3.3.1
Spark Packages
spark-nlp on Apache Spark 3.0.x and 3.1.x (Scala 2.12 only):
GPU
spark-nlp on Apache Spark 2.4.x (Scala 2.11 only):
GPU
spark-nlp on Apache Spark 2.3.x (Scala 2.11 only):
GPU
Maven
spark-nlp on Apache Spark 3.0.x and 3.1.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.4.x:
spark-nlp-gpu:
spark-nlp on Apache Spark 2.3.x:
spark-nlp-gpu:
FAT JARs
CPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-assembly-3.3.1.jar
GPU on Apache Spark 3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-assembly-3.3.1.jar
CPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark24-assembly-3.3.1.jar
GPU on Apache Spark 2.4.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark24-assembly-3.3.1.jar
CPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-spark23-assembly-3.3.1.jar
GPU on Apache Spark 2.3.x: https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/jars/spark-nlp-gpu-spark23-assembly-3.3.1.jar
This discussion was created from the release John Snow Labs Spark-NLP 3.3.1: New EntityRuler annotator, better integration with TokenClassification annotators, new state-of-the-art XLM-RoBERTa models in African Languages, and bug fixes!.
Beta Was this translation helpful? Give feedback.
All reactions