pattern es

The pattern.es module contains a fast part-of-speech tagger for Spanish (identifies nouns, adjectives, verbs, etc. in a sentence) and tools for Spanish verb conjugation and noun singularization & pluralization.

Documentation

The functions in this module take the same parameters and return the same values as their counterparts in pattern.en. Refer to the documentation there for more details.

Noun singularization & pluralization

For Spanish nouns there is singularize() and pluralize(). The implementation is slightly less robust than the English version (accuracy 94% for singularization and 78% for pluralization).

>>> from pattern.es import singularize, pluralize
>>>  
>>> print singularize('gatos')
>>> print pluralize('gato')

gato
gatos

Verb conjugation

For Spanish verbs there is conjugate(), lemma(), lexeme() and tenses(). The lexicon for verb conjugation contains about 600 common Spanish verbs, composed by Fred Jehle. For unknown verbs it will fall back to a rule-based approach with an accuracy of about 84%.

Spanish verbs have more tenses than English verbs. In particular, the plural differs for each person, and there are additional forms for the FUTURE and CONDITIONAL tense, the IMPERATIVE and SUBJUNCTIVE mood and the PERFECTIVE aspect:

>>> from pattern.es import conjugate
>>> from pattern.es import INFINITIVE, PRESENT, PAST, SG, SUBJUNCTIVE, PERFECTIVE
>>>  
>>> print conjugate('soy', INFINITIVE)
>>> print conjugate('soy', PRESENT, 1, SG, mood=SUBJUNCTIVE)
>>> print conjugate('soy', PAST, 3, SG) 
>>> print conjugate('soy', PAST, 3, SG, aspect=PERFECTIVE) 

ser
sea
era 
fue

For PAST tense + PERFECTIVE aspect we can also use PRETERITE. For PAST tense + IMPERFECTIVE aspect we can also use IMPERFECT:

>>> from pattern.es import conjugate
>>> from pattern.es import IMPERFECT, PRETERITE
>>>  
>>> print conjugate('soy', IMPERFECT, 3, SG)
>>> print conjugate('soy', PRETERITE, 3, SG)

era
fue

The conjugate() function takes the following optional parameters:

Tense	Person	Number	Mood	Aspect	Alias	Example
INFINITVE	None	None	None	None	"inf"	ser
PRESENT	1	SG	INDICATIVE	IMPERFECTIVE	"1sg"	yo __soy__
PRESENT	2	SG	INDICATIVE	IMPERFECTIVE	"2sg"	tú __eres__
PRESENT	3	SG	INDICATIVE	IMPERFECTIVE	"3sg"	el __es__
PRESENT	1	PL	INDICATIVE	IMPERFECTIVE	"1pl"	nosotros __somos__
PRESENT	2	PL	INDICATIVE	IMPERFECTIVE	"2pl"	vosotros __sois__
PRESENT	3	PL	INDICATIVE	IMPERFECTIVE	"3pl"	ellos __son__
PRESENT	None	None	INDICATIVE	PROGRESSIVE	"part"	siendo

PRESENT	2	SG	IMPERATIVE	IMPERFECTIVE	"2sg!"	sé
PRESENT	2	PL	IMPERATIVE	IMPERFECTIVE	"2pl!"	sed

PRESENT	1	SG	SUBJUNCTIVE	IMPERFECTIVE	"1sg?"	yo __sea__
PRESENT	2	SG	SUBJUNCTIVE	IMPERFECTIVE	"2sg?"	tú __seas__
PRESENT	3	SG	SUBJUNCTIVE	IMPERFECTIVE	"3sg?"	el __sea__
PRESENT	1	PL	SUBJUNCTIVE	IMPERFECTIVE	"1pl?"	nosotros __seamos__
PRESENT	2	PL	SUBJUNCTIVE	IMPERFECTIVE	"2pl?"	vosotros __seáis__
PRESENT	3	PL	SUBJUNCTIVE	IMPERFECTIVE	"3pl?"	ellos __sean__

PAST	1	SG	INDICATIVE	IMPERFECTIVE	"1sgp"	yo __era__
PAST	2	SG	INDICATIVE	IMPERFECTIVE	"2sgp"	tú __eras__
PAST	3	SG	INDICATIVE	IMPERFECTIVE	"3sgp"	el __era__
PAST	1	PL	INDICATIVE	IMPERFECTIVE	"1ppl"	nosotros __éramos__
PAST	2	PL	INDICATIVE	IMPERFECTIVE	"2ppl"	vosotros __erais__
PAST	3	PL	INDICATIVE	IMPERFECTIVE	"3ppl"	ellos __eran__
PAST	None	None	INDICATIVE	PROGRESSIVE	"ppart"	sido

PAST	1	SG	INDICATIVE	PERFECTIVE	"1sgp+"	yo __fui__
PAST	2	SG	INDICATIVE	PERFECTIVE	"2sgp+"	tú __fuiste__
PAST	3	SG	INDICATIVE	PERFECTIVE	"3sgp+"	el __fue__
PAST	1	PL	INDICATIVE	PERFECTIVE	"1ppl+"	nosotros __fuimos__
PAST	2	PL	INDICATIVE	PERFECTIVE	"2ppl+"	vosotros __fuisteis__
PAST	3	PL	INDICATIVE	PERFECTIVE	"3ppl+"	ellos __fueron__

PAST	1	SG	SUBJUNCTIVE	IMPERFECTIVE	"1sgp?"	yo __fuera__
PAST	2	SG	SUBJUNCTIVE	IMPERFECTIVE	"2sgp?"	tú __fueras__
PAST	3	SG	SUBJUNCTIVE	IMPERFECTIVE	"3sgp?"	el __fuera__
PAST	1	PL	SUBJUNCTIVE	IMPERFECTIVE	"1ppl?"	nosotros __fuéramos__
PAST	2	PL	SUBJUNCTIVE	IMPERFECTIVE	"2ppl?"	vosotros __fuerais__
PAST	3	PL	SUBJUNCTIVE	IMPERFECTIVE	"3ppl?"	ellos __fueran__

FUTURE	1	SG	INDICATIVE	IMPERFECTIVE	"1sgf"	yo __seré__
FUTURE	2	SG	INDICATIVE	IMPERFECTIVE	"2sgf"	tú __serás__
FUTURE	3	SG	INDICATIVE	IMPERFECTIVE	"3sgf"	el __será__
FUTURE	1	PL	INDICATIVE	IMPERFECTIVE	"1plf"	nosotros __seremos__
FUTURE	2	PL	INDICATIVE	IMPERFECTIVE	"2plf"	vosotros __seréis__
FUTURE	3	PL	INDICATIVE	IMPERFECTIVE	"3plf"	ellos __serán__

CONDITIONAL	1	SG	INDICATIVE	IMPERFECTIVE	"1sg->"	yo __sería__
CONDITIONAL	2	SG	INDICATIVE	IMPERFECTIVE	"2sg->"	tú __serías__
CONDITIONAL	3	SG	INDICATIVE	IMPERFECTIVE	"3sg->"	el __sería__
CONDITIONAL	1	PL	INDICATIVE	IMPERFECTIVE	"1pl->"	nosotros __seríamos__
CONDITIONAL	2	PL	INDICATIVE	IMPERFECTIVE	"2pl->"	vosotros __seríais__
CONDITIONAL	3	PL	INDICATIVE	IMPERFECTIVE	"3pl->"	ellos __serían__

Instead of optional parameters, a single short alias, or PARTICIPLE or PAST+PARTICIPLE can also be given. With no parameters, the infinitive form of the verb is returned.

Reference: Jehle, F. (2012). Spanish Verb Forms. Retrieved from: http://users.ipfw.edu/jehle/verblist.htm.

Attributive & predicative adjectives

Spanish adjectives inflect with an -o, -a , -os, -as, or -es suffix (e.g., curioso → los gatos curiosos) depending on gender. You can get the base form with the predicative() function, or vice versa with attributive(). For predicative, a statistical approach is used with an accuracy of 93%. For attributive, you need to supply gender (MALE, FEMALE, NEUTRAL and/or PLURAL).

>>> from pattern.es import attributive, predicative
>>> from pattern.es import FEMALE, PLURAL 
>>>  
>>> print predicative('curiosos') 
>>> print attributive('curioso', gender=FEMALE)
>>> print attributive('curioso', gender=FEMALE+PLURAL)

curioso
curiosa 
curiosas

Parser

For parsing there is parse(), parsetree() and split(). The parse() function annotates words in the given string with their part-of-speech tags (e.g., NN for nouns and VB for verbs). The parsetree() function takes a string and returns a tree of nested objects (Text → Sentence → Chunk → Word). The split() function takes the output of parse() and returns a Text. See the pattern.en documentation (here) how to manipulate Text objects.

>>> from pattern.es import parse, split
>>>  
>>> s = parse('El gato negro se sienta en la estera.')
>>> for sentence in split(s):
>>>     print sentence

Sentence('El/DT/B-NP/O gato/NN/I-NP/O negro/JJ/I-NP/O'
         'se/PRP/B-NP/O sienta/VB/B-VP/O'
         'en/IN/B-PP/B-PNP la/DT/B-NP/I-PNP estera/NN/I-NP/I-PNP ././O/O')

The parser is trained on the Spanish portion of Wikicorpus using 1.5M words from the tagged sections 10,000–15,000. The accuracy is around 92%. The original Parole tagset is mapped to Penn Treebank tagset. If you need to work with the original tags you can also use parse() with an optional parameter tagset="parole".

Reference: Reese, S., Boleda, G., Cuadros, M., Padró, L., Rigau, G (2010).
Wikicorpus: A Word-Sense Disambiguated Multilingual Wikipedia Corpus. Proceedings of LREC'10.

Sentiment analysis

There's no sentiment() function for Spanish yet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly