The code of A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging.
If MKL is supported in your server, modify the MKL path in CMakeLists.txt first.
modify "set(MKL_ROOT /opt/intel/mkl)" to "set(MKL_ROOT your_mkl_path)"
mkdir build
cd build
cmake .. or cmake .. -DMKL=True(if mkl is supported)
cd ..
./bin/NNJSTagger -l -train data/ctb50/train.corpus -dev data/ctb50/dev.corpus -test data/ctb50/test.corpus -option data/option.debug
config file in ./data/option.debug
seg = true
dropProb = 0.25
adaAlpha = 0.001
charEmbFile = data/char.vec
bicharEmbFile = data/mini.bichar.vec
batchSize = 16
CTB5 | CTB6 | CTB7 | PKU | NCC | |
---|---|---|---|---|---|
Model | SEG POS | SEG POS | SEG POS | SEG POS | SEG POS |
Our Model (No External Embeddings) | 97.69 94.16 | 95.37 90.83 | 95.32 90.25 | 95.22 92.62 | 93.97 89.47 |
Our Model (Basic Embeddings) | 97.93 94.44 | 95.78 91.79 | 95.77 91.12 | 95.82 93.42 | 94.52 89.82 |
Our Model (Word-context Embeddings) | 98.50 94.95 | 96.36 92.51 | 96.25 91.87 | 96.35 94.14 | 95.30 90.42 |
Intel CPU: i7 6800k, MKL supported; GCC version 5.4.0
CTB6 | Sentences | Time | Speed |
---|---|---|---|
Train | 23k | 465.41s | 50.3 sents/s |
Devel | 2.1k | 17.74s | 117.1 sents/s |
Test | 2.8k | 23.67s | 118.1 sents/s |
@Article{zhang2018jointposseg,
author = {Zhang, Meishan and Yu, Nan and Fu, Guohong},
title = {A Simple and Effective Neural Model for Joint Word Segmentation and POS Tagging},
journal = {IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)},
year = {2018},
volume = {26},
number = {9},
pages = {1528--1538},
publisher = {IEEE Press},
}
-
if you have any question, you can open a issue or email
[email protected]
、[email protected]
、bamtercelboo@{gmail.com, 163.com}
. -
if you have any good suggestions, you can PR or email me.
Meishan Zhang, Yu Nan, Zonglin Liu