Skip to content

Latest commit

 

History

History
26 lines (24 loc) · 712 Bytes

data_analysis.md

File metadata and controls

26 lines (24 loc) · 712 Bytes
item size
train_set 18770
dev_set 2000
test_set -
train_cls_0 0.5
train_cls_1 0.5
dev_cls_0 0.5
dev_cls_1 0.5
word_max_len 66
word_min_len 5
word_avg_len 9.42
word_median_len 9
word_len <= 17 0.9668
char_max_len 155
char_min_len 10
char_avg_len 24.09
char_median_len 22
char_len <= 48 0.972
word_vocab 你们 部门,,
char_vocab 3656
char_unigram 3656
char_bigram 29142
char_trigram 124820