https://github.com/nju-websoft/iESBM
Last update: 2020-05-20
License: ODC Attribution License (ODC-By)
iESBM: an interpretative Entity Summarization Benchmark on Multiple Datasets.
- iESBM benchmark: data of iESBM, include datasets, features and FER results, see iESBM.zip;
- Evaluator (python): code for generating evaluation results (FSR results), see code/;
- runs: output files generated by selected entity summarizers and their FSR results, see runs.zip.
ESBM-D
: 125 DBpedia entities from ESBM v1.2ESBM-L
: 50 LinkedMDB entities from ESBM v1.2FED
: 50 DBpedia entities from FACES system
See in_ds_raw
Suppose you want to evaluate your algorithm named 'youralgo', and its summaries generated for entities from the three datasets are in directory 'data/algosumm/youralgo/'. Run the following command:
python code/iesbm_eval.py -mode FSR -algo_name youralgo
Evaluation results will be outputted to directory 'data/out/out_youralgo/'. See the next section for details.
The evaluator can be used to evaluate any general-purpose entity summarizer with the following steps:
- Python 3.x (tested on Python 3.6)
- Numpy
- Scipy
Install the evaluator by firstly dowloading the project, and then installing required packages with the following commands:
git clone [email protected]:nju-websoft/iESBM.git iESBM cd iESBM pip install -e .
To evaluate your algorithm, please generate summaries for entities from the three datasets and organize the directory of summaries as follows (see youralgo as example):
├── ${algo_name} │ ├── ${ds_name} │ │ ├── ${eid} │ │ │ ├── ${eid}_top5.nt │ │ │ ├── ${eid}_top10.nt
where
- ${algo_name} is the name of your entity summarization algorithm, e.g. 'relin', 'diversum', 'youralgo';
- ${ds_name} is the alias for dataset, 'dbpedia' for ESBM-D, 'lmdb' for ESBM-L, 'dsfaces' for FED;
- ${eid} is an integer as the unique identifier for each entity, see elist.txt file in each dataset.
Please put the folder ${algo_name}/ under directory 'data/algosumm/', and run iesbm_eval.py by the following command:
python code/iesbm_eval.py -algo_name ${algo_name} [-feature_name ${feature_name} -ds_name ${ds_name} -topk ${topk} -mode ${mode}]
where parameter -algo_name
is necessary when you want to get 'FSR' results of an algorithm, and optional parameters:
-feature_name
accept values: 'LFoP', 'GFoP', 'GFoV', 'IoPV', 'DoP' and 'DoV';-topk
accept two values: 'top5' for k=5 summaries, and 'top10' for k=10 summaries;-mode
accept three values: 'FER' for only output FER results, 'FSR' for only output FSR results, and 'all' for output both.
For each setting (dataset, topk, feature), the evaluator will:
(0) Generate parsed files:
During the preprocesing of summary files, triples in summaries will be converted to triple ids,
and these ids will be printed files in directory out_${algo_name}/algo_parsed/
.
(1) Generate an output file:
The evaluator will output the evaluation results for summaries to file out_${algo_name}/algo_metrics/FSR_${feature_name}_${ds_name}_${topk}.txt
.
Each line in the file includes the following items (items are seperated by tab, see FSR_GFoP_dbpedia_top5.txt as example):
${eid}, ${FSR_of_e}
(2) Print statistical results:
Statistical information of the evaluation results will be printed to the console, including the following items:
${feature_name}, ${ds_name}, ${topk}, ${mean_FSR}, ${std_FSR}, ${significance_with_FER}
where ${significance_with_FER} composed of two values: t-statistic and p-value of the t-test.
Meanwhile, these results will be outputted to file out_${algo_name}/result_statics_FSR.txt
, see out_youralgo/result_statics_FSR.txt as example.
You can add customized features to the evaluator according to following process:
Firstly, compute feature score for each triple in dataset ${ds_name}, and output these information to a file named '${feature_name}_${ds_name}.txt' (where ${feature_name} is the name of your new feature, e.g. 'GFoP'). In this file, each line contains the following items (items are splitted by tab, see GFoP_dbpedia.txt as example):
${tid}, ${tscore}
Put this file to directory in/in_ds_feature/
.
Open f_imp.py, add a new elif
statement to the function get_feature_by_name()
:
elif fname=='${feature_name}' return Feature(ds_name, fname, FType.F_Triple, fpath)
Run iesbm_gen.py to generate FER files for this new feature:
python code/iesbm_gen.py ${feature_name}
Each line of the FER file contains the following items (splitted by tab, see FER_GFoP_dbpedia_top5.txt as example)
${eid}, ${FER_of_e}, ${average_score_of_golds}, ${score_of_desc}
Finally, this new feature can be used by setting parameter '-feature_name ${feature_name}' when running iesbm_eval.py
First, implement a new subclass of f_base.Feature
and name this class as 'F_${feature_name}' (see class F_DoP, F_DoV in f_imp.py as example).
In this class, define the method to get feature score for an entity in function self._get_score_by_sscore()
.
Then, open f_imp.py, add a new elif
-statement to function get_feature_by_name()
, to return an object of the newly defined class:
elif fname=='${feature_name}' return F_${feature_name}(ds_name, fpath=fpath)
Each line of the FER file contains the following items (splitted by tab, see FER_DoP_dbpedia_top5.txt as example)
${eid}, ${FER_of_e}, ${average_score_of_golds}, ${score_of_desc}
Run iesbm_gen.py
to generate FER files for this new feature:
python code/iesbm_gen.py ${feature_name}
Finally, this new feature can be used by setting parameter '-feature_name ${feature_name}' when running iesbm_eval.py
Effectiveness of existing features (FER), and evaluation results of several selected entity summarizers (FSR) are presented in the following tables.
You are encouraged to submit the results of your entity summarizer by contacting us. We will add your results to the following tables. Your submission should contain:
Summary files
: summaries generated by your entity summarizer;Evaluation results
: evaluation results outputted by our evaluator;Notes
: brief description of your entity summarizer (e.g., name of the summarizer, citation information, parameter settings).
FER Results from ground-truth summaries are presented in Table 1 and Table 2. Detailed results of FER are available, see in_ds_fer/.
Table 1. FER on each dataset for k=5.
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
ESBM-D | 0.561(±0.165)↓ | 0.913(±0.052)↓ | 0.759(±0.125)↓ | 1.275(±0.175)↑ | 2.478(±0.747)↑ | 1.016(±0.054)↑ |
ESBM-L | 0.581(±0.125)↓ | 0.998(±0.029) | 1.349(±0.188)↑ | 0.864(±0.057)↓ | 3.093(±2.394)↑ | 1.061(±0.068)↑ |
FED | 0.821(±0.205)↓ | 1.012(±0.066) | 1.148(±0.153)↑ | 0.958(±0.044)↓ | 1.699(±0.480)↑ | 1.016(±0.046) |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
ESBM-D | 0.569(±0.170)↓ | 0.902(±0.048)↓ | 0.753(±0.113)↓ | 1.267(±0.158)↑ | 2.080(±0.555)↑ | 1.002(±0.038) |
ESBM-L | 0.757(±0.131)↓ | 0.983(±0.025)↓ | 1.203(±0.152)↑ | 0.917(±0.054)↓ | 2.092(±1.298)↑ | 1.048(±0.068)↑ |
FED | 0.862(±0.154)↓ | 0.993(±0.041) | 1.065(±0.098)↑ | 0.981(±0.028)↓ | 1.601(±0.423)↑ | 1.018(±0.029)↑ |
FSR results for several selected entity summarizers are presented in the following tables. Their output files are also available (runs.zip).
Table 3. FSR of selected entity summarizers on ESBM-D for k=5.
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.280(±0.228) | 0.869(±0.075) | 0.277(±0.098) | 1.801(±0.297) | 2.351(±0.791)• | 0.749(±0.253) |
DIVERSUM | 0.649(±0.192) | 0.910(±0.048)• | 0.854(±0.167) | 1.175(±0.198) | 2.753(±0.925) | 1.037(±0.086) |
FACES-E | 0.623(±0.281)• | 0.914(±0.079)• | 0.911(±0.208) | 1.142(±0.241) | 2.494(±0.881)• | 0.972(±0.118) |
CD | 0.334(±0.181) | 0.863(±0.062) | 0.414(±0.136) | 1.620(±0.250) | 2.742(±0.886) | 1.061(±0.051) |
BAFREC | 0.585(±0.169)• | 0.954(±0.056) | 0.908(±0.177) | 1.117(±0.198) | 2.586(±0.755) | 0.980(±0.107) |
KAFCA | 0.361(±0.248) | 0.850(±0.083) | 0.646(±0.244) | 1.377(±0.294) | 2.505(±0.829)• | 0.993(±0.116)• |
MPSUM | 0.434(±0.201) | 0.876(±0.072) | 0.730(±0.250)• | 1.304(±0.291)• | 2.742(±0.886) | 0.891(±0.187) |
ESA | 0.266(±0.212) | 0.848(±0.084) | 0.529(±0.179) | 1.535(±0.282) | 2.303(±0.827) | 0.930(±0.160) |
DeepLENS | 0.302(±0.219) | 0.854(±0.076) | 0.656(±0.190) | 1.407(±0.267) | 2.415(±0.801)• | 0.957(±0.115) |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.688(±0.432)• | 0.991(±0.047)• | 0.600(±0.137) | 1.154(±0.050) | 2.775(±2.012)• | 0.967(±0.189) |
DIVERSUM | 0.870(±0.381) | 0.993(±0.038)• | 1.006(±0.247) | 0.991(±0.077) | 3.869(±3.357) | 1.091(±0.075) |
FACES-E | 0.536(±0.163)• | 0.962(±0.079) | 1.341(±0.296)• | 0.872(±0.092)• | 3.848(±3.352) | 1.059(±0.103)• |
CD | 0.470(±0.199) | 0.996(±0.073)• | 1.009(±0.212) | 0.959(±0.079) | 3.869(±3.357) | 1.102(±0.071) |
BAFREC | 0.562(±0.201)• | 1.020(±0.053) | 1.598(±0.491) | 0.781(±0.144) | 3.485(±3.228) | 1.007(±0.088) |
KAFCA | 0.234(±0.200) | 0.954(±0.056) | 1.309(±0.395)• | 0.884(±0.108)• | 3.869(±3.357) | 1.104(±0.102) |
MPSUM | 0.568(±0.201)• | 0.979(±0.046) | 1.249(±0.428)• | 0.908(±0.131)• | 3.869(±3.357) | 1.083(±0.104)• |
ESA | 0.514(±0.235)• | 1.029(±0.037) | 1.241(±0.352)• | 0.892(±0.116)• | 3.125(±2.613)• | 1.013(±0.154)• |
DeepLENS | 0.361(±0.163) | 1.004(±0.037)• | 1.412(±0.409)• | 0.840(±0.129)• | 3.496(±2.343) | 1.000(±0.081) |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.911(±0.481)• | 1.028(±0.157)• | 0.652(±0.329) | 1.123(±0.097) | 1.473(±0.579) | 0.761(±0.209) |
DIVERSUM | 1.339(±0.220) | 0.962(±0.056) | 1.043(±0.191) | 0.989(±0.069)• | 1.783(±0.517) | 0.981(±0.097)• |
FACES | 0.860(±0.314)• | 0.936(±0.081) | 1.489(±0.245) | 0.886(±0.084) | 1.714(±0.514)• | 1.019(±0.124)• |
FACES-E | 0.860(±0.314)• | 0.936(±0.081) | 1.489(±0.245) | 0.886(±0.084) | 1.714(±0.514)• | 1.019(±0.124)• |
CD | 0.799(±0.206)• | 1.042(±0.075) | 0.699(±0.226) | 1.118(±0.076) | 1.783(±0.517) | 1.060(±0.066) |
LinkSUM | 0.976(±0.353) | 0.987(±0.080)• | 1.656(±0.250) | 0.797(±0.089) | 1.460(±0.474) | 1.062(±0.074) |
BAFREC | 0.928(±0.273) | 0.949(±0.078) | 1.658(±0.304) | 0.811(±0.089) | 1.768(±0.516) | 0.975(±0.119)• |
KAFCA | 0.636(±0.248) | 0.999(±0.116)• | 0.864(±0.363) | 1.024(±0.092) | 1.699(±0.518)• | 0.909(±0.134) |
MPSUM | 0.878(±0.245)• | 0.918(±0.067) | 1.344(±0.289) | 0.949(±0.095)• | 1.783(±0.517) | 0.821(±0.225) |
ESA | 0.842(±0.323)• | 1.090(±0.113) | 0.813(±0.232) | 1.039(±0.075) | 1.378(±0.408) | 0.875(±0.136) |
DeepLENS | 0.823(±0.476)• | 1.056(±0.095) | 1.166(±0.375)• | 0.926(±0.124)• | 1.481(±0.486) | 0.863(±0.131) |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.392(±0.217) | 0.879(±0.063) | 0.374(±0.112) | 1.655(±0.228) | 2.015(±0.627)• | 0.872(±0.155) |
DIVERSUM | 0.413(±0.135) | 0.861(±0.048) | 0.745(±0.164)• | 1.299(±0.230)• | 2.753(±0.925) | 1.013(±0.056)• |
FACES-E | 0.516(±0.182) | 0.897(±0.053)• | 0.770(±0.146)• | 1.270(±0.210)• | 2.453(±0.842) | 0.985(±0.062) |
CD | 0.393(±0.155) | 0.871(±0.045) | 0.555(±0.145) | 1.467(±0.219) | 2.538(±0.741) | 1.026(±0.045) |
BAFREC | 0.629(±0.191) | 0.945(±0.049) | 0.850(±0.148) | 1.171(±0.181) | 1.926(±0.543) | 0.968(±0.071) |
KAFCA | 0.443(±0.223) | 0.883(±0.069) | 0.661(±0.195) | 1.359(±0.257) | 2.199(±0.721) | 0.972(±0.065) |
MPSUM | 0.405(±0.162) | 0.880(±0.060) | 0.686(±0.158) | 1.349(±0.210) | 2.612(±0.765) | 0.971(±0.066) |
ESA | 0.309(±0.222) | 0.839(±0.061) | 0.606(±0.149) | 1.442(±0.222) | 2.088(±0.610)• | 0.965(±0.082) |
DeepLENS | 0.334(±0.209) | 0.827(±0.066) | 0.674(±0.150) | 1.367(±0.207) | 2.070(±0.593)• | 0.994(±0.058)• |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.865(±0.260) | 1.000(±0.040)• | 0.634(±0.113) | 1.143(±0.045) | 1.962(±1.646)• | 0.949(±0.111) |
DIVERSUM | 0.570(±0.266) | 0.965(±0.040) | 1.221(±0.296)• | 0.922(±0.082)• | 3.869(±3.357) | 1.083(±0.071) |
FACES-E | 0.470(±0.184) | 0.953(±0.051) | 1.398(±0.212) | 0.867(±0.055) | 3.856(±3.353) | 1.057(±0.083)• |
CD | 0.560(±0.171) | 0.992(±0.043)• | 1.302(±0.317)• | 0.885(±0.100)• | 2.904(±2.170) | 1.077(±0.040) |
BAFREC | 0.714(±0.146) | 1.005(±0.057) | 1.360(±0.253) | 0.861(±0.074) | 2.235(±1.506)• | 0.937(±0.075) |
KAFCA | 0.407(±0.142) | 0.969(±0.043) | 1.336(±0.337)• | 0.874(±0.099)• | 3.119(±2.576) | 1.069(±0.065) |
MPSUM | 0.564(±0.192) | 0.977(±0.031)• | 1.261(±0.280)• | 0.909(±0.082)• | 3.428(±2.898) | 1.079(±0.065) |
ESA | 0.662(±0.232) | 0.993(±0.037)• | 1.187(±0.207)• | 0.919(±0.071)• | 2.257(±1.559)• | 1.020(±0.082)• |
DeepLENS | 0.643(±0.188) | 0.974(±0.036)• | 1.210(±0.267)• | 0.910(±0.087)• | 2.284(±0.968)• | 1.061(±0.075)• |
LFoP | GFoP | GFoV | IoPV | DoP | DoV | |
---|---|---|---|---|---|---|
RELIN | 0.883(±0.345)• | 1.042(±0.089) | 0.545(±0.152) | 1.151(±0.054) | 1.495(±0.505)• | 0.889(±0.082) |
DIVERSUM | 1.021(±0.207) | 0.943(±0.050) | 1.115(±0.157)• | 0.978(±0.050)• | 1.783(±0.517) | 1.011(±0.054)• |
FACES | 0.905(±0.235)• | 0.928(±0.060) | 1.315(±0.219) | 0.933(±0.063) | 1.584(±0.464)• | 1.012(±0.055)• |
FACES-E | 0.905(±0.235)• | 0.928(±0.060) | 1.315(±0.219) | 0.933(±0.063) | 1.584(±0.464)• | 1.012(±0.055)• |
CD | 0.735(±0.175) | 1.022(±0.060) | 0.840(±0.199) | 1.050(±0.063) | 1.783(±0.517) | 1.055(±0.047) |
LinkSUM | 1.028(±0.224) | 0.964(±0.061) | 1.366(±0.186) | 0.893(±0.054) | 1.301(±0.337) | 1.052(±0.049) |
BAFREC | 0.870(±0.181)• | 0.926(±0.046) | 1.433(±0.234) | 0.890(±0.064) | 1.634(±0.463)• | 0.998(±0.057)• |
KAFCA | 0.680(±0.223) | 0.984(±0.081)• | 0.972(±0.246)• | 0.996(±0.070)• | 1.624(±0.510)• | 0.975(±0.078) |
MPSUM | 0.804(±0.174) | 0.909(±0.051) | 1.256(±0.168) | 0.954(±0.052) | 1.783(±0.517) | 0.958(±0.090) |
ESA | 0.832(±0.290)• | 1.047(±0.080) | 0.896(±0.183) | 1.020(±0.060) | 1.292(±0.365) | 0.926(±0.080) |
DeepLENS | 0.863(±0.377)• | 0.999(±0.092)• | 1.116(±0.252)• | 0.955(±0.077)• | 1.334(±0.491) | 0.908(±0.108) |
[1] Gong Cheng, Thanh Tran, Yuzhong Qu: RELIN: Relatedness and Informativeness-Based Centrality for Entity Summarization. International Semantic Web Conference (1) 2011: 114-129.
[2] Marcin Sydow, Mariusz Pikula, Ralf Schenkel: The notion of diversity in graphical entity summarisation on semantic knowledge graphs. J. Intell. Inf. Syst. 41(2): 109-149 (2013).
[3] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth: FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering. AAAI 2015: 116-122.
[4] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit P. Sheth, Gong Cheng: Gleaning Types for Literals in RDF Triples with Application to Entity Summarization. ESWC 2016: 85-100.
[5] Danyun Xu, Liang Zheng, Yuzhong Qu: CD at ENSEC 2016: Generating Characteristic and Diverse Entity Summaries. SumPre@ESWC 2016.
[6] Andreas Thalhammer, Nelia Lasierra, Achim Rettinger: LinkSUM: Using Link Analysis to Summarize Entity Data. ICWE 2016: 244-261.
[7] Hermann Kroll, Denis Nagel and Wolf-Tilo Balke: BAFREC: Balancing Frequency and Rarity for Entity Characterization in Linked Open Data. EYRE 2018.
[8] Eun-Kyung Kim and Key-Sun Choi: Entity Summarization Based on Formal Concept Analysis. EYRE 2018.
[9] Dongjun Wei, Shiyuan Gao, Yaxin Liu, Zhibing Liu and Longtao Huang: MPSUM: Entity Summarization with Predicate-based Matching. EYRE 2018.
[10] Dongjun Wei, Yaxin Liu, Fuqing Zhu, Liangjun Zhang, Wei Zhou, Jizhong Han and Songlin Hu: ESA: Entity Summarization with Attention. EYRE 2019.
[11] Qingxia Liu, Gong Cheng and Yuzhong Qu: DeepLENS: Deep Learning for Entity Summarization. arXiv preprint 2020. arXiv:2003.03736.
If you have any questions or suggestions, please feel free to contact Qingxia Liu and Gong Cheng.