NACER (Neural Architecture for Conversational Entity Retrieval) is an advanced feature-based neural architecture designed for the task of KG candidate entity ranking in Conversational Entity Retrieval from DBpedia. This repository contains the source code for NACER and the baseline models utilized in our research, including BM25F, KV-MemNN, GENRE, and LLaMa. Our findings are detailed in the paper: "Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph," presented at the ACM Web Conference 2024 (WWW ’24), authored by Mona Zamiri, Yao Qiang, Fedor Nikolaev, Dongxiao Zhu, Alexander Kotov.
This section provides information on the dataset preparation process, including entity linking and candidate entity collection methods.
- Entity Linking using TAGME: For reference,
extract_entities.pyoutlines an obsolete method for entity linking.
- Mentioned Entities: Obtain a set of mentioned entities using Lukovnikov's method. Refer to
lukovnikov/README.mdfor details. - Candidate Entities: Collect a set of candidate entities Y with
collect_neighbors.py.
The NACER model is implemented across several files, detailing the process from collecting KG triples to the model's training and testing procedures.
collect_triples.py- For collecting KG triples Ti.calculate_feature_inputs.py- For calculating inputs necessary for semantic similarity features.calculate_overlap_features.py- For calculating values for lexical similarity features.model*_mult*.py- Contains the model, along with training and testing procedures.
The baseline models are organized in the "Baselines" folder, which includes separate folders for BM25F, KV-MemNN, GENRE, and LLaMa.
- BM25F: Implementation details are provided in a separate README file within its folder.
- KV-MemNN: Models are implemented in
model_kvmem*.pyandcollect_kvmem_triples.py. - GENRE: Implementation fdetails are provided in a separate README file within its folder.
- LLaMa: Implementation fdetails are provided in a separate README file within its folde.
Statistical significance of the results is calculated using calculate_stat_significance.py.
This work is supported by the National Institutes of Health under the award #1R21NR020388-01A1 and by the National Science Foundation under the award #2211897
If you find this work useful, please cite our paper:
"Benchmark and Neural Architecture for Conversational Entity Retrieval from a Knowledge Graph," presented at the ACM Web Conference 2024 (WWW ’24), authored by Mona Zamiri, Yao Qiang, Fedor Nikolaev, Dongxiao Zhu, Alexander Kotov.