Skip to content

raopr/FoodKG

 
 

Repository files navigation

FoodKG: A Tool to Enrich Knowledge Graphs Using Machine Learning Techniques

Run FoodKG with one command
FoodKG exists on Docker. To run our tool, just install docker on your machine: Docker then run the following command:
docker run -p 5000:5000 gharibim/foodkg
FoodKG will start on the localhost, port 5000: 127.0.0.1:5000
You can find a sample input file in Sample_Input folder
and sample context: http://example.com

To reproduce the results and build from scratch follow these steps:

Required libraries:

  1. TensorFlow
  2. Flask
  3. NLTK
  4. Werkzeug
  5. Beautiful Soup
  6. Requests
    Install AGROVEC Embedding model from Google drive, unzip it then place in FoodKG/Prediction/AGROVEC/.
    After that, download Apache Jena and place it in Apache Jena directory.
    Finally run python3 FoodKG.py which is the main script that will start Flask server at localhost.

AGROVOC & AGROVEC
FoodKG will run and use our space vector AGROVEC by default. Our vector can be found in Prediction/AGROVEC/.
Moreover, if you would like to use Glove or any other vector instead of AGROVEC, then add the new vector in the same directory and change the name in prepare_Models.py. Get Glvoe from here
By default, the loaded words are 1000000, you can change the number in prepare_Models.py.

Relations Prediction
FoodKG uses Specialization Tensor Model (STM) to predict the relation between newly added triples. However, we re-trained STM model on AGROVOC triples dataset. FoodKG will use our pre-trained model Prediction/relations_prediction/args.output by default.

If you want to re-train the STM model by yourself, we provided the SPARQL queries that you will need to extract the instances from a dataset SPARQL_Queries. In our case, we used AGROVOC triples dataset, which get be found here. After extracting the instances using SPARQL, check STM Github page to prepare the training data for STM.

Evaluation
To reproduce the results, you can download the models and the evaluation dataset from Google Drive

References:
GEMSEC: Graph Embedding with Self Clustering
Specialization Tensor Model (STM)
Stanford Parser
Tensorflow
AGROVOC
GloVe: Global Vectors for Word Representation
Apache Jena

About

FoodKG Tool

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 78.3%
  • HTML 21.2%
  • Dockerfile 0.5%