TimeNet is a specialized Question Answering (QA) agent designed to handle temporal reasoning in natural language queries. It addresses challenges in identifying, normalizing, and reasoning over time-related information — especially for questions that involve recurring events, ambiguous time expressions, and temporal relationships between events. Built with a ReAct-style Agent architecture and powered by a temporal knowledge graph, TimeNet demonstrates improved performance over standard RAG (Retrieval-Augmented Generation) systems.
- ⚙️ Temporal Knowledge Graph: Constructed using Memgraph, storing over 2,000 entities and 750 time-related nodes with rich temporal relations.
- 🗂️ Graph Construction Pipeline: Automated pipeline for crawling, preprocessing, and enriching event data.
- 🤖 ReAct Agent Architecture: Modular reasoning agent that supports multi-step inference and temporal normalization.
- 📊 Benchmark Evaluation: 210+ temporal QA examples spanning 6 question types, evaluated using LLM-as-Judge, F1-Score, and Time IoU.
- 🔁 Training Time-aware Embedding model: Fine-tuning 'intfloat multilingual-e5-large' with existing triplets to focus on temporal reasoning. (In processing)
TimeNet/
├── agent_workflow/ # Core agent implementation
│ ├── state.py # Agent state management
│ ├── tool.py # Agent tools and utilities
│ └── workflow.py # Main workflow logic
├── benchmark/ # Evaluation and benchmarking
├── config/ # Configuration files
├── data_processing/ # Data preprocessing utilities
├── embedding_training/ # Training embeddings for temporal understanding
├── experiment/ # Experimental results and analysis
│ └── results/ # Evaluation results (F1 scores, LLM evaluations)
├── prompt/ # Prompt engineering and graph extraction
└── utils/ # General utilities
conda create -n timenet python=3.11
conda activate timenet
pip install -r requirements.txtNEO4J_URL =
NEO4J_USERNAME =
NEO4J_PASSWORD =
GOOGLE_API_KEY =
OPENAI_API_KEY =
GROQ_API_KEY =
AZURE_GPT_KEY =
AZURE_GPT_URL =
TAVILY_API_KEY =
MONGO_URI =
MONGO_DB_NAME = TimeNet
QDRANT_URL =
QDRANT_API_KEY =
QDRANT_DB_NAME = kg_triplets
WANDB_API_KEY = docker-compose up
langgraph dev TimeNet builds a temporal knowledge graph to store structured, time-anchored information, mainly on Viettel-related events and Vietnamese public holidays.
-
📥 Data Collection (data_processing/data_crawling)
- Crawl structured event data from Wikipedia and Viettel news.
- Use
ScrapeGraphAIfor web scraping + keyword-based search (e.g., "What was Viettel’s biggest milestone in 2021?"). - Ensure timestamps (day/month/year) are clearly extracted.
-
🧹 Preprocessing & Normalization (data_processing/data_crawling)
- Clean and normalize raw data into structured form:
- Event name
- Start/end time (Gregorian + Lunar)
- Description and location
- Store in MongoDB.
- Clean and normalize raw data into structured form:
-
🔍 Entity & Relation Extraction (data_processing/data_transform)
- Use GPT-4o + few-shot CoT to extract:
- Time expressions
- Relationships (e.g.,
OCCURRED_AT,PRECEDES)
- Use GPT-4o + few-shot CoT to extract:
-
🧱 Graph Updating (data_processing/data_transform)
- Use Cypher queries to check duplicates and merge or insert nodes.
- Automatically update and scale graph over time.
-
📐 Embedding & Indexing (data_processing/data_transform)
- Generate embeddings via
text-embedding-003-small:- Triplets (⟨subject, predicate, object⟩)
- Node names
- Store embeddings in QDrant for fast vector search.
- Generate embeddings via
🤖 ReAct Agent Flow (Workflow)
TimeNet follows a ReAct-style agent design combining reasoning and tool-use.
-
🔍 Analysis Node
- Analyzes user query
- Decides next action (tool use, graph search, answer generation)
- Reformulates sub-queries for optimization
-
🧠 Subgraph Retriever
- Keyword Extraction: Finds temporal + event-related terms
- Vector Search: Retrieves relevant subgraphs from QDrant
- Triplet Selection: Selects ~15 most relevant triplets using cosine similarity
-
🛠️ Toolset
- Web Search Tool: Uses Tavily for missing or up-to-date info
- Time Normalization Tools: Converts various time formats (e.g., "next Friday", "last month") into standard Gregorian dates
-
✅ Answer Node
- Synthesizes reasoning results
- Formats final answer (timeline, events, durations)
📊 Temporal Embedding Training (In Progress)
TimeNet is currently developing specialized temporal embeddings to better capture time-related semantic relationships. The embedding training process is ongoing work and aims to improve the system's ability to understand and reason about temporal expressions.
-
Base Model Selection
- Starting with
intfloat/multilingual-e5-largeas our foundation model - Selected for its strong multilingual capabilities and performance on semantic similarity tasks
- Starting with
-
Training Data Preparation
- Using temporally-rich triplets from the knowledge graph (
<subject, predicate, object>) - Processing pipeline:
Triplets CSV → Data Loading → Negative Sample Generation → Training Example Creation - Implementing negative sampling strategies:
- Wrong object for given subject-predicate pairs
- Wrong predicate for given subject-object pairs
- Creating query-answer pairs with temporal descriptions
- Using temporally-rich triplets from the knowledge graph (
-
Contrastive Learning Approach
- Using cosine similarity loss function
- Training with positive examples (labeled 1.0) and negative examples (labeled 0.0)
- Employing
SentenceTransformerwith mean pooling for sequence representation
-
Training Configuration
- Hyperparameters:
- Learning rate: 2e-5
- Batch size: 16
- Training epochs: 40
- Warmup steps: 500
- Training-validation split: 80%-20%
- Using Weights & Biases for experiment tracking
- Hyperparameters:
-
Caching & Optimization
- Implementing data preprocessing caching for faster iteration
- Pre-computing training data and storing in pickle format
Note: This component is still under active development. Performance metrics and integration results will be reported in future updates.
- Data Volume:
- 800+ annotated events from 2018–2025.
- Stored in MongoDB and embedded via
text-embedding-003-small.
- Question Categories:
Explicit,Implicit,Ordinal,Temporal Answer,Duration,Non-temporal.
- Evaluation Set:
- 180 curated questions from events + 30 general temporal questions.
| Metric | Description |
|---|---|
| LLM-as-Judge | GPT-4-o scores (0–5) comparing predicted vs. reference answers |
| F1-Score | Event match quality (name similarity threshold 0.8 via embeddings) |
| Time IoU | Overlap between predicted and true event time ranges |
| Metric | Baseline (RAG) | TimeNet |
|---|---|---|
| LLM-as-Judge | 3.16 | 3.38 |
| F1-Score | 0.59 | 0.81 |
| Time IoU | 0.49 | 0.58 |
- ❌ Difficulty with broad time queries (e.g., "List all events in 2024")
⚠️ Noisy data can introduce inconsistencies in the graph- 📉 Generic embeddings may miss subtle temporal cues
- 🔁 Train a time-aware embedding model for better temporal matching.
- 🧹 Add modules for data validation, de-duplication, and graph update optimization.

