You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implementation of an interactive chatbot for summarizing legal and policy documents. Includes data preprocessing (cleaning, tokenization, hierarchical chunking), extractive TF-IDF baselines, and fine-tuned abstractive models (DistilBART, LED). Integrates a retrieval layer for document relevance and uses ROUGE, BLEU, and cosine similarity metrics.
Exploration of retrieval methods on the HotpotQA corpus, combining dense retrieval and feature-based reranking. Achieved a mean nDCG@10 of 0.9416 using LambdaRank with features such as cross-encoder score, LLM score, BM25 score, and token-based statistics—surpassing dense retriever + cross-encoder baselines.
This repository contains implementations and workflows for key NLP tasks like text classification, Generative AI, sentiment analysis, and entity recognition. It includes preprocessing scripts, annotated datasets, and fine-tuning methods for frameworks like Hugging Face and spaCy. Ideal for building and deploying scalable NLP solutions.