Skip to content

v-mdev/news-scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

News Orchestration

This project manages and automates news data workflows. It collects news article titles and classifies them into categories using embeddings and vector search.

Note: This project is not finished and is currently under active development.

Project Overview

This repository provides all the code and configuration to:

  • Ingest news data from multiple sources.
  • Process and clean collected news articles.
  • Orchestrate workflows using Prefect.
  • Run and monitor data pipelines locally.

Features

  • Automated news data ingestion and processing.
  • Integration with Prefect for workflow management.
  • Local and programmatic execution of flows.

Installation

Clone the repository and install the package locally:

git clone https://github.com/your-username/news-orchestration.git
cd news-orchestration
pip install .

Usage

Run flows using the Prefect CLI:

prefect deployment run 'pipeline/news_scraping'

You can also customize or create new flows by editing the Python modules in the repository.

License

This project is distributed under the MIT License.

About

News title classifier using LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages