Skip to content

Info-Tech-org/Dialog

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Dialog / Info-Tech 语镜

Dialog · Dialog Safety Infra — Real-time harmful-speech detection and communication improvement for family dialogue

Primary documentation is in English. 中文简要说明:README_zh.md.

License: MIT Website Python 3.10+ React 18

Website · Features · Quick Start · Architecture · Contributing


Introduction

Dialog (语镜) targets family conversation: real-time speech recognition + harmful-speech detection + smart feedback so parents can notice and improve how they talk to children. It supports ESP32 hardware, web, and browser extension; uses a three-stage pipeline (absolute keywords → semantic vector recall → LLM screening) for high recall with controlled false positives.

Capability Description
Real-time WebSocket audio → Tencent Cloud ASR → harmful detection → vibration/caption alerts
Offline Upload recordings → speaker diarization → replay summary and AI role-play
Multi-client Web dashboard, live captioning, browser extension, wearable bridge (e.g. smart glasses)

Product Preview

Dashboard Website
Dashboard preview → Live website

Run the frontend locally for full Dashboard /dashboard, Sessions, Live Listen /live, Devices, Review Feed, etc. For a quick UI-only preview, open ui-preview.html in the repo root (no server required). See docs/UI_PREVIEW.md.


Features

Core

  • Real-time harmful-speech detection: absolute keywords → semantic vector recall → LLM screening. Design doc
  • Tencent Cloud ASR: real-time and file-based recognition, speaker diarization (up to 9 speakers)
  • Smart feedback: severity 1–5, configurable vibration, live captions, alerts
  • Review & role-play: session summaries, highlight clips, AI alternative phrasing and scenario practice
  • Multi-client: ESP32 (PCM/BLE), browser extension, PCM SDK, web upload

Tech highlights

Real-time + offline Live alerts and post-hoc analysis
Pluggable detectors Keyword / vector / LLM, extensible
JWT + device binding User auth and device ownership, admin and normal users
Tencent COS Audio storage and presigned URLs, private read/write
Wearable bridge Caption/alert bridge for smart glasses, etc.

Architecture

flowchart LR
  subgraph Ingest
    A[ESP32]
    B[Browser extension]
    C[Web / Mobile]
  end

  subgraph Backend
    D[FastAPI]
    E[Ingest WS/HTTP]
    F[Tencent ASR]
    G[Detector pipeline]
    H[Review / Role-play]
  end

  subgraph Output
    I[Live captions / Alerts]
    J[Tencent COS]
    K[Wearable bridge]
  end

  A & B --> E
  C --> D
  E --> F --> G --> H
  G --> I
  D --> J
  I --> K
Loading

Flow: Devices / extension / web → Ingest → Tencent ASR → pipeline (keywords + vectors + LLM) → live alerts / storage / review.


Tech Stack

Layer Stack
Backend FastAPI · WebSocket · SQLModel · Tencent ASR/COS · OpenRouter (LLM) · JWT
Frontend React 18 · Vite · React Router · design system (CSS variables, glassmorphism)
Detection Absolute keywords · semantic vectors (OpenAI / sentence-transformers) · LLM screening
Ecosystem Browser extension · PCM client SDK · wearable bridge scripts

Quick Start

Requirements

  • Python 3.10+ (backend)
  • Node.js 18+ (frontend)
  • Tencent Cloud account (ASR + COS). Setup guide

1. Clone and install

git clone https://github.com/Info-Tech-org/Dialog.git
cd Dialog

# Backend
cd backend && pip install -r requirements.txt && cd ..

# Frontend
cd frontend && npm install && cd ..

2. Configure (local only)

Secrets stay out of the repo. Use a local .env in backend/:

cd backend
cp .env.example .env
# Edit .env and set TENCENT_SECRET_ID, TENCENT_SECRET_KEY, OPENROUTER_API_KEY, etc.

Or run the one-shot setup script (prompts for keys and writes backend/.env):

python scripts/setup_local_env.py

See Local development below.

3. Run

# Terminal 1: backend
cd backend && python -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload

# Terminal 2: frontend
cd frontend && npm run dev

Open http://localhost:3000 (or the port Vite shows). Create an admin with python backend/create_admin_user.py if needed.

4. UI-only preview

Open ui-preview.html in the repo root, or run the frontend and go to /dashboard. docs/UI_PREVIEW.md.


Local development (secrets)

All secrets are read from environment variables (e.g. backend/.env). Never commit real keys.

  1. Copy template: cp backend/.env.example backend/.env
  2. Fill in backend/.env with your Tencent ASR/COS and OpenRouter keys.
  3. One-shot script (optional): from repo root run python scripts/setup_local_env.py; it will prompt for each key and write backend/.env.

Required for full local run:

  • TENCENT_SECRET_ID, TENCENT_SECRET_KEY (Tencent Cloud ASR)
  • OPENROUTER_API_KEY (LLM for harmful detection and review)
  • Optional: COS bucket/region, EMBEDDING_* for vector detection, JWT_SECRET_KEY

Project structure

Dialog/
├── backend/           # FastAPI backend
│   ├── main.py        # App entry
│   ├── config/        # settings.py (env-based)
│   ├── api/           # REST: auth, sessions, upload, review, devices
│   ├── realtime/      # ASR, detector pipeline (keyword / vector / LLM)
│   ├── ingest/        # WebSocket ingest, session management
│   ├── offline/       # Offline ASR, diarization, COS upload
│   └── models/        # SQLModel
├── frontend/          # React + Vite
│   └── src/
│       ├── pages/     # Dashboard, sessions, upload, live, devices, review
│       ├── components/
│       └── api/
├── assert/            # Website and brand assets (e.g. Vercel)
├── browser-extension/
├── packages/           # PCM client SDK, etc.
├── docs/              # Design docs, UI preview
├── tools/             # PCM tests, wearable bridge
└── ui-preview.html    # Static UI preview (no server)

Docs and ecosystem

Type Links
Website Website · assert/ · OFFICIAL_SITE.md
Detection Harmful detection design · Detector plugins
APIs PCM ingest · WS streaming · BLE binding
Extensions Browser extension · PCM client · Wearable bridge
Deploy Deploy guide · COS setup
Governance CONTRIBUTING.md · LICENSE · SECURITY · CODE_OF_CONDUCT

Project standards

File Purpose
LICENSE MIT. Use, modify, redistribute with copyright notice.
SECURITY.md Supported versions, how to report vulnerabilities privately.
CODE_OF_CONDUCT.md Community standards (Contributor Covenant 2.0).
CONTRIBUTING.md Bug reports, feature ideas, PR and code guidelines.

Contributing and license

Use GitHub Issues for bugs and Pull Requests for changes. Please read CONTRIBUTING.md and CODE_OF_CONDUCT.md.

This project is MIT licensed. Security issues: see SECURITY.md.


Dialog — Safer family dialogue · https://infotech-launch.vercel.app/

About

Open infrastructure for real-time harmful speech detection in family dialogue. ASR → semantic vectors + LLM → instant nudges. Hardware, web, extensions. 语镜. Für Dialog Safety.

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors