Skip to content

[Shopping List] Task 4 – Speech Loop (Whisper + GPT-4 Orchestration) #68

@akisma

Description

@akisma

Epic

Goal

Wire the end-to-end speech interaction so chefs can control shopping lists hands-free, translating voice input into backend actions through AI intent resolution.

Deliverables

  • Expo-side voice capture module leveraging microphone permissions and streaming
  • Backend relay endpoint calling OpenAI Whisper for speech-to-text with error retries
  • Conversational intent orchestrator using GPT-4 to map utterances to CRUD/reminder actions with context awareness
  • Guardrails/rate limiting to prevent abuse and handle noisy kitchen environments
  • Automated tests for intent mapping, fallback dialogs, and failure states

Acceptance Criteria

  • Voice commands (create list, add/remove items, send list) succeed end-to-end with latency targets documented
  • Conversational context retained within active session (e.g., clarifying “add two cases of tomatoes”)
  • Graceful degradation to manual UI when speech fails or is unavailable
  • Telemetry logged for speech success/failure without storing raw audio beyond processing needs

Notes

  • Coordinate with Task 5 for spoken confirmation contract
  • Security review required for handling OpenAI credentials and audio payloads
  • Consider offline/poor network guidance in UX copy

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions