Feature/audio input by DenisovAV · Pull Request #182 · DenisovAV/flutter_gemma

DenisovAV · 2026-02-04T18:05:30Z

No description provided.

Audio support: - Add supportAudio parameter through full chain (Dart -> Native) - Add setAudioModelOptions() in Android native for MediaPipe - Add audio recording UI in chat_input_field.dart - Add audio playback in chat_message.dart - Disable audio for .task models (no TF_LITE_AUDIO_ENCODER) Download improvements: - Add foreground parameter for Android large model downloads - SmartDownloader auto-detects allowPause based on server response - Remove automatic retries, keep manual retry only Desktop fixes: - Add maxNumImages parameter to grpc_client.initialize() - Fix vision parameter passing chain Tests: - Add pigeon_support_audio_test.dart - Add desktop_vision_params_test.dart

Audio Input: - Add audio recording and conversion in chat_input_field - Support audio bytes in gRPC client and server - Add chatWithAudio method to desktop inference model - Update proto with audio message support Desktop Fixes: - Switch to Azul Zulu JRE 24 (fixes Jinja template errors) - Add SHA256 checksums for JRE verification - Fix vision enable logic to match Android (maxNumImages > 0) - Document vision limitation on macOS (SDK bug #684) - Fix MediaPipe supportsAudio flag (audio is LiteRT-LM only) Tests: - Add desktop gRPC integration tests - Add LiteRtLmSession unit tests

MediaPipe Engine: - Add audio capability validation in createSession() - Add consistent error handling in generateResponse() Desktop: - Add buffer cleanup in session close() to prevent memory leaks - Add thread safety documentation for session class - Add shutdown RPC before killing server process - Fail fast on chatWithImage when vision not enabled Server: - Document WAV audio format expectation Example: - Fix audio error message (MediaPipe limitation, not iOS) Documentation: - Add Platform Limitations table with vision/audio support - Document iOS Simulator, macOS vision issues

- Replace Temurin JRE 21 with Azul Zulu JRE 24 on all desktop platforms (Temurin causes Jinja template errors with LiteRT-LM native library) - Update JAR version from 0.1.0 to 0.12.3 - Update all checksums for new JRE and JAR - Update DESKTOP_SUPPORT.md with Vision/Audio feature columns

Copilot

Pull request overview

This PR adds audio input support for Gemma 3n E2B/E4B models, enabling voice-to-text and multimodal interactions. It includes significant infrastructure upgrades (JRE switch from Temurin 21 to Azul Zulu 24) and comprehensive platform implementations across Android, Desktop, and Web.

Changes:

Audio input API with supportAudio parameter and addAudio() method for Android, Desktop (macOS/Windows/Linux), and Web platforms
JRE upgrade from Adoptium Temurin 21 to Azul Zulu 24 to fix Jinja template errors with LiteRT-LM native library
LiteRT-LM SDK update from 0.9.0-alpha01 to 0.9.0-alpha02 with Contents API support for multimodal messages
Enhanced download service with Android foreground service support for large files (>500MB)
Desktop bug fixes for text chat and callback-based streaming API

Reviewed changes

Copilot reviewed 77 out of 91 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
pigeon.dart	Added supportAudio and enableAudioModality parameters, addAudio method to PlatformService interface
lib/pigeon.g.dart	Generated Dart code for new audio API methods
ios/Classes/PigeonInterface.g.swift	Generated Swift code with audio support (iOS returns error - not supported)
android/src/main/kotlin/.../PigeonInterface.g.kt	Generated Kotlin code for Android audio implementation
lib/core/message.dart	Added audioBytes field and audio-related factory methods to Message class
lib/core/chat.dart	Added supportAudio field to InferenceChat
android/src/main/kotlin/.../engines/	Audio support in MediaPipe and LiteRT-LM engines with proper error handling
litertlm-server/src/main/kotlin/	Desktop gRPC server audio implementation with WAV format support
lib/desktop/grpc_client.dart	Added chatWithAudio method and audio parameters to initialization
lib/web/flutter_gemma_web.dart	Web platform audio support with AudioPromptPart
windows/scripts/setup_desktop.ps1	JRE upgrade to Azul Zulu 24, version 0.12.3
macos/scripts/setup_desktop.sh	JRE upgrade to Azul Zulu 24, version 0.12.3
linux/scripts/setup_desktop.sh	JRE upgrade to Azul Zulu 24, version 0.12.3
lib/mobile/smart_downloader.dart	Android foreground service configuration for large downloads
example/lib/utils/audio_converter.dart	Audio format conversion utilities (PCM/WAV, resampling)
example/lib/chat_input_field.dart	Audio recording UI with microphone button and waveform display
test/pigeon_support_audio_test.dart	Integration tests for audio parameter passing through Pigeon
README.md	Documentation updates for audio features and foreground downloads

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

DenisovAV added 7 commits January 24, 2026 13:09

Merge main: add supportAudio to new EngineConfig/SessionConfig

717b2f5

Fix test mocks: add foreground parameter to downloadWithProgress

df56440

Add desktopUrl for gemma3n_2B_litertlm and gemma3n_4B_litertlm models

b4bed19

DenisovAV requested a review from Copilot February 4, 2026 18:16

Copilot started reviewing on behalf of DenisovAV February 4, 2026 18:19 View session

Copilot AI reviewed Feb 4, 2026

View reviewed changes

DenisovAV added 3 commits February 4, 2026 19:30

Fix lint warnings in pigeon_support_audio_test.dart

63eec14

Fix background_downloader_service_test: mock plugin channel

85e3e1d

Update prepare_resources.sh: Zulu JRE 24 and JAR 0.12.3

4400f87

DenisovAV merged commit f7430f0 into main Feb 4, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Feature/audio input#182

Feature/audio input#182
DenisovAV merged 10 commits intomainfrom
feature/audio-input

DenisovAV commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

DenisovAV commented Feb 4, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant