Skip to content

🛡️ Sentinel: Validate audio file uploads using magic bytes#67

Open
Krosebrook wants to merge 1 commit intomainfrom
security/validate-audio-upload-15389321388176749818
Open

🛡️ Sentinel: Validate audio file uploads using magic bytes#67
Krosebrook wants to merge 1 commit intomainfrom
security/validate-audio-upload-15389321388176749818

Conversation

@Krosebrook
Copy link
Owner

@Krosebrook Krosebrook commented Feb 18, 2026

Implemented server-side validation of audio file uploads by inspecting magic bytes. This ensures that only valid audio files (MP3, WAV, OGG, FLAC, M4A, AAC) are processed, preventing potential security risks associated with malicious file uploads disguised as audio. Added detectAudioFormat utility and integrated it into the /api/audio/generate-with-reference route. Verified with unit tests covering various audio formats and invalid inputs.


PR created automatically by Jules for task 15389321388176749818 started by @Krosebrook


Summary by cubic

Adds server-side magic byte validation for audio uploads to block disguised files. Only MP3, WAV, OGG, FLAC, M4A, and AAC are accepted in /api/audio/generate-with-reference.

  • New Features
    • Added detectAudioFormat to identify audio types via magic bytes.
    • Integrated validation in /api/audio/generate-with-reference with 400 on invalid files.
    • Added unit tests for supported formats and invalid inputs.

Written for commit abec10f. Summary will update on new commits.

Summary by CodeRabbit

  • New Features
    • Added audio file format validation to the audio generation feature. When you upload audio files for generation, the system now validates them before processing to ensure they are in a supported format. Invalid or unsupported audio formats are rejected with a clear error message providing details about the issue.

- Implemented `detectAudioFormat` in `server/utils.ts` to inspect file buffers for valid magic bytes (MP3, WAV, OGG, FLAC, M4A/AAC).
- Updated `/api/audio/generate-with-reference` in `server/routes.ts` to use this validation, rejecting files that do not match expected audio formats regardless of their extension or MIME type.
- Added comprehensive tests for audio detection in `server/utils.test.ts`.

This mitigates the risk of malicious file uploads disguised as audio files.

Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
Copilot AI review requested due to automatic review settings February 18, 2026 05:47
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Krosebrook, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security posture of the application by introducing server-side validation for audio file uploads. By inspecting the "magic bytes" at the beginning of uploaded files, the system can now accurately determine if a file is a genuine audio format, thereby mitigating risks associated with disguised or malicious file uploads. This ensures that only expected and safe audio content is processed, improving the overall integrity and stability of the service.

Highlights

  • Server-side Audio Validation: Implemented robust server-side validation for audio file uploads using magic bytes to ensure only legitimate audio formats (MP3, WAV, OGG, FLAC, M4A, AAC) are processed.
  • New detectAudioFormat Utility: Introduced a new utility function, detectAudioFormat, capable of identifying various audio file types based on their unique byte signatures.
  • API Integration: Integrated the new audio validation logic into the /api/audio/generate-with-reference route, enhancing security and preventing malicious file uploads disguised as audio.
  • Comprehensive Unit Tests: Added extensive unit tests for the detectAudioFormat utility, covering valid audio formats and various invalid inputs to ensure reliability and correctness.
Changelog
  • server/routes.ts
    • Imported the new detectAudioFormat utility.
    • Integrated audio file format validation using detectAudioFormat into the /api/audio/generate-with-reference endpoint.
    • Added error handling to return a 400 status if an invalid audio file format is detected.
  • server/utils.test.ts
    • Imported detectAudioFormat for testing purposes.
    • Renamed existing test cases for sanitizeLog to sanitizeTestCases for clarity.
    • Added a new suite of audioTestCases to thoroughly test the detectAudioFormat function with various valid and invalid audio buffers.
    • Updated test execution logic to run both sanitization and audio detection tests.
  • server/utils.ts
    • Added a new detectAudioFormat function that takes a Buffer as input.
    • Implemented logic within detectAudioFormat to identify audio formats (WAV, MP3, AAC, OGG, FLAC, M4A/MP4) by checking specific magic bytes and byte patterns.
    • Included checks for buffer length to ensure safe and accurate detection.
Activity
  • PR created automatically by Jules for task 15389321388176749818 started by @Krosebrook.
  • No other specific review comments or activity found in the provided context.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 18, 2026

📝 Walkthrough

Walkthrough

The changes add audio format detection to validate uploaded audio files before processing. A new utility function identifies audio formats via magic bytes, which is integrated into the audio generation route to reject invalid formats with a 400 error, and covered with comprehensive tests for multiple audio format signatures.

Changes

Cohort / File(s) Summary
Audio Format Detection Utility
server/utils.ts
Added exported detectAudioFormat(buffer: Buffer): string | null function that detects audio formats (WAV, MP3, AAC, OGG, FLAC, MP4/M4A) by examining magic bytes and returns corresponding MIME type or null for unrecognized formats.
Route Validation Integration
server/routes.ts
Imported detectAudioFormat and added validation in the /api/audio/generate-with-reference route to check uploaded audio file format before processing; returns 400 with "Invalid audio file format" error if detection fails.
Test Coverage
server/utils.test.ts
Refactored test data structure and added comprehensive test suite for detectAudioFormat function, testing multiple valid audio format byte signatures, expected MIME types, and rejection cases for invalid/text/empty/short buffers.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A rabbit hops through audio bytes,
Magic headers shining bright!
WAV and MP3 now detected,
Invalid formats are rejected—
Files flow safely, tests all right! 🎵

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding validation for audio file uploads via magic bytes inspection, which aligns with the core objectives of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch security/validate-audio-upload-15389321388176749818

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a robust server-side validation for audio uploads using magic bytes, which significantly improves the security of the application. The implementation of detectAudioFormat is logically sound and covers the required formats. I have suggested a few improvements regarding efficiency and ensuring consistency between the detected format and the data URL used for external API calls.

Comment on lines +44 to +90
export function detectAudioFormat(buffer: Buffer): string | null {
if (!buffer || buffer.length < 4) return null;

// WAV: RIFF .... WAVE
if (buffer.length >= 12 &&
buffer.toString('hex', 0, 4) === '52494646' &&
buffer.toString('hex', 8, 12) === '57415645') {
return 'audio/wav';
}

// MP3: ID3
if (buffer.toString('hex', 0, 3) === '494433') {
return 'audio/mpeg';
}

// AAC: ADTS Header
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}

// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
if (buffer[0] === 0xFF && (buffer[1] & 0xE0) === 0xE0 && (buffer[1] & 0x06) !== 0x00) {
return 'audio/mpeg';
}

// OGG: OggS
if (buffer.toString('utf8', 0, 4) === 'OggS') {
return 'audio/ogg';
}

// FLAC: fLaC
if (buffer.toString('utf8', 0, 4) === 'fLaC') {
return 'audio/flac';
}

// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
}

return null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current implementation of detectAudioFormat creates multiple strings (hex and utf8) for each validation check. For better performance and reduced memory allocation, especially when processing binary data, it is recommended to perform direct byte comparisons on the buffer.

export function detectAudioFormat(buffer: Buffer): string | null {
  if (!buffer || buffer.length < 4) return null;

  // WAV: RIFF .... WAVE
  if (buffer.length >= 12 &&
      buffer[0] === 0x52 && buffer[1] === 0x49 && buffer[2] === 0x46 && buffer[3] === 0x46 &&
      buffer[8] === 0x57 && buffer[9] === 0x41 && buffer[10] === 0x56 && buffer[11] === 0x45) {
    return 'audio/wav';
  }

  // MP3: ID3
  if (buffer[0] === 0x49 && buffer[1] === 0x44 && buffer[2] === 0x33) {
    return 'audio/mpeg';
  }

  // AAC: ADTS Header
  if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
    return 'audio/aac';
  }

  // MP3: Frame Sync
  if (buffer[0] === 0xFF && (buffer[1] & 0xE0) === 0xE0 && (buffer[1] & 0x06) !== 0x00) {
    return 'audio/mpeg';
  }

  // OGG: OggS
  if (buffer[0] === 0x4F && buffer[1] === 0x67 && buffer[2] === 0x67 && buffer[3] === 0x53) {
    return 'audio/ogg';
  }

  // FLAC: fLaC
  if (buffer[0] === 0x66 && buffer[1] === 0x4C && buffer[2] === 0x61 && buffer[3] === 0x43) {
    return 'audio/flac';
  }

  // M4A/AAC (MP4 container): ftyp
  if (buffer.length >= 8 && 
      buffer[4] === 0x66 && buffer[5] === 0x74 && buffer[6] === 0x79 && buffer[7] === 0x70) {
    return 'audio/mp4';
  }

  return null;
}

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: abec10fc9e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +85 to +86
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict MP4 magic-byte detection to audio brands

The new detectAudioFormat logic classifies any buffer with ftyp at offset 4 as audio/mp4, but many non-audio ISO-BMFF files (e.g., HEIC/AVIF images and video MP4s) also match this signature; because this function is now the content gate for /api/audio/generate-with-reference, such files can pass the new validation and be forwarded as reference audio. This weakens the intended security check—ftyp should be combined with an allowed major brand (or deeper container inspection) before accepting as audio.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds server-side validation of audio file uploads by inspecting magic bytes to prevent malicious files disguised as audio. A new detectAudioFormat utility function validates file signatures for MP3, WAV, OGG, FLAC, M4A, and AAC formats, integrated into the /api/audio/generate-with-reference endpoint with comprehensive unit tests.

Changes:

  • Added detectAudioFormat function in server/utils.ts to validate audio file formats via magic byte inspection
  • Integrated magic byte validation into the audio upload endpoint in server/routes.ts
  • Added comprehensive unit tests in server/utils.test.ts covering various audio formats and edge cases

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
server/utils.ts Implements detectAudioFormat function with magic byte checks for WAV, MP3, AAC, OGG, FLAC, and M4A/MP4 formats
server/utils.test.ts Adds test cases for audio format detection including valid formats and rejection of invalid inputs
server/routes.ts Integrates magic byte validation into /api/audio/generate-with-reference endpoint before processing uploads

Comment on lines +84 to +88
name: "Detect M4A/MP4 (ftypM4A)",
input: Buffer.from("00000020667479704d344120", "hex"), // ...ftypM4A
expected: "audio/mp4"
},
{
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case for M4A/MP4 detection doesn't validate the major brand checks needed to distinguish audio-only containers from video containers. According to the codebase pattern (server/utils.test.ts:110-124 from memories), tests should include cases for M4A (4d344120), M4B (4d344220), M4P (4d345020) acceptance and isom rejection. The current test only checks for 'ftyp' presence which is insufficient.

Suggested change
name: "Detect M4A/MP4 (ftypM4A)",
input: Buffer.from("00000020667479704d344120", "hex"), // ...ftypM4A
expected: "audio/mp4"
},
{
name: "Detect M4A (ftypM4A)",
input: Buffer.from("00000020667479704d344120", "hex"), // size(32) + ftyp + major brand "M4A "
expected: "audio/mp4"
},
{
name: "Detect M4B (ftypM4B)",
input: Buffer.from("00000020667479704d344220", "hex"), // size(32) + ftyp + major brand "M4B "
expected: "audio/mp4"
},
{
name: "Detect M4P (ftypM4P)",
input: Buffer.from("00000020667479704d345020", "hex"), // size(32) + ftyp + major brand "M4P "
expected: "audio/mp4"
},
{
name: "Reject MP4 video (ftypisom)",
input: Buffer.from("000000206674797069736f6d", "hex"), // size(32) + ftyp + major brand "isom"
expected: null
},
{

Copilot uses AI. Check for mistakes.
input: Buffer.from("FFF15080", "hex"),
expected: "audio/aac"
},
{
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test suite lacks coverage for MP3 files without ID3 tags (frame sync only). The current test only covers MP3 with ID3 tags (line 65), but the implementation also supports MP3 frame sync detection (server/utils.ts:69-71). Add a test case with MP3 frame sync bytes (e.g., Buffer.from("FFE30000", "hex")) to ensure both detection paths work correctly and don't conflict with AAC detection.

Suggested change
{
{
name: "Detect MP3 (frame sync)",
input: Buffer.from("FFE30000", "hex"),
expected: "audio/mpeg"
},
{

Copilot uses AI. Check for mistakes.
return res.status(400).json({ message: "Reference audio file is required" });
}

// Sentinel: Validate file content using magic bytes
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses the "Sentinel:" branded prefix, which violates the established codebase convention. According to the style guide, comments should not use branded prefixes like "Sentinel:", "Tool:", or "AI:". Use a simple descriptive comment instead, such as "Validate file content using magic bytes".

Copilot generated this review using guidance from repository custom instructions.
Comment on lines +44 to +90
export function detectAudioFormat(buffer: Buffer): string | null {
if (!buffer || buffer.length < 4) return null;

// WAV: RIFF .... WAVE
if (buffer.length >= 12 &&
buffer.toString('hex', 0, 4) === '52494646' &&
buffer.toString('hex', 8, 12) === '57415645') {
return 'audio/wav';
}

// MP3: ID3
if (buffer.toString('hex', 0, 3) === '494433') {
return 'audio/mpeg';
}

// AAC: ADTS Header
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}

// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
if (buffer[0] === 0xFF && (buffer[1] & 0xE0) === 0xE0 && (buffer[1] & 0x06) !== 0x00) {
return 'audio/mpeg';
}

// OGG: OggS
if (buffer.toString('utf8', 0, 4) === 'OggS') {
return 'audio/ogg';
}

// FLAC: fLaC
if (buffer.toString('utf8', 0, 4) === 'fLaC') {
return 'audio/flac';
}

// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
}

return null;
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function duplicates existing functionality. There is already a detectAudioFormat function in server/replit_integrations/audio/client.ts (lines 20-47) that serves a similar purpose. The existing function returns an AudioFormat type instead of MIME type strings and includes WebM support. Consider consolidating these functions to avoid code duplication and maintenance overhead. Either reuse the existing function or refactor to have a single source of truth for audio format detection.

Copilot uses AI. Check for mistakes.
Comment on lines +83 to +86
// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The M4A/MP4 container detection accepts any file with 'ftyp' at offset 4, which is too broad and could lead to accepting non-audio MP4 files (e.g., video files). According to established patterns in the codebase (server/utils.ts:95-102 from memories), M4A validation should check the major brand at bytes 8-11 and accept only M4A (4d344120), M4B (4d344220), M4P (4d345020) brands while rejecting generic MP4 brands like isom, mp41, mp42. This prevents accepting video files disguised as audio.

Suggested change
// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
// M4A-family (MP4 container): ftyp + specific major brands (M4A/M4B/M4P)
// Looking for 'ftyp' at offset 4 and validating major brand at bytes 8-11
if (buffer.length >= 12 && buffer.toString('utf8', 4, 8) === 'ftyp') {
const majorBrandHex = buffer.toString('hex', 8, 12).toLowerCase();
if (
majorBrandHex === '4d344120' || // "M4A "
majorBrandHex === '4d344220' || // "M4B "
majorBrandHex === '4d345020' // "M4P "
) {
return 'audio/mp4'; // Covers m4a/m4b/m4p audio containers
}

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +68
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}

// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AAC ADTS detection logic is too permissive and could create false positives with MP3 files. According to established patterns (server/utils.ts:96-98 from memories), AAC ADTS files should use explicit checks for 0xFF 0xF1 (MPEG-4) and 0xFF 0xF9 (MPEG-2) rather than bit masking to avoid ambiguity with MP3 sync frames. The current mask (buffer[1] & 0xF6) === 0xF0 would match bytes like 0xF1, 0xF5, 0xF9, but also 0xF0, 0xF4, 0xF8 which might not be valid AAC. Using explicit byte value checks reduces the risk of misidentification.

Suggested change
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}
// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
// MPEG-4: 0xFF 0xF1
// MPEG-2: 0xFF 0xF9
if (buffer[0] === 0xFF && (buffer[1] === 0xF1 || buffer[1] === 0xF9)) {
return 'audio/aac';
}
// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
// Layer (bits 1,2 of byte 1) is NOT 00.

Copilot uses AI. Check for mistakes.
Comment on lines +1154 to +1155
console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
return res.status(400).json({ message: "Invalid audio file format. Please upload a valid audio file." });
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the security logging memory (server/routes.ts:1151-1158), file upload validation failures should be logged with sanitizeLog() including userId, fileSize, mimeType, and originalName for security monitoring and incident response. The current implementation only logs the MIME type without using sanitizeLog() and doesn't include other security-relevant metadata like userId or fileSize. This makes it harder to track and investigate potential security incidents.

Copilot uses AI. Check for mistakes.
Comment on lines +59 to +71
// AAC: ADTS Header
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}

// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
if (buffer[0] === 0xFF && (buffer[1] & 0xE0) === 0xE0 && (buffer[1] & 0x06) !== 0x00) {
return 'audio/mpeg';
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AAC ADTS detection (lines 63-65) is placed before the MP3 frame sync detection (lines 69-71). This ordering could cause MP3 files without ID3 tags to be misidentified as AAC, since MP3 frame sync bytes (0xFF 0xE0-0xFF) could match the AAC check depending on the specific byte values. The AAC check should be made more restrictive (as noted in another comment) or moved after the MP3 frame sync check, or both checks should be mutually exclusive to prevent misidentification.

Copilot uses AI. Check for mistakes.
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="server/routes.ts">

<violation number="1" location="server/routes.ts:1152">
P2: The magic‑byte detection is only used as a boolean check, but the untrusted `file.mimetype` is still used to build the data URL. A spoofed MIME can pass validation and be forwarded with the wrong type. Use the detected format (and/or reject mismatches) so downstream processing uses the verified MIME.</violation>
</file>

<file name="server/utils.ts">

<violation number="1" location="server/utils.ts:85">
P2: The `ftyp` check is too permissive for audio validation. `ftyp` is present in any MP4 container (including video), so this accepts video MP4s as `audio/mp4`, undermining the audio-only validation. Consider validating the MP4 brand/compatible brands or inspecting tracks to ensure the container is audio-only.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

}

// Sentinel: Validate file content using magic bytes
const detectedFormat = detectAudioFormat(file.buffer);
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The magic‑byte detection is only used as a boolean check, but the untrusted file.mimetype is still used to build the data URL. A spoofed MIME can pass validation and be forwarded with the wrong type. Use the detected format (and/or reject mismatches) so downstream processing uses the verified MIME.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At server/routes.ts, line 1152:

<comment>The magic‑byte detection is only used as a boolean check, but the untrusted `file.mimetype` is still used to build the data URL. A spoofed MIME can pass validation and be forwarded with the wrong type. Use the detected format (and/or reject mismatches) so downstream processing uses the verified MIME.</comment>

<file context>
@@ -1147,6 +1148,13 @@ Also suggest a fitting title for the song.`;
       }
 
+      // Sentinel: Validate file content using magic bytes
+      const detectedFormat = detectAudioFormat(file.buffer);
+      if (!detectedFormat) {
+        console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
</file context>
Fix with Cubic


// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
Copy link

@cubic-dev-ai cubic-dev-ai bot Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The ftyp check is too permissive for audio validation. ftyp is present in any MP4 container (including video), so this accepts video MP4s as audio/mp4, undermining the audio-only validation. Consider validating the MP4 brand/compatible brands or inspecting tracks to ensure the container is audio-only.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At server/utils.ts, line 85:

<comment>The `ftyp` check is too permissive for audio validation. `ftyp` is present in any MP4 container (including video), so this accepts video MP4s as `audio/mp4`, undermining the audio-only validation. Consider validating the MP4 brand/compatible brands or inspecting tracks to ensure the container is audio-only.</comment>

<file context>
@@ -40,3 +40,51 @@ export function sanitizeLog(data: any): any {
+
+  // M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
+  // Looking for 'ftyp' at offset 4
+  if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
+    return 'audio/mp4'; // Covers m4a, mp4 audio
+  }
</file context>
Fix with Cubic

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
server/routes.ts (1)

1151-1164: ⚠️ Potential issue | 🟠 Major

detectedFormat is computed but then discarded — the data URL is still built from client-supplied file.mimetype.

The magic-bytes check only acts as a boolean gate (if (!detectedFormat)). When validation passes, file.mimetype — which is client-controlled and may have already been spoofed past the multer MIME filter — is used verbatim in the data URL at line 1164. This defeats the primary purpose of the detection: ensuring the downstream service receives a data URL whose MIME type actually matches the file content.

Replace file.mimetype with detectedFormat when constructing the data URL:

🛡️ Proposed fix
       const detectedFormat = detectAudioFormat(file.buffer);
       if (!detectedFormat) {
-        console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
+        console.warn("Invalid audio file format detected", { mimetype: file.mimetype });
         return res.status(400).json({ message: "Invalid audio file format. Please upload a valid audio file." });
       }

       const { prompt, duration } = req.body;
       ...

       const base64 = file.buffer.toString("base64");
-      const dataUrl = `data:${file.mimetype};base64,${base64}`;
+      const dataUrl = `data:${detectedFormat};base64,${base64}`;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/routes.ts` around lines 1151 - 1164, detectedFormat from
detectAudioFormat is validated but then ignored; replace use of the
client-supplied file.mimetype when building the data URL so the MIME reflects
the detected format. Update the construction of dataUrl to use detectedFormat
(the value returned by detectAudioFormat) instead of file.mimetype so downstream
consumers receive an accurate MIME type; ensure any subsequent logic that reads
dataUrl still works with the new MIME string and keep base64 encoding of
file.buffer unchanged.
🧹 Nitpick comments (4)
server/utils.ts (2)

44-44: Add JSDoc to this public exported function.

detectAudioFormat is part of the module's public surface but has no JSDoc, which violates the "Add JSDoc comments for public APIs" guideline.

📝 Suggested addition
+/**
+ * Detects the audio format of a file buffer by inspecting its magic bytes.
+ * `@param` buffer - Raw file bytes to inspect (minimum 4 bytes required).
+ * `@returns` MIME type string (e.g. `'audio/wav'`) if a known format is detected, or `null`.
+ */
 export function detectAudioFormat(buffer: Buffer): string | null {

As per coding guidelines, "Add JSDoc comments for public APIs".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.ts` at line 44, Add a JSDoc block for the exported function
detectAudioFormat describing its purpose, parameters and return value: document
that it accepts a Node.js Buffer (parameter name buffer) containing audio data
and that it returns a string with the detected audio format or null if unknown;
include `@param` {Buffer} buffer and `@returns` {string|null} annotations and a
short one-line description (optionally an example or notes about supported
formats) so the public API is properly documented.

48-87: Inconsistent byte-inspection style across the function.

Three different approaches are used to read the same underlying bytes:

  • buffer.toString('hex', …) for WAV and MP3-ID3 (lines 49–50, 55)
  • buffer[n] direct byte access for AAC and MP3 frame sync (lines 63–64, 69)
  • buffer.toString('utf8', …) for OGG, FLAC, and M4A (lines 74, 79, 85)

Direct byte comparisons (buffer[n]) are both the fastest and the most readable for this use-case. Standardising on them would also make it easier to compare with the existing client.ts implementation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.ts` around lines 48 - 87, The detection code mixes
buffer.toString() and buffer[n] checks; standardize to direct byte comparisons
on the same buffer variable: replace the WAV check (currently using
buffer.toString('hex', 0, 4) and buffer.toString('hex', 8, 12)) with explicit
byte checks for 'R','I','F','F' (0x52,0x49,0x46,0x46) and 'W','A','V','E'
(0x57,0x41,0x56,0x45); replace the ID3 MP3 check (buffer.toString('hex', 0, 3)
=== '494433') with checks for 'I','D','3' bytes; replace OggS, fLaC and the
'ftyp' offset-4 MP4/M4A check (currently using buffer.toString('utf8', ...))
with byte comparisons for 'O','g','g','S', 'f','L','a','C', and 'f','t','y','p'
at offset 4 respectively; keep the existing AAC and MP3 frame-sync logic that
already uses buffer[n]; ensure bounds checks (buffer.length) remain before
accessing indexes.
server/utils.test.ts (2)

62-108: Missing test case for the MP3 frame-sync detection path (lines 69–71 of utils.ts).

The existing MP3 test covers only the ID3 header path (buffer[0-2] = 0x49 0x44 0x33). The second MP3 branch — raw MPEG frame sync (0xFF 0xFB etc.) — has no coverage. A typical MPEG1 Layer III frame starts with 0xFF 0xFB or 0xFF 0xFA; these should be verified to return 'audio/mpeg' and should also confirm the AAC gate doesn't mis-classify them.

➕ Suggested additional test cases
+  {
+    name: "Detect MP3 (frame sync, MPEG1 Layer III)",
+    input: Buffer.from("FFFB9400", "hex"), // 0xFF 0xFB: MPEG1, Layer III, no CRC
+    expected: "audio/mpeg"
+  },
+  {
+    name: "Detect MP3 (frame sync, MPEG2 Layer III)",
+    input: Buffer.from("FFF39400", "hex"), // 0xFF 0xF3: MPEG2, Layer III, no CRC
+    expected: "audio/mpeg"
+  },

As per coding guidelines, "Test utility functions and data transformations".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.test.ts` around lines 62 - 108, Add test cases to cover the MP3
frame-sync branch in utils.ts (lines 69–71) by extending the existing
audioTestCases: include at least two inputs whose first bytes are 0xFF 0xFB and
0xFF 0xFA (typical MPEG1 Layer III frame headers) and assert the detection
returns "audio/mpeg"; place these cases alongside the other entries in
audioTestCases and ensure they also verify the AAC detection path does not
mis-classify these buffers (i.e., expected remains "audio/mpeg").

73-82: OGG/FLAC test buffers use utf8 encoding — the trailing bytes are ASCII digit characters, not binary zeros.

Buffer.from("OggS00020000000000000000", "utf8") produces bytes where the '0' and '2' characters are 0x30 and 0x32 (ASCII), not 0x00/0x02. The tests still pass since only the first 4 bytes are checked, but the intent looks like real file header data. Use hex encoding for clarity.

♻️ Clearer test data construction
-    input: Buffer.from("OggS00020000000000000000", "utf8"),
+    input: Buffer.from("4f67675300020000000000000000000000000000", "hex"), // OggS + real OGG capture page header bytes

-    input: Buffer.from("fLaC00000022", "utf8"),
+    input: Buffer.from("664c614300000022", "hex"), // fLaC + STREAMINFO block header
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.test.ts` around lines 73 - 82, The test buffers for "Detect OGG"
and "Detect FLAC" are constructed with utf8, so characters like '0' and '2'
become ASCII bytes instead of the intended binary zeros and 0x02; update the
Buffer.from calls in server/utils.test.ts for the "Detect OGG" and "Detect FLAC"
cases to use hex-encoded literals representing the real file headers (e.g., OggS
-> hex "4f676753000200000000000000" and fLaC -> hex "664c614300000022") and call
Buffer.from(..., "hex") so the tests use actual binary header bytes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/routes.ts`:
- Line 1154: Replace the console.error call that logs user-supplied
file.mimetype with a warning-level log and sanitize the value first: change the
call from console.error(...) to a warning (e.g., console.warn(...) or your
structured logger's warn method) and pass sanitizeLog(file.mimetype) instead of
file.mimetype so the output uses sanitized user data; keep the descriptive
message (e.g., "Invalid file format detected. Mime:") and concatenate or
template it with the sanitized value.

In `@server/utils.ts`:
- Around line 44-89: There are two divergent detectAudioFormat implementations
(this one and the one in the audio client module); consolidate by extracting a
single canonical utility, update both callers to import it, and ensure the
canonical function returns MIME types and covers all formats needed (WAV, MP3,
AAC/ADTS, MP4/M4A, OGG, FLAC, WebM if required); specifically, move logic from
this detectAudioFormat into a shared utility (e.g., shared audio detection
helper), reconcile minimum buffer checks (use safe length checks like >=12 where
needed), preserve/merge checks from the other implementation (frame sync, ADTS,
'ftyp' at offset 4, 'OggS', 'fLaC'), export the unified detectAudioFormat and
replace both local implementations with imports.
- Around line 83-87: The current detection logic that checks
buffer.toString('utf8', 4, 8) === 'ftyp' is too broad and will match video
MP4/MOV; update the check inside the MIME-detection routine (the branch that
currently returns 'audio/mp4') to also verify the brand bytes at offset 8–12 and
only treat as audio when the brand is an audio-specific brand (e.g., 'M4A ',
'M4B ', 'm4a ', 'mp42' as appropriate); specifically, update the condition
around the buffer.toString(...) ftyp check and add a secondary
buffer.toString('utf8', 8, 12) in the same function to match allowed audio
brands before returning 'audio/mp4'.

---

Outside diff comments:
In `@server/routes.ts`:
- Around line 1151-1164: detectedFormat from detectAudioFormat is validated but
then ignored; replace use of the client-supplied file.mimetype when building the
data URL so the MIME reflects the detected format. Update the construction of
dataUrl to use detectedFormat (the value returned by detectAudioFormat) instead
of file.mimetype so downstream consumers receive an accurate MIME type; ensure
any subsequent logic that reads dataUrl still works with the new MIME string and
keep base64 encoding of file.buffer unchanged.

---

Nitpick comments:
In `@server/utils.test.ts`:
- Around line 62-108: Add test cases to cover the MP3 frame-sync branch in
utils.ts (lines 69–71) by extending the existing audioTestCases: include at
least two inputs whose first bytes are 0xFF 0xFB and 0xFF 0xFA (typical MPEG1
Layer III frame headers) and assert the detection returns "audio/mpeg"; place
these cases alongside the other entries in audioTestCases and ensure they also
verify the AAC detection path does not mis-classify these buffers (i.e.,
expected remains "audio/mpeg").
- Around line 73-82: The test buffers for "Detect OGG" and "Detect FLAC" are
constructed with utf8, so characters like '0' and '2' become ASCII bytes instead
of the intended binary zeros and 0x02; update the Buffer.from calls in
server/utils.test.ts for the "Detect OGG" and "Detect FLAC" cases to use
hex-encoded literals representing the real file headers (e.g., OggS -> hex
"4f676753000200000000000000" and fLaC -> hex "664c614300000022") and call
Buffer.from(..., "hex") so the tests use actual binary header bytes.

In `@server/utils.ts`:
- Line 44: Add a JSDoc block for the exported function detectAudioFormat
describing its purpose, parameters and return value: document that it accepts a
Node.js Buffer (parameter name buffer) containing audio data and that it returns
a string with the detected audio format or null if unknown; include `@param`
{Buffer} buffer and `@returns` {string|null} annotations and a short one-line
description (optionally an example or notes about supported formats) so the
public API is properly documented.
- Around line 48-87: The detection code mixes buffer.toString() and buffer[n]
checks; standardize to direct byte comparisons on the same buffer variable:
replace the WAV check (currently using buffer.toString('hex', 0, 4) and
buffer.toString('hex', 8, 12)) with explicit byte checks for 'R','I','F','F'
(0x52,0x49,0x46,0x46) and 'W','A','V','E' (0x57,0x41,0x56,0x45); replace the ID3
MP3 check (buffer.toString('hex', 0, 3) === '494433') with checks for
'I','D','3' bytes; replace OggS, fLaC and the 'ftyp' offset-4 MP4/M4A check
(currently using buffer.toString('utf8', ...)) with byte comparisons for
'O','g','g','S', 'f','L','a','C', and 'f','t','y','p' at offset 4 respectively;
keep the existing AAC and MP3 frame-sync logic that already uses buffer[n];
ensure bounds checks (buffer.length) remain before accessing indexes.

// Sentinel: Validate file content using magic bytes
const detectedFormat = detectAudioFormat(file.buffer);
if (!detectedFormat) {
console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

console.error inappropriate for a validation failure; log user-controlled data through sanitizeLog.

Two issues on this line:

  1. A 400 validation rejection is not an unexpected error — console.error is the wrong level. Use console.warn (or a structured info-level log).
  2. file.mimetype is client-supplied and logged directly. Per guidelines, user-controlled data should pass through sanitizeLog() before logging.
📝 Proposed fix
-        console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
+        console.warn("Rejected upload: unrecognised audio format", sanitizeLog({ mimetype: file.mimetype }));

As per coding guidelines, "Use sanitizeLog() before logging sensitive data".

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
console.error(`Invalid file format detected. Mime: ${file.mimetype}`);
console.warn("Rejected upload: unrecognised audio format", sanitizeLog({ mimetype: file.mimetype }));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/routes.ts` at line 1154, Replace the console.error call that logs
user-supplied file.mimetype with a warning-level log and sanitize the value
first: change the call from console.error(...) to a warning (e.g.,
console.warn(...) or your structured logger's warn method) and pass
sanitizeLog(file.mimetype) instead of file.mimetype so the output uses sanitized
user data; keep the descriptive message (e.g., "Invalid file format detected.
Mime:") and concatenate or template it with the sanitized value.

Comment on lines +44 to +89
export function detectAudioFormat(buffer: Buffer): string | null {
if (!buffer || buffer.length < 4) return null;

// WAV: RIFF .... WAVE
if (buffer.length >= 12 &&
buffer.toString('hex', 0, 4) === '52494646' &&
buffer.toString('hex', 8, 12) === '57415645') {
return 'audio/wav';
}

// MP3: ID3
if (buffer.toString('hex', 0, 3) === '494433') {
return 'audio/mpeg';
}

// AAC: ADTS Header
// Sync (12 bits) means first 4 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is 00.
// Mask 0xF6 (1111 0110) should result in 0xF0 (1111 0000)
if (buffer[0] === 0xFF && (buffer[1] & 0xF6) === 0xF0) {
return 'audio/aac';
}

// MP3: Frame Sync (11 bits) means first 3 bits of byte 1 are 1.
// Layer (bits 1,2 of byte 1) is NOT 00.
if (buffer[0] === 0xFF && (buffer[1] & 0xE0) === 0xE0 && (buffer[1] & 0x06) !== 0x00) {
return 'audio/mpeg';
}

// OGG: OggS
if (buffer.toString('utf8', 0, 4) === 'OggS') {
return 'audio/ogg';
}

// FLAC: fLaC
if (buffer.toString('utf8', 0, 4) === 'fLaC') {
return 'audio/flac';
}

// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
}

return null;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Duplicate implementation — consolidate with existing detectAudioFormat in server/replit_integrations/audio/client.ts.

A function with the same name and purpose already exists in server/replit_integrations/audio/client.ts (lines 19–46). The two implementations diverge in return type (string | null here vs. short format names there), minimum buffer length (4 vs. 12 bytes), and format coverage (FLAC/AAC here; WebM there). Having two parallel implementations will inevitably drift.

Consider extending the existing function (or extracting a shared utility in @shared/) to return the MIME type and cover the missing formats, then importing from one canonical location.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.ts` around lines 44 - 89, There are two divergent
detectAudioFormat implementations (this one and the one in the audio client
module); consolidate by extracting a single canonical utility, update both
callers to import it, and ensure the canonical function returns MIME types and
covers all formats needed (WAV, MP3, AAC/ADTS, MP4/M4A, OGG, FLAC, WebM if
required); specifically, move logic from this detectAudioFormat into a shared
utility (e.g., shared audio detection helper), reconcile minimum buffer checks
(use safe length checks like >=12 where needed), preserve/merge checks from the
other implementation (frame sync, ADTS, 'ftyp' at offset 4, 'OggS', 'fLaC'),
export the unified detectAudioFormat and replace both local implementations with
imports.

Comment on lines +83 to +87
// M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
// Looking for 'ftyp' at offset 4
if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
return 'audio/mp4'; // Covers m4a, mp4 audio
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

ftyp box check at offset 4 matches any ISO BMFF container, including video MP4/MOV.

The check buffer.toString('utf8', 4, 8) === 'ftyp' will match a video .mp4 file equally, because the ftyp box is present in all ISO Base Media File Format containers regardless of whether they carry audio or video. To restrict to audio-only M4A, validate the brand bytes at offset 8–12 (e.g. M4A , M4B , mp42).

🛡️ Tighter brand check
-  // M4A/AAC (MP4 container): ftypM4A or ftypmp42 or similar
-  // Looking for 'ftyp' at offset 4
-  if (buffer.length >= 8 && buffer.toString('utf8', 4, 8) === 'ftyp') {
-    return 'audio/mp4'; // Covers m4a, mp4 audio
-  }
+  // M4A/AAC (MP4 container): check ftyp box and audio-only brands
+  const AUDIO_MP4_BRANDS = ['M4A ', 'M4B ', 'mp42', 'isom', 'MSNV'];
+  if (buffer.length >= 12 &&
+      buffer.toString('utf8', 4, 8) === 'ftyp' &&
+      AUDIO_MP4_BRANDS.includes(buffer.toString('utf8', 8, 12))) {
+    return 'audio/mp4';
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/utils.ts` around lines 83 - 87, The current detection logic that
checks buffer.toString('utf8', 4, 8) === 'ftyp' is too broad and will match
video MP4/MOV; update the check inside the MIME-detection routine (the branch
that currently returns 'audio/mp4') to also verify the brand bytes at offset
8–12 and only treat as audio when the brand is an audio-specific brand (e.g.,
'M4A ', 'M4B ', 'm4a ', 'mp42' as appropriate); specifically, update the
condition around the buffer.toString(...) ftyp check and add a secondary
buffer.toString('utf8', 8, 12) in the same function to match allowed audio
brands before returning 'audio/mp4'.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants