Skip to content

🛡️ Sentinel: [HIGH] Fix file upload validation bypass#48

Open
Krosebrook wants to merge 8 commits intomainfrom
sentinel-fix-file-upload-bypass-2628575097628889752
Open

🛡️ Sentinel: [HIGH] Fix file upload validation bypass#48
Krosebrook wants to merge 8 commits intomainfrom
sentinel-fix-file-upload-bypass-2628575097628889752

Conversation

@Krosebrook
Copy link
Owner

@Krosebrook Krosebrook commented Feb 14, 2026

🚨 Severity: HIGH
💡 Vulnerability: The /api/audio/generate-with-reference endpoint relied solely on the client-provided MIME type (via Multer) to validate uploaded files. This allowed users to bypass validation by renaming malicious files (e.g., scripts, executables) with audio extensions.
🎯 Impact: Attackers could upload arbitrary files disguised as audio. While immediate RCE is mitigated by base64 encoding and external processing, this is a dangerous validation gap that could be exploited if file processing logic changes (e.g., local FFmpeg execution).
🔧 Fix: Implemented verifyAudioFileSignature in server/utils.ts to check the magic bytes (file signature) of uploaded files against known audio formats (MP3, WAV, OGG, FLAC, AAC/M4A). Updated the route handler to reject files with invalid signatures.
✅ Verification: Added comprehensive unit tests in server/utils.test.ts covering valid and invalid file signatures. Verified that valid audio files pass and invalid files (including text and executables) fail.


PR created automatically by Jules for task 2628575097628889752 started by @Krosebrook


Summary by cubic

Hardened /api/audio/generate-with-reference with magic-byte validation and sanitized security logging. Blocks non-audio files disguised as audio and prevents log injection in failure logs.

  • Bug Fixes
    • Added verifyAudioFileSignature for MP3, WAV, OGG, FLAC, AAC (ADTS), and M4A; tightened ADTS and M4A brand checks.
    • Enforced signature check in the route; invalid files return 400; logs use sanitizeLog to strip newlines.
    • Expanded tests for text, EXE, UTF-16 BOM, generic MP4, and log injection; refactored magic bytes into named constants.

Written for commit 0b53a50. Summary will update on new commits.

Summary by CodeRabbit

  • New Features

    • Added file-signature validation for audio uploads in the reference-based generation flow; invalid files are rejected with a clear 400 error message.
    • Recognizes common audio formats: MP3, WAV, OGG, FLAC, and AAC.
  • Bug Fixes

    • Improved log sanitization to strip newline characters and prevent log injection.
  • Tests

    • Added comprehensive tests for audio signature verification and log sanitization.

🚨 Severity: HIGH
💡 Vulnerability: The /api/audio/generate-with-reference endpoint relied solely on the client-provided MIME type (via Multer) to validate uploaded files. This allowed users to bypass validation by renaming malicious files (e.g., scripts, executables) with audio extensions.
🎯 Impact: Attackers could upload arbitrary files disguised as audio. While immediate RCE is mitigated by base64 encoding and external processing, this is a dangerous validation gap that could be exploited if file processing logic changes (e.g., local FFmpeg execution).
🔧 Fix: Implemented verifyAudioFileSignature in server/utils.ts to check the magic bytes (file signature) of uploaded files against known audio formats (MP3, WAV, OGG, FLAC, AAC/M4A). Updated the route handler to reject files with invalid signatures.
✅ Verification: Added comprehensive unit tests in server/utils.test.ts covering valid and invalid file signatures. Verified that valid audio files pass and invalid files (including text and executables) fail.

Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

Copilot AI review requested due to automatic review settings February 14, 2026 05:52
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @Krosebrook, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a critical security enhancement by implementing server-side validation of audio file signatures. Previously, the system relied on client-provided MIME types, which could be easily manipulated to upload malicious files disguised as audio. The new validation mechanism inspects the actual file content, significantly reducing the risk of arbitrary file uploads and potential exploitation, thereby bolstering the application's security posture.

Highlights

  • Security Fix: Addressed a high-severity file upload validation bypass vulnerability in the /api/audio/generate-with-reference endpoint.
  • File Signature Validation: Implemented robust validation using magic bytes (file signatures) for uploaded audio files, supporting MP3, WAV, OGG, FLAC, and AAC/M4A formats.
  • Comprehensive Testing: Included new unit tests to verify the correctness and effectiveness of the audio file signature validation logic, covering various valid and invalid file types.
Changelog
  • server/routes.ts
    • Imported the new verifyAudioFileSignature utility function.
    • Integrated a check within the /api/audio/generate-with-reference endpoint to validate uploaded audio files using their magic bytes, rejecting those with invalid signatures.
  • server/utils.test.ts
    • Imported verifyAudioFileSignature for testing.
    • Renamed the existing testCases array to sanitizationTestCases for clarity.
    • Added a new set of signatureTestCases to thoroughly test the verifyAudioFileSignature function with various valid audio headers and invalid file types (e.g., text, executables).
  • server/utils.ts
    • Added a new exported function verifyAudioFileSignature.
    • The function checks the initial bytes (magic bytes) of a given buffer against known signatures for MP3, WAV, OGG, FLAC, and AAC/M4A audio formats.
    • Includes logic to handle different MP3 header types (ID3v2 and Sync Frame) and AAC formats (ftyp and ADTS).
Activity
  • PR created automatically by Jules for task 2628575097628889752, initiated by @Krosebrook.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link

coderabbitai bot commented Feb 14, 2026

📝 Walkthrough

Walkthrough

Added audio file signature validation: exported verifyAudioFileSignature(buffer: Buffer): boolean in server/utils.ts checks magic bytes for common audio formats; POST /api/audio/generate-with-reference now calls it and returns HTTP 400 with "Invalid file signature. Please upload a valid audio file." if validation fails.

Changes

Cohort / File(s) Summary
Audio signature + sanitization utils
server/utils.ts
Added exported verifyAudioFileSignature(buffer: Buffer): boolean that checks MP3 (ID3v2 or sync), WAV (RIFF/WAVE), OGG (OggS), FLAC (fLaC), AAC/M4A (ftyp brand at offset 4), and AAC ADTS (0xFFF sync). sanitizeLog now strips newline characters from string values to prevent log injection.
Route validation integration
server/routes.ts
Imported and invoked verifyAudioFileSignature(file.buffer) in POST /api/audio/generate-with-reference; on false, logs a sanitized payload and responds 400 with a fixed error message before further processing.
Tests expanded
server/utils.test.ts
Renamed test variables, added sanitization case for newline injection, introduced a signature test suite with multiple valid/invalid byte patterns, and adjusted test output formatting.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant Server as Server
  participant Utils as Utils
  participant Processor as Processor

  Client->>Server: POST /api/audio/generate-with-reference (multipart file)
  Server->>Utils: verifyAudioFileSignature(file.buffer)
  alt signature valid
    Utils-->>Server: true
    Server->>Processor: proceed with audio processing
    Processor-->>Server: processing result
    Server-->>Client: 200 + result
  else signature invalid
    Utils-->>Server: false
    Server-->>Server: sanitizeLog({ userId, fileSize, mimeType, originalName })
    Server-->>Client: 400 "Invalid file signature. Please upload a valid audio file."
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

Poem

🐰 I hop and sniff the bytes so bright,
Headers tell me wrong from right—
I guard the logs, I scrub the line,
I nod for WAV and MP3 fine.
Hooray, the upload path is tight! 🎶

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title uses an emoji and a generic prefix format [HIGH] that adds noise, but it accurately describes the main security fix: adding file upload validation to prevent bypass attacks.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into main

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch sentinel-fix-file-upload-bypass-2628575097628889752

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a critical security vulnerability by implementing file signature validation for audio uploads, preventing malicious file uploads disguised as audio. The new verifyAudioFileSignature function is well-implemented and accompanied by a comprehensive set of unit tests that cover valid, invalid, and edge cases. The integration into the route handler is correct. I have one suggestion to improve the maintainability of the new validation function by refactoring the magic numbers into named constants. Overall, this is an excellent and important security enhancement.

Comment on lines 48 to 98
export function verifyAudioFileSignature(buffer: Buffer): boolean {
if (!buffer || buffer.length < 12) return false;

const header = buffer.subarray(0, 12);

// MP3: ID3v2 (49 44 33) or Sync Frame (FF FB / FF F3 / FF F2)
// Note: Sync frame usually starts at byte 0 for raw streams, but ID3 tag is common.
if (
(header[0] === 0x49 && header[1] === 0x44 && header[2] === 0x33) ||
(header[0] === 0xff && (header[1] === 0xfb || header[1] === 0xf3 || header[1] === 0xf2))
) {
return true;
}

// WAV: RIFF (52 49 46 46) ... WAVE (57 41 56 45)
if (
header[0] === 0x52 &&
header[1] === 0x49 &&
header[2] === 0x46 &&
header[3] === 0x46 &&
header[8] === 0x57 &&
header[9] === 0x41 &&
header[10] === 0x56 &&
header[11] === 0x45
) {
return true;
}

// OGG: OggS (4F 67 67 53)
if (header[0] === 0x4f && header[1] === 0x67 && header[2] === 0x67 && header[3] === 0x53) {
return true;
}

// FLAC: fLaC (66 4C 61 43)
if (header[0] === 0x66 && header[1] === 0x4c && header[2] === 0x61 && header[3] === 0x43) {
return true;
}

// AAC/M4A: ftyp (66 74 79 70) usually at offset 4
if (header[4] === 0x66 && header[5] === 0x74 && header[6] === 0x79 && header[7] === 0x70) {
return true;
}

// AAC ADTS: FF F1 (MPEG-4) or FF F9 (MPEG-2)
// Sync word is 12 bits of 1s (0xFFF)
if (header[0] === 0xff && (header[1] & 0xf0) === 0xf0) {
return true;
}

return false;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function verifyAudioFileSignature is well-structured and covers many important audio formats. However, it uses many "magic numbers" directly in the if conditions, which can make the code harder to read and maintain.

To improve clarity and maintainability, I suggest refactoring this function to use named constants for the magic byte sequences and Buffer.equals() for comparisons. This makes the purpose of each check more explicit and the code more idiomatic. The SIGNATURES object is defined inside the function for a self-contained suggestion, but could be moved to the module scope for a minor performance gain.

export function verifyAudioFileSignature(buffer: Buffer): boolean {
  if (!buffer || buffer.length < 12) return false;

  const header = buffer.subarray(0, 12);

  const SIGNATURES = {
    ID3: Buffer.from([0x49, 0x44, 0x33]),
    RIFF: Buffer.from([0x52, 0x49, 0x46, 0x46]),
    WAVE: Buffer.from([0x57, 0x41, 0x56, 0x45]),
    OGG: Buffer.from([0x4F, 0x67, 0x67, 0x53]),
    FLAC: Buffer.from([0x66, 0x4C, 0x61, 0x43]),
    FTYP: Buffer.from([0x66, 0x74, 0x79, 0x70]),
  };

  // MP3: ID3v2 tag or Sync Frame. A more general check for MPEG frames is used.
  const isMp3 =
    header.subarray(0, 3).equals(SIGNATURES.ID3) ||
    (header[0] === 0xff && (header[1] & 0xe0) === 0xe0);

  // WAV: RIFF header with WAVE format
  const isWav =
    header.subarray(0, 4).equals(SIGNATURES.RIFF) &&
    header.subarray(8, 12).equals(SIGNATURES.WAVE);

  // OGG: OggS capture pattern
  const isOgg = header.subarray(0, 4).equals(SIGNATURES.OGG);

  // FLAC: fLaC marker
  const isFlac = header.subarray(0, 4).equals(SIGNATURES.FLAC);

  // M4A/MP4: ftyp box at offset 4
  const isM4a = header.subarray(4, 8).equals(SIGNATURES.FTYP);

  // AAC ADTS: Sync word (12 bits of 1s)
  const isAacAdts = header[0] === 0xff && (header[1] & 0xf0) === 0xf0;

  return isMp3 || isWav || isOgg || isFlac || isM4a || isAacAdts;
}

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 42d8dd0f2d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

server/utils.ts Outdated
Comment on lines 93 to 94
if (header[0] === 0xff && (header[1] & 0xf0) === 0xf0) {
return true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Tighten ADTS validation to prevent trivial signature bypass

This condition accepts any payload starting with 0xFF followed by any byte whose high nibble is 0xF, which is much broader than a valid AAC ADTS header. A non-audio file can be prefixed with FF F0 and will pass verifyAudioFileSignature, so /api/audio/generate-with-reference can still accept forged uploads despite the new guard.

Useful? React with 👍 / 👎.

server/utils.ts Outdated
Comment on lines 87 to 88
if (header[4] === 0x66 && header[5] === 0x74 && header[6] === 0x79 && header[7] === 0x70) {
return true;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Restrict MP4 signature checks to audio-specific brands

Checking only for ftyp at offset 4 treats any ISO-BMFF file as valid audio, including ordinary MP4 video files and other non-audio container types that share this marker. Because this branch returns true without verifying an audio brand (for example M4A ) or other audio-specific fields, non-audio uploads can still pass the new validation path.

Useful? React with 👍 / 👎.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a HIGH severity security vulnerability in the /api/audio/generate-with-reference endpoint where file upload validation relied solely on client-provided MIME types, allowing attackers to bypass validation by renaming malicious files with audio extensions. The fix implements magic byte (file signature) verification to ensure uploaded files are genuinely audio files.

Changes:

  • Added verifyAudioFileSignature() function to validate file signatures for MP3, WAV, OGG, FLAC, and AAC/M4A formats
  • Integrated signature verification into the file upload endpoint before processing
  • Added comprehensive unit tests for valid/invalid file signatures

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.

File Description
server/utils.ts Implements magic byte validation for audio file signatures (MP3, WAV, OGG, FLAC, M4A, AAC)
server/utils.test.ts Adds test coverage for file signature validation including valid audio formats and invalid files
server/routes.ts Integrates signature verification into the upload endpoint to reject files with invalid signatures

Comment on lines 44 to 46
/**
* Sentinel 🛡️: Validates audio file signatures to prevent malicious uploads.
* Checks for MP3, WAV, OGG, FLAC, and AAC/M4A magic bytes.
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses "Sentinel 🛡️:" as a branded prefix, which violates the coding guidelines. According to the project guidelines, comments should not use branded prefixes. Remove the "Sentinel 🛡️:" prefix and use a simple descriptive comment instead, such as "Validates audio file signatures to prevent malicious uploads."

Copilot generated this review using guidance from repository custom instructions.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

server/routes.ts Outdated
Comment on lines 1151 to 1152
// Sentinel: Verify file signature (magic bytes)
if (!verifyAudioFileSignature(file.buffer)) {
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment uses "Sentinel:" as a branded prefix, which violates the coding guidelines. According to the project guidelines, comments should not use branded prefixes. Remove "Sentinel:" and simply say "Verify file signature (magic bytes)" or just use the existing comment from the function definition.

Copilot generated this review using guidance from repository custom instructions.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

server/utils.ts Outdated
Comment on lines 53 to 57
// MP3: ID3v2 (49 44 33) or Sync Frame (FF FB / FF F3 / FF F2)
// Note: Sync frame usually starts at byte 0 for raw streams, but ID3 tag is common.
if (
(header[0] === 0x49 && header[1] === 0x44 && header[2] === 0x33) ||
(header[0] === 0xff && (header[1] === 0xfb || header[1] === 0xf3 || header[1] === 0xf2))
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MP3 sync frame check only validates 0xFF followed by 0xFB, 0xF3, or 0xF2. However, MP3 sync frames can also use 0xFA, 0xE2, and 0xE3 depending on the MPEG version and layer. While the most common patterns are covered, this could reject some valid MP3 files. Consider expanding the check to include all valid MP3 sync patterns: for MPEG-1 Layer III use 0xFF 0xFB/0xFA, for MPEG-2 Layer III use 0xFF 0xF3/0xF2, and for MPEG-2.5 Layer III use 0xFF 0xE3/0xE2. However, be cautious about being too permissive.

Suggested change
// MP3: ID3v2 (49 44 33) or Sync Frame (FF FB / FF F3 / FF F2)
// Note: Sync frame usually starts at byte 0 for raw streams, but ID3 tag is common.
if (
(header[0] === 0x49 && header[1] === 0x44 && header[2] === 0x33) ||
(header[0] === 0xff && (header[1] === 0xfb || header[1] === 0xf3 || header[1] === 0xf2))
// MP3: ID3v2 (49 44 33) or sync frame:
// - MPEG-1 Layer III: FF FB / FF FA
// - MPEG-2 Layer III: FF F3 / FF F2
// - MPEG-2.5 Layer III: FF E3 / FF E2
// Note: Sync frame usually starts at byte 0 for raw streams, but ID3 tag is common.
if (
(header[0] === 0x49 && header[1] === 0x44 && header[2] === 0x33) ||
(header[0] === 0xff &&
(header[1] === 0xfb ||
header[1] === 0xfa ||
header[1] === 0xf3 ||
header[1] === 0xf2 ||
header[1] === 0xe3 ||
header[1] === 0xe2))

Copilot uses AI. Check for mistakes.
server/routes.ts Outdated
Comment on lines 1151 to 1154
// Sentinel: Verify file signature (magic bytes)
if (!verifyAudioFileSignature(file.buffer)) {
return res.status(400).json({ message: "Invalid file signature. Please upload a valid audio file." });
}
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security-critical validation failures (like file signature mismatches) should be logged for security monitoring and incident response. Add logging when file signature verification fails, including sanitized request metadata (user ID, file size, claimed MIME type). This helps detect attack patterns and investigate security incidents. Use sanitizeLog to ensure no sensitive data is logged.

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the 'Sentinel:' prefix from the comment.

server/utils.ts Outdated
Comment on lines 86 to 88
// AAC/M4A: ftyp (66 74 79 70) usually at offset 4
if (header[4] === 0x66 && header[5] === 0x74 && header[6] === 0x79 && header[7] === 0x70) {
return true;
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The M4A validation accepts any file with 'ftyp' at offset 4, including generic MP4 video files (with brands like 'isom', 'mp41', 'mp42'). This is a security bypass that allows non-audio files to pass validation. M4A files should only accept specific audio brands (M4A, M4B, M4P) by checking bytes 8-11 after 'ftyp'. For example, reject 'isom' (69 73 6F 6D) and accept 'M4A ' (4D 34 41 20). According to stored memory about M4A file signature validation, proper validation should check the major brand at bytes 8-11 and only accept M4A, M4B, M4P brands while rejecting generic MP4 brands.

Suggested change
// AAC/M4A: ftyp (66 74 79 70) usually at offset 4
if (header[4] === 0x66 && header[5] === 0x74 && header[6] === 0x79 && header[7] === 0x70) {
return true;
// AAC/M4A: ISO BMFF with ftyp (66 74 79 70) at offset 4 and audio-specific major brand
if (header[4] === 0x66 && header[5] === 0x74 && header[6] === 0x79 && header[7] === 0x70) {
const majorBrand = header.toString("ascii", 8, 12);
const allowedM4ABrands = new Set(["M4A ", "M4B ", "M4P "]);
if (allowedM4ABrands.has(majorBrand)) {
return true;
}

Copilot uses AI. Check for mistakes.
server/utils.ts Outdated
Comment on lines 92 to 93
// Sync word is 12 bits of 1s (0xFFF)
if (header[0] === 0xff && (header[1] & 0xf0) === 0xf0) {
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AAC ADTS sync word detection (0xFFF in the first 12 bits) is too permissive and will match MP3 files with sync frames starting 0xFF 0xFx. This causes ambiguity - the MP3 check at line 57 handles 0xFF 0xFB/0xF3/0xF2, but this AAC check will also match those patterns and any other 0xFF 0xFx pattern. This overlap means some malicious files could be misidentified. Consider being more specific with AAC ADTS detection by checking additional bits beyond the sync word, or reordering checks to ensure MP3 patterns are handled first and don't fall through to AAC detection. The current implementation returns true on the first match, so the MP3 check does take precedence, but this creates fragile logic that's easy to break.

Suggested change
// Sync word is 12 bits of 1s (0xFFF)
if (header[0] === 0xff && (header[1] & 0xf0) === 0xf0) {
// Sync word is 12 bits of 1s (0xFFF) followed by ID/layer/protection bits.
// We check for the common header patterns to avoid overlap with MP3 sync frames.
if (header[0] === 0xff && (header[1] === 0xf1 || header[1] === 0xf9)) {

Copilot uses AI. Check for mistakes.
Comment on lines 62 to 118
const signatureTestCases = [
{
name: "Valid MP3 (ID3)",
input: Buffer.from([0x49, 0x44, 0x33, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),
expected: true,
},
{
name: "Valid MP3 (Sync Frame)",
input: Buffer.from([0xFF, 0xFB, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),
expected: true,
},
{
name: "Valid WAV",
input: Buffer.from([0x52, 0x49, 0x46, 0x46, 0x00, 0x00, 0x00, 0x00, 0x57, 0x41, 0x56, 0x45]),
expected: true,
},
{
name: "Valid OGG",
input: Buffer.from([0x4F, 0x67, 0x67, 0x53, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),
expected: true,
},
{
name: "Valid FLAC",
input: Buffer.from([0x66, 0x4C, 0x61, 0x43, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),
expected: true,
},
{
name: "Valid M4A (ftyp)",
input: Buffer.from([0x00, 0x00, 0x00, 0x1C, 0x66, 0x74, 0x79, 0x70, 0x4D, 0x34, 0x41, 0x20]),
expected: true,
},
{
name: "Valid AAC (ADTS)",
input: Buffer.from([0xFF, 0xF1, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]),
expected: true,
},
{
name: "Invalid Text File",
input: Buffer.from("This is a text file content"),
expected: false,
},
{
name: "Invalid EXE (MZ header)",
input: Buffer.from([0x4D, 0x5A, 0x90, 0x00, 0x03, 0x00, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00]),
expected: false,
},
{
name: "Empty Buffer",
input: Buffer.alloc(0),
expected: false,
},
{
name: "Short Buffer",
input: Buffer.from([0xFF, 0xFB]),
expected: false,
},
];
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test case for rejecting generic MP4 video files (e.g., files with 'ftyp' followed by 'isom' brand). This is critical for verifying the M4A validation doesn't allow MP4 videos to pass. Add a test with Buffer containing [0x00, 0x00, 0x00, 0x20, 0x66, 0x74, 0x79, 0x70, 0x69, 0x73, 0x6F, 0x6D] (ftyp + isom brand) and expect it to return false.

Copilot uses AI. Check for mistakes.
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot open a new pull request to apply changes based on this feedback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the tests to include UTF-16 LE BOM and generic MP4 video cases as requested.

server/utils.ts Outdated
}

// AAC ADTS: FF F1 (MPEG-4) or FF F9 (MPEG-2)
// Sync word is 12 bits of 1s (0xFFF)
Copy link

Copilot AI Feb 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment claims to check the sync word as "12 bits of 1s (0xFFF)", but the implementation checks (header[1] & 0xf0) === 0xf0, which only verifies the first 4 bits of the second byte are all 1s. This gives 8 bits from header[0] + 4 bits from header[1] = 12 bits total, which is correct. However, the comment could be clearer about what's actually being checked. Consider revising to: "Sync word is 0xFFF in first 12 bits (byte 0 = 0xFF, high nibble of byte 1 = 0xF)".

Suggested change
// Sync word is 12 bits of 1s (0xFFF)
// Sync word is 0xFFF in the first 12 bits (byte 0 = 0xFF, high nibble of byte 1 = 0xF)

Copilot uses AI. Check for mistakes.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@server/utils.test.ts`:
- Around line 62-118: Add a test case to the signatureTestCases array to catch
UTF‑16 LE BOM false positives (bytes 0xFF,0xFE) so the ADTS matcher is
exercised; create an entry named like "UTF-16 LE BOM" with input
Buffer.from([0xFF, 0xFE, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00]) (or a short UTF‑16 LE sample starting with 0xFF,0xFE) and expected:
false, ensuring this new case lives in the signatureTestCases array used to
validate server/utils.ts's ADTS/media signature detection.

In `@server/utils.ts`:
- Around line 91-95: The ADTS sync check is too permissive: tighten the
validation in the block that inspects header[0]/header[1] (the ADTS detection
code using header) so it not only ensures the 12-bit sync (header[0] === 0xFF
and top nibble of header[1] are 0xF) but also enforces the ADTS layer bits are
zero (check the layer mask, e.g. require (header[1] & 0x06) === 0x00 in addition
to the sync mask or explicitly test header[1] against the four valid values
0xF0, 0xF1, 0xF8, 0xF9); also remove the branded prefix "Sentinel 🛡️" from the
JSDoc comment above the related utility so the comment is a plain descriptive
JSDoc.
🧹 Nitpick comments (2)
server/utils.ts (1)

44-46: Remove branded prefix from JSDoc comment.

The comment uses "Sentinel 🛡️:" which violates the guideline to use simple descriptive comments without branded prefixes. As per coding guidelines, "Use simple descriptive comments without branded prefixes (avoid 'Tool:', 'AI:', etc.)".

Proposed fix
-/**
- * Sentinel 🛡️: Validates audio file signatures to prevent malicious uploads.
- * Checks for MP3, WAV, OGG, FLAC, and AAC/M4A magic bytes.
- */
+/**
+ * Validates audio file signatures to prevent malicious uploads.
+ * Checks for MP3, WAV, OGG, FLAC, and AAC/M4A magic bytes.
+ */
server/routes.ts (1)

1151-1154: Good placement of the signature validation. The check correctly runs after Multer processes the buffer but before any further processing. Minor nit: the "Sentinel:" branded prefix in the comment on line 1151 should be removed per coding guidelines.

As per coding guidelines, "Use simple descriptive comments without branded prefixes (avoid 'Tool:', 'AI:', etc.)".

Proposed fix
-      // Sentinel: Verify file signature (magic bytes)
+      // Verify file signature (magic bytes)

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 3 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="server/utils.ts">

<violation number="1" location="server/utils.ts:87">
P2: The AAC/M4A check only validates the `ftyp` marker, which is shared by all ISO BMFF containers (MP4 video included). To prevent non-audio files from passing, also verify the major brand at bytes 8–11 (e.g., `M4A ` / `M4B ` / `M4P `) before returning true.</violation>

<violation number="2" location="server/utils.ts:93">
P1: The ADTS validation check is too permissive and can be trivially bypassed. The current condition only verifies the 12-bit sync word (0xFFF) but doesn't validate other required ADTS header fields. An attacker could prefix any malicious file with `FF F0` bytes and it would pass validation. Consider adding additional checks for ADTS header structure (e.g., layer bits, profile bits) to strengthen the validation against forged uploads.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@Krosebrook
Copy link
Owner Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 14, 2026

@Krosebrook I've opened a new pull request, #51, to work on those changes. Once the pull request is ready, I'll request review from you.

Copilot AI and others added 6 commits February 14, 2026 23:34
Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
🚨 Severity: HIGH
💡 Vulnerability: The previous audio file signature validation was too permissive, potentially allowing non-audio files (like UTF-16 BOM or generic MP4 videos) to bypass the check.
🎯 Impact: Attackers could upload arbitrary files disguised as audio.
🔧 Fix:
- Tightened ADTS validation to verify layer bits, preventing UTF-16 BOM false positives.
- Tightened M4A validation to verify audio-specific brands (M4A, M4B, M4P), rejecting generic MP4 video containers.
- Refactored magic bytes into named constants for readability.
- Removed branded prefixes from comments as requested.
✅ Verification: Added new test cases for UTF-16 LE BOM and generic MP4 files to `server/utils.test.ts`, confirming they are now correctly rejected. Verified that valid audio files still pass.

Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
Co-authored-by: Krosebrook <214532761+Krosebrook@users.noreply.github.com>
Harden file signature validation and improve security logging
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
server/utils.ts (1)

47-101: Solid magic-byte validation with proper ADTS mask fix.

The implementation correctly covers the common audio formats and the ADTS mask (0xF6) now properly constrains to the 4 valid ADTS second-byte values. The explicit Layer III sync-frame bytes for MP3 avoid false positives (e.g., UTF-16 LE BOM FF FE).

Two optional refinements:

  1. Move SIGNATURES to module scope — the object is re-allocated on every call. Since Buffer.from(...) with static data is deterministic, hoisting it avoids per-call allocation on a hot path.

  2. Add @param / @returns to the JSDoc to fully satisfy the public API documentation guideline.

♻️ Hoist constants & complete JSDoc
+/** Magic byte signatures for audio format detection */
+const AUDIO_SIGNATURES = {
+  ID3: Buffer.from([0x49, 0x44, 0x33]),
+  RIFF: Buffer.from([0x52, 0x49, 0x46, 0x46]),
+  WAVE: Buffer.from([0x57, 0x41, 0x56, 0x45]),
+  OGG: Buffer.from([0x4F, 0x67, 0x67, 0x53]),
+  FLAC: Buffer.from([0x66, 0x4C, 0x61, 0x43]),
+  FTYP: Buffer.from([0x66, 0x74, 0x79, 0x70]),
+} as const;
+
 /**
  * Validates audio file signatures to prevent malicious uploads.
  * Checks for MP3, WAV, OGG, FLAC, and AAC/M4A magic bytes.
+ * `@param` buffer - The file buffer to validate
+ * `@returns` true if the buffer starts with a recognized audio signature
  */
 export function verifyAudioFileSignature(buffer: Buffer): boolean {
   if (!buffer || buffer.length < 12) return false;
 
   const header = buffer.subarray(0, 12);
-
-  // Magic bytes constants
-  const SIGNATURES = {
-    ID3: Buffer.from([0x49, 0x44, 0x33]),
-    RIFF: Buffer.from([0x52, 0x49, 0x46, 0x46]),
-    WAVE: Buffer.from([0x57, 0x41, 0x56, 0x45]),
-    OGG: Buffer.from([0x4F, 0x67, 0x67, 0x53]),
-    FLAC: Buffer.from([0x66, 0x4C, 0x61, 0x43]),
-    FTYP: Buffer.from([0x66, 0x74, 0x79, 0x70]),
-  };

Then replace all SIGNATURES. references with AUDIO_SIGNATURES. in the function body.

As per coding guidelines, "Add JSDoc comments for public APIs".

Copy link
Contributor

Copilot AI commented Feb 17, 2026

@Krosebrook I've opened a new pull request, #61, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Feb 17, 2026

@Krosebrook I've opened a new pull request, #62, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Feb 17, 2026

@Krosebrook I've opened a new pull request, #63, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Feb 17, 2026

@Krosebrook I've opened a new pull request, #64, to work on those changes. Once the pull request is ready, I'll request review from you.

@Krosebrook
Copy link
Owner Author

@copilot open a new pull request to apply changes based on the comments in this thread

Copy link
Contributor

Copilot AI commented Feb 17, 2026

@Krosebrook I've opened a new pull request, #65, to work on those changes. Once the pull request is ready, I'll request review from you.

@google-labs-jules
Copy link
Contributor

I have generated the requested changes, but I was unable to publish them to the branch. You can publish the changes manually from the Jules UI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants