Skip to content

Conversation

@dependabot
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Feb 5, 2026

Bumps kreuzberg from 4.2.9 to 4.2.10.

Release notes

Sourced from kreuzberg's releases.

v4.2.10

Fixes

Java Bindings

  • Fix ClassCastException when deserializing nested generic collections (#355)
    • Added @JsonDeserialize(contentAs = ...) annotations to PageStructure, FormattedBlock, Footnote, Attributes, PageHierarchy, PageContent, DjotContent
    • Added comprehensive JSON deserialization regression tests (17 new tests)

Python Bindings

  • Fix Windows CLI binary missing from wheel (#349)
    • CI workflow was copying with wrong filename (kreuzberg.exe instead of kreuzberg-cli.exe)

MIME Type Detection

  • Fix DOCX/XLSX/PPTX detected as ZIP via detect_mime_type_from_bytes (#350)
    • The function now inspects ZIP contents for Office format markers

Java Bindings

  • Fix format-specific metadata missing in getMetadataMap()
    • ResultParser.buildMetadata() now properly propagates flattened format metadata to Metadata.additional

Full Changelog: kreuzberg-dev/kreuzberg@v4.2.9...v4.2.10

Changelog

Sourced from kreuzberg's changelog.

[4.2.10] - 2026-02-05

Fixed

MIME Type Detection

  • DOCX/XLSX/PPTX files detected as ZIP via detect_mime_type_from_bytes: Fixed Office Open XML files (DOCX, XLSX, PPTX) being incorrectly detected as application/zip when using bytes-based MIME detection. The function now inspects ZIP contents for Office format markers (word/document.xml, xl/workbook.xml, ppt/presentation.xml) to correctly identify these formats. (#350)

Java Bindings

  • Format-specific metadata missing in getMetadataMap(): Fixed sheet_count, sheet_names, and other format-specific metadata fields not being accessible via ExtractionResult.getMetadataMap(). The ResultParser.buildMetadata() method now properly propagates flattened format metadata (e.g., Excel, PPTX) to the Metadata.additional map.
  • ClassCastException when deserializing nested generic collections: Fixed LinkedHashMap cannot be cast to PageStructure and similar errors when deserializing JSON with nested List<T> fields. Added @JsonDeserialize(contentAs = ...) annotations to all model classes with generic list fields (PageStructure, FormattedBlock, Footnote, Attributes, PageHierarchy, PageContent, DjotContent) to preserve type information during Jackson deserialization. (#355)

Python Bindings

  • Windows CLI binary still missing from wheel: Fixed CI workflow copying CLI binary with wrong filename (kreuzberg.exe instead of kreuzberg-cli.exe), causing the binary to be excluded from Windows wheels despite the v4.2.9 build.py fix. The CI now copies with the correct name to match pyproject.toml include paths. (#349)

Commits
  • c9ca4f7 fix(benchmark): add per-file timing to batch mode and measure disk sizes
  • 3e22952 chore: updated benchmark docs
  • 5b2ba69 chore(release): v4.2.10
  • d77a24c fix(build): sync Ruby native extension version with workspace
  • 486f800 fix(ci): update C# e2e test document paths
  • bed64ba fix(ci): use find -print -quit to avoid SIGPIPE in CI CLI
  • 5003897 fix(ci): update remaining test document paths and fix ground truth
  • 2c8b21b fix(tests): update test document paths after directory refactoring
  • 6afdc2b feat(benchmark-harness): add Rust fixture generation and size measurement
  • 629ed63 chore(ruby): remove unnecessary rubocop disable directives
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [kreuzberg](https://github.com/kreuzberg-dev/kreuzberg) from 4.2.9 to 4.2.10.
- [Release notes](https://github.com/kreuzberg-dev/kreuzberg/releases)
- [Changelog](https://github.com/kreuzberg-dev/kreuzberg/blob/main/CHANGELOG.md)
- [Commits](kreuzberg-dev/kreuzberg@v4.2.9...v4.2.10)

---
updated-dependencies:
- dependency-name: kreuzberg
  dependency-version: 4.2.10
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants