Skip to content

[Feature] TwoNN dimension estimator#37

Merged
Xmaster6y merged 13 commits intomainfrom
twonn
Feb 13, 2026
Merged

[Feature] TwoNN dimension estimator#37
Xmaster6y merged 13 commits intomainfrom
twonn

Conversation

@Xmaster6y
Copy link
Owner

@Xmaster6y Xmaster6y commented Feb 13, 2026

What does this PR do?

Key insights about the PR.

Linked Issues

Checklist

  • I have read the CONTRIBUTING guide.
  • I have added tests for my changes if needed.
  • I have updated the documentation if needed.

Summary by cubic

Adds TwoNN intrinsic dimension estimation as a TensorDict module with batched support and optional x/y diagnostics. Fixes return_xy batch shape handling (multi-batch and size-1 axes) and seeds tests for stable determinism; addresses part of #34.

  • New Features

    • TwoNnDimensionEstimator: reads (N, D) or (..., N, D); writes one scalar per dataset to out_key.
    • Optional return_xy: outputs log(mu) (x) and -log(1−F) (y); NaN-padded to keep batch shape.
    • eps to ignore near-duplicates; checks for minimum points and degeneracy.
    • Exported via tdhook.latent and latent.dimension_estimation; adds 2017 Sci Rep reference.
    • Tests for keys/diagnostics, known manifolds, batch-shape preservation, errors, seeded determinism, repr.
  • Bug Fixes

    • Corrected multi-batch return_xy padding and reshaping to match leading batch dimensions.

Written for commit 46e1ae3. Summary will update on new commits.

@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.61%. Comparing base (20e1ece) to head (46e1ae3).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #37      +/-   ##
==========================================
+ Coverage   96.50%   96.61%   +0.11%     
==========================================
  Files          29       31       +2     
  Lines        1916     1980      +64     
==========================================
+ Hits         1849     1913      +64     
  Misses         67       67              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files

Confidence score: 4/5

  • Minor but concrete edge-case risk: src/tdhook/latent/dimension_estimation/twonn.py can hit an IndexError when ndim < 2 because it routes into the batched path without validation.
  • Overall risk is low since the issue is a specific input validation gap with limited scope and no other findings.
  • Pay close attention to src/tdhook/latent/dimension_estimation/twonn.py - validate dimensionality before calling _write_batched_result to avoid batched path errors.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tdhook/latent/dimension_estimation/twonn.py">

<violation number="1" location="src/tdhook/latent/dimension_estimation/twonn.py:34">
P2: Validate input dimensionality before calling _write_batched_result; ndim < 2 currently falls into the batched path and will throw IndexError.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="src/tdhook/latent/dimension_estimation/twonn.py">

<violation number="1" location="src/tdhook/latent/dimension_estimation/twonn.py:49">
P2: The new singleton handling drops batch dimensions for batched inputs with size-1 batch axes (e.g., data shape (1, N, D)), so out_key_x/out_key_y shapes no longer match the batch_shape used for out_key. This breaks shape consistency for callers that rely on batch preservation.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@Xmaster6y Xmaster6y changed the title [Feature] TwoNN implementation [Feature] TwoNN dimension estimator Feb 13, 2026
@Xmaster6y Xmaster6y merged commit 975080d into main Feb 13, 2026
7 checks passed
@Xmaster6y Xmaster6y deleted the twonn branch February 16, 2026 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments