Skip to content

Conversation

@cachafla
Copy link
Contributor

@cachafla cachafla commented Dec 5, 2025

Pull Request Description

What and why?

Eliminate Network Calls on Import with Lazy Tiktoken Loading

This PR refactors the validmind.ai.test_descriptions module to eliminate network calls during import by implementing lazy loading for tiktoken. Previously, import tiktoken at the module level would trigger network requests to download encoding data, causing delays and failures in environments without network access.

The solution implements a hybrid approach that attempts to import tiktoken once at module load within a try-catch block, caching the result in module-level flags (_TIKTOKEN_AVAILABLE and _TIKTOKEN_ENCODING). The _truncate_summary function now checks these cached flags with zero runtime overhead:

Before: Direct import causes network call

import tiktoken

def _truncate_summary(summary, test_id, max_tokens=100_000):
    encoding = tiktoken.encoding_for_model("gpt-4o")  # Called every time
    summary_tokens = encoding.encode(summary)
    ...

After: Cached import with character-based fallback

_TIKTOKEN_AVAILABLE = False
_TIKTOKEN_ENCODING = None

try:
    import tiktoken
    _TIKTOKEN_ENCODING = tiktoken.encoding_for_model("gpt-4o")
    _TIKTOKEN_AVAILABLE = True
except (ImportError, Exception):
    pass  # Fall back to character-based estimation

def _truncate_summary(summary, test_id, max_tokens=100_000):
    if _TIKTOKEN_AVAILABLE:
        summary_tokens = _TIKTOKEN_ENCODING.encode(summary)  # Use cached encoding
        ...
    else:
        estimated_tokens = len(summary) // 4  # Simple fallback
        ...

When tiktoken is available, the implementation uses accurate token counting. When unavailable (no network, import failure), it gracefully falls back to character-based estimation (~4 characters per token). This ensures the library works reliably in all environments while maintaining accuracy when possible. Comprehensive unit tests verify both code paths execute correctly with proper assertions on mocked function calls.

How to test

What needs special review?

Dependencies, breaking changes, and deployment notes

Release notes

Checklist

  • What and why
  • Screenshots or videos (Frontend)
  • How to test
  • What needs special review
  • Dependencies, breaking changes, and deployment notes
  • Labels applied
  • PR linked to Shortcut
  • Unit tests added (Backend)
  • Tested locally
  • Documentation updated (if required)
  • Environment variable additions/changes documented (if required)

@cachafla cachafla requested a review from nibalizer December 5, 2025 00:12
@cachafla cachafla added the internal Not to be externalized in the release notes label Dec 5, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2025

PR Summary

This PR introduces significant enhancements to the token estimation and summary truncation logic within the project. The changes include:

  1. Implementation of a character-based token estimation function (_estimate_tokens_simple) and a corresponding text truncation function (_truncate_text_simple) that are used as a fallback when the tiktoken library is unavailable.

  2. Modification of the _truncate_summary function to dynamically choose between using tiktoken for accurate token counting and falling back to the character-based methods. This ensures that summary truncation works reliably in different runtime environments.

  3. Addition of comprehensive unit tests in tests/test_test_descriptions.py that validate both the tiktoken and fallback code paths. These tests cover scenarios such as:

    • Token estimation for texts of varying lengths.
    • Proper truncation behavior for both short and excessively long texts.
    • Correct selection of the code path based on the availability of the tiktoken module using patching techniques.
  4. Minor version updates in configuration files to reflect the new release version.

Overall, these changes enhance the robustness of the module by ensuring that summary truncation is both accurate (using tiktoken when possible) and resilient (with a reliable fallback).

Test Suggestions

  • Test with multi-byte or Unicode characters to ensure the character-based estimation remains consistent.
  • Add edge case tests where the summary length is just around the max_tokens threshold to verify boundaries.
  • Include tests that simulate failures in tiktoken functions (e.g., encoding/decoding errors) to further validate fallback behavior.
  • Run performance benchmarks for long text inputs to ensure the fallback method scales well.

@cachafla cachafla merged commit f0edca4 into main Dec 5, 2025
17 checks passed
@cachafla cachafla deleted the cacahfla/tiktoken-fallback branch December 5, 2025 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

internal Not to be externalized in the release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants