Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 5, 2025

The Series class had no automatic cache invalidation when underlying data mutated, and nested operations created O(N) closure chains that retained all intermediate Series objects in memory.

Changes

BarData wrapper for automatic cache invalidation

  • Wraps Bar[] with version tracking that increments on mutations (push(), pop(), set(), updateLast(), setAll())
  • Series now checks version on toArray() and recomputes only when data changed
  • Enables real-time streaming scenarios where bars are updated incrementally

materialize() method to break closure chains

  • Eagerly computes values and returns fresh Series without closure references
  • Example: a.add(b).mul(c).materialize().div(d) frees references to a, b, c
  • Critical for complex indicators with deep operation chains

Backward compatibility

  • Existing Bar[] arrays automatically wrapped in BarData internally
  • All factory methods (fromBars(), constant(), fromArray()) accept both types
  • No API changes required for existing code

Example

// Before: manual invalidation, stale cache risk
const close = Series.fromBars(bars, 'close');
const values1 = close.toArray();  // caches
bars.push(newBar);                // Series unaware
close._invalidate();              // must remember to call
const values2 = close.toArray();  // recomputes

// After: automatic invalidation
const barData = new BarData(bars);
const close = Series.fromBars(barData, 'close');
const values1 = close.toArray();  // caches
barData.push(newBar);             // version++
const values2 = close.toArray();  // auto-detects, recomputes

// Memory management
const result = a.add(b).mul(c).div(d).sub(e);  // holds a,b,c,d,e in memory
const result = a.add(b).mul(c).materialize().div(d).sub(e);  // frees a,b,c

Performance impact: O(1) version check overhead on cache hits, otherwise identical.

Original prompt

Problem Statement

The Series class in packages/oakscriptjs/src/runtime/series.ts has two critical issues that need to be addressed for production use cases, especially real-time/streaming applications.

Issue 1: Series Cache Invalidation

Current Behavior

The Series class caches computed values in toArray() but has no automatic mechanism to detect when the underlying data changes:

toArray(): number[] {
  if (this.cached !== null) {
    return this.cached;  // Returns stale data if underlying bars changed
  }
  this.cached = this.data.map((bar, i) => this.extractor(bar, i, this.data));
  return this.cached;
}

_invalidate(): void {
  this.cached = null;  // Must be called manually
}

Problems

  1. Manual invalidation only: _invalidate() must be called explicitly—there's no automatic detection when this.data changes
  2. No derived Series tracking: When you compose Series (e.g., close.add(open)), the derived Series has its own cache but doesn't know when parent caches should be invalidated
  3. Mutable data reference: this.data is stored by reference—if the original Bar[] array is mutated externally, cached values become stale

Required Solution

Implement a versioned data source pattern:

  1. Create a BarData class (or similar) that wraps Bar[] and tracks a version number that increments on mutations
  2. Modify Series to store a reference to the data source and track the version when cache was computed
  3. On toArray(), check if the current data version matches the cached version before returning cached results
  4. Optionally add a invalidateAll() mechanism for derived Series coordination

Issue 2: Closure Chain Memory Leak

Current Behavior

Every Series operation creates a new closure that captures the parent Series:

add(other: Series | number): Series {
  return new Series(this.data, (bar, i, data) => {
    const a = this.extractor(bar, i, data);  // Captures 'this'
    const b = typeof other === 'number' ? other : other.extractor(bar, i, data);  // Captures 'other'
    return a + b;
  });
}

Problem

For a chain like a.add(b).mul(c).div(d).sub(e):

  • Each step creates a closure capturing all previous Series objects
  • Results in O(N) Series objects kept alive for a chain of N operations
  • Each Series holds: data (Bar[]), extractor (function), cached (number[] | null)
  • Deep chains in complex indicators can retain significant memory

Required Solution

Implement one or more of these approaches:

  1. Add a materialize() method that eagerly computes values and creates a fresh Series, breaking the closure chain:

    materialize(): Series {
      const values = this.toArray();
      return Series.fromArray(this.data, values);
    }
  2. Expression tree approach (more complex): Replace nested closures with an expression AST that can be evaluated in a single pass without holding references to intermediate Series

  3. Automatic materialization heuristic: Track chain depth and auto-materialize when exceeding a threshold

Files to Modify

  • packages/oakscriptjs/src/runtime/series.ts - Main Series class implementation
  • packages/oakscriptjs/src/types.ts - May need new types for BarData/DataSource
  • packages/oakscriptjs/src/index.ts - Export new classes if added
  • packages/oakscriptjs/tests/ - Add tests for new functionality

Acceptance Criteria

  1. Series cache is automatically invalidated when underlying data changes (for streaming/real-time use)
  2. A materialize() method (or similar) exists to break closure chains and free memory
  3. Existing API remains backward compatible
  4. Unit tests cover cache invalidation scenarios
  5. Unit tests cover memory management with materialize()
  6. Documentation updated to explain new features

References

This pull request was created as a result of the following prompt from Copilot chat.

Problem Statement

The Series class in packages/oakscriptjs/src/runtime/series.ts has two critical issues that need to be addressed for production use cases, especially real-time/streaming applications.

Issue 1: Series Cache Invalidation

Current Behavior

The Series class caches computed values in toArray() but has no automatic mechanism to detect when the underlying data changes:

toArray(): number[] {
  if (this.cached !== null) {
    return this.cached;  // Returns stale data if underlying bars changed
  }
  this.cached = this.data.map((bar, i) => this.extractor(bar, i, this.data));
  return this.cached;
}

_invalidate(): void {
  this.cached = null;  // Must be called manually
}

Problems

  1. Manual invalidation only: _invalidate() must be called explicitly—there's no automatic detection when this.data changes
  2. No derived Series tracking: When you compose Series (e.g., close.add(open)), the derived Series has its own cache but doesn't know when parent caches should be invalidated
  3. Mutable data reference: this.data is stored by reference—if the original Bar[] array is mutated externally, cached values become stale

Required Solution

Implement a versioned data source pattern:

  1. Create a BarData class (or similar) that wraps Bar[] and tracks a version number that increments on mutations
  2. Modify Series to store a reference to the data source and track the version when cache was computed
  3. On toArray(), check if the current data version matches the cached version before returning cached results
  4. Optionally add a invalidateAll() mechanism for derived Series coordination

Issue 2: Closure Chain Memory Leak

Current Behavior

Every Series operation creates a new closure that captures the parent Series:

add(other: Series | number): Series {
  return new Series(this.data, (bar, i, data) => {
    const a = this.extractor(bar, i, data);  // Captures 'this'
    const b = typeof other === 'number' ? other : other.extractor(bar, i, data);  // Captures 'other'
    return a + b;
  });
}

Problem

For a chain like a.add(b).mul(c).div(d).sub(e):

  • Each step creates a closure capturing all previous Series objects
  • Results in O(N) Series objects kept alive for a chain of N operations
  • Each Series holds: data (Bar[]), extractor (function), cached (number[] | null)
  • Deep chains in complex indicators can retain significant memory

Required Solution

Implement one or more of these approaches:

  1. Add a materialize() method that eagerly computes values and creates a fresh Series, breaking the closure chain:

    materialize(): Series {
      const values = this.toArray();
      return Series.fromArray(this.data, values);
    }
  2. Expression tree approach (more complex): Replace nested closures with an expression AST that can be evaluated in a single pass without holding references to intermediate Series

  3. Automatic materialization heuristic: Track chain depth and auto-materialize when exceeding a threshold

Files to Modify

  • packages/oakscriptjs/src/runtime/series.ts - Main Series class implementation
  • packages/oakscriptjs/src/types.ts - May need new types for BarData/DataSource
  • packages/oakscriptjs/src/index.ts - Export new classes if added
  • packages/oakscriptjs/tests/ - Add tests for new functionality

Acceptance Criteria

  1. Series cache is automatically invalidated when underlying data changes (for streaming/real-time use)
  2. A materialize() method (or similar) exists to break closure chains and free memory
  3. Existing API remains backward compatible
  4. Unit tests cover cache invalidation scenarios
  5. Unit tests cover memory management with materialize()
  6. Documentation updated to explain new features

References


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Add BarData class with version tracking for automatic cache invalidation
- Update Series to use BarData and track cache version
- Add materialize() method to break closure chains and free memory
- Maintain backward compatibility with Bar[] arrays
- Add comprehensive test suite with 33 tests covering all new features

Co-authored-by: deepentropy <8287111+deepentropy@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix cache invalidation in Series class for real-time updates Add versioned cache invalidation and materialize() to Series for production streaming use cases Dec 5, 2025
Copilot AI requested a review from deepentropy December 5, 2025 08:13
@deepentropy deepentropy marked this pull request as ready for review December 5, 2025 08:14
Copilot AI review requested due to automatic review settings December 5, 2025 08:14
@deepentropy deepentropy merged commit c960795 into main Dec 5, 2025
@deepentropy deepentropy deleted the copilot/fix-series-cache-invalidation branch December 5, 2025 08:14
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds production-ready cache invalidation and memory management to the Series class for streaming use cases. It introduces a BarData wrapper that tracks data mutations via version numbers, enabling automatic cache invalidation, and adds a materialize() method to break closure chains that cause memory leaks in complex indicator calculations.

Key Changes

  • BarData wrapper: Automatically increments version on mutations (push(), pop(), set(), updateLast(), setAll()), enabling Series to detect stale caches
  • Automatic cache invalidation: Series checks BarData version before returning cached values, recomputing only when data has changed
  • materialize() method: Breaks closure chains by eagerly computing values and creating a fresh Series without parent references, critical for complex multi-operation chains

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
packages/oakscriptjs/src/runtime/series.ts Adds BarData class with version tracking, updates Series constructor to accept Bar[] or BarData, implements version-based cache invalidation in toArray(), adds materialize() method
packages/oakscriptjs/src/index.ts Exports BarData class alongside Series for public API
packages/oakscriptjs/tests/runtime/series.test.ts Comprehensive test coverage for BarData mutations, cache invalidation scenarios, materialize() functionality, backward compatibility, and integration tests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +514 to +516
// Create a copy of values to ensure no shared references
const valuesCopy = [...values];
return Series.fromArray(this.dataSource, valuesCopy);
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The array spread operation [...values] creates an unnecessary copy. Since toArray() already returns a fresh computation when cache is invalidated, and the cache is not shared between Series instances, the copy is redundant. This adds O(N) time and space overhead on every materialize() call.

Consider removing the spread and directly using the values:

materialize(): Series {
  const values = this.toArray();
  return Series.fromArray(this.dataSource, values);
}

The values array is already safe to use directly because:

  1. toArray() returns a fresh array when recomputing
  2. Series.fromArray() creates a new Series that captures the array by reference
  3. The materialized Series won't modify the array
Suggested change
// Create a copy of values to ensure no shared references
const valuesCopy = [...values];
return Series.fromArray(this.dataSource, valuesCopy);
return Series.fromArray(this.dataSource, values);

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +57
get bars(): Bar[] {
return this._bars;
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bars getter exposes the internal array by reference, allowing external code to bypass version tracking. While this is documented in the JSDoc (lines 52-54), it creates a significant footgun that could lead to subtle cache invalidation bugs.

Consider strengthening the API design:

  1. Return a readonly array type: get bars(): readonly Bar[] to make mutations a type error
  2. Or provide an internal _getBarsUnsafe() method for Series to use, while making bars return a copy

The current documentation warns users but doesn't prevent the misuse at compile-time.

Suggested change
get bars(): Bar[] {
return this._bars;
get bars(): readonly Bar[] {
// Return a shallow copy to prevent external mutation
return this._bars.slice();

Copilot uses AI. Check for mistakes.
Comment on lines +514 to +516
// Create a copy of values to ensure no shared references
const valuesCopy = [...values];
return Series.fromArray(this.dataSource, valuesCopy);
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The variable name valuesCopy is misleading because it suggests the copy is necessary, but based on the implementation, the copy appears redundant. If the copy is actually needed for safety (e.g., to prevent the materialized Series from being affected by cache clearing in the source Series), this should be documented.

Consider either:

  1. Removing the copy if it's truly unnecessary (see related comment on lines 514-515)
  2. Or renaming to isolatedValues with a comment explaining why isolation is needed
Suggested change
// Create a copy of values to ensure no shared references
const valuesCopy = [...values];
return Series.fromArray(this.dataSource, valuesCopy);
// Create a copy to isolate the materialized Series from future cache invalidation or mutation in the source Series.
const isolatedValues = [...values];
return Series.fromArray(this.dataSource, isolatedValues);

Copilot uses AI. Check for mistakes.
Comment on lines 479 to 490
toArray(): number[] {
if (this.cached !== null) {
// Check if cache is valid by comparing versions
if (this.cached !== null && this.cachedVersion === this.dataSource.version) {
return this.cached;
}

this.cached = this.data.map((bar, i) => this.extractor(bar, i, this.data));
// Recompute and update cache
const bars = this.dataSource.bars;
this.cached = bars.map((bar, i) => this.extractor(bar, i, bars));
this.cachedVersion = this.dataSource.version;
return this.cached;
}
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The toArray() method creates a new array on every cache miss, but there's no mechanism to limit memory growth if the BarData keeps growing (e.g., in a long-running streaming application). Each Series holds a full cached copy of all values.

For production streaming scenarios with unbounded bar data, consider adding:

  1. A maximum cache size or TTL mechanism
  2. Documentation about memory implications for long-running streams
  3. A clearCache() method users can call periodically

Example scenario: A streaming app running for days/weeks could accumulate millions of bars, with each derived Series holding a full array copy in cache.

Copilot uses AI. Check for mistakes.
Comment on lines +129 to +134
* Get a bar at a specific index
* @param index - Bar index
* @returns Bar at that index, or undefined
*/
at(index: number): Bar | undefined {
return this._bars[index];
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The at() method doesn't validate the index parameter, which could lead to unexpected behavior with negative indices due to JavaScript's array indexing.

For example:

barData.at(-1) // Returns the last element (standard JS behavior)
barData.at(-100) // Returns undefined

While this matches JavaScript's native Array.at() behavior, it may be surprising for users who expect array-style bounds checking. Consider:

  1. Documenting that negative indices are supported (following Array.at() convention)
  2. Or explicitly rejecting negative indices if that's not the intended behavior

The current implementation is silent about this edge case.

Suggested change
* Get a bar at a specific index
* @param index - Bar index
* @returns Bar at that index, or undefined
*/
at(index: number): Bar | undefined {
return this._bars[index];
* Get a bar at a specific index (supports negative indices, like Array.at)
* @param index - Bar index (negative values count from the end)
* @returns Bar at that index, or undefined
*/
at(index: number): Bar | undefined {
return this._bars.at(index);

Copilot uses AI. Check for mistakes.
];

const close = Series.fromBars(bars, 'close');
const open = Series.fromBars(bars, 'open');
Copy link

Copilot AI Dec 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable open.

Suggested change
const open = Series.fromBars(bars, 'open');

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants