Skip to content

TEST#1

Closed
batcheu wants to merge 13 commits intomasterfrom
draft/trix_pipeline
Closed

TEST#1
batcheu wants to merge 13 commits intomasterfrom
draft/trix_pipeline

Conversation

@batcheu
Copy link
Owner

@batcheu batcheu commented Dec 1, 2025

No description provided.

@batcheu
Copy link
Owner Author

batcheu commented Dec 1, 2025

@greptile

@greptile-apps
Copy link

greptile-apps bot commented Dec 1, 2025

Greptile Overview

Greptile Summary

This PR implements bulk pipeline execution for ONERT's Trix backend, enabling multi-model orchestration with buffer sharing optimization.

Key Changes:

  • Added BulkPipelineManager to orchestrate multiple models in a pipeline with configurable buffer sharing between owner and shared models
  • Implemented BulkPipelineModel for individual model lifecycle management with async buffer filling capability
  • Created BulkPipelineBuffer for DMA buffer allocation and management using low-level ioctl operations
  • Extended TrixLoader to parse semicolon-delimited model paths (e.g., model1.tvn;model2.tvn)
  • Modified KernelGenerator to dispatch to BulkPipelineLayer for multi-model execution vs single-model BulkLayer
  • Changed Bulk::Param::binary_path from string to vector<string> to support multiple models
  • Added comprehensive unit tests with mock syscall framework for hardware-independent testing

Critical Issues Found:

  • BulkPipelineModel::initialize() never sets _initialized flag to true, causing initialization state to remain false
  • BulkPipelineBuffer::allocate() incorrectly handles ioctl return value, assigning it to dmabuf instead of checking return code and using hwmem.dbuf_fd
  • Race condition in BulkPipelineModel::startAsyncBufferFill() where _buffer_ready is set outside mutex protection before async task starts

Confidence Score: 1/5

  • This PR contains critical bugs that will cause runtime failures and potential resource leaks
  • The three logic errors identified are severe: the missing _initialized flag will break model reuse checks, the ioctl mishandling could cause file descriptor leaks and buffer allocation failures, and the race condition creates undefined behavior in multi-threaded execution. These must be fixed before merge.
  • Critical fixes needed in runtime/onert/backend/trix/ops/BulkPipelineModel.cc and runtime/onert/backend/trix/ops/BulkPipelineBuffer.cc

Important Files Changed

File Analysis

Filename Score Overview
runtime/onert/backend/trix/ops/BulkPipelineModel.cc 1/5 Implements model lifecycle management with critical bugs: missing _initialized flag assignment and race condition in async buffer fill
runtime/onert/backend/trix/ops/BulkPipelineBuffer.cc 1/5 DMA buffer management with critical ioctl return value handling error that could cause file descriptor leaks
runtime/onert/backend/trix/ops/BulkPipelineManager.cc 3/5 Orchestrates multi-model pipeline with buffer sharing logic, minor typo in comment
runtime/onert/backend/trix/ops/BulkPipelineLayer.cc 4/5 Integration layer connecting pipeline manager to ONERT execution framework
runtime/onert/backend/trix/KernelGenerator.cc 4/5 Extended to support both single model and pipeline execution based on binary_path size
runtime/onert/loader/trix/TrixLoader.cc 4/5 Refactored to parse semicolon-delimited model paths and load I/O metadata from head/tail models

Sequence Diagram

sequenceDiagram
    participant User
    participant TrixLoader
    participant KernelGenerator
    participant BulkPipelineLayer
    participant BulkPipelineManager
    participant BulkPipelineModel
    participant BulkPipelineBuffer
    participant NPU

    User->>TrixLoader: loadFromFile
    TrixLoader->>TrixLoader: Parse semicolon-delimited paths
    TrixLoader->>TrixLoader: Load I/O metadata from head and tail models
    TrixLoader->>TrixLoader: Create Bulk operation with vector of paths

    User->>KernelGenerator: visit Bulk operation
    alt Single Model
        KernelGenerator->>KernelGenerator: Create BulkLayer
    else Multiple Models
        KernelGenerator->>BulkPipelineLayer: Create BulkPipelineLayer
    end

    BulkPipelineLayer->>BulkPipelineManager: configure and initialize
    BulkPipelineManager->>BulkPipelineManager: createModels
    loop For each model path
        BulkPipelineManager->>BulkPipelineModel: new BulkPipelineModel
        BulkPipelineModel->>BulkPipelineModel: loadMetadata
    end
    
    BulkPipelineManager->>BulkPipelineManager: Check buffer sharing eligibility
    Note over BulkPipelineManager: All models must have same<br/>program_size and weight_size
    
    BulkPipelineManager->>BulkPipelineManager: linkModels
    Note over BulkPipelineManager: Set ownership OWNER vs SHARED<br/>Link models in chain
    
    BulkPipelineManager->>BulkPipelineManager: prepareModels
    loop For each OWNER model
        BulkPipelineModel->>NPU: openDevice
        BulkPipelineModel->>BulkPipelineBuffer: allocate program buffer
        BulkPipelineModel->>BulkPipelineBuffer: allocate weight buffer
        BulkPipelineBuffer->>NPU: ioctl TRIX_IOCTL_HWMEM_ALLOC
        BulkPipelineBuffer->>BulkPipelineBuffer: mmap DMA buffer
        BulkPipelineModel->>BulkPipelineBuffer: fillBuffers
        BulkPipelineModel->>NPU: registerNPUmodel_ext
    end
    
    loop For each SHARED model
        BulkPipelineModel->>BulkPipelineModel: shareBuffersFrom owner
        Note over BulkPipelineModel: Reuse buffers device and model_id
    end

    User->>BulkPipelineLayer: run
    BulkPipelineLayer->>BulkPipelineManager: execute with inputs and outputs
    
    loop For each model in pipeline
        BulkPipelineManager->>BulkPipelineModel: waitForBufferReady
        Note over BulkPipelineModel: Wait on condition variable<br/>until buffer is ready
        BulkPipelineModel->>NPU: runNPU_model
        NPU-->>BulkPipelineModel: execution complete
        
        alt Not last model
            BulkPipelineManager->>BulkPipelineManager: Chain outputs to next inputs
        end
        
        alt Buffer sharing enabled
            BulkPipelineModel->>BulkPipelineModel: getNextModel
            BulkPipelineModel->>BulkPipelineModel: startAsyncBufferFill
            Note over BulkPipelineModel: Async refill buffers for<br/>next shared neighbor
        end
    end
    
    BulkPipelineManager-->>User: Pipeline execution complete
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

19 files reviewed, 6 comments

Edit Code Review Agent Settings | Greptile

Comment on lines 41 to 54
bool BulkPipelineModel::initialize()
{
if (_initialized.load())
{
return true;
}

if (!loadMetadata())
{
return false;
}

return true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: _initialized flag never set to true in initialize() method.

Suggested change
bool BulkPipelineModel::initialize()
{
if (_initialized.load())
{
return true;
}
if (!loadMetadata())
{
return false;
}
return true;
}
bool BulkPipelineModel::initialize()
{
if (_initialized.load())
{
return true;
}
if (!loadMetadata())
{
return false;
}
_initialized = true;
return true;
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineModel.cc
Line: 41:54

Comment:
**logic:** `_initialized` flag never set to `true` in `initialize()` method.

```suggestion
bool BulkPipelineModel::initialize()
{
  if (_initialized.load())
  {
    return true;
  }

  if (!loadMetadata())
  {
    return false;
  }

  _initialized = true;
  return true;
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +204 to +218
void BulkPipelineModel::startAsyncBufferFill()
{
_buffer_ready = false;
_async_fill_future = std::async(std::launch::async, [this] {
try
{
fillBuffers();
markBufferReady();
}
catch (const std::exception &e)
{
std::cerr << "Failed to fill buffers asynchronously: " << e.what() << std::endl;
}
});
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Race condition: _buffer_ready set to false before async task starts without synchronization

The _buffer_ready flag is set to false outside the mutex protection on line 206, then accessed inside the async lambda. If the async task starts immediately, it could read/write _buffer_ready concurrently with the main thread setting it, causing a data race.

Suggested change
void BulkPipelineModel::startAsyncBufferFill()
{
_buffer_ready = false;
_async_fill_future = std::async(std::launch::async, [this] {
try
{
fillBuffers();
markBufferReady();
}
catch (const std::exception &e)
{
std::cerr << "Failed to fill buffers asynchronously: " << e.what() << std::endl;
}
});
}
void BulkPipelineModel::startAsyncBufferFill()
{
{
std::lock_guard<std::mutex> lock(_buffer_mutex);
_buffer_ready = false;
}
_async_fill_future = std::async(std::launch::async, [this] {
try
{
fillBuffers();
markBufferReady();
}
catch (const std::exception &e)
{
std::cerr << "Failed to fill buffers asynchronously: " << e.what() << std::endl;
}
});
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineModel.cc
Line: 204:218

Comment:
**logic:** Race condition: `_buffer_ready` set to `false` before async task starts without synchronization

The `_buffer_ready` flag is set to `false` outside the mutex protection on line 206, then accessed inside the async lambda. If the async task starts immediately, it could read/write `_buffer_ready` concurrently with the main thread setting it, causing a data race.

```suggestion
void BulkPipelineModel::startAsyncBufferFill()
{
  {
    std::lock_guard<std::mutex> lock(_buffer_mutex);
    _buffer_ready = false;
  }
  _async_fill_future = std::async(std::launch::async, [this] {
    try
    {
      fillBuffers();
      markBufferReady();
    }
    catch (const std::exception &e)
    {
      std::cerr << "Failed to fill buffers asynchronously: " << e.what() << std::endl;
    }
  });
}
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Typo: "devbice" should be "device"

Suggested change
// Open the devbice
// Open the device
Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineBuffer.cc
Line: 71:71

Comment:
**syntax:** Typo: "devbice" should be "device"

```suggestion
  // Open the device
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Typo: "initilized" should be "initialized"

Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineManager.cc
Line: 45:45

Comment:
**syntax:** Typo: "initilized" should be "initialized"

How can I resolve this? If you propose a fix, please make it concise.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Typo: "Shareing" should be "Sharing"

Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineModel.cc
Line: 175:175

Comment:
**syntax:** Typo: "Shareing" should be "Sharing"

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +85 to +91
_buffer->dmabuf = ioctl(_dev_fd, TRIX_IOCTL_HWMEM_ALLOC, &hwmem);
if (_buffer->dmabuf < 0)
{
close(_dev_fd);
_dev_fd = -1;
throw std::runtime_error("Failed to allocate DMA buffer, size: " + std::to_string(hwmem.size));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Return value from ioctl() not checked correctly

The code assigns the return value of ioctl() to _buffer->dmabuf and checks if it's negative. However, ioctl() returns -1 on error (not the fd). The actual dmabuf fd should come from the hwmem.dbuf_fd field after a successful call.

Suggested change
_buffer->dmabuf = ioctl(_dev_fd, TRIX_IOCTL_HWMEM_ALLOC, &hwmem);
if (_buffer->dmabuf < 0)
{
close(_dev_fd);
_dev_fd = -1;
throw std::runtime_error("Failed to allocate DMA buffer, size: " + std::to_string(hwmem.size));
}
int ret = ioctl(_dev_fd, TRIX_IOCTL_HWMEM_ALLOC, &hwmem);
if (ret < 0)
{
close(_dev_fd);
_dev_fd = -1;
throw std::runtime_error("Failed to allocate DMA buffer, size: " + std::to_string(hwmem.size));
}
_buffer->dmabuf = hwmem.dbuf_fd;
Prompt To Fix With AI
This is a comment left during a code review.
Path: runtime/onert/backend/trix/ops/BulkPipelineBuffer.cc
Line: 85:91

Comment:
**logic:** Return value from `ioctl()` not checked correctly

The code assigns the return value of `ioctl()` to `_buffer->dmabuf` and checks if it's negative. However, `ioctl()` returns `-1` on error (not the fd). The actual dmabuf fd should come from the `hwmem.dbuf_fd` field after a successful call.

```suggestion
  int ret = ioctl(_dev_fd, TRIX_IOCTL_HWMEM_ALLOC, &hwmem);
  if (ret < 0)
  {
    close(_dev_fd);
    _dev_fd = -1;
    throw std::runtime_error("Failed to allocate DMA buffer, size: " + std::to_string(hwmem.size));
  }
  _buffer->dmabuf = hwmem.dbuf_fd;
```

How can I resolve this? If you propose a fix, please make it concise.

batcheu and others added 13 commits December 3, 2025 13:08
This commit introduces the BulkPipelineBuffer class to manage the
buffers in the TRIX backend.

ONt-DCO-1.0-ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
…sung#16334)

This commits fixes the typing by allowing passing None as a value for
the input_shapes parameter in the benchmark_inference() function. In
case of input_shapes being None, the function uses the shape retrieved
from the initialized session.

ONE-DCO-1.0-Signed-off-by: Arkadiusz Bokowy <a.bokowy@samsung.com>
…sung#16333)

It adds new MockSyscallsManager class to provide configurable hook system
for mocking system calls in tests.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
This commit adds Mean_U8_000 to tolerance-based evaluation with 1 absolute tolerance to handle precision issues in uint8 operations.

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh <hseok82.oh@samsung.com>
…6338)

This commit adds debug print statements to log input data when model execution results differ between interpreter and luci outputs.
This change helps diagnose test failures by providing complete context including input data, interpreter output, and luci output.

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh <hseok82.oh@samsung.com>
…#16332)

This implements new BulkPipelineModel class to handle NPU model loading.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
…g#16339)

Add new BulkPipelineManager class to coordinate execution of multiple
models in sequence with proper resource management.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>

Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
Samsung#16342)

This commit updates header guard names and nested namespace declarations
in bulk pipeline headers. This improves code consistency and readability.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
This commit adds comprehensive comments documenting current restrictions for MX dtypes (MXFP4, MXINT8) in the circle schema.

ONE-DCO-1.0-Signed-off-by: Hyeongseok Oh <hseok82.oh@samsung.com>
…g#16343)

It adds new BulkPipelineLayer class to handle bulk pipeline operations
in the trix backend.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
This replaces the previous NYI exception with actual pipeline
execution functionality for the trix backend.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
This commit adds buffer sharing mechanism to reduce memory usage
in bulk pipeline execution.
Link models for async buffer preparation and optimize execution
performance when models have identical program and weight sizes.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
This commit adds verification step to ensure loaded models have
matching input/output counts with the pipeline configuration.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>
@batcheu batcheu force-pushed the draft/trix_pipeline branch from 77b0f24 to 784f6ea Compare January 15, 2026 03:43
@batcheu batcheu closed this Jan 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments