[onert] Implement buffer sharing optimization for BulkPipeline by batcheu · Pull Request #16349 · Samsung/ONE

batcheu · 2026-01-21T00:18:46Z

This commit adds buffer sharing mechanism to reduce memory usage in bulk pipeline execution.
Link models for async buffer preparation and optimize execution performance when models have identical program and weight sizes.

ONE-DCO-1.0-Signed-off-by: Jonghwa Lee jonghwa3.lee@samsung.com

This commit adds buffer sharing mechanism to reduce memory usage in bulk pipeline execution. Link models for async buffer preparation and optimize execution performance when models have identical program and weight sizes. ONE-DCO-1.0-Signed-off-by: Jonghwa Lee <jonghwa3.lee@samsung.com>

batcheu · 2026-01-21T00:21:36Z

runtime/onert/backend/trix/ops/BulkPipelineModel.h

-  BulkPipelineModel(const std::string &model_path, int device_id);
+  enum class BufferOwnership
+  {
+    OWNER,


Note for reviewers

BufferOwnership shows whther the buffer is shared (=BufferOwnerShip::SHARED) or maintained directly (=BufferOwnerShip::OWNER).

batcheu · 2026-01-21T00:27:58Z

runtime/onert/backend/trix/ops/BulkPipelineManager.cc

+  }
+}
+
+void BulkPipelineManager::linkModels()


Note for reviewers

The linked models will share program and weight buffers.
For example, if we have 6 models will be executed in serial, then the linkage between models can be like,

6 models : [model_0, model_1, model_2, model_3, model_4, model_5]

First linkage : model_0 -> model_2 -> model_4

Seconde linkage : model_1 -> model_3 -> model_5

And model_0, model_1 will have BufferOwnership::OWNER buffers.

ys44kim · 2026-01-21T05:53:37Z

runtime/onert/backend/trix/ops/BulkPipelineManager.cc

+      if (model_idx++ < _config.n_owner_models)
+      {
+        // First n_shared_owner_models models are OWNERS
+        continue;


If _use_buffer_sharing becomes true, it seems that the first model and other models are all the same.
But if n_owner_models is 2, is it intended to have 2 owners?
I think only the first should be owner and the rest should be shared.
(If I misunderstood, please correct it. 😅 )

Yes, the first 2 models are OWNER.
As I wrote in below (https://github.com/Samsung/ONE/pull/16349/files#r2710531814) model_0 and model_1 will have OWNER buffers and other models in the same linkage will share them.

Thanks for reviewing ;)

ys44kim

LGTM

hseok-oh

LGTM

runtime/onert/backend/trix/ops/BulkPipelineModel.cc

ragmani · 2026-01-26T09:30:15Z

runtime/onert/backend/trix/ops/BulkPipelineModel.cc

+    {
+      openDevice();
+      allocateBuffers();
+      fillBuffers();


In fillBuffers(), _fp can be leaked if an exception is thrown.
Could you please add logic to check and close _fp before throwing an exception to avoid a file handle leak?

_fp will be closed on release() method which is called in exception handler and destructor.

ONE/runtime/onert/backend/trix/ops/BulkPipelineModel.cc

Lines 86 to 90 in cdd7cb6

if (_fp)

{

fclose(_fp);

_fp = nullptr;

}

Thanks for your information. I missed release().
But, there is no release() in startAsyncBufferFill(). Could you check it?

IMHO, the failure in startAsyncBufferFill() should be handled in caller side.
In that case, release() can be called explictly by BulkPipelineManager which has BulkPipelineModel.
I'll update the error handling BulkPipelineManager code in another PR too.

ragmani · 2026-01-26T09:44:06Z

runtime/onert/backend/trix/ops/BulkPipelineModel.cc

+    try
+    {
+      fillBuffers();
+      markBufferReady();
+    }
+    catch (const std::exception &e)
+    {
+      std::cerr << "Failed to fill buffers asynchronously: " << e.what() << std::endl;
+    }


_buffer_ready can leave permanently false if fillBuffers() throws before markBufferReady(). In other words, waitForBufferReady() may block forever.

Yes, that's possibly right. If you don't mind I'll upload a separate PR to fix it.

Thanks. I hope it gets fixed in a separate PR.

batcheu · 2026-01-27T23:37:37Z

@ragmani
Thanks for your detailed review ;)
I left the comments, please check if it is the answer to the question.

ragmani

LGTM

batcheu commented Jan 21, 2026

View reviewed changes

batcheu requested review from chunseoklee, hseok-oh, ragmani and ys44kim January 21, 2026 00:33

ys44kim reviewed Jan 21, 2026

View reviewed changes

ys44kim approved these changes Jan 22, 2026

View reviewed changes

hseok-oh approved these changes Jan 26, 2026

View reviewed changes

ragmani reviewed Jan 26, 2026

View reviewed changes

ragmani approved these changes Jan 28, 2026

View reviewed changes

chunseoklee approved these changes Jan 29, 2026

View reviewed changes

chunseoklee merged commit 83f476d into Samsung:master Jan 29, 2026
10 checks passed

batcheu deleted the implement_buffer_sharing_for_bulkpipeline branch February 2, 2026 07:52

batcheu mentioned this pull request Feb 2, 2026

[onert] Propagate async buffer fill errors to waiters #16365

Merged

Comments

Conversation

batcheu commented Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Note for reviewers

Uh oh!

batcheu Jan 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Note for reviewers

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ys44kim left a comment

Choose a reason for hiding this comment

Uh oh!

hseok-oh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ragmani Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

batcheu commented Jan 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ragmani left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

batcheu commented Jan 21, 2026 •

edited

Loading

batcheu Jan 21, 2026 •

edited

Loading

ragmani Jan 28, 2026 •

edited

Loading

batcheu commented Jan 27, 2026 •

edited

Loading