Skip to content

Conversation

@mlim19
Copy link
Contributor

@mlim19 mlim19 commented Oct 10, 2025

…o streaming

This fixes one of the issues described at #990.

The change is to reduce the memory footprint while handling perf script output by avoiding the whole data to be loaded in memory. Instead, it tries to read data as needed as streaming.

This change has been tested by the submitter as well as Prashant. We got the positive feedback from Prashant on this change.

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Screenshots

Checklist:

  • I have read the CONTRIBUTING document.
  • I have updated the relevant documentation.
  • I have added tests for new logic.

@mlim19
Copy link
Contributor Author

mlim19 commented Oct 10, 2025

@prashantbytesyntax, I created this PR for early feedback. My local tests look promising. Please take a look and give a try if you can

perf_script_cmd, stdout=PIPE, stderr=PIPE, text=True, encoding="utf8", errors="replace"
)

# Stream output line by line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome !

…o streaming

Signed-off-by: Min Yeol Lim <min.yeol.lim@intel.com>
Signed-off-by: Min Lim <min.yeol.lim@intel.com>
@mlim19 mlim19 force-pushed the mlim19_change_perf_script_handling branch from c6bc533 to f4512a8 Compare October 23, 2025 17:50
@mlim19 mlim19 changed the title [ DRAFT ] Change the way to handle perf script output from in-memory handling t… Change the way to handle perf script output from in-memory handling t… Nov 6, 2025
@mlim19 mlim19 requested a review from dkorlovs November 6, 2025 18:35
Copy link
Contributor

@prashantbytesyntax prashantbytesyntax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks again for the streaming optimization. This had a great impact on gProfiler resource reduction in environments especially with large # of processes ( 600-1k+). #990 (comment)

@dkorlovs dkorlovs requested a review from Copilot November 7, 2025 17:18
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors perf script output handling from in-memory processing to a streaming approach, addressing memory footprint issues described in issue #990. The change aims to reduce memory consumption when processing large perf script outputs by reading and processing data line-by-line instead of loading the entire output into memory at once.

Key Changes:

  • Modified wait_and_script() to return an iterator that streams output line-by-line
  • Refactored parse_perf_script() into parse_perf_script_from_iterator() to process streaming data
  • Inlined the merge_global_perfs() function into the PerfProfiler class as _merge_fp_and_dwarf_results()

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
gprofiler/utils/perf_process.py Changed wait_and_script() to stream output via iterator instead of returning complete string; updated resource cleanup logic
gprofiler/utils/perf.py Replaced parse_perf_script() with streaming version parse_perf_script_from_iterator(); fixed typo in comment
gprofiler/profilers/perf.py Updated to use streaming parser; moved merge logic from global function to instance method _merge_fp_and_dwarf_results()

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +246 to +247
perf_script_proc.terminate()
perf_script_proc.wait(timeout=5)
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup logic always attempts to terminate the perf_script_proc, even when it has already completed successfully. This can cause unnecessary termination of a process that has already exited. The termination should only occur if the process is still running (e.g., check perf_script_proc.poll() is None before calling terminate).

Suggested change
perf_script_proc.terminate()
perf_script_proc.wait(timeout=5)
if perf_script_proc.poll() is None:
perf_script_proc.terminate()
perf_script_proc.wait(timeout=5)

Copilot uses AI. Check for mistakes.
@mlim19 mlim19 merged commit 37e6acc into master Nov 8, 2025
57 of 59 checks passed
@mlim19 mlim19 deleted the mlim19_change_perf_script_handling branch November 8, 2025 22:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants