-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Bug Report: Memory Safety Issues in accl.cpp
Summary
Found several memory safety and exception handling issues in driver/xrt/src/accl.cpp that could cause memory leaks, undefined behavior, and unhelpful error messages.
Issue 1: Exception Thrown as Pointer (Memory Leak)
Location
driver/xrt/src/accl.cpp, Line 1118 (in configure_arithmetic())
Problem
Exception is thrown using new, creating a pointer to an exception object instead of throwing by value:
throw new std::runtime_error("Communicators unconfigured, please call "
"configure_communicator() first");Impact
- Memory Leak: The allocated exception object is never freed
- Incorrect Catch Behavior: Standard
catch (const std::runtime_error& e)blocks won't catch this because it's a pointer, not an object - C++ Best Practice Violation: Exceptions should always be thrown by value and caught by reference
Fix
Remove the new keyword:
throw std::runtime_error("Communicators unconfigured, please call "
"configure_communicator() first");Issue 2: Uninitialized Pointer and Missing Bounds Validation
Location
driver/xrt/src/accl.cpp, Lines 1174-1196 (in setup_rendezvous_spare_buffers())
Problem 1: Uninitialized Pointer
The buffer pointer is declared but not initialized:
Buffer<int8_t> *buf;
if (sim_mode) {
buf = new SimBuffer(...);
} else if (cclo->get_device_type() == CCLO::xrt_device) {
buf = new XRTBuffer<int8_t>(...);
} else if (cclo->get_device_type() == CCLO::coyote_device) {
buf = new CoyoteBuffer<int8_t>(...);
}
buf->sync_to_device(); // Potential null/garbage pointer dereference!If none of the conditions match (e.g., a new device type is added but not handled here), buf contains garbage/uninitialized memory, leading to undefined behavior when sync_to_device() is called.
Problem 2: Missing Bounds Validation
The code accesses utility_spares.at(0), .at(1), and .at(2) without validating the vector size:
cclo->write(CCLO_ADDR::SPARE1_OFFSET, utility_spares.at(0)->address() & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE2_OFFSET, utility_spares.at(1)->address() & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE3_OFFSET, utility_spares.at(2)->address() & 0xffffffff);While the loop above should create 3 buffers, if any allocation fails or the logic changes, this will throw an unhelpful std::out_of_range exception.
Problem 3: Magic Number
The loop uses hardcoded 3 without explanation:
for(int i=0; i<3; i++){Impact
- Undefined Behavior: Dereferencing uninitialized pointer if device type doesn't match
- Poor Error Messages:
std::out_of_rangegives no context about what failed - Maintenance Risk: Magic number makes code harder to understand and modify
Fix
void ACCL::setup_rendezvous_spare_buffers(addr_t rndzv_spare_buf_size, const std::vector<int> &devicemem){
constexpr int REQUIRED_SPARE_BUFFERS = 3;
max_rndzv_msg_size = rndzv_spare_buf_size;
for(int i=0; i<REQUIRED_SPARE_BUFFERS; i++){
Buffer<int8_t> *buf = nullptr; // Initialize to nullptr
if (sim_mode) {
buf = new SimBuffer(new int8_t[max_rndzv_msg_size](), max_rndzv_msg_size, dataType::int8,
static_cast<SimDevice *>(cclo)->get_context());
} else if(cclo->get_device_type() == CCLO::xrt_device ){
buf = new XRTBuffer<int8_t>(max_rndzv_msg_size, dataType::int8,
*(static_cast<XRTDevice *>(cclo)->get_device()), devicemem[i % devicemem.size()]);
} else if(cclo->get_device_type() == CCLO::coyote_device){
buf = new CoyoteBuffer<int8_t>(max_rndzv_msg_size, dataType::int8, static_cast<CoyoteDevice *>(cclo));
}
// Validate buffer was created
if (buf == nullptr) {
throw std::runtime_error("Failed to create spare buffer: unsupported device type");
}
buf->sync_to_device();
utility_spares.emplace_back(buf);
}
// Validate we have enough buffers before accessing
if (utility_spares.size() < REQUIRED_SPARE_BUFFERS) {
throw std::runtime_error("Insufficient utility spare buffers allocated: expected " +
std::to_string(REQUIRED_SPARE_BUFFERS) + ", got " +
std::to_string(utility_spares.size()));
}
// Safe to use [] now - bounds validated above
cclo->write(CCLO_ADDR::SPARE1_OFFSET, utility_spares[0]->address() & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE1_OFFSET+4, (utility_spares[0]->address() >> 32) & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE2_OFFSET, utility_spares[1]->address() & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE2_OFFSET+4, (utility_spares[1]->address() >> 32) & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE3_OFFSET, utility_spares[2]->address() & 0xffffffff);
cclo->write(CCLO_ADDR::SPARE3_OFFSET+4, (utility_spares[2]->address() >> 32) & 0xffffffff);
}Environment
- ACCL Version: Latest main branch (cloned 2026-01-27)
- File:
driver/xrt/src/accl.cpp - Compiler: Any C++17 compliant compiler
Severity
- Issue 1: Low (memory leak, incorrect exception handling)
- Issue 2: Medium (potential undefined behavior, poor error messages)
Checklist
- Identified root cause
- Provided minimal reproduction context
- Suggested fix with code
Found during static code analysis of the ACCL codebase.