-
Notifications
You must be signed in to change notification settings - Fork 3
Open
Description
On my current setup, I've managed to acquire this performance numbers with CPU vs GPU acquisition for 8.192 MHz sampling rate:
| Acquisition(type) | Ipp(float) | Ipp(double) | ArrayFire(float) | ArrayFire(double) |
|---|---|---|---|---|
| Time, ms | 134 | 303 | 1269 | 1394 |
This is heavily affected by calculating maximum value and position, as well as statistical parameters here. This approach leads to transferring small amounts of data (4 numbers) over PCI-Express, which is a performance killer.
In my sandbox, I've managed to reduce the acquisition time to ~500 ms by making those functions return af::array (keep data on GPU), but it would require the following code modification, which I don't think is necessary for now.
However, let's keep this documented in case I'd like to return to this in the future.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels