Reusing a thread variable across iterations with parallelFor #45
Unanswered
andrewmarx
asked this question in
Q&A
Replies: 1 comment 1 reply
-
|
If I understand correctly, you really need access to the state of individual threads. This is not directly possible with // create a dedicated vector of zeros for every thread
int nThreads = 4;
int sz = 100;
std::vector<VectorXd> xs(nThreads);
for (auto &x : xs)
x = VectorXd::Zero(sz);
auto fun = [&] (unsigned int i) {
// for convenience: short handle for zero vector dedicated to current thread
VectorXd& x = xs[(i * nThreads) / sz];
x(i) = 1;
// do something with x ...
};
parallelFor(0, sz, fun, nThreads, nThreads); // number of threads = number of batches |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
For various reasons, I'm experimenting with swapping out RcppParallel for RcppThread in my package. A key thing to the performance of my package is being able to use reuse the same copy of a variable across iterations of the for loop, with each thread getting its own copy. The reason this is important for me is that the variable is an RcppEigen vector with a length equal to the number of total iterations, and the minimum number of iterations my package needs to support is a million. That's almost 4TB of memory allocations for what is, ironically, essentially a vector of zeros with only a single non-zero value that has its position updated based on the iteration. Ideally, I'd like to make 100 million iterations viable... 😬 So in RcppParallel, I could initialize a copy of this variable in the worker, then in the for loop simply update which element had the non-zero value, avoiding an excessive amount of memory allocations just to change 8 bytes.
My question is, is it possible to do something like this with
parallelFor()? Essentially, have each thread get a copy of this vector that they reuse across iterations?If not, it looks like I could conceivably do this by getting creative with the thread pools, but
parallelFor()is so much nicer to use.Beta Was this translation helpful? Give feedback.
All reactions