can we look into concurrency, parallelism and/or multithreading to accelerate inference calls?
we could look at bringing in tokio as our async runtime and move towards more async rust
a multithreaded inference client running predictions at breakneck speeds would be really cool
im keeping an eye on the rayon crate esp. cuz of the noise it's been making in the rustacean circles on the interweb