efficientLLM for notebooks to run in your local environment first run !pip install -r requirements.txt then download the model python3 model_downdloader.py llm model optimizations to decrease latency 1. KV model 2. Batching 3. Continuous Batching