G-B-KEVIN-ARJUN

G-B-KEVIN-ARJUN

Pinned Loading

size-precision-slm-bench size-precision-slm-bench Public

is it better to run a Tiny Model (2B-4B) at High Precision (FP16/INT8), or a Large Model (8B+) at Low Precision (INT4)?" This benchmark framework allows developers to scientifically choose the best…

Python 1
Image-Captioning-with-Visual-Attention Image-Captioning-with-Visual-Attention Public

The model tackles the task of automatically generating captions for images by focusing on different regions of the image as it generates each word in the caption. A ResNet model is employed to enco…

Python
runtime-inference runtime-inference Public

"Faster AI: Accelerating Qwen 2.5 from 7 t/s to 82 t/s on a single RTX 4060 using Llama.cpp and ONNX" a comparative analysis of LLM inference runtimes (PyTorch, ONNX, Llama.cpp) on consumer hardwar…

Python
auto-dataset-annotator auto-dataset-annotator Public

A Zero-Shot annotation engine that outperforms manual YOLO labeling by generating training data from scratch using only text prompts. Leverages Grounding DINO and SAM to create high-precision datas…

Python
microservices microservices Public

scaleDown AI: Enterprise Model Quantization Platform Slash inference costs by 70%. Deploy LLMs anywhere. A microservices-based orchestration engine for isolating and automating incompatible AI opti…

Python