Skip to content
View tianyuxbear's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report tianyuxbear

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
tianyuxbear/README.md

Hi there, I'm Tianyu Xiong πŸ‘‹

πŸŽ“ About Me

I am a Master's student at UESTC, spending my time at the intersection of Computer Architecture and System Software. I enjoy digging into the "black box" between hardware and software to see how things actually work under the hood.

  • πŸ› οΈ What I'm working on:
    • Deeply involved in Modern C++ (11/14/17) and Linux System Programming.
    • Building ZedInfer, a lightweight inference engine, to explore CUDA and AVX512 (SIMD).
    • Messing around with GEM5 and NEMU for hybrid simulation and performance analysis.
  • πŸ“ Research & Projects:
    • Currently improving program analysis efficiency through the SimPoint+ project.
    • Holding a patent on execution time predictionβ€”basically trying to make simulations less of a "guessing game".
  • 🎯 Interests: Passionate about HPC, Kernel Optimization, and building highly efficient AI Infrastructure to squeeze out every last bit of hardware performance.

🌐 Connect with Me


πŸ“Š GitHub Stats

Pinned Loading

  1. zedinfer zedinfer Public

    A lightweight, high-performance LLM inference engine in C++ for edge AIPC, featuring graph-runtime separation and heterogeneous acceleration.

    C++

  2. llaisys llaisys Public

    A learning-oriented LLM inference system built from scratch, focusing on operator-level correctness validation and performance evaluation on CPU, serving as an experimental platform for ZedInfer.

    C++

  3. cuda-kernels cuda-kernels Public

    A collection of high-performance CUDA kernels and experiments for learning and optimizing GPU compute primitives.

    Cuda

  4. matmul-cpu matmul-cpu Public

    High-performance CPU GEMM kernels (C = AΒ·Bα΅€ + C) optimized for LLM inference, featuring AVX2/AVX-512 SIMD and multi-threading. Benchmarked against OpenBLAS.

    C++ 1

  5. nju-pa-2023fall nju-pa-2023fall Public

    Programming assignments for NJU ICS (Introduction to Computer Systems), featuring NEMU β€” a high-performance full-system emulator and OS-level experiments.

    C

  6. xv6-labs-2020fall xv6-labs-2020fall Public

    MIT 6.S081 (Operating System Engineering) labs based on xv6, implementing core OS mechanisms including processes, virtual memory, file systems, and concurrency.