LLM inference systems / AI Agent infrastructure / open-source backend patches
CS grad student working on practical LLM infrastructure: reproducible bug fixes, upstream-friendly patches, inference serving internals, and training workflows that can survive outside a notebook.
I am focusing my public work around three connected areas:
- LLM inference systems: learning and patching serving engines, scheduler/runtime behavior, memory pressure, and CUDA-facing execution paths.
- AI Agent infrastructure: bot frameworks, tool calling, plugin backends, message routing, and reliability issues in agentic systems.
- Open-source backend patches: small reproductions, conservative fixes, tests, and PRs that are easy for maintainers to review.
| Repository | Why it is pinned |
|---|---|
| AstrBot | Fork and PR work for upstream open-source agent infrastructure. |
| vLLM | Fork and learning notes for LLM inference systems and serving internals. |
| tcmalloc memory pool | C++ memory pool and systems performance study. |
| Nemotron-Model | LLM SFT / LoRA / Kaggle training engineering. |
Auto-updated daily from upstream repositories.
- AstrBotDevs/AstrBot#7751 - fix: prevent path traversal in file uploads (merged 2026-04-24)
- Reproduce the bug before changing code.
- Prefer small patches that match the existing project style.
- Add tests or focused verification whenever the behavior can regress.
- Keep PRs easy to review: clear scope, clear reason, clear result.
- Reading vLLM from the user API down into engine, scheduler, memory, and CUDA operator boundaries.
- Looking for real bugs in LLM inference and agent infrastructure projects where a small backend patch can help upstream.
- Turning training experiments into repeatable SFT / LoRA workflows instead of one-off notebooks.