I'm an Inference Engine Technology Expert at Alibaba (Taotian Group), working as a core architect and developer of MNN - a blazing fast deep learning inference engine with 13,000+ GitHub stars. I'm passionate about making AI, especially LLMs, run efficiently on edge devices.
- 🔭 Currently working on On-Device LLM Deployment at Alibaba MNN Team
- 🌱 Exploring LLM inference optimization, quantization, and speculative decoding
- 🏠 Founder of MNN-LLM - LLM deployment on mobile devices
- 📖 Check out my Tech Blog for AI deployment insights
| Project | Description |
|---|---|
| llm-export ⭐ 344 | Export LLM models to ONNX format for cross-platform deployment |
| tokenizer.cpp ⭐ 24 | Lightweight C++ library for LLM tokenization (HuggingFace compatible) |
| mnn-asr ⭐ 25 | MNN-based Automatic Speech Recognition demo |
| mnn-tts ⭐ 19 | MNN-based Text-to-Speech demo |
| jinja.cpp ⭐ 18 | Single-header C++11 Jinja2 engine for LLM chat templates |
| llm-lab ⭐ 9 | LLM experiments and research notes |
⭐ From wangzhaode




