Wasm is becoming the runtime for LLMs

Today’s LLM apps, including inference apps and agents, are mostly written in Python. But this is about to change. Python is too slow, too bloated, and too complicated to install and manage. That’s why popular LLM frameworks, such as llama2.c, whisper.cpp, llama.cpp, all thrive to have zero Python dependency. All those are written in compiled languages (C/C++/Rust) and can be compiled into Wasm. With WASI NN, you can create complex LLM apps in Rust and run them in Wasm sandboxes. The combination to develop and run LLM apps is more lightweight, high performance and portable. In this talk, Michael will demonstrate how to run llama2 series of models in Wasm, how to develop LLM agents in Rust and run them in Wasm. In-production use cases, like LLM-based code review and assistants with your knowledge base, will be discussed and demoed.

Michael Yuan

登壇者プロフィール

Michael Yuan

Second State/WasmEdge

Founder/Maintainer

juntao

Dr. Michael Yuan is a maintainer of the WasmEdge project and a co-founder of Second State. He is the author of 5 books on software engineering published by Addison-Wesley, Prentice-Hall, and O’Reilly. Michael is a long-time open-source developer and contributor. He had previously spoken in many industry conferences including Open Source Summit, RustLab Conference, and KubeCon.