plait Documentation
Welcome to plait, a PyTorch-inspired framework for building, executing, and optimizing LLM inference pipelines.
Use the navigation on the left to explore guides, tutorials, the API reference, and design documentation.
Highlights
- PyTorch-like authoring with
Module,Parameter, andforward() - Automatic DAG capture with async execution and backpressure
- Resource-aware scheduling and LLM endpoint management
- Optional feedback-driven optimization
Where to go next
- Getting Started: install plait and run your first pipeline
- Tutorials: end-to-end walkthroughs and recipes
- API Reference: auto-generated from Google-style docstrings
- Design: architecture and internals for advanced usage