Skip to content

plait Documentation

Welcome to plait, a PyTorch-inspired framework for building, executing, and optimizing LLM inference pipelines.

Use the navigation on the left to explore guides, tutorials, the API reference, and design documentation.

Highlights

  • PyTorch-like authoring with Module, Parameter, and forward()
  • Automatic DAG capture with async execution and backpressure
  • Resource-aware scheduling and LLM endpoint management
  • Optional feedback-driven optimization

Where to go next

  • Getting Started: install plait and run your first pipeline
  • Tutorials: end-to-end walkthroughs and recipes
  • API Reference: auto-generated from Google-style docstrings
  • Design: architecture and internals for advanced usage