Welcome to lhotse’s documentation!
Contents:
- Getting started
- Representing a corpus
- Cuts
- Feature extraction
- Executing tasks in parallel
- PyTorch Datasets
- A quick re-cap of PyTorch’s data API
- About Lhotse’s Datasets and Samplers
- Restoring sampler’s state: continuing the training
- Resumable Stateful Dataloading (Indexed)
- Batch I/O: pre-computed vs. on-the-fly features
- Handling random seeds
- Customizing sampling constraints
- Dataset’s list
- Sampler’s list
- Input strategies’ list
- Augmentation - transforms on cuts
- Augmentation - transforms on signals
- Collation utilities for building custom Datasets
- Dataloading seeding utilities
- Indexed Manifests and IteratorNodes
- What an indexed manifest is
- Creating indexes
- Reading indexed data
- How iterator composition works
- Three important capabilities
- Property summary for built-in iterators
- Checkpointing: graph tokens
- Why this enables O(1) restore
- Worker-process restore
- Implementing a new IteratorNode
- Checkpointable does not imply O(1)
- Minimal stateless transform node
- Stateful node with RNG or cursor state
- When a node should not support exact restore
- Runtime metadata rules
- Testing new IteratorNodes
- Kaldi Interoperability
- Command-line interface
- API Reference