Command-line interface

lhotse obtain

Command group for download and extract data.

lhotse obtain [OPTIONS] COMMAND [ARGS]...

heroico

heroico download.

lhotse obtain heroico [OPTIONS] TARGET_DIR

Arguments

TARGET_DIR

Required argument

librimix

Mini LibriMix download.

lhotse obtain librimix [OPTIONS] TARGET_DIR

Arguments

TARGET_DIR

Required argument

mini-librispeech

Mini Librispeech download.

lhotse obtain mini-librispeech [OPTIONS] TARGET_DIR

Arguments

TARGET_DIR

Required argument

tedlium

TED-LIUM v3 download (approx. 11GB).

lhotse obtain tedlium [OPTIONS] TARGET_DIR

Arguments

TARGET_DIR

Required argument

lhotse prepare

Command group with data preparation recipes.

lhotse prepare [OPTIONS] COMMAND [ARGS]...

broadcast-news

English Broadcast News 1997 data preparation. It will output three manifests: for recordings, topic sections, and speech segments. It supports the following LDC distributions:

* 1997 English Broadcast News Train (HUB4)
Speech LDC98S71
Transcripts LDC98T28

This data is not available for free - your institution needs to have an LDC subscription.

lhotse prepare broadcast-news [OPTIONS] AUDIO_DIR TRANSCRIPT_DIR OUTPUT_DIR

Arguments

AUDIO_DIR

Required argument

TRANSCRIPT_DIR

Required argument

OUTPUT_DIR

Required argument

heroico

heroico Answers ASR data preparation.

lhotse prepare heroico [OPTIONS] SPEECH_DIR TRANSCRIPT_DIR OUTPUT_DIR

Arguments

SPEECH_DIR

Required argument

TRANSCRIPT_DIR

Required argument

OUTPUT_DIR

Required argument

librimix

LibrMix source separation data preparation.

lhotse prepare librimix [OPTIONS] LIBRIMIX_CSV OUTPUT_DIR

Options

--sampling-rate <sampling_rate>

Sampling rate to set in the RecordingSet manifest.

--min-segment-seconds <min_segment_seconds>

Remove segments shorter than MIN_SEGMENT_SECONDS.

--with-precomputed-mixtures, --no-precomputed-mixtures

Optionally create an RecordingSet manifest including the precomputed LibriMix mixtures.

Arguments

LIBRIMIX_CSV

Required argument

OUTPUT_DIR

Required argument

mini-librispeech

Mini Librispeech ASR data preparation.

lhotse prepare mini-librispeech [OPTIONS] CORPUS_DIR OUTPUT_DIR

Arguments

CORPUS_DIR

Required argument

OUTPUT_DIR

Required argument

switchboard

The Switchboard corpus preparation.

This is conversational telephone speech collected as 2-channel, 8kHz-sampled
data. We are using just the Switchboard-1 Phase 1 training data.
The catalog number LDC97S62 (Switchboard-1 Release 2) corresponds, we believe,
to what we have. We also use the Mississippi State transcriptions, which
we download separately from

This data is not available for free - your institution needs to have an LDC subscription.

lhotse prepare switchboard [OPTIONS] AUDIO_DIR OUTPUT_DIR

Options

--transcript-dir <transcript_dir>
--sentiment-dir <sentiment_dir>

Optional path to LDC2020T14 package with sentiment annotations for SWBD.

--omit-silence, --retain-silence

Should the [silence] segments be kept.

Arguments

AUDIO_DIR

Required argument

OUTPUT_DIR

Required argument

tedlium

TED-LIUM v3 recording and supervision manifest preparation.

lhotse prepare tedlium [OPTIONS] TEDLIUM_DIR OUTPUT_DIR

Arguments

TEDLIUM_DIR

Required argument

OUTPUT_DIR

Required argument

lhotse cut

Group of commands used to create CutSets.

lhotse cut [OPTIONS] COMMAND [ARGS]...

append

Create a new CutSet by appending the cuts in CUT_MANIFESTS. CUT_MANIFESTS are iterated position-wise (the cuts on i’th position in each manfiest are appended to each other). The cuts are appended in the order in which they appear in the input argument list. If CUT_MANIFESTS have different lengths, the script stops once the shortest CutSet is depleted.

lhotse cut append [OPTIONS] [CUT_MANIFESTS]... OUTPUT_CUT_MANIFEST

Arguments

CUT_MANIFESTS

Optional argument(s)

OUTPUT_CUT_MANIFEST

Required argument

mix-by-recording-id

Create a CutSet stored in OUTPUT_CUT_MANIFEST by matching the Cuts from CUT_MANIFESTS by their recording IDs and mixing them together.

lhotse cut mix-by-recording-id [OPTIONS] [CUT_MANIFESTS]...
                               OUTPUT_CUT_MANIFEST

Arguments

CUT_MANIFESTS

Optional argument(s)

OUTPUT_CUT_MANIFEST

Required argument

mix-sequential

Create a CutSet stored in OUTPUT_CUT_MANIFEST by iterating jointly over CUT_MANIFESTS and mixing the Cuts on the same positions. E.g. the first output cut is created from the first cuts in each input manifest. The mix is performed by summing the features from all Cuts. If the CUT_MANIFESTS have different number of Cuts, the mixing ends when the shorter manifest is depleted.

lhotse cut mix-sequential [OPTIONS] [CUT_MANIFESTS]... OUTPUT_CUT_MANIFEST

Arguments

CUT_MANIFESTS

Optional argument(s)

OUTPUT_CUT_MANIFEST

Required argument

pad

Create a new CutSet by padding the cuts in CUT_MANIFEST. The cuts will be right-padded, i.e. the padding is placed after the signal ends.

lhotse cut pad [OPTIONS] CUT_MANIFEST OUTPUT_CUT_MANIFEST

Options

-d, --duration <duration>

Desired duration of cuts after padding. Cuts longer than this won’t be affected. By default, pad to the longest cut duration found in CUT_MANIFEST.

Arguments

CUT_MANIFEST

Required argument

OUTPUT_CUT_MANIFEST

Required argument

random-mixed

Create a CutSet stored in OUTPUT_CUT_MANIFEST that contains supervision regions from SUPERVISION_MANIFEST and features supplied by FEATURE_MANIFEST. It first creates a trivial CutSet, splits it into two equal, randomized parts and mixes their features. The parameters of the mix are controlled via SNR_RANGE and OFFSET_RANGE.

lhotse cut random-mixed [OPTIONS] SUPERVISION_MANIFEST FEATURE_MANIFEST
                        OUTPUT_CUT_MANIFEST

Options

-s, --snr-range <snr_range>

Range of SNR values (in dB) that will be uniformly sampled in order to mix the signals.

-o, --offset-range <offset_range>

Range of relative offset values (0 - 1), which will offset the “right” signal by this many times the duration of the “left” signal. It is uniformly sampled for each mix operation.

Arguments

SUPERVISION_MANIFEST

Required argument

FEATURE_MANIFEST

Required argument

OUTPUT_CUT_MANIFEST

Required argument

simple

Create a CutSet stored in OUTPUT_CUT_MANIFEST. Depending on the provided options, it may contain any combination of recording, feature and supervision manifests. Either RECORDING_MANIFEST or FEATURE_MANIFEST has to be provided. When SUPERVISION_MANIFEST is provided, the cuts time span will correspond to that of the supervision segments. Otherwise, that time span corresponds to the one found in features, if available, otherwise recordings.

lhotse cut simple [OPTIONS] OUTPUT_CUT_MANIFEST

Options

-r, --recording-manifest <recording_manifest>

Optional recording manifest - will be used to attach the recordings to the cuts.

-f, --feature-manifest <feature_manifest>

Optional feature manifest - will be used to attach the features to the cuts.

-s, --supervision_manifest <supervision_manifest>

Optional supervision manifest - will be used to attach the supervisions to the cuts.

Arguments

OUTPUT_CUT_MANIFEST

Required argument

truncate

Truncate the cuts in the CUT_MANIFEST and write them to OUTPUT_CUT_MANIFEST. Cuts shorter than MAX_DURATION will not be modified.

lhotse cut truncate [OPTIONS] CUT_MANIFEST OUTPUT_CUT_MANIFEST

Options

--preserve-id

Should the cuts preserve IDs (by default, they will get new, random IDs)

-d, --max-duration <max_duration>

The maximum duration in seconds of a cut in the resulting manifest. [required]

-o, --offset-type <offset_type>

Where should the truncated cut start: “start” - at the start of the original cut, “end” - MAX_DURATION before the end of the original cut, “random” - randomly choose somewhere between “start” and “end” options.

Options

start|end|random

--keep-overflowing-supervisions, --discard-overflowing-supervisions

When a cut is truncated in the middle of a supervision segment, should the supervision be kept.

Arguments

CUT_MANIFEST

Required argument

OUTPUT_CUT_MANIFEST

Required argument

windowed

Create a CutSet stored in OUTPUT_CUT_MANIFEST from feature regions in FEATURE_MANIFEST. The feature matrices are traversed in windows with CUT_SHIFT increments, creating cuts of constant CUT_DURATION.

lhotse cut windowed [OPTIONS] FEATURE_MANIFEST OUTPUT_CUT_MANIFEST

Options

-d, --cut-duration <cut_duration>

How long should the cuts be in seconds.

-s, --cut-shift <cut_shift>

How much to shift the cutting window in seconds (by default the shift is equal to CUT_DURATION).

--keep-shorter-windows, --discard-shorter-windows

When true, the last window will be used to create a Cut even if its duration is shorter than CUT_DURATION.

Arguments

FEATURE_MANIFEST

Required argument

OUTPUT_CUT_MANIFEST

Required argument

lhotse manifest

Generic commands working with all or most manifest types.

lhotse manifest [OPTIONS] COMMAND [ARGS]...

combine

Load MANIFESTS, combine them into a single one, and write it to OUTPUT_MANIFEST.

lhotse manifest combine [OPTIONS] [MANIFESTS]... OUTPUT_MANIFEST

Arguments

MANIFESTS

Optional argument(s)

OUTPUT_MANIFEST

Required argument

filter

Filter a MANIFEST according to the rule specified in PREDICATE, and save the result to OUTPUT_MANIFEST. It is intended to work generically with most manifest types - it supports RecordingSet, SupervisionSet and CutSet.

The PREDICATE specifies which attribute is used for item selection. Some examples:
lhotse manifest filter ‘duration>4.5’ supervision.json output.json
lhotse manifest filter ‘num_frames<600’ cuts.json output.json
lhotse manifest filter ‘start=0’ cuts.json output.json
lhotse manifest filter ‘channel!=0’ audio.json output.json

It currently only supports comparison of numerical manifest item attributes, such as: start, duration, end, channel, num_frames, num_features, etc.

lhotse manifest filter [OPTIONS] PREDICATE MANIFEST OUTPUT_MANIFEST

Arguments

PREDICATE

Required argument

MANIFEST

Required argument

OUTPUT_MANIFEST

Required argument

split

Load MANIFEST, split it into NUM_SPLITS equal parts and save as separate manifests in OUTPUT_DIR.

lhotse manifest split [OPTIONS] NUM_SPLITS MANIFEST OUTPUT_DIR

Options

--randomize

Optionally randomize the sequence before splitting.

Arguments

NUM_SPLITS

Required argument

MANIFEST

Required argument

OUTPUT_DIR

Required argument

lhotse feat

Feature extraction related commands.

lhotse feat [OPTIONS] COMMAND [ARGS]...

extract

Extract features for recordings in a given AUDIO_MANIFEST. The features are stored in OUTPUT_DIR, with one file per recording (or segment).

lhotse feat extract [OPTIONS] RECORDING_MANIFEST OUTPUT_DIR

Options

-a, --augmentation <augmentation>

Optional time-domain data augmentation effect chain to apply.

Options

pitch|reverb|pitch_reverb_tdrop

-f, --feature-manifest <feature_manifest>

Optional manifest specifying feature extractor configuration.

--storage-type <storage_type>

Select a storage backend for the feature matrices.

Options

lilcom_files|lilcom_hdf5|numpy_files|numpy_hdf5

-t, --lilcom-tick-power <lilcom_tick_power>

Determines the compression accuracy; the input will be compressed to integer multiples of 2^tick_power

-r, --root-dir <root_dir>

Root directory - all paths in the manifest will use this as prefix.

-j, --num-jobs <num_jobs>

Number of parallel processes.

Arguments

RECORDING_MANIFEST

Required argument

OUTPUT_DIR

Required argument

write-default-config

Save a default feature extraction config to OUTPUT_CONFIG.

lhotse feat write-default-config [OPTIONS] OUTPUT_CONFIG

Options

-f, --feature-type <feature_type>

Which feature extractor type to use.

Options

fbank|mfcc|spectrogram

Arguments

OUTPUT_CONFIG

Required argument

lhotse convert-kaldi

Convert a Kaldi data dir DATA_DIR into a directory MANIFEST_DIR of lhotse manifests. Ignores feats.scp. The SAMPLING_RATE has to be explicitly specified as it is not available to read from DATA_DIR.

lhotse convert-kaldi [OPTIONS] DATA_DIR SAMPLING_RATE MANIFEST_DIR

Arguments

DATA_DIR

Required argument

SAMPLING_RATE

Required argument

MANIFEST_DIR

Required argument