API Reference

This page contains a comprehensive list of all classes and functions within lhotse.

Recording manifests

Data structures used for describing audio recordings in a dataset.

Supervision manifests

Data structures used for describing supervisions in a dataset.

class lhotse.supervision.AlignmentItem(symbol: str, start: float, duration: float, score: Optional[float] = None)[source]

This class contains an alignment item, for example a word, along with its start time (w.r.t. the start of recording) and duration. It can potentially be used to store other kinds of alignment items, such as subwords, pdfid’s etc.

symbol: str

Alias for field number 0

start: float

Alias for field number 1

duration: float

Alias for field number 2

score: Optional[float]

Alias for field number 3

static deserialize(data)[source]
Return type

AlignmentItem

serialize()[source]
Return type

list

property end: float
Return type

float

with_offset(offset)[source]

Return an identical AlignmentItem, but with the offset added to the start field.

Return type

AlignmentItem

perturb_speed(factor, sampling_rate)[source]

Return an AlignmentItem that has time boundaries matching the recording/cut perturbed with the same factor. See SupervisionSegment.perturb_speed() for details.

Return type

AlignmentItem

trim(end, start=0)[source]

See SupervisionSegment.trim().

Return type

AlignmentItem

transform(transform_fn)[source]

Perform specified transformation on the alignment content.

Return type

AlignmentItem

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

class lhotse.supervision.SupervisionSegment(id, recording_id, start, duration, channel=0, text=None, language=None, speaker=None, gender=None, custom=None, alignment=None)[source]

SupervisionSegment represents a time interval (segment) annotated with some supervision labels and/or metadata, such as the transcription, the speaker identity, the language, etc.

Each supervision has unique id and always refers to a specific recording (via recording_id) and one or more channel (by default, 0). Note that multiple channels of the recording may share the same supervision, in which case the channel field will be a list of integers.

It’s also characterized by the start time (relative to the beginning of a Recording or a Cut) and a duration, both expressed in seconds.

The remaining fields are all optional, and their availability depends on specific corpora. Since it is difficult to predict all possible types of metadata, the custom field (a dict) can be used to insert types of supervisions that are not supported out of the box.

SupervisionSegment may contain multiple types of alignments. The alignment field is a dict, indexed by alignment’s type (e.g., word or phone), and contains a list of AlignmentItem objects – simple structures that contain a given symbol and its time interval. Alignments can be read from CTM files or created programatically.

Examples

A simple segment with no supervision information:

>>> from lhotse import SupervisionSegment
>>> sup0 = SupervisionSegment(
...     id='rec00001-sup00000', recording_id='rec00001',
...     start=0.5, duration=5.0, channel=0
... )

Typical supervision containing transcript, speaker ID, gender, and language:

>>> sup1 = SupervisionSegment(
...     id='rec00001-sup00001', recording_id='rec00001',
...     start=5.5, duration=3.0, channel=0,
...     text='transcript of the second segment',
...     speaker='Norman Dyhrentfurth', language='English', gender='M'
... )

Two supervisions denoting overlapping speech on two separate channels in a microphone array/multiple headsets (pay attention to start, duration, and channel):

>>> sup2 = SupervisionSegment(
...     id='rec00001-sup00002', recording_id='rec00001',
...     start=15.0, duration=5.0, channel=0,
...     text="i have incredibly good news for you",
...     speaker='Norman Dyhrentfurth', language='English', gender='M'
... )
>>> sup3 = SupervisionSegment(
...     id='rec00001-sup00003', recording_id='rec00001',
...     start=18.0, duration=3.0, channel=1,
...     text="say what",
...     speaker='Hervey Arman', language='English', gender='M'
... )

A supervision with a phone alignment:

>>> from lhotse.supervision import AlignmentItem
>>> sup4 = SupervisionSegment(
...     id='rec00001-sup00004', recording_id='rec00001',
...     start=33.0, duration=1.0, channel=0,
...     text="ice",
...     speaker='Maryla Zechariah', language='English', gender='F',
...     alignment={
...         'phone': [
...             AlignmentItem(symbol='AY0', start=33.0, duration=0.6),
...             AlignmentItem(symbol='S', start=33.6, duration=0.4)
...         ]
...     }
... )

A supervision shared across multiple channels of a recording (e.g. a microphone array):

>>> sup5 = SupervisionSegment(
...     id='rec00001-sup00005', recording_id='rec00001',
...     start=33.0, duration=1.0, channel=[0, 1],
...     text="ice",
...     speaker='Maryla Zechariah',
... )

Converting SupervisionSegment to a dict:

>>> sup0.to_dict()
{'id': 'rec00001-sup00000', 'recording_id': 'rec00001', 'start': 0.5, 'duration': 5.0, 'channel': 0}
id: str
recording_id: str
start: float
duration: float
channel: Union[int, List[int]] = 0
text: Optional[str] = None
language: Optional[str] = None
speaker: Optional[str] = None
gender: Optional[str] = None
custom: Optional[Dict[str, Any]] = None
alignment: Optional[Dict[str, List[lhotse.supervision.AlignmentItem]]] = None
property end: float
Return type

float

with_alignment(kind, alignment)[source]
Return type

SupervisionSegment

with_offset(offset)[source]

Return an identical SupervisionSegment, but with the offset added to the start field.

Return type

SupervisionSegment

perturb_speed(factor, sampling_rate, affix_id=True)[source]

Return a SupervisionSegment that has time boundaries matching the recording/cut perturbed with the same factor.

Parameters
  • factor (float) – The speed will be adjusted this many times (e.g. factor=1.1 means 1.1x faster).

  • sampling_rate (int) – The sampling rate is necessary to accurately perturb the start and duration (going through the sample counts).

  • affix_id (bool) – When true, we will modify the id and recording_id fields by affixing it with “_sp{factor}”.

Return type

SupervisionSegment

Returns

a modified copy of the current SupervisionSegment.

perturb_tempo(factor, sampling_rate, affix_id=True)[source]

Return a SupervisionSegment that has time boundaries matching the recording/cut perturbed with the same factor.

Parameters
  • factor (float) – The tempo will be adjusted this many times (e.g. factor=1.1 means 1.1x faster).

  • sampling_rate (int) – The sampling rate is necessary to accurately perturb the start and duration (going through the sample counts).

  • affix_id (bool) – When true, we will modify the id and recording_id fields by affixing it with “_tp{factor}”.

Return type

SupervisionSegment

Returns

a modified copy of the current SupervisionSegment.

perturb_volume(factor, affix_id=True)[source]

Return a SupervisionSegment with modified ids.

Parameters
  • factor (float) – The volume will be adjusted this many times (e.g. factor=1.1 means 1.1x louder).

  • affix_id (bool) – When true, we will modify the id and recording_id fields by affixing it with “_vp{factor}”.

Return type

SupervisionSegment

Returns

a modified copy of the current SupervisionSegment.

reverb_rir(affix_id=True, channel=None)[source]

Return a SupervisionSegment with modified ids.

Parameters

affix_id (bool) – When true, we will modify the id and recording_id fields by affixing it with “_rvb”.

Return type

SupervisionSegment

Returns

a modified copy of the current SupervisionSegment.

trim(end, start=0)[source]

Return an identical SupervisionSegment, but ensure that self.start is not negative (in which case it’s set to 0) and self.end does not exceed the end parameter. If a start is optionally provided, the supervision is trimmed from the left (note that start should be relative to the cut times).

This method is useful for ensuring that the supervision does not exceed a cut’s bounds, in which case pass cut.duration as the end argument, since supervision times are relative to the cut.

Return type

SupervisionSegment

map(transform_fn)[source]

Return a copy of the current segment, transformed with transform_fn.

Parameters

transform_fn (Callable[[SupervisionSegment], SupervisionSegment]) – a function that takes a segment as input, transforms it and returns a new segment.

Return type

SupervisionSegment

Returns

a modified SupervisionSegment.

transform_text(transform_fn)[source]

Return a copy of the current segment with transformed text field. Useful for text normalization, phonetic transcription, etc.

Parameters

transform_fn (Callable[[str], str]) – a function that accepts a string and returns a string.

Return type

SupervisionSegment

Returns

a SupervisionSegment with adjusted text.

transform_alignment(transform_fn, type='word')[source]

Return a copy of the current segment with transformed alignment field. Useful for text normalization, phonetic transcription, etc.

Parameters
  • type (Optional[str, None]) – alignment type to transform (key for alignment dict).

  • transform_fn (Callable[[str], str]) – a function that accepts a string and returns a string.

Return type

SupervisionSegment

Returns

a SupervisionSegment with adjusted alignments.

to_dict()[source]
Return type

dict

static from_dict(data)[source]
Return type

SupervisionSegment

__init__(id, recording_id, start, duration, channel=0, text=None, language=None, speaker=None, gender=None, custom=None, alignment=None)
class lhotse.supervision.SupervisionSet(segments=None)[source]

SupervisionSet represents a collection of segments containing some supervision information (see SupervisionSegment), that are indexed by segment IDs.

It acts as a Python dict, extended with an efficient find operation that indexes and caches the supervision segments in an interval tree. It allows to quickly find supervision segments that correspond to a specific time interval.

When coming from Kaldi, think of SupervisionSet as a segments file on steroids, that may also contain text, utt2spk, utt2gender, utt2dur, etc.

Examples

Building a SupervisionSet:

>>> from lhotse import SupervisionSet, SupervisionSegment
>>> sups = SupervisionSet.from_segments([SupervisionSegment(...), ...])

Writing/reading a SupervisionSet:

>>> sups.to_file('supervisions.jsonl.gz')
>>> sups2 = SupervisionSet.from_file('supervisions.jsonl.gz')

Using SupervisionSet like a dict:

>>> 'rec00001-sup00000' in sups
True
>>> sups['rec00001-sup00000']
SupervisionSegment(id='rec00001-sup00000', recording_id='rec00001', start=0.5, ...)
>>> for segment in sups:
...     pass

Searching by recording_id and time interval:

>>> matched_segments = sups.find(recording_id='rec00001', start_after=17.0, end_before=25.0)

Manipulation:

>>> longer_than_5s = sups.filter(lambda s: s.duration > 5)
>>> first_100 = sups.subset(first=100)
>>> split_into_4 = sups.split(num_splits=4)
>>> shuffled = sups.shuffle()
__init__(segments=None)[source]
property data: Union[Dict[str, lhotse.supervision.SupervisionSegment], Iterable[lhotse.supervision.SupervisionSegment]]

Alias property for self.segments

Return type

Union[Dict[str, SupervisionSegment], Iterable[SupervisionSegment]]

property ids: Iterable[str]
Return type

Iterable[str]

static from_segments(segments)[source]
Return type

SupervisionSet

static from_items(segments)

Function to be implemented by every sub-class of this mixin. It’s expected to create a sub-class instance out of an iterable of items that are held by the sub-class (e.g., CutSet.from_items(iterable_of_cuts)).

Return type

SupervisionSet

static from_dicts(data)[source]
Return type

SupervisionSet

static from_rttm(path)[source]

Read an RTTM file located at path (or an iterator) and create a SupervisionSet manifest for them. Can be used to create supervisions from custom RTTM files (see, for example, lhotse.dataset.DiarizationDataset).

>>> from lhotse import SupervisionSet
>>> sup1 = SupervisionSet.from_rttm('/path/to/rttm_file')
>>> sup2 = SupervisionSet.from_rttm(Path('/path/to/rttm_dir').rglob('ref_*'))

The following description is taken from the [dscore](https://github.com/nryant/dscore#rttm) toolkit:

Rich Transcription Time Marked (RTTM) files are space-delimited text files containing one turn per line, each line containing ten fields:

  • Type – segment type; should always by SPEAKER

  • File ID – file name; basename of the recording minus extension (e.g.,

rec1_a) - Channel ID – channel (1-indexed) that turn is on; should always be 1 - Turn Onset – onset of turn in seconds from beginning of recording - Turn Duration – duration of turn in seconds - Orthography Field – should always by <NA> - Speaker Type – should always be <NA> - Speaker Name – name of speaker of turn; should be unique within scope of each file - Confidence Score – system confidence (probability) that information is correct; should always be <NA> - Signal Lookahead Time – should always be <NA>

For instance:

SPEAKER CMU_20020319-1400_d01_NONE 1 130.430000 2.350 <NA> <NA> juliet <NA> <NA> SPEAKER CMU_20020319-1400_d01_NONE 1 157.610000 3.060 <NA> <NA> tbc <NA> <NA> SPEAKER CMU_20020319-1400_d01_NONE 1 130.490000 0.450 <NA> <NA> chek <NA> <NA>

Parameters

path (Union[Path, str, Iterable[Union[Path, str]]]) – Path to RTTM file or an iterator of paths to RTTM files.

Return type

SupervisionSet

Returns

a new SupervisionSet instance containing segments from the RTTM file.

with_alignment_from_ctm(ctm_file, type='word', match_channel=False, verbose=False)[source]

Add alignments from CTM file to the supervision set.

Parameters
  • ctm – Path to CTM file.

  • type (str) – Alignment type (optional, default = word).

  • match_channel (bool) – if True, also match channel between CTM and SupervisionSegment

  • verbose (bool) – if True, show progress bar

Return type

SupervisionSet

Returns

A new SupervisionSet with AlignmentItem objects added to the segments.

write_alignment_to_ctm(ctm_file, type='word')[source]

Write alignments to CTM file.

Parameters
  • ctm_file (Union[Path, str]) – Path to output CTM file (will be created if not exists)

  • type (str) – Alignment type to write (default = word)

Return type

None

to_dicts()[source]
Return type

Iterable[dict]

split(num_splits, shuffle=False, drop_last=False)[source]

Split the SupervisionSet into num_splits pieces of equal size.

Parameters
  • num_splits (int) – Requested number of splits.

  • shuffle (bool) – Optionally shuffle the recordings order first.

  • drop_last (bool) – determines how to handle splitting when len(seq) is not divisible by num_splits. When False (default), the splits might have unequal lengths. When True, it may discard the last element in some splits to ensure they are equally long.

Return type

List[SupervisionSet]

Returns

A list of SupervisionSet pieces.

split_lazy(output_dir, chunk_size, prefix='')[source]

Splits a manifest (either lazily or eagerly opened) into chunks, each with chunk_size items (except for the last one, typically).

In order to be memory efficient, this implementation saves each chunk to disk in a .jsonl.gz format as the input manifest is sampled.

Note

For lowest memory usage, use load_manifest_lazy to open the input manifest for this method.

Parameters
  • it – any iterable of Lhotse manifests.

  • output_dir (Union[Path, str]) – directory where the split manifests are saved. Each manifest is saved at: {output_dir}/{prefix}.{split_idx}.jsonl.gz

  • chunk_size (int) – the number of items in each chunk.

  • prefix (str) – the prefix of each manifest.

Return type

List[SupervisionSet]

Returns

a list of lazily opened chunk manifests.

subset(first=None, last=None)[source]

Return a new SupervisionSet according to the selected subset criterion. Only a single argument to subset is supported at this time.

Parameters
  • first (Optional[int, None]) – int, the number of first supervisions to keep.

  • last (Optional[int, None]) – int, the number of last supervisions to keep.

Return type

SupervisionSet

Returns

a new SupervisionSet with the subset results.

transform_text(transform_fn)[source]

Return a copy of the current SupervisionSet with the segments having a transformed text field. Useful for text normalization, phonetic transcription, etc.

Parameters

transform_fn (Callable[[str], str]) – a function that accepts a string and returns a string.

Return type

SupervisionSet

Returns

a SupervisionSet with adjusted text.

transform_alignment(transform_fn, type='word')[source]

Return a copy of the current SupervisionSet with the segments having a transformed alignment field. Useful for text normalization, phonetic transcription, etc.

Parameters
  • transform_fn (Callable[[str], str]) – a function that accepts a string and returns a string.

  • type (str) – alignment type to transform (key for alignment dict).

Return type

SupervisionSet

Returns

a SupervisionSet with adjusted text.

find(recording_id, channel=None, start_after=0, end_before=None, adjust_offset=False, tolerance=0.001)[source]

Return an iterable of segments that match the provided recording_id.

Parameters
  • recording_id (str) – Desired recording ID.

  • channel (Optional[int, None]) – When specified, return supervisions in that channel - otherwise, in all channels.

  • start_after (float) – When specified, return segments that start after the given value.

  • end_before (Optional[float, None]) – When specified, return segments that end before the given value.

  • adjust_offset (bool) – When true, return segments as if the recordings had started at start_after. This is useful for creating Cuts. From a user perspective, when dealing with a Cut, it is no longer helpful to know when the supervisions starts in a recording - instead, it’s useful to know when the supervision starts relative to the start of the Cut. In the anticipated use-case, start_after and end_before would be the beginning and end of a cut; this option converts the times to be relative to the start of the cut.

  • tolerance (float) – Additional margin to account for floating point rounding errors when comparing segment boundaries.

Return type

Iterable[SupervisionSegment]

Returns

An iterator over supervision segments satisfying all criteria.

filter(predicate)

Return a new manifest containing only the items that satisfy predicate. If the manifest is lazy, the filtering will also be applied lazily.

Parameters

predicate (Callable[[~T], bool]) – a function that takes a cut as an argument and returns bool.

Returns

a filtered manifest.

classmethod from_file(path)
Return type

Any

classmethod from_json(path)
Return type

Any

classmethod from_jsonl(path)
Return type

Any

classmethod from_jsonl_lazy(path)

Read a JSONL manifest in a lazy manner, which opens the file but does not read it immediately. It is only suitable for sequential reads and iteration.

Warning

Opening the manifest in this way might cause some methods that rely on random access to fail.

Return type

Any

classmethod from_yaml(path)
Return type

Any

classmethod infinite_mux(*manifests, weights=None, seed=0, max_open_streams=None)

Merges multiple manifest iterables into a new iterable by lazily multiplexing them during iteration time. Unlike mux(), this method allows to limit the number of max open sub-iterators at any given time.

To enable this, it performs 2-stage sampling. First, it samples with replacement the set of iterators I to construct a subset I_sub of size max_open_streams. Then, for each iteration step, it samples an iterator i from I_sub, fetches the next item from it, and yields it. Once i becomes exhausted, it is replaced with a new iterator j sampled from I_sub.

Caution

Do not use this method with inputs that are infinitely iterable as they will silently break the multiplexing property by only using a subset of the input iterables.

Caution

This method is not recommended for multiplexing for a small amount of iterations, as it may be much less accurate than mux() depending on the number of open streams, iterable sizes, and the random seed.

Parameters
  • manifests – iterables to be multiplexed. They can be either lazy or eager, but the resulting manifest will always be lazy.

  • weights (Optional[List[Union[int, float]], None]) – an optional weight for each iterable, affects the probability of it being sampled. The weights are uniform by default. If lengths are known, it makes sense to pass them here for uniform distribution of items in the expectation.

  • seed (Union[int, Literal[‘trng’]]) – the random seed, ensures deterministic order across multiple iterations.

  • max_open_streams (Optional[int, None]) – the number of iterables that can be open simultaneously at any given time.

property is_lazy: bool

Indicates whether this manifest was opened in lazy (read-on-the-fly) mode or not.

Return type

bool

map(transform_fn)

Apply transform_fn to each item in this manifest and return a new manifest. If the manifest is opened lazy, the transform is also applied lazily.

Parameters

transform_fn (Callable[[~T], ~T]) – A callable (function) that accepts a single item instance and returns a new (or the same) instance of the same type. E.g. with CutSet, callable accepts Cut and returns also Cut.

Returns

a new CutSet with transformed cuts.

classmethod mux(*manifests, stop_early=False, weights=None, seed=0, max_open_streams=None)

Merges multiple manifest iterables into a new iterable by lazily multiplexing them during iteration time. If one of the iterables is exhausted before the others, we will keep iterating until all iterables are exhausted. This behavior can be changed with stop_early parameter.

Parameters
  • manifests – iterables to be multiplexed. They can be either lazy or eager, but the resulting manifest will always be lazy.

  • stop_early (bool) – should we stop the iteration as soon as we exhaust one of the manifests.

  • weights (Optional[List[Union[int, float]], None]) – an optional weight for each iterable, affects the probability of it being sampled. The weights are uniform by default. If lengths are known, it makes sense to pass them here for uniform distribution of items in the expectation.

  • seed (Union[int, Literal[‘trng’]]) – the random seed, ensures deterministic order across multiple iterations.

classmethod open_writer(path, overwrite=True)

Open a sequential writer that allows to store the manifests one by one, without the necessity of storing the whole manifest set in-memory. Supports writing to JSONL format (.jsonl), with optional gzip compression (.jsonl.gz).

Note

when path is None, we will return a InMemoryWriter instead has the same API but stores the manifests in memory. It is convenient when you want to make disk saving optional.

Example:

>>> from lhotse import RecordingSet
... recordings = [...]
... with RecordingSet.open_writer('recordings.jsonl.gz') as writer:
...     for recording in recordings:
...         writer.write(recording)

This writer can be useful for continuing to write files that were previously stopped – it will open the existing file and scan it for item IDs to skip writing them later. It can also be queried for existing IDs so that the user code may skip preparing the corresponding manifests.

Example:

>>> from lhotse import RecordingSet, Recording
... with RecordingSet.open_writer('recordings.jsonl.gz', overwrite=False) as writer:
...     for path in Path('.').rglob('*.wav'):
...         recording_id = path.stem
...         if writer.contains(recording_id):
...             # Item already written previously - skip processing.
...             continue
...         # Item doesn't exist yet - run extra work to prepare the manifest
...         # and store it.
...         recording = Recording.from_file(path, recording_id=recording_id)
...         writer.write(recording)
Return type

Union[SequentialJsonlWriter, InMemoryWriter]

repeat(times=None, preserve_id=False)

Return a new, lazily evaluated manifest that iterates over the original elements times number of times.

Parameters
  • times (Optional[int, None]) – how many times to repeat (infinite by default).

  • preserve_id (bool) – when True, we won’t update the element ID with repeat number.

Returns

a repeated manifest.

shuffle(rng=None, buffer_size=10000)

Shuffles the elements and returns a shuffled variant of self. If the manifest is opened lazily, performs shuffling on-the-fly with a fixed buffer size.

Parameters

rng (Optional[Random, None]) – an optional instance of random.Random for precise control of randomness.

Returns

a shuffled copy of self, or a manifest that is shuffled lazily.

to_eager()

Evaluates all lazy operations on this manifest, if any, and returns a copy that keeps all items in memory. If the manifest was “eager” already, this is a no-op and won’t copy anything.

to_file(path)
Return type

None

to_json(path)
Return type

None

to_jsonl(path)
Return type

None

to_yaml(path)
Return type

None

Lhotse Shar – sequential storage

Lhotse Shar readers

Lhotse Shar writers

Feature extraction and manifests

Data structures and tools used for feature extraction and description.

Features API - extractor and manifests

class lhotse.features.base.FeatureExtractor(config=None)[source]

The base class for all feature extractors in Lhotse. It is initialized with a config object, specific to a particular feature extraction method. The config is expected to be a dataclass so that it can be easily serialized.

All derived feature extractors must implement at least the following:

  • a name class attribute (how are these features called, e.g. ‘mfcc’)

  • a config_type class attribute that points to the configuration dataclass type

  • the extract method,

  • the frame_shift property.

Feature extractors that support feature-domain mixing should additionally specify two static methods:

  • compute_energy, and

  • mix.

By itself, the FeatureExtractor offers the following high-level methods that are not intended for overriding:

  • extract_from_samples_and_store

  • extract_from_recording_and_store

These methods run a larger feature extraction pipeline that involves data augmentation and disk storage.

name = None
config_type = None
__init__(config=None)[source]
abstract extract(samples, sampling_rate)[source]

Defines how to extract features using a numpy ndarray of audio samples and the sampling rate.

Return type

ndarray

Returns

a numpy ndarray representing the feature matrix.

abstract property frame_shift: float
Return type

float

abstract feature_dim(sampling_rate)[source]
Return type

int

property device: Union[str, torch.device]
Return type

Union[str, device]

static mix(features_a, features_b, energy_scaling_factor_b)[source]

Perform feature-domain mix of two signals, a and b, and return the mixed signal.

Parameters
  • features_a (ndarray) – Left-hand side (reference) signal.

  • features_b (ndarray) – Right-hand side (mixed-in) signal.

  • energy_scaling_factor_b (float) – A scaling factor for features_b energy. It is used to achieve a specific SNR. E.g. to mix with an SNR of 10dB when both features_a and features_b energies are 100, the features_b signal energy needs to be scaled by 0.1. Since different features (e.g. spectrogram, fbank, MFCC) require different combination of transformations (e.g. exp, log, sqrt, pow) to allow mixing of two signals, the exact place where to apply energy_scaling_factor_b to the signal is determined by the implementer.

Return type

ndarray

Returns

A mixed feature matrix.

static compute_energy(features)[source]

Compute the total energy of a feature matrix. How the energy is computed depends on a particular type of features. It is expected that when implemented, compute_energy will never return zero.

Parameters

features (ndarray) – A feature matrix.

Return type

float

Returns

A positive float value of the signal energy.

extract_batch(samples, sampling_rate, lengths=None)[source]

Performs batch extraction. It is not guaranteed to be faster than FeatureExtractor.extract() – it depends on whether the implementation of a particular feature extractor supports accelerated batch computation. If lengths is provided, it is assumed that the input is a batch of padded sequences, so we will not perform any further collation.

Note

Unless overridden by child classes, it defaults to sequentially calling FeatureExtractor.extract() on the inputs.

Note

This method should support variable length inputs.

Return type

Union[ndarray, Tensor, List[ndarray], List[Tensor]]

extract_from_samples_and_store(samples, storage, sampling_rate, offset=0, channel=None, augment_fn=None)[source]

Extract the features from an array of audio samples in a full pipeline:

  • optional audio augmentation;

  • extract the features;

  • save them to disk in a specified directory;

  • return a Features object with a description of the extracted features.

Note, unlike in extract_from_recording_and_store, the returned Features object might not be suitable to store in a FeatureSet, as it does not reference any particular Recording. Instead, this method is useful when extracting features from cuts - especially MixedCut instances, which may be created from multiple recordings and channels.

Parameters
  • samples (ndarray) – a numpy ndarray with the audio samples.

  • sampling_rate (int) – integer sampling rate of samples.

  • storage (FeaturesWriter) – a FeaturesWriter object that will handle storing the feature matrices.

  • offset (float) – an offset in seconds for where to start reading the recording - when used for Cut feature extraction, must be equal to Cut.start.

  • channel (Union[List[int], int, None]) – an optional channel number(s) to insert into Features manifest.

  • augment_fn (Optional[Callable[[ndarray, int], ndarray], None]) – an optional WavAugmenter instance to modify the waveform before feature extraction.

Return type

Features

Returns

a Features manifest item for the extracted feature matrix (it is not written to disk).

extract_from_recording_and_store(recording, storage, offset=0, duration=None, channels=None, augment_fn=None)[source]

Extract the features from a Recording in a full pipeline:

  • load audio from disk;

  • optionally, perform audio augmentation;

  • extract the features;

  • save them to disk in a specified directory;

  • return a Features object with a description of the extracted features and the source data used.

Parameters
  • recording (Recording) – a Recording that specifies what’s the input audio.

  • storage (FeaturesWriter) – a FeaturesWriter object that will handle storing the feature matrices.

  • offset (float) – an optional offset in seconds for where to start reading the recording.

  • duration (Optional[float, None]) – an optional duration specifying how much audio to load from the recording.

  • channels (Union[int, List[int], None]) – an optional int or list of ints, specifying the channels; by default, all channels will be used.

  • augment_fn (Optional[Callable[[ndarray, int], ndarray], None]) – an optional WavAugmenter instance to modify the waveform before feature extraction.

Return type

Features

Returns

a Features manifest item for the extracted feature matrix.

classmethod from_dict(data)[source]
Return type

FeatureExtractor

to_dict()[source]
Return type

Dict[str, Any]

classmethod from_yaml(path)[source]
Return type

FeatureExtractor

to_yaml(path)[source]
lhotse.features.base.get_extractor_type(name)[source]

Return the feature extractor type corresponding to the given name.

Parameters

name (str) – specifies which feature extractor should be used.

Return type

Type

Returns

A feature extractors type.

lhotse.features.base.create_default_feature_extractor(name)[source]

Create a feature extractor object with a default configuration.

Parameters

name (str) – specifies which feature extractor should be used.

Return type

Optional[FeatureExtractor, None]

Returns

A new feature extractor instance.

lhotse.features.base.register_extractor(cls)[source]

This decorator is used to register feature extractor classes in Lhotse so they can be easily created just by knowing their name.

An example of usage:

@register_extractor class MyFeatureExtractor: …

Parameters

cls – A type (class) that is being registered.

Returns

Registered type.

class lhotse.features.base.TorchaudioFeatureExtractor(config=None)[source]

Common abstract base class for all torchaudio based feature extractors.

extract(samples, sampling_rate)[source]

Defines how to extract features using a numpy ndarray of audio samples and the sampling rate.

Return type

ndarray

Returns

a numpy ndarray representing the feature matrix.

property frame_shift: float
Return type

float

__init__(config=None)
static compute_energy(features)

Compute the total energy of a feature matrix. How the energy is computed depends on a particular type of features. It is expected that when implemented, compute_energy will never return zero.

Parameters

features (ndarray) – A feature matrix.

Return type

float

Returns

A positive float value of the signal energy.

config_type = None
property device: Union[str, torch.device]
Return type

Union[str, device]

extract_batch(samples, sampling_rate, lengths=None)

Performs batch extraction. It is not guaranteed to be faster than FeatureExtractor.extract() – it depends on whether the implementation of a particular feature extractor supports accelerated batch computation. If lengths is provided, it is assumed that the input is a batch of padded sequences, so we will not perform any further collation.

Note

Unless overridden by child classes, it defaults to sequentially calling FeatureExtractor.extract() on the inputs.

Note

This method should support variable length inputs.

Return type

Union[ndarray, Tensor, List[ndarray], List[Tensor]]

extract_from_recording_and_store(recording, storage, offset=0, duration=None, channels=None, augment_fn=None)

Extract the features from a Recording in a full pipeline:

  • load audio from disk;

  • optionally, perform audio augmentation;

  • extract the features;

  • save them to disk in a specified directory;

  • return a Features object with a description of the extracted features and the source data used.

Parameters
  • recording (Recording) – a Recording that specifies what’s the input audio.

  • storage (FeaturesWriter) – a FeaturesWriter object that will handle storing the feature matrices.

  • offset (float) – an optional offset in seconds for where to start reading the recording.

  • duration (Optional[float, None]) – an optional duration specifying how much audio to load from the recording.

  • channels (Union[int, List[int], None]) – an optional int or list of ints, specifying the channels; by default, all channels will be used.

  • augment_fn (Optional[Callable[[ndarray, int], ndarray], None]) – an optional WavAugmenter instance to modify the waveform before feature extraction.

Return type

Features

Returns

a Features manifest item for the extracted feature matrix.

extract_from_samples_and_store(samples, storage, sampling_rate, offset=0, channel=None, augment_fn=None)

Extract the features from an array of audio samples in a full pipeline:

  • optional audio augmentation;

  • extract the features;

  • save them to disk in a specified directory;

  • return a Features object with a description of the extracted features.

Note, unlike in extract_from_recording_and_store, the returned Features object might not be suitable to store in a FeatureSet, as it does not reference any particular Recording. Instead, this method is useful when extracting features from cuts - especially MixedCut instances, which may be created from multiple recordings and channels.

Parameters
  • samples (ndarray) – a numpy ndarray with the audio samples.

  • sampling_rate (int) – integer sampling rate of samples.

  • storage (FeaturesWriter) – a FeaturesWriter object that will handle storing the feature matrices.

  • offset (float) – an offset in seconds for where to start reading the recording - when used for Cut feature extraction, must be equal to Cut.start.

  • channel (Union[List[int], int, None]) – an optional channel number(s) to insert into Features manifest.

  • augment_fn (Optional[Callable[[ndarray, int], ndarray], None]) – an optional WavAugmenter instance to modify the waveform before feature extraction.

Return type

Features

Returns

a Features manifest item for the extracted feature matrix (it is not written to disk).

abstract feature_dim(sampling_rate)
Return type

int

classmethod from_dict(data)
Return type

FeatureExtractor

classmethod from_yaml(path)
Return type

FeatureExtractor

static mix(features_a, features_b, energy_scaling_factor_b)

Perform feature-domain mix of two signals, a and b, and return the mixed signal.

Parameters
  • features_a (ndarray) – Left-hand side (reference) signal.

  • features_b (ndarray) – Right-hand side (mixed-in) signal.

  • energy_scaling_factor_b (float) – A scaling factor for features_b energy. It is used to achieve a specific SNR. E.g. to mix with an SNR of 10dB when both features_a and features_b energies are 100, the features_b signal energy needs to be scaled by 0.1. Since different features (e.g. spectrogram, fbank, MFCC) require different combination of transformations (e.g. exp, log, sqrt, pow) to allow mixing of two signals, the exact place where to apply energy_scaling_factor_b to the signal is determined by the implementer.

Return type

ndarray

Returns

A mixed feature matrix.

name = None
to_dict()
Return type

Dict[str, Any]

to_yaml(path)
class lhotse.features.base.Features(type, num_frames, num_features, frame_shift, sampling_rate, start, duration, storage_type, storage_path, storage_key, recording_id=None, channels=None)[source]

Represents features extracted for some particular time range in a given recording and channel. It contains metadata about how it’s stored: storage_type describes “how to read it”, for now it supports numpy arrays serialized with np.save, as well as arrays compressed with lilcom; storage_path is the path to the file on the local filesystem.

type: str
num_frames: int
num_features: int
frame_shift: float
sampling_rate: int
start: float
duration: float
storage_type: str
storage_path: str
storage_key: Union[str, bytes]
recording_id: Optional[str] = None
channels: Optional[Union[List[int], int]] = None
property end: float
Return type

float

load(start=None, duration=None, channel_id=0)[source]
Return type

ndarray

move_to_memory(start=0, duration=None, lilcom=False)[source]
Return type

Features

with_path_prefix(path)[source]
Return type

Features

to_dict()[source]
Return type

dict

copy_feats(writer)[source]

Read the referenced feature array and save it using writer. Returns a copy of the manifest with updated fields related to the feature storage.

Return type

Features

static from_dict(data)[source]
Return type

Features

__init__(type, num_frames, num_features, frame_shift, sampling_rate, start, duration, storage_type, storage_path, storage_key, recording_id=None, channels=None)
class lhotse.features.base.FeatureSet(features=None)[source]

Represents a feature manifest, and allows to read features for given recordings within particular channels and time ranges. It also keeps information about the feature extractor parameters used to obtain this set. When a given recording/time-range/channel is unavailable, raises a KeyError.

__init__(features=None)[source]
property data: Union[Dict[str, lhotse.features.base.Features], Iterable[lhotse.features.base.Features]]

Alias property for self.features

Return type

Union[Dict[str, Features], Iterable[Features]]

static from_features(features)[source]
Return type

FeatureSet

static from_items(features)

Function to be implemented by every sub-class of this mixin. It’s expected to create a sub-class instance out of an iterable of items that are held by the sub-class (e.g., CutSet.from_items(iterable_of_cuts)).

Return type

FeatureSet

static from_dicts(data)[source]
Return type

FeatureSet

to_dicts()[source]
Return type

Iterable[dict]

with_path_prefix(path)[source]
Return type

FeatureSet

split(num_splits, shuffle=False, drop_last=False)[source]

Split the FeatureSet into num_splits pieces of equal size.

Parameters
  • num_splits (int) – Requested number of splits.

  • shuffle (bool) – Optionally shuffle the recordings order first.

  • drop_last (bool) – determines how to handle splitting when len(seq) is not divisible by num_splits. When False (default), the splits might have unequal lengths. When True, it may discard the last element in some splits to ensure they are equally long.

Return type

List[FeatureSet]

Returns

A list of FeatureSet pieces.

split_lazy(output_dir, chunk_size, prefix='')[source]

Splits a manifest (either lazily or eagerly opened) into chunks, each with chunk_size items (except for the last one, typically).

In order to be memory efficient, this implementation saves each chunk to disk in a .jsonl.gz format as the input manifest is sampled.

Note

For lowest memory usage, use load_manifest_lazy to open the input manifest for this method.

Parameters
  • it – any iterable of Lhotse manifests.

  • output_dir (Union[Path, str]) – directory where the split manifests are saved. Each manifest is saved at: {output_dir}/{prefix}.{split_idx}.jsonl.gz

  • chunk_size (int) – the number of items in each chunk.

  • prefix (str) – the prefix of each manifest.

Return type

List[FeatureSet]

Returns

a list of lazily opened chunk manifests.

shuffle(*args, **kwargs)[source]

Shuffles the elements and returns a shuffled variant of self. If the manifest is opened lazily, performs shuffling on-the-fly with a fixed buffer size.

Parameters

rng – an optional instance of random.Random for precise control of randomness.

Returns

a shuffled copy of self, or a manifest that is shuffled lazily.

subset(first=None, last=None)[source]

Return a new FeatureSet according to the selected subset criterion. Only a single argument to subset is supported at this time.

Parameters
  • first (Optional[int, None]) – int, the number of first supervisions to keep.

  • last (Optional[int, None]) – int, the number of last supervisions to keep.

Return type

FeatureSet

Returns

a new FeatureSet with the subset results.

find(recording_id, channel_id=0, start=0.0, duration=None, leeway=0.05)[source]

Find and return a Features object that best satisfies the search criteria. Raise a KeyError when no such object is available.

Parameters
  • recording_id (str) – str, requested recording ID.

  • channel_id (Union[int, List[int]]) – int, requested channel.

  • start (float) – float, requested start time in seconds for the feature chunk.

  • duration (Optional[float, None]) – optional float, requested duration in seconds for the feature chunk. By default, return everything from the start.

  • leeway (float) – float, controls how strictly we have to match the requested start and duration criteria. It is necessary to keep a small positive value here (default 0.05s), as there might be differences between the duration of recording/supervision segment, and the duration of features. The latter one is constrained to be a multiple of frame_shift, while the former can be arbitrary.

Return type

Features

Returns

a Features object satisfying the search criteria.

load(recording_id, channel_id=0, start=0.0, duration=None)[source]

Find a Features object that best satisfies the search criteria and load the features as a numpy ndarray. Raise a KeyError when no such object is available.

Return type

ndarray

copy_feats(writer)[source]

For each manifest in this FeatureSet, read the referenced feature array and save it using writer. Returns a copy of the manifest with updated fields related to the feature storage.

Return type

FeatureSet

compute_global_stats(storage_path=None)[source]

Compute the global means and standard deviations for each feature bin in the manifest. It follows the implementation in scikit-learn: https://github.com/scikit-learn/scikit-learn/blob/0fb307bf39bbdacd6ed713c00724f8f871d60370/sklearn/utils/extmath.py#L715 which follows the paper: “Algorithms for computing the sample variance: analysis and recommendations”, by Chan, Golub, and LeVeque.

Parameters

storage_path (Union[Path, str, None]) – an optional path to a file where the stats will be stored with pickle.

Return a dict of ``{‘norm_means’``{‘norm_means’

np.ndarray, ‘norm_stds’: np.ndarray}`` with the shape of the arrays equal to the number of feature bins in this manifest.

Return type

Dict[str, ndarray]

filter(predicate)

Return a new manifest containing only the items that satisfy predicate. If the manifest is lazy, the filtering will also be applied lazily.

Parameters

predicate (Callable[[~T], bool]) – a function that takes a cut as an argument and returns bool.

Returns

a filtered manifest.

classmethod from_file(path)
Return type

Any

classmethod from_json(path)
Return type

Any

classmethod from_jsonl(path)
Return type

Any

classmethod from_jsonl_lazy(path)

Read a JSONL manifest in a lazy manner, which opens the file but does not read it immediately. It is only suitable for sequential reads and iteration.

Warning

Opening the manifest in this way might cause some methods that rely on random access to fail.

Return type

Any

classmethod from_yaml(path)
Return type

Any

classmethod infinite_mux(*manifests, weights=None, seed=0, max_open_streams=None)

Merges multiple manifest iterables into a new iterable by lazily multiplexing them during iteration time. Unlike mux(), this method allows to limit the number of max open sub-iterators at any given time.

To enable this, it performs 2-stage sampling. First, it samples with replacement the set of iterators I to construct a subset I_sub of size max_open_streams. Then, for each iteration step, it samples an iterator i from I_sub, fetches the next item from it, and yields it. Once i becomes exhausted, it is replaced with a new iterator j sampled from I_sub.

Caution

Do not use this method with inputs that are infinitely iterable as they will silently break the multiplexing property by only using a subset of the input iterables.

Caution

This method is not recommended for multiplexing for a small amount of iterations, as it may be much less accurate than mux() depending on the number of open streams, iterable sizes, and the random seed.

Parameters
  • manifests – iterables to be multiplexed. They can be either lazy or eager, but the resulting manifest will always be lazy.

  • weights (Optional[List[Union[int, float]], None]) – an optional weight for each iterable, affects the probability of it being sampled. The weights are uniform by default. If lengths are known, it makes sense to pass them here for uniform distribution of items in the expectation.

  • seed (Union[int, Literal[‘trng’]]) – the random seed, ensures deterministic order across multiple iterations.

  • max_open_streams (Optional[int, None]) – the number of iterables that can be open simultaneously at any given time.

property is_lazy: bool

Indicates whether this manifest was opened in lazy (read-on-the-fly) mode or not.

Return type

bool

map(transform_fn)

Apply transform_fn to each item in this manifest and return a new manifest. If the manifest is opened lazy, the transform is also applied lazily.

Parameters

transform_fn (Callable[[~T], ~T]) – A callable (function) that accepts a single item instance and returns a new (or the same) instance of the same type. E.g. with CutSet, callable accepts Cut and returns also Cut.

Returns

a new CutSet with transformed cuts.

classmethod mux(*manifests, stop_early=False, weights=None, seed=0, max_open_streams=None)

Merges multiple manifest iterables into a new iterable by lazily multiplexing them during iteration time. If one of the iterables is exhausted before the others, we will keep iterating until all iterables are exhausted. This behavior can be changed with stop_early parameter.

Parameters
  • manifests – iterables to be multiplexed. They can be either lazy or eager, but the resulting manifest will always be lazy.

  • stop_early (bool) – should we stop the iteration as soon as we exhaust one of the manifests.

  • weights (Optional[List[Union[int, float]], None]) – an optional weight for each iterable, affects the probability of it being sampled. The weights are uniform by default. If lengths are known, it makes sense to pass them here for uniform distribution of items in the expectation.

  • seed (Union[int, Literal[‘trng’]]) – the random seed, ensures deterministic order across multiple iterations.

classmethod open_writer(path, overwrite=True)

Open a sequential writer that allows to store the manifests one by one, without the necessity of storing the whole manifest set in-memory. Supports writing to JSONL format (.jsonl), with optional gzip compression (.jsonl.gz).

Note

when path is None, we will return a InMemoryWriter instead has the same API but stores the manifests in memory. It is convenient when you want to make disk saving optional.

Example:

>>> from lhotse import RecordingSet
... recordings = [...]
... with RecordingSet.open_writer('recordings.jsonl.gz') as writer:
...     for recording in recordings:
...         writer.write(recording)

This writer can be useful for continuing to write files that were previously stopped – it will open the existing file and scan it for item IDs to skip writing them later. It can also be queried for existing IDs so that the user code may skip preparing the corresponding manifests.

Example:

>>> from lhotse import RecordingSet, Recording
... with RecordingSet.open_writer('recordings.jsonl.gz', overwrite=False) as writer:
...     for path in Path('.').rglob('*.wav'):
...         recording_id = path.stem
...         if writer.contains(recording_id):
...             # Item already written previously - skip processing.
...             continue
...         # Item doesn't exist yet - run extra work to prepare the manifest
...         # and store it.
...         recording = Recording.from_file(path, recording_id=recording_id)
...         writer.write(recording)
Return type

Union[SequentialJsonlWriter, InMemoryWriter]

repeat(times=None, preserve_id=False)

Return a new, lazily evaluated manifest that iterates over the original elements times number of times.

Parameters
  • times (Optional[int, None]) – how many times to repeat (infinite by default).

  • preserve_id (bool) – when True, we won’t update the element ID with repeat number.

Returns

a repeated manifest.

to_eager()

Evaluates all lazy operations on this manifest, if any, and returns a copy that keeps all items in memory. If the manifest was “eager” already, this is a no-op and won’t copy anything.

to_file(path)
Return type

None

to_json(path)
Return type

None

to_jsonl(path)
Return type

None

to_yaml(path)
Return type

None

class lhotse.features.base.FeatureSetBuilder(feature_extractor, storage, augment_fn=None)[source]

An extended constructor for the FeatureSet. Think of it as a class wrapper for a feature extraction script. It consumes an iterable of Recordings, extracts the features specified by the FeatureExtractor config, and saves stores them on the disk.

Eventually, we plan to extend it with the capability to extract only the features in specified regions of recordings and to perform some time-domain data augmentation.

__init__(feature_extractor, storage, augment_fn=None)[source]
process_and_store_recordings(recordings, output_manifest=None, num_jobs=1)[source]
Return type

FeatureSet

lhotse.features.base.store_feature_array(feats, storage)[source]

Store feats array on disk, using lilcom compression by default.

Parameters
  • feats (ndarray) – a numpy ndarray containing features.

  • storage (FeaturesWriter) – a FeaturesWriter object to use for array storage.

Return type

str

Returns

a path to the file containing the stored array.

lhotse.features.base.compute_global_stats(feature_manifests, storage_path=None)[source]

Compute the global means and standard deviations for each feature bin in the manifest. It performs only a single pass over the data and iteratively updates the estimate of the means and variances.

We follow the implementation in scikit-learn: https://github.com/scikit-learn/scikit-learn/blob/0fb307bf39bbdacd6ed713c00724f8f871d60370/sklearn/utils/extmath.py#L715 which follows the paper: “Algorithms for computing the sample variance: analysis and recommendations”, by Chan, Golub, and LeVeque.

Parameters
  • feature_manifests (Iterable[Features]) – an iterable of Features objects.

  • storage_path (Union[Path, str, None]) – an optional path to a file where the stats will be stored with pickle.

Return a dict of ``{‘norm_means’``{‘norm_means’

np.ndarray, ‘norm_stds’: np.ndarray}`` with the shape of the arrays equal to the number of feature bins in this manifest.

Return type

Dict[str, ndarray]

class lhotse.features.base.StatsAccumulator(feature_dim)[source]
__init__(feature_dim)[source]
update(arr)[source]
Return type

None

property norm_means: numpy.ndarray
Return type

ndarray

property norm_stds: numpy.ndarray
Return type

ndarray

get()[source]
Return type

Dict[str, ndarray]

Lhotse’s feature extractors

class lhotse.features.kaldi.extractors.Fbank(config=None)[source]
name = 'kaldi-fbank'
config_type

alias of lhotse.features.kaldi.extractors.FbankConfig

__init__(config=None)[source]
property device: Union[str, torch.device]
Return type

Union[str, device]

property frame_shift: float
Return type

float

to(device)[source]
feature_dim(sampling_rate)[source]
Return type

int

extract(samples, sampling_rate)[source]

Defines how to extract features using a numpy ndarray of audio samples and the sampling rate.

Return type

Union[ndarray, Tensor]

Returns

a numpy ndarray representing the feature matrix.

extract_batch(samples, sampling_rate, lengths=None)[source]

Performs batch extraction. It is not guaranteed to be faster than FeatureExtractor.extract() – it depends on whether the implementation of a particular feature extractor supports accelerated batch computation. If lengths is provided, it is assumed that the input is a batch of padded sequences, so we will not perform any further collation.

Note

Unless overridden by child classes, it defaults to sequentially calling FeatureExtractor.extract() on the inputs.

Note

This method should support variable length inputs.

Return type

Union[ndarray, Tensor, List[ndarray], List[Tensor]]

static mix(features_a, features_b, energy_scaling_factor_b)[source]

Perform feature-domain mix of two signals, a and b, and return the mixed signal.

Parameters
  • features_a (ndarray) – Left-hand side (reference) signal.

  • features_b (ndarray) – Right-hand side (mixed-in) signal.

  • energy_scaling_factor_b (float) – A scaling factor for features_b energy. It is used to achieve a specific SNR. E.g. to mix with an SNR of 10dB when both features_a and features_b energies are 100, the features_b signal energy needs to be scaled by 0.1. Since different features (e.g. spectrogram, fbank, MFCC) require different combination of transformations (e.g. exp, log, sqrt, pow) to allow mixing of two signals, the exact place where to apply energy_scaling_factor_b to the signal is determined by the implementer.

Return type

ndarray

Returns

A mixed feature matrix.

static compute_energy(features)[source]

Compute the total energy of a feature matrix. How the energy is computed depends on a particular type of features. It is expected that when implemented, compute_energy will never return zero.

Parameters

features (ndarray) – A feature matrix.

Return type

float

Returns

A positive float value of the signal energy.

class lhotse.features.kaldi.extractors.Mfcc(config=None)[source]
name = 'kaldi-mfcc'
config_type

alias of lhotse.features.kaldi.extractors.MfccConfig

__init__(config=None)[source]
property device: Union[str, torch.device]
Return type

Union[str, device]

property frame_shift: float
Return type

float

feature_dim(sampling_rate)[source]
Return type

int

extract(samples, sampling_rate)[source]

Defines how to extract features using a numpy ndarray of audio samples and the sampling rate.

Return type

Union[ndarray, Tensor]

Returns

a numpy ndarray representing the feature matrix.

extract_batch(samples, sampling_rate, lengths=None)[source]

Performs batch extraction. It is not guaranteed to be faster than FeatureExtractor.extract() – it depends on whether the implementation of a particular feature extractor supports accelerated batch computation. If lengths is provided, it is assumed that the input is a batch of padded sequences, so we will not perform any further collation.

Note

Unless overridden by child classes, it defaults to sequentially calling FeatureExtractor.extract() on the inputs.

Note

This method should support variable length inputs.

Return type

Union[ndarray, Tensor, List[ndarray], List[Tensor]]

Kaldi feature extractors as network layers

This whole module is authored and contributed by Jesus Villalba, with minor changes by Piotr Żelasko to make it more consistent with Lhotse.

It contains a PyTorch implementation of feature extractors that is very close to Kaldi’s – notably, it differs in that the preemphasis and DC offset removal are applied in the time, rather than frequency domain. This should not significantly affect any results, as confirmed by Jesus.

This implementation works well with autograd and batching, and can be used neural network layers.

Update January 2022: These modules now expose a new API function called “online_inference” that may be used to compute the features when the audio is streaming. The implementation is stateless, and passes the waveform remainders back to the user to feed them to the modules once new data becomes available. The implementation is compatible with JIT scripting via TorchScript.

class lhotse.features.kaldi.layers.Wav2Win(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, pad_length=None, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, return_log_energy=False)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and partition them into overlapping frames (of audio samples). Note: no feature extraction happens in here, the output is still a time-domain signal.

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2Win()
>>> t(x).shape
torch.Size([1, 100, 400])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, window_length). When return_log_energy==True, returns a tuple where the second element is a log-energy tensor of shape (batch_size, num_frames).

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, pad_length=None, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, return_log_energy=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tuple[Tensor, Optional[Tensor, None]]

online_inference(x, context=None)[source]

The same as the forward() method, except it accepts an extra argument with the remainder waveform from the previous call of online_inference(), and returns a tuple of ((frames, log_energy), remainder).

Return type

Tuple[Tuple[Tensor, Optional[Tensor, None]], Tensor]

T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

requires_grad_(requires_grad=True)

Change if autograd should record operations on parameters in this module.

This method sets the parameters’ requires_grad attributes in-place.

This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training).

See locally-disable-grad-doc for a comparison between .requires_grad_() and several similar mechanisms that may be confused with it.

Args:
requires_grad (bool): whether autograd should record operations on

parameters in this module. Default: True.

Returns:

Module: self

Return type

~T

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict. Implement this function and a corresponding get_extra_state() for your module if you need to store extra state within its state_dict.

Args:

state (dict): Extra state from the state_dict

share_memory()

See torch.Tensor.share_memory_()

Return type

~T

state_dict(*args, destination=None, prefix='', keep_vars=False)

Returns a dictionary containing references to the whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. Parameters and buffers set to None are not included.

Note

The returned object is a shallow copy. It contains references to the module’s parameters and buffers.

Warning

Currently state_dict() also accepts positional arguments for destination, prefix and keep_vars in order. However, this is being deprecated and keyword arguments will be enforced in future releases.

Warning

Please avoid the use of argument destination as it is not designed for end-users.

Args:
destination (dict, optional): If provided, the state of module will

be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

prefix (str, optional): a prefix added to parameter and buffer

names to compose the keys in state_dict. Default: ''.

keep_vars (bool, optional): by default the Tensor s

returned in the state dict are detached from autograd. If it’s set to True, detaching will not be performed. Default: False.

Returns:
dict:

a dictionary containing a whole state of the module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> module.state_dict().keys()
['bias', 'weight']
to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
to_empty(*, device, recurse=True)

Moves the parameters and buffers to the specified device without copying storage.

Args:
device (torch.device): The desired device of the parameters

and buffers in this module.

recurse (bool): Whether parameters and buffers of submodules should

be recursively moved to the specified device.

Returns:

Module: self

Return type

~T

train(mode=True)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

Return type

~T

type(dst_type)

Casts all parameters and buffers to dst_type.

Note

This method modifies the module in-place.

Args:

dst_type (type or string): the desired type

Returns:

Module: self

Return type

~T

xpu(device=None)

Moves all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

zero_grad(set_to_none=True)

Resets gradients of all model parameters. See similar function under torch.optim.Optimizer for more context.

Args:
set_to_none (bool): instead of setting to zero, set the grads to None.

See torch.optim.Optimizer.zero_grad() for details.

Return type

None

training: bool
class lhotse.features.kaldi.layers.Wav2FFT(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and compute their Short-Time Fourier Transform (STFT). The output is a complex-valued tensor.

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2FFT()
>>> t(x).shape
torch.Size([1, 100, 257])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, num_fft_bins) with dtype torch.complex64.

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property sampling_rate: int
Return type

int

property frame_length: float
Return type

float

property frame_shift: float
Return type

float

property remove_dc_offset: bool
Return type

bool

property preemph_coeff: float
Return type

float

property window_type: str
Return type

str

property dither: float
Return type

float

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tensor

online_inference(x, context=None)[source]
Return type

Tuple[Tensor, Tensor]

T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

requires_grad_(requires_grad=True)

Change if autograd should record operations on parameters in this module.

This method sets the parameters’ requires_grad attributes in-place.

This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training).

See locally-disable-grad-doc for a comparison between .requires_grad_() and several similar mechanisms that may be confused with it.

Args:
requires_grad (bool): whether autograd should record operations on

parameters in this module. Default: True.

Returns:

Module: self

Return type

~T

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict. Implement this function and a corresponding get_extra_state() for your module if you need to store extra state within its state_dict.

Args:

state (dict): Extra state from the state_dict

share_memory()

See torch.Tensor.share_memory_()

Return type

~T

state_dict(*args, destination=None, prefix='', keep_vars=False)

Returns a dictionary containing references to the whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. Parameters and buffers set to None are not included.

Note

The returned object is a shallow copy. It contains references to the module’s parameters and buffers.

Warning

Currently state_dict() also accepts positional arguments for destination, prefix and keep_vars in order. However, this is being deprecated and keyword arguments will be enforced in future releases.

Warning

Please avoid the use of argument destination as it is not designed for end-users.

Args:
destination (dict, optional): If provided, the state of module will

be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

prefix (str, optional): a prefix added to parameter and buffer

names to compose the keys in state_dict. Default: ''.

keep_vars (bool, optional): by default the Tensor s

returned in the state dict are detached from autograd. If it’s set to True, detaching will not be performed. Default: False.

Returns:
dict:

a dictionary containing a whole state of the module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> module.state_dict().keys()
['bias', 'weight']
to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
to_empty(*, device, recurse=True)

Moves the parameters and buffers to the specified device without copying storage.

Args:
device (torch.device): The desired device of the parameters

and buffers in this module.

recurse (bool): Whether parameters and buffers of submodules should

be recursively moved to the specified device.

Returns:

Module: self

Return type

~T

train(mode=True)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

Return type

~T

type(dst_type)

Casts all parameters and buffers to dst_type.

Note

This method modifies the module in-place.

Args:

dst_type (type or string): the desired type

Returns:

Module: self

Return type

~T

xpu(device=None)

Moves all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

zero_grad(set_to_none=True)

Resets gradients of all model parameters. See similar function under torch.optim.Optimizer for more context.

Args:
set_to_none (bool): instead of setting to zero, set the grads to None.

See torch.optim.Optimizer.zero_grad() for details.

Return type

None

training: bool
class lhotse.features.kaldi.layers.Wav2Spec(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True, use_fft_mag=False)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and compute their Short-Time Fourier Transform (STFT). The STFT is transformed either to a magnitude spectrum (use_fft_mag=True) or a power spectrum (use_fft_mag=False).

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2Spec()
>>> t(x).shape
torch.Size([1, 100, 257])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, num_fft_bins).

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True, use_fft_mag=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

property dither: float
Return type

float

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tensor

property frame_length: float
Return type

float

property frame_shift: float
Return type

float

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

online_inference(x, context=None)
Return type

Tuple[Tensor, Tensor]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

property preemph_coeff: float
Return type

float

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

property remove_dc_offset: bool
Return type

bool

requires_grad_(requires_grad=True)

Change if autograd should record operations on parameters in this module.

This method sets the parameters’ requires_grad attributes in-place.

This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training).

See locally-disable-grad-doc for a comparison between .requires_grad_() and several similar mechanisms that may be confused with it.

Args:
requires_grad (bool): whether autograd should record operations on

parameters in this module. Default: True.

Returns:

Module: self

Return type

~T

property sampling_rate: int
Return type

int

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict. Implement this function and a corresponding get_extra_state() for your module if you need to store extra state within its state_dict.

Args:

state (dict): Extra state from the state_dict

share_memory()

See torch.Tensor.share_memory_()

Return type

~T

state_dict(*args, destination=None, prefix='', keep_vars=False)

Returns a dictionary containing references to the whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. Parameters and buffers set to None are not included.

Note

The returned object is a shallow copy. It contains references to the module’s parameters and buffers.

Warning

Currently state_dict() also accepts positional arguments for destination, prefix and keep_vars in order. However, this is being deprecated and keyword arguments will be enforced in future releases.

Warning

Please avoid the use of argument destination as it is not designed for end-users.

Args:
destination (dict, optional): If provided, the state of module will

be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

prefix (str, optional): a prefix added to parameter and buffer

names to compose the keys in state_dict. Default: ''.

keep_vars (bool, optional): by default the Tensor s

returned in the state dict are detached from autograd. If it’s set to True, detaching will not be performed. Default: False.

Returns:
dict:

a dictionary containing a whole state of the module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> module.state_dict().keys()
['bias', 'weight']
to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
to_empty(*, device, recurse=True)

Moves the parameters and buffers to the specified device without copying storage.

Args:
device (torch.device): The desired device of the parameters

and buffers in this module.

recurse (bool): Whether parameters and buffers of submodules should

be recursively moved to the specified device.

Returns:

Module: self

Return type

~T

train(mode=True)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

Return type

~T

type(dst_type)

Casts all parameters and buffers to dst_type.

Note

This method modifies the module in-place.

Args:

dst_type (type or string): the desired type

Returns:

Module: self

Return type

~T

property window_type: str
Return type

str

xpu(device=None)

Moves all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

zero_grad(set_to_none=True)

Resets gradients of all model parameters. See similar function under torch.optim.Optimizer for more context.

Args:
set_to_none (bool): instead of setting to zero, set the grads to None.

See torch.optim.Optimizer.zero_grad() for details.

Return type

None

training: bool
class lhotse.features.kaldi.layers.Wav2LogSpec(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True, use_fft_mag=False)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and compute their Short-Time Fourier Transform (STFT). The STFT is transformed either to a log-magnitude spectrum (use_fft_mag=True) or a log-power spectrum (use_fft_mag=False).

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2LogSpec()
>>> t(x).shape
torch.Size([1, 100, 257])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, num_fft_bins).

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=True, use_fft_mag=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

property dither: float
Return type

float

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tensor

property frame_length: float
Return type

float

property frame_shift: float
Return type

float

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

online_inference(x, context=None)
Return type

Tuple[Tensor, Tensor]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

property preemph_coeff: float
Return type

float

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

property remove_dc_offset: bool
Return type

bool

requires_grad_(requires_grad=True)

Change if autograd should record operations on parameters in this module.

This method sets the parameters’ requires_grad attributes in-place.

This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training).

See locally-disable-grad-doc for a comparison between .requires_grad_() and several similar mechanisms that may be confused with it.

Args:
requires_grad (bool): whether autograd should record operations on

parameters in this module. Default: True.

Returns:

Module: self

Return type

~T

property sampling_rate: int
Return type

int

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict. Implement this function and a corresponding get_extra_state() for your module if you need to store extra state within its state_dict.

Args:

state (dict): Extra state from the state_dict

share_memory()

See torch.Tensor.share_memory_()

Return type

~T

state_dict(*args, destination=None, prefix='', keep_vars=False)

Returns a dictionary containing references to the whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. Parameters and buffers set to None are not included.

Note

The returned object is a shallow copy. It contains references to the module’s parameters and buffers.

Warning

Currently state_dict() also accepts positional arguments for destination, prefix and keep_vars in order. However, this is being deprecated and keyword arguments will be enforced in future releases.

Warning

Please avoid the use of argument destination as it is not designed for end-users.

Args:
destination (dict, optional): If provided, the state of module will

be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

prefix (str, optional): a prefix added to parameter and buffer

names to compose the keys in state_dict. Default: ''.

keep_vars (bool, optional): by default the Tensor s

returned in the state dict are detached from autograd. If it’s set to True, detaching will not be performed. Default: False.

Returns:
dict:

a dictionary containing a whole state of the module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> module.state_dict().keys()
['bias', 'weight']
to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
to_empty(*, device, recurse=True)

Moves the parameters and buffers to the specified device without copying storage.

Args:
device (torch.device): The desired device of the parameters

and buffers in this module.

recurse (bool): Whether parameters and buffers of submodules should

be recursively moved to the specified device.

Returns:

Module: self

Return type

~T

train(mode=True)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

Return type

~T

type(dst_type)

Casts all parameters and buffers to dst_type.

Note

This method modifies the module in-place.

Args:

dst_type (type or string): the desired type

Returns:

Module: self

Return type

~T

property window_type: str
Return type

str

xpu(device=None)

Moves all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

zero_grad(set_to_none=True)

Resets gradients of all model parameters. See similar function under torch.optim.Optimizer for more context.

Args:
set_to_none (bool): instead of setting to zero, set the grads to None.

See torch.optim.Optimizer.zero_grad() for details.

Return type

None

training: bool
class lhotse.features.kaldi.layers.Wav2LogFilterBank(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=False, use_fft_mag=False, low_freq=20.0, high_freq=- 400.0, num_filters=80, norm_filters=False, torchaudio_compatible_mel_scale=True)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and compute their log-Mel filter bank energies (also known as “fbank”).

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2LogFilterBank()
>>> t(x).shape
torch.Size([1, 100, 80])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, num_filters).

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=False, use_fft_mag=False, low_freq=20.0, high_freq=- 400.0, num_filters=80, norm_filters=False, torchaudio_compatible_mel_scale=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

property dither: float
Return type

float

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tensor

property frame_length: float
Return type

float

property frame_shift: float
Return type

float

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

online_inference(x, context=None)
Return type

Tuple[Tensor, Tensor]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

property preemph_coeff: float
Return type

float

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

property remove_dc_offset: bool
Return type

bool

requires_grad_(requires_grad=True)

Change if autograd should record operations on parameters in this module.

This method sets the parameters’ requires_grad attributes in-place.

This method is helpful for freezing part of the module for finetuning or training parts of a model individually (e.g., GAN training).

See locally-disable-grad-doc for a comparison between .requires_grad_() and several similar mechanisms that may be confused with it.

Args:
requires_grad (bool): whether autograd should record operations on

parameters in this module. Default: True.

Returns:

Module: self

Return type

~T

property sampling_rate: int
Return type

int

set_extra_state(state)

This function is called from load_state_dict() to handle any extra state found within the state_dict. Implement this function and a corresponding get_extra_state() for your module if you need to store extra state within its state_dict.

Args:

state (dict): Extra state from the state_dict

share_memory()

See torch.Tensor.share_memory_()

Return type

~T

state_dict(*args, destination=None, prefix='', keep_vars=False)

Returns a dictionary containing references to the whole state of the module.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are corresponding parameter and buffer names. Parameters and buffers set to None are not included.

Note

The returned object is a shallow copy. It contains references to the module’s parameters and buffers.

Warning

Currently state_dict() also accepts positional arguments for destination, prefix and keep_vars in order. However, this is being deprecated and keyword arguments will be enforced in future releases.

Warning

Please avoid the use of argument destination as it is not designed for end-users.

Args:
destination (dict, optional): If provided, the state of module will

be updated into the dict and the same object is returned. Otherwise, an OrderedDict will be created and returned. Default: None.

prefix (str, optional): a prefix added to parameter and buffer

names to compose the keys in state_dict. Default: ''.

keep_vars (bool, optional): by default the Tensor s

returned in the state dict are detached from autograd. If it’s set to True, detaching will not be performed. Default: False.

Returns:
dict:

a dictionary containing a whole state of the module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> module.state_dict().keys()
['bias', 'weight']
to(*args, **kwargs)

Moves and/or casts the parameters and buffers.

This can be called as

to(device=None, dtype=None, non_blocking=False)
to(dtype, non_blocking=False)
to(tensor, non_blocking=False)
to(memory_format=torch.channels_last)

Its signature is similar to torch.Tensor.to(), but only accepts floating point or complex dtypes. In addition, this method will only cast the floating point or complex parameters and buffers to dtype (if given). The integral parameters and buffers will be moved device, if that is given, but with dtypes unchanged. When non_blocking is set, it tries to convert/move asynchronously with respect to the host if possible, e.g., moving CPU Tensors with pinned memory to CUDA devices.

See below for examples.

Note

This method modifies the module in-place.

Args:
device (torch.device): the desired device of the parameters

and buffers in this module

dtype (torch.dtype): the desired floating point or complex dtype of

the parameters and buffers in this module

tensor (torch.Tensor): Tensor whose dtype and device are the desired

dtype and device for all parameters and buffers in this module

memory_format (torch.memory_format): the desired memory

format for 4D parameters and buffers in this module (keyword only argument)

Returns:

Module: self

Examples:

>>> # xdoctest: +IGNORE_WANT("non-deterministic")
>>> linear = nn.Linear(2, 2)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]])
>>> linear.to(torch.double)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1913, -0.3420],
        [-0.5113, -0.2325]], dtype=torch.float64)
>>> # xdoctest: +REQUIRES(env:TORCH_DOCTEST_CUDA1)
>>> gpu1 = torch.device("cuda:1")
>>> linear.to(gpu1, dtype=torch.half, non_blocking=True)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16, device='cuda:1')
>>> cpu = torch.device("cpu")
>>> linear.to(cpu)
Linear(in_features=2, out_features=2, bias=True)
>>> linear.weight
Parameter containing:
tensor([[ 0.1914, -0.3420],
        [-0.5112, -0.2324]], dtype=torch.float16)

>>> linear = nn.Linear(2, 2, bias=None).to(torch.cdouble)
>>> linear.weight
Parameter containing:
tensor([[ 0.3741+0.j,  0.2382+0.j],
        [ 0.5593+0.j, -0.4443+0.j]], dtype=torch.complex128)
>>> linear(torch.ones(3, 2, dtype=torch.cdouble))
tensor([[0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j],
        [0.6122+0.j, 0.1150+0.j]], dtype=torch.complex128)
to_empty(*, device, recurse=True)

Moves the parameters and buffers to the specified device without copying storage.

Args:
device (torch.device): The desired device of the parameters

and buffers in this module.

recurse (bool): Whether parameters and buffers of submodules should

be recursively moved to the specified device.

Returns:

Module: self

Return type

~T

train(mode=True)

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Args:
mode (bool): whether to set training mode (True) or evaluation

mode (False). Default: True.

Returns:

Module: self

Return type

~T

type(dst_type)

Casts all parameters and buffers to dst_type.

Note

This method modifies the module in-place.

Args:

dst_type (type or string): the desired type

Returns:

Module: self

Return type

~T

property window_type: str
Return type

str

xpu(device=None)

Moves all model parameters and buffers to the XPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on XPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

zero_grad(set_to_none=True)

Resets gradients of all model parameters. See similar function under torch.optim.Optimizer for more context.

Args:
set_to_none (bool): instead of setting to zero, set the grads to None.

See torch.optim.Optimizer.zero_grad() for details.

Return type

None

training: bool
class lhotse.features.kaldi.layers.Wav2MFCC(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=False, use_fft_mag=False, low_freq=20.0, high_freq=- 400.0, num_filters=23, norm_filters=False, num_ceps=13, cepstral_lifter=22, torchaudio_compatible_mel_scale=True)[source]

Apply standard Kaldi preprocessing (dithering, removing DC offset, pre-emphasis, etc.) on the input waveforms and compute their Mel-Frequency Cepstral Coefficients (MFCC).

Example:

>>> x = torch.randn(1, 16000, dtype=torch.float32)
>>> x.shape
torch.Size([1, 16000])
>>> t = Wav2MFCC()
>>> t(x).shape
torch.Size([1, 100, 13])

The input is a tensor of shape (batch_size, num_samples). The output is a tensor of shape (batch_size, num_frames, num_ceps).

__init__(sampling_rate=16000, frame_length=0.025, frame_shift=0.01, round_to_power_of_two=True, remove_dc_offset=True, preemph_coeff=0.97, window_type='povey', dither=0.0, snip_edges=False, energy_floor=1e-10, raw_energy=True, use_energy=False, use_fft_mag=False, low_freq=20.0, high_freq=- 400.0, num_filters=23, norm_filters=False, num_ceps=13, cepstral_lifter=22, torchaudio_compatible_mel_scale=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

static make_lifter(N, Q)[source]

Makes the liftering function

Args:

N: Number of cepstral coefficients. Q: Liftering parameter

Returns:

Liftering vector.

static make_dct_matrix(num_ceps, num_filters)[source]
T_destination

alias of TypeVar(‘T_destination’, bound=Dict[str, Any])

add_module(name, module)

Adds a child module to the current module.

The module can be accessed as an attribute using the given name.

Args:
name (str): name of the child module. The child module can be

accessed from this module using the given name

module (Module): child module to be added to the module.

Return type

None

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also nn-init-doc).

Args:

fn (Module -> None): function to be applied to each submodule

Returns:

Module: self

Example:

>>> @torch.no_grad()
>>> def init_weights(m):
>>>     print(m)
>>>     if type(m) == nn.Linear:
>>>         m.weight.fill_(1.0)
>>>         print(m.weight)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Linear(in_features=2, out_features=2, bias=True)
Parameter containing:
tensor([[1., 1.],
        [1., 1.]], requires_grad=True)
Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
Return type

~T

bfloat16()

Casts all floating point parameters and buffers to bfloat16 datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

buffers(recurse=True)

Returns an iterator over module buffers.

Args:
recurse (bool): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module.

Yields:

torch.Tensor: module buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for buf in model.buffers():
>>>     print(type(buf), buf.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Tensor]

call_super_init: bool = False
children()

Returns an iterator over immediate children modules.

Yields:

Module: a child module

Return type

Iterator[Module]

compile(*args, **kwargs)

Compile this Module’s forward using torch.compile().

This Module’s __call__ method is compiled and all arguments are passed as-is to torch.compile().

See torch.compile() for details on the arguments for this function.

cpu()

Moves all model parameters and buffers to the CPU.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

cuda(device=None)

Moves all model parameters and buffers to the GPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on GPU while being optimized.

Note

This method modifies the module in-place.

Args:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

property dither: float
Return type

float

double()

Casts all floating point parameters and buffers to double datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

dump_patches: bool = False
eval()

Sets the module in evaluation mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

This is equivalent with self.train(False).

See locally-disable-grad-doc for a comparison between .eval() and several similar mechanisms that may be confused with it.

Returns:

Module: self

Return type

~T

extra_repr()

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

Return type

str

float()

Casts all floating point parameters and buffers to float datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

forward(x)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

Return type

Tensor

property frame_length: float
Return type

float

property frame_shift: float
Return type

float

get_buffer(target)

Returns the buffer given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the buffer

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.Tensor: The buffer referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not a buffer

Return type

Tensor

get_extra_state()

Returns any extra state to include in the module’s state_dict. Implement this and a corresponding set_extra_state() for your module if you need to store extra state. This function is called when building the module’s state_dict().

Note that extra state should be picklable to ensure working serialization of the state_dict. We only provide provide backwards compatibility guarantees for serializing Tensors; other objects may break backwards compatibility if their serialized pickled form changes.

Returns:

object: Any extra state to store in the module’s state_dict

Return type

Any

get_parameter(target)

Returns the parameter given by target if it exists, otherwise throws an error.

See the docstring for get_submodule for a more detailed explanation of this method’s functionality as well as how to correctly specify target.

Args:
target: The fully-qualified string name of the Parameter

to look for. (See get_submodule for how to specify a fully-qualified string.)

Returns:

torch.nn.Parameter: The Parameter referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Parameter

Return type

Parameter

get_submodule(target)

Returns the submodule given by target if it exists, otherwise throws an error.

For example, let’s say you have an nn.Module A that looks like this:

A(
    (net_b): Module(
        (net_c): Module(
            (conv): Conv2d(16, 33, kernel_size=(3, 3), stride=(2, 2))
        )
        (linear): Linear(in_features=100, out_features=200, bias=True)
    )
)

(The diagram shows an nn.Module A. A has a nested submodule net_b, which itself has two submodules net_c and linear. net_c then has a submodule conv.)

To check whether or not we have the linear submodule, we would call get_submodule("net_b.linear"). To check whether we have the conv submodule, we would call get_submodule("net_b.net_c.conv").

The runtime of get_submodule is bounded by the degree of module nesting in target. A query against named_modules achieves the same result, but it is O(N) in the number of transitive modules. So, for a simple check to see if some submodule exists, get_submodule should always be used.

Args:
target: The fully-qualified string name of the submodule

to look for. (See above example for how to specify a fully-qualified string.)

Returns:

torch.nn.Module: The submodule referenced by target

Raises:
AttributeError: If the target string references an invalid

path or resolves to something that is not an nn.Module

Return type

Module

half()

Casts all floating point parameters and buffers to half datatype.

Note

This method modifies the module in-place.

Returns:

Module: self

Return type

~T

ipu(device=None)

Moves all model parameters and buffers to the IPU.

This also makes associated parameters and buffers different objects. So it should be called before constructing optimizer if the module will live on IPU while being optimized.

Note

This method modifies the module in-place.

Arguments:
device (int, optional): if specified, all parameters will be

copied to that device

Returns:

Module: self

Return type

~T

load_state_dict(state_dict, strict=True, assign=False)

Copies parameters and buffers from state_dict into this module and its descendants. If strict is True, then the keys of state_dict must exactly match the keys returned by this module’s state_dict() function.

Warning

If assign is True the optimizer must be created after the call to load_state_dict.

Args:
state_dict (dict): a dict containing parameters and

persistent buffers.

strict (bool, optional): whether to strictly enforce that the keys

in state_dict match the keys returned by this module’s state_dict() function. Default: True

assign (bool, optional): whether to assign items in the state

dictionary to their corresponding keys in the module instead of copying them inplace into the module’s current parameters and buffers. When False, the properties of the tensors in the current module are preserved while when True, the properties of the Tensors in the state dict are preserved. Default: False

Returns:
NamedTuple with missing_keys and unexpected_keys fields:
  • missing_keys is a list of str containing the missing keys

  • unexpected_keys is a list of str containing the unexpected keys

Note:

If a parameter or buffer is registered as None and its corresponding key exists in state_dict, load_state_dict() will raise a RuntimeError.

modules()

Returns an iterator over all modules in the network.

Yields:

Module: a module in the network

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.modules()):
...     print(idx, '->', m)

0 -> Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
)
1 -> Linear(in_features=2, out_features=2, bias=True)
Return type

Iterator[Module]

named_buffers(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module buffers, yielding both the name of the buffer as well as the buffer itself.

Args:

prefix (str): prefix to prepend to all buffer names. recurse (bool, optional): if True, then yields buffers of this module

and all submodules. Otherwise, yields only buffers that are direct members of this module. Defaults to True.

remove_duplicate (bool, optional): whether to remove the duplicated buffers in the result. Defaults to True.

Yields:

(str, torch.Tensor): Tuple containing the name and buffer

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, buf in self.named_buffers():
>>>     if name in ['running_var']:
>>>         print(buf.size())
Return type

Iterator[Tuple[str, Tensor]]

named_children()

Returns an iterator over immediate children modules, yielding both the name of the module as well as the module itself.

Yields:

(str, Module): Tuple containing a name and child module

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, module in model.named_children():
>>>     if name in ['conv4', 'conv5']:
>>>         print(module)
Return type

Iterator[Tuple[str, Module]]

named_modules(memo=None, prefix='', remove_duplicate=True)

Returns an iterator over all modules in the network, yielding both the name of the module as well as the module itself.

Args:

memo: a memo to store the set of modules already added to the result prefix: a prefix that will be added to the name of the module remove_duplicate: whether to remove the duplicated module instances in the result

or not

Yields:

(str, Module): Tuple of name and module

Note:

Duplicate modules are returned only once. In the following example, l will be returned only once.

Example:

>>> l = nn.Linear(2, 2)
>>> net = nn.Sequential(l, l)
>>> for idx, m in enumerate(net.named_modules()):
...     print(idx, '->', m)

0 -> ('', Sequential(
  (0): Linear(in_features=2, out_features=2, bias=True)
  (1): Linear(in_features=2, out_features=2, bias=True)
))
1 -> ('0', Linear(in_features=2, out_features=2, bias=True))
named_parameters(prefix='', recurse=True, remove_duplicate=True)

Returns an iterator over module parameters, yielding both the name of the parameter as well as the parameter itself.

Args:

prefix (str): prefix to prepend to all parameter names. recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

remove_duplicate (bool, optional): whether to remove the duplicated

parameters in the result. Defaults to True.

Yields:

(str, Parameter): Tuple containing the name and parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for name, param in self.named_parameters():
>>>     if name in ['bias']:
>>>         print(param.size())
Return type

Iterator[Tuple[str, Parameter]]

online_inference(x, context=None)
Return type

Tuple[Tensor, Tensor]

parameters(recurse=True)

Returns an iterator over module parameters.

This is typically passed to an optimizer.

Args:
recurse (bool): if True, then yields parameters of this module

and all submodules. Otherwise, yields only parameters that are direct members of this module.

Yields:

Parameter: module parameter

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> for param in model.parameters():
>>>     print(type(param), param.size())
<class 'torch.Tensor'> (20L,)
<class 'torch.Tensor'> (20L, 1L, 5L, 5L)
Return type

Iterator[Parameter]

property preemph_coeff: float
Return type

float

register_backward_hook(hook)

Registers a backward hook on the module.

This function is deprecated in favor of register_full_backward_hook() and the behavior of this function will change in future versions.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_buffer(name, tensor, persistent=True)

Adds a buffer to the module.

This is typically used to register a buffer that should not to be considered a model parameter. For example, BatchNorm’s running_mean is not a parameter, but is part of the module’s state. Buffers, by default, are persistent and will be saved alongside parameters. This behavior can be changed by setting persistent to False. The only difference between a persistent buffer and a non-persistent buffer is that the latter will not be a part of this module’s state_dict.

Buffers can be accessed as attributes using given names.

Args:
name (str): name of the buffer. The buffer can be accessed

from this module using the given name

tensor (Tensor or None): buffer to be registered. If None, then operations

that run on buffers, such as cuda, are ignored. If None, the buffer is not included in the module’s state_dict.

persistent (bool): whether the buffer is part of this module’s

state_dict.

Example:

>>> # xdoctest: +SKIP("undefined vars")
>>> self.register_buffer('running_mean', torch.zeros(num_features))
Return type

None

register_forward_hook(hook, *, prepend=False, with_kwargs=False, always_call=False)

Registers a forward hook on the module.

The hook will be called every time after forward() has computed an output.

If with_kwargs is False or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the output. It can modify the input inplace but it will not have effect on forward since this is called after forward() is called. The hook should have the following signature:

hook(module, args, output) -> None or modified output

If with_kwargs is True, the forward hook will be passed the kwargs given to the forward function and be expected to return the output possibly modified. The hook should have the following signature:

hook(module, args, kwargs, output) -> None or modified output
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If True, the provided hook will be fired

before all existing forward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward hooks on this torch.nn.modules.Module. Note that global forward hooks registered with register_module_forward_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If True, the hook will be passed the

kwargs given to the forward function. Default: False

always_call (bool): If True the hook will be run regardless of

whether an exception is raised while calling the Module. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_forward_pre_hook(hook, *, prepend=False, with_kwargs=False)

Registers a forward pre-hook on the module.

The hook will be called every time before forward() is invoked.

If with_kwargs is false or not specified, the input contains only the positional arguments given to the module. Keyword arguments won’t be passed to the hooks and only to the forward. The hook can modify the input. User can either return a tuple or a single modified value in the hook. We will wrap the value into a tuple if a single value is returned (unless that value is already a tuple). The hook should have the following signature:

hook(module, args) -> None or modified input

If with_kwargs is true, the forward pre-hook will be passed the kwargs given to the forward function. And if the hook modifies the input, both the args and kwargs should be returned. The hook should have the following signature:

hook(module, args, kwargs) -> None or a tuple of modified input and kwargs
Args:

hook (Callable): The user defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing forward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing forward_pre hooks on this torch.nn.modules.Module. Note that global forward_pre hooks registered with register_module_forward_pre_hook() will fire before all hooks registered by this method. Default: False

with_kwargs (bool): If true, the hook will be passed the kwargs

given to the forward function. Default: False

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_hook(hook, prepend=False)

Registers a backward hook on the module.

The hook will be called every time the gradients with respect to a module are computed, i.e. the hook will execute if and only if the gradients with respect to module outputs are computed. The hook should have the following signature:

hook(module, grad_input, grad_output) -> tuple(Tensor) or None

The grad_input and grad_output are tuples that contain the gradients with respect to the inputs and outputs respectively. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the input that will be used in place of grad_input in subsequent computations. grad_input will only correspond to the inputs given as positional arguments and all kwarg arguments are ignored. Entries in grad_input and grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs or outputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward hooks on this torch.nn.modules.Module. Note that global backward hooks registered with register_module_full_backward_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_full_backward_pre_hook(hook, prepend=False)

Registers a backward pre-hook on the module.

The hook will be called every time the gradients for the module are computed. The hook should have the following signature:

hook(module, grad_output) -> tuple[Tensor] or None

The grad_output is a tuple. The hook should not modify its arguments, but it can optionally return a new gradient with respect to the output that will be used in place of grad_output in subsequent computations. Entries in grad_output will be None for all non-Tensor arguments.

For technical reasons, when this hook is applied to a Module, its forward function will receive a view of each Tensor passed to the Module. Similarly the caller will receive a view of each Tensor returned by the Module’s forward function.

Warning

Modifying inputs inplace is not allowed when using backward hooks and will raise an error.

Args:

hook (Callable): The user-defined hook to be registered. prepend (bool): If true, the provided hook will be fired before

all existing backward_pre hooks on this torch.nn.modules.Module. Otherwise, the provided hook will be fired after all existing backward_pre hooks on this torch.nn.modules.Module. Note that global backward_pre hooks registered with register_module_full_backward_pre_hook() will fire before all hooks registered by this method.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

Return type

RemovableHandle

register_load_state_dict_post_hook(hook)

Registers a post hook to be run after module’s load_state_dict is called.

It should have the following signature::

hook(module, incompatible_keys) -> None

The module argument is the current module that this hook is registered on, and the incompatible_keys argument is a NamedTuple consisting of attributes missing_keys and unexpected_keys. missing_keys is a list of str containing the missing keys and unexpected_keys is a list of str containing the unexpected keys.

The given incompatible_keys can be modified inplace if needed.

Note that the checks performed when calling load_state_dict() with strict=True are affected by modifications the hook makes to missing_keys or unexpected_keys, as expected. Additions to either set of keys will result in an error being thrown when strict=True, and clearing out both missing and unexpected keys will avoid an error.

Returns:
torch.utils.hooks.RemovableHandle:

a handle that can be used to remove the added hook by calling handle.remove()

register_module(name, module)

Alias for add_module().

Return type

None

register_parameter(name, param)

Adds a parameter to the module.

The parameter can be accessed as an attribute using given name.

Args:
name (str): name of the parameter. The parameter can be accessed

from this module using the given name

param (Parameter or None): parameter to be added to the module. If

None, then operations that run on parameters, such as cuda, are ignored. If None, the parameter is not included in the module’s state_dict.

Return type

None

register_state_dict_pre_hook(hook)

These hooks will be called with arguments: self, prefix, and keep_vars before calling state_dict on self. The registered hooks can be used to perform pre-processing before the state_dict call is made.

property remove_dc_offset: bool