pycochleagram package

pycochleagram.cochleagram module

pycochleagram.cochleagram.apply_envelope_downsample(subband_envelopes, mode, audio_sr=None, env_sr=None, invert=False, strict=True)[source]

Apply a downsampling operation to cochleagram subband envelopes.

The mode argument can be a predefined downsampling type from {‘poly’, ‘resample’, ‘decimate’}, a callable (to perform custom downsampling), or None to return the unmodified cochleagram. If mode is a predefined type, audio_sr and env_sr are required.

Parameters:
  • subband_envelopes (array) – Cochleagram subbands to mode.
  • mode ({'poly', 'resample', 'decimate', callable, None}) – Determines the downsampling operation to apply to the cochleagram. ‘decimate’ will resample using scipy.signal.decimate with audio_sr/env_sr as the downsampling factor. ‘resample’ will downsample using scipy.signal.resample with np.ceil(subband_envelopes.shape[1]*(audio_sr/env_sr)) as the number of samples. ‘poly’ will resample using scipy.signal.resample_poly with env_sr as the upsampling factor and audio_sr as the downsampling factor. If mode is a python callable (e.g., function), it will be applied to subband_envelopes. If this is None, no downsampling is performed and the unmodified cochleagram is returned.
  • audio_sr (int, optional) – If using a predefined sampling mode, this represents the sampling rate of the original signal.
  • env_sr (int, optional) – If using a predefined sampling mode, this represents the sampling rate of the downsampled subband envelopes.
  • invert (bool, optional) – If using a predefined sampling mode, this will invert (i.e., upsample) the subband envelopes using the values provided in audio_sr and env_sr.
  • strict (bool, optional) – If using a predefined sampling mode, this ensure the downsampling will result in an integer number of samples. This should mean the upsample(downsample(x)) will have the same number of samples as x.
Returns:

downsampled_subband_envelopes: The subband_envelopes after being

downsampled with mode.

Return type:

array

pycochleagram.cochleagram.apply_envelope_nonlinearity(subband_envelopes, nonlinearity, invert=False)[source]

Apply a nonlinearity to the cochleagram.

The nonlinearity argument can be an predefined type, a callable (to apply a custom nonlinearity), or None to return the unmodified cochleagram.

Parameters:
  • subband_envelopes (array) – Cochleagram to apply the nonlinearity to.
  • nonlinearity ({'db', 'power'}, callable, None) – Determines the nonlinearity operation to apply to the cochleagram. If this is a valid string, one of the predefined nonlinearities will be used. It can be: ‘power’ to perform np.power(subband_envelopes, 3.0 / 10.0) or ‘db’ to perform 20 * np.log10(subband_envelopes / np.max(subband_envelopes)), with values clamped to be greater than -60. If nonlinearity is a python callable (e.g., function), it will be applied to subband_envelopes. If this is None, no nonlinearity is applied and the unmodified cochleagram is returned.
  • invert (bool) – For predefined nonlinearities ‘db’ and ‘power’, if False (default), the nonlinearity will be applied. If True, the nonlinearity will be inverted.
Returns:

nonlinear_subband_envelopes: The subband_envelopes with the specified

nonlinearity applied.

Return type:

array

Raises:
  • ValueError – Error if the provided nonlinearity isn’t a recognized
  • option.
pycochleagram.cochleagram.cochleagram(signal, sr, n, low_lim, hi_lim, sample_factor, padding_size=None, downsample=None, nonlinearity=None, fft_mode='auto', ret_mode='envs', strict=True, **kwargs)[source]

Generate the subband envelopes (i.e., the cochleagram) of the provided signal.

This first creates a an ERB filterbank with the provided input arguments for the provided signal. This filterbank is then used to perform the subband decomposition to create the subband envelopes. The resulting envelopes can be optionally downsampled and then modified with a nonlinearity.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain. Should be flattened, i.e., the shape is (n_samples,).
  • sr (int) – Sampling rate associated with the signal waveform.
  • n (int) – Number of filters (subbands) to be generated with standard sampling (i.e., using a sampling factor of 1). Note, the actual number of filters in the generated filterbank depends on the sampling factor, and will also include lowpass and highpass filters that allow for perfect reconstruction of the input signal (the exact number of lowpass and highpass filters is determined by the sampling factor).
  • low_lim (int) – Lower limit of frequency range. Filters will not be defined below this limit.
  • hi_lim (int) – Upper limit of frequency range. Filters will not be defined above this limit.
  • sample_factor (int) – Positive integer that determines how densely ERB function will be sampled to create bandpass filters. 1 represents standard sampling; adjacent bandpass filters will overlap by 50%. 2 represents 2x overcomplete sampling; adjacent bandpass filters will overlap by 75%. 4 represents 4x overcomplete sampling; adjacent bandpass filters will overlap by 87.5%.
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size+signal_length.
  • downsample (None, int, callable, optional) – The downsample argument can be an integer representing the upsampling factor in polyphase resampling (with sr as the downsampling factor), a callable (to perform custom downsampling), or None to return the unmodified cochleagram; see apply_envelope_downsample for more information. If ret_mode is ‘envs’, this will be applied to the cochleagram before the nonlinearity, otherwise no downsampling will be performed. Providing a callable for custom downsampling is suggested.
  • nonlinearity ({None, 'db', 'power', callable}, optional) – The nonlinearity argument can be an predefined type, a callable (to apply a custom nonlinearity), or None to return the unmodified cochleagram; see apply_envelope_nonlinearity for more information. If ret_mode is ‘envs’, this will be applied to the cochleagram after downsampling, otherwise no nonlinearity will be applied. Providing a callable for applying a custom nonlinearity is suggested.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
  • ret_mode ({'envs', 'subband', 'analytic', 'all'}) – Determines what will be returned. ‘envs’ (default) returns the subband envelopes; ‘subband’ returns just the subbands, ‘analytic’ returns the analytic signal provided by the Hilbert transform, ‘all’ returns all local variables created in this function.
  • strict (bool, optional) –

    If True (default), will include the extra highpass and lowpass filters required to make the filterbank invertible. If False, this will only perform calculations on the bandpass filters; note this decreases the number of frequency channels in the output by

    2 * sample_factor.

    function is used in a way that is unsupported by the MATLAB implemenation.

  • strict – If True (default), will throw an errors if this function is used in a way that is unsupported by the MATLAB implemenation.
Returns:

out: The output, depending on the value of ret_mode. If the ret_mode

is ‘envs’ and a downsampling and/or nonlinearity operation was requested, the output will reflect these operations.

Return type:

array

pycochleagram.cochleagram.human_cochleagram(signal, sr, n=None, low_lim=50, hi_lim=20000, sample_factor=2, padding_size=None, downsample=None, nonlinearity=None, fft_mode='auto', ret_mode='envs', strict=True, **kwargs)[source]

Convenience function to generate the subband envelopes (i.e., the cochleagram) of the provided signal using sensible default parameters for a human cochleagram.

This first creates a an ERB filterbank with the provided input arguments for the provided signal. This filterbank is then used to perform the subband decomposition to create the subband envelopes. The resulting envelopes can be optionally downsampled and then modified with a nonlinearity.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain. Should be flattened, i.e., the shape is (n_samples,).
  • sr (int) – Sampling rate associated with the signal waveform.
  • n (int) – Number of filters (subbands) to be generated with standard sampling (i.e., using a sampling factor of 1). Note, the actual number of filters in the generated filterbank depends on the sampling factor, and will also include lowpass and highpass filters that allow for perfect reconstruction of the input signal (the exact number of lowpass and highpass filters is determined by the sampling factor).
  • low_lim (int) – Lower limit of frequency range. Filters will not be defined below this limit.
  • hi_lim (int) – Upper limit of frequency range. Filters will not be defined above this limit.
  • sample_factor (int) – Positive integer that determines how densely ERB function will be sampled to create bandpass filters. 1 represents standard sampling; adjacent bandpass filters will overlap by 50%. 2 represents 2x overcomplete sampling; adjacent bandpass filters will overlap by 75%. 4 represents 4x overcomplete sampling; adjacent bandpass filters will overlap by 87.5%.
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size+signal_length.
  • downsample (None, int, callable, optional) – The downsample argument can be an integer representing the upsampling factor in polyphase resampling (with sr as the downsampling factor), a callable (to perform custom downsampling), or None to return the unmodified cochleagram; see apply_envelope_downsample for more information. If ret_mode is ‘envs’, this will be applied to the cochleagram before the nonlinearity, otherwise no downsampling will be performed. Providing a callable for custom downsampling is suggested.
  • nonlinearity ({None, 'db', 'power', callable}, optional) – The nonlinearity argument can be an predefined type, a callable (to apply a custom nonlinearity), or None to return the unmodified cochleagram; see apply_envelope_nonlinearity for more information. If ret_mode is ‘envs’, this will be applied to the cochleagram after downsampling, otherwise no nonlinearity will be applied. Providing a callable for applying a custom nonlinearity is suggested.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
  • ret_mode ({'envs', 'subband', 'analytic', 'all'}) – Determines what will be returned. ‘envs’ (default) returns the subband envelopes; ‘subband’ returns just the subbands, ‘analytic’ returns the analytic signal provided by the Hilber transform, ‘all’ returns all local variables created in this function.
  • strict (bool, optional) – If True (default), will throw an errors if this function is used in a way that is unsupported by the MATLAB implemenation.
Returns:

out: The output, depending on the value of ret_mode. If the ret_mode

is ‘envs’ and a downsampling and/or nonlinearity operation was requested, the output will reflect these operations.

Return type:

array

pycochleagram.cochleagram.invert_cochleagram(cochleagram, sr, n, low_lim, hi_lim, sample_factor, padding_size=None, target_rms=100, downsample=None, nonlinearity=None, n_iter=50, strict=True)[source]

Generate a waveform from a cochleagram using the provided arguments to construct a filterbank.

Parameters:
  • cochleagram (array) – The subband envelopes (i.e., cochleagram) to invert.
  • sr (int) – Sampling rate associated with the cochleagram.
  • n (int) – Number of filters (subbands) to be generated with standard sampling (i.e., using a sampling factor of 1). Note, the actual number of filters in the generated filterbank depends on the sampling factor, and will also include lowpass and highpass filters that allow for perfect reconstruction of the input signal (the exact number of lowpass and highpass filters is determined by the sampling factor).
  • low_lim (int) – Lower limit of frequency range. Filters will not be defined below this limit.
  • hi_lim (int) – Upper limit of frequency range. Filters will not be defined above this limit.
  • sample_factor (int) – Positive integer that determines how densely ERB function will be sampled to create bandpass filters. 1 represents standard sampling; adjacent bandpass filters will overlap by 50%. 2 represents 2x overcomplete sampling; adjacent bandpass filters will overlap by 75%. 4 represents 4x overcomplete sampling; adjacent bandpass filters will overlap by 87.5%.
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size+signal_length.
  • target_rms (scalar) – Target root-mean-squared value of the output, related to SNR, TODO: this needs to be checked
  • downsample (None, int, callable, optional) – If downsampling was performed on cochleagram, this is the operation to invert that downsampling (i.e., upsample); this determines the length of the output signal. The downsample argument can be an integer representing the downsampling factor in polyphase resampling (with sr as the upsampling factor), a callable (to perform custom downsampling), or None to return the unmodified cochleagram; see apply_envelope_downsample for more information. Providing a callable for custom function for upsampling is suggested.
  • nonlinearity ({None, 'db', 'power', callable}, optional) – If a nonlinearity was applied to cochleagram, this is the operation to invert that nonlinearity. The nonlinearity argument can be an predefined type, a callable (to apply a custom nonlinearity), or None to return the unmodified cochleagram; see apply_envelope_nonlinearity for more information. If this is a predefined type, the nonlinearity will be inverted according to apply_envelope_nonlinearity.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
  • n_iter (int, optional) – Number of iterations to perform for the inversion.
  • strict (bool, optional) – If True (default), will throw an errors if this function is used in a way that is unsupported by the MATLAB implemenation.
Returns:

inv_signal: The waveform signal created by inverting the cochleagram. inv_coch: The inverted cochleagram.

Return type:

array

pycochleagram.cochleagram.invert_cochleagram_with_filterbank(cochleagram, filters, sr, target_rms=100, downsample=None, nonlinearity=None, n_iter=20)[source]

Generate a waveform from a cochleagram using a provided filterbank.

Parameters:
  • cochleagram (array) – The subband envelopes (i.e., cochleagram) to invert.
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar.
  • sr (int) – Sampling rate associated with the cochleagram.
  • target_rms (scalar) – Target root-mean-squared value of the output, related to SNR, TODO: this needs to be checked
  • downsample (None, int, callable, optional) – If downsampling was performed on cochleagram, this is the operation to invert that downsampling (i.e., upsample); this determines the length of the output signal. The downsample argument can be an integer representing the downsampling factor in polyphase resampling (with sr as the upsampling factor), a callable (to perform custom downsampling), or None to return the unmodified cochleagram; see apply_envelope_downsample for more information. Providing a callable for custom function for upsampling is suggested.
  • nonlinearity ({None, 'db', 'power', callable}, optional) – If a nonlinearity was applied to cochleagram, this is the operation to invert that nonlinearity. The nonlinearity argument can be an predefined type, a callable (to apply a custom nonlinearity), or None to return the unmodified cochleagram; see apply_envelope_nonlinearity for more information. If this is a predefined type, the nonlinearity will be inverted according to apply_envelope_nonlinearity.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
  • n_iter (int, optional) – Number of iterations to perform for the inversion.
Returns:

inv_signal: The waveform signal created by inverting the cochleagram.

Return type:

array

pycochleagram.demo module

pycochleagram.demo.demo_human_cochleagram(signal=None, sr=None, n=None)[source]

Demo to generate the human cochleagrams, displaying various nonlinearity and downsampling options. If a signal is not provided, a tone synthesized with 40 harmonics and an f0=100 will be used.

Parameters:
  • signal (array, optional) – Signal containing waveform data.
  • sr (int, optional) – Sampling rate of the input signal.
  • n (int, optional) – Number of filters to use in the filterbank.
Returns:

None

pycochleagram.demo.demo_human_cochleagram_helper(signal, sr, n, sample_factor=2, downsample=None, nonlinearity=None)[source]

Demo the cochleagram generation.

signal (array): If a time-domain signal is provided, its
cochleagram will be generated with some sensible parameters. If this is None, a synthesized tone (harmonic stack of the first 40 harmonics) will be used.
sr: (int): If signal is not None, this is the sampling rate
associated with the signal.

n (int): number of filters to use. sample_factor (int): Determines the density (or “overcompleteness”) of the

filterbank. Original MATLAB code supported 1, 2, 4.
downsample({None, int, callable}, optional): Determines downsampling method to apply.
If None, no downsampling will be applied. If this is an int, it will be interpreted as the upsampling factor in polyphase resampling (with sr as the downsampling factor). A custom downsampling function can be provided as a callable. The callable will be called on the subband envelopes.
nonlinearity({None, ‘db’, ‘power’, callable}, optional): Determines
nonlinearity method to apply. None applies no nonlinearity. ‘db’ will convert output to decibels (truncated at -60). ‘power’ will apply 3/10 power compression.
Returns:
cochleagram: The cochleagram of the input signal, created with
largely default parameters.
Return type:array
pycochleagram.demo.demo_invert_cochleagram(signal=None, sr=None, n=None, playback=False)[source]

Demo that will generate a cochleagram from a signal, then invert this cochleagram to produce a waveform signal.

Parameters:
  • signal (array, optional) – Signal containing waveform data.
  • sr (int, optional) – Sampling rate of the input signal.
  • n (int, optional) – Number of filters to use in the filterbank.
  • playback (bool, optional) – Determines if audio signals will be played (using pyaudio). If False, only plots will be created. If True, the original signal and inverted cochleagram signal will be played. NOTE: Be careful with the volume when using playback, things can get very loud.
Returns:

None

pycochleagram.demo.demo_playback(signal, sr, ignore_warning=False)[source]

Demo audio playback with pyaudio.

Parameters:
  • signal (array, optional) – Signal containing waveform data.
  • sr (int, optional) – Sampling rate of the input signal.
  • ignore_warning (bool, optional) – Determines if audio signals will be played (using pyaudio). NOTE: Be careful with the volume when using playback, things can get very loud.
Returns:

None

pycochleagram.demo.main(ignore_playback_warning=False, mode='rand_sound')[source]

Run all demo functions.

Parameters:
  • ignore_playback_warning (bool, optional) – To use audio playback, you must acknowledge that things can get very loud by setting ignore_playback_warning to True.
  • mode ({'rand_sound', other}) – Set the mode for the demo. If this is ‘rand_sound’, a sound from the demo_stim/ directory will be chosen at random and used for the demos. If this is anything else, a harmonic stack of 40 harmonics and an f0=100Hz will be generated and used.
Returns:

None

pycochleagram.demo.make_harmonic_stack(f0=100, n_harm=40, dur=0.25001, sr=20000, low_lim=50, hi_lim=20000, n=None)[source]

Synthesize a tone created with a stack of harmonics.

Parameters:
  • f0 (int, optional) – Fundamental frequency.
  • n_harm (int, optional) – Number of harmonics to include.
  • dur (float, optional) – Duration, in milliseconds. Note, the default value was chosen to create a signal length that is compatible with the predefined downsampling method.
  • sr (int, optional) – Sampling rate.
  • low_lim (int, optional) – Lower limit for filterbank.
  • hi_lim (int, optional) – Upper limit for filerbank.
  • n (None, optional) – Number of filters in filterbank.
Returns:

signal (array): Synthesized tone. signal_params (dict): A dictionary containing all of the parameters

used to synthesize the tone.

Return type:

tuple

pycochleagram.erbfilter module

pycochleagram.erbfilter.erb2freq(n_erb)[source]

Converts human ERBs to Hz, using the formula of Glasberg and Moore.

Parameters:n_erb (array_like) – Human-defined ERB to convert to frequency.
Returns:freq_hz – Frequency representation of input.
Return type:ndarray
pycochleagram.erbfilter.freq2erb(freq_hz)[source]

Converts Hz to human-defined ERBs, using the formula of Glasberg and Moore.

Parameters:freq_hz (array_like) – frequency to use for ERB.
Returns:n_erb – Human-defined ERB representation of input.
Return type:ndarray
pycochleagram.erbfilter.freq2lin(freq_hz)[source]

Compatibility hack to allow for linearly spaced cosine filters with make_erb_cos_filters_nx; intended to generalize the functionality of make_lin_cos_filters.

pycochleagram.erbfilter.lin2freq(n_lin)[source]

Compatibility hack to allow for linearly spaced cosine filters with make_erb_cos_filters_nx; intended to generalize the functionality of make_lin_cos_filters.

pycochleagram.erbfilter.make_cosine_filter(freqs, l, h, convert_to_erb=True)[source]

Generate a half-cosine filter. Represents one subband of the cochleagram.

A half-cosine filter is created using the values of freqs that are within the interval [l, h]. The half-cosine filter is centered at the center of this interval, i.e., (h - l) / 2. Values outside the valid interval [l, h] are discarded. So, if freqs = [1, 2, 3, … 10], l = 4.5, h = 8, the cosine filter will only be defined on the domain [5, 6, 7] and the returned output will only contain 3 elements.

Parameters:
  • freqs (array_like) – Array containing the domain of the filter, in ERB space; see convert_to_erb parameter below.. A single half-cosine filter will be defined only on the valid section of these values; specifically, the values between cutoffs l and h. A half-cosine filter centered at (h - l ) / 2 is created on the interval [l, h].
  • l (float) – The lower cutoff of the half-cosine filter in ERB space; see convert_to_erb parameter below.
  • h (float) – The upper cutoff of the half-cosine filter in ERB space; see convert_to_erb parameter below.
  • convert_to_erb (bool, default=True) – If this is True, the values in input arguments freqs, l, and h will be transformed from Hz to ERB space before creating the half-cosine filter. If this is False, the input arguments are assumed to be in ERB space.
Returns:

half_cos_filter – A half-cosine filter defined using elements of freqs within [l, h].

Return type:

ndarray

pycochleagram.erbfilter.make_erb_cos_filters(signal_length, sr, n, low_lim, hi_lim, full_filter=False, strict=False)[source]

Fairly literal port of Josh McDermott’s MATLAB make_erb_cos_filters. Useful for debugging, but isn’t very generalizable. Use make_erb_cos_filters_1x or make_erb_cos_filters_nx with sample_factor=1 instead.

Returns n+2 filters as ??column vectors of FILTS

filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim

Adjacent filters overlap by 50%.

The squared frequency responses of the filters sums to 1, so that they can be applied once to generate subbands and then again to collapse the subbands to generate a sound signal, without changing the frequency content of the signal.

intended for use with GENERATE_SUBBANDS and COLLAPSE_SUBBANDS

Parameters:
  • signal_length (int) – Length of input signal. Filters are to be applied multiplicatively in the frequency domain and thus have a length that scales with the signal length (signal_length).
  • sr (int) – is the sampling rate
  • n (int) – number of filters to create
  • low_lim (int) – low cutoff of lowest band
  • hi_lim (int) – high cutoff of highest band
Returns:

filts (array): There are n+2 filters because filts also contains lowpass

and highpass filters to cover the ends of the spectrum.

hz_cutoffs (array): is a vector of the cutoff frequencies of each filter.

Because of the overlap arrangement, the upper cutoff of one filter is the center frequency of its neighbor.

freqs (array): is a vector of frequencies the same length as filts, that

can be used to plot the frequency response of the filters.

Return type:

tuple

pycochleagram.erbfilter.make_erb_cos_filters_1x(signal_length, sr, n, low_lim, hi_lim, padding_size=None, full_filter=False, strict=False)[source]

Create ERB cosine filterbank, sampled from ERB at 1x overcomplete.

Returns n+2 filters as ??column vector

filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim

Adjacent filters overlap by 50%.

The squared frequency responses of the filters sums to 1, so that they can be applied once to generate subbands and then again to collapse the subbands to generate a sound signal, without changing the frequency content of the signal.

intended for use with GENERATE_SUBBANDS and COLLAPSE_SUBBANDS

Parameters:
  • signal_length (int) – Length of input signal. Filters are to be applied multiplicatively in the frequency domain and thus have a length that scales with the signal length (signal_length).
  • sr (int) – is the sampling rate
  • n (int) – number of filters to create
  • low_lim (int) – low cutoff of lowest band
  • hi_lim (int) – high cutoff of highest band
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size*signal_length.
  • full_filter (bool, optional) – If True, the complete filter that is ready to apply to the signal is returned. If False (default), only the first half of the filter is returned (likely positive terms of FFT).
  • strict (bool, optional) – If True (default), will throw an error if provided hi_lim is greater than the Nyquist rate.
Returns:

filts (array): There are n+2 filters because filts also contains lowpass

and highpass filters to cover the ends of the spectrum.

hz_cutoffs (array): is a vector of the cutoff frequencies of each filter.

Because of the overlap arrangement, the upper cutoff of one filter is the center frequency of its neighbor.

freqs (array): is a vector of frequencies the same length as filts, that

can be used to plot the frequency response of the filters.

Return type:

tuple

pycochleagram.erbfilter.make_erb_cos_filters_2x(signal_length, sr, n, low_lim, hi_lim, padding_size=None, full_filter=False, strict=False)[source]

Create ERB cosine filterbank, sampled from ERB at 2x overcomplete.

Returns 2*n+5 filters as column vectors filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim

This function returns a filterbank that is 2x overcomplete compared to make_erb_cos_filts_1x (to get filterbanks that can be compared with each other, use the same value of n in both cases). Adjacent filters overlap by 75%.

The squared frequency responses of the filters sums to 1, so that they can be applied once to generate subbands and then again to collapse the subbands to generate a sound signal, without changing the frequency content of the signal.

intended for use with GENERATE_SUBBANDS and COLLAPSE_SUBBANDS

Parameters:
  • signal_length (int) – Length of input signal. Filters are to be applied multiplicatively in the frequency domain and thus have a length that scales with the signal length (signal_length).
  • sr (int) – the sampling rate
  • n (int) – number of filters to create
  • low_lim (int) – low cutoff of lowest band
  • hi_lim (int) – high cutoff of highest band
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size*signal_length.
  • full_filter (bool, optional) – If True, the complete filter that is ready to apply to the signal is returned. If False (default), only the first half of the filter is returned (likely positive terms of FFT).
  • strict (bool, optional) – If True, will throw an error if provided hi_lim is greater than the Nyquist rate.
Returns:

tuple containing:
filts (array): There are 2*n+5 filters because filts also contains lowpass

and highpass filters to cover the ends of the spectrum and sampling is 2x overcomplete.

hz_cutoffs (array): is a vector of the cutoff frequencies of each filter.

Because of the overlap arrangement, the upper cutoff of one filter is the center frequency of its neighbor.

freqs (array): is a vector of frequencies the same length as filts, that

can be used to plot the frequency response of the filters.

Return type:

tuple

pycochleagram.erbfilter.make_erb_cos_filters_4x(signal_length, sr, n, low_lim, hi_lim, padding_size=None, full_filter=False, strict=False)[source]

Create ERB cosine filterbank, sampled from ERB at 4x overcomplete.

Returns 4*n+11 filters as column vectors filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim

This function returns a filterbank that is 4x overcomplete compared to MAKE_ERB_COS_FILTS (to get filterbanks that can be compared with each other, use the same value of n in both cases). Adjacent filters overlap by 87.5%.

The squared frequency responses of the filters sums to 1, so that they can be applied once to generate subbands and then again to collapse the subbands to generate a sound signal, without changing the frequency content of the signal.

intended for use with GENERATE_SUBBANDS and COLLAPSE_SUBBANDS

Parameters:
  • signal_length (int) – Length of input signal. Filters are to be applied multiplicatively in the frequency domain and thus have a length that scales with the signal length (signal_length).
  • sr (int) – the sampling rate
  • n (int) – number of filters to create
  • low_lim (int) – low cutoff of lowest band
  • hi_lim (int) – high cutoff of highest band
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size*signal_length.
  • full_filter (bool, optional) – If True, the complete filter that is ready to apply to the signal is returned. If False (default), only the first half of the filter is returned (likely positive terms of FFT).
  • strict (bool, optional) – If True, will throw an error if provided hi_lim is greater than the Nyquist rate.
Returns:

filts (array): There are 4*n+11 filters because filts also contains lowpass

and highpass filters to cover the ends of the spectrum and sampling is 4x overcomplete.

hz_cutoffs (array): is a vector of the cutoff frequencies of each filter.

Because of the overlap arrangement, the upper cutoff of one filter is the center frequency of its neighbor.

freqs (array): is a vector of frequencies the same length as filts, that

can be used to plot the frequency response of the filters.

Return type:

tuple

pycochleagram.erbfilter.make_erb_cos_filters_nx(signal_length, sr, n, low_lim, hi_lim, sample_factor, padding_size=None, full_filter=True, strict=True, **kwargs)[source]

Create ERB cosine filters, oversampled by a factor provided by “sample_factor”

Parameters:
  • signal_length (int) – Length of signal to be filtered with the generated filterbank. The signal length determines the length of the filters.
  • sr (int) – Sampling rate associated with the signal waveform.
  • n (int) –

    Number of filters (subbands) to be generated with standard sampling (i.e., using a sampling factor of 1). Note, the actual number of filters in the generated filterbank depends on the sampling factor, and will also include lowpass and highpass filters that allow for perfect reconstruction of the input signal (the exact number of lowpass and highpass filters is determined by the sampling factor). The number of filters in the generated filterbank is given below:

    sample factor n_out = bandpass + highpass + lowpass
    1 n+2 = n + 1 + 1
    2 2*n+1+4 = 2*n+1 + 2 + 2
    4 4*n+3+8 = 4*n+3 + 4 + 4
    s s*(n+1)-1+2*s = s*(n+1)-1 + s + s
  • low_lim (int) – Lower limit of frequency range. Filters will not be defined below this limit.
  • hi_lim (int) – Upper limit of frequency range. Filters will not be defined above this limit.
  • sample_factor (int) – Positive integer that determines how densely ERB function will be sampled to create bandpass filters. 1 represents standard sampling; adjacent bandpass filters will overlap by 50%. 2 represents 2x overcomplete sampling; adjacent bandpass filters will overlap by 75%. 4 represents 4x overcomplete sampling; adjacent bandpass filters will overlap by 87.5%.
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size*signal_length.
  • full_filter (bool, default=True) – If True (default), the complete filter that is ready to apply to the signal is returned. If False, only the first half of the filter is returned (likely positive terms of FFT).
  • strict (bool, default=True) – If True (default), will throw an error if sample_factor is not a power of two. This facilitates comparison across sample_factors. Also, if True, will throw an error if provided hi_lim is greater than the Nyquist rate.
Returns:

A tuple containing the output:

  • filts (array)– The filterbank consisting of filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim.
  • center_freqs (array) – something
  • freqs (array) – something

Return type:

tuple

Raises:

ValueError – Various value errors for bad choices of sample_factor; see description for strict parameter.

pycochleagram.erbfilter.make_full_filter_set(filts, signal_length=None)[source]

Create the full set of filters by extending the filterbank to negative FFT frequencies.

Parameters:
  • filts (array_like) – Array containing the cochlear filterbank in frequency space, i.e., the output of make_erb_cos_filters_nx. Each row of filts is a single filter, with columns indexing frequency.
  • signal_length (int, optional) – Length of the signal to be filtered with this filterbank. This should be equal to filter length * 2 - 1, i.e., 2*filts.shape[1] - 1, and if signal_length is None, this value will be computed with the above formula. This parameter might be deprecated later.
Returns:

full_filter_set – Array containing the complete filterbank in frequency space. This output can be directly applied to the frequency representation of a signal.

Return type:

ndarray

pycochleagram.erbfilter.make_lin_cos_filters(signal_length, sr, n, low_lim, hi_lim, full_filter=False, strict=False)[source]

Fairly literal port of Josh McDermott’s MATLAB make_lin_cos_filters. Useful for debugging, but isn’t very generalizable. Use make_erb_cos_filters_1x or make_erb_cos_filters_nx with sample_factor=1 instead.

Returns n+2 filters as ??column vectors of FILTS

filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim

Adjacent filters overlap by 50%.

The squared frequency responses of the filters sums to 1, so that they can be applied once to generate subbands and then again to collapse the subbands to generate a sound signal, without changing the frequency content of the signal.

intended for use with GENERATE_SUBBANDS and COLLAPSE_SUBBANDS

Parameters:
  • signal_length (int) – Length of input signal. Filters are to be applied multiplicatively in the frequency domain and thus have a length that scales with the signal length (signal_length).
  • sr (int) – is the sampling rate
  • n (int) – number of filters to create
  • low_lim (int) – low cutoff of lowest band
  • hi_lim (int) – high cutoff of highest band
Returns:

filts (array): There are n+2 filters because filts also contains lowpass

and highpass filters to cover the ends of the spectrum.

hz_cutoffs (array): is a vector of the cutoff frequencies of each filter.

Because of the overlap arrangement, the upper cutoff of one filter is the center frequency of its neighbor.

freqs (array): is a vector of frequencies the same length as filts, that

can be used to plot the frequency response of the filters.

Return type:

tuple

pycochleagram.erbfilter.make_ref_cos_filters_nx(signal_length, sr, n, low_lim, hi_lim, sample_factor, padding_size=None, full_filter=True, strict=True, ref_spacing_mode='erb', **kwargs)[source]

Create ERB cosine filters, oversampled by a factor provided by “sample_factor”

Parameters:
  • signal_length (int) – Length of signal to be filtered with the generated filterbank. The signal length determines the length of the filters.
  • sr (int) – Sampling rate associated with the signal waveform.
  • n (int) –

    Number of filters (subbands) to be generated with standard sampling (i.e., using a sampling factor of 1). Note, the actual number of filters in the generated filterbank depends on the sampling factor, and will also include lowpass and highpass filters that allow for perfect reconstruction of the input signal (the exact number of lowpass and highpass filters is determined by the sampling factor). The number of filters in the generated filterbank is given below:

    sample factor n_out = bandpass + highpass + lowpass
    1 n+2 = n + 1 + 1
    2 2*n+1+4 = 2*n+1 + 2 + 2
    4 4*n+3+8 = 4*n+3 + 4 + 4
    s s*(n+1)-1+2*s = s*(n+1)-1 + s + s
  • low_lim (int) – Lower limit of frequency range. Filters will not be defined below this limit.
  • hi_lim (int) – Upper limit of frequency range. Filters will not be defined above this limit.
  • sample_factor (int) – Positive integer that determines how densely ERB function will be sampled to create bandpass filters. 1 represents standard sampling; adjacent bandpass filters will overlap by 50%. 2 represents 2x overcomplete sampling; adjacent bandpass filters will overlap by 75%. 4 represents 4x overcomplete sampling; adjacent bandpass filters will overlap by 87.5%.
  • padding_size (int, optional) – If None (default), the signal will not be padded before filtering. Otherwise, the filters will be created assuming the waveform signal will be padded to length padding_size*signal_length.
  • full_filter (bool, default=True) – If True (default), the complete filter that is ready to apply to the signal is returned. If False, only the first half of the filter is returned (likely positive terms of FFT).
  • strict (bool, default=True) – If True (default), will throw an error if sample_factor is not a power of two. This facilitates comparison across sample_factors. Also, if True, will throw an error if provided hi_lim is greater than the Nyquist rate.
Returns:

A tuple containing the output:

  • filts (array)– The filterbank consisting of filters have cosine-shaped frequency responses, with center frequencies equally spaced on an ERB scale from low_lim to hi_lim.
  • center_freqs (array) – something
  • freqs (array) – something

Return type:

tuple

Raises:

ValueError – Various value errors for bad choices of sample_factor; see description for strict parameter.

pycochleagram.misc module

pycochleagram.subband module

pycochleagram.subband.collapse_subbands(subbands, filters, fft_mode='auto')[source]

Collapse the subbands into a waveform by (re)applying the filterbank.

Parameters:
  • subbands (array) – The subband decomposition (i.e., cochleagram) to collapse.
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar, that was used to create subbands.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
Returns:

signal: The signal resulting from collapsing the subbands.

Return type:

array

pycochleagram.subband.generate_analytic_subbands(signal, filters, padding_size=None, fft_mode='auto')[source]
Generate the analytic subbands (i.e., hilbert transform) of the signal by

applying the provided filters.

The input filters are applied to the signal to perform subband decomposition. The signal can be optionally zero-padded before the decomposition. For full cochleagram generation, see generate_subband_envelopes.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain.
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar.
  • padding_size (int, optional) – Factor that determines if the signal will be zero-padded before generating the subbands. If this is None, or less than 1, no zero-padding will be used. Otherwise, zeros are added to the end of the input signal until is it of length padding_size * length(signal). This padded region will be removed after performing the subband decomposition.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary. TODO: fix zero-padding
Returns:

analytic_subbands: The analytic subbands (i.e., hilbert transform) resulting

of the subband decomposition. This should have the same shape as filters.

Return type:

array

pycochleagram.subband.generate_subband_envelopes(signal, filters, padding_size=None, debug_ret_all=False)[source]
Generate the subband envelopes (i.e., the cochleagram) of the signal by
applying the provided filters.

The input filters are applied to the signal to perform subband decomposition. The signal can be optionally zero-padded before the decomposition.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain.
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar.
  • padding_size (int, optional) – Factor that determines if the signal will be zero-padded before generating the subbands. If this is None, or less than 1, no zero-padding will be used. Otherwise, zeros are added to the end of the input signal until is it of length padding_size * length(signal). This padded region will be removed after performing the subband decomposition.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
Returns:

subband_envelopes: The subband envelopes (i.e., cochleagram) resulting from

the subband decomposition. This should have the same shape as filters.

Return type:

array

pycochleagram.subband.generate_subband_envelopes_fast(signal, filters, padding_size=None, fft_mode='auto', debug_ret_all=False)[source]

Generate the subband envelopes (i.e., the cochleagram) of the signal by applying the provided filters.

This method returns only the envelopes of the subband decomposition. The signal can be optionally zero-padded before the decomposition. The resulting envelopes can be optionally downsampled and then modified with a nonlinearity.

This function expedites the calculation of the subbands envelopes by:
  1. using the rfft rather than standard fft to compute the dft for real-valued signals
  2. hand-computing the Hilbert transform, to avoid unnecessary calls to fft/ifft.

See utils.rfft, utils.irfft, and utils.fhilbert for more details on the methods used for speed-up.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain. Should be flattened, i.e., the shape is (n_samples,).
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar.
  • padding_size (int, optional) – Factor that determines if the signal will be zero-padded before generating the subbands. If this is None, or less than 1, no zero-padding will be used. Otherwise, zeros are added to the end of the input signal until is it of length padding_size * length(signal). This padded region will be removed after performing the subband decomposition.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
Returns:

subband_envelopes: The subband envelopes (i.e., cochleagram) resulting from

the subband decomposition. This should have the same shape as filters.

Return type:

array

pycochleagram.subband.generate_subbands(signal, filters, padding_size=None, fft_mode='auto', debug_ret_all=False)[source]

Generate the subband decomposition of the signal by applying the provided filters.

The input filters are applied to the signal to perform subband decomposition. The signal can be optionally zero-padded before the decomposition.

Parameters:
  • signal (array) – The sound signal (waveform) in the time domain.
  • filters (array) – The filterbank, in frequency space, used to generate the cochleagram. This should be the full filter-set output of erbFilter.make_erb_cos_filters_nx, or similar.
  • padding_size (int, optional) – Factor that determines if the signal will be zero-padded before generating the subbands. If this is None, or less than 1, no zero-padding will be used. Otherwise, zeros are added to the end of the input signal until is it of length padding_size * length(signal). This padded region will be removed after performing the subband decomposition.
  • fft_mode ({'auto', 'fftw', 'np'}, optional) – Determine what implementation to use for FFT-like operations. ‘auto’ will attempt to use pyfftw, but will fallback to numpy, if necessary.
Returns:

subbands: The subbands resulting from the subband decomposition. This

should have the same shape as filters.

Return type:

array

pycochleagram.subband.pad_signal(signal, padding_size, axis=0)[source]

Pad the signal by appending zeros to the end. The padded signal has length padding_size * length(signal).

Parameters:
  • signal (array) – The signal to be zero-padded.
  • padding_size (int) – Factor that determines the size of the padded signal. The padded signal has length padding_size * length(signal).
  • axis (int) – Specifies the axis to pad; defaults to 0.
Returns:

pad_signal (array): The zero-padded signal. padding_size (int): The length of the zero-padding added to the array.

Return type:

tuple

pycochleagram.subband.reshape_signal_batch(signal)[source]

Convert the signal into a standard batch shape for use with cochleagram.py functions. The first dimension is the batch dimension.

Parameters:signal (array) – The sound signal (waveform) in the time domain. Should be either a flattened array with shape (n_samples,), a row vector with shape (1, n_samples), a column vector with shape (n_samples, 1), or a 2D matrix of the form [batch, waveform].
Returns:
out_signal: If the input signal has a valid shape, returns a
2D version of the signal with the first dimension as the batch dimension.
Return type:array
Raises:ValueError – Raises an error of the input signal has invalid shape.
pycochleagram.subband.reshape_signal_canonical(signal)[source]

Convert the signal into a canonical shape for use with cochleagram.py functions.

This first verifies that the signal contains only one data channel, which can be in a row, a column, or a flat array. Then it flattens the signal array.

Parameters:signal (array) – The sound signal (waveform) in the time domain. Should be either a flattened array with shape (n_samples,), a row vector with shape (1, n_samples), or a column vector with shape (n_samples, 1).
Returns:
out_signal: If the input signal has a valid shape, returns a
flattened version of the signal.
Return type:array
Raises:ValueError – Raises an error of the input signal has invalid shape.

pycochleagram.utils module

pycochleagram.utils.check_if_display_exists()[source]

Check if a display is present on the machine. This can be used to conditionally import matplotlib, as importing it with an interactive backend on a machine without a display causes a core dump.

Returns:Indicates if there is a display present on the machine.
Return type:(bool)
pycochleagram.utils.cochshow(cochleagram, interact=True, cmap='magma')[source]

Helper function to facilitate displaying cochleagrams.

Parameters:
  • cochleagram (array) – Cochleagram to display with matplotlib.
  • interact (bool, optional) – Determines if interactive plot should be shown. If True (default), plot will be shown. If this is False, the figure will be created but not displayed.
  • cmap (str, optional) – A matplotlib cmap name to use for this plot.
Returns:

image: Whatever matplotlib.pyplot.plt returns.

Return type:

AxesImage

pycochleagram.utils.combine_signal_and_noise(signal, noise, snr)[source]

Combine the signal and noise at the provided snr.

Parameters:
  • signal (array-like) – Signal waveform data.
  • noise (array-like) – Noise waveform data.
  • snr (number) – SNR level in dB.
Returns:

Combined signal and noise waveform.

Return type:

signal_and_noise

pycochleagram.utils.compute_cochleagram_shape(signal_len, sr, n, sample_factor, env_sr=None)[source]

Returns the shape of the cochleagram that will be created from by using the provided parameters.

Parameters:
  • signal_len (int) – Length of signal waveform.
  • sr (int) – Waveform sampling rate.
  • n (int) – Number of filters requested in the filter bank.
  • sample_factor (int) – Degree of overcompleteness of the filter bank.
  • env_sr (int, optional) – Envelope sampling rate, if None (default), will equal the waveform sampling rate sr.
Returns:

Shape of the array containing the cochleagram.

Return type:

tuple

pycochleagram.utils.fft(a, n=None, axis=-1, norm=None, mode='auto', params=None)[source]

Provides support for various implementations of the FFT, using numpy’s fftpack or pyfftw’s fftw. This uses a numpy.fft-like interface.

Parameters:
  • a (array) – Time-domain signal.
  • mode (str) – Determines which FFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • n (int, optional) – Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
  • axis (int, optional) – Axis over which to compute the FFT. If not given, the last axis is used.
  • norm ({None, 'ortho'}, optional) – Support for numpy interface.
  • params (dict, None, optional) – Dictionary of additional input arguments to provide to the appropriate fft function (usually fftw). Note, named arguments (e.g., n, axis, and norm) will override identically named arguments in params. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If params is not None, it will not be altered.
Returns:

fft_a: Signal in the frequency domain in FFT standard order. See numpy.fft() for a description of the output.

Return type:

array

pycochleagram.utils.fhilbert(a, axis=None, mode='auto', ifft_params=None)[source]

Compute the Hilbert transform of the provided frequency-space signal.

This function assumes the input array is already in frequency space, i.e., it is the output of a numpy-like FFT implementation. This avoids unnecessary repeated computation of the FFT/IFFT.

Parameters:
  • a (array) – Signal, in frequency space, e.g., a = fft(signal).
  • mode (str) – Determines which FFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • iff_params (dict, None, optional) – Dictionary of input arguments to provide to the call computing ifft. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If ifft_params is not None, it will not be altered.
Returns:

hilbert_a: Hilbert transform of input array a, in the time domain.

Return type:

array

pycochleagram.utils.filtshow(freqs, filts, hz_cutoffs=None, full_filter=True, use_log_x=False, interact=True)[source]
pycochleagram.utils.get_channels(snd_array)[source]

Returns the number of channels in the sound array.

Parameters:snd_array (array) – Array (of sound data).
Returns:n_channels: The number of channels in the input array.
Return type:int
pycochleagram.utils.hilbert(a, axis=None, mode='auto', fft_params=None)[source]

Compute the Hilbert transform of time-domain signal.

Provides access to FFTW-based implementation of the Hilbert transform.

Parameters:
  • a (array) – Time-domain signal.
  • mode (str) – Determines which FFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • fft_params (dict, None, optional) – Dictionary of input arguments to provide to the call computing fft and ifft. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If fft_params is not None, it will not be altered.
Returns:

hilbert_a: Hilbert transform of input array a, in the time domain.

Return type:

array

pycochleagram.utils.ifft(a, n=None, axis=-1, norm=None, mode='auto', params=None)[source]

Provides support for various implementations of the IFFT, using numpy’s fftpack or pyfftw’s fftw. This uses a numpy.fft-like interface.

Parameters:
  • a (array) – Time-domain signal.
  • mode (str) – Determines which IFFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • n (int, optional) – Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
  • axis (int, optional) – Axis over which to compute the FFT. If not given, the last axis is used.
  • norm ({None, 'ortho'}, optional) – Support for numpy interface.
  • params (dict, None, optional) – Dictionary of additional input arguments to provide to the appropriate fft function (usually fftw). Note, named arguments (e.g., n, axis, and norm) will override identically named arguments in params. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If params is not None, it will not be altered.
Returns:

ifft_a: Signal in the time domain. See numpy.ifft() for a

description of the output.

Return type:

array

pycochleagram.utils.irfft(a, n=None, axis=-1, mode='auto', params=None)[source]

Provides support for various implementations of the IRFFT, using numpy’s fftpack or pyfftw’s fftw. This uses a numpy.fft-like interface.

Parameters:
  • a (array) – Time-domain signal.
  • mode (str) – Determines which FFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • n (int, optional) – Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
  • axis (int, optional) – Axis over which to compute the FFT. If not given, the last axis is used.
  • params (dict, None, optional) – Dictionary of additional input arguments to provide to the appropriate fft function (usually fftw). Note, named arguments (e.g., n and axis) will override identically named arguments in params. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If params is not None, it will not be altered.
Returns:

irfft_a: Signal in the time domain. See numpy.irfft() for a

description of the output.

Return type:

array

pycochleagram.utils.matlab_arange(start, stop, num)[source]

Mimics MATLAB’s sequence generation.

Returns num + 1 evenly spaced samples, calculated over the interval [start, stop].

Parameters:
  • start (scalar) – The starting value of the sequence.
  • stop (scalar) – The end value of the sequence.
  • num (int) – Number of samples to generate.
Returns:

samples: There are num + 1 equally spaced samples in the closed interval.

Return type:

ndarray

pycochleagram.utils.play_array(snd_array, sr=44100, rescale='normalize', pyaudio_params={}, ignore_warning=False)[source]

Play the provided sound array using pyaudio.

Parameters:
  • snd_array (array) – The array containing the sound data.
  • sr (number) – Sampling sr for playback; defaults to 44,100 Hz.
  • be overriden if pyaudio_params is provided. (Will) –
  • rescale ({'standardize', 'normalize', None}) – Determines type of rescaling to perform. ‘standardize’ will divide by the max value allowed by the numerical precision of the input. ‘normalize’ will rescale to the interval [-1, 1]. None will not perform rescaling (NOTE: be careful with this as this can be very loud if playedback!).
  • pyaudio_params (dict) – A dictionary containing any input arguments to pass to the pyaudio.PyAudio.open method.
  • ignore_warning (bool, optional) – Determines if audio playback will occur. The playback volume can be very loud, so to use this method, ignore_warning must be True. If this is False, an error will be thrown warning the user about this issue.
Returns:

sound_str: The string representation (used by pyaudio) of the sound

array.

Return type:

str

Raises:

ValueError – If ignore_warning is False, an error is thrown to warn the user about the possible loud sounds associated with playback

pycochleagram.utils.rescale_sound(snd_array, rescale)[source]

Rescale the sound with the provided rescaling method (if supported).

Parameters:
  • snd_array (array) – The array containing the sound data.
  • rescale ({'standardize', 'normalize', None}) – Determines type of rescaling to perform. ‘standardize’ will divide by the max value allowed by the numerical precision of the input. ‘normalize’ will rescale to the interval [-1, 1]. None will not perform rescaling (NOTE: be careful with this as this can be very loud if playedback!).
Returns:

rescaled_snd: The sound array after rescaling.

Return type:

array

pycochleagram.utils.rfft(a, n=None, axis=-1, mode='auto', params=None)[source]

Provides support for various implementations of the RFFT, using numpy’s fftpack or pyfftw’s fftw. This uses a numpy.fft-like interface.

Parameters:
  • a (array) – Time-domain signal.
  • mode (str) – Determines which FFT implementation will be used. Options are ‘fftw’, ‘np’, and ‘auto’. Using ‘auto’, will attempt to use a pyfftw implementation with some sensible parameters (if the module is available), and will use numpy’s fftpack implementation otherwise.
  • n (int, optional) – Length of the transformed axis of the output. If n is smaller than the length of the input, the input is cropped. If it is larger, the input is padded with zeros. If n is not given, the length of the input along the axis specified by axis is used.
  • axis (int, optional) – Axis over which to compute the FFT. If not given, the last axis is used.
  • params (dict, None, optional) – Dictionary of additional input arguments to provide to the appropriate fft function (usually fftw). Note, named arguments (e.g., n and axis) will override identically named arguments in params. If mode is ‘auto’ and params dict is None, sensible values will be chosen. If params is not None, it will not be altered.
Returns:

rfft_a: Signal in the frequency domain in standard order.

See numpy.rfft() for a description of the output.

Return type:

array

pycochleagram.utils.rms(a, strict=True)[source]

Compute root mean squared of array. WARNING: THIS BREAKS WITH AXIS, only works on vector input.

Parameters:a (array) – Input array.
Returns:rms_a: Root mean squared of array.
Return type:array
pycochleagram.utils.wav_to_array(fn, rescale='standardize')[source]

Reads wav file data into a numpy array.

Parameters:
  • fn (str) – The file path to .wav file.
  • rescale ({'standardize', 'normalize', None}) – Determines type of rescaling to perform. ‘standardize’ will divide by the max value allowed by the numerical precision of the input. ‘normalize’ will rescale to the interval [-1, 1]. None will not perform rescaling (NOTE: be careful with this as this can be very loud if playedback!).
Returns:

snd (int): The sound in the .wav file as a numpy array. samp_freq (array): Sampling frequency of the input sound.

Return type:

tuple