MCPcopy Index your code
hub / github.com/TheAlgorithms/Python / mfcc

Function mfcc

machine_learning/mfcc.py:69–149  ·  view source on GitHub ↗

Calculate Mel Frequency Cepstral Coefficients (MFCCs) from an audio signal. Args: audio: The input audio signal. sample_rate: The sample rate of the audio signal (in Hz). ftt_size: The size of the FFT window (default is 1024). hop_length: The hop length for

(
    audio: np.ndarray,
    sample_rate: int,
    ftt_size: int = 1024,
    hop_length: int = 20,
    mel_filter_num: int = 10,
    dct_filter_num: int = 40,
)

Source from the content-addressed store, hash-verified

67
68
69def mfcc(
70 audio: np.ndarray,
71 sample_rate: int,
72 ftt_size: int = 1024,
73 hop_length: int = 20,
74 mel_filter_num: int = 10,
75 dct_filter_num: int = 40,
76) -> np.ndarray:
77 """
78 Calculate Mel Frequency Cepstral Coefficients (MFCCs) from an audio signal.
79
80 Args:
81 audio: The input audio signal.
82 sample_rate: The sample rate of the audio signal (in Hz).
83 ftt_size: The size of the FFT window (default is 1024).
84 hop_length: The hop length for frame creation (default is 20ms).
85 mel_filter_num: The number of Mel filters (default is 10).
86 dct_filter_num: The number of DCT filters (default is 40).
87
88 Returns:
89 A matrix of MFCCs for the input audio.
90
91 Raises:
92 ValueError: If the input audio is empty.
93
94 Example:
95 >>> sample_rate = 44100 # Sample rate of 44.1 kHz
96 >>> duration = 2.0 # Duration of 1 second
97 >>> t = np.linspace(0, duration, int(sample_rate * duration), endpoint=False)
98 >>> audio = 0.5 * np.sin(2 * np.pi * 440.0 * t) # Generate a 440 Hz sine wave
99 >>> mfccs = mfcc(audio, sample_rate)
100 >>> mfccs.shape
101 (40, 101)
102 """
103 logging.info(f"Sample rate: {sample_rate}Hz")
104 logging.info(f"Audio duration: {len(audio) / sample_rate}s")
105 logging.info(f"Audio min: {np.min(audio)}")
106 logging.info(f"Audio max: {np.max(audio)}")
107
108 # normalize audio
109 audio_normalized = normalize(audio)
110
111 logging.info(f"Normalized audio min: {np.min(audio_normalized)}")
112 logging.info(f"Normalized audio max: {np.max(audio_normalized)}")
113
114 # frame audio into
115 audio_framed = audio_frames(
116 audio_normalized, sample_rate, ftt_size=ftt_size, hop_length=hop_length
117 )
118
119 logging.info(f"Framed audio shape: {audio_framed.shape}")
120 logging.info(f"First frame: {audio_framed[0]}")
121
122 # convert to frequency domain
123 # For simplicity we will choose the Hanning window.
124 window = get_window("hann", ftt_size, fftbins=True)
125 audio_windowed = audio_framed * window
126

Callers 1

exampleFunction · 0.85

Calls 7

audio_framesFunction · 0.85
calculate_fftFunction · 0.85
calculate_signal_powerFunction · 0.85
mel_spaced_filterbankFunction · 0.85
transposeMethod · 0.80
normalizeFunction · 0.70

Tested by

no test coverage detected