%pip install -q gTTS pydub audioop-lts pandas;Note: you may need to restart the kernel to use updated packages.
Michael Winterspacher
January 18, 2026
When you ever thought about training your swimming cadence, then probably an accoustic metronom device came into your mind. The problem with such a device, you have to buy one and it’s not cheap.
Personally, I’ve been using a waterproof audio player for a few years now. So – I was thinking – why not using this device as an accoustic metronom. The only problem: “Where do I get the MP3 files?”.
I didn’t find good ones that matched my expectations, therefore I had to generate them by myself. The result is something like this:
This file is ten minutes long and starts with an announcement of the beats per minute (BPM), which helps you to identify the file on your MP3 player.
Of course, I generated a lot more of those files – with different audios and tempos. If you’re only interested in them, then simply scroll down to the gigantic table. At the very bottom is also a link to download a ZIP with all files.
Nevertheless, when you want more customization (like different length, other beat sounds) then take a look at the next part where I show you how you can generate them by yourself.
We basically need two libraries. The first one is Google’s Text-to-Speech library gTTs which we use for the announcements. The second one is Pydub which we use to create the audio files. Additionally, I also install pandas to show the results as tables.
Note: you may need to restart the kernel to use updated packages.
After installing the requirements, we can start writing a function that generates our audio file.
from pydub import AudioSegment
import math
def generate_beats(
bpm: int,
duration_seconds: int,
audiofiles_and_volume_change_in_db: list[(str, float)],
audios_per_beat = 1,
) -> AudioSegment:
loaded_audiofiles = [AudioSegment.from_file(f) + v for f, v in audiofiles_and_volume_change_in_db]
max_audio_length = 60_000 / (bpm * audios_per_beat)
audios = AudioSegment.empty()
for sound in loaded_audiofiles:
audios += sound[:max_audio_length] + AudioSegment.silent(duration=max(0, max_audio_length - len(sound)))
total_beats = math.floor((duration_seconds / 60) * bpm * (audios_per_beat / len(loaded_audiofiles)))
beats = audios * total_beats
return beats<_io.BufferedRandom name='first.mp3'>
This should result into clap each second and a beep sound in between:
The function takes as inputs multiple audio files with a value that controls the volume gain.
Then we need the announcement:
from gtts import gTTS
from pydub import AudioSegment
import os
import uuid
def generate_announcement(
text,
locale = 'en',
) -> AudioSegment:
announcement_file = f"{uuid.uuid4().hex}.mp3"
tts = gTTS(text=text, lang=locale)
tts.save(announcement_file)
announcement = AudioSegment.from_mp3(announcement_file)
os.remove(announcement_file)
return announcement<_io.BufferedRandom name='announcement.mp3'>
And we get:
Perfect, before we combine everything, here is a small helper function which exports the audiofile as a compressed MP3 file:
And finally, we combine everything and generate a sound file:
In the last step, we define a few different configurations and generate all sound files for BPM 45 to 110 in increments of 5:
import pandas as pd
bpms = range(45, 110, 5)
configurations = [
{
"name": "beep",
"audiofiles_and_volume_change_in_db": [("input/beep.mp3", 10)],
"audios_per_beat": 1,
},
{
"name": "tick",
"audiofiles_and_volume_change_in_db": [("input/tick1.mp3", 10), ("input/tick2.mp3", 10)],
"audios_per_beat": 1,
},
{
"name": "clap",
"audiofiles_and_volume_change_in_db": [("input/clap1.mp3", 20), ("input/clap2.mp3", 0), ("input/clap3.mp3", 0), ("input/clap4.mp3", 0)],
"audios_per_beat": 4,
},
{
"name": "clock",
"audiofiles_and_volume_change_in_db": [("input/clock1.mp3", 20), ("input/clock2.mp3", 0), ("input/clock3.mp3", 0), ("input/clock4.mp3", 0)],
"audios_per_beat": 4,
}
]
result_data = []
for config in configurations:
for bpm in bpms:
out_file = f'output/{config['name']}-{bpm}bpm.mp3'
export_audiofile_as_optimized_mp3(
out_file,
generate_announcement(f'{bpm}') + AudioSegment.silent(duration=500) + generate_beats(
bpm,
10 * 60,
config['audiofiles_and_volume_change_in_db'],
audios_per_beat = config['audios_per_beat']
)
)
result_data.append({
'Name': config['name'],
'BPM': bpm,
'Interval (s)': round(60 / bpm, 3),
'File': out_file,
})
df = pd.DataFrame(result_data)| Name | BPM | Interval (s) | Audio | Download | |
|---|---|---|---|---|---|
| 0 | beep | 45 | 1.333 | Download | |
| 1 | beep | 50 | 1.200 | Download | |
| 2 | beep | 55 | 1.091 | Download | |
| 3 | beep | 60 | 1.000 | Download | |
| 4 | beep | 65 | 0.923 | Download | |
| 5 | beep | 70 | 0.857 | Download | |
| 6 | beep | 75 | 0.800 | Download | |
| 7 | beep | 80 | 0.750 | Download | |
| 8 | beep | 85 | 0.706 | Download | |
| 9 | beep | 90 | 0.667 | Download | |
| 10 | beep | 95 | 0.632 | Download | |
| 11 | beep | 100 | 0.600 | Download | |
| 12 | beep | 105 | 0.571 | Download | |
| 13 | tick | 45 | 1.333 | Download | |
| 14 | tick | 50 | 1.200 | Download | |
| 15 | tick | 55 | 1.091 | Download | |
| 16 | tick | 60 | 1.000 | Download | |
| 17 | tick | 65 | 0.923 | Download | |
| 18 | tick | 70 | 0.857 | Download | |
| 19 | tick | 75 | 0.800 | Download | |
| 20 | tick | 80 | 0.750 | Download | |
| 21 | tick | 85 | 0.706 | Download | |
| 22 | tick | 90 | 0.667 | Download | |
| 23 | tick | 95 | 0.632 | Download | |
| 24 | tick | 100 | 0.600 | Download | |
| 25 | tick | 105 | 0.571 | Download | |
| 26 | clap | 45 | 1.333 | Download | |
| 27 | clap | 50 | 1.200 | Download | |
| 28 | clap | 55 | 1.091 | Download | |
| 29 | clap | 60 | 1.000 | Download | |
| 30 | clap | 65 | 0.923 | Download | |
| 31 | clap | 70 | 0.857 | Download | |
| 32 | clap | 75 | 0.800 | Download | |
| 33 | clap | 80 | 0.750 | Download | |
| 34 | clap | 85 | 0.706 | Download | |
| 35 | clap | 90 | 0.667 | Download | |
| 36 | clap | 95 | 0.632 | Download | |
| 37 | clap | 100 | 0.600 | Download | |
| 38 | clap | 105 | 0.571 | Download | |
| 39 | clock | 45 | 1.333 | Download | |
| 40 | clock | 50 | 1.200 | Download | |
| 41 | clock | 55 | 1.091 | Download | |
| 42 | clock | 60 | 1.000 | Download | |
| 43 | clock | 65 | 0.923 | Download | |
| 44 | clock | 70 | 0.857 | Download | |
| 45 | clock | 75 | 0.800 | Download | |
| 46 | clock | 80 | 0.750 | Download | |
| 47 | clock | 85 | 0.706 | Download | |
| 48 | clock | 90 | 0.667 | Download | |
| 49 | clock | 95 | 0.632 | Download | |
| 50 | clock | 100 | 0.600 | Download | |
| 51 | clock | 105 | 0.571 | Download |
Download all files as ZIP
You have a great idea to improve this (for example a cool metronome sound), then contact me at hello@winte.xyz. You saved money? Then maybe buy a coffee for me ;).