Use your MP3 player to train your swimming cadence

If you have ever thought about training your swimming cadence, then probably an acoustic metronome device came into your mind. The problem with such a device is that you have to buy one, and they aren’t cheap.

Personally, I’ve been using a waterproof audio player for a few years now. So – I was thinking – why not use this device as an acoustic metronome? But, the main challenge was finding suitable MP3 files that matched my training needs.

Unfortunately, I didn’t find good ones that matched my expectations, therefore I had to generate them by myself. The result sounds like this:

This file is ten minutes long and starts with an announcement of the beats per minute (BPM), which helps you to identify the file on your MP3 player.

I generated a lot more of those files – with different beat sounds and tempos. If you’re only interested in them, then simply scroll down to the extensive table. At the very bottom is also a link to download a ZIP with all files.

However, if you’re looking for more customization (e.g., different durations or custom sounds), the following section explains how to generate them yourself.

Generating the Audiofiles using Python

To generate the files I use Python in version 3.13 and basically two libraries. The first one is Google’s Text-to-Speech library gTTs which we use for the announcements. The second one is Pydub which we use to create the audio files.

The last two are not absolutely necessary. audioop-lts is only required if you are using Python 3.13 or a later, since the builtin audioop module was deprecated in version 3.11. And, pandas is used to render the tables below.

%pip install -q gTTS pydub audioop-lts pandas;

Note: you may need to restart the kernel to use updated packages.


[notice] A new release of pip is available: 25.3 -> 26.0.1
[notice] To update, run: python.exe -m pip install --upgrade pip

After installing the requirements, we can start writing the function that generates our audio file:

from pydub import AudioSegment
import math

def generate_beats(
    bpm: int,
    duration_seconds: int,
    audiofiles_and_volume_change_in_db: list[(str, float)],  # db = decibel
    audios_per_beat = 1,
) -> AudioSegment:
    # we load all audio files and boost/reduce their volume
    loaded_audiofiles = [AudioSegment.from_file(f) + v for f, v in audiofiles_and_volume_change_in_db]

    # to avoid overlaps later, we have to calculate the maximum length for a single beat
    max_audio_length = 60_000 / (bpm * audios_per_beat)

    # we concenate all the loaded audio files. Additionally, we clip the sound files using the max_audio_length 
    # and add a silent filling so that the length matches the beats per minute.
    audios = AudioSegment.empty()
    for sound in loaded_audiofiles:
        audios += sound[:max_audio_length] + AudioSegment.silent(duration=max(0, max_audio_length - len(sound)))

    # we calculate the total number of beats
    total_beats = math.floor((duration_seconds / 60) * bpm * (audios_per_beat / len(loaded_audiofiles)))
    
    # we multiply the concenated audios to get the full-length
    beats = audios * total_beats

    return beats

The generate_beats function now takes…

the beats per minute,
the desired length in seconds,
a list of sound files and their decibel volume change (so, we can make certain sounds louder or quieter),
and the number of audio files to play per beat. This enables nuances to be played.

We can test this by providing two sound files and setting audios_per_beat to two. This should result in a clap every second, with a beep sound in between:

generate_beats(
    60, # bpm
    10, # duration in seconds
    [("input/clap1.mp3", 10), ("input/beep.mp3", 0)],
    audios_per_beat = 2
    ).export("first.mp3")

<_io.BufferedRandom name='first.mp3'>

Next, we need to implement the function that generates the announcement:

from gtts import gTTS
from pydub import AudioSegment
import os
import uuid

def generate_announcement(
    text,
    locale = 'en',
) -> AudioSegment:
    tts = gTTS(text=text, lang=locale)
    try: 
        # we create a temporary file (without using Python's tempfile 
        # because it often makes problems on Windows machines)
        announcement_file = f"{uuid.uuid4().hex}.mp3"
        tts.save(announcement_file)
        announcement = AudioSegment.from_mp3(announcement_file)
    finally:
        # and clean up the temporary file
        os.remove(announcement_file)
    
    return announcement

generate_announcement("Hello World").export("announcement.mp3")

<_io.BufferedRandom name='announcement.mp3'>

And we get:

Before we combine everything, we write a simple helper function to export the audio file as a compressed MP3. We don’t want large, repetitive sound files ^^:

from pydub import AudioSegment

def export_audiofile_as_optimized_mp3(
        out_file: str,
        audio: AudioSegment
        ):
    audio = audio.set_frame_rate(16000)
    audio = audio.set_channels(1) # mono channel which reduces the file size
    audio.export(out_file, format="mp3", bitrate="24k", parameters=["-acodec", "libmp3lame", "-q:a", "9"])

Now, we can combine everything and generate a ‘final’ audio file:

bmp = 60
duration_in_seconds = 10
export_audiofile_as_optimized_mp3(
    "result.mp3", 
    generate_announcement(f"{bmp}") + AudioSegment.silent(duration=500) + generate_beats(
        bmp,
        duration_in_seconds,
        [("input/beep.mp3", 0), ("input/clap1.mp3", 10)],
        audios_per_beat = 2
    )   
)

Lastly, we define a few different configurations to generate multiple sound files for BPM values between 45 to 110 in increments of 5:

import pandas as pd

bpms = range(45, 110, 5)

configurations = [
    {
        "name": "beep",
        "audiofiles_and_volume_change_in_db": [("input/beep.mp3", 10)],
        "audios_per_beat": 1,
    },
    {
        "name": "tick",
        "audiofiles_and_volume_change_in_db": [("input/tick1.mp3", 10), ("input/tick2.mp3", 10)],
        "audios_per_beat": 1,
    },
    {
        "name": "clap",
        "audiofiles_and_volume_change_in_db": [("input/clap1.mp3", 20), ("input/clap2.mp3", 0), ("input/clap3.mp3", 0), ("input/clap4.mp3", 0)],
        "audios_per_beat": 4,
    },
]

result_data = []

for config in configurations:
    for bpm in bpms:
        out_file = f'output/{config['name']}-{bpm}bpm.mp3'

        export_audiofile_as_optimized_mp3(
            out_file, 
            generate_announcement(f'{bpm}') + AudioSegment.silent(duration=500) + generate_beats(
                bpm,
                10 * 60,
                config['audiofiles_and_volume_change_in_db'],
                audios_per_beat = config['audios_per_beat']
            )   
        )

        result_data.append({
            'Name': config['name'],
            'BPM': bpm,
            'Interval (s)': round(60 / bpm, 3),
            'File': out_file,
        })

df = pd.DataFrame(result_data)

Results

from IPython.display import HTML
df['Audio'] = '<audio controls><source src="' + df['File'] + '" type="audio/mpeg"></audio>'
df['Download'] = '<a href="' + df['File'] + '" download>Download</a>'
HTML(df.drop(columns=['File']).to_html(render_links=True, escape=False))

	Name	BPM	Interval (s)	Download
0	beep	45	1.333	Download
1	beep	50	1.200	Download
2	beep	55	1.091	Download
3	beep	60	1.000	Download
4	beep	65	0.923	Download
5	beep	70	0.857	Download
6	beep	75	0.800	Download
7	beep	80	0.750	Download
8	beep	85	0.706	Download
9	beep	90	0.667	Download
10	beep	95	0.632	Download
11	beep	100	0.600	Download
12	beep	105	0.571	Download
13	tick	45	1.333	Download
14	tick	50	1.200	Download
15	tick	55	1.091	Download
16	tick	60	1.000	Download
17	tick	65	0.923	Download
18	tick	70	0.857	Download
19	tick	75	0.800	Download
20	tick	80	0.750	Download
21	tick	85	0.706	Download
22	tick	90	0.667	Download
23	tick	95	0.632	Download
24	tick	100	0.600	Download
25	tick	105	0.571	Download
26	clap	45	1.333	Download
27	clap	50	1.200	Download
28	clap	55	1.091	Download
29	clap	60	1.000	Download
30	clap	65	0.923	Download
31	clap	70	0.857	Download
32	clap	75	0.800	Download
33	clap	80	0.750	Download
34	clap	85	0.706	Download
35	clap	90	0.667	Download
36	clap	95	0.632	Download
37	clap	100	0.600	Download
38	clap	105	0.571	Download

import zipfile, os

with zipfile.ZipFile('all.zip', 'w', zipfile.ZIP_DEFLATED) as zipf:
    for f in df['File']:
        zipf.write(f, arcname=os.path.basename(f))

You can download all the files here –> Download ZIP. Simply download the file, unzip it, and transfer the tracks you want to your MP3 player. Then you’re all set to hit the pool! :)!

btw: If this helped you save some money, feel free to support my work by buying me a coffee.

Categories

Generating the Audiofiles using Python

Results