ML Two
Lecture 02
๐Ÿค— AI for audio applications
+
๐ŸŽค Sound classification with CreateML
Welcome ๐Ÿ‘ฉโ€๐ŸŽค๐Ÿง‘โ€๐ŸŽค๐Ÿ‘จโ€๐ŸŽค
First of all, don't forget to confirm your attendence on Seats App!
Last lecture: train an image classifier with CreateML, using a well-prepared fruit image dataset๐ŸŽ๐Ÿ
Today: train a sound classifier with CreateML, using a not-so-well-prepared environment sound dataset๐Ÿ”Š
๐Ÿ˜ŽRecall from last lecture, what were the important bits from training an image classifier in CreateML?
- in order to inform the CreateML what the training data and their corresponding labels are,
- we need to put images of the same class
- into one folder named as the class label!
Today's dataset: environment sounds
1. download
2. unzip
3. check what is the file structure for this dataset?
All files are scrambled in one big folder ๐Ÿฅฒ
๐ŸฅฒNot the nice structure that CreateML recognises (actually most raw datasets do not have that nice structure.)
- We will prepare the dataset ourselves!๐Ÿ˜Ž
- Recall that data preparation is usually the first part of the ML development pipeline!๐Ÿค—
1. DATA PREP
- - data collection (p)
- - data pre-processing (p)
2. TRAINING
- - fine tuning (p,c)
- - from scratch (p,t)
3. DEPLOYMENT (c)
p: python ๐Ÿ
C: CoreML and CreateML ๐ŸŽ๐Ÿค–
๐Ÿค—Interactive classroom Q&A time:
where can I find the labels/classes information for this dataset?
Let's preview meta/esc50.csv
- btw what is csv...? Let's google!
Here, each row in "meta/esc50.csv" contains the mapping relation from a filename to its category
We need to use the information in the csv and organize the dataset into the NICE structure for createML!
let's write a cool python script that does the organization! ๐Ÿง‘โ€๐ŸŽค
Prepare our Python development environment (the software to use for write and run python code):
1. Download anaconda
2. Install Anaconda and open it
3. Install Spyder from Anaconda and open it
Familiarise ourselves with Spyder interface, it is quite nice!
Two tricks in Spyder:
- enter #%% to create new cell
- press shift + enter to run one cell, just like colab notebook!
Python time! - the code is prepared here

- ๐ŸŒถ๏ธ๐ŸŒถ๏ธ๐ŸŒถ๏ธ we can start from scrach,
- ๐ŸŒถ๏ธ๐ŸŒถ๏ธ or from the DataPrep-todo.py,
- ๐ŸŒถ๏ธ or directly run the DataPrep-complete.py that will do the job
Keywords:
create new directory(folder) using python code,
csv,
dataframe,
iterate rows in a dataframe,
copy-paste files
Step 1 Data preparation done!!! ๐Ÿ˜Ž
Next: summon CreateML ๐Ÿ˜ˆ
Step 2 Train the (sound classification) model: import the prepared dataset into CreateML and train
Let's bring the model to our App!
Step 3 Deploy the (sound classification) model: export the trained model from CreateML, and import that into the template IOS App "SCDemo-Improved"
play with it!
๐ŸŒถ๐ŸŒถOur second AI project done!๐ŸŒถ๐ŸŒถ
๐ŸŒถ๐ŸŒถA little summary for today's practice๐ŸŒถ๐ŸŒถ
We wrote some Python code for data pre-processing ๐Ÿ”จ:
-- From the "os" library,
--- use "makedirs" function to create new folders ๐Ÿ›ธ
-- From the "pandas" library,
--- use "dataframe" object for handling csv file data, and basically any tabular data!๐Ÿฑ
๐Ÿ˜Ž๐Ÿ›ธFun AI time: AI for audio applications
Example applications - ๐Ÿ”Š๐Ÿฑ Classification tasks
Example: sound source identification, music genre classification, etc.
- BirdNET
- Speaker recognition
Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- ๐ŸคŒ Music source separation
-- commercial products: lalal.ai, vocali, etc.
-- research/open source projects: demucs, spleeter, etc.
- ๐ŸคŒ Music generation / audio synthesis
-- WaveNet (based on CNN model)
-- Riffusion (based on diffusion model, to be introduced later in ML Two! )
-- Musika (based on GAN, to be introduced next week! )
-- Great explanation and demo from the author's PhD defense, demo starts at 42:20
Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- ๐ŸคŒ RAVE
-- A audio synthesis (generation) model.
-- A real-time audio synthesis (generation) model.
-- Great impact on the creative community, experimented by artists including hexorcismos, portrait xo, dadabots, etc.
-- Techniques include self-supervised learning (VAE), Convolution neural network(conv layers and residual blocks), GAN, etc.
We'll see you next week same time same place! ๐Ÿซก