ML Two
Lecture 02
๐ค AI for audio applications
+
๐ค Sound classification with CreateML
Welcome ๐ฉโ๐ค๐งโ๐ค๐จโ๐ค
First of all, don't forget to confirm your attendence on
Seats App!
Last lecture: train an image classifier with CreateML, using a well-prepared fruit image dataset๐๐
Today: train a sound classifier with CreateML, using a not-so-well-prepared environment sound dataset๐
๐Recall from last lecture, what were the important bits from training an image classifier in CreateML?
- in order to inform the CreateML what the training data and their corresponding labels are,
- we need to put images of the same class
- into one folder named as the class label!
Today's dataset:
environment sounds
1. download
2. unzip
3. check what is the file structure for this dataset?
All files are scrambled in one big folder ๐ฅฒ
๐ฅฒNot the nice structure that CreateML recognises (actually most raw datasets do not have that nice structure.)
- We will prepare the dataset ourselves!๐
- Recall that data preparation is usually the first part of the ML development pipeline!๐ค
1. DATA PREP
- - data collection (p)
- - data pre-processing (p)
2. TRAINING
- - fine tuning (p,c)
- - from scratch (p,t)
3. DEPLOYMENT (c)
p: python ๐
C: CoreML and CreateML ๐๐ค
๐คInteractive classroom Q&A time:
where can I find the labels/classes information for this dataset?
Let's preview meta/esc50.csv
- btw what is csv...? Let's google!
Here, each row in "meta/esc50.csv" contains the mapping relation from a filename to its category
We need to use the information in the csv and organize the dataset into the NICE structure for createML!
let's write a cool python script that does the organization! ๐งโ๐ค
Prepare our Python development environment (the software to use for write and run python code):
1. Download
anaconda
2. Install Anaconda and open it
3. Install Spyder from Anaconda and open it
Familiarise ourselves with Spyder interface, it is quite nice!
Two tricks in Spyder:
- enter #%% to create new cell
- press shift + enter to run one cell, just like colab notebook!
- ๐ถ๏ธ๐ถ๏ธ๐ถ๏ธ we can start from scrach,
- ๐ถ๏ธ๐ถ๏ธ or from the DataPrep-todo.py,
- ๐ถ๏ธ or directly run the DataPrep-complete.py that will do the job
Keywords:
create new directory(folder) using python code,
csv,
dataframe,
iterate rows in a dataframe,
copy-paste files
Step 1 Data preparation done!!! ๐
Next: summon CreateML ๐
Step 2 Train the (sound classification) model: import the prepared dataset into CreateML and train
Let's bring the model to our App!
Step 3 Deploy the (sound classification) model: export the trained model from CreateML, and import that into the template IOS App "SCDemo-Improved"
play with it!
๐ถ๐ถOur second AI project done!๐ถ๐ถ
๐ถ๐ถA little summary for today's practice๐ถ๐ถ
We wrote some Python code for data pre-processing ๐จ:
-- From the "os" library,
--- use "makedirs" function to create new folders ๐ธ
-- From the "pandas" library,
--- use "dataframe" object for handling csv file data, and basically any tabular data!๐ฑ
๐๐ธFun AI time: AI for audio applications
Example applications - ๐๐ฑ Classification tasks
Example: sound source identification, music genre classification, etc.
-
BirdNET
-
Speaker recognition
Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- ๐ค Music source separation
-- commercial products:
lalal.ai,
vocali, etc.
-- research/open source projects:
demucs,
spleeter, etc.
- ๐ค Music generation / audio synthesis
--
WaveNet (based on CNN model)
--
Riffusion (based on diffusion model, to be introduced later in ML Two! )
--
Musika (based on GAN, to be introduced next week! )
-- Great explanation and demo from the author's
PhD defense, demo starts at 42:20
Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- ๐ค
RAVE
-- A audio synthesis (generation) model.
-- A real-time audio synthesis (generation) model.
-- Great impact on the creative community, experimented by artists including
hexorcismos,
portrait xo,
dadabots, etc.
-- Techniques include self-supervised learning (VAE), Convolution neural network(conv layers and residual blocks), GAN, etc.
We'll see you next week same time same place! ๐ซก