Big

ML Two
Lecture 02
🤗 AI for audio applications
+
🎤 Sound classification with CreateML

Welcome 👩‍🎤🧑‍🎤👨‍🎤

First of all, don't forget to confirm your attendence on Seats App!

Last lecture: train an image classifier with CreateML, using a well-prepared fruit image dataset🍎🍏

Today: train a sound classifier with CreateML, using a not-so-well-prepared environment sound dataset🔊

😎Recall from last lecture, what were the important bits from training an image classifier in CreateML?
- in order to inform the CreateML what the training data and their corresponding labels are,
- we need to put images of the same class
- into one folder named as the class label!

Today's dataset: environment sounds
1. download
2. unzip
3. check what is the file structure for this dataset?

All files are scrambled in one big folder 🥲

🥲Not the nice structure that CreateML recognises (actually most raw datasets do not have that nice structure.)
- We will prepare the dataset ourselves!😎
- Recall that data preparation is usually the first part of the ML development pipeline!🤗

1. DATA PREP
- - data collection (p)
- - data pre-processing (p)
2. TRAINING
- - fine tuning (p,c)
- - from scratch (p,t)
3. DEPLOYMENT (c)
p: python 🐍
C: CoreML and CreateML 🍎🤖

🤗Interactive classroom Q&A time:
where can I find the labels/classes information for this dataset?

Let's preview meta/esc50.csv
- btw what is csv...? Let's google!

Here, each row in "meta/esc50.csv" contains the mapping relation from a filename to its category

We need to use the information in the csv and organize the dataset into the NICE structure for createML!

let's write a cool python script that does the organization! 🧑‍🎤

Prepare our Python development environment (the software to use for write and run python code):
1. Download anaconda
2. Install Anaconda and open it
3. Install Spyder from Anaconda and open it

Familiarise ourselves with Spyder interface, it is quite nice!

Two tricks in Spyder:
- enter #%% to create new cell
- press shift + enter to run one cell, just like colab notebook!

Python time! - the code is prepared here

- 🌶️🌶️🌶️ we can start from scrach,
- 🌶️🌶️ or from the DataPrep-todo.py,
- 🌶️ or directly run the DataPrep-complete.py that will do the job

Keywords:
create new directory(folder) using python code,
csv,
dataframe,
iterate rows in a dataframe,
copy-paste files

Step 1 Data preparation done!!! 😎

Next: summon CreateML 😈

Step 2 Train the (sound classification) model: import the prepared dataset into CreateML and train

Let's bring the model to our App!

Step 3 Deploy the (sound classification) model: export the trained model from CreateML, and import that into the template IOS App "SCDemo-Improved"
play with it!

🌶🌶Our second AI project done!🌶🌶

🌶🌶A little summary for today's practice🌶🌶
We wrote some Python code for data pre-processing 🔨:
-- From the "os" library,
--- use "makedirs" function to create new folders 🛸
-- From the "pandas" library,
--- use "dataframe" object for handling csv file data, and basically any tabular data!🍱

😎🛸Fun AI time: AI for audio applications

Example applications - 🔊🍱 Classification tasks
Example: sound source identification, music genre classification, etc.
- BirdNET
- Speaker recognition

Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- 🤌 Music source separation
-- commercial products: lalal.ai, vocali, etc.
-- research/open source projects: demucs, spleeter, etc.
- 🤌 Music generation / audio synthesis
-- WaveNet (based on CNN model)
-- Riffusion (based on diffusion model, to be introduced later in ML Two! )
-- Musika (based on GAN, to be introduced next week! )
-- Great explanation and demo from the author's PhD defense, demo starts at 42:20

Example applications - Music related (there are so many and what listed below are very much non-exhaustive)
- 🤌 RAVE
-- A audio synthesis (generation) model.
-- A real-time audio synthesis (generation) model.
-- Great impact on the creative community, experimented by artists including hexorcismos, portrait xo, dadabots, etc.
-- Techniques include self-supervised learning (VAE), Convolution neural network(conv layers and residual blocks), GAN, etc.

We'll see you next week same time same place! 🫡