Convolutional Networks Zoo

I’ve always been fascinated by how a machine can detects known objects in an image… Yes it’s wierd… I know…

But if we imagine the range of applications for this kind of algorythms, we may be surprised… From autonomous vehicules to automatic defects detection, crowd counting etc…, they are everywhere…

So I decided to give it a shot, and build several models, with different techniques :

Classification -> 1 image = 1 label
Segmentation -> 1 pixel = 1 label
Localisation / Detection -> object detection and localisation (bounding boxes)

For this purpose, I used Keras and re-implemented the models with it.
It’s been challenging, especially for YOLO ( Localisation ) witch need a custom loss function because it predicts bounding boxes instead of a single label…

You can find them on my GitHub, HERE

Classification is the simplest task and uses only simple convolutional layers before some Dense (fully connected) layers. The number of layers depends on your problem’s complexity…
You can also output more than one label per image (imagine a driving car that decide to go to left and to speed up, from the same picture… like I did before)

Segmentation is as simple as the previous but this time, we will classify the pixels themselves (or little groups of pixels)… So the network should output W x H x C elements with W the image’s width, H the image’s height and C the number of class in witch you want to classify. I used a UNET Keras implementation for this purpose :

Last but not least, maybe the more challenging and interesting in my opinion… YOLO !

You Only Look Once is an algorythm witch split the image into 7 x 7 cells, and will predict the probability of an object having its center in a cell, the width and height of the bounding box, along with a class label and confidence…
So the output of the model should be S x S x (5 + C)) if we detect only one object per cell… (the “5” represents the 4 dimensions (x, y, w, h,) coords of a bounding box and the “confidence of the model in its prediction”).

Example on 3 x 3 grid:

All you have to do is build up your data the good shape and you’r good to go !

this dog is so famous now !

Then you need to process the loss by hand beacause Keras can’t do it for you…
Basically, it compute the squared difference between the x, y, w, h values, classes, confidence and ground thuth. It also uses an “Intersection over Union” to consider only one box for a given object…

To be honnest, I took it from here (and did some modifications to makes it fit my data):
https://blog.emmanuelcaradec.com/humble-yolo-implementation-in-keras/
This is an awsome start point !

Neural Network: Music Composer

Machines can’t do ART…

But they are very good at learning patterns and reproduce them, so let’s try to build an “AI” (or whatever you want to call it…) that can produce (originals) parts of music !

And like always, everything starts with data…

While computers are not that good to deal with audio, midi format fits perfectly for the job (for non musicians, midi files are like text files where you write the notes/lenghts/velocities/etc as successives numbers). So we take a bunch of piano solo midi files, put them all together, before we split them in chunks of N notes …

We will give those successive chunks of notes to an Artificial Neural Network, that will have to predict the next single note, knowing the N past notes.

In order to do so, we compute the midi files so the Neural Net can read them, and, with the help of several LSTM (Long SHort Term Memory) layers, after approx 10 epochs of learning (100 time the whole dataset), we achieved the result you can ear from here :

These 3 piano parts have been integrally improvised by the Neural Network, no post processing have been done except we used a piano patch to make it sound better…

Not that bad for a machine !


But one of these songs turned out to be a perfect copy of the original (a beautyfull example of “overfiting”), can you gess wich one ?

As usual, you can find all code on my GitHub !

ImagesClassifier_GUI

Machine Learning Images Classification in Graphical User Interface

Take pictures
Labelize them
Define a Model Design
Train your model
Test your model

Without any line of code !

ICGUI1
ICGUI2