What if AI was trained on Christmas tunes? How would it sound?
We got the answer.
We have trained a neural network to generate tunes with a Christmas feeling for you to enjoy
To get into the right Christmas spirit we at Made by AI have trained a neural network to generate tunes that reminds us about Christmas. You can listen to a few sampled tunes or download your own tune below.
Download your own tune as a MP3
We wrote a short technical blog post for those who are more interested in what is going on behind the scenes of AI Christmas tunes.
We used deep learning to generate tunes. Using neural networks to generate music was both an interesting task and an opportunity to learn more about another type of data.
When deciding to generate tunes we constraints that we had to stick to.
Total time we had on the hack.
Amount of publicly available data
The performance track record of the deep learning model we were going to pick
The expected training time of such model. How easy it would be to sample (generate tunes) from the trained model
We already knew that working with raw audio input would be difficult in all mentioned limitations. Therefore we realized that a path forward was to train on musical notes. We limited ourselves to network architecture types of RNNs (Recurrent Neural Networks) or a LSTM (Long-short Term Memory). There is a lot of well written articles and examples of using these types of neural network. To be able to generate good results of longer sequences of text we choose the LSTM model.
LSTMs are a special kind of recurrent neural network, they have a memory capacity that learns long-term dependencies.
Regular neural network have a difficulty learning the relevant information from the context if there is a big gap between, something that LSTMs handle accurately. Consider the text “My dog has a black tail. It also...”. You know that “it” refers to the dog while the regular RNN will have no idea.
A more detailed explanation of how LSTMs work can be found here
Our dataset consisted of about a hundred Christmas tunes and was collected in MIDI format. A MIDI-file is a text file containing the notes and length and loudness of each note. Because of this, the MIDI format is suitable for doing machine learning tasks. For converting we used Music21, an open source library to read and write playable MIDI files.
The first step was to analyze the dataset by going through the existing tunes and storing the notes, chords and the sequences that were used in each tune.
We used a prebuilt LSTM functions in Keras. The training consists of modeling a LSTM network that would “learn” these sequences of notes and chords, with the goal of being able to generate its own sequences when fully learned.
For this case we spun up a GPU spot instance with a NVIDIA Tesla V100-SXM2 on AWS. The training took approximately 3 hours with the GPU instance. This was by any means more computation than needed, but we knew we had little time and could run into the need to optimize hyperparameters in the model, which we also did with trying out different batch sizes on the model.
The weights from the trained model could used to sample new Christmas tunes of any custom duration. Generating tunes is not as compute intensive as the training but still takes time (about 40 seconds per minute of tune).
An API was written to allow communication between the website and the deployed model. The Christmas tunes can be customized regarding duration (min: 10s, max: 2min) and instrument (can be Glockenspiel, Bells or Clarinet). This is another advantage of using MIDI-files, any instrument can be used to play the notes. The MIDI-file is finally converted to an mp3-file for compatibility.
Since the generation of this file can take several minutes (depending on server load) we ask the users for an email address; this way, the generation of a Christmas tune is queued on the server and an email with a download link is sent out when finished.
Overall, we are satisfied with the results from this years Christmas hack at Made by AI. Specially considering the time limitation we gave ourselves. It would have been fun to generate lyrics to the tunes too, or try to train on raw audio input.
In hindsight there is a bunch of open source tools available for music generation tasks that can be used. With that said, we encourage others to try to generate Christmas music with other input data and other models.
More about LSTMs can be found here http://colah.github.io/posts/2015-08-Understanding-LSTMs/
If you made it all the way here, you should check out our AI designed necklaces