https://github.com/zharry29/drums-with-llm
we present ongoing work and preliminary findings on the possibility for deep models to transfer knowledge from language to music, by finetuning large language models pre-trained on a massive text corpus on only hundreds of MIDI files of drum performances.
models that are not pre-trained (Transformer) shows no such ability beyond naive repetition.
Evaluating generated music is a challenging task, more so is evaluating drum grooves with little precedence in literature. Hence, we propose a tailored structural evaluation method
First, the drum set is one of the most common and important instruments in many genres of music such as jazz, funk, blues, gospel, latin, pop, rock, metal, etc.
Second, the symbolic representation of the drum set is simpler than most pitched
instruments, as each note does not have a pitch but corresponds to a hit on one drum.
Third, the performance of a drum set typically is endowed with more degree of freedom with regard to the audience’s aesthetics than many other instruments.
Finetune a state-of-the-art LLM, GPT3 model on the Groove dataset
Google’s Groove MIDI Dataset is the largest and the most highquality to date, containing 1,150 MIDI files and over 22,000 measures of drumming by 10 professional drummers.
we re-purpose the dataset to study drum composition
To help LLMs identify the boundary between measures, we add a newline of “SEP” between every 16 lines (a measure) and a newline of “END” after the final line.
6 drumset(2^6) x 64 time step
the model is given the first 2 measures and must complete the rest of the 14 measures of the groove.
Perplexity
Perplexity is a metric used to measure how well a language model predicts a sequence of words or a given text. It quantifies the level of uncertainty or confusion of the model when faced with predicting the next word. A lower perplexity value indicates that the language model is more accurate and confident in its predictions, while a higher perplexity value suggests more uncertainty and less accurate predictions.
from https://towardsdatascience.com/the-relationship-between-perplexity-and-entropy-in-nlp-f81888775ccc
Structural similarity
from Museformer: Transformer with Fine- and Coarse-Grained Attention for Music Generation https://arxiv.org/abs/2210.10349
Pattern and Fill analysis
To classify each measure as either a pattern or a fill, we take a sliding window of size 3 centered at some measure mi and calculate the edit distances between this measure and its two neighbors
Intra-centroid distance: between pattern and fill
Inter-centroid distance: between measures in a group of pattern or fill
Concretely, all drum grooves produced via different means are shuffled and randomly present to one of the authors who has had years of training in drumming.
• Is the groove repetitive, meaning there is little or no variation among measures?
• Is the groove consistent, meaning there is some variation among measures but a steady rhythmic idea (specifically, the back-beat placement) can be followed?
• Is the groove chaotic, meaning there is either too much variation, or a lack of a clear rhythmic idea?
• Does the groove contain any reasonable drum fill?
For the smaller GPT3 Ada model, the observation holds to a larger extent, with more inconsistent grooves and less fills.
Embark on a musical journey through the heart of China with its captivating range of traditional instruments. From the enchanting melodies of the Flute to the intricate harmonies of the Guqin, each Chinese musical instruments tells a story of ancient heritage. With West Lake Taobao Agent Shopping Service, acquiring these treasures is convenient, offering global shipping to bring the magic of Chinese music to your doorstep.