Optical Music Recognition - From Dataset Creation to Inference

As a short disclaimer, I have a personal interest in this project to utilize my time in Prague at CVUT during my Erasmus semester to achieve the following things: Following these higher order goals, I want to create a library and application in the field of optical music recognition (OMR).
More specifically, This blog focuses on the necessary steps to achieve a minimum pipeline to phrase optical music recognition as machine learning task in four acts.

Act 1

Writing a Music Renderer From Scratch

Act 2

Filetype Agnostic Music Rendering?

Act 3

Creating Simple Datasets for OMR

Act 4

Three Machine Learning Approaches

TLDR

Following Act 1 to Act 4, I show the implementation and usage of the notensatz library to render music. Due to the design of a filetype agnostic rendering, it is possible to create various formats that come in handy when creating datasets for machine learning tasks. In combination with the typestate builder pattern and a RandomConfig struct, creating such datasets is incredibly easy, user friendly and fast. Three different machine learning techniques to solve a simple toy example have been implemented, compared and evaluated. They include a pixel-wise semantic segmentation, a simple regression as well as a bounding-box regression.

Alongside other use-cases for this library, it may serve as an educational toolbox to teach and explore machine learning topics. I heavily promote the field of optimal music recognition for educational purposes as it is