Act 1
Writing a Music Renderer from Scratch
RenderModel
struct which is constructed by a Score
, handling all the complicated logic
where to place note heads, stems and beams, which padding should be applied or whether the measures fit in one row.
Once the RenderModel
has been created, there should be functions to create all kinds of data. This includes
RenderModel
struct, we only have to calculate the logic to place things once, and then create all kinds of outputs.
The public interface will look as follows:
impl RenderModel {
/// Handles all the complicated logic to build a RenderModel
pub fn from_score(score: &Score, score_max_width: f64) -> RenderModel;
// RenderModel to SvgDocument
pub fn svg(&self) -> SvgDocument;
// RenderModel to Bitamp (= image::RgbaImage)
pub fn bitmap(&self) -> Bitmap;
// RenderModel to SegmentationMask
pub fn segmentation_mask(&self, element_classes: ElementClasses) -> SegmentationMask;
// RenderModel to BoundingBoxes (= Vec<BoundingBox>)
pub fn bounding_boxes(&self, element_classes: ElementClasses) -> BoundingBoxes;
}
Now, for the structure of the RenderModel
itself, we have to focus on the fundamental building blocks that were needed to render an SVG.
Starting with the symbols for music notation, we narrowed them down to three categories
ElementData
enum which is part of an RenderElement
struct.
Alongside the ElementData
, this struct also contains an ElementClass
which holds the information about the class used for the machine learning task.
In addition to that, the end-user should be able to specify which classes to use during the dataset creation covered in Act 3.
#[derive(Debug, Clone)]
enum ElementData {
Line(Line),
Glyph(rusttype::PositionedGlyph),
Text(Text),
}
#[derive(Debug, Clone)]
pub struct RenderElement {
data: ElementData,
cls: ElementClass,
}
#[derive(Debug, Clone, Copy)]
pub enum ElementClass {
Background,
AccidentalNatural,
AccidentalSharp,
AccidentalFlat,
Barline,
Beats,
BeatsType,
Beam,
ClefG,
ClefF,
NoteWhole,
NoteHalf,
NoteFilled,
NoteDot,
NoteStem,
NoteStemArm,
NoteRest,
Staff,
StaffHelper,
}
Now that we can represent a single element, we have to connect them. A key part of building an SVG was about creating groups, connecting them hierarchically and translating each element.
This is not inherent to SVGs alone and thus, we define a RenderTree
element which holds data in form of Vec<RenderElement>
,
children to other RenderTrees
, as well an optional translation Translate
that is applied to both the data and the children objects.
Finally, the RenderModel
struct will simply represent the root of the tree and serve as public API endpoint.
#[derive(Debug, Clone)]
struct RenderTree {
transform: Option<Translate>,
data: Vec<RenderElement>,
children: Vec<RenderTree>,
}
#[derive(Debug, Clone)]
struct Translate {
x: f64,
y: f64,
}
#[derive(Debug, Clone)]
pub struct RenderModel {
tree: RenderTree,
}
Sparing implementation details, we are now able to generate svgs, bitmaps, segmentation masks or bounding boxes given a RenderModel
instance.
Therefore, we are set to start with the real part, the part where we create some cool datasets that we use in a machine learning pipeline!