Act 2
Filetype Agnostic Music Rendering?
impl Note {
pub fn note() -> PitchedNoteBuilder<NoPitch, NoNoteType>;
pub fn rest() -> NoteRestBuilde<NoNoteType>r;
pub fn grace() -> NoteGraceBuilde<NoPitch, NoNoteType>r;
}
pub struct PitchedNoteBuilder<P, N> {
pitch: P,
note_type: N,
dot: Option<Dot>,
stem: Option<Stem>,
accidental: Option<Accidental>,
}
impl<P, N> PitchedNoteBuilder<P, N> {
/// Assigns a pitch, REQUIRED.
pub fn pitch(self, pitch: Pitch) -> PitchedNoteBuilder<HasPitch, N>;
/// Assigns a note_type, REQUIRED.
pub fn note_type(self, note_type: NoteType) -> PitchedNoteBuilder<P, HasNoteType>;
/// Specifies a dot will be used.
pub fn dot(mut self) -> Self;
/// Assigns a stem.
pub fn stem(mut self, stem: Stem) -> Self;
/// Assigns an accidental.
pub fn accidental(mut self, accidental: Accidental) -> Self;
// ...
}
impl PitchedNoteBuilder<HasPitch, HasNoteType> {
/// Finalizes the builder and returns a `Note` of type PitchedNote.
///
/// `NoteType` has a higher precedence over stems.
/// If a `NoteType::Whole` is assigned to a stem, the stem is ignored.
/// For any `NoteType`s, a stem is mandatory. It will select `Stem::Up` as default.
pub fn build(self) -> Note;
}
Due to clever templating techniques, a Note
can only be finalized (by calling build()
),
when a Pitch
as well as a NoteType
have been provided.
As invalid notes are handled at compile-time, this builder pattern is called a
Typestate Builder Pattern.
With this, creating a Note
is as easy and pretty as the following:
let note = Note::note()
.pitched(Pitch::new(Step::A, 4))
.note_type(NoteType::Quarter)
.stem(Stem::Up)
.build();
}
Let's extend this pattern to generate random notes!
The idea is to be able to call random()
at any given time.
This will work similarly as finalizing a note with build()
, but it chooses missing properties randomly.
Distribution
trait for each enum and struct we defined.
Due to the strong typing of Rust, this task is straightforward as shown for the boilerplate needed for Step
:
impl Distribution<Step> for Standard {
fn sample<R: rand::Rng + ?Sized>(&self, rng: &mut R) -> Step {
match rng.gen_range(0..=6) {
0 => Step::C,
1 => Step::D,
2 => Step::E,
3 => Step::F,
4 => Step::G,
5 => Step::A,
6 => Step::B,
_ => unreachable!(),
}
}
}
In combination with a RandomConfig
specifying which properties should be turned on, the usage will look as follows:
// initialize a RandomConfig
let random_config = RandomConfig::builder()
.octaves(vec![4, 5]) // octave limited to either 4 or 5
.note_types(vec![NoteType::Quarter]) // only Quarter notes
.stems(vec![Stem::Up, Stem::Down]) // either Up or Down
.accidentals_maybe_of(vec![Accidental::Flat, Accidental::Sharp]) // sometimes, a Flat or Sharp appears
.build()
let mut rng = rand::thread_rng();
// create 500 samples and store them to disk
(0..500).map(|_| {
// each Score has two notes
let score = Score::builder()
.add_note(Note::note().random(&mut rng, &random_config)))
.add_note(Note::note().random(&mut rng, &random_config)))
.build();
// render Score
let model = RenderModel::from_score(score, MAX_WIDTH);
model.bitmap();
model.segmentation_mask(&element_classes);
model.bounding_boxes(&element_classes);
// store to disk...
}
As visible above, creating a segmentation mask as well as bounding boxes needs element_classes
as arguments.
These element classes were shown in Act 2, and summarize all the existing distinctions of musical symbols and classes.
However, most of the time the user doesn't care about most classes and e.g. just wants to predict the note heads or note stems.
ElementClasses
struct which can be configured as follows:
// specify which classes should be used
let element_classes = ElementClasses::builder()
.set_class(ElementClass::Background, 0)
.set_class(ElementClass::NoteWhole, 1)
.set_class(ElementClass::NoteHalf, 1)
.set_class(ElementClass::NoteFilled, 1)
.set_class(ElementClass::Staff, 2)
.build()
.unwrap();
}
Excluding the Background
class i.e. fixed to class 0, each note head (no matter the type) will belong to class 1 and the staff lines belong to class 2.
notensatz
to generate a dataset
of 500 images where each sample currently contains exactly two notes.