mir_eval.transcription_velocity
Transcription evaluation, as defined in mir_eval.transcription
, does not
take into account the velocities of reference and estimated notes. This
submodule implements a variant of
mir_eval.transcription.precision_recall_f1_overlap()
which
additionally considers note velocity when determining whether a note is
correctly transcribed. This is done by defining a new function
mir_eval.transcription_velocity.match_notes()
which first calls
mir_eval.transcription.match_notes()
to get a note matching based on
onset, offset, and pitch. Then, we follow the evaluation procedure described in
[1] to test whether an estimated note should be considered
correct:
Reference velocities are re-scaled to the range [0, 1].
A linear regression is performed to estimate global scale and offset parameters which minimize the L2 distance between matched estimated and (rescaled) reference notes.
The scale and offset parameters are used to rescale estimated velocities.
An estimated/reference note pair which has been matched according to the onset, offset, and pitch is further only considered correct if the rescaled velocities are within a predefined threshold, defaulting to 0.1.
mir_eval.transcription_velocity.match_notes()
is used to define a new
variant mir_eval.transcription_velocity.precision_recall_f1_overlap()
which considers velocity.
Conventions
This submodule follows the conventions of mir_eval.transcription
and
additionally requires velocities to be provided as MIDI velocities in the range
[0, 127].
Metrics
mir_eval.transcription_velocity.precision_recall_f1_overlap()
: The precision, recall, F-measure, and Average Overlap Ratio of the note transcription, where an estimated note is considered correct if its pitch, onset, velocity and (optionally) offset are sufficiently close to a reference note.
References
- mir_eval.transcription_velocity.validate(ref_intervals, ref_pitches, ref_velocities, est_intervals, est_pitches, est_velocities)
Check that the input annotations have valid time intervals, pitches, and velocities, and throws helpful errors if not.
- Parameters:
- ref_intervalsnp.ndarray, shape=(n,2)
Array of reference notes time intervals (onset and offset times)
- ref_pitchesnp.ndarray, shape=(n,)
Array of reference pitch values in Hertz
- ref_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of reference notes
- est_intervalsnp.ndarray, shape=(m,2)
Array of estimated notes time intervals (onset and offset times)
- est_pitchesnp.ndarray, shape=(m,)
Array of estimated pitch values in Hertz
- est_velocitiesnp.ndarray, shape=(m,)
Array of MIDI velocities (i.e. between 0 and 127) of estimated notes
- mir_eval.transcription_velocity.match_notes(ref_intervals, ref_pitches, ref_velocities, est_intervals, est_pitches, est_velocities, onset_tolerance=0.05, pitch_tolerance=50.0, offset_ratio=0.2, offset_min_tolerance=0.05, strict=False, velocity_tolerance=0.1)
Match notes, taking note velocity into consideration.
This function first calls
mir_eval.transcription.match_notes()
to match notes according to the supplied intervals, pitches, onset, offset, and pitch tolerances. The velocities of the matched notes are then used to estimate a slope and intercept which can rescale the estimated velocities so that they are as close as possible (in L2 sense) to their matched reference velocities. Velocities are then normalized to the range [0, 1]. A estimated note is then further only considered correct if its velocity is withinvelocity_tolerance
of its matched (according to pitch and timing) reference note.- Parameters:
- ref_intervalsnp.ndarray, shape=(n,2)
Array of reference notes time intervals (onset and offset times)
- ref_pitchesnp.ndarray, shape=(n,)
Array of reference pitch values in Hertz
- ref_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of reference notes
- est_intervalsnp.ndarray, shape=(m,2)
Array of estimated notes time intervals (onset and offset times)
- est_pitchesnp.ndarray, shape=(m,)
Array of estimated pitch values in Hertz
- est_velocitiesnp.ndarray, shape=(m,)
Array of MIDI velocities (i.e. between 0 and 127) of estimated notes
- onset_tolerancefloat > 0
The tolerance for an estimated note’s onset deviating from the reference note’s onset, in seconds. Default is 0.05 (50 ms).
- pitch_tolerancefloat > 0
The tolerance for an estimated note’s pitch deviating from the reference note’s pitch, in cents. Default is 50.0 (50 cents).
- offset_ratiofloat > 0 or None
The ratio of the reference note’s duration used to define the offset_tolerance. Default is 0.2 (20%), meaning the
offset_tolerance
will equal theref_duration * 0.2
, or 0.05 (50 ms), whichever is greater. Ifoffset_ratio
is set toNone
, offsets are ignored in the matching.- offset_min_tolerancefloat > 0
The minimum tolerance for offset matching. See offset_ratio description for an explanation of how the offset tolerance is determined. Note: this parameter only influences the results if
offset_ratio
is notNone
.- strictbool
If
strict=False
(the default), threshold checks for onset, offset, and pitch matching are performed using<=
(less than or equal). Ifstrict=True
, the threshold checks are performed using<
(less than).- velocity_tolerancefloat > 0
Estimated notes are considered correct if, after rescaling and normalization to [0, 1], they are within
velocity_tolerance
of a matched reference note.
- Returns:
- matchinglist of tuples
A list of matched reference and estimated notes.
matching[i] == (i, j)
where reference notei
matches estimated notej
.
- mir_eval.transcription_velocity.precision_recall_f1_overlap(ref_intervals, ref_pitches, ref_velocities, est_intervals, est_pitches, est_velocities, onset_tolerance=0.05, pitch_tolerance=50.0, offset_ratio=0.2, offset_min_tolerance=0.05, strict=False, velocity_tolerance=0.1, beta=1.0)
Compute the Precision, Recall and F-measure of correct vs incorrectly transcribed notes, and the Average Overlap Ratio for correctly transcribed notes (see
mir_eval.transcription.average_overlap_ratio()
). “Correctness” is determined based on note onset, velocity, pitch and (optionally) offset. An estimated note is considered correct ifIts onset is within
onset_tolerance
(default +-50ms) of a reference noteIts pitch (F0) is within +/-
pitch_tolerance
(default one quarter tone, 50 cents) of the corresponding reference noteIts velocity, after normalizing reference velocities to the range [0, 1] and globally rescaling estimated velocities to minimize L2 distance between matched reference notes, is within
velocity_tolerance
(default 0.1) the corresponding reference noteIf
offset_ratio
isNone
, note offsets are ignored in the comparison. Otherwise, on top of the above requirements, a correct returned note is required to have an offset value within offset_ratio` (default 20%) of the reference note’s duration around the reference note’s offset, or withinoffset_min_tolerance
(default 50 ms), whichever is larger.
- Parameters:
- ref_intervalsnp.ndarray, shape=(n,2)
Array of reference notes time intervals (onset and offset times)
- ref_pitchesnp.ndarray, shape=(n,)
Array of reference pitch values in Hertz
- ref_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of reference notes
- est_intervalsnp.ndarray, shape=(m,2)
Array of estimated notes time intervals (onset and offset times)
- est_pitchesnp.ndarray, shape=(m,)
Array of estimated pitch values in Hertz
- est_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of estimated notes
- onset_tolerancefloat > 0
The tolerance for an estimated note’s onset deviating from the reference note’s onset, in seconds. Default is 0.05 (50 ms).
- pitch_tolerancefloat > 0
The tolerance for an estimated note’s pitch deviating from the reference note’s pitch, in cents. Default is 50.0 (50 cents).
- offset_ratiofloat > 0 or None
The ratio of the reference note’s duration used to define the offset_tolerance. Default is 0.2 (20%), meaning the
offset_tolerance
will equal theref_duration * 0.2
, oroffset_min_tolerance
(0.05 by default, i.e. 50 ms), whichever is greater. Ifoffset_ratio
is set toNone
, offsets are ignored in the evaluation.- offset_min_tolerancefloat > 0
The minimum tolerance for offset matching. See
offset_ratio
description for an explanation of how the offset tolerance is determined. Note: this parameter only influences the results ifoffset_ratio
is notNone
.- strictbool
If
strict=False
(the default), threshold checks for onset, offset, and pitch matching are performed using<=
(less than or equal). Ifstrict=True
, the threshold checks are performed using<
(less than).- velocity_tolerancefloat > 0
Estimated notes are considered correct if, after rescaling and normalization to [0, 1], they are within
velocity_tolerance
of a matched reference note.- betafloat > 0
Weighting factor for f-measure (default value = 1.0).
- Returns:
- precisionfloat
The computed precision score
- recallfloat
The computed recall score
- f_measurefloat
The computed F-measure score
- avg_overlap_ratiofloat
The computed Average Overlap Ratio score
- mir_eval.transcription_velocity.evaluate(ref_intervals, ref_pitches, ref_velocities, est_intervals, est_pitches, est_velocities, **kwargs)
Compute all metrics for the given reference and estimated annotations.
- Parameters:
- ref_intervalsnp.ndarray, shape=(n,2)
Array of reference notes time intervals (onset and offset times)
- ref_pitchesnp.ndarray, shape=(n,)
Array of reference pitch values in Hertz
- ref_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of reference notes
- est_intervalsnp.ndarray, shape=(m,2)
Array of estimated notes time intervals (onset and offset times)
- est_pitchesnp.ndarray, shape=(m,)
Array of estimated pitch values in Hertz
- est_velocitiesnp.ndarray, shape=(n,)
Array of MIDI velocities (i.e. between 0 and 127) of estimated notes
- **kwargs
Additional keyword arguments which will be passed to the appropriate metric or preprocessing functions.
- Returns:
- scoresdict
Dictionary of scores, where the key is the metric name (str) and the value is the (float) score achieved.