Acknowledgement
This work was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korea Government (MSIT) (NRF-2022R1F1A1074566).
References
- K. O'Hanlon and M. D. Plumbley, "Polyphonic piano transcription using non-negative matrix factorisation with group sparsity," Proc. IEEE ICASSP, 3112-3116 (2014).
- C. Raphael, "Automatic transcription of piano music," Proc. 3rd ISMIR, (2002).
- V. Emiya, R. Badeau, and B. David "Multipitch estimation of piano sounds using a new probabilistic spectral smoothness principle," IEEE Trans. on Audio, Speech, and Lang. Process. 18, 1643-1654 (2010). https://doi.org/10.1109/TASL.2009.2038819
- L. Su and Y-H. Yang, "Combining spectral and temporal representations for multipitch estimation of polyphonic music," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 23, 1600-1612 (2015). https://doi.org/10.1109/TASLP.2015.2442411
- S. Sigtia, E. Benetos, and S. Dixon, "An end-to-end neural network for polyphonic piano music transcription," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 24, 927-939 (2016). https://doi.org/10.1109/TASLP.2016.2533858
- R. Kelz, M. Dorfer, F. Korzeniowski, S. Bock, A. Arzt, and G. Widmer, "On the potential of simple framewise approaches to piano transcription," Proc. 17th ISMIR, 475-481 (2016).
- C. Hawthorne, E. Elsen, J. Song, A. Roberts, I. Simon, C. Raffel, J. Engel, S. Oore, and D. Eck, "Onsets and frames: Dual-objective piano transcription," Proc. 19th ISMIR, 50-57 (2018).
- C. Hawthorne, A. Stasyuk, A. Roberts, I. Simon, C-Z A. Huang, S. Dieleman, E. Elsen, J. Engel, and D. Eck, "Enabling factorized piano music modeling and generation with the MAESTRO dataset," Proc. 7th ICLR, 1-12 (2019).
- Q. Kong, K. Choi, and Y. Wang, "Large-scale MIDI-based composer classification," arXiv preprint arXiv: 2010.14805 (2020).
- H. Zhang, J. Tang, S. R. Rafee, S. Dixon, G. Fazekas, and G. A. Wiggins, "ATEPP: A dataset of automatically transcribed expressive piano performance," Proc. 23st ISMIR, 446-453 (2022).
- Q. Kong, B. Li, X. Song, Y. Wan, and Y. Wang, "High resolution piano transcription with pedals by regressing onset and offset times," IEEE/ACM Trans. on Audio, Speech, and Lang. Process. 29, 3707-3717 (2021). https://doi.org/10.1109/TASLP.2021.3121991
- T. Kwon, D. Jeong, and J. Nam, "Polyphonic piano transcription using autoregressive multi-state note model," Proc. the 21st ISMIR, 454-461 (2020).
- D. Jeong, "Real-time automatic piano music transcription system," Proc. Late Breaking/Demo of the 21st ISMIR, 4-6 (2020).
- A. A. Sawchuk, E. Chew, R. Zimmermann, C. Papadopoulos, and C. Kyriakakis, "From remote media immersion to distributed immersive performance," Proc. ACM SIGMM workshop on Experiential Telepresence, 110-120 (2003).
- J. W. Kim and J. P. Bello, "Adversarial learning for improved onsets and frames music transcription," Proc. 20th ISMIR, 670-677 (2019).
- M. Akbari and H. Cheng, "Real-time piano music transcription based on computer vision," IEEE Trans. Multimedia, 17, 2113-2121 (2015). https://doi.org/10.1109/TMM.2015.2473702
- A. Dessein, A. Cont, and G. Lemaitre, "Real-time polyphonic music transcription with non-negative matrix factorization and beta-divergence," Proc. 11th ISMIR, 489-494 (2010).
- C. Raffel, B. McFee, E. J. Humphrey, J. Salamon, O. Nieto, D. Liang, and D. P. Ellis, "Mir_eval: A transparent implementation of common mir metrics," Proc. 15th ISMIR, 367-372 (2014).