I. INTRODUCTION
The well-being, history, and culture of Mongolians can be seen in any genre of folklore only by reading the poetry of long songs [1]. The lyrics of long songs were first composed by someone, developed, and spread through word of mouth, becoming a form of folklore. Looking at the historical roots of Mongolian literature and culture, there are very ancient sources, and the poems of long songs are of ancient origin, and they are examples of elegant poems and poems composed by our sages. But now, because the long song is not sung in all its tones, the full meaning of the long song is not heard or known. In this study, we will not only express the poetic meaning of the long song in terms of liter- ature, but also identify and evaluate long song data using machine learning methods in 3 categories: ayzam, suman, and besreg. There is a lot of research in this field in many fields around the world, and the Mongolian linguistics sector has developed this interdisciplinary research and made great achievements. Our work is new, as it is the first of its kind in the field of folklore and literature. In particular, it is very important to start with long poems.
Our mission is to study and promote artificial intelli- gence, which is widely used in multidisciplinary research around the world, in combination with its unique folklore heritage.
II. RELATED WORKS
The first person to study Mongolian long songs on a scientific basis was the Russian scientist A. Pozdneev. In 1880, he included Buryat, Khalkh, and Ould songs in his Mongol long song, and the Russian scholar AD Rudneev also studied the melodies of Mongolian long songs. B. Я. Vladimertsov carefully studied and recorded Oirat songs [2].
In addition to the oral source of long songs, the written source has become a major research tool in this field. Mongolian scholars such as P. Khorloo, H. Sampildendev, Sh. Dorj, J. Badraa, and S. Tsoodol have studied long songs, and some senior scholars of the SCC of the Mongolian Academy of Sciences have collected written scriptures and books that were widely distributed in Mongolia. The famous long song singer J. Dorjdagva is not only a great singer but also a researcher who has a great place in the history of long song studies. J. Badraa published a book about his story called “The Great Singer's Speech” which is a valuable work among scholars and researchers in this field. Dr. A. Alimaa, Head of the Institute of Linguistics and Oral Studies of the Mongolian Academy of Sciences, has studied, discovered, and put into circulation more than 3, 000 long songs sung in Mongolia [3]. The study of Mongolian poetry, it has been studied by Western Mongol scholars since the middle of the 19th century. They have been observing and emphasizing the uniqueness of Mongolian poetry [4-5] and [6-7]. Also the Long song was inscribed by UNESCO in 2008 [8].
On the other side, computer science researchers are researching to classify sound types. It is common to process signals from audio data and classify them into rock, pop, rap, and classical [9]. Although there are fewer classifications based on verse alone than audio, there are also works using natural language processing and machine learning. In recent years, the use of deep learning has increased, and as a result, deep learning methods such as RNN have been used to classify verse data. For example, the work of Alexandros Tsaptsinos [10]. Anna Boonyanit's work [11] categorizes hip-hop, rock, and pop with about 60 % recognition rate. However, no research has been conducted in Mongolia to classify the types of long songs and the meaning of the poems automatically using machine learning. Therefore, in this study, we purposed to Mongolian long song type classification using machine learning methods.
III. LONG SONG TYPE CLASSIFICATION
3.1. Mongolian Poetry, Poetic Tradition, Regularity, Interpretation of the Meaning of the Verse
Mongolian folk songs are one of the major genres of Mongolian folklore. Oral literature is a work of art that originated from the life of the people and spread through word of mouth as an expression of Mongolian customs, history, culture, and wisdom. The main types of folklore include fairy tales, epics, legends, riddles, proverbs, blessings, praises, the three worlds, and old sayings. Many of these genres are poetic. It seeks to study song poetry, including long song poetry, which includes verse patterns, word inter- pretations, the meaning of lyrics, and the ability to classify lyrics by machine learning methods.
The poetry of long songs is mostly composed of written words. The noble composition of the Mongolian script and the choice of rare words in the Mongolian language fund show that the Mongolian long song is not only a genre of oral literature but also written poetry. It seems that most of the poems in long songs were written by highly educated people. This is especially evident in the long songs of state related reverence. The expression of the above is that the verses of the long song are read to gain a wide range of knowledge and teachings, such as the phenomena of the universe, nature, customs, and respect and love.
Long songs are divided into three types according to their melody and size: ayzam, suman, and besreg, and these types are also reflected in the meaning and content of the poem. Many works of Mongolian poetry are thematically categorized only in terms of verse content. Ayzam (large- scale song) is a song with a wide range of melodies, and a large number of retro folds, which are larger than the other two categories. Suman (medium-sized song) long song has a wide range of melodies, is fast and has a lot of ornaments, and is widely sung in Mongolia. In addition to popular topics such as farming, there is a wide range of topics that can be explored to understand the history of Mongolia. Besreg (short or small songs) long songs have a wider melody than short folk songs, but they are not short, they have short percussion and ornamentation, and the meaning and content of the words are dominated by syllables and teachings. There is a tradition of using this type of song as a learning tool for beginners.
3.2. Ability to Classify Long Song Types by Lyrics
Mongolian folk song is innovative and important to study long songs around the world, combine them with computer science research methods, and expand it into interdisciplinary research. The most important thing to do in this area is to collect a large amount of data. However, there is a lot of data collected for the written sources of long songs, and we decided to experiment with the example of Central Khalkh songs in this study. Khalkh long song is widespread in the heart of Mongolia, so most of the songs commonly sung today fall into this category. Every song has verses (badag/turleg in Mongolian), every verse consists of lines, and each line has several words. In this work, we studied the possibility of classifying three types of ayzam, suman, and besreg based on the data of the long song verses by machine learning method. The following figure shows the general scheme of work.
Fig. 1. Long song type classification general scheme.
Data preparation and features: This time, we collected and experimented with 14 ayzam types, 45 suman types, 21 besreg types, and a total of 80 lyrics.
Features: From the long song data, 11 features such as song name (string), number of verses (numeric), number of lines (numeric), number of words (numeric), generalvalue (string), doublevalue(string), elapsed time of verse(nu- meric), elapsed time of 5 words(numeric), the longest elapsed time of 1 word (numeric), full text(string), category name (string -suman, ayzam or besreg). Fig. 2 shows the average values of verse/line/word numbers in 3 types.
Fig. 3. The average values of audio features for the 3 classes.
Fig. 3 shows the average continued time of verse/5 words/ the longest 1 word in 3 types.
Fig. 3. The average values of audio features for the 3 classes.
Here, it is clear to the long song means because the longest continued time for 1 word is 27 seconds in the Suman type case.
Because machine learning methods are relatively effective depending on the data distribution and characteristics, possible methods have been tested using the weka program. The best method for our data was Multilayer perceptron al- gorithm.
3.3. MLP Neural Network
A Multilayer Perceptron has input and output layers, and one or more hidden layers with many neurons stacked to- gether. And while in the Perceptron the neuron must have an activation function that imposes a threshold, like ReLU or sigmoid, neurons in a Multilayer Perceptron can use any arbitrary activation function. Multilayer Perceptron falls under the category of feedforward algorithms because inputs are combined with the initial weights in a weighted sum and subjected to the activation function, just like in the Perceptron. But the difference is that each linear combination is propagated to the next layer. Each layer is feeding the next one with the result of their computation, their internal representation of the data. This goes all the way through the hidden layers to the output layer. If the algorithm only computed the weighted sums in each neuron, propagated results to the output layer, and stopped there, it wouldn’t be able to learn the weights that minimize the cost function. If the algorithm only computed one iteration, there would be no actual learning. Backpropagation is the learning mechanism that allows the Multilayer Perceptron to iteratively adjust the weights in the network, to minimize the cost function. In each iteration, after the weighted sums are forwarded through all layers, the gradient of the Mean Squared Error is computed across all input and output pairs. Then, to propagate it back, the weights of the first hidden layer are updated with the value of the gradient. This process keeps going until the gradient for each input-output pair has converged, meaning the newly computed gradient hasn’t changed more than a specified convergence thresh- old, compared to the previous iteration.
Fig. 4. Multilayer Perceptron example of Long song numeric values.
The results are described in detail in the experimental results section. The following figure shows an example of how song data is prepared in a * .arff file to be read in a weka program [12].
Fig. 5. Data format for machine learning algorithms input (Weka tool).
IV. EXPERIMENTAL RESULTS
We classified the collected data using machine learning methods. This time, we have collected and experimented with 14-ayzam types, 45-suman types, 21-besreg types, and a total of 80 lyrics. We tested the data with only text features, and only numeric values and combined text and numeric values in a 10-fold cross-validation (Table 1).
Table 1. The number of experimental data samples.
Fig. 6 shows the example of balanced and numeric valued data in Weka.
Fig. 6. Data inputs(balanced numeric values) in Weka.
Fig. 7 shows an example of the best result of classification by the MLP method in Weka.
Fig. 7. An example of classification results in Weka.
Unbalanced data in the three categories in terms of data may affect the classification results. Most of these three categories are suman type songs.
Table 2 shows comparison results of song type/genre classification in balanced and unbalanced data.
Table 2. Experimental results comparison.
In our case, the best result is function methods show 78% accuracy in balanced data with only numeric values. There are 6 features that have numeric values and 3 of them are about lyrics structure, then 3 of them are about audio infor- mation. The worst case is text acceptable methods show 20- 56% in accuracy balanced and unbalanced data with texts. Because we did not use any natural language processing methods. We converted Cyrillic text to Latin text character by character. Therefore it is meaningless about semantics. They compared 2 methods are used language models such as BERT, LSTM, etc. So their work is meaningful and shows better results.
Table 3. Experimental results comparison with related works. Only text values of Long song.
On the other side, song genres are big differences compared with the 3 types of one genre. Our goal is to classify 3 types of only Long song genres. Table 4 shows the results of 3 types each and weighted average scores.
Table 4. Classification results for 3 types.
According to the definition of the 3 types of long songs, Suman and Ayzam songs are similar, and Ayzam and Besreg songs are similar too. Besger is the shortest one. Therefore, the Besreg songs have the highest values in Table 4.
Fig. 8 shows the confusion matrix of 3 types of classification results. As mentioned above, Suman and Ayzam songs are misclassified compared with Besreg songs.
Fig. 8. A confusion matrix of the models with balanced data.
In the other viewpoint, we may mislabel the 3 types of long songs, because there is no exact correct answer which is suman type, which is ayzam type, etc.
V. CONCLUSION
This study tested the features of long song lyrics, such as long song verses, poetry, its structure, and the symbolic meaning of long songs, as well as the possibility of combining traditional long song research with modern technological advances. The main result of this study was that long song researchers showed how it is possible to classify data by machine learning when preparing data and classifying it by the human mind. In this time, we tested 80 songs, future it is possible to experiment with the lyrics of more than 300 popular songs.
ACKNOWLEDGEMENT
This work was supported by the Youth Research grant funded by the National University of Mongolia (NUM) (No. P2020-3945) in 2020-2021.
References
- Long song definition, https://en.wikipedia.org/wiki/Long_song
- B. Y. Vladimirtsov and J. R. Krueger, "The oirat-mongolian heroic epic,"Mongolian Studies, vol. 8, pp. 5-58, 1983.
- A. Alimaa, "Features, distribution and release characteristics of long songs," Ulaanbaatar, 2013.
- G. Galbayar, "The method of connecting the head of Mongolian poetry and its regularity," Ulaanbaatar, 2014.
- Mend-oyoo, http://www.mend-ooyo.mn/, 2014.
- K. Sampildendev and K. N. Yatsovskoi, Mongolian folk long song, Ulaanbaatar, 1984.
- S. Yoon, "Remains and renewals: The process of preserving Urtyn duu in contemporary Mongolia," Mongolian Studies, vol. 35, pp. 119-131, 2013.
- Unesco report, https://ich.unesco.org/en/RL/urtiin-duutraditional-folk-long-song-00115.
- G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293-302, Jul. 2002. https://doi.org/10.1109/TSA.2002.800560
- A. Tsaptsinos, Lyric-Based Music Genre Classification using a Hierarchical Attention Network, Jul. 2017. https://arxiv.org/abs/1707.04678
- A. Boonyanit, A. Dahl, and M. Leszczynski, Music Genre Classification using Song Lyrics, Stanford CS224N Custom Project, https://web.stanford.edu/class/cs224n/reports/final_reports/report003.pdf
- F. Eibe, A. H. Mark, and H. W. Ian, The WEKA Workbench. Online Appendix for Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, Fourth Edition, 2016. https://doc1.bibliothek.li/acb/FLMF040119.pdf
- H., Sam, N. Carlos, Jr. Silla, and C. G. Johnson. Automatic Lyrics based Music Genre Classification in a Multilingual Setting, in Thirteenth Brazilian Symposium on Computer Music. https://kar.kent.ac.uk/33266/. 2011.
- H. Akalp, E. F. Cigdem, S. Yilmaz, N. Bolucu, and B. Can, "Language representation models for music genre classification using lyrics," International Symposium on Electrical, Electronics and Information Engineering, pp. 408-414, Feb. 2021.