• Title/Summary/Keyword: Stereo music

Search Result 17, Processing Time 0.029 seconds

An Embedding /Extracting Method of Audio Watermark Information for High Quality Stereo Music (고품질 스테레오 음악을 위한 오디오 워터마크 정보 삽입/추출 기술)

  • Bae, Kyungyul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.21-35
    • /
    • 2018
  • Since the introduction of MP3 players, CD recordings have gradually been vanishing, and the music consuming environment of music users is shifting to mobile devices. The introduction of smart devices has increased the utilization of music through music playback, mass storage, and search functions that are integrated into smartphones and tablets. At the time of initial MP3 player supply, the bitrate of the compressed music contents generally was 128 Kbps. However, as increasing of the demand for high quality music, sound quality of 384 Kbps appeared. Recently, music content of FLAC (Free License Audio Codec) format using lossless compression method is becoming popular. The download service of many music sites in Korea has classified by unlimited download with technical protection and limited download without technical protection. Digital Rights Management (DRM) technology is used as a technical protection measure for unlimited download, but it can only be used with authenticated devices that have DRM installed. Even if music purchased by the user, it cannot be used by other devices. On the contrary, in the case of music that is limited in quantity but not technically protected, there is no way to enforce anyone who distributes it, and in the case of high quality music such as FLAC, the loss is greater. In this paper, the author proposes an audio watermarking technology for copyright protection of high quality stereo music. Two kinds of information, "Copyright" and "Copy_free", are generated by using the turbo code. The two watermarks are composed of 9 bytes (72 bits). If turbo code is applied for error correction, the amount of information to be inserted as 222 bits increases. The 222-bit watermark was expanded to 1024 bits to be robust against additional errors and finally used as a watermark to insert into stereo music. Turbo code is a way to recover raw data if the damaged amount is less than 15% even if part of the code is damaged due to attack of watermarked content. It can be extended to 1024 bits or it can find 222 bits from some damaged contents by increasing the probability, the watermark itself has made it more resistant to attack. The proposed algorithm uses quantization in DCT so that watermark can be detected efficiently and SNR can be improved when stereo music is converted into mono. As a result, on average SNR exceeded 40dB, resulting in sound quality improvements of over 10dB over traditional quantization methods. This is a very significant result because it means relatively 10 times improvement in sound quality. In addition, the sample length required for extracting the watermark can be extracted sufficiently if the length is shorter than 1 second, and the watermark can be completely extracted from music samples of less than one second in all of the MP3 compression having a bit rate of 128 Kbps. The conventional quantization method can extract the watermark with a length of only 1/10 compared to the case where the sampling of the 10-second length largely fails to extract the watermark. In this study, since the length of the watermark embedded into music is 72 bits, it provides sufficient capacity to embed necessary information for music. It is enough bits to identify the music distributed all over the world. 272 can identify $4*10^{21}$, so it can be used as an identifier and it can be used for copyright protection of high quality music service. The proposed algorithm can be used not only for high quality audio but also for development of watermarking algorithm in multimedia such as UHD (Ultra High Definition) TV and high-resolution image. In addition, with the development of digital devices, users are demanding high quality music in the music industry, and artificial intelligence assistant is coming along with high quality music and streaming service. The results of this study can be used to protect the rights of copyright holders in these industries.

Performance of music section detection in broadcast drama contents using independent component analysis and deep neural networks (ICA와 DNN을 이용한 방송 드라마 콘텐츠에서 음악구간 검출 성능)

  • Heo, Woon-Haeng;Jang, Byeong-Yong;Jo, Hyeon-Ho;Kim, Jung-Hyun;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.19-29
    • /
    • 2018
  • We propose to use independent component analysis (ICA) and deep neural network (DNN) to detect music sections in broadcast drama contents. Drama contents mainly comprise silence, noise, speech, music, and mixed (speech+music) sections. The silence section is detected by signal activity detection. To detect the music section, we train noise, speech, music, and mixed models with DNN. In computer experiments, we used the MUSAN corpus for training the acoustic model, and conducted an experiment using 3 hours' worth of Korean drama contents. As the mixed section includes music signals, it was regarded as a music section. The segmentation error rate (SER) of music section detection was observed to be 19.0%. In addition, when stereo mixed signals were separated into music signals using ICA, the SER was reduced to 11.8%.

A Mono-To-Stereo Upmixing Algorithm Based on the Harmonic-Percussive Separation (타악기 음원 분리에 기반한 모노-스테레오 업믹싱 기법)

  • Choi, Keunwoo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2013.06a
    • /
    • pp.60-63
    • /
    • 2013
  • In this research, a mono-to-stereo upmixing algorithm based on music source separation is proposed. For the upmixing, a harmonic and percussive separation for jazz music is implemented. Then, the sources are re-panned by equalizing the loudness of left and right sides of listeners in the one proposed approach. In the other approach, the harmonic sources are spread by a decorrelator while the percussive sources are panned to the center. In the experiments, the re-panning algorithm showed advanced performance in terms of localization and timbral quality.

  • PDF

A Study on Vocal Separation from Mixtured Music

  • Kim, Hyun-Tae;Park, Jang-Sik
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.2
    • /
    • pp.161-165
    • /
    • 2011
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. Separating technique for singing voice from music accompaniment is very useful in such equipment. We propose a system to separate singing voice from music accompaniment for stereo recordings. Our system consists of three stages. The first stage is a spectral change detector. The second stage classifies an input into vocal and non vocal portions by using GMM classifier. The last stage is a selective frequency separation stage. The results of removed by listening test from the results for computer based extraction simulation, spectrogram results show separation task successfully. Listening test with extracted MR from proposed system show vocal separating and removal task successfully.

Vocal Separation in Music Using SVM and Selective Frequency Subtraction (SVM과 선택적 주파수 차감법을 이용한 음악에서의 보컬 분리)

  • Kim, Hyun-Tae
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.1
    • /
    • pp.1-6
    • /
    • 2015
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. Listening test with removed vocal music from proposed system show relatively high satisfactory level.

Stereo Sound Image Expansion Using Phase Difference and Sound Pressure Level Difference in Television (위상차와 음압 레벨차를 이용한 텔레비전에서의 스테레오 음상 확대)

  • 박해광;오제화
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1243-1246
    • /
    • 1998
  • Three-dimensional(3-D) sound is a technique for generating or recreating sounds so they are perceived as emanating from locations in a three-dimensional space. Three dimensional sound has the potential of increasing the feeling of realism in music or movie soundtracks. Three-dimensional sound effects depend on psychoacoustic spectral and phase cues being presented in a reproduced signal. In this paper we propose an effective algorithm for the sound image expansion in television system using stereo image enhancement techniques. Compared to the other techniques of three-dimensional sound, the proposed algorithm use only two speakers to enhance the sound image expansion, while maintaining the original sound characteristics.

  • PDF

A Study on the Development for 3D Audio Generation Machine

  • Kim Sung-Eun;Kim Myong-Hee;Park Man-Gon
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.6
    • /
    • pp.807-813
    • /
    • 2005
  • The production and authoring of digital multimedia contents are most important fields in multimedia technology. Nowadays web-based technology and related multimedia software technology are growing in the IT industry and these technologies are evolving most rapidly in our life. The technology of digital audio and video processing is utilizing rapidly to improve quality of our life, Also we are more interested in high sense and artistic feeling in the music and entertainment areas by use of three dimensional (3D) digital sound technology continuously as well as 3D digital video technology. The service field of digital audio contents is increasing rapidly through the Internet. And the society of Internet users wants the audio contents service with better quality. Recently Internet users are not satisfying the sound quality with 2 channels stereo but seeking the high quality of sound with 5,] channels such as 3D audio of the movie films. But it might be needed proper hardware equipments for the service of 3D sound to satisfy this demand. In this paper, we expand the simple 3D audio generator developed and propose a web-based music bank by the software development of 3D audio generation player in 3D sound environment with two speakers minimizing hardware equipments, Also we believe that this study would contribute greatly to digital 3D sound service of high quality for music and entertainment mania.

  • PDF

Gender Display in Music Videos: Gender Image and Sexuality by Genre and Gender (국내 뮤직비디오에 나타난 성역할 고정관념: 노래 장르와 성별 차이를 중심으로)

  • Joe, Susan
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.7
    • /
    • pp.58-69
    • /
    • 2014
  • The purpose of this study was to examine stereotype of gender role in music video, comparing of gender image, sexuality(body exposure & sexual expression) by genre and gender difference. A content analysis of 300 songs and 517 characters was conducted between 2004 and 2013. Results were including followings. While women engaged in classic image, man engaged in naive image. R&B and Ballad demonstrated more classic image of women. Ballad, R&B, and Rock demonstrated more naive image of men than other genre. Sexuality was more prominent in dance and hip-hop genre. Compared to male character, female character was more sexually objectified, held to exposer herself and more likely to demonstrated sexually alluring behavior.

Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases (에너지와 위상을 고려한 선택적 주파수 차감법을 이용한 보컬 분리)

  • Kim, Hyuntae;Park, Jangsik
    • Journal of Broadcast Engineering
    • /
    • v.20 no.3
    • /
    • pp.408-413
    • /
    • 2015
  • Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. In this case, it is determined in consideration not only the energies for each frequency bin but also the phase of the each frequency bin at each channel signal. Listening test with removed vocal music from proposed system show relatively high satisfactory level.

Hand Gesture Recognition for Understanding Conducting Action (지휘행동 이해를 위한 손동작 인식)

  • Je, Hong-Mo;Kim, Ji-Man;Kim, Dai-Jin
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10c
    • /
    • pp.263-266
    • /
    • 2007
  • We introduce a vision-based hand gesture recognition fer understanding musical time and patterns without extra special devices. We suggest a simple and reliable vision-based hand gesture recognition having two features First, the motion-direction code is proposed, which is a quantized code for motion directions. Second, the conducting feature point (CFP) where the point of sudden motion changes is also proposed. The proposed hand gesture recognition system extracts the human hand region by segmenting the depth information generated by stereo matching of image sequences. And then, it follows the motion of the center of the gravity(COG) of the extracted hand region and generates the gesture features such as CFP and the direction-code finally, we obtain the current timing pattern of beat and tempo of the playing music. The experimental results on the test data set show that the musical time pattern and tempo recognition rate is over 86.42% for the motion histogram matching, and 79.75% fer the CFP tracking only.

  • PDF