Search | Korea Science

A study on speech disentanglement framework based on adversarial learning for speaker recognition (화자 인식을 위한 적대학습 기반 음성 분리 프레임워크에 대한 연구)

Kwon, Yoohwan;Chung, Soo-Whan;Kang, Hong-Goo
- The Journal of the Acoustical Society of Korea
- /
- v.39 no.5
- /
- pp.447-453
- /
- 2020
In this paper, we propose a system to extract effective speaker representations from a speech signal using a deep learning method. Based on the fact that speech signal contains identity unrelated information such as text content, emotion, background noise, and so on, we perform a training such that the extracted features only represent speaker-related information but do not represent speaker-unrelated information. Specifically, we propose an auto-encoder based disentanglement method that outputs both speaker-related and speaker-unrelated embeddings using effective loss functions. To further improve the reconstruction performance in the decoding process, we also introduce a discriminator popularly used in Generative Adversarial Network (GAN) structure. Since improving the decoding capability is helpful for preserving speaker information and disentanglement, it results in the improvement of speaker verification performance. Experimental results demonstrate the effectiveness of our proposed method by improving Equal Error Rate (EER) on benchmark dataset, Voxceleb1.
https://doi.org/10.7776/ASK.2020.39.5.447 인용 PDF KSCI

Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm (오피니언 마이닝 알고리즘 기반 음성인식 인터뷰 모델의 설계 및 구현)

Kim, Kyu-Ho;Kim, Hee-Min;Lee, Ki-Young;Lim, Myung-Jae;Kim, Jeong-Lae
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.12 no.1
- /
- pp.225-230
- /
- 2012
The opinion mining is that to use the existing data mining technology also uploaded blog to web, to use product comment, the opinion mining can extract the author's opinion therefore it not judge text's subject, only judge subject's emotion. In this paper, published opinion mining algorithms and the text using speech recognition API for non-voice data to judge the emotions suggested. The system is open and the Subject associated with Google Voice Recognition API sunwihwa algorithm, the algorithm determines the polarity through improved design, based on this interview, speech recognition, which implements the model.
https://doi.org/10.7236/JIWIT.2012.12.1.225 인용 PDF KSCI

Analyzing Correlations between Movie Characters Based on Deep Learning

Jin, Kyo Jun;Kim, Jong Wook
- Journal of the Korea Society of Computer and Information
- /
- v.26 no.10
- /
- pp.9-17
- /
- 2021
Humans are social animals that have gained information or social interaction through dialogue. In conversation, the mood of the word can change depending on the sensibility of one person to another. Relationships between characters in films are essential for understanding stories and lines between characters, but methods to extract this information from films have not been investigated. Therefore, we need a model that automatically analyzes the relationship aspects in the movie. In this paper, we propose a method to analyze the relationship between characters in the movie by utilizing deep learning techniques to measure the emotion of each character pair. The proposed method first extracts main characters from the movie script and finds the dialogue between the main characters. Then, to analyze the relationship between the main characters, it performs a sentiment analysis, weights them according to the positions of the metabolites in the entire time intervals and gathers their scores. Experimental results with real data sets demonstrate that the proposed scheme is able to effectively measure the emotional relationship between the main characters.
https://doi.org/10.9708/jksci.2021.26.10.009 인용 PDF KSCI HTML

Video Content Editing System for Senior Video Creator based on Video Analysis Techniques (영상분석 기술을 활용한 시니어용 동영상 편집 시스템)

Jang, Dalwon;Lee, Jaewon;Lee, JongSeol
- Journal of Broadcast Engineering
- /
- v.27 no.4
- /
- pp.499-510
- /
- 2022
This paper introduces a video editing system for senior creator who is not familiar to video editing. Based on video analysis techniques, it provide various information and delete unwanted shot. The system detects shot boundaries based on RNN(Recurrent Neural Network), and it determines the deletion of video shots. The shots can be deleted using shot-level significance, which is computed by detecting focused area. It is possible to delete unfocused shots or motion-blurred shots using the significance. The system detects object and face, and extract the information of emotion, age, and gender from face image. Users can create video contents using the information. Decorating tools are also prepared, and in the tools, the preferred design, which is determined from user history, places in the front of the design element list. With the video editing system, senior creators can make their own video contents easily and quickly.
https://doi.org/10.5909/JBE.2022.27.4.499 인용 PDF KSCI KPUBS

Korean Contextual Information Extraction System using BERT and Knowledge Graph (BERT와 지식 그래프를 이용한 한국어 문맥 정보 추출 시스템)

Yoo, SoYeop;Jeong, OkRan
- Journal of Internet Computing and Services
- /
- v.21 no.3
- /
- pp.123-131
- /
- 2020
Along with the rapid development of artificial intelligence technology, natural language processing, which deals with human language, is also actively studied. In particular, BERT, a language model recently proposed by Google, has been performing well in many areas of natural language processing by providing pre-trained model using a large number of corpus. Although BERT supports multilingual model, we should use the pre-trained model using large amounts of Korean corpus because there are limitations when we apply the original pre-trained BERT model directly to Korean. Also, text contains not only vocabulary, grammar, but contextual meanings such as the relation between the front and the rear, and situation. In the existing natural language processing field, research has been conducted mainly on vocabulary or grammatical meaning. Accurate identification of contextual information embedded in text plays an important role in understanding context. Knowledge graphs, which are linked using the relationship of words, have the advantage of being able to learn context easily from computer. In this paper, we propose a system to extract Korean contextual information using pre-trained BERT model with Korean language corpus and knowledge graph. We build models that can extract person, relationship, emotion, space, and time information that is important in the text and validate the proposed system through experiments.
https://doi.org/10.7472/jksii.2020.21.3.123 인용 PDF KSCI HTML

Smartphone Addiction Detection Based Emotion Detection Result Using Random Forest (랜덤 포레스트를 이용한 감정인식 결과를 바탕으로 스마트폰 중독군 검출)

Lee, Jin-Kyu;Kang, Hyeon-Woo;Kang, Hang-Bong
- Journal of IKEEE
- /
- v.19 no.2
- /
- pp.237-243
- /
- 2015
Recently, eight out of ten people have smartphone in Korea. Also, many applications of smartphone have increased. So, smartphone addiction has become a social issue. Especially, many people in smartphone addiction can't control themselves. Sometimes they don't realize that they are smartphone addiction. Many studies, mostly surveys, have been conducted to diagnose smartphone addiction, e.g. S-measure. In this paper, we suggest how to detect smartphone addiction based on ECG and Eye Gaze. We measure the signals of ECG from the Shimmer and the signals of Eye Gaze from the smart eye when the subjects see the emotional video. In addition, we extract features from the S-transform of ECG. Using Eye Gaze signals(pupil diameter, Gaze distance, Eye blinking), we extract 12 features. The classifier is trained using Random Forest. The classifiers detect the smartphone addiction using the ECG and Eye Gaze signals. We compared the detection results with S-measure results that surveyed before test. It showed 87.89% accuracy in ECG and 60.25% accuracy in Eye Gaze.
https://doi.org/10.7471/ikeee.2015.19.2.237 인용 PDF KSCI

The Intelligent Determination Model of Audience Emotion for Implementing Personalized Exhibition (개인화 전시 서비스 구현을 위한 지능형 관객 감정 판단 모형)

Jung, Min-Kyu;Kim, Jae-Kyeong
- Journal of Intelligence and Information Systems
- /
- v.18 no.1
- /
- pp.39-57
- /
- 2012
Recently, due to the introduction of high-tech equipment in interactive exhibits, many people's attention has been concentrated on Interactive exhibits that can double the exhibition effect through the interaction with the audience. In addition, it is also possible to measure a variety of audience reaction in the interactive exhibition. Among various audience reactions, this research uses the change of the facial features that can be collected in an interactive exhibition space. This research develops an artificial neural network-based prediction model to predict the response of the audience by measuring the change of the facial features when the audience is given stimulation from the non-excited state. To present the emotion state of the audience, this research uses a Valence-Arousal model. So, this research suggests an overall framework composed of the following six steps. The first step is a step of collecting data for modeling. The data was collected from people participated in the 2012 Seoul DMC Culture Open, and the collected data was used for the experiments. The second step extracts 64 facial features from the collected data and compensates the facial feature values. The third step generates independent and dependent variables of an artificial neural network model. The fourth step extracts the independent variable that affects the dependent variable using the statistical technique. The fifth step builds an artificial neural network model and performs a learning process using train set and test set. Finally the last sixth step is to validate the prediction performance of artificial neural network model using the validation data set. The proposed model is compared with statistical predictive model to see whether it had better performance or not. As a result, although the data set in this experiment had much noise, the proposed model showed better results when the model was compared with multiple regression analysis model. If the prediction model of audience reaction was used in the real exhibition, it will be able to provide countermeasures and services appropriate to the audience's reaction viewing the exhibits. Specifically, if the arousal of audience about Exhibits is low, Action to increase arousal of the audience will be taken. For instance, we recommend the audience another preferred contents or using a light or sound to focus on these exhibits. In other words, when planning future exhibitions, planning the exhibition to satisfy various audience preferences would be possible. And it is expected to foster a personalized environment to concentrate on the exhibits. But, the proposed model in this research still shows the low prediction accuracy. The cause is in some parts as follows : First, the data covers diverse visitors of real exhibitions, so it was difficult to control the optimized experimental environment. So, the collected data has much noise, and it would results a lower accuracy. In further research, the data collection will be conducted in a more optimized experimental environment. The further research to increase the accuracy of the predictions of the model will be conducted. Second, using changes of facial expression only is thought to be not enough to extract audience emotions. If facial expression is combined with other responses, such as the sound, audience behavior, it would result a better result.
https://doi.org/10.13088/jiis.2012.18.1.039 인용 PDF KSCI

Emoticon by Emotions: The Development of an Emoticon Recommendation System Based on Consumer Emotions (Emoticon by Emotions: 소비자 감성 기반 이모티콘 추천 시스템 개발)

Kim, Keon-Woo;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.24 no.1
- /
- pp.227-252
- /
- 2018
The evolution of instant communication has mirrored the development of the Internet and messenger applications are among the most representative manifestations of instant communication technologies. In messenger applications, senders use emoticons to supplement the emotions conveyed in the text of their messages. The fact that communication via messenger applications is not face-to-face makes it difficult for senders to communicate their emotions to message recipients. Emoticons have long been used as symbols that indicate the moods of speakers. However, at present, emoticon-use is evolving into a means of conveying the psychological states of consumers who want to express individual characteristics and personality quirks while communicating their emotions to others. The fact that companies like KakaoTalk, Line, Apple, etc. have begun conducting emoticon business and sales of related content are expected to gradually increase testifies to the significance of this phenomenon. Nevertheless, despite the development of emoticons themselves and the growth of the emoticon market, no suitable emoticon recommendation system has yet been developed. Even KakaoTalk, a messenger application that commands more than 90% of domestic market share in South Korea, just grouped in to popularity, most recent, or brief category. This means consumers face the inconvenience of constantly scrolling around to locate the emoticons they want. The creation of an emoticon recommendation system would improve consumer convenience and satisfaction and increase the sales revenue of companies the sell emoticons. To recommend appropriate emoticons, it is necessary to quantify the emotions that the consumer sees and emotions. Such quantification will enable us to analyze the characteristics and emotions felt by consumers who used similar emoticons, which, in turn, will facilitate our emoticon recommendations for consumers. One way to quantify emoticons use is metadata-ization. Metadata-ization is a means of structuring or organizing unstructured and semi-structured data to extract meaning. By structuring unstructured emoticon data through metadata-ization, we can easily classify emoticons based on the emotions consumers want to express. To determine emoticons' precise emotions, we had to consider sub-detail expressions-not only the seven common emotional adjectives but also the metaphorical expressions that appear only in South Korean proved by previous studies related to emotion focusing on the emoticon's characteristics. We therefore collected the sub-detail expressions of emotion based on the "Shape", "Color" and "Adumbration". Moreover, to design a highly accurate recommendation system, we considered both emotion-technical indexes and emoticon-emotional indexes. We then identified 14 features of emoticon-technical indexes and selected 36 emotional adjectives. The 36 emotional adjectives consisted of contrasting adjectives, which we reduced to 18, and we measured the 18 emotional adjectives using 40 emoticon sets randomly selected from the top-ranked emoticons in the KakaoTalk shop. We surveyed 277 consumers in their mid-twenties who had experience purchasing emoticons; we recruited them online and asked them to evaluate five different emoticon sets. After data acquisition, we conducted a factor analysis of emoticon-emotional factors. We extracted four factors that we named "Comic", Softness", "Modernity" and "Transparency". We analyzed both the relationship between indexes and consumer attitude and the relationship between emoticon-technical indexes and emoticon-emotional factors. Through this process, we confirmed that the emoticon-technical indexes did not directly affect consumer attitudes but had a mediating effect on consumer attitudes through emoticon-emotional factors. The results of the analysis revealed the mechanism consumers use to evaluate emoticons; the results also showed that consumers' emoticon-technical indexes affected emoticon-emotional factors and that the emoticon-emotional factors affected consumer satisfaction. We therefore designed the emoticon recommendation system using only four emoticon-emotional factors; we created a recommendation method to calculate the Euclidean distance from each factors' emotion. In an attempt to increase the accuracy of the emoticon recommendation system, we compared the emotional patterns of selected emoticons with the recommended emoticons. The emotional patterns corresponded in principle. We verified the emoticon recommendation system by testing prediction accuracy; the predictions were 81.02% accurate in the first result, 76.64% accurate in the second, and 81.63% accurate in the third. This study developed a methodology that can be used in various fields academically and practically. We expect that the novel emoticon recommendation system we designed will increase emoticon sales for companies who conduct business in this domain and make consumer experiences more convenient. In addition, this study served as an important first step in the development of an intelligent emoticon recommendation system. The emotional factors proposed in this study could be collected in an emotional library that could serve as an emotion index for evaluation when new emoticons are released. Moreover, by combining the accumulated emotional library with company sales data, sales information, and consumer data, companies could develop hybrid recommendation systems that would bolster convenience for consumers and serve as intellectual assets that companies could strategically deploy.
https://doi.org/10.13088/jiis.2018.24.1.227 인용 PDF KSCI

Preliminary Research on the Effect of Cosmetic Containing Ginseng Extract on Quality of Life of Healthy Women Based on Skindex-16 (인삼 추출물 함유 한방화장품이 건강한 성인 여성의 삶의 질에 미치는 영향에 관한 예비 연구; Skindex-16을 중심으로)

Cho, Ga Young;Park, Hyo Min;Kwon, Lee Kyung;Cho, Sung A;Kang, Byung Young;Kim, Yoon Bum
- Journal of the Society of Cosmetic Scientists of Korea
- /
- v.41 no.4
- /
- pp.333-340
- /
- 2015
This study is designed to analyze the effect of skincare using cosmetic containing ginseng extract, on improving quality of life (QOL) of healthy women, with blind testing. QOL is a concept that represents how one's disease or health condition can physically, psychologically, and socially influence his or her daily life. The study was conducted to assess the effect of a ginseng cosmetic preparation on quality of life (QOL) using the Skindex-16 score, stratified by blind versus non-blinded option. 45 healthy women aged between 30 and 49 years with no skin disease were recruited for this study. Volunteers were divided into two groups. Group A (n = 22) received anti-aging cream with ginseng extract in the original packaging, which included the brand name and logo. Group B (n = 23) received the same cream in a plain white jar without any package decoration or logo. Both groups used the cream for 8 weeks. For the skin-related QOL assessment, Skindex-16 was used at baseline, forth, and eighth week. All volunteers except two dropouts in Group A completed the dermatology-specific QOL measure, Skindex-16, at baseline, after 4 weeks, and after 8 weeks of treatment with the provided samples. As a result, the mean score of 43 participants at baseline was $22.70{\pm}4.82$. There was a significant difference between the baseline score and the score after 8 weeks in both groups: The scores changed from $23.30{\pm}5.14$ to $20.20{\pm}4.83$ in Group A, from $22.17{\pm}4.58$ to $20.52{\pm}3.60$ in Group B. The "Symptom" subscale of Skindex-16 improved after 4 weeks and the "Emotion" subscale improved after 8 weeks in Group A. The "Function" subscale did not show improvement in either groups. Both groups showed no interaction effect between follow up time and groups in Skindex-16 and subscale. This research opens up the possibility of skincare using ginseng cream having a positive effect on QOL in healthy women. Moreover, one can predict that skincare ritual itself may have greater impact on the improvement of QOL, compared to the product packaging.
https://doi.org/10.15230/SCSK.2015.41.4.333 인용 PDF KSCI

Analyzing Heart Rate Variability for Automatic Sleep Stage Classification (수면단계 자동분류를 위한 심박동변이도 분석)

김원식;김교헌;박세진;신재우;윤영로
- Science of Emotion and Sensibility
- /
- v.6 no.4
- /
- pp.9-14
- /
- 2003
Sleep stages have been useful indicator to check a person's comfortableness in a sleep, But the traditional method of scoring sleep stages with polysomnography based on the integrated analysis of the electroencephalogram(EEG), electrooculogram(EOG), electrocardiogram(ECG), and electromyogram(EMG) is too restrictive to take a comfortable sleep for the participants, While the sympathetic nervous system is predominant during a wakefulness, the parasympathetic nervous system is more active during a sleep, Cardiovascular function is controlled by this autonomic nervous system, So, we have interpreted the heart rate variability(HRV) among sleep stages to find a simple method of classifying sleep stages, Six healthy male college students participated, and 12 night sleeps were recorded in this research, Sleep stages based on the "Standard scoring system for sleep stage" were automatically classified with polysomnograph by measuring EEG, EOG, ECG, and EMG(chin and leg) for the six participants during sleeping, To extract only the ECG signals from the polysomnograph and to interpret the HRV, a Sleep Data Acquisition/Analysis System was devised in this research, The power spectrum of HRV was divided into three ranges; low frequency(LF), medium frequency(MF), and high frequency(HF), It showed that, the LF/HF ratio of the Stage W(Wakefulness) was 325% higher than that of the Stage 2(p＜.05), 628% higher than that of the Stage 3(p＜.001), and 800% higher than that of the Stage 4(p＜.001), Moreover, this ratio of the Stage 4 was 427% lower than that of the Stage REM (rapid eye movement) (p＜.05) and 418% lower than that of the Stage l(p＜.05), respectively, It was observed that the LF/HF ratio decreased monotonously as the sleep stage changes from the Stage W, Stage REM, Stage 1, Stage 2, Stage 3, to Stage 4, While the difference of the MF/(LF+HF) ratio among sleep Stages was not significant, it was higher in the Stage REM and Stage 3 than that of in the other sleep stages in view of descriptive statistic analysis for the sample group.
PDF

Search Result 134, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)