• Title/Summary/Keyword: 핵심단어 분석

Search Result 158, Processing Time 0.024 seconds

A Study of Fundamental Frequency for Focused Word Spotting in Spoken Korean (한국어 발화음성에서 중점단어 탐색을 위한 기본주파수에 대한 연구)

  • Kwon, Soon-Il;Park, Ji-Hyung;Park, Neung-Soo
    • The KIPS Transactions:PartB
    • /
    • v.15B no.6
    • /
    • pp.595-602
    • /
    • 2008
  • The focused word of each sentence is a help in recognizing and understanding spoken Korean. To find the method of focused word spotting at spoken speech signal, we made an analysis of the average and variance of Fundamental Frequency and the average energy extracted from a focused word and the other words in a sentence by experiments with the speech data from 100 spoken sentences. The result showed that focused words have either higher relative average F0 or higher relative variances of F0 than other words. Our findings are to make a contribution to getting prosodic characteristics of spoken Korean and keyword extraction based on natural language processing.

Implementation of summarization system for documents by using a word co-occurrence graph (단어의 공기 관계 그래프를 이용한 문서 요약 시스템의 구현)

  • Ryu, Je;Sun, Bok-Keun;Park, Boh-A;Han, Kwang-Rok
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.348-350
    • /
    • 2000
  • 본 논문은 문서의 내용을 요약하기 위한 시스템의 구현에 대해서 다룬다. 문서의 내용을 분석하기 위해서는 문서의 키워드를 추출하고, 추출된 키워드를 사용하여 문서의 핵심 내용을 찾는 두 가지의 작업이 이루어져야 한다. 본 논문에서는 키워드를 추출하기 위해 형태소 분석 및 전처리기, 그리고 단어의 공기 관계 그래프를 이용한 키워드 추출기를 이용하였으며, 추출된 키워드를 이용하여 문서의 핵심 문장을 찾아내는 핵심 문장 추출기, 그리고 추출된 문장을 분석하여 내용을 요약할 수 있도록 해주는 구문분석기가 이용된다.

  • PDF

Presidential Candidate's Speech based on Network Analysis : Mainly on the Visibility of the Words and the Connectivity between the Words (18대 대통령 선거 후보자의 연설문 네트워크 분석: 단어의 가시성(visibility)과 단어 간 연결성(connectivity)을 중심으로)

  • Hong, Ju-Hyun;Yun, Hae-Jin
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.9
    • /
    • pp.24-44
    • /
    • 2014
  • This study explores the political meaning of candidate's speech and statement who run for the 18th presidential election in the viewpoint of communication. The visibility of the words and the connectivity between the words are analyzed in the viewpoint of structural aspect and the vision, policy. The visibility of the words is analyzed based on the frequency of the words mentioned in the speech or the statement. The connectivity between the words are analyzed based on the network analysis and expressed by graph. In the case of candidate Park, the key word is the happiness of the people and appointment. The key word for candidate Moon is regime change and the Korean Peninsula and the key word for candidate Ahn is the people and change. This study contributes positively to the study of candidate's discourse in the viewpoint of methodology by using network analysis and exploring scientifically the connectivity of the words. In the theoretical aspect this study uses the results of network analysis for revealing what is the leadership components in the speech and the statement. In conclusion, this study highlights the extension of the communication studies.

Content Analysis of Food and Nutrition unit in Middle School Textbooks of Home Economics - Focus on the National Curriculums from 1st to 2009 revised (중학교 가정(기술·가정)교과 식생활 영역의 핵심 교육내용 분석 - 제1차 교육과정부터 2009개정 교육과정의 교과서 내용을 중심으로 -)

  • Jang, Yoon-Mi;Kim, Yoo Kyeong
    • Journal of Korean Home Economics Education Association
    • /
    • v.30 no.4
    • /
    • pp.93-112
    • /
    • 2018
  • We analysed the textbooks of Home Economics in middle school from 1st to 2009 curriculums to investigate the contents and the portion of Food and Nutrition section. The key words were generated by word cloud technique using text-mining, and the portion of Food and Nutrition section was presented as a ratio of the pages. The core key words of Food and Nutrition section through the curriculums were 'raw food'·'food'·'diet'. In 1st and 2nd curriculums, the main key words were related to food materials, condiments and nutrients such as 'vitamin'·'protein'. The words such as 'nutrition'·'eating'·'requirement' were newly appeared in 3rd, 'portion' in 6th, and 'diet'·'adolescence' in 7th curriculum. The mean ratio of Food and Nutrition section in Home Economics was 24.3%. While the portion was as high as 31.8% in 7th it was strikingly reduced to 15.2% in 2009th. curriculum. Besides, Food and Nutrition section was composed of 10 units of middle level category during the 2nd and 3rd curriculums, and was reduced to 2 small units with none of middle level category in 2009th curriculum. Although the contents of Food and Nutrition section has been developed and adapted to the needs of the society through the curriculums, the portion of Food and Nutrition section in Home Economics has been reduced especially in 2009th curriculum, which could raise concerns on the health of individuals and communities.

Development of big data based Skin Care Information System SCIS for skin condition diagnosis and management

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.3
    • /
    • pp.137-147
    • /
    • 2022
  • Diagnosis and management of skin condition is a very basic and important function in performing its role for workers in the beauty industry and cosmetics industry. For accurate skin condition diagnosis and management, it is necessary to understand the skin condition and needs of customers. In this paper, we developed SCIS, a big data-based skin care information system that supports skin condition diagnosis and management using social media big data for skin condition diagnosis and management. By using the developed system, it is possible to analyze and extract core information for skin condition diagnosis and management based on text information. The skin care information system SCIS developed in this paper consists of big data collection stage, text preprocessing stage, image preprocessing stage, and text word analysis stage. SCIS collected big data necessary for skin diagnosis and management, and extracted key words and topics from text information through simple frequency analysis, relative frequency analysis, co-occurrence analysis, and correlation analysis of key words. In addition, by analyzing the extracted key words and information and performing various visualization processes such as scatter plot, NetworkX, t-SNE, and clustering, it can be used efficiently in diagnosing and managing skin conditions.

Korea National College of Agriculture and Fisheries in Naver News by Web Crolling : Based on Keyword Analysis and Semantic Network Analysis (웹 크롤링에 의한 네이버 뉴스에서의 한국농수산대학 - 키워드 분석과 의미연결망분석 -)

  • Joo, J.S.;Lee, S.Y.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.2
    • /
    • pp.71-86
    • /
    • 2021
  • This study was conducted to find information on the university's image from words related to 'Korea National College of Agriculture and Fisheries (KNCAF)' in Naver News. For this purpose, word frequency analysis, TF-IDF evaluation and semantic network analysis were performed using web crawling technology. In word frequency analysis, 'agriculture', 'education', 'support', 'farmer', 'youth', 'university', 'business', 'rural', 'CEO' were important words. In the TF-IDF evaluation, the key words were 'farmer', 'dron', 'agricultural and livestock food department', 'Jeonbuk', 'young farmer', 'agriculture', 'Chonju', 'university', 'device', 'spreading'. In the semantic network analysis, the Bigrams showed high correlations in the order of 'youth' - 'farmer', 'digital' - 'agriculture', 'farming' - 'settlement', 'agriculture' - 'rural', 'digital' - 'turnover'. As a result of evaluating the importance of keywords as five central index, 'agriculture' ranked first. And the keywords in the second place of the centrality index were 'farmers' (Cc, Cb), 'education' (Cd, Cp) and 'future' (Ce). The sperman's rank correlation coefficient by centrality index showed the most similar rank between Degree centrality and Pagerank centrality. The KNCAF articles of Naver News were used as important words such as 'agriculture', 'education', 'support', 'farmer', 'youth' in terms of word frequency. However, in the evaluation including document frequency, the words such as 'farmer', 'dron', 'Ministry of Agriculture, Food and Rural Affairs', 'Jeonbuk', and 'young farmers' were found to be key words. The centrality analysis considering the network connectivity between words was suitable for evaluation by Cd and Cp. And the words with strong centrality were 'agriculture', 'education', 'future', 'farmer', 'digital', 'support', 'utilization'.

Analyzing Self-Introduction Letter of Freshmen at Korea National College of Agricultural and Fisheries by Using Semantic Network Analysis : Based on TF-IDF Analysis (언어네트워크분석을 활용한 한국농수산대학 신입생 자기소개서 분석 - TF-IDF 분석을 기초로 -)

  • Joo, J.S.;Lee, S.Y.;Kim, J.S.;Kim, S.H.;Park, N.B.
    • Journal of Practical Agriculture & Fisheries Research
    • /
    • v.23 no.1
    • /
    • pp.89-104
    • /
    • 2021
  • Based on the TF-IDF weighted value that evaluates the importance of words that play a key role, the semantic network analysis(SNA) was conducted on the self-introduction letter of freshman at Korea National College of Agriculture and Fisheries(KNCAF) in 2020. The top three words calculated by TF-IDF weights were agriculture, mathematics, study (Q. 1), clubs, plants, friends (Q. 2), friends, clubs, opinions, (Q. 3), mushrooms, insects, and fathers (Q. 4). In the relationship between words, the words with high betweenness centrality are reason, high school, attending (Q. 1), garbage, high school, school (Q. 2), importance, misunderstanding, completion (Q.3), processing, feed, and farmhouse (Q. 4). The words with high degree centrality are high school, inquiry, grades (Q. 1), garbage, cleanup, class time (Q. 2), opinion, meetings, volunteer activities (Q.3), processing, space, and practice (Q. 4). The combination of words with high frequency of simultaneous appearances, that is, high correlation, appeared as 'certification - acquisition', 'problem - solution', 'science - life', and 'misunderstanding - concession'. In cluster analysis, the number of clusters obtained by the height of cluster dendrogram was 2(Q.1), 4(Q.2, 4) and 5(Q. 3). At this time, the cohesion in Cluster was high and the heterogeneity between Clusters was clearly shown.

Contextual Advertisement System based on Document Clustering (문서 클러스터링을 이용한 문맥 광고 시스템)

  • Lee, Dong-Kwang;Kang, In-Ho;An, Dong-Un
    • The KIPS Transactions:PartB
    • /
    • v.15B no.1
    • /
    • pp.73-80
    • /
    • 2008
  • In this paper, an advertisement-keyword finding method using document clustering is proposed to solve problems by ambiguous words and incorrect identification of main keywords. News articles that have similar contents and the same advertisement-keywords are clustered to construct the contextual information of advertisement-keywords. In addition to news articles, the web page and summary of a product are also used to construct the contextual information. The given document is classified as one of the news article clusters, and then cluster-relevant advertisement-keywords are used to identify keywords in the document. We could achieve 21% precision improvement by our proposed method.

The Study on the Software Educational Needs by Applying Text Content Analysis Method: The Case of the A University (텍스트 내용분석 방법을 적용한 소프트웨어 교육 요구조사 분석: A대학을 중심으로)

  • Park, Geum-Ju
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.3
    • /
    • pp.65-70
    • /
    • 2019
  • The purpose of this study is to understand the college students' needs for software curriculum which based on surveys from educational satisfaction of the software lecture evaluation, as well as to find out the improvement plan by applying the text content analysis method. The research method used the text content analysis program to calculate the frequency of words occurrence, key words selection, co-occurrence frequency of key words, and analyzed the text center and network analysis by using the network analysis program. As a result of this research, the decent points of the software education network are mentioned with 'lecturer' is the most frequently occurrence after then with 'kindness', 'student', 'explanation', 'coding'. The network analysis of the shortage points has been the most mention of 'lecture', 'wish to', 'student', 'lecturer', 'assignment', 'coding', 'difficult', and 'announcement' which are mentioned together. The comprehensive network analysis of both good and shortage points has compared among key words, we can figure out difference among the key words: for example, 'group activity or task', 'assignment', 'difficulty on level of lecture', and 'thinking about lecturer'. Also, from this difference, we can provide that the lack of proper role of individual staff at group activities, difficult and excessive tasks, awareness of the difficulty and necessity of software education, lack of instructor's teaching method and feedback. Therefore, it is necessary to examine not only how the grouping of software education (activities) and giving assignments (or tasks), but also how carried out group activities and tasks and monitored about the contents of lectures, teaching methods, the ratio of practice and design thinking.

Correlation Analysis of the Arirangs Based on the Informatics Algorithms (정보 알고리즘 기반 아리랑의 계통도 및 상관관계 분석)

  • Kim, Hak Yong
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.4
    • /
    • pp.407-417
    • /
    • 2014
  • An arirang is the most famous Korean folk song and was registered in UNESCO(Unitied Nations Educational, Scientific and cultural Organization) as an intangible cultural heritage in 2012. Most arirangs are composed of text and refrain parts. Genealogy of the arirang was classified in refrain patterns by using multiple sequence alignment algorithm. There are two different refrain patterns, slow and fast melodies. Of 106 arirangs, 38 and 68 arirangs contain fast and slow melodies, respectively. 73 arirangs and 104 their key words were extracted from bipartate arirang network that composed of arirangs, text works, and their relationships. The correlation among the arirangs was analyzed from the selected arirangs and key words by using pairwise comparison matrix. Also, analysis of correlation among the arirnags was performed by stepwise removal of the single degree nodes from the bipartate arirang network In this study, arirangs were analyzed in genealogy and correlation among arirangs by using informatic algorithm and network technology, in which arirang research will be constructed a stepping stone for the popularization and globalization of the arirangs.