• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.031 seconds

Correspondence Strategy for Big Data's New Customer Value and Creation of Business (빅 데이터의 새로운 고객 가치와 비즈니스 창출을 위한 대응 전략)

  • Koh, Joon-Cheol;Lee, Hae-Uk;Jeong, Jee-Youn;Kim, Kyung-Sik
    • Journal of the Korea Safety Management & Science
    • /
    • v.14 no.4
    • /
    • pp.229-238
    • /
    • 2012
  • Within last 10 years, internet has become a daily activity, and humankind had to face the Data Deluge, a dramatic increase of digital data (Economist 2012). Due to exponential increase in amount of digital data, large scale data has become a big issue and hence the term 'big data' appeared. There is no official agreement in quantitative and detailed definition of the 'big data', but the meaning is expanding to its value and efficacy. Big data not only has the standardized personal information (internal) like customer information, but also has complex data of external, atypical, social, and real time data. Big data's technology has the concept that covers wide range technology, including 'data achievement, save/manage, analysis, and application'. To define the connected technology of 'big data', there are Big Table, Cassandra, Hadoop, MapReduce, Hbase, and NoSQL, and for the sub-techniques, Text Mining, Opinion Mining, Social Network Analysis, Cluster Analysis are gaining attention. The three features that 'bid data' needs to have is about creating large amounts of individual elements (high-resolution) to variety of high-frequency data. Big data has three defining features of volume, variety, and velocity, which is called the '3V'. There is increase in complexity as the 4th feature, and as all 4features are satisfied, it becomes more suitable to a 'big data'. In this study, we have looked at various reasons why companies need to impose 'big data', ways of application, and advanced cases of domestic and foreign applications. To correspond effectively to 'big data' revolution, paradigm shift in areas of data production, distribution, and consumption is needed, and insight of unfolding and preparing future business by considering the unpredictable market of technology, industry environment, and flow of social demand is desperately needed.

A Study on Development of Patent Information Retrieval Using Textmining (텍스트 마이닝을 이용한 특허정보검색 개발에 관한 연구)

  • Go, Gwang-Su;Jung, Won-Kyo;Shin, Young-Geun;Park, Sang-Sung;Jang, Dong-Sik
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.8
    • /
    • pp.3677-3688
    • /
    • 2011
  • The patent information retrieval system can serve a variety of purposes. In general, the patent information is retrieved using limited key words. To identify earlier technology and priority rights repeated effort is needed. This study proposes a method of content-based retrieval using text mining. Using the proposed algorithm, each of the documents is invested with characteristic value. The characteristic values are used to compare similarities between query documents and database documents. Text analysis is composed of 3 steps: stop-word, keyword analysis and weighted value calculation. In the test results, the general retrieval and the proposed algorithm were compared by using accuracy measurements. As the study arranges the result documents as similarities of the query documents, the surfer can improve the efficiency by reviewing the similar documents first. Also because of being able to input the full-text of patent documents, the users unacquainted with surfing can use it easily and quickly. It can reduce the amount of displayed missing data through the use of content based retrieval instead of keyword based retrieval for extending the scope of the search.

Text Mining-Based Analysis for Research Trends in Vocational Studies (텍스트 마이닝을 활용한 직업학 연구동향 분석)

  • Yook, Dong-In
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.3
    • /
    • pp.586-599
    • /
    • 2017
  • This study attempts to understand the overall research trends in Vocational Studies using a text mining method, which is a means to analyze big data. The findings of the research show that Vocational Studies in Korea has been directly influenced by global economic crises, as evidenced by its exponential growth after the 1997 foreign exchange crisis that resulted in a bailout from the IMF. In addition, the topics of research have been shifting from such macro subjects as government policies and systems to such micro topics as individual career development. Moreover, the perspective of research is being moved from the socially vulnerable, including women and the disabled, to the economically marginalized, including retirees and the unemployed. As for the research targets, college students overwhelmingly outnumbered primary and secondary school students. However, few cases analyzed the clinical outcomes of career counseling or attempted to process job information and study the history of jobs. This research is limited in that it only analyzed journal abstracts. Nonetheless, it is meaningful because it used topic analysis, one of the text mining methods, to give a complete enumeration of all articles available for search, thereby crafting a framework of quantitative analysis methodology for Vocational Studies. It is also significant in that it is the first attempt to analyze themes in every stage of the development of Vocational Studies.

The Image of Ruralism in Korea through a Text Mining for Online News Media analysis (인터넷 뉴스 데이터 텍스트 분석을 통해 본 우리나라 농촌다움에 대한 이미지 연구)

  • Son, Yong-hoon;Kim, Young-jin
    • Journal of Korean Society of Rural Planning
    • /
    • v.25 no.4
    • /
    • pp.13-26
    • /
    • 2019
  • The rural areas in South Korea have changed rapidly in the process of national land development. Rural landscapes have become discoloured, and their attractiveness has decreased as cities have expanded. But the attractiveness or multifunctional values of rural areas has become more important in contemporary society around the world. According to this social demand, the efforts of conserving the rural landscape are of high priority and the recovery of ruralism in the area is required. This study has tried to understand how the public image of ruralism in South Korea has been influenced by the news media. The study retrieved news articles using the web searching portal site from the six keywords, commonly used to refer to ruralism, including 'rural landscape', 'rural community', 'rural tourism', 'rural life', 'rural amenity', and 'rural environment'. News data from the six keywords were also collected respectively from within the year-period of 2004-05, 2007-08, 2012-13, and 2016-17. In the text mining analysis, the nouns with high Degree Centrality were figured out, and the changes by year-period were identified. Then, LDA topic analysis was performed for text datasets of six keywords. As a result, the study found that the news articles gave an informed focus on only a handful of issues such as 'poor rural living condition', 'regional or village improvement projects', 'rural tourism promotion projects', and 'other government support projects'. On the other hand, nouns related to virtues and values in the rural landscape were less shown in news articles. These results have become more apparent in recent years. In the topic analysis, 35 topics were identified. 'village development projects', 'rural tourism', and 'urban-rural exchange projects' were appeared repeatedly in several keywords. Among the topics, there are also topics closely related to ruralism such as 'rural landscape conservation', 'eco-friendly rural areas', 'local amenity resources', 'public interest values of agriculture', and 'rural life and communities'. The study presented an image map showing ruralism in South Korea using a network map between all topics and keywords. At the end of the study, implications for Korean rural area policy and research directions were discussed.

Effective Text Question Analysis for Goal-oriented Dialogue (목적 지향 대화를 위한 효율적 질의 의도 분석에 관한 연구)

  • Kim, Hakdong;Go, Myunghyun;Lim, Heonyeong;Lee, Yurim;Jee, Minkyu;Kim, Wonil
    • Journal of Broadcast Engineering
    • /
    • v.24 no.1
    • /
    • pp.48-57
    • /
    • 2019
  • The purpose of this study is to understand the intention of the inquirer from the single text type question in Goal-oriented dialogue. Goal-Oriented Dialogue system means a dialogue system that satisfies the user's specific needs via text or voice. The intention analysis process is a step of analysing the user's intention of inquiry prior to the answer generation, and has a great influence on the performance of the entire Goal-Oriented Dialogue system. The proposed model was used for a daily chemical products domain and Korean text data related to the domain was used. The analysis is divided into a speech-act which means independent on a specific field concept-sequence and which means depend on a specific field. We propose a classification method using the word embedding model and the CNN as a method for analyzing speech-act and concept-sequence. The semantic information of the word is abstracted through the word embedding model, and concept-sequence and speech-act classification are performed through the CNN based on the semantic information of the abstract word.

Trend Analysis of Fraudulent Claims by Long Term Care Institutions for the Elderly using Text Mining and BIGKinds (텍스트 마이닝과 빅카인즈를 활용한 노인장기요양기관 부당청구 동향 분석)

  • Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.2
    • /
    • pp.13-24
    • /
    • 2022
  • In order to explore the context of fraudulent claims and the measures for preventing them targeting the long-term care institutions for the elderly, which is increasing every year in Korea, this study conducted the text mining analysis using the media report articles. The media report articles were collected from the news big data analysis system called 'BIG KINDS' for about 15 years from July 2008 when the Long-Term Care Insurance for the Elderly took effect, to February 28th 2022. During this period of time, total 2,627 articles were collected under keywords like 'elderly care+fraudulent claims' and 'long-term care+fraudulent claims', and among them, total 946 articles were selected after excluding overlapped articles. In the results of the text mining analysis in this study, first, the top 10 keywords mentioned in the highest frequency in every section(July 1st 2008-February 28th 2022) were shown in the order of long-term care institution for the elderly, fraudulent claims, National Health Insurance Service, Long-Term Care Insurance for the Elderly, long-term care benefits(expenses), elderly care facilities, The Ministry of Health & Welfare, the elderly, report, and reward(payment). Second, in the results of the N-gram analysis, they were shown in the order of long-term care benefits(expenses) and fraudulent claims, fraudulent claims and long-care institution for the elderly, falsehood and fraudulent claims, report and reward(payment), and long-term care institution for the elderly and report. Third, the analysis of TF-IDF was similar to the results of the frequency analysis while the rankings of report, reward(payment), and increase moved up. Based on such results of the analysis above, this study presented the future direction for the prevention of fraudulent claims of long-term care institutions for the elderly.

Effect of Collaborative Problem-Solving for Competency Instruction Strategy Using Science Reading Text on Elementary Sch ool Students' Science Reading Ability (과학 읽기 자료를 이용한 협력적 문제해결 중심 과학 수업이 초등학교 학생들의 과학 읽기 능력에 미치는 영향)

  • Park, Jihun;Jun, Jaekyoung;Lee, Sujin;Nam, Jeonghee
    • Journal of Korean Elementary Science Education
    • /
    • v.41 no.4
    • /
    • pp.642-657
    • /
    • 2022
  • This study aimed to investigate how elementary school students' science reading ability is influenced by collaborative problem-solving for competency instruction strategy using science reading text. This study recruited two groups of elementary students in fifth grade. The experimental group underwent an instruction strategy using science reading text, while the comparative group experienced a science class using a textbook. Afterward, data from the science reading ability tests, voice recordings of the discussion process involving each group, and class videos were collected and analyzed. The results showed that science classes that used collaborative problem-solving for their competency instruction strategy via science reading text were effective in enhancing elementary school students' science reading ability. Meanwhile, the science reading ability test results indicated that the experimental group had statistically higher total scores than the comparative group in the three subelements, especially "introspection and evaluation" and "integration and interpretation" owing to their significant improvement in high-level cognitive processes. In these classes, the students read the materials that the teacher provided, participated in the discussion based on what they have read, and had the chance to reflect on their reading processes. Overall, students' science reading ability was enhanced through this process.

Analysis of Research Trends in Elder Abuse Using Text Mining : Academic Papers from 2004 to 2021. (텍스트 마이닝 분석을 통한 노인학대 관련 연구 동향 분석 : 2004년~2021년까지 발행된 국내 학술논문을 중심으로)

  • Youn, Ki-Hyok
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.4
    • /
    • pp.25-40
    • /
    • 2022
  • This study aimed to understand the increasing number of elder abuses in South Korea, where entry into the super-aged society is imminent, by implementing text mining analysis. Korean Academic journals were obtained from 2004, the establishment year of the senior care agency, to 2021. We performed natural language processing of the titles, keywords, and abstracts and divided them into three segments of periods to identify latent meanings in the data. The results illustrated that the first section included 81 papers, the second 64, and the third 104 respectively, averaging 13.8 annually, which increased its numbers from 2014 until the decrease below the annual average in 2020. Word frequency demonstrated that the common keywords of the entire segments were 'elder abuse,' 'elders,' 'influences,' 'factors,' 'recognition,' 'family,' 'society,' 'prevention plans,' 'experiences,' 'abused elders,' 'abuse prevention,' 'depression,' etc., in consecutive order. TF-IDF indicated that 'influences,' 'recognition,' 'society,' 'prevention plans,' 'abuse prevention,' 'experiences,' 'depression,' etc., were the common keywords of all divisions. Network text analysis displayed that the commonly represented keywords were 'elder abuse,' 'elders,' 'influences,' 'factors,' 'characteristics,' 'recognition,' 'family,' 'prevention plans,' 'society,' 'abuse prevention,' and 'experiences' in the entire sections. concor analysis presented that the first segment consisted of 5 groups, the second 7, and the third 6. We suggest future directions for elder abuse research based on the results.

Analysis of Research Trends in Relation to the Yellow Sea using Text Mining (텍스트 마이닝을 활용한 황해 관련 연구동향 분석연구)

  • Kyu Won Hwang;Kim Jinkyung;Kang Seung-Koo;Kang Gil Mo
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.29 no.7
    • /
    • pp.724-739
    • /
    • 2023
  • Located in the sea area between South Korea, North Korea, and China, the Yellow Sea plays an important role from a geopolitical perspective, and recently, as the use of marine space in the Yellow Sea is expanding, its social and economic values have been increasing further. In addition, owing to rapid climate changes, the need for joint response and cooperation between Korea and China is increasing in various fields, including changes in the marine environment and marine ecosystem and generation and movement of air pollutants. Accordingly, in this study, core topics were derived from research papers with the Yellow Sea as a keyword, and research trends to date were explored through author network analysis. As a specific research method, research papers related to the Yellow Sea published between 1984 and 2021 were extracted from the Web of Science database and were classified into four periods to derive core topics using topic modeling, a type of text mining. Furthermore, the influences of major research communities, researchers, and research institutes in the appropriate fields were identified through analyzing the author network, and their implications were presented. The analysis results indicated that the core topics of research papers on the Yellow Sea had changed over time, and differences existed in the influence (centrality) of key researchers. Finally, based on the results of this study, this study aims to identify research trends related to the Yellow Sea, major researchers, and research institutes and contribute to research cooperation between Korea and China regarding the Yellow Sea in the future.

An Exploratory Study of e-Learning Satisfaction: A Mixed Methods of Text Mining and Interview Approaches (이러닝 만족도 증진을 위한 탐색적 연구: 텍스트 마이닝과 인터뷰 혼합방법론)

  • Sun-Gyu Lee;Soobin Choi;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.21 no.1
    • /
    • pp.39-59
    • /
    • 2019
  • E-learning has improved the educational effect by making it possible to learn anytime and anywhere by escaping the traditional infusion education. As the use of e-learning system increases with the increasing popularity of e-learning, it has become important to measure e-learning satisfaction. In this study, we used the mixed research method to identify satisfaction factors of e-learning. The mixed research method is to perform both qualitative research and quantitative research at the same time. As a quantitative research, we collected reviews in Udemy.com by text mining. Then we classified high and low rated lectures and applied topic modeling technique to derive factors from reviews. Also, this study conducted an in-depth 1:1 interview on e-learning learners as a qualitative research. By combining these results, we were able to derive factors of e-learning satisfaction and dissatisfaction. Based on these factors, we suggested ways to improve e-learning satisfaction. In contrast to the fact that survey-based research was mainly conducted in the past, this study collects actual data by text mining. The academic significance of this study is that the results of the topic modeling are combined with the factor based on the information system success model.