• Title/Summary/Keyword: text mining analysis

Search Result 1,254, Processing Time 0.057 seconds

Information Security Consultants' Role: Analysis of Job Ads in the US and Korea (정보보호 컨설턴트의 역할: 미국과 한국의 구인광고 분석)

  • Sang-Woo Park;Tae-Sung Kim;Hyo-Jung Jun
    • Information Systems Review
    • /
    • v.22 no.3
    • /
    • pp.157-172
    • /
    • 2020
  • The demand of information security consultants is expected to increase due to the emergence of ISMS-P incorporating ISMS and PIMS, the implementation of European Privacy Act (GDPR) and various security accidents. In this paper, we collected and analyzed advertisements of job advertisement sites that could identify firms' demand explicitly. We selected representative job advertisement sites in Korea and the United States and collected job advertisement details of information security consultants in 2014 and 2019. The collected data were visualized using text mining and analyzed using non-parametric methods to determine whether there was a change in the role of the information security consultant. The findings show that the requirements for information security consultants have changed very little. This means that the role does not change much over a five year time gap. The results of the study are expected to be helpful to policy makers related to information security consultants, those seeking to find employment as information security consultants, and those seeking information security consultants.

Cost Performance Evaluation Framework through Analysis of Unstructured Construction Supervision Documents using Binomial Logistic Regression (비정형 공사감리문서 정보와 이항 로지스틱 회귀분석을 이용한 건축 현장 비용성과 평가 프레임워크 개발)

  • Kim, Chang-Won;Song, Taegeun;Lee, Kiseok;Yoo, Wi Sung
    • Journal of the Korea Institute of Building Construction
    • /
    • v.24 no.1
    • /
    • pp.121-131
    • /
    • 2024
  • This research explores the potential of leveraging unstructured data from construction supervision documents, which contain detailed inspection insights from independent third-party monitors of building construction processes. With the evolution of analytical methodologies, such unstructured data has been recognized as a valuable source of information, offering diverse insights. The study introduces a framework designed to assess cost performance by applying advanced analytical methods to the unstructured data found in final construction supervision reports. Specifically, key phrases were identified using text mining and social network analysis techniques, and these phrases were then analyzed through binomial logistic regression to assess cost performance. The study found that predictions of cost performance based on unstructured data from supervision documents achieved an accuracy rate of approximately 73%. The findings of this research are anticipated to serve as a foundational resource for analyzing various forms of unstructured data generated within the construction sector in future projects.

Trend Analysis of Dance Performance Research Using Keywords and Topic Modeling of LDA Techniques (LDA 토픽 모델링 기법을 활용한 무용공연의 연구 동향 분석)

  • SI YU
    • Journal of Industrial Convergence
    • /
    • v.22 no.3
    • /
    • pp.13-25
    • /
    • 2024
  • This study explores research topics related to dance performances published in Korea based on big data and examines research trends that change according to the trend of the times. The results derived from topic modeling analysis are as follows. (1) Six major topics were derived: a study on marketing strategies and development plans for dance performances, (2) a study on the re-watching factors of dance performance space and performance satisfaction, (3) a study on the popularity and contribution of dance performances in the stage environment, (4) a study on the current status of dance performances and the convergence of dance group operations, (5) a study on the definition of dance performances using various social media, and (6) a study on the direction and development of technology-applied dance performance contents. Accordingly, research trends and topics related to dance, including dance performances, social changes, key keywords of researchers' change interests were extracted, and keywords were compared and analyzed to present academic changes and countermeasures. Accordingly, the need for research to apply new technologies was emphasized as it diversified and fused.

Analysis of Perceptions and Differences between Groups regarding Generative AI (생성형 AI에 관한 인식 및 집단간 차이 분석)

  • Kyoo-Sung Noh
    • Journal of Digital Convergence
    • /
    • v.22 no.1
    • /
    • pp.15-21
    • /
    • 2024
  • The purpose of this study is to analyze the use of generative AI and the perception of differences between user groups. This study explored the perceptions of different user groups regarding generative AI, aiming to derive implications for enhancing AI utilization capabilities for each group. Upon analysis, it was found that there were no significant differences in perceptions across age groups. However, notable differences were observed between professional backgrounds, particularly in the areas of generative AI application and ethical perspectives. Consequently, this study suggests the need for diversified AI solutions tailored to specific fields of expertise. It underscores the importance of customized education and training programs, as well as specialized education focused on ethical considerations. Additionally, this research contributes academically by proposing varied AI usage strategies for different age and professional groups. It also highlights the role of text mining techniques in developing and improving AI utilization skills.

Research Trends in e-commerce Using Topic Modeling: Focusing on SCOPUS Database (토픽 모델링을 활용한 e-commerce 연구 동향: SCOPUS DB 데이터를 중심으로)

  • Tae-Gu Kang
    • Journal of Industrial Convergence
    • /
    • v.22 no.10
    • /
    • pp.1-9
    • /
    • 2024
  • E-commerce has emerged as a key economic driver in the digital age, and the importance of the e-commerce market has been highlighted, leading to rapid expansion in related research areas. This paper analyzes the research trends on e-commerce from 1996, when e-commerce emerged and research began, to the present day. To this end, we used R and LDA topic modeling techniques and conducted a validity test on the number of topics and an analysis of the predictive value of the topic model centered on the core keyword "e-commerce" using the SCOPUS, a foreign academic database. The analysis of topics showed that ecommerce, model, study, data, and online were among the important topics. Logistics was also found to be important. In the rapidly changing and complex e-commerce market environment, it is important to respond to the diversification of business models and the establishment of a stable revenue structure to survive. As the continuous growth of the e-commerce market is predicted, the results of this study can be used as basic data for entering the e-commerce market and expanding business through countermeasures and strategies.

Korean Word Sense Disambiguation using Dictionary and Corpus (사전과 말뭉치를 이용한 한국어 단어 중의성 해소)

  • Jeong, Hanjo;Park, Byeonghwa
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.1-13
    • /
    • 2015
  • As opinion mining in big data applications has been highlighted, a lot of research on unstructured data has made. Lots of social media on the Internet generate unstructured or semi-structured data every second and they are often made by natural or human languages we use in daily life. Many words in human languages have multiple meanings or senses. In this result, it is very difficult for computers to extract useful information from these datasets. Traditional web search engines are usually based on keyword search, resulting in incorrect search results which are far from users' intentions. Even though a lot of progress in enhancing the performance of search engines has made over the last years in order to provide users with appropriate results, there is still so much to improve it. Word sense disambiguation can play a very important role in dealing with natural language processing and is considered as one of the most difficult problems in this area. Major approaches to word sense disambiguation can be classified as knowledge-base, supervised corpus-based, and unsupervised corpus-based approaches. This paper presents a method which automatically generates a corpus for word sense disambiguation by taking advantage of examples in existing dictionaries and avoids expensive sense tagging processes. It experiments the effectiveness of the method based on Naïve Bayes Model, which is one of supervised learning algorithms, by using Korean standard unabridged dictionary and Sejong Corpus. Korean standard unabridged dictionary has approximately 57,000 sentences. Sejong Corpus has about 790,000 sentences tagged with part-of-speech and senses all together. For the experiment of this study, Korean standard unabridged dictionary and Sejong Corpus were experimented as a combination and separate entities using cross validation. Only nouns, target subjects in word sense disambiguation, were selected. 93,522 word senses among 265,655 nouns and 56,914 sentences from related proverbs and examples were additionally combined in the corpus. Sejong Corpus was easily merged with Korean standard unabridged dictionary because Sejong Corpus was tagged based on sense indices defined by Korean standard unabridged dictionary. Sense vectors were formed after the merged corpus was created. Terms used in creating sense vectors were added in the named entity dictionary of Korean morphological analyzer. By using the extended named entity dictionary, term vectors were extracted from the input sentences and then term vectors for the sentences were created. Given the extracted term vector and the sense vector model made during the pre-processing stage, the sense-tagged terms were determined by the vector space model based word sense disambiguation. In addition, this study shows the effectiveness of merged corpus from examples in Korean standard unabridged dictionary and Sejong Corpus. The experiment shows the better results in precision and recall are found with the merged corpus. This study suggests it can practically enhance the performance of internet search engines and help us to understand more accurate meaning of a sentence in natural language processing pertinent to search engines, opinion mining, and text mining. Naïve Bayes classifier used in this study represents a supervised learning algorithm and uses Bayes theorem. Naïve Bayes classifier has an assumption that all senses are independent. Even though the assumption of Naïve Bayes classifier is not realistic and ignores the correlation between attributes, Naïve Bayes classifier is widely used because of its simplicity and in practice it is known to be very effective in many applications such as text classification and medical diagnosis. However, further research need to be carried out to consider all possible combinations and/or partial combinations of all senses in a sentence. Also, the effectiveness of word sense disambiguation may be improved if rhetorical structures or morphological dependencies between words are analyzed through syntactic analysis.

A Study on Industry-specific Sustainability Strategy: Analyzing ESG Reports and News Articles (산업별 지속가능경영 전략 고찰: ESG 보고서와 뉴스 기사를 중심으로)

  • WonHee Kim;YoungOk Kwon
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.287-316
    • /
    • 2023
  • As global energy crisis and the COVID-19 pandemic have emerged as social issues, there is a growing demand for companies to move away from profit-centric business models and embrace sustainable management that balances environmental, social, and governance (ESG) factors. ESG activities of companies vary across industries, and industry-specific weights are applied in ESG evaluations. Therefore, it is important to develop strategic management approaches that reflect the characteristics of each industry and the importance of each ESG factor. Additionally, with the stance of strengthened focus on ESG disclosures, specific guidelines are needed to identify and report on sustainable management activities of domestic companies. To understand corporate sustainability strategies, analyzing ESG reports and news articles by industry can help identify strategic characteristics in specific industries. However, each company has its own unique strategies and report structures, making it difficult to grasp detailed trends or action items. In our study, we analyzed ESG reports (2019-2021) and news articles (2019-2022) of six companies in the 'Finance,' 'Manufacturing,' and 'IT' sectors to examine the sustainability strategies of leading domestic ESG companies. Text mining techniques such as keyword frequency analysis and topic modeling were applied to identify industry-specific, ESG element-specific management strategies and issues. The analysis revealed that in the 'Finance' sector, customer-centric management strategies and efforts to promote an inclusive culture within and outside the company were prominent. Strategies addressing climate change, such as carbon neutrality and expanding green finance, were also emphasized. In the 'Manufacturing' sector, the focus was on creating sustainable communities through occupational health and safety issues, sustainable supply chain management, low-carbon technology development, and eco-friendly investments to achieve carbon neutrality. In the 'IT' sector, there was a tendency to focus on technological innovation and digital responsibility to enhance social value through technology. Furthermore, the key issues identified in the ESG factors were as follows: under the 'Environmental' element, issues such as greenhouse gas and carbon emission management, industry-specific eco-friendly activities, and green partnerships were identified. Under the 'Social' element, key issues included social contribution activities through stakeholder engagement, supporting the growth and coexistence of members and partner companies, and enhancing customer value through stable service provision. Under the 'Governance' element, key issues were identified as strengthening board independence through the appointment of outside directors, risk management and communication for sustainable growth, and establishing transparent governance structures. The exploration of the relationship between ESG disclosures in reports and ESG issues in news articles revealed that the sustainability strategies disclosed in reports were aligned with the issues related to ESG disclosed in news articles. However, there was a tendency to strengthen ESG activities for prevention and improvement after negative media coverage that could have a negative impact on corporate image. Additionally, environmental issues were mentioned more frequently in news articles compared to ESG reports, with environmental-related keywords being emphasized in the 'Finance' sector in the reports. Thus, ESG reports and news articles shared some similarities in content due to the sharing of information sources. However, the impact of media coverage influenced the emphasis on specific sustainability strategies, and the extent of mentioning environmental issues varied across documents. Based on our study, the following contributions were derived. From a practical perspective, companies need to consider their characteristics and establish sustainability strategies that align with their capabilities and situations. From an academic perspective, unlike previous studies on ESG strategies, we present a subdivided methodology through analysis considering the industry-specific characteristics of companies.

Research Suggestion for Disaster Prediction using Safety Report of Korea Government (안전신문고를 이용한 재난 예측 방법론 제안)

  • Lee, Jun;Shin, Jindong;Cho, Sangmyeong;Lee, Sanghwa
    • Journal of Korean Society of Disaster and Security
    • /
    • v.12 no.4
    • /
    • pp.15-26
    • /
    • 2019
  • Anjunshinmungo (The safety e-report) has been in operation since 2014, and there are about 1 million cumulative reports by June 2019. This study analyzes the contents of more than 1 million safety newspapers reported at the present time of information age to determine how powerful and meaningful the people's voice and interest are. In particular, we are interested in forecasting ability. We wanted to check whether the report of the safety newspaper was related to possible disasters. To this end, the researchers received data reported in the safety newspaper as text and analyzed it by natural language analysis methodology. Based on this, the newspaper articles during the analysis of the safety newspaper were analyzed, and the correlation between the contents of the newspaper and the newspaper was analyzed. As a result, accidents occurred within a few months as the number of reports related to response and confirmation increased, and analyzing the contents of safety reports previously reported on social instability can be used to predict future disasters.

A Study on the Research Trends on Domestic Platform Government using Topic Modeling (토픽 모델링을 활용한 한국의 플랫폼정부 연구동향 분석)

  • Suh, Byung-Jo;Shin, Sun-Young
    • Informatization Policy
    • /
    • v.24 no.3
    • /
    • pp.3-26
    • /
    • 2017
  • The amount of unstructured data generated online is increasing exponentially and the analysis of text data is being done in various fields. In order to identify the research trends on the platform government, the title, year, academic society, and abstract information of the academic papers on the subject of platform government were collected from the database of the domestic papers, DBPIA(www.dbpia.co.kr). The results of the existing research on the platform government and related fields were analyzed based on each stage of the national informatization promotion. The technology, service, and governance topics were extracted from papers on platform government and the trends of core topics were analyzed by year. Entering the era of the intelligent information society, this study has significance for providing the basis for defining a new role of government - the platform government that sets the stage for the private sector to lead the innovation, and plays the role of an 'enabler' and 'facilitator' instead. The purpose of this study is to understand the platform government research through objective analysis of its trends. Looking for future directions, this study will contribute to future research by providing reference materials.

A Literature Review and Classification of Recommender Systems on Academic Journals (추천시스템관련 학술논문 분석 및 분류)

  • Park, Deuk-Hee;Kim, Hyea-Kyeong;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.1
    • /
    • pp.139-152
    • /
    • 2011
  • Recommender systems have become an important research field since the emergence of the first paper on collaborative filtering in the mid-1990s. In general, recommender systems are defined as the supporting systems which help users to find information, products, or services (such as books, movies, music, digital products, web sites, and TV programs) by aggregating and analyzing suggestions from other users, which mean reviews from various authorities, and user attributes. However, as academic researches on recommender systems have increased significantly over the last ten years, more researches are required to be applicable in the real world situation. Because research field on recommender systems is still wide and less mature than other research fields. Accordingly, the existing articles on recommender systems need to be reviewed toward the next generation of recommender systems. However, it would be not easy to confine the recommender system researches to specific disciplines, considering the nature of the recommender system researches. So, we reviewed all articles on recommender systems from 37 journals which were published from 2001 to 2010. The 37 journals are selected from top 125 journals of the MIS Journal Rankings. Also, the literature search was based on the descriptors "Recommender system", "Recommendation system", "Personalization system", "Collaborative filtering" and "Contents filtering". The full text of each article was reviewed to eliminate the article that was not actually related to recommender systems. Many of articles were excluded because the articles such as Conference papers, master's and doctoral dissertations, textbook, unpublished working papers, non-English publication papers and news were unfit for our research. We classified articles by year of publication, journals, recommendation fields, and data mining techniques. The recommendation fields and data mining techniques of 187 articles are reviewed and classified into eight recommendation fields (book, document, image, movie, music, shopping, TV program, and others) and eight data mining techniques (association rule, clustering, decision tree, k-nearest neighbor, link analysis, neural network, regression, and other heuristic methods). The results represented in this paper have several significant implications. First, based on previous publication rates, the interest in the recommender system related research will grow significantly in the future. Second, 49 articles are related to movie recommendation whereas image and TV program recommendation are identified in only 6 articles. This result has been caused by the easy use of MovieLens data set. So, it is necessary to prepare data set of other fields. Third, recently social network analysis has been used in the various applications. However studies on recommender systems using social network analysis are deficient. Henceforth, we expect that new recommendation approaches using social network analysis will be developed in the recommender systems. So, it will be an interesting and further research area to evaluate the recommendation system researches using social method analysis. This result provides trend of recommender system researches by examining the published literature, and provides practitioners and researchers with insight and future direction on recommender systems. We hope that this research helps anyone who is interested in recommender systems research to gain insight for future research.