• Title/Summary/Keyword: Concept of the future school

Search Result 423, Processing Time 0.023 seconds

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

Stage Costume Design for Performance Hamlet (II) - The Study on Pattern and Manufactured Product - (햄릿 공연을 위한 무대의상 디자인 (II) - 패턴 및 실물제작 -)

  • Kim, Soon-Ku;Hwang, Seong-Won
    • Fashion & Textile Research Journal
    • /
    • v.6 no.1
    • /
    • pp.41-50
    • /
    • 2004
  • This research proposes the on-stage costumes for the play Hamlet of Shakespeare performed by Yunheedan Guhri Pae - the Street Theater Troupe. Stage costumes have an important role in displaying the characteristics of each characters to the audience and has big visual effects. However, in order to design the costumes in the object viewpoints of the audience, the survey on the images of the characters who had actually watched the performance was taken place and proposed the costume design according to the results of the survey. Hamlet a: This result was applied to propose a sweater in black color, black leather pants and vest. Hamlet b: This result was applied to propose hooded coat in purple in middle level of brightness and color spectrum and yellow coat. For free image, loose pants in blue and vest in the same color tone were proposed. Gertrude a: This result was applied to use purple (violet) with reddish tone to propose the formation of a dress applying tailored suit. Gertrude b: This result was applied to propose purple gown and the one-piece dress with black laces. Ophelia a: This result was applied to propose feminine white dress and cape in purple color tone. Ophelia b: This result was applied to propose dyed and weaved clothes. Through the surveys as above, the images of each character was driven in adjectives, and using the results driven from the brightness, coloration, and color, color images were proposed. Only one costume cannot make up for the stage costumes and because it exists as an element of stage production, it is true that costumes are limited in some areas. However, that limit can become the motive of the costume. There is a limit, which the designer cannot produce the costumes as he or she had designed but I believe it is the center of the on-stage customers to display the characteristics of the characters according to the given concept. The limit of this research is the fact that because the costumes were designed so they fit the conditions already given, thus it was difficult to regard the process of designing and producing the costume as a project done according to the interaction. And in the future, if it is possible, I wish for the joint research with the people responsible for stage art to take place as a practical stage art. It was possible to produce practical costume since they were produced for actual performance and the production of costumes considering the dance steps, line of flow, and acting, was able to reduce the trial and error on stage. Through this research, I felt that the understanding and smooth interaction on diverse other areas not limited to the costume design should be taken place and believe that this was a research that proposes new research method since there had been only a few previous research regarding the on-stage costumes for actual performances. Therefore, this research had depended on the surveys given to the audiences to endow objectivity, however, I wish this research can contribute to defining effective process and methods for the on-stage costumes with more active researches with diverse methods and in diverse areas. I am sorry that the costume production for all the characters and all the scenes in Hamlet couldn't be done due to many limitations. As the following research assignment, I am planning on designing the costumes for all the scenes.

Survey of Operation and Status of the Human Research Protection Program (HRPP) in Korea (2019) (임상시험 및 대상자보호프로그램의 운영과 현황에 대한 설문조사 연구(2019))

  • Maeng, Chi Hoon;Lee, Sun Ju;Cho, Sung Ran;Kim, Jin Seok;Rha, Sun Young;Kim, Yong Jin;Chung, Jong Woo;Kim, Seung Min
    • The Journal of KAIRB
    • /
    • v.2 no.2
    • /
    • pp.37-48
    • /
    • 2020
  • Purpose: The purpose of this study is to assess the operational status and level of understanding among IRB and HRPP staffs at a hospital or a research institute to the HRPP guideline set by the Ministry of Food and Drug Safety (MFDS) and to provide recommendations. Methods: Online survey was distributed among members of Korean Association of IRB (KAIRB) through each IRB office. The result was separated according to topic and descriptive statistics was used for analysis. Result: Survey notification was sent out to 176 institutions and 65 (37.1%) institutions answered the survey by online. Of 65 institutions that answered the survey; 83.1% was hospital, 12.3% was university, 3.1% was medical college, 1.5% was research institution. 23 institutions (25.4%) established independent HRPP offices and 39 institutions (60.0%) did not. 12 institutions (18.5%) had separate IRB and HRPP heads, 21 (32.3%) institutions separated business reporting procedure and person in charge, 12 institutions separated the responsibility of IRB and HRPP among staff, and 45 institutions (69.2%) had audit & non-compliance managers. When asked about the most important basic task for HRPP, 23% answered self-audit. And according to 43.52%, self-audit was also the most by both institutions that operated HRPP and institutions that did not. When basic task performance status was analyzed, on average, the institutions that operated HRPP was 14% higher than institutions that only operated IRB. 9 (13.8%) institutions were evaluated and obtained HRPP accreditation from MFDS and the most common reason for obtaining the accreditation was to be selected as Institution for the education of persons conducting clinical trial (6 institutions). The most common reason for not obtaining HRPP accreditation was because of insufficient staff and limited capacity of the institution (28%). Institutions with and without a plan to be HRPP accredited by MFDS were 20 (37.7%) each. 34 institutions (52.3%) answered HRPP evaluation method and accreditation by MFDS was appropriate while 31 institutions (47.7%) answered otherwise. 36 institutions answered that HRPP evaluation and accreditation by MFDS was credible while 29 institutions (44.5%) answered that HRPP evaluation method and accreditation by MFDS was not credible. Conclusion: 1. MFDS's HRPP accreditation program can facilitate the main objective of HRPP and MFDS's HRPP accreditation program should be encouraged to non-tertiary hospitals by taking small staff size into consideration and issuing accreditation by segregating accreditation. 2. While issuing Institution for the education of persons conducting clinical trial status as a benefit of MFDS's HRPP accreditation program, it can also hinder access to MFDS's HRPP accreditation program. It should also be considered that the non-contact culture during COVID-19 pandemic eliminated time and space limitation for education. 3. For clinical research conducted internally by an institution, internal audit is the most effective and sole method of protecting safety and right of the test subjects and integrity for research in Korea. For this reason, regardless of the size of the institution, an internal audit should be enforced. 4. It is necessary for KAIRB and MFDSto improve HRPP awareness by advocating and educating the concept and necessity of HRPP in clinical research. 5. A new HRPP accreditation system should be setup for all clinical research with human subjects, including Investigational New Drug (IND) application in near future.

  • PDF

A Study on the Consciousness Survey for the Establishment of Safety Village in Disaster (재난안전마을 구축을 위한 의식조사 연구)

  • Koo, Wonhoi;Baek, Minho
    • Journal of the Society of Disaster Information
    • /
    • v.14 no.3
    • /
    • pp.238-246
    • /
    • 2018
  • Purpose: The purpose of this study is to examine the directions for establishing a disaster safety village in rural areas where damage from a similar type of disaster occurs repeatedly by conducting the consciousness survey targeting at experts and disaster safety officials in a local government. Method: The risks of disaster in rural areas were examined and the concept and characteristics of disaster safety village which is a measure on the basis of Myeon (township) among the measures of village unit were examined in order to carry out this study. In addition, opinion polling targeting at officials-in-charge in the local government and survey targeting at experts in disaster safety and building village were conducted. Based on the findings, the directions for establishing a disaster safety village that fitted the characteristics of rural areas were examined. Result: The officials-in-charge in the local government answered that rural areas have a high risk of storm and flood such as heavy snowing, typhoon, drought, and heavy rain as well as forest fire, and it is difficult to draw voluntary participation of farmers for disaster management activities due to their main duties. They also replied that active support and participation of residents in rural areas are necessary for future improvement measures. The experts mostly replied that the problem of disaster safety village project is a temporary project which has low sustainability, and the lack of connections between the central government, local governments and residents was stressed out as the difficulties. They said that measures to secure the budget and the directions of project promotion system should be promoted by the central government, local governments and residents together. Conclusion: The results of this study are as follows. First, a disaster safety village should be established in consideration of the disaster types and characteristics. Second, measures to secure the budget for utilizing the central government fund as well as local government fund and village development fund should be prepared when establishing and operating a disaster safety village in rural areas. Third, measures to utilize a disaster safety village in rural areas for a long period of time such as the re-authorization system should be prepared in order to continuously operate and manage such villages after its establishment. Fourth, detailed measures that allow residents of rural areas to positively participate in the activities for establishing a disaster safety village in rural areas should be prepared.

A Geographical Study of Therapeutic Spaces after the Disaster of the MV Sewol in a Local Community (세월호 참사 이후 지역 커뮤니티에 형성된 치유의 공간에 대한 지리적 고찰)

  • Park, Sookyung
    • Journal of the Korean Geographical Society
    • /
    • v.52 no.1
    • /
    • pp.25-53
    • /
    • 2017
  • The ultimate goal of this research is to examine the geographical characteristics of therapeutic spaces where have been appeared in Wa-dong and Gojan-dong, Ansan-si after the disaster of the MV Sewol. As looking into the inside, the aim of the therapeutic spaces, which cover each target group (victims) individually, is various and different because the disaster of the MV Sewol generated various direct and indirect victims requiring healing. The therapeutic spaces are estimated at about 10 organizations and are leaded by private agents predominantly. Furthermore, the therapeutic spaces are located near, but are aside from Danwon high school where many students are reported killed and injured in the incident. And the therapeutic spaces provide simple and repetitive diversions, for example, having a meal, knitting and studying, rather than special programs to restore a broken daily life to the original state. On the basis of such a background, the geographical characteristics of the therapeutic spaces related to the disaster of the MV Sewol can be summarized as follows; first, it seems that target groups accept the therapeutic spaces as the concept of place gradually. Even though most of the therapeutic spaces were suggested by third parties at first, target groups are involved in the management and recollection of their own therapeutic spaces as well as the plan for a future direction now; and consider the therapeutic spaces as exclusive properties. Second, the disaster of the MV Sewol have embedded collective trauma to not only direct victims, but extensive groups such as parents, brothers and sisters, relatives, friends and neighbors as noted earlier. Therefore, the therapeutic spaces support comprehensive target groups; but each therapeutic space is not overlapped each other. However, to solve collective trauma in a local community effectively, the therapeutic spaces are networked closely and build a regular cooperative system. Third, a continuous memory is mentioned as an important point to overcome collective trauma, but some phenomena such as fatigue and conflict with neighbors, out-migrants and a faded atmosphere as time passes act as risk factors in Ansan-si. To keep a continuous memory, the therapeutic spaces attempt the recovery of local communities and devise various events, for example, cultural performances; furthermore, are closely connected with external organizations.

  • PDF

Effects of CSV Activities on Purchasing Intention : on the Perspectives of Value Chain (공유가치창출(CSV)활동이 구매의도에 미치는 영향 : 가치사슬 관점)

  • Weon, Jong-Ha;Jung, Dae-Hyu
    • Management & Information Systems Review
    • /
    • v.36 no.4
    • /
    • pp.1-19
    • /
    • 2017
  • These days, the concept of creating shared value is drawn keen attentions to. This interest comes out of the expectation that Creating Shared Value(CSV) can offer an answer to some social issues by creating societal and economic values on the top of the achievements that existing Corporate Social Responsibility(CSR) has made. However, it is difficult to make a clear distinction between the achievements that the activities of CSR and CSV have made. In this regard, developing a methodology to make an actual proof analysis on the accomplishments of CSV and to verify customer's awareness of and attitude towards the CSV is necessarily required. A company needs to gain a competitive advantage in the marketplace as well as resolve a social issue by innovating value chain. The research has verified the cause and effect relationship between the CSV from the point of view of value chain and the purchase intention aroused by its economic, societal and cultural values through the company image and credibility with actual proof analysis and come up with following results. First, a societal and cultural value resulted in giving positive impact on a company's image, which implies that CSV activities can be the thin end of the wedge through which customers have a good image of the company involved in CSV. Second, a societal value makes a positive influence on the credibility of a company. In this regard, CSV should be recognized not just as a thing that generates a cost, but a way to win-win as well as future development. Third and last, the research results show that both company image and credibility influence on purchase intention. Considering that CSV generates a positive evaluation on a company that will ultimately cause continuous profit-making, the company's ultimate goal of activities, it should be approached from the perspective of making a mid-and-long term strategy.

  • PDF

Development of 3D Printed Snack-dish for the Elderly with Dementia (3D 프린팅 기술을 활용한 치매노인 전용 영양(수분)보충 식품섭취용기 개발)

  • Lee, Ji-Yeon;Kim, Cheol-Ho;Kim, Kug-Weon;Lee, Kyong-Ae;Koh, Kwangoh;Kim, Hee-Seon
    • Korean Journal of Community Nutrition
    • /
    • v.26 no.5
    • /
    • pp.327-336
    • /
    • 2021
  • Objectives: This study was conducted to create a 3D printable snack dish model for the elderly with low food or fluid intake along with barriers towards eating. Methods: The decision was made by the hybrid-brainstorming method for creating the 3D model. Experts were assigned based on their professional areas such as clinical nutrition, food hygiene and chemical safety for the creation process. After serial feedback processes, the grape shape was suggested as the final model. After various concept sketching and making clay models, 3D-printing technology was applied to produce a prototype. Results: 3D design modeling process was conducted by SolidWorks program. After considering Dietary reference intakes for Koreans (KDRIs) and other survey data, appropriate supplementary water serving volume was decided as 285 mL which meets 30% of Adequate intake. To consider printing output conditions, this model has six grapes in one bunch with a safety lid. The FDM printer and PLA filaments were used for food hygiene and safety. To stimulate cognitive functions and interests of eating, numbers one to six was engraved on the lid of the final 3D model. Conclusions: The newly-developed 3D model was designed to increase intakes of nutrients and water in the elderly with dementia during snack time. Since dementia patients often forget to eat, engraving numbers on the grapes was conducted to stimulate cognitive function related to the swallowing and chewing process. We suggest that investigations on the types of foods or fluids are needed in the developed 3D model snack dish for future studies.

A study on the Relationship between the Degree of Awareness on Low Carbon Green Growth and the Organizational Commitment Focused on the Traditional Retailers (전통시장 상인들의 저탄소 녹색성장에 대한 인식과 조직몰입의 관계에 대한 연구)

  • Yang, Hoe-Chang;Kim, Sung-Il;Park, Young-Ho;Lee, Shang-Nam
    • Journal of Distribution Science
    • /
    • v.9 no.3
    • /
    • pp.37-46
    • /
    • 2011
  • Since the Korean retail industry was made accessible to the big conglomerates and foreign retail companies, local traditional markets have faced serious problems. To sustain the local traditional markets' survival, the Korean government established various remedial policies for addressing, and many scholars published articles to suggest how to find solutions to, the problem. Unfortunately, the results have not been satisfactory. The purpose of this study is to find another way to help the Korean traditional retail market, from the view point of the Green Growth Policy, an initiative designed to address environmentally balanced economic growth in Korea. In order to survive and to maintain sustainable growth, it is incumbent upon retailers in the traditional market to understand the concept of the Green Growth Policy. A survey was conducted as a means of testing the degree of awareness of the Green Growth Policy, as well as determining the relationship between the degree of awareness and the degree of organizational commitment by the retailers in the local traditional markets. Interestingly, we were able to detect some of the features (e.g., they were distinguished by the elderly and the young, as well as low level of education and high level of education) in the traditional market retailers' demographic characteristics. We utilized the analysis of variance (ANOVA) statistical method to simultaneously compare the differences in retailers' demographic characteristics; the results were as follows: Overall, the results showed that the awareness of the Green Growth Policy, the degree of trust in the government's policy, levels of self-efficacy, and levels of organizational commitment were higher with the older traditional market retailers than the younger traditional market retailers. Specifically, the degree of trust in government policies (F=9.964,p < .05), levels of self-efficacy (F=5.532,p < .05), and levels of organizational commitment (F=5.697,p < .05) were statistically significant. Moreover, in the portion of the study that addressed the difference between education levels, all the variables were averaged in the higher education category of the traditional market retailers. Specifically, awareness levels of the Green Growth Policy (F=8.564,p < .005) and levels of self-efficacy (F=6.754,p < .005) were statistically significant. These results revealed that the traditional market retailers' demographic characteristics should be considered important factors in order to realize their policy. The results of the study showed the following: 1) The degree of awareness of the government's Green Growth Policy was statistically significant as it related to traditional market retailers' organizational commitment. 2) The degree of trust of the government's policy was significantly moderated between the awareness of the government's Green Growth Policy and the traditional market retailers' organizational commitment. This result demonstrates that the traditional market retailers' awareness of the government's Green Growth Policy will show more organizational commitment with higher levels of trust of the government's policy. 3) It also revealed that traditional market retailers' self-efficacy was fully mediated between the awareness of the Green Growth Policy of the government and traditional market retailers' organizational commitment. The results suggest that the government should show an interest in showing traditional market retailers how to enhance their traditional markets. Implications and future research directions are also discussed.

  • PDF

Surrogate Internet Shopping Malls: The Effects of Consumers' Perceived Risk and Product Evaluations on Country-of-Buying-Origin Image (망상대구점(网上代购店): 소비자감지풍험화산품평개대원산국형상적영향(消费者感知风险和产品评价对原产国形象的影响))

  • Lee, Hyun-Joung;Shin, So-Hyoun;Kim, Sang-Uk
    • Journal of Global Scholars of Marketing Science
    • /
    • v.20 no.2
    • /
    • pp.208-218
    • /
    • 2010
  • Internet has grown fast and become one of the most important retail channels now. Various types of Internet retailers, hereafter etailers, have been introduced so far and as one type of Internet shopping mall, 'surrogate Internet shopping mall' has been prosperous and attracting consumers in the domestic market. Surrogate Internet shopping mall is a unique type of etailer that globally purchases well-known brand goods that are not imported in the market, completes delivery in the favor of individual buyers, and collects fees for these specific services. The consumers, who are usually interested in purchasing high-end and unique but not eligible brands, have difficulties to purchase these items overseas directly from the retailers or brands in other countries due to worries of payment failure and no address available for their usually domestic only delivery. In Korea, both numbers of surrogate Internet shopping malls and the magnitude of sales have been growing rapidly up to more than 430 active malls and 500 billion Korean won in 2008 since the population of consumers who want this agent shopping service is also expending. This etail business concept is originated from 'surrogate-mediated purchase' and this type of shopping agent has existed in many different forms and also in wide ranges of context level for quite a long time. As marketers face their individual buyers' representatives instead of a direct contact with them in many occasions, the impact of surrogate shoppers on consumer's decision making has been enormously important and many scholars have explored various range of agent's impact on consumer's purchase decisions in marketing and psychology field. However, not much rigorous research in the Internet commerce has been conveyed yet. Moreover, since as one of the shopping agent surrogate Internet shopping malls specifically connect overseas brands or retailers to domestic consumers, one specific character of the mall's, image of surrogate buying country, where surrogate purchases are conducted in, may play an important role to form consumers' attitude and purchase intention toward products. Furthermore it also possibly affects various dimensions of perceived risk in consumer's information processing. However, though tremendous researches have been carried exploring the effects of diverse dimensions of country of origin, related studies in Internet context has been rarely executed. There have been some studies that prove the positive impact of country of origin on consumer's evaluations as one of information clues in product manufacture descriptions, yet studies detecting the relationship between country image of surrogate buying origin and product evaluations rarely undertaken regarding this specific mall type. Thus, the authors have found it well-worth investigating in this specific retail channel and explored systematic relationships among focal constructs and elaborated their different paths. The authors have proven that country image of surrogate buying origin in the mall, where surrogate malls purchase products in and brings them from for buyers, not only has a positive effect on consumers' product evaluations including attitude and purchase intention but also has a negative effect on all three dimensions of perceived risk: product-related risk, shipping-related risk, and post-purchase risk. Specifically among all the perceived risk, product-related risk which is arisen from high uncertainty of product performance is most affected (${\beta}$= -.30) by negative country image of surrogate buying origin, and also shipping-related risk (${\beta}$= -.18) and post-purchase risk (${\beta}$= -.15) get influenced in order. Its direct effects on product attitude (${\beta}$= .10) and purchase intention (${\beta}$= .14) are also secured. Each of perceived risk dimension is proven to have a negative effect on purchase intention through product attitude as a mediator (${\beta}$= -.57: product-related risk ${\rightarrow}$ product attitude; ${\beta}$= -.24: shipping-related risk ${\rightarrow}$ product attitude; ${\beta}$= -.44: post-purchase risk ${\rightarrow}$ product attitude) as well. From the additional analysis, the paths of consumers' information processing are shown to be different based on their levels of product knowledge. While novice consumers with low level of knowledge consider only perceived risk important, expert consumers with high level of knowledge take both the country image, where surrogate services are conducted in, and perceived risk seriously to build their attitudes and formulate decisions toward products more delicately and systematically, which is in line with previous studies. This study suggests several pieces of academic and practical advice. Precisely, country image of surrogate buying origin does affect on consumer's risk perceptions and behavioral consequences. Therefore a careful selection of surrogate buying origin is recommended. Furthermore, reducing consumers' risk level is required to blossom this new type of retail business whether its consumer are novices or experts. Additionally, since consumer take different paths of elaborating information based on their knowledge levels, sophisticated marketing approaches to each group of consumers are required. For novice buyers strong devices for risk mitigation are needed to induce them to form better attitudes and for experts selections of better and advanced countries as surrogate buying origins are advised while endorsement strategy for the site might work as a reliable information clue to all consumers to mitigate the barriers to purchase goods online. The authors have also explained that the study suffers from some limitations, including generalizability. In future studies, tests of and comparisons among different types of etailers with relevant constructs are recommended to broaden the findings.

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.