• Title/Summary/Keyword: Sub-text

Search Result 199, Processing Time 0.03 seconds

Efficient Topic Modeling by Mapping Global and Local Topics (전역 토픽의 지역 매핑을 통한 효율적 토픽 모델링 방안)

  • Choi, Hochang;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.69-94
    • /
    • 2017
  • Recently, increase of demand for big data analysis has been driving the vigorous development of related technologies and tools. In addition, development of IT and increased penetration rate of smart devices are producing a large amount of data. According to this phenomenon, data analysis technology is rapidly becoming popular. Also, attempts to acquire insights through data analysis have been continuously increasing. It means that the big data analysis will be more important in various industries for the foreseeable future. Big data analysis is generally performed by a small number of experts and delivered to each demander of analysis. However, increase of interest about big data analysis arouses activation of computer programming education and development of many programs for data analysis. Accordingly, the entry barriers of big data analysis are gradually lowering and data analysis technology being spread out. As the result, big data analysis is expected to be performed by demanders of analysis themselves. Along with this, interest about various unstructured data is continually increasing. Especially, a lot of attention is focused on using text data. Emergence of new platforms and techniques using the web bring about mass production of text data and active attempt to analyze text data. Furthermore, result of text analysis has been utilized in various fields. Text mining is a concept that embraces various theories and techniques for text analysis. Many text mining techniques are utilized in this field for various research purposes, topic modeling is one of the most widely used and studied. Topic modeling is a technique that extracts the major issues from a lot of documents, identifies the documents that correspond to each issue and provides identified documents as a cluster. It is evaluated as a very useful technique in that reflect the semantic elements of the document. Traditional topic modeling is based on the distribution of key terms across the entire document. Thus, it is essential to analyze the entire document at once to identify topic of each document. This condition causes a long time in analysis process when topic modeling is applied to a lot of documents. In addition, it has a scalability problem that is an exponential increase in the processing time with the increase of analysis objects. This problem is particularly noticeable when the documents are distributed across multiple systems or regions. To overcome these problems, divide and conquer approach can be applied to topic modeling. It means dividing a large number of documents into sub-units and deriving topics through repetition of topic modeling to each unit. This method can be used for topic modeling on a large number of documents with limited system resources, and can improve processing speed of topic modeling. It also can significantly reduce analysis time and cost through ability to analyze documents in each location or place without combining analysis object documents. However, despite many advantages, this method has two major problems. First, the relationship between local topics derived from each unit and global topics derived from entire document is unclear. It means that in each document, local topics can be identified, but global topics cannot be identified. Second, a method for measuring the accuracy of the proposed methodology should be established. That is to say, assuming that global topic is ideal answer, the difference in a local topic on a global topic needs to be measured. By those difficulties, the study in this method is not performed sufficiently, compare with other studies dealing with topic modeling. In this paper, we propose a topic modeling approach to solve the above two problems. First of all, we divide the entire document cluster(Global set) into sub-clusters(Local set), and generate the reduced entire document cluster(RGS, Reduced global set) that consist of delegated documents extracted from each local set. We try to solve the first problem by mapping RGS topics and local topics. Along with this, we verify the accuracy of the proposed methodology by detecting documents, whether to be discerned as the same topic at result of global and local set. Using 24,000 news articles, we conduct experiments to evaluate practical applicability of the proposed methodology. In addition, through additional experiment, we confirmed that the proposed methodology can provide similar results to the entire topic modeling. We also proposed a reasonable method for comparing the result of both methods.

Impulse Buying and Searching For Sources Of Information according to the Utilization of Sales Promotion in an Internet Fashion Shopping mall (인터넷 패션 쇼핑몰의 판매촉진 활용에 따른 충동구매와 정보원탐색)

  • Ha, Jong Kyung
    • The Korean Journal of Community Living Science
    • /
    • v.24 no.3
    • /
    • pp.313-325
    • /
    • 2013
  • The purpose of this study is to investigate impulse buying and the searching for sources of information among males and females in their 20s by their utilization of sales promotion in an internet fashion shopping mall. The findings were as follows: First, there was a statistically significant difference in the use of sales promotion by age and gender. Second, two factors, such as a price-oriented utilization of sales promotion and a non price-oriented utilization of sales promotion were extracted from the analysis of sub factors of the utilization of sales promotion in an internet fashion shopping mall. Third, five factors, including affective impulse buying, provocative impulsive buying, situational impulse buying, reminder impulse buying, and pure impulsive buying, were extracted from the analysis of sub factors of impulse buying in an internet fashion shopping mall. Fourth, the utilization of sales promotions in an internet fashion shopping mall had statistically significant effects on the sub factors of impulse buying, provocative impulse buying, situational impulse buying, reminder impulse buying, and pure impulse buying. Fifth, the analysis of the correlation between the utilization of sales promotion and the searching for sources of information revealed that price-oriented utilization of sales promotion had a correlation with DM or text message, advice from friends or family, advice from a sales person, information from friends or colleagues, celebrity supporters on TV dramas or movies, and product commercials and information.

Protective effects of kaempferol, quercetin, and its glycosides on amyloid beta-induced neurotoxicity in C6 glial cell (Kaempferol, quercetin 및 그 배당체의 amyloid beta 유도 신경독성에 대한 C6 신경교세포 보호 효과)

  • Kim, Ji Hyun;Kim, Hyun Young;Cho, Eun Ju
    • Journal of Applied Biological Chemistry
    • /
    • v.62 no.4
    • /
    • pp.327-332
    • /
    • 2019
  • Alzheimer's disease (AD) is a common neurodegenerative disease. Oxidative stress by amyloid beta peptide (Aβ) of neuronal cell is the most cause of AD. In the present study, protective effects of several flavonoids such as kaempferol (K), kaempferol-3-O-glucoside (KG), quercetin (Q) and quercetin-3-β-ᴅ-glucoside (QG) from Aβ25-35 were investigated using C6 glial cell. Treatment of Aβ25-35 to C6 glial cell showed decrease of cell viability, while treatment of flavonoids such as Q and QG increased cell viability. In addition, treatment of flavonoids declined reactive oxygen species (ROS) production compared with Aβ25-35-induced control. The ROS production was increased by treatment of Aβ25-35 to 133.39%, while KG and QG at concentration of 1 μM decreased ROS production to 107.44 and 113.10%, respectively. To study mechanisms of protective effect of these flavonoids against Aβ25-35, the protein expression related to inflammation under Aβ25-35-induced C6 glial cell was investigated. The results showed that C6 glial cell under Aβ25-35-induced oxidative stress up-regulated inflammation-related protein expressions. However, treatment of flavonoids led to reduction of protein expression such as inducible nitric oxide synthase, cyclooxygenase-2 and interleukin-1β. Especially, treatment of KG and QG decreased more effectively inflammation-related protein expression than its aglycones, K and Q. Therefore, the present results indicated that K, Q and its glycosides attenuated Aβ25-35-induced neuronal oxidative stress and inflammation.

Assessing the Impact of Digital Procurement via Mobile Phone on the Agribusiness of Rural Bangladesh: A Decision-analytic Approach

  • Alam, Md. Mahbubul;Wagner, Christian
    • Agribusiness and Information Management
    • /
    • v.5 no.1
    • /
    • pp.31-41
    • /
    • 2013
  • The research assesses the impact of a digital procurement (e-purjee) system for sugarcane growers in Bangladesh. The system itself is simple, transmitting purchase orders to local farmers via SMS text notification. It replaces a traditional paper-based system fraught with low reliability and delivery delays. Applying expected value theory, and using decision tree representations to depict growers' decision-making complexity in an information-asymmetric environment, we compute outcomes for the strategies and sub-strategies of ICT vs. traditional paper-based order management from the sugarcane growers' perspective. The study results show that the digital procurement system outperforms the paper-based system by tangibly reducing growers' economic losses. The digital system also appears to benefit growers non-monetarily, because of reduced uncertainty and a higher level of perceived fairness. Sugarcane growers appear to value the non-monetary benefits even higher than the economic advantages of the e-purjee system.

  • PDF

Modified GMM Training for Inexact Observation and Its Application to Speaker Identification

  • Kim, Jin-Young;Min, So-Hee;Na, Seung-You;Choi, Hong-Sub;Choi, Seung-Ho
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.163-174
    • /
    • 2007
  • All observation has uncertainty due to noise or channel characteristics. This uncertainty should be counted in the modeling of observation. In this paper we propose a modified optimization object function of a GMM training considering inexact observation. The object function is modified by introducing the concept of observation confidence as a weighting factor of probabilities. The optimization of the proposed criterion is solved using a common EM algorithm. To verify the proposed method we apply it to the speaker recognition domain. The experimental results of text-independent speaker identification with VidTimit DB show that the error rate is reduced from 14.8% to 11.7% by the modified GMM training.

  • PDF

Splitting and Merging Algorithm Based on Local Statistics of Sub-Regions in Document Image

  • Thapaliya, Kiran;Park, Il-Cheol;Kwon, Goo-Rak
    • Journal of information and communication convergence engineering
    • /
    • v.9 no.5
    • /
    • pp.487-490
    • /
    • 2011
  • This paper presents splitting and merging algorithm based on adaptive thresholding. The algorithm first divides the image into blocks, and then compares each block using the calculated thresholding value. The blocks which are same are merged using the certain threshold value and different blocks are split unless it satisfies the threshold value. When the block has been merged, maximum and minimum block sizes are determined then the average block size is determined. After the average block size is determined the average intensity and standard deviation of average block is calculated. The process of thresholding is applied to binarize the image. Finally, the experimental results show that the proposed method distinguishes clearly the background with text in the document image.

Translation of RDF to VRML (RDF - VRML 변환)

  • Kim, Hye-Yeon;Park, Kin;Cho, Dong-Sub
    • Proceedings of the KIEE Conference
    • /
    • 2000.11d
    • /
    • pp.830-832
    • /
    • 2000
  • XML형식으로 표현된 RDF data를 VRML을 사용하여 시각적으로 나타내는 방법을 연구하였다. 현재 Web 환경은 동적으로 문서를 생성하고 Visual하게 표현하는 방향으로 발전하고 있으며 이러한 환경에서 XML은 실시간으로 data를 생성하기 쉬워 많이 사용되고 있다. 그러나 XML은 text 기반이기 때문에 data를 가시화하여 사용자한테 보여주기 힘들며 data를 표현하는데 너무 많은 융통성을 제공하고 있다는 단점이 있다. 이에 XML 표현에 제약을 둬 표준적인 방식으로 표현하도록 해주는 RDF가 유용하다고 할 수 있다. 본 논문에서는 VRML을 RDF와 결합하여 실시간으로 변하는 data를 시각화 도구를 사용하여 표현하는 방법에 대해 연구를 하였다 이를 위하여 Java Servlet을 사용하였으며 RDF 문서에서 data를 추출하여 VRML 펀드를 만들고. 그 코드를 사용자측에 전달하여 시각적으로 data를 볼 수 있도록 하는 시스템을 구현하였다.

  • PDF

Development of Narrow Viewing Angle Mode TFT LCD and Application of Advanced Gray Compensation(GC) Algorithm

  • Kim, S.H.;Lee, K.J.;Jung, Y.C.;Lee, D.G.;Baek, J.S.;Ahn, B.C.;Choi, H.C.
    • 한국정보디스플레이학회:학술대회논문집
    • /
    • 2008.10a
    • /
    • pp.1383-1385
    • /
    • 2008
  • In the viewing-angle image control (VIC) technology, one pixel is made up of a quad pixel structure which is consisting of R, G, B, and electrically control briefringence (ECB) sub-pixels. Two types of test stimuli were used; text & complex image respectively. The limitations of those methods were found from the experiment. From the results the advanced GC technology was proposed.

  • PDF

Development of Lesson Plans for Food Hygiene and Safety in Food Convergence (식품융합교과의 식품위생·안전 단원 교수-학습지도안 개발)

  • Kwon, Mi-Jung;Park, Jong-Un
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.25 no.5
    • /
    • pp.1068-1078
    • /
    • 2013
  • We will discuss the procedures involved in the development of the lesson plans including the various research and analysis approaches, which lead into practical lesson plans based on the 4 sub-categorized subjects analyzed throughout 7 different text books of food hygiene and safety education as followings: Food Hygiene, Personal and Environmental Hygiene, Food Contamination Incidents, Food Poisoning, and Food Safety. Lesson Plans represents STEAM associated education involving the partnerships between business-associated teachers and food educational teachers, focusing on cultivating the students' problem-solving abilities by inducing voluntary participation and critical thinking.

Text Categorization using Topic Signature and Co-occurrence Features (Topic Signature와 동시 출현 단어 쌍을 이용한 문서 범주화)

  • Bae, Won-Sik;Han, Yo-Sub;Cha, Jeong-Won
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2008.06c
    • /
    • pp.262-267
    • /
    • 2008
  • 본 논문에서는 문서 내에서 동시에 출현하는 단어 쌍을 자질 추출 단위로 하는 문서 범주화 시스템에 대하여 기술한다. 자질 추출 단위를 단어 쌍으로 정의한 것은 문서에서 빈번하게 동시에 출현하는 단어들은 서로 연관관계가 높으며, 단어 하나보다는 연관관계가 높은 단어들의 쌍이 특정 범주의 문서에서만 나타날 확률이 높아지므로 문서 분류 능력을 높이는데 좋은 요인으로 작용할 수 있을 것이라는 가정 때문이다. 그리고 문서 요약 분야에서 제안된 Log-likelihood Ratio를 기반으로 하는 Topic Signature Term Extraction 방법을 사용하여 자질 추출을 하고, Naive Bayes 분류기를 이용하여 문서를 분류한다. 본 연구는 Reuters-21578 문서 집합을 이용한 성능평가에서 좋은 결과를 보였으며, 이는 앞으로의 연구에도 기여할 수 있을 것이라 기대한다.

  • PDF