• Title/Summary/Keyword: Open source system

Search Result 869, Processing Time 0.025 seconds

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

Comparison of Deep Learning Frameworks: About Theano, Tensorflow, and Cognitive Toolkit (딥러닝 프레임워크의 비교: 티아노, 텐서플로, CNTK를 중심으로)

  • Chung, Yeojin;Ahn, SungMahn;Yang, Jiheon;Lee, Jaejoon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.1-17
    • /
    • 2017
  • The deep learning framework is software designed to help develop deep learning models. Some of its important functions include "automatic differentiation" and "utilization of GPU". The list of popular deep learning framework includes Caffe (BVLC) and Theano (University of Montreal). And recently, Microsoft's deep learning framework, Microsoft Cognitive Toolkit, was released as open-source license, following Google's Tensorflow a year earlier. The early deep learning frameworks have been developed mainly for research at universities. Beginning with the inception of Tensorflow, however, it seems that companies such as Microsoft and Facebook have started to join the competition of framework development. Given the trend, Google and other companies are expected to continue investing in the deep learning framework to bring forward the initiative in the artificial intelligence business. From this point of view, we think it is a good time to compare some of deep learning frameworks. So we compare three deep learning frameworks which can be used as a Python library. Those are Google's Tensorflow, Microsoft's CNTK, and Theano which is sort of a predecessor of the preceding two. The most common and important function of deep learning frameworks is the ability to perform automatic differentiation. Basically all the mathematical expressions of deep learning models can be represented as computational graphs, which consist of nodes and edges. Partial derivatives on each edge of a computational graph can then be obtained. With the partial derivatives, we can let software compute differentiation of any node with respect to any variable by utilizing chain rule of Calculus. First of all, the convenience of coding is in the order of CNTK, Tensorflow, and Theano. The criterion is simply based on the lengths of the codes and the learning curve and the ease of coding are not the main concern. According to the criteria, Theano was the most difficult to implement with, and CNTK and Tensorflow were somewhat easier. With Tensorflow, we need to define weight variables and biases explicitly. The reason that CNTK and Tensorflow are easier to implement with is that those frameworks provide us with more abstraction than Theano. We, however, need to mention that low-level coding is not always bad. It gives us flexibility of coding. With the low-level coding such as in Theano, we can implement and test any new deep learning models or any new search methods that we can think of. The assessment of the execution speed of each framework is that there is not meaningful difference. According to the experiment, execution speeds of Theano and Tensorflow are very similar, although the experiment was limited to a CNN model. In the case of CNTK, the experimental environment was not maintained as the same. The code written in CNTK has to be run in PC environment without GPU where codes execute as much as 50 times slower than with GPU. But we concluded that the difference of execution speed was within the range of variation caused by the different hardware setup. In this study, we compared three types of deep learning framework: Theano, Tensorflow, and CNTK. According to Wikipedia, there are 12 available deep learning frameworks. And 15 different attributes differentiate each framework. Some of the important attributes would include interface language (Python, C ++, Java, etc.) and the availability of libraries on various deep learning models such as CNN, RNN, DBN, and etc. And if a user implements a large scale deep learning model, it will also be important to support multiple GPU or multiple servers. Also, if you are learning the deep learning model, it would also be important if there are enough examples and references.

A New Understanding on Environmental Problems in China - Dilemma between Economic Development and Environmental Protection - (중국 환경문제에 대한 재인식 -경제발전과 환경보호의 딜레마-)

  • Won, Dong-Wook
    • Journal of Environmental Policy
    • /
    • v.5 no.1
    • /
    • pp.45-70
    • /
    • 2006
  • China has achieved great economic growth above 9% annual since it changed to more of a market economy system by its reform and open-door policy. At the same time, China has experienced severe ecological deterioration, such as air and water pollutions caused by its rapid urbanization and industrialization. China is now confronted with environmental pollution and ecological deterioration at a critical point, at which economic development in China is limited. Moreover, environmental problems in China have become a lit fuse for social fluctuation beyond pollution problems. The root and background of environmental problems in China, firstly, are its government's lack of understanding of these problems and incorrect economic policies affected by political and ideological prejudice. Secondly, the plundering of resources, 'the principle of development first' which didn't consider environmental sustainability is another source of environmental deterioration in China. In addition, a huge population and poverty in China have increased the difficulty in solving its environmental problems, and in fact have accelerated them. The Chinese government has established many environmental laws and institutions, increased environmental investments, and is enlarging the participation of NGOs and the general public in some limited scale to solve its environmental problems. However, it has not obtained effective results because of the lack of environmental investments owing to the government's limit of the development phase, a structural limit of law enforcement and local protectionism, and the limit of political independency in NGOs and the lack of public participation in China. It seems that China remains in the stage of 'economic development first, environmental protection second', contrary to its catch-phrase of 'the harmony between economic development and environmental protection'. China is now confronted with dual pressure both domestically and abroad because of deepening environmental problems. There are growing public's protests and demonstrations in China in response to the spread of damage owing to environmental pollution and ecological deterioration. On the other hand, international society, in particular neighboring countries, regard China as a principal cause of ecological disaster. In the face of this dual pressure, China is presently contemplating a 'recycling economy' that helps sustainable development through the structural reform of industries using too much energy and through more severe law enforcement than now. Therefore, it is desirable to promote regional cooperation more progressively and practically in the direction of building China's ability to solve environmental problems.

  • PDF

A Comparison of Peripheral Doses Scattered from a Physical Wedge and an Enhanced Dynamic Wedge (금속쐐기와 기능강화동적쐐기의 조사야 주변부 선량 비교)

  • Park, Jong-Min;Kim, Hee-Jung;Min, Je-Soon;Lee, Je-Hee;Park, Charn-Il;Ye, Sung-Joon
    • Progress in Medical Physics
    • /
    • v.18 no.3
    • /
    • pp.107-117
    • /
    • 2007
  • In order to evaluate the radio-protective advantage of an enhanced dynamic wedge (EDW) over a physical wedge (PW), we measured peripheral doses scattered from both types of wedges using a 2D array of ion-chambers. A 2D array of ion-chambers was used for this purpose. In order to confirm the accuracy of the device we first compared measured profiles of open fields with the profiles calculated by our commissioned treatment planning system. Then, we measured peripheral doses for the wedge angles of $15^{\circ},\;30^{\circ},\;45^{\circ},\;and\;60^{\circ}$ at source to surface distances (SSD) of 80 cm and 90 cm. The measured points were located at 0.5 cm depth from 1 cm to 5 cm outside of the field edge. In addition, the measurements were repeated by using thermoluminescence dosimeters (TLD). The peripheral doses of EDW were (1.4% to 11.9%) lower than those of PW (2.5% to 12.4%). At 15 MV energy, the average peripheral doses of both wedges were 2.9% higher than those at 6MV energy. At a small SSD (80 cm vs. 90 cm), peripheral dose differences were more recognizable. The average peripheral doses to the heel direction were 0.9% lower than those to the toe direction. The results from the TLD measurements confirmed these findings with similar tendency. Dynamic wedges can reduce unnecessary scattered doses to normal tissues outside of the field edge in many clinical situations. Such an advantage is more profound in the treatment of steeper wedge angles, and shorter SSD.

  • PDF

Geological Structures and Geochemical Uranium Anormal Zone Around the Shinbo Mine, Korea (신보광산 주변지역의 지질구조와 우라늄 지화학 이상대)

  • Kang, Ji-Hoon;Lee, Deok-Seon
    • Economic and Environmental Geology
    • /
    • v.45 no.1
    • /
    • pp.31-40
    • /
    • 2012
  • This paper examined the characteristics of ductile and brittle structural elements with detailed mapping by lithofacies classification to clarify the relationship between the geological structure and the geochemical high-grade uranium anormal zone and to provide the basic information on the flow of groundwater in the eastern area of Shinbo mine, Jinan-gun, Jeollabuk-do, Korea. It indicates that this area is mainly composed of Precambrian quartzite, metapelite, metapsammite, which show a zonal distribution of mainly ENE-WSW trend, and age unknown pegmatite and Cretaceous porphyry which intrude them. But the Cretaceous Jinan Group which unconformably covers them, contrary to assumption, could not be observed. The main ductile deformation structures of Precambrian metasedimentary rocks were formed at least through three phases of deformation [ENE striking regional foliation (D1) -> ENE or EW striking crenulation foliation (D2) -> WNW or EW trending open, tight, kink folds (D3)]. The predominant orientation of S1 regional foliation strikes ENE and dips south, being similar to the zonal distribution of Precambrian metasedimentary rocks. Most predominant orientation of high-angled brittle fracture (dip angle ${\geq}45^{\circ}$) [ENE (frequency: 24.3%) > NS (23.9%) > (N)NW (18.8%) > WNW (16.9%) > NE (16.1%) fracture sets in descending frequency order], which is closely related to the flow of groundwater, strikes ENE and dips south. It also agrees with the zonal distribution of metasedimentary rocks and the predominant orientation of S1 regional foliation. The next one strikes NS and dips east or west. Considering the controlling factor of the geochemical uranium anormal zone in the Shinbo mine and its eastern areas from the above structural data. the uranium source rock in these areas might be pegmatite and the geochemical uranium anormal zone in the Sinbo mine area could be formed by an secondary enrichment through the flow of pegmatite aquifer's groundwater into the Sinbo mine area like the previous research's result.

Evaluation of Setup Uncertainty on the CTV Dose and Setup Margin Using Monte Carlo Simulation (몬테칼로 전산모사를 이용한 셋업오차가 임상표적체적에 전달되는 선량과 셋업마진에 대하여 미치는 영향 평가)

  • Cho, Il-Sung;Kwark, Jung-Won;Cho, Byung-Chul;Kim, Jong-Hoon;Ahn, Seung-Do;Park, Sung-Ho
    • Progress in Medical Physics
    • /
    • v.23 no.2
    • /
    • pp.81-90
    • /
    • 2012
  • The effect of setup uncertainties on CTV dose and the correlation between setup uncertainties and setup margin were evaluated by Monte Carlo based numerical simulation. Patient specific information of IMRT treatment plan for rectal cancer designed on the VARIAN Eclipse planning system was utilized for the Monte Carlo simulation program including the planned dose distribution and tumor volume information of a rectal cancer patient. The simulation program was developed for the purpose of the study on Linux environment using open source packages, GNU C++ and ROOT data analysis framework. All misalignments of patient setup were assumed to follow the central limit theorem. Thus systematic and random errors were generated according to the gaussian statistics with a given standard deviation as simulation input parameter. After the setup error simulations, the change of dose in CTV volume was analyzed with the simulation result. In order to verify the conventional margin recipe, the correlation between setup error and setup margin was compared with the margin formula developed on three dimensional conformal radiation therapy. The simulation was performed total 2,000 times for each simulation input of systematic and random errors independently. The size of standard deviation for generating patient setup errors was changed from 1 mm to 10 mm with 1 mm step. In case for the systematic error the minimum dose on CTV $D_{min}^{stat{\cdot}}$ was decreased from 100.4 to 72.50% and the mean dose $\bar{D}_{syst{\cdot}}$ was decreased from 100.45% to 97.88%. However the standard deviation of dose distribution in CTV volume was increased from 0.02% to 3.33%. The effect of random error gave the same result of a reduction of mean and minimum dose to CTV volume. It was found that the minimum dose on CTV volume $D_{min}^{rand{\cdot}}$ was reduced from 100.45% to 94.80% and the mean dose to CTV $\bar{D}_{rand{\cdot}}$ was decreased from 100.46% to 97.87%. Like systematic error, the standard deviation of CTV dose ${\Delta}D_{rand}$ was increased from 0.01% to 0.63%. After calculating a size of margin for each systematic and random error the "population ratio" was introduced and applied to verify margin recipe. It was found that the conventional margin formula satisfy margin object on IMRT treatment for rectal cancer. It is considered that the developed Monte-carlo based simulation program might be useful to study for patient setup error and dose coverage in CTV volume due to variations of margin size and setup error.

Application of Terrestrial LiDAR for Reconstructing 3D Images of Fault Trench Sites and Web-based Visualization Platform for Large Point Clouds (지상 라이다를 활용한 트렌치 단층 단면 3차원 영상 생성과 웹 기반 대용량 점군 자료 가시화 플랫폼 활용 사례)

  • Lee, Byung Woo;Kim, Seung-Sep
    • Economic and Environmental Geology
    • /
    • v.54 no.2
    • /
    • pp.177-186
    • /
    • 2021
  • For disaster management and mitigation of earthquakes in Korea Peninsula, active fault investigation has been conducted for the past 5 years. In particular, investigation of sediment-covered active faults integrates geomorphological analysis on airborne LiDAR data, surface geological survey, and geophysical exploration, and unearths subsurface active faults by trench survey. However, the fault traces revealed by trench surveys are only available for investigation during a limited time and restored to the previous condition. Thus, the geological data describing the fault trench sites remain as the qualitative data in terms of research articles and reports. To extend the limitations due to temporal nature of geological studies, we utilized a terrestrial LiDAR to produce 3D point clouds for the fault trench sites and restored them in a digital space. The terrestrial LiDAR scanning was conducted at two trench sites located near the Yangsan Fault and acquired amplitude and reflectance from the surveyed area as well as color information by combining photogrammetry with the LiDAR system. The scanned data were merged to form the 3D point clouds having the average geometric error of 0.003 m, which exhibited the sufficient accuracy to restore the details of the surveyed trench sites. However, we found more post-processing on the scanned data would be necessary because the amplitudes and reflectances of the point clouds varied depending on the scan positions and the colors of the trench surfaces were captured differently depending on the light exposures available at the time. Such point clouds are pretty large in size and visualized through a limited set of softwares, which limits data sharing among researchers. As an alternative, we suggested Potree, an open-source web-based platform, to visualize the point clouds of the trench sites. In this study, as a result, we identified that terrestrial LiDAR data can be practical to increase reproducibility of geological field studies and easily accessible by researchers and students in Earth Sciences.

A Study on the Documentation Related to Mugeuk-do: Focusing on Its Comparison and Historical Evidence (무극도 관련 문헌 연구 - 비교 및 고증을 중심으로 -)

  • Park Sang-kyu
    • Journal of the Daesoon Academy of Sciences
    • /
    • v.41
    • /
    • pp.27-61
    • /
    • 2022
  • Documentation related to Mugeuk-do (Limitless Dao) is rare in comparison to other Korean new religions given that it has been open to the public and translated since the 1970s. Due to its rarity, the documentation has been used uncritically, without there being any comparative study or historical research. It is undeniable that distortions and fallacies are embedded in these documents, and this has resulted in quite a few problems in precisely understanding Mugeuk-do and Daesoon Jinrihoe (The Fellowship of Daesoon Truth), an order that has inherited the legacy of Mugeuk-do. In this regard, this study aims to critically define the characteristics and limitations of the major documents related to Mugeuk-do that were published by the colonial government in the 1920s~1930s and recorded by multiple orders in the 1970s-1980s through comparisons. An attempt to conduct this research allows for the discovery of a solution to the problem of uncritical usage of those materials. The documents produced by the colonial government that can be used as basic texts to study Mugeuk-do are The General Conditions of the Religion Mugeuk-do (無極大道敎槪況) and Unofficial Religions of the Korea (朝鮮の類似宗敎). These can be found through bibliography, comparison, and historical research. Chapters 6, 7, and 8 of The General Conditions of the Religion Mugeuk-do are a possible source on the order that reflects the circumstances of Mugeuk-do until 1925. In the case of Unofficial Religions of the Korea, if the descriptive perspective on unofficial religions is excluded, the articles written about the circumstances post 1925 have credibility. Another document that describes multiple orders and can be used as a basic text is chapter 2 of 'Progress of the Order' in Daesoon Jinrihoe's The Canonical Scripture. This is because its record precisely reflects the conditions of the era, with regard to the fact that it is the freest from distortions caused by changes in the belief system and it is less biased towards certain sects or denominations. Furthermore, the collection period of the articles is the earliest. Accordingly, as basic texts, Chapters 6, 7, and 8 of The General Conditions of the Religion Mugeuk-do and the articles from Unofficial Religions of the Korea after 1925, as well as chapter 2 of 'Progress of the Order' in The Canonical Scripture are appropriate for studying Mugeuk-do. In addition, Overview of Bocheonism, History of Jeungsan-gyo, and The True Scripture of the Great Ultimate can be utilized as references after removing distortions and fallacies through comparative study. Henceforth, relevant documents should be utilized to establish comprehensive data on Mugeuk-do through comparative and historical research.

A New Exploratory Research on Franchisor's Provision of Exclusive Territories (가맹본부의 배타적 영업지역보호에 대한 탐색적 연구)

  • Lim, Young-Kyun;Lee, Su-Dong;Kim, Ju-Young
    • Journal of Distribution Research
    • /
    • v.17 no.1
    • /
    • pp.37-63
    • /
    • 2012
  • In franchise business, exclusive sales territory (sometimes EST in table) protection is a very important issue from an economic, social and political point of view. It affects the growth and survival of both franchisor and franchisee and often raises issues of social and political conflicts. When franchisee is not familiar with related laws and regulations, franchisor has high chance to utilize it. Exclusive sales territory protection by the manufacturer and distributors (wholesalers or retailers) means sales area restriction by which only certain distributors have right to sell products or services. The distributor, who has been granted exclusive sales territories, can protect its own territory, whereas he may be prohibited from entering in other regions. Even though exclusive sales territory is a quite critical problem in franchise business, there is not much rigorous research about the reason, results, evaluation, and future direction based on empirical data. This paper tries to address this problem not only from logical and nomological validity, but from empirical validation. While we purse an empirical analysis, we take into account the difficulties of real data collection and statistical analysis techniques. We use a set of disclosure document data collected by Korea Fair Trade Commission, instead of conventional survey method which is usually criticized for its measurement error. Existing theories about exclusive sales territory can be summarized into two groups as shown in the table below. The first one is about the effectiveness of exclusive sales territory from both franchisor and franchisee point of view. In fact, output of exclusive sales territory can be positive for franchisors but negative for franchisees. Also, it can be positive in terms of sales but negative in terms of profit. Therefore, variables and viewpoints should be set properly. The other one is about the motive or reason why exclusive sales territory is protected. The reasons can be classified into four groups - industry characteristics, franchise systems characteristics, capability to maintain exclusive sales territory, and strategic decision. Within four groups of reasons, there are more specific variables and theories as below. Based on these theories, we develop nine hypotheses which are briefly shown in the last table below with the results. In order to validate the hypothesis, data is collected from government (FTC) homepage which is open source. The sample consists of 1,896 franchisors and it contains about three year operation data, from 2006 to 2008. Within the samples, 627 have exclusive sales territory protection policy and the one with exclusive sales territory policy is not evenly distributed over 19 representative industries. Additional data are also collected from another government agency homepage, like Statistics Korea. Also, we combine data from various secondary sources to create meaningful variables as shown in the table below. All variables are dichotomized by mean or median split if they are not inherently dichotomized by its definition, since each hypothesis is composed by multiple variables and there is no solid statistical technique to incorporate all these conditions to test the hypotheses. This paper uses a simple chi-square test because hypotheses and theories are built upon quite specific conditions such as industry type, economic condition, company history and various strategic purposes. It is almost impossible to find all those samples to satisfy them and it can't be manipulated in experimental settings. However, more advanced statistical techniques are very good on clean data without exogenous variables, but not good with real complex data. The chi-square test is applied in a way that samples are grouped into four with two criteria, whether they use exclusive sales territory protection or not, and whether they satisfy conditions of each hypothesis. So the proportion of sample franchisors which satisfy conditions and protect exclusive sales territory, does significantly exceed the proportion of samples that satisfy condition and do not protect. In fact, chi-square test is equivalent with the Poisson regression which allows more flexible application. As results, only three hypotheses are accepted. When attitude toward the risk is high so loyalty fee is determined according to sales performance, EST protection makes poor results as expected. And when franchisor protects EST in order to recruit franchisee easily, EST protection makes better results. Also, when EST protection is to improve the efficiency of franchise system as a whole, it shows better performances. High efficiency is achieved as EST prohibits the free riding of franchisee who exploits other's marketing efforts, and it encourages proper investments and distributes franchisee into multiple regions evenly. Other hypotheses are not supported in the results of significance testing. Exclusive sales territory should be protected from proper motives and administered for mutual benefits. Legal restrictions driven by the government agency like FTC could be misused and cause mis-understandings. So there need more careful monitoring on real practices and more rigorous studies by both academicians and practitioners.

  • PDF