• Title/Summary/Keyword: Decision making tree

Search Result 200, Processing Time 0.029 seconds

Improving Performance of Recommendation Systems Using Topic Modeling (사용자 관심 이슈 분석을 통한 추천시스템 성능 향상 방안)

  • Choi, Seongi;Hyun, Yoonjin;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.101-116
    • /
    • 2015
  • Recently, due to the development of smart devices and social media, vast amounts of information with the various forms were accumulated. Particularly, considerable research efforts are being directed towards analyzing unstructured big data to resolve various social problems. Accordingly, focus of data-driven decision-making is being moved from structured data analysis to unstructured one. Also, in the field of recommendation system, which is the typical area of data-driven decision-making, the need of using unstructured data has been steadily increased to improve system performance. Approaches to improve the performance of recommendation systems can be found in two aspects- improving algorithms and acquiring useful data with high quality. Traditionally, most efforts to improve the performance of recommendation system were made by the former approach, while the latter approach has not attracted much attention relatively. In this sense, efforts to utilize unstructured data from variable sources are very timely and necessary. Particularly, as the interests of users are directly connected with their needs, identifying the interests of the user through unstructured big data analysis can be a crew for improving performance of recommendation systems. In this sense, this study proposes the methodology of improving recommendation system by measuring interests of the user. Specially, this study proposes the method to quantify interests of the user by analyzing user's internet usage patterns, and to predict user's repurchase based upon the discovered preferences. There are two important modules in this study. The first module predicts repurchase probability of each category through analyzing users' purchase history. We include the first module to our research scope for comparing the accuracy of traditional purchase-based prediction model to our new model presented in the second module. This procedure extracts purchase history of users. The core part of our methodology is in the second module. This module extracts users' interests by analyzing news articles the users have read. The second module constructs a correspondence matrix between topics and news articles by performing topic modeling on real world news articles. And then, the module analyzes users' news access patterns and then constructs a correspondence matrix between articles and users. After that, by merging the results of the previous processes in the second module, we can obtain a correspondence matrix between users and topics. This matrix describes users' interests in a structured manner. Finally, by using the matrix, the second module builds a model for predicting repurchase probability of each category. In this paper, we also provide experimental results of our performance evaluation. The outline of data used our experiments is as follows. We acquired web transaction data of 5,000 panels from a company that is specialized to analyzing ranks of internet sites. At first we extracted 15,000 URLs of news articles published from July 2012 to June 2013 from the original data and we crawled main contents of the news articles. After that we selected 2,615 users who have read at least one of the extracted news articles. Among the 2,615 users, we discovered that the number of target users who purchase at least one items from our target shopping mall 'G' is 359. In the experiments, we analyzed purchase history and news access records of the 359 internet users. From the performance evaluation, we found that our prediction model using both users' interests and purchase history outperforms a prediction model using only users' purchase history from a view point of misclassification ratio. In detail, our model outperformed the traditional one in appliance, beauty, computer, culture, digital, fashion, and sports categories when artificial neural network based models were used. Similarly, our model outperformed the traditional one in beauty, computer, digital, fashion, food, and furniture categories when decision tree based models were used although the improvement is very small.

Development of Diameter Distribution Change and Site Index in a Stand of Robinia pseudoacacia, a Major Honey Plant (꿀샘식물 아까시나무의 지위지수 도출 및 직경분포 변화)

  • Kim, Sora;Song, Jungeun;Park, Chunhee;Min, Suhui;Hong, Sunghee;Yun, Junhyuk;Son, Yeongmo
    • Journal of Korean Society of Forest Science
    • /
    • v.111 no.2
    • /
    • pp.311-318
    • /
    • 2022
  • We conducted this study to derive the site index, which is a criterion for the planting of Robinia pseudoacacia, a honey plant, and to investigate the diameter distribution change by derived site index. We applied the Chapman-Richards equation model to estimate the site index of the Robinia pseudoacacia stand. The site index was distributed within the range of 16-22 when the base age was 30 years. The fitness index of the site index estimation model was low, but we judged that there was no problem in the application because the residual distribution of the equation had not shifted to one side. We used the Weibull diameter distribution function to determine the diameter distribution of the Robinia pseudoacacia stand by site index. We used the mean diameter and the dominant tree height as independent variables to present the diameter distribution, and our analysis procedure was to estimate and recover the parameters of the Weibull diameter distribution function. We used the mean diameter and the dominant tree height of the Robinia pseudoacacia stand to show distribution by diameter class, and the fitness index for dbh distribution estimation was about 80.5%. As a result of schematizing the diameter distribution by site indices as a 30-year-old, we found that the higher the site index, the more the curve of the diameter distribution moved to the right. This suggests that if the plantation were to be established in a high site index stand, considering the suitable trees on the site, the growth of Robinia pseudoacacia woul d become active, and not onl y the production of wood but al so the production of honey would increase. We therefore anticipate that the site index classification table and curve of this Robinia pseudoacacia stand will become the standard for decision making in the plantation and management of this tree.

Determination of Fire Severity and Deduction of Influence Factors Through Landsat-8 Satellite Image Analysis - A Case Study of Gangneung and Donghae Forest Fires - (Landsat-8 위성영상 분석을 통한 산불피해 심각도 판정 및 영향 인자 도출 - 강릉, 동해 산불을 사례로 -)

  • Soo-Dong Lee;Gyoung-Sik Park;Chung-Hyeon Oh;Bong-Gyo Cho;Byeong-Hyeok Yu
    • Korean Journal of Environment and Ecology
    • /
    • v.38 no.3
    • /
    • pp.277-292
    • /
    • 2024
  • In order to manage large-scale forest fires concentrated in Gangwon-do and Gyeongsangbuk-do with severe topographical heterogeneity, a decision-making process through efficient and rapid damage assessment using satellite images is essential. Accordingly, this study targets a large-scale forest fire that ignited in Gangneung and the Donghae, Gangwon-do on March 5, 2022, and was extinguished around 19:00 on March 8, to estimate the fire severity using dNBR and derive environmental factors that affect the grade. As environmental factors, we quantified the regular vegetation index representing vegetation or fuel type, the forest index that classifies tree species, the regular moisture index representing moisture content, and DEM in relation to topography, and then analyzed the correlation with the fire severity. In terms of fire severity, the widest range was 'Unbured' at 52.4%, followed by low severity at 42.9%, medium-low severity at 4.3%, and medium-high severity at 0.4%. Environmental factors showed a negative correlation with dNDVI and dNDWI, and a positive correlation with slope. Regarding vegetation, the differences between coniferous, broad-leaved, and other groups in dNDVI, dNIWI, and slope, which were analyzed to affect the fire severity, were analyzed to be significant with p-value < 2.2e-16. In particular, the difference between coniferous and broad-leaved forests was clear, and it was confirmed that coniferous forest suffered more damage than broad-leaved forest due to the higher fire severity in the Gangwon-do region, including Pinus densiflora, which are dominant species, as well as P. koraiensis, P. rigida and P. thunbergii.

Classification of Urban Green Space Using Airborne LiDAR and RGB Ortho Imagery Based on Deep Learning (항공 LiDAR 및 RGB 정사 영상을 이용한 딥러닝 기반의 도시녹지 분류)

  • SON, Bokyung;LEE, Yeonsu;IM, Jungho
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.3
    • /
    • pp.83-98
    • /
    • 2021
  • Urban green space is an important component for enhancing urban ecosystem health. Thus, identifying the spatial structure of urban green space is required to manage a healthy urban ecosystem. The Ministry of Environment has provided the level 3 land cover map(the highest (1m) spatial resolution map) with a total of 41 classes since 2010. However, specific urban green information such as street trees was identified just as grassland or even not classified them as a vegetated area in the map. Therefore, this study classified detailed urban green information(i.e., tree, shrub, and grass), not included in the existing level 3 land cover map, using two types of high-resolution(<1m) remote sensing data(i.e., airborne LiDAR and RGB ortho imagery) in Suwon, South Korea. U-Net, one of image segmentation deep learning approaches, was adopted to classify detailed urban green space. A total of three classification models(i.e., LRGB10, LRGB5, and RGB5) were proposed depending on the target number of classes and the types of input data. The average overall accuracies for test sites were 83.40% (LRGB10), 89.44%(LRGB5), and 74.76%(RGB5). Among three models, LRGB5, which uses both airborne LiDAR and RGB ortho imagery with 5 target classes(i.e., tree, shrub, grass, building, and the others), resulted in the best performance. The area ratio of total urban green space(based on trees, shrub, and grass information) for the entire Suwon was 45.61%(LRGB10), 43.47%(LRGB5), and 44.22%(RGB5). All models were able to provide additional 13.40% of urban tree information on average when compared to the existing level 3 land cover map. Moreover, these urban green classification results are expected to be utilized in various urban green studies or decision making processes, as it provides detailed information on urban green space.

Prioritization of Species Selection Criteria for Urban Fine Dust Reduction Planting (도시 미세먼지 저감 식재를 위한 수종 선정 기준의 우선순위 도출)

  • Cho, Dong-Gil
    • Korean Journal of Environment and Ecology
    • /
    • v.33 no.4
    • /
    • pp.472-480
    • /
    • 2019
  • Selection of the plant material for planting to reduce fine dust should comprehensively consider the visual characteristics, such as the shape and texture of the plant leaves and form of bark, which affect the adsorption function of the plant. However, previous studies on reduction of fine dust through plants have focused on the absorption function rather than the adsorption function of plants and on foliage plants, which are indoor plants, rather than the outdoor plants. In particular, the criterion for selection of fine dust reduction species is not specific, so research on the selection criteria for plant materials for fine dust reduction in urban areas is needed. The purpose of this study is to identify the priorities of eight indicators that affect the fine dust reduction by using the fuzzy multi-criteria decision-making model (MCDM) and establish the tree selection criteria for the urban planting to reduce fine dust. For the purpose, we conducted a questionnaire survey of those who majored in fine dust-related academic fields and those with experience of researching fine dust. A result of the survey showed that the area of leaf and the tree species received the highest score as the factors that affect the fine dust reduction. They were followed by the surface roughness of leaves, tree height, growth rate, complexity of leaves, edge shape of leaves, and bark feature in that order. When selecting the species that have leaves with the coarse surface, it is better to select the trees with wooly, glossy, and waxy layers on the leaves. When considering the shape of the leaves, it is better to select the two-type or three-type leaves and palm-shaped leaves than the single-type leaves and to select the serrated leaves than the smooth edged leaves to increase the surface area for adsorbing fine dust in the air on the surface of the leaves. When considering the characteristics of the bark, it is better to select trees that have cork layers or show or are likely to show the bark loosening or cracks than to select those with lenticel or patterned barks. This study is significant in that it presents the priorities of the selection criteria of plant material based on the visual characteristics that affect the adsorption of fine dust for the planning of planting to reduce fine dust in the urban area. The results of this study can be used as basic data for the selection of trees for plantation planning in the urban area.

Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company (소셜미디어 콘텐츠의 오피니언 마이닝결과 시각화: N라면 사례 분석 연구)

  • Kim, Yoosin;Kwon, Do Young;Jeong, Seung Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.89-105
    • /
    • 2014
  • After emergence of Internet, social media with highly interactive Web 2.0 applications has provided very user friendly means for consumers and companies to communicate with each other. Users have routinely published contents involving their opinions and interests in social media such as blogs, forums, chatting rooms, and discussion boards, and the contents are released real-time in the Internet. For that reason, many researchers and marketers regard social media contents as the source of information for business analytics to develop business insights, and many studies have reported results on mining business intelligence from Social media content. In particular, opinion mining and sentiment analysis, as a technique to extract, classify, understand, and assess the opinions implicit in text contents, are frequently applied into social media content analysis because it emphasizes determining sentiment polarity and extracting authors' opinions. A number of frameworks, methods, techniques and tools have been presented by these researchers. However, we have found some weaknesses from their methods which are often technically complicated and are not sufficiently user-friendly for helping business decisions and planning. In this study, we attempted to formulate a more comprehensive and practical approach to conduct opinion mining with visual deliverables. First, we described the entire cycle of practical opinion mining using Social media content from the initial data gathering stage to the final presentation session. Our proposed approach to opinion mining consists of four phases: collecting, qualifying, analyzing, and visualizing. In the first phase, analysts have to choose target social media. Each target media requires different ways for analysts to gain access. There are open-API, searching tools, DB2DB interface, purchasing contents, and so son. Second phase is pre-processing to generate useful materials for meaningful analysis. If we do not remove garbage data, results of social media analysis will not provide meaningful and useful business insights. To clean social media data, natural language processing techniques should be applied. The next step is the opinion mining phase where the cleansed social media content set is to be analyzed. The qualified data set includes not only user-generated contents but also content identification information such as creation date, author name, user id, content id, hit counts, review or reply, favorite, etc. Depending on the purpose of the analysis, researchers or data analysts can select a suitable mining tool. Topic extraction and buzz analysis are usually related to market trends analysis, while sentiment analysis is utilized to conduct reputation analysis. There are also various applications, such as stock prediction, product recommendation, sales forecasting, and so on. The last phase is visualization and presentation of analysis results. The major focus and purpose of this phase are to explain results of analysis and help users to comprehend its meaning. Therefore, to the extent possible, deliverables from this phase should be made simple, clear and easy to understand, rather than complex and flashy. To illustrate our approach, we conducted a case study on a leading Korean instant noodle company. We targeted the leading company, NS Food, with 66.5% of market share; the firm has kept No. 1 position in the Korean "Ramen" business for several decades. We collected a total of 11,869 pieces of contents including blogs, forum contents and news articles. After collecting social media content data, we generated instant noodle business specific language resources for data manipulation and analysis using natural language processing. In addition, we tried to classify contents in more detail categories such as marketing features, environment, reputation, etc. In those phase, we used free ware software programs such as TM, KoNLP, ggplot2 and plyr packages in R project. As the result, we presented several useful visualization outputs like domain specific lexicons, volume and sentiment graphs, topic word cloud, heat maps, valence tree map, and other visualized images to provide vivid, full-colored examples using open library software packages of the R project. Business actors can quickly detect areas by a swift glance that are weak, strong, positive, negative, quiet or loud. Heat map is able to explain movement of sentiment or volume in categories and time matrix which shows density of color on time periods. Valence tree map, one of the most comprehensive and holistic visualization models, should be very helpful for analysts and decision makers to quickly understand the "big picture" business situation with a hierarchical structure since tree-map can present buzz volume and sentiment with a visualized result in a certain period. This case study offers real-world business insights from market sensing which would demonstrate to practical-minded business users how they can use these types of results for timely decision making in response to on-going changes in the market. We believe our approach can provide practical and reliable guide to opinion mining with visualized results that are immediately useful, not just in food industry but in other industries as well.

Matching Algorithms using the Union and Division (결합과 분배를 이용한 정합 알고리즘)

  • 박종민;조범준
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.8 no.5
    • /
    • pp.1102-1107
    • /
    • 2004
  • Fingerprint Recognition System is made up of Off-line treatment and On-line treatment; the one is registering all the information of there trieving features which are retrieved in the digitalized fingerprint getting out of the analog fingerprint through the fingerprint acquisition device and the other is the treatment making the decision whether the users are approved to be accessed to the system or not with matching them with the fingerprint features which are retrieved and database from the input fingerprint when the users are approaching the system to use. In matching between On-line and Off-line treatment, the most important thing is which features we are going to use as the standard. Therefore, we have been using “Delta” and “Core” as this standard until now, but there might have been some deficits not to exist in every person when we set them up as the standards. In order to handle the users who do not have those features, we are still using the matching method which enables us to make up of the spanning tree or the triangulation with the relations of the spanned feature. However, there are some overheads of the time on these methods and it is not sure whether they make the correct matching or not. Therefore, I would like to represent the more correct matching algorism in this paper which has not only better matching rate but also lower mismatching rate compared to the present matching algorism by selecting the line segment connecting two minutiae on the same ridge and furrow structures as the reference point.

Taper Equations and Stem Volume Table of Eucalyptus pellita and Acacia mangium Plantations in Indonesia (인도네시아 유칼립투스 및 아카시아 조림지의 수간곡선식 및 수간재적표 조제)

  • Son, Yeong Mo;Kim, Hoon;Lee, Ho Young;Kim, Cheol Min;Kim, Cheol Sang;Kim, Jae Weon;Joo, Rin Won;Lee, Kyeong Hak
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.6
    • /
    • pp.633-638
    • /
    • 2009
  • This study was conducted to develop stem taper equations and stem volume tables for Eucalyptus pellita and Acacia mangium plantations in Kalimantan, Indonesia. To derive a most adequate taper equation for the plantations, three models - Max & Burkhart, Kozak, and Lee models - were applied and their fitness were statistically analyzed by using fitness index, bias, and standard error of bias. The result showed that there is no significant difference between the three models, but the fitness index was slightly higher in the Kozak model. Therefore, the Kozak model was chosen for generating stem taper equations and stem volume tables for the Eucalyptus pellita and Acacia mangium plantations. The resulted stem volume table was compared to the local volume table used in Kalimantan regions, but no significant difference was found in the stem volume estimation. It is expected that the results of this study would provide a good information about the tree growth in abroad plantations and support a reliable decision-making for their management.

Reconsideration of decision making for third molar extraction (하악 제3대구치 발치의 결정에 관한 재고찰 - 발치 현황과 영향 인자를 중심으로)

  • Park, Won-Se;Kim, Jin-Hak;Kang, Sang-Hoon;Kim, Moon-Key;Kim, Bong-Chul;Choi, Ji-Wook;Lee, Sang-Hwy
    • Journal of the Korean Association of Oral and Maxillofacial Surgeons
    • /
    • v.37 no.5
    • /
    • pp.343-348
    • /
    • 2011
  • Introduction: Third molar extraction is one of the most common procedures in oral and maxillofacial surgery. The impacted third molar causes many pathological conditions, such as pericoronitis, caries, periodontitis, resorption of adjacent teeth, and cyst or tumors associated with impacted teeth. Extraction is often considered the treatment of choice for impacted lower third molars. On the other hand, imprudent extraction of deeply impacted third molars can cause permanent complications, such as inferior alveolar nerve damage. Therefore, guidelines for the extraction of lower third molars should be set to prevent embarrassing complications. This study examined the indication and current trends of the extracted lower third molars in the dental hospital of a dental college. Materials and Methods: 557 extracted third molars were evaluated at the department of oral and maxillofacial surgery of Yonsei University. The chief complaint, diagnosis, age and degree of impaction were analyzed to determine the tendency for the extraction of asymptomatic lower third molars. Results: The percentage of asymptomatic third molars was 40.8%. In cases of full impacted tooth or full erupted tooth, the percentage of asymptomatic teeth was more than 50% (52.4% and 54.3, respectively). Among those partially impacted teeth, 73.1% of them showed symptoms, such as pain, tenderness and swelling. In terms of age, pericoronitis was evident at a younger age, and dental caries/periodontitis was the main cause of removal in those aged over 50. Twenty nine cases (1.6%) had teeth associated with pathological changes Conclusion: The incidence of pathological changes to the lower third molar was relatively low. Surgical extraction is recommended in cases of partially impacted teeth. In Korea, the incidence of asymptomatic third molar extraction was relatively higher than in European countries. More careful attention would be desirable to consider the risks and benefits of lower third molar extraction.

Changes in Corporate Governance and Competitiveness in Vietnam: Strategies for the Equitization of Vinacafe (베트남 기업 지배구조의 변화와 경쟁력: 비나카페의 주식회사화 전략)

  • Ji, Hochul;Lee, Sung-Cheol
    • Journal of the Economic Geographical Society of Korea
    • /
    • v.18 no.4
    • /
    • pp.415-430
    • /
    • 2015
  • Since the late 1990s Vinacafe has gone through strategic changes in corporate governance and managements due to an increase in the introduction of coffee MNCs, a growth of global demands in sustainable coffee, aging coffee tree, and the deterioration of coffee production with climate changes in Vietnam. Vinacafe has attempted to cope with these kinds of changes through strategies for equitization. Therefore, the main aim of this paper is to identify strategies for enhancing the competitiveness of the Vietnamese coffee industry by investigating changes in corporate governance and processes of coffee production and distribution. The equitization of Vinacafe has led to the enhancement of coffee competitiveness in two perspectives. Firstly, as it has decentralized decision-making from headquarter, subsidiaries have become able to strength their competitiveness themselves by introducing new technologies, improving coffee quality, and encouraging the introduction of eco-friendly production methods through cooperative relationships with stakeholders involved in coffee production and distributions in Vietnam. Secondly, it has also enhanced competitiveness through the diversification and effectiveness of coffee managements by intensifying the flexibility of contract with coffee farmers and diversifying coffee sales and supply chains in Vietnam.

  • PDF