• Title/Summary/Keyword: 트리 회귀

Search Result 82, Processing Time 0.028 seconds

Terminology Recognition System based on Machine Learning for Scientific Document Analysis (과학 기술 문헌 분석을 위한 기계학습 기반 범용 전문용어 인식 시스템)

  • Choi, Yun-Soo;Song, Sa-Kwang;Chun, Hong-Woo;Jeong, Chang-Hoo;Choi, Sung-Pil
    • The KIPS Transactions:PartD
    • /
    • v.18D no.5
    • /
    • pp.329-338
    • /
    • 2011
  • Terminology recognition system which is a preceding research for text mining, information extraction, information retrieval, semantic web, and question-answering has been intensively studied in limited range of domains, especially in bio-medical domain. We propose a domain independent terminology recognition system based on machine learning method using dictionary, syntactic features, and Web search results, since the previous works revealed limitation on applying their approaches to general domain because their resources were domain specific. We achieved F-score 80.8 and 6.5% improvement after comparing the proposed approach with the related approach, C-value, which has been widely used and is based on local domain frequencies. In the second experiment with various combinations of unithood features, the method combined with NGD(Normalized Google Distance) showed the best performance of 81.8 on F-score. We applied three machine learning methods such as Logistic regression, C4.5, and SVMs, and got the best score from the decision tree method, C4.5.

Analysis of Meat Quality for Hanwoo Beef using Machine Learning (기계학습을 이용한 한우고기 품질 분석)

  • Lee, Woongsup
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.10a
    • /
    • pp.450-452
    • /
    • 2022
  • Recently, various machine learning algorithms have been actively applied to the field of livestock research, including genetic analysis, and have drawn noteworthy results. In this study, the statistical characteristics of meat color, hydrogen ion concentration, water holding capacity (WHC), shear force, and grilling loss that affect the quality of Hanwoo beef are examined using the Hanwoo beef data collected in various environments. Moreover, the prediction of meat quality is also investigated using the two machine learning algorithms, which are linear regression and regression tree. Analysis results show that meat color has the most significant effect on WHC, which determines the tenderness of beef, and hydrogen ion concentration significantly influences shear force and grilling loss. Through this study, we can confirm the applicability of machine learning algorithms in the research on the quality of Hanwoo beef. In addition, this study can also be applied to the prediction and improvement of the quality of Hanwoo beef.

  • PDF

Research on Mining Technology for Explainable Decision Making (설명가능한 의사결정을 위한 마이닝 기술)

  • Kyungyong Chung
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.4
    • /
    • pp.186-191
    • /
    • 2023
  • Data processing techniques play a critical role in decision-making, including handling missing and outlier data, prediction, and recommendation models. This requires a clear explanation of the validity, reliability, and accuracy of all processes and results. In addition, it is necessary to solve data problems through explainable models using decision trees, inference, etc., and proceed with model lightweight by considering various types of learning. The multi-layer mining classification method that applies the sixth principle is a method that discovers multidimensional relationships between variables and attributes that occur frequently in transactions after data preprocessing. This explains how to discover significant relationships using mining on transactions and model the data through regression analysis. It develops scalable models and logistic regression models and proposes mining techniques to generate class labels through data cleansing, relevance analysis, data transformation, and data augmentation to make explanatory decisions.

The Effects of Gamification of e-Learning Platforms on Engagement: Focusing on Moderating Effects of Interaction, Difficulty, and Length (e-러닝 플랫폼의 게임화가 인게이지먼트에 미치는 영향: 상호작용, 스터디 난이도, 스터디 길이의 조절효과를 중심으로)

  • Ohsung Kim;Jungwon Lee
    • Information Systems Review
    • /
    • v.26 no.1
    • /
    • pp.73-91
    • /
    • 2024
  • Recently, e-learning platforms are rapidly growing by innovating the education industry by applying various IT technologies. Because student participation in the online environment is considered a prerequisite for learning, low participation rates are considered one of the most important issues determining the performance of e-learning platforms. Gamification has grown rapidly over the past decades and is highly valued for its applicability in education because it is expected to enhance learning motivation. However, despite the interest of researchers, previous studies have reported conflicting results on the effect of gamification on participation rates in the context of e-learning platforms, and have mainly studied structural gamification, but have not sufficiently addressed the effects of content gamification. In this context, this study aims to analyze the effect of content gamification on e-learning platform engagement and to explore the boundary conditions moderating this effect. For empirical analysis, 5,017 data registered from February 11, 2022 to May 31, 2022 were analyzed for the education platform entry (https://playentry.org). The propensity score matching method and Poisson multilevel regression model were applied as analysis methods. As a result of the analysis, content gamification had a statistically significant effect on engagement, and the interaction effects of interaction and content difficulty were statistically significant.

An Experimental Study on the Shear Resistance of Dowel Bars (장부철근의 전단저항에 대한 실험적 연구)

  • 신장호
    • Magazine of the Korea Concrete Institute
    • /
    • v.7 no.6
    • /
    • pp.216-223
    • /
    • 1995
  • This research is aimed to investigate the influence of the structural parameters on dowel action of reinforcing bars in reinforced concrete members. I~ollowing the previous research, $^{(3.6)}$ a total of forty two specimens were tested to scrutinize the dowel action of reinforcing bars. Concrete cover, reinforcing bar size and bar distance were taken as main test variables for constant compressive strength of concrete. ]+om the test results, the structural behavior of all specimens was almost linear up to failure load. It is seen that dowel force increases as concrete cover increases. Reinforcing bar size and bar distance hardly affects dowel force. It is found that the dowel forces obtained by this experimental research is relatively close to that of regression analysis results and White's equation.

The Comparison of OC1 and CART for Prosodic Boundary Index Prediction (운율 경계강도 예측을 위한 OC1의 적용 및 CART와의 비교)

  • 임동식;김진영;김선미
    • The Journal of the Acoustical Society of Korea
    • /
    • v.18 no.4
    • /
    • pp.60-64
    • /
    • 1999
  • In this paper, we apply CART(Classification And Regression tree) and OC1(Oblique Classifier1) which methods are widely used for continuous speech recognition and synthesis. We prediet prosodic boundary index by applying CART and OC1, which combine right depth of tree-structured method and To_Right of link grammar method with tri_gram model. We assigned four prosodic boundary index level from 0 to 3. Experimental results show that OC1 method is superior to CART method. In other words, in spite of OC1's having fewer nodes than CART, it can make more improved prediction than CART.

  • PDF

Analysis of Elementary Students' Smartphone Addiction Level by Demographic Features (인구통계학적 특성에 따른 초등학생의 스마트폰 중독 수준 분석)

  • Lee, Soojung
    • The Journal of Korean Association of Computer Education
    • /
    • v.17 no.6
    • /
    • pp.1-8
    • /
    • 2014
  • Recently, use of smartphones has increased so sharply at all ages that addiction problems have emerged. This study analysed factors, focusing on demographic variables, that impact on smartphone addiction of elementary students. First, differences between distributions of addicted groups and those between distributions of most frequently used smartphone functions per variable are analyzed. As a result, grade and academic achievements yield the biggest differences between distributions of addicted groups and gender, grade, and academic achievements yield differences between distributions of most frequently used smartphone functions. Also, differences between distributions of most frequently used smartphone functions per addicted user group are regarded significant. Furthermore, factors affecting smartphone addiction are analysed through the logistic regression analysis and decision trees, where grade, academic achievements, dual-income parents, and residential areas are found affecting in that order.

  • PDF

A study on integration of semantic topic based Knowledge model (의미적 토픽 기반 지식모델의 통합에 관한 연구)

  • Chun, Seung-Su;Lee, Sang-Jin;Bae, Sang-Tea
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2012.06b
    • /
    • pp.181-183
    • /
    • 2012
  • 최근 자연어 및 정형언어 처리, 인공지능 알고리즘 등을 활용한 효율적인 의미 기반 지식모델의 생성과 분석 방법이 제시되고 있다. 이러한 의미 기반 지식모델은 효율적 의사결정트리(Decision Making Tree)와 특정 상황에 대한 체계적인 문제해결(Problem Solving) 경로 분석에 활용된다. 특히 다양한 복잡계 및 사회 연계망 분석에 있어 정적 지표 생성과 회귀 분석, 행위적 모델을 통한 추이분석, 거시예측을 지원하는 모의실험(Simulation) 모형의 기반이 된다. 본 연구에서는 이러한 의미 기반 지식모델을 통합에 있어 텍스트 마이닝을 통해 도출된 토픽(Topic) 모델 간 통합 방법과 정형적 알고리즘을 제시한다. 이를 위해 먼저, 텍스트 마이닝을 통해 도출되는 키워드 맵을 동치적 지식맵으로 변환하고 이를 의미적 지식모델로 통합하는 방법을 설명한다. 또한 키워드 맵으로부터 유의미한 토픽 맵을 투영하는 방법과 의미적 동치 모델을 유도하는 알고리즘을 제안한다. 통합된 의미 기반 지식모델은 토픽 간의 구조적 규칙과 정도 중심성, 근접 중심성, 매개 중심성 등 관계적 의미분석이 가능하며 대규모 비정형 문서의 의미 분석과 활용에 실질적인 기반 연구가 될 수 있다.

Influence of Other Blood Components in Predicting Glucose Concentration using Design of Experiment (실험계획 법에 의한 혈중 글루코즈 측정 시 타 성분의 영향 분석)

  • 김연주;윤길원;전계진
    • Journal of Biomedical Engineering Research
    • /
    • v.22 no.6
    • /
    • pp.497-502
    • /
    • 2001
  • Influence of other blond components on measuring glucose concentration was analyzed B)food phantom containing five major components was made. The prediction model was developed based on the measurement of absorption spectra including the first overtone glucose band, i.e.. 1500 ∼ 1850 nm. The concentrations were Predicted using the Partial least squares regression. Factor analysis based on Design of Experiment was Performed to study the influence of other components in predicting glucose concentration. Triglyceride does not influence. Albumin and globulin haute minor effects. However, hemoglobin showed substantial response and the compensation of hemoglobin concentration appears to be required for the model of glucose measurement.

  • PDF

Impervious Surface Estimation Area of Seom River Basin using Satellite Imagery and Sub-pixel Classifier (위성영상과 Sub-pixel 분류에 의한 섬강유역의 불투수율 추정)

  • Na, Sang-Il;Park, Jong-Hwa;Shin, Hyoung-Sub;Park, Jin-Ki;Baek, Shin-Chul
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.744-744
    • /
    • 2012
  • 불투수층은 자연적인 침투를 허용하지 않는 인위적인 토지피복상태로 도시화율 추정 및 유역의 환경변화 정도를 분석하기 위한 척도로 사용되어 왔다. 특히, 수문학적 관점에서 불투수층은 단기 유출현상에 큰 영향을 끼치는 요소로 불투수율이 증가할수록 침투량이 감소하여 첨두유출량은 증가하고 도달시간은 짧아진다. 최근에는 급속한 도시화로 인해 불투수층의 영향이 더욱 커짐에 따라 불투수율의 추정에 대한 필요성이 증가하고 있다. 현재까지 위성영상을 이용한 불투수층의 추정은 고해상도 영상을 이용하여 피복분류를 수행하였다. 즉, 분류된 토지피복에 근거하여 불투수율을 산술적으로 계산하거나 분광혼합기법 및 회귀 트리기법 등 다양한 방법에 적용되어 왔다. 본 연구에서는 Sub-pixel 분류기법을 위성영상에 적용하여 섬강유역의 불투수율을 추정하고자 한다. Sub-pixel 분류는 기존 분류기법들이 다양한 토지피복이 혼합된 화소에 대해서도 가장 비중이 높은 토지피복 하나로 분류하던 것을 개선한 방법으로 fuzzy 이론을 적용하여 최소 20% 이상의 비율을 점유하는 항목 모두를 구분하여 분류하는 기법이다. 이를 위해 섬강유역의 Landsat TM 영상을 수집하고 환경부의 토지피복도와 지질도를 참조하여 트레이닝 자료를 수집하였다. 또한 결과에 영향을 미칠 수 있는 구름은 전처리를 통하여 제거하고 수집된 트레이닝 자료에 Sub-pixel 분류기법을 적용하여 섬강유역의 불투수율을 공간분포도로 작성하였다.

  • PDF