• Title/Summary/Keyword: Binary Tree

Search Result 298, Processing Time 0.026 seconds

A Comparative Study of Prediction Models for College Student Dropout Risk Using Machine Learning: Focusing on the case of N university (머신러닝을 활용한 대학생 중도탈락 위험군의 예측모델 비교 연구 : N대학 사례를 중심으로)

  • So-Hyun Kim;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.2
    • /
    • pp.155-166
    • /
    • 2024
  • Purpose : This study aims to identify key factors for predicting dropout risk at the university level and to provide a foundation for policy development aimed at dropout prevention. This study explores the optimal machine learning algorithm by comparing the performance of various algorithms using data on college students' dropout risks. Methods : We collected data on factors influencing dropout risk and propensity were collected from N University. The collected data were applied to several machine learning algorithms, including random forest, decision tree, artificial neural network, logistic regression, support vector machine (SVM), k-nearest neighbor (k-NN) classification, and Naive Bayes. The performance of these models was compared and evaluated, with a focus on predictive validity and the identification of significant dropout factors through the information gain index of machine learning. Results : The binary logistic regression analysis showed that the year of the program, department, grades, and year of entry had a statistically significant effect on the dropout risk. The performance of each machine learning algorithm showed that random forest performed the best. The results showed that the relative importance of the predictor variables was highest for department, age, grade, and residence, in the order of whether or not they matched the school location. Conclusion : Machine learning-based prediction of dropout risk focuses on the early identification of students at risk. The types and causes of dropout crises vary significantly among students. It is important to identify the types and causes of dropout crises so that appropriate actions and support can be taken to remove risk factors and increase protective factors. The relative importance of the factors affecting dropout risk found in this study will help guide educational prescriptions for preventing college student dropout.

Metadata-Based Data Structure Analysis to Optimize Search Speed and Memory Efficiency (검색 속도와 메모리 효율 최적화를 위한 메타데이터 기반 데이터 구조 분석)

  • Kim Se Yeon;Lim Young Hoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.311-318
    • /
    • 2024
  • As the amount of data increases due to the development of artificial intelligence and the Internet, data management is becoming increasingly important, and the efficient utilization of data retrieval and memory space is crucial. In this study, we investigate how to optimize search speed and memory efficiency by analyzing data structure based on metadata. As a research method, we compared and analyzed the performance of the array, association list, dictionary binary tree, and graph data structures using metadata of photographic images, focusing on temporal and space complexity. Through experimentation, it was confirmed that dictionary data structure performs best in collection speed and graph data structure performs best in search speed when dealing with large-scale image data. We expect the results of this paper to provide practical guidelines for selecting data structures to optimize search speed and memory efficiency for the images data.

A Study on the Financial Strength of Households on House Investment Demand (가계 재무건전성이 주택투자수요에 미치는 영향에 관한 연구)

  • Rho, Sang-Youn;Yoon, Bo-Hyun;Choi, Young-Min
    • Journal of Distribution Science
    • /
    • v.12 no.4
    • /
    • pp.31-39
    • /
    • 2014
  • Purpose - This study investigates the following two issues. First, we attempt to find the important determinants of housing investment and to identify their significance rank using survey panel data. Recently, the expansion of global uncertainty in the real estate market has directly and indirectly influenced the Korean housing market; households demonstrate a sensitive reaction to changes in that market. Therefore, this study aims to draw conclusions from understanding how the impact of financial strength of the household is related to house investment. Second, we attempt to verify the effectiveness of diverse indices of financial strength such as DTI, LTV, and PIR as measures to monitor the housing market. In the continuous housing market recession after the global crisis, the government places top priority on residence stability. However, the government still imposes forceful restraints on indices of financial strength. We believe this study verifies the utility of these regulations when used in the housing market. Research design, data, and methodology - The data source for this study is the "National Survey of Tax and Benefit" from 2007 (1st) to 2011 (5th) by the Korea Institute of Public Finance. Based on this survey data, we use panel data of 3,838 households that have been surveyed continuously for 5 years. We sort the base variables according to relevance of house investment criteria using the decision tree model (DTM), which is the standard decision-making model for data-mining techniques. The DTM method is known as a powerful methodology to identify contributory variables for predictive power. In addition, we analyze how important explanatory variables and the financial strength index of households affect housing investment with the binary logistic multi-regressive model. Based on the analyses, we conclude that the financial strength index has a significant role in house investment demand. Results - The results of this research are as follows: 1) The determinants of housing investment are age, consumption expenditures, income, total assets, rent deposit, housing price, habits satisfaction, housing scale, number of household members, and debt related to housing. 2) The impact power of these determinants has changed more or less annually due to economic situations and housing market conditions. The level of consumption expenditure and income are the main determinants before 2009; however, the determinants of housing investment changed to indices of the financial strength of households, i.e., DTI, LTV, and PIR, after 2009. 3) Most of all, since 2009, housing loans has been a more important variable than the level of consumption in making housing market decisions. Conclusions - The results of this research show that sound financing of households has a stronger effect on housing investment than reduced consumption expenditures. At the same time, the key indices that must be monitored by the government under economic emergency conditions differ from those requiring monitoring under normal market conditions; therefore, political indices to encourage and promote the housing market must be divided based on market conditions.

Mining Frequent Trajectory Patterns in RFID Data Streams (RFID 데이터 스트림에서 이동궤적 패턴의 탐사)

  • Seo, Sung-Bo;Lee, Yong-Mi;Lee, Jun-Wook;Nam, Kwang-Woo;Ryu, Keun-Ho;Park, Jin-Soo
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.1
    • /
    • pp.127-136
    • /
    • 2009
  • This paper proposes an on-line mining algorithm of moving trajectory patterns in RFID data streams considering changing characteristics over time and constraints of single-pass data scan. Since RFID, sensor, and mobile network technology have been rapidly developed, many researchers have been recently focused on the study of real-time data gathering from real-world and mining the useful patterns from them. Previous researches for sequential patterns or moving trajectory patterns based on stream data have an extremely time-consum ing problem because of multi-pass database scan and tree traversal, and they also did not consider the time-changing characteristics of stream data. The proposed method preserves the sequential strength of 2-lengths frequent patterns in binary relationship table using the time-evolving graph to exactly reflect changes of RFID data stream from time to time. In addition, in order to solve the problem of the repetitive data scans, the proposed algorithm infers candidate k-lengths moving trajectory patterns beforehand at a time point t, and then extracts the patterns after screening the candidate patterns by only one-pass at a time point t+1. Through the experiment, the proposed method shows the superior performance in respect of time and space complexity than the Apriori-like method according as the reduction ratio of candidate sets is about 7 percent.

  • PDF

A Region-based Comparison Algorithm of k sets of Trapezoids (k 사다리꼴 셋의 영역 중심 비교 알고리즘)

  • Jung, Hae-Jae
    • The KIPS Transactions:PartA
    • /
    • v.10A no.6
    • /
    • pp.665-670
    • /
    • 2003
  • In the applications like automatic masks generation for semiconductor production, a drawing consists of lots of polygons that are partitioned into trapezoids. The addition/deletion of a polygon to/from the drawing is performed through geometric operations such as insertion, deletion, and search of trapezoids. Depending on partitioning algorithm being used, a polygon can be partitioned differently in terms of shape, size, and so on. So, It's necessary to invent some comparison algorithm of sets of trapezoids in which each set represents interested parts of a drawing. This comparison algorithm, for example, may be used to verify a software program handling geometric objects consisted of trapezoids. In this paper, given k sets of trapezoids in which each set forms the regions of interest of each drawing, we present how to compare the k sets to see if all k sets represent the same geometric scene. When each input set has the same number n of trapezoids, the algorithm proposed has O(2$^{k-2}$ $n^2$(log n+k)) time complexity. It is also shown that the algorithm suggested has the same time complexity O( $n^2$ log n) as the sweeping-based algorithm when the number k(<< n) of input sets is small. Furthermore, the proposed algorithm can be kn times faster than the sweeping-based algorithm when all the trapezoids in the k input sets are almost the same.

Field Performance and Morphological Characterization of Transgenic Codonopsis lanceolata Expressing $\gamma-TMT$ Gene.

  • Ghimire, Bimal Kumar;Li, Cheng Hao;Kil, Hyun-Young;Kim, Na-Young;Lim, Jung-Dae;Kim, Jae-Kwang;Kim, Myong-Jo;Chung, Ill-Min;Lee, Sun-Joo;Eom, Seok-Hyun;Cho, Dong-Ha;Yu, Chang-Yeon
    • Korean Journal of Medicinal Crop Science
    • /
    • v.15 no.5
    • /
    • pp.339-345
    • /
    • 2007
  • Field performance and morphological characterization was conducted on seven transgenic lines of Codonopsis lanceolata expressing ${\gamma}-TMT$ gene. The shoots were obtained from leaf explants after co-cultivation with Agrobacterium tume-faciens strain LBA 4404 harboring a binary vector pYBI 121 that carried genes encoding ${\gamma}-Tocopherol$ methyltransferase gene (${\gamma}-TMT$) and a neomycin phosphotransferase II gene (npt II) for kanamycin resistance. The transgenic plants were transferred to a green house for acclimation. Integration of T-DNA into the $T_0\;and\;T_1$ generation of transgenic Codonopsis lanceolata genome was confirmed by the polymerase chain reaction and southern blot analysis. The progenies of transgenic plants showed phenotypic differences within the different lines and with relative to control plants. When grown in field, the transgenic plants in general exhibited increased fertility, significant improvement in the shoot weight, root weight, shoot height and rachis length with relation to the control plants. However, all seven independently derived transgenic lines produced normal flower with respect to its shape, size, color and seeds number at its maturity. Indicating that the addition of a selectable marker gene in the plant genome does not effect on seed germination and agronomic performance of transgenic Codonopsis lanceolata. $T_1$ progenies of these plants were obtained and evaluated together with control plant in a field experiment. Overall, the agronomic performance of $T_1$ progenies of transgenic Codonopsis lanceolata showed superior to that of the seed derived non-transgenic plant. In this study, we report on the morphological variation and agronomic performance of transgenic Codonopsis lanceolata developed by Agrobacterium transformation.

Constructing Software Structure Graph through Progressive Execution (점진적 실행을 통한 소프트웨어의 구조 그래프 생성)

  • Lee, Hye-Ryun;Shin, Seung-Hun;Choi, Kyung-Hee;Jung, Gi-Hyun;Park, Seung-Kyu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.18 no.7
    • /
    • pp.111-123
    • /
    • 2013
  • To verify software vulnerability, the method of conjecturing software structure and then testing the software based on the conjectured structure has been highlighted. To utilize the method, an efficient way to conjecture software structure is required. The popular graph and tree methods such as DFG(Data Flow Graph), CFG(Control Flow Graph) and CFA(Control Flow Automata) have a serious drawback. That is, they cannot express software in a hierarchical fashion. In this paper, we propose a method to overcome the drawback. The proposed method applies various input data to a binary code, generate CFG's based on the code output and construct a HCFG (Hierarchical Control Flow Graph) to express the generated CFG's in a hierarchical structure. The components required for HCFG and progressive algorithm to construct HCFG are also proposed. The proposed method is verified through constructing the software architecture of an open SMTP(Simple Mail Transfer Protocol) server program. The structure generated by the proposed method and the real program structure are compared and analyzed.

The Recognition of Occluded 2-D Objects Using the String Matching and Hash Retrieval Algorithm (스트링 매칭과 해시 검색을 이용한 겹쳐진 이차원 물체의 인식)

  • Kim, Kwan-Dong;Lee, Ji-Yong;Lee, Byeong-Gon;Ahn, Jae-Hyeong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.7
    • /
    • pp.1923-1932
    • /
    • 1998
  • This paper deals with a 2-D objects recognition algorithm. And in this paper, we present an algorithm which can reduce the computation time in model retrieval by means of hashing technique instead of using the binary~tree method. In this paper, we treat an object boundary as a string of structural units and use an attributed string matching algorithm to compute similarity measure between two strings. We select from the privileged strings a privileged string wIth mmimal eccentricity. This privileged string is treated as the reference string. And thell we wllstructed hash table using the distance between privileged string and the reference string as a key value. Once the database of all model strings is built, the recognition proceeds by segmenting the scene into a polygonal approximation. The distance between privileged string extracted from the scene and the reference string is used for model hypothesis rerieval from the table. As a result of the computer simulation, the proposed method can recognize objects only computing, the distance 2-3tiems, while previous method should compute the distance 8-10 times for model retrieval.

  • PDF

Design and Implementation of ASTERIX Parsing Module Based on Pattern Matching for Air Traffic Control Display System (항공관제용 현시시스템을 위한 패턴매칭 기반의 ASTERIX 파싱 모듈 설계 및 구현)

  • Kim, Kanghee;Kim, Hojoong;Yin, Run Dong;Choi, SangBang
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.3
    • /
    • pp.89-101
    • /
    • 2014
  • Recently, as domestic air traffic dramatically increases, the need of ATC(air traffic control) systems has grown for safe and efficient ATM(air traffic management). Especially, for smooth ATC, it is far more important that performance of display system which should show all air traffic situation in FIR(Flight Information Region) without additional latency is guaranteed. In this paper, we design a ASTERIX(All purpose STructured Eurocontrol suRveillance Information eXchange) parsing module to promote stable ATC by minimizing system loads, which is connected with reducing overheads arisen when we parse ASTERIX message. Our ASTERIX parsing module based on pattern matching creates patterns by analyzing received ASTERIX data, and handles following received ASTERIX data using pre-defined procedure through patterns. This module minimizes display errors by rapidly extracting only necessary information for display different from existing parsing module containing unnecessary parsing procedure. Therefore, this designed module is to enable controllers to operate stable ATC. The comparison with existing general bit level ASTERIX parsing module shows that ASTERIX parsing module based on pattern matching has shorter processing delay, higher throughput, and lower CPU usage.

Organ Specific Expression of the nos-NPT II Gene in Transgenic Hybrid Poplar (형질 전환된 포플러에 대한 nos-NPT II 유전자의 기관별 발현 특성)

  • Chun, Young Woo;Klopfenstein, Ned B.
    • Journal of Korean Society of Forest Science
    • /
    • v.84 no.1
    • /
    • pp.77-86
    • /
    • 1995
  • To effectively modify tree function with genetic engineering, transgenes must be expressed at the proper level in the appropriate tissues at suitable developmental stages. Toward understanding the spatial and temporal expression of transgenes in woody plants, transgene expression was evaluated in three greenhouse-grown, transgenic lines of Populus alba ${\times}$ P. grandidentata hybrid clone 'Hansen'. All transgenic poplar lines possess constructs containing the bacterial nopaline synthase(nos) promoter linked to a neomycin phosphotransferase II(NPT II) selectable marker gene. In addition, each transgenic poplar line contains one of the following gene constructs : 1) a wound-inducible potato proteinase inhibitor II (pin2) promoter linked to a chloramphenicol acetyltransferase(CAT) reporter gene. 2) a nos promoter linked to a PIN2 structural gene : or 3) a Cauliflower Mosaic Virus 35s promoter linked to a PIN2 structural gene. Polymerase chain reaction(PCR) was used to verify the presence of foreign genes in the poplar genome. Enzyme-linked immunosorbent assays(ELISAs) were used to evaluate organ specific expression of the nos-NPT II construct. NPT II expression was detected in leaves, petioles, stems, and roots of transgenic poplar, thereby indicating that the nos promoter is potentially effective for general constitutive expression of transgenes. NPT expression varied among transgenic poplar lines and among organs for one transgenic line, Tr15. With Tr15, NPT II levels were highest in older leaves and petioles. These results indicate that screening of several transgenic lines may be required to identify lines with optimal transgene expression.

  • PDF