• Title/Summary/Keyword: Interval partitioning

Search Result 27, Processing Time 0.022 seconds

A dominant hyperrectangle generation technique of classification using IG partitioning (정보이득 분할을 이용한 분류기법의 지배적 초월평면 생성기법)

  • Lee, Hyeong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.1
    • /
    • pp.149-156
    • /
    • 2014
  • NGE(Nested Generalized Exemplar Method) can increase the performance of the noisy data at the same time, can reduce the size of the model. It is the optimal distance-based classification method using a matching rule. NGE cross or overlap hyperrectangles generated in the learning has been noted to inhibit the factors. In this paper, We propose the DHGen(Dominant Hyperrectangle Generation) algorithm which avoids the overlapping and the crossing between hyperrectangles, uses interval weights for mixed hyperrectangles to be splited based on the mutual information. The DHGen improves the classification performance and reduces the number of hyperrectangles by processing the training set in an incremental manner. The proposed DHGen has been successfully shown to exhibit comparable classification performance to k-NN and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.

Heuristics for Selecting Nodes on Cable TV Network (케이블 TV 망에서 노드 선택을 위한 휴리스틱 연구)

  • Chong, Kyun-Rak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.13 no.4
    • /
    • pp.133-140
    • /
    • 2008
  • The cable TV network has delivered downward broadcasting signals from distribution centers to subscribers. Since the traditional coaxial cable has been upgraded by the Hybrid Fiber Coaxial(HFC) cable, the upward channels has expanded broadband services such as Internet. This upward channel is vulnerable to ingress noises. When the noises from the children nodes accumulated in an amplifier exceeds a certain level, that node has to be cut off to prevent the noise propagation. The node selection problem(NSP) is defined to select nodes so that the noise in each node does not exceed the given threshold value and the sum of Profits of selected nodes can be maximized. The NSP has shown to be NP-hard. In this paper, we have proposed heuristics to find the near-optimal solution for NSP. The experimental results show that interval partitioning is better than greedy approach. Our heuristics can be used by the HFC network management system to provide privileged services to the premium subscribers on HFC networks.

  • PDF

A new method for calculating quantiles of grouped data based on the frequency polygon (집단화된 통계자료의 도수다각형에 근거한 새로운 분위수 계산법)

  • Kim, Hyuk Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.383-393
    • /
    • 2017
  • When we deal with grouped statistical data, it is desirable to use a calculation method that gives as close value to the true value of a statistic as possible. In this paper, we suggested a new method to calculate the quantiles of grouped data. The main idea of the suggested method is calculating the data values by partitioning the pentagons, that correspond to the class intervals in the frequency polygon drawn according to the histogram, into parts with equal area. We compared this method with existing methods through simulations using some datasets from introductory statistics textbooks. In the simulation study, we simulated as many data values as given in each class interval using the inverse transform method, on the basis of the distribution that has the shape given by the frequency polygon. Using the sum of squares of differences from quantiles of the simulated data as a criterion, the suggested method was found to have better performance than existing methods for almost all quartiles and deciles.

Age-related Reference Intervals for Total Collagen-I-N-terminal Propeptide in Healthy Korean Population

  • Yoo, Jun-Il;Park, Ae-Ja;Lim, Yong Kwan;Kweon, Oh Joo;Choi, Jee-Hye;Do, Jae Hyuk;Kim, Sunjoo;Kim, Youngri;Ha, Yong-Chan
    • Journal of Bone Metabolism
    • /
    • v.25 no.4
    • /
    • pp.235-241
    • /
    • 2018
  • Background: Procollagen type I N-terminal propeptide (PINP) is one of the most clinically useful bone formation biomarkers. Therefore, the purpose of this study was to independently evaluate the performance of automated total PINP assay and established age- and gender- specific reference intervals for PINP in healthy Korean population. Methods: The imprecision, linearity, and detection capability of Elecsys total PINP assay was determined and reference interval was established using 599 serums from Korean population with normal bone mineral densities based on bone densitometry. Age groups were divided into 20s, 30s, 40s, 50s, 60s and over. Results: Elecsys total PINP had excellent performance in imprecision, linearity, and detection capability. When partitioning age groups in Korean male and female populations, there was significant difference in total PINP between different age groups. In male populations, PINP level was decreased with increasing age, then it remained steady after middle-age. In female populations, there was a decreasing tendency similar to that in the male population with a sharp increase in the 50 to 59 age group. Conclusions: Elecsys total PINP assay showed precise and reliable performance in our study. We established age-related PINP reference intervals for Korean male and female population with normal bone mineral densities.

A Query Index for Processing Continuous Queries over RFID Tag Data (RFID 태그 데이타의 연속질의 처리를 위한 질의 색인)

  • Seok, Su-Wook;Park, Jae-Kwan;Hong, Bong-Hee
    • Journal of KIISE:Databases
    • /
    • v.34 no.2
    • /
    • pp.166-178
    • /
    • 2007
  • The ALE specification of EPCglobal is leading the development of RFID standards, includes the Event Cycle Specification (ECSpec) describing how long a cycle is, how to filter RFID tag data and which reader is interested in. The ECSpec is a specification for filtering and collecting RFID tag data. It is registered to a middleware for long time and is evaluated to return results satisfying the requirements included in it. Thus, it is quite similar to the continuous query. It can be transformed into a continuous query as its predicate in WHERE clause is characterized by the long interval. Long intervals cause problems deteriorating insertion and search performance of existing query indices. In this paper, we propose a TLC-index as a new query index structure for long interval data. The TLC-index has hybrid structure that uses the cell construct of CQI-index with the virtual construct of VCR-index for partitioning long intervals. The TLC-index can reduce the storage cost and improve the insertion performance through decomposing long intervals into one or more cell constructs that have long size. It can also improve the search performance through decomposing short intervals into one or more virtual constructs that have short size enough to fit into those intervals.

Quantitative Analysis of Dry Matter Production and its Partition in Rice II. Partitioning of Dry Matter Affected by Transplanting Date (수도의 건물 생산 및 배분의 수리적연구 II. 이앙기에 따른 부위별 건물배분)

  • Cho, Dong-Sam;Jong, Seung-Keun;Heo, Hoon;Yuk, Chang-Soo
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.35 no.3
    • /
    • pp.273-281
    • /
    • 1990
  • Two rice varieties, Samkangbyeo and Sangpungbyeo, were transplanted on 1/2000a pots at 6 different dates beginning on May 11 with 10 day interval in 1987 and at 4 different dates beginning on May 21 with 10 day interval in a paddy field at the Chungbuk Provincial Rural Development Administration. Dry matter distributions to stem and leaf sheath, leaves and ear at different growth stages were analyzed to provide basic informations neccessary for the development of dynamic growth model. Dry matter production was reduced as transplanting was delayed and the degree of reduction was greater at the transplanting later than June 1. Dry matter distribution to stem and leaf sheath was increased up to 60-70 days after transplanting with the maximum ratio between 60-70%, which were decreased to 37-43% in pots and 27-33% in field at the end of ripening stage. On the other hand, dry matter distribution to leaf blade was decreased from 40-50% at transplanting to 11-17% at harvesting. Ear dry matter distribution increased rapidly after heading and the distribution ratio was 42-49% in pots and 52-62% in field. Although regression equations to predict dry matter distribution to different parts of rice plant were satisfactory for individual experiment, the application to different experiment was not appropriate.

  • PDF

Rough Set Analysis for Stock Market Timing (러프집합분석을 이용한 매매시점 결정)

  • Huh, Jin-Nyung;Kim, Kyoung-Jae;Han, In-Goo
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.3
    • /
    • pp.77-97
    • /
    • 2010
  • Market timing is an investment strategy which is used for obtaining excessive return from financial market. In general, detection of market timing means determining when to buy and sell to get excess return from trading. In many market timing systems, trading rules have been used as an engine to generate signals for trade. On the other hand, some researchers proposed the rough set analysis as a proper tool for market timing because it does not generate a signal for trade when the pattern of the market is uncertain by using the control function. The data for the rough set analysis should be discretized of numeric value because the rough set only accepts categorical data for analysis. Discretization searches for proper "cuts" for numeric data that determine intervals. All values that lie within each interval are transformed into same value. In general, there are four methods for data discretization in rough set analysis including equal frequency scaling, expert's knowledge-based discretization, minimum entropy scaling, and na$\ddot{i}$ve and Boolean reasoning-based discretization. Equal frequency scaling fixes a number of intervals and examines the histogram of each variable, then determines cuts so that approximately the same number of samples fall into each of the intervals. Expert's knowledge-based discretization determines cuts according to knowledge of domain experts through literature review or interview with experts. Minimum entropy scaling implements the algorithm based on recursively partitioning the value set of each variable so that a local measure of entropy is optimized. Na$\ddot{i}$ve and Booleanreasoning-based discretization searches categorical values by using Na$\ddot{i}$ve scaling the data, then finds the optimized dicretization thresholds through Boolean reasoning. Although the rough set analysis is promising for market timing, there is little research on the impact of the various data discretization methods on performance from trading using the rough set analysis. In this study, we compare stock market timing models using rough set analysis with various data discretization methods. The research data used in this study are the KOSPI 200 from May 1996 to October 1998. KOSPI 200 is the underlying index of the KOSPI 200 futures which is the first derivative instrument in the Korean stock market. The KOSPI 200 is a market value weighted index which consists of 200 stocks selected by criteria on liquidity and their status in corresponding industry including manufacturing, construction, communication, electricity and gas, distribution and services, and financing. The total number of samples is 660 trading days. In addition, this study uses popular technical indicators as independent variables. The experimental results show that the most profitable method for the training sample is the na$\ddot{i}$ve and Boolean reasoning but the expert's knowledge-based discretization is the most profitable method for the validation sample. In addition, the expert's knowledge-based discretization produced robust performance for both of training and validation sample. We also compared rough set analysis and decision tree. This study experimented C4.5 for the comparison purpose. The results show that rough set analysis with expert's knowledge-based discretization produced more profitable rules than C4.5.