• Title/Summary/Keyword: Large-set Classification

Search Result 183, Processing Time 0.026 seconds

Library Management and Services for Software Component Reuse on the Web (Web 소프트웨어 컴포넌트 재사용을 위한 라이브러리 관리와 서비스)

  • Lee, Sung-Koo
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.1_2
    • /
    • pp.10-19
    • /
    • 2002
  • In searching and locating a collection of components on the Web, users require a Web browser. Since the Web libraries tend to grow rapidly, there needs to be an effective way to organize and manage such large libraries. Traditional Web-based library(retrieval) systems provide various classification scheme and retrieval services to store and retrieve components. However, these systems do not include invaluable services, for example, enabling users to grasp the overall contents of the library at the beginning of retrieval. This paper discusses a Web-based library system, which provides the efficient management of object-oriented components and a set of services beyond simple component store and retrieval. These services consist of component comprehension through a reverse engineering process, automated summary extraction, and comprehension-based retrieval. Also, The performance of an automated cluster-based classification scheme adopted on the system is evaluated and compared with the cluster-based classification scheme adopted on the system is evaluated and compared with the performance of two other systems using traditional classification scheme.

Physical activity classification table for Korean youth: using the Youth Compendium of Physical Activities in the United States (한국 소아청소년을 위한 신체활동분류표: 미국의 청소년 신체활동목록 (Youth Compendium of Physical Activities)을 이용하여)

  • Kim, Eun-Kyung;Gwak, Ji-Yeon;Jun, Ha-Yeon
    • Journal of Nutrition and Health
    • /
    • v.55 no.5
    • /
    • pp.533-542
    • /
    • 2022
  • The total energy expenditure (TEE) consists of the basal energy expenditure (BEE), physical activity energy expenditure (PAEE) and the thermic effect of food. The PAEE accounts for a significant portion of the TEE and can be changed according to individual efforts, and the difference between individuals of PAEE is large. Even for the same physical activity, there is a difference in energy expenditure between adults and children. Therefore, a physical activity classification table for youth is needed to classify the physical activity recorded in the physical activity diary prepared to evaluate children's energy expenditure. It is also necessary to calculate the physical activity level required to set the estimated energy requirement in the Dietary Reference Intakes for children and adolescents in Korea. This paper reports a physical activity classification table for Korean youth using the 2017 Youth Compendium of Physical Activities in the United States. This physical activity classification table includes 110 specific activities classified into 14 major categories by four age groups (6-9, 10-12, 13-15, and 16-18 years old) and their metabolic equivalent values. Of these, 87 physical activities were selected from the 2017 Youth Compendium reported in the United States. Nine physical activities such as washing and going to the bathroom, which are daily activities of children and adolescents not included among them, were selected from the another list (2008) of physical activities in America. The remaining 15 physical activities were selected from the research results, which measured the energy expenditure of Korean children and adolescents. Activity categories were divided into 4 areas: daily activity (A), movement (B), school work (C), exercise and sports (D). This physical activity classification table will help standardize the interpretation and scoring process of physical activity of youth in related studies and community health surveys.

Utilizing health promotion indices of the 3rd national health plan in the 6th Community Health Plans in South Korea (제6기 지역보건의료계획의 제3차 국민건강증진종합계획 건강증진 지표 활용도)

  • Kim, Hyun-Soo;Lee, Jong-Ha;Jeon, Hyo-In;Lee, Moo-Sik;Hong, Jee-Young
    • Korean Journal of Health Education and Promotion
    • /
    • v.33 no.5
    • /
    • pp.83-91
    • /
    • 2016
  • Objectives: This study was aimed to investigate utilization of health promotion indices of the 3rd National Health Plan 2011-2020 (HP2020) in the 6th Korean Community Health Plan. Methods: Health promotion indices were defined as a set of indicators on smoking, alcohol drinking, physical activity, nutrition and obesity used in HP2020. This indices were categorized into essential indicator, accessory indicators and others. Based on chi-square test, we analyzed utilization of health promotion indices in 186 Community Health Plans by regional classifications: four large influence areas (SudoGangwon, Chungcheong, Gyeongsang and HonamJeju) and four regional classification (metropolitan district, city, urban-rural area and rural area) Results: Among total 186 plans, indicator utilization rate were 97.8% in smoking, 71.0% in alcohol drinking, 91.9% in physical activity, 99.5% in nutrition and 72.0% in obesity. Utilization rates of alcohol drinking indicators and essential indicators in alcohol drinking show significantly difference by four large influence areas (p<0.01) and four regional classification (p<0.01). Essential indicators in physical activity show significantly difference by four large influence areas (p<0.01). Conclusions: Central government must provide technical assistance and educate personnel in community health centers and provincial health department about meaning and usefulness of Health Plan 2020 indicators.

Development of Automatic Rule Extraction Method in Data Mining : An Approach based on Hierarchical Clustering Algorithm and Rough Set Theory (데이터마이닝의 자동 데이터 규칙 추출 방법론 개발 : 계층적 클러스터링 알고리듬과 러프 셋 이론을 중심으로)

  • Oh, Seung-Joon;Park, Chan-Woong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.6
    • /
    • pp.135-142
    • /
    • 2009
  • Data mining is an emerging area of computational intelligence that offers new theories, techniques, and tools for analysis of large data sets. The major techniques used in data mining are mining association rules, classification and clustering. Since these techniques are used individually, it is necessary to develop the methodology for rule extraction using a process of integrating these techniques. Rule extraction techniques assist humans in analyzing of large data sets and to turn the meaningful information contained in the data sets into successful decision making. This paper proposes an autonomous method of rule extraction using clustering and rough set theory. The experiments are carried out on data sets of UCI KDD archive and present decision rules from the proposed method. These rules can be successfully used for making decisions.

Fuzzy Rules Generation Using the LVQ (LVQ를 이용한 퍼지 규칙 생성)

  • Lee, Nam-Il;Jang, Gwang-Gyu;Im, Han-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.4
    • /
    • pp.988-998
    • /
    • 1999
  • This paper is to investigate the method of reducing the number of fuzzy rules with the help of LVQ. a large number of training patterns usually leads to a large set of fuzzy rules that require a large computer memory and take a long time to perform classification. so, in order to solve these problems, it is necessary to study to minimize the number of fuzzy rules. However, so as to minimize the performance degradation resulting from the reduction of fuzzy rules, fuzzy rules are generated after training the high-quality initial reference pattern. Through the simulation, we confirm that the proposed method is very effective.

  • PDF

Change of Sunspot Groups Observed from 2002 to 2011 at ButterStar Observatory

  • Oh, Sung-Jin;Chang, Heon-Young
    • Journal of Astronomy and Space Sciences
    • /
    • v.29 no.3
    • /
    • pp.245-251
    • /
    • 2012
  • Since the development of surface magnetic features should reflect the evolution of the solar magnetic field in the deep interior of the Sun, it is crucial to study properties of sunspots and sunspot groups to understand the physical processes working below the solar surface. Here, using the data set of sunspot groups observed at the ButterStar observatory for 3,364 days from 2002 October 16 to 2011 December 31, we investigate temporal change of sunspot groups depending on their Z$\ddot{u}$rich classification type. Our main findings are as follows: (1) There are more sunspot groups in the southern hemisphere in solar cycle 23, while more sunspot groups appear in the northern hemisphere in solar cycle 24. We also note that in the declining phase of solar cycle 23 the decreasing tendency is apparently steeper in the solar northern hemisphere than in the solar southern hemisphere. (2) Some of sunspot group types make a secondary peak in the distribution between the solar maximum and the solar minimum. More importantly, in this particular data set, sunspot groups which have appeared in the solar southern hemisphere make a secondary peak 1 year after a secondary peak occurs in the solar northern hemisphere. (3) The temporal variations of small and large sunspot group numbers are disparate. That is, the number of large sunspot group declines earlier and faster and that the number of small sunspot group begins to rise earlier and faster. (4) The total number of observed sunspot is found to behave more likewise as the small sunspot group does. Hence, according to our findings, behaviors and evolution of small magnetic flux tubes and large magnetic flux tubes seem to be different over solar cycles. Finally, we conclude by briefly pointing out its implication on the space weather forecast.

Improved Sentence Boundary Detection Method for Web Documents (웹 문서를 위한 개선된 문장경계인식 방법)

  • Lee, Chung-Hee;Jang, Myung-Gil;Seo, Young-Hoon
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.6
    • /
    • pp.455-463
    • /
    • 2010
  • In this paper, we present an approach to sentence boundary detection for web documents that builds on statistical-based methods and uses rule-based correction. The proposed system uses the classification model learned offline using a training set of human-labeled web documents. The web documents have many word-spacing errors and frequently no punctuation mark that indicates the end of sentence boundary. As sentence boundary candidates, the proposed method considers every Ending Eomis as well as punctuation marks. We optimize engine performance by selecting the best feature, the best training data, and the best classification algorithm. For evaluation, we made two test sets; Set1 consisting of articles and blog documents and Set2 of web community documents. We use F-measure to compare results on a large variety of tasks, Detecting only periods as sentence boundary, our basis engine showed 96.5% in Set1 and 56.7% in Set2. We improved our basis engine by adapting features and the boundary search algorithm. For the final evaluation, we compared our adaptation engine with our basis engine in Set2. As a result, the adaptation engine obtained improvements over the basis engine by 39.6%. We proved the effectiveness of the proposed method in sentence boundary detection.

A Co-training Method based on Classification Using Unlabeled Data (비분류표시 데이타를 이용하는 분류 기반 Co-training 방법)

  • 윤혜성;이상호;박승수;용환승;김주한
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.991-998
    • /
    • 2004
  • In many practical teaming problems including bioinformatics area, there is a small amount of labeled data along with a large pool of unlabeled data. Labeled examples are fairly expensive to obtain because they require human efforts. In contrast, unlabeled examples can be inexpensively gathered without an expert. A common method with unlabeled data for data classification and analysis is co-training. This method uses a small set of labeled examples to learn a classifier in two views. Then each classifier is applied to all unlabeled examples, and co-training detects the examples on which each classifier makes the most confident predictions. After some iterations, new classifiers are learned in training data and the number of labeled examples is increased. In this paper, we propose a new co-training strategy using unlabeled data. And we evaluate our method with two classifiers and two experimental data: WebKB and BIND XML data. Our experimentation shows that the proposed co-training technique effectively improves the classification accuracy when the number of labeled examples are very small.

Extracting High Quality Thematic Information by Using High-Resolution Satellite Imagery (고해상도 위성영상을 이용한 정밀 주제 정보 추출)

  • Lee, Hyun-Jik;Ru, Ji-Ho;Yu, Young-Geol
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.18 no.1
    • /
    • pp.73-81
    • /
    • 2010
  • In recent years, there have been diverse researches and utilizations of creating geo-spatial information with high resolution satellite images. However thematic maps made with middle or low resolution satellite images have low location accuracy and precision of thematic information. This study set out to propose a method of making a precision thematic map with high resolution satellite images by examining the conversion from the conventional method based on middle or low resolution satellite images to the automatic method based on high resolution satellite images of GSD 1m or lower, extracting thematic information of middle or large scale of 1/5,000 or lower, and analyzing its accuracy. Seven classification classes were categorized according to the object-oriented classification in order to automatically extract thematic information with high resolution satellite images. And the classification results were compared and analyzed with the old middle scale land cover map and 1/1000 digital map.

Extraction of the aquaculture farms information from the Landsat- TM imagery of the Younggwang coastal area

  • Shanmugam, P.;Ahn, Yu-Hwan;Yoo, Hong-Ryong
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2004.03a
    • /
    • pp.493-498
    • /
    • 2004
  • The objective of the present study is to compare various conventional and recently evolved satellite image-processing techniques and to ascertain the best possible technique that can identify and position of aquaculture farms accurately in and around the Younggwang coastal area. Several conventional techniques performed to extract such information fiom the Landsat-TM imagery do not seem to yield better information about the aquaculture farms, and lead to misclassification. The large errors between the actual and extracted aquaculture farm information are due to existence of spectral confusion and inadequate spatial resolution of the sensor. This leads to possible occurrence of mixture pixels or 'mixels' of the source of errors in the classification techniques. Understanding the confusing and mixture pixel problems requires the development of efficient methods that can enable more reliable extraction of aquaculture farm information. Thus, the more recently evolved methods such as the step-by-step partial spectral end-member extraction and linear spectral unmixing methods are introduced. The farmer one assumes that an end-member, which is often referred to as 'spectrally pure signature' of a target feature, does not appear to be a spectrally pure form, but always mix with the other features at certain proportions. The assumption of the linear spectral unmxing is that the measured reflectance of a pixel is the linear sum of the reflectance of the mixture components that make up that pixel. The classification accuracy of the step-by-step partial end-member extraction improved significantly compared to that obtained from the traditional supervised classifiers. However, this method did not distinguish the aquaculture ponds and non-aquaculture ponds within the region of the aquaculture farming areas. In contrast, the linear spectral unmixing model produced a set of fraction images for the aquaculture, water and soil. Of these, the aquaculture fraction yields good estimates about the proportion of the aquaculture farm in each pixel. The acquired proportion was compared with the values of NDVI and both are positively correlated (R$^2$ =0.91), indicating the reliability of the sub-pixel classification.ixel classification.

  • PDF