• Title/Summary/Keyword: 분류(分類)

Search Result 34,756, Processing Time 0.056 seconds

Feature Selection and Classification of Web Pages (웹 페이지에서의 자질 선택과 분류)

  • 송무희;임수연;박성배;강동진;이상조
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.10a
    • /
    • pp.796-798
    • /
    • 2004
  • 본 논문에서는 웹 문서의 분류 성능을 향상시키기 위해 웹 페이지에서의 자질선택과 그에 따른 웹 문서 분류 방법을 제안한다. 문서 분류에는 문서에 포함된 단어를 분류 자질로 사용하게 되며 이때 한 문서의 모든 단어를 분류 자질로 이용한다고 좋은 성능을 보인다고 보장할 수는 없다. 그러므로 문서에 필요한 단어만을 자동으로 추출하여 문서데이터의 자질을 축소하는 작업이 필요하다. 따라서 본 논문에서는 모집군 내의 자질벡터의 범위가 큰 것을 적은 수의 주요성분으로 감소시키기 위해 통계적 분석 기법중의 하나인 주성분분석 방법을 이용하여 자질감소와 그에 따른 문서분류의 성능 향상을 실험을 통하여 보인다. 야후 스포츠 뉴스 웹 페이지가 분류를 위해 사용되었으며, 분류기로는 Naive Bayesian 분류 방법을 사용하였다. 실험 결과를 통해 본 논문에서 제안한 뉴스 웹페이지 분류 방법이 스포츠 뉴스 데이터 군에서 만족할 만한 분류 정확도를 제공한다는 것을 알 수 있다.

  • PDF

A Study on the Theory and Historical Development of Official Document Classification Scheme in Korea - Since Chosun Dynasty to Current Korea Government - (문서분류의 이론과 변천에 관한 연구 - 조선조이후 현행 '정부공문서분류'까지 -)

  • Choe, Jung-Tai;Lee, Ju-Yeon
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.3 no.2
    • /
    • pp.1-33
    • /
    • 2003
  • This study is to aim on the theory of document classification system and historical development of official document classification scheme since Chosun dynasty to Republic of Korea. We have been new version of classification scheme 'Document Classification Standard' is scheduled in 2004, though there are many fundamental problems in governmental agencies and record centers. Thus new 'Document Classification Standard' should be make discussion and inquire.

Classification of Remote Sensing Data using Random Selection of Training Data and Multiple Classifiers (훈련 자료의 임의 선택과 다중 분류자를 이용한 원격탐사 자료의 분류)

  • Park, No-Wook;Yoo, Hee Young;Kim, Yihyun;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.5
    • /
    • pp.489-499
    • /
    • 2012
  • In this paper, a classifier ensemble framework for remote sensing data classification is presented that combines classification results generated from both different training sets and different classifiers. A core part of the presented framework is to increase a diversity between classification results by using both different training sets and classifiers to improve classification accuracy. First, different training sets that have different sampling densities are generated and used as inputs for supervised classification using different classifiers that show different discrimination capabilities. Then several preliminary classification results are combined via a majority voting scheme to generate a final classification result. A case study of land-cover classification using multi-temporal ENVISAT ASAR data sets is carried out to illustrate the potential of the presented classification framework. In the case study, nine classification results were combined that were generated by using three different training sets and three different classifiers including maximum likelihood classifier, multi-layer perceptron classifier, and support vector machine. The case study results showed that complementary information on the discrimination of land-cover classes of interest would be extracted within the proposed framework and the best classification accuracy was obtained. When comparing different combinations, to combine any classification results where the diversity of the classifiers is not great didn't show an improvement of classification accuracy. Thus, it is recommended to ensure the greater diversity between classifiers in the design of multiple classifier systems.

Flora of Sum-eunmulbaengdui Forest Genetic Resource Reserve Area in Jeju-do (숨은물뱅듸 산림유전자원 보호구역의 식물상)

  • Jung, Gi-Soo;Hyun, Hwa-Ja;Jeong, Jun-Ho;Moon, Sung-Pil;Lee, Sun-Ryung;Song, Gwanpil
    • Proceedings of the Plant Resources Society of Korea Conference
    • /
    • 2018.10a
    • /
    • pp.54-54
    • /
    • 2018
  • 숨은물뱅듸 산림유전자원 보호구역은 해발 950 m 이상 지역의 습지로서 오름으로 둘러싸인 넓은 웅덩이 형태로 환경부 멸종위기 2급 야생식물인 자주땅귀개를 비롯한 다양한 습지 식물과 이를 둘러싸고 있는 산림지역을 포함하여 산림청에서는 산림유전자원 보호구역으로 지정 관리되고 있다. 본 조사는 숨은물뱅듸 산림유전자원 보호구역을 대상으로 식물상을 조사하여 식물종 다양성을 보존하기 위한 기초자료를 만들기 위해 실시하였다. 본 연구는 2018년 7월 24일부터 8월 28일까지 총 4회에 걸쳐 현장조사를 통하여 표본을 채집하고 기록하여 정리하였다. 그 결과, 숨은물뱅듸에 자생중인 식물은 양치식물 8과 11속 17종 17분류군, 나자식물 2과 2속 2종 2분류군, 피자식물 56과 121속 167종 5변종 1품종 173분류군 총66과 134속 186종 5변종 1품종의 총 192분류군이 조사되었다. 이 중 환경부 멸종위기야생식물은 자주땅귀개 1종이 확인되었고, 제주특산식물 6분류군, 한국특산식물 2분류군이 확인되었다. 식물구계학적특정식물은 총37분류군이며 V등급 5분류군, IV등급 5분류군, III등급 12분류군, II등급 5분류군, I 등급 10분류군이 확인되었다. 한국의 적색목록 식물은 위기(EN) 1분류군, 취약(VU) 1분류군, 준위협(NT) 1분류군, 관심대상(LC) 6분류군, 미평가(NE) 3분류군으로 나타났다. 조사된 식물들 대상으로 생활형을 분석해보면, 휴면형은 Ch 47분류군으로 가장 많이 나타났고, G(30분류군), MM(24분류군), HH(23분류군) 순으로 나타났다. 번식형은 R5가 101분류군, 산포기관형은 D4가 84분류군, 생육형은 e가 89분류군으로 가장 많이 나타났다. 반면, 외래식물 1분류군이 출현한 것으로 보아 숨은물뱅되는 아직까지 보전이 잘 되어 있고, 식물종다양성이 우수하며 식물학적으로 가치가 매우 높은 것으로 판단되었다.

  • PDF

Flora of Mt. Choejeong (Daegu) (최정산(대구)의 관속식물상 연구)

  • Jun, Minji;Lee, Eunmi;Park, Sunmi;Bae, Jongwu;Na, Myeongwu;Hwang, Youjin;Choi, SuMi;Park, SeonJoo
    • Korean Journal of Plant Resources
    • /
    • v.32 no.2
    • /
    • pp.170-200
    • /
    • 2019
  • This study was carried out to investigate the vascular plants of Mt. Choejeong in Gachang-myeon, Daegu. From March 2017 to October 2018, a total of 22 studies were conducted. The vascular plants surveyed were grouped into 560 taxa, including 104 families, 297 genera, 495 species, 4 subspecies, 51 varieties and 10 forma. Endemic plants 15 taxa, Rare plants 5 taxa, Red list plants 5 taxa, Floristic regional indicator plants 54 taxa, Naturalized plants 36 taxa were recorded. Among surveyed 560 taxa, edible, medicinal, ornamental, timber, pasturing, industrial and fiber plants included 246 taxa (29.2%), 228 taxa (27.1%), 164 taxa (19.5%), 61 taxa (7.2%), 13 taxa (1.5%), and 8 taxa (0.9%). And because people are coming and going more frequently than in the past, this will result in more frequent influx of naturalized plants and a threat to the habitat of the plants that are currently growing.

Distribution of Vascular Plants at the Ecological Landscape Conservation Area Heoninlleung in Seoul (서울시 생태.경관보전지역 헌인릉의 관속식물 분포)

  • Kim, Kun-Ok;Hong, Sun-Hee;Lee, Yong-Ho;Na, Chae-Sun;Kang, Byeung-Hoa;Son, Yo-Whan
    • Korean Journal of Plant Resources
    • /
    • v.23 no.1
    • /
    • pp.60-78
    • /
    • 2010
  • To clarify the distribution of vascular plants and their usefulness in Heoninlleung, Ecological Landscape Conservation Areas of Seoul, we investigated it from April, 2006 to June, 2009. Total 313 taxa; 68 families, 191 genera, 264 species, 41 varieties and 8 forma were distributed in Heoninlleung. Among them, 37 taxa were highly abundant everywhere (3A), 16 taxa were highly abundant locally (3B), 70 taxa were moderately abundant everywhere (2A), 96 taxa were common in certain regions locally (2B), 9 taxa were rare but observed everywhere with low frequency (2A) and 85 taxa were rare and observed locally (1B). The economic plants were 293 taxa. There were 156 taxa of edible source, 223 taxa of medicinal source, 141 taxa of ornamental source, 69 taxa of pastoral source, 12 taxa of industrial, and 8 taxa of timber source. Twelve Korean endemic plants were collected. Based on the list of rare plants by the Korea National Arboretum and Ministry of Environment, 2 rare species were found. The specific species of I~V grades by phytogeography were 19 taxa. And twentyfour taxa of naturalized plant species were distributed. Naturalization Index was 7.7% and Urbanization Index was 8.4% in the investigated area.

A Study on the Relationship between Class Similarity and the Performance of Hierarchical Classification Method in a Text Document Classification Problem (텍스트 문서 분류에서 범주간 유사도와 계층적 분류 방법의 성과 관계 연구)

  • Jang, Soojung;Min, Daiki
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.77-93
    • /
    • 2020
  • The literature has reported that hierarchical classification methods generally outperform the flat classification methods for a multi-class document classification problem. Unlike the literature that has constructed a class hierarchy, this paper evaluates the performance of hierarchical and flat classification methods under a situation where the class hierarchy is predefined. We conducted numerical evaluations for two data sets; research papers on climate change adaptation technologies in water sector and 20NewsGroup open data set. The evaluation results show that the hierarchical classification method outperforms the flat classification methods under a certain condition, which differs from the literature. The performance of hierarchical classification method over flat classification method depends on class similarities at levels in the class structure. More importantly, the hierarchical classification method works better when the upper level similarity is less that the lower level similarity.

A Co-training Method based on Classification Using Unlabeled Data (비분류표시 데이타를 이용하는 분류 기반 Co-training 방법)

  • 윤혜성;이상호;박승수;용환승;김주한
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.991-998
    • /
    • 2004
  • In many practical teaming problems including bioinformatics area, there is a small amount of labeled data along with a large pool of unlabeled data. Labeled examples are fairly expensive to obtain because they require human efforts. In contrast, unlabeled examples can be inexpensively gathered without an expert. A common method with unlabeled data for data classification and analysis is co-training. This method uses a small set of labeled examples to learn a classifier in two views. Then each classifier is applied to all unlabeled examples, and co-training detects the examples on which each classifier makes the most confident predictions. After some iterations, new classifiers are learned in training data and the number of labeled examples is increased. In this paper, we propose a new co-training strategy using unlabeled data. And we evaluate our method with two classifiers and two experimental data: WebKB and BIND XML data. Our experimentation shows that the proposed co-training technique effectively improves the classification accuracy when the number of labeled examples are very small.

A Study on the Model for Construction Records Classification System (건설기록물 분류체계 모형에 관한 연구)

  • Park, Yong-Boo;Kim, Tae-Soo
    • Journal of the Korean Society for information Management
    • /
    • v.28 no.3
    • /
    • pp.83-101
    • /
    • 2011
  • The international standards, ISO 15489 and Family Code, recommend using functional classification method both in public and private organizations. In this study made a comparative analysis of the details of classification systems through case studies on records classification systems of a total of seven comprehensive construction companies in Korea including three large corporations and four small and medium-size businesses. Findings of this study suggester the direction of developing construction records classification system and its methodology. By summarizing classification standards derived from these case studies, key construction records classification standards were presented.

Automatic Document Classification Using Multiple Classifier Systems (다중 분류기 시스템을 이용한 자동 문서 분류)

  • Kim, In-Cheol
    • The KIPS Transactions:PartB
    • /
    • v.11B no.5
    • /
    • pp.545-554
    • /
    • 2004
  • Combining multiple classifiers to obtain improved performance over the individual classifier has been a widely used technique. The task of constructing a multiple classifier system(MCS) contains two different Issues how to generate a diverse set of base-level classifiers and how to combine their predictions. In this paper, we review the characteristics of existing multiple classifier systems : Bagging, Boosting, and Slaking. For document classification, we propose new MCSs such as Stacked Bagging, Stacked Boosting, Bagged Stacking, Boosted Stacking. These MCSs are a sort of hybrid MCSs that combine advantages of existing MCSs such as Bugging, Boosting, and Stacking. We conducted some experiments of document classification to evaluate the performances of the proposed schemes on MEDLINE, Usenet news, and Web document collections. The result of experiments demonstrate the superiority of our hybrid MCSs over the existing ones.