• Title/Summary/Keyword: Classification Variables

Search Result 921, Processing Time 0.03 seconds

Case Study on Public Document Classification System That Utilizes Text-Mining Technique in BigData Environment (빅데이터 환경에서 텍스트마이닝 기법을 활용한 공공문서 분류체계의 적용사례 연구)

  • Shim, Jang-sup;Lee, Kang-wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.1085-1089
    • /
    • 2015
  • Text-mining technique in the past had difficulty in realizing the analysis algorithm due to text complexity and degree of freedom that variables in the text have. Although the algorithm demanded lots of effort to get meaningful result, mechanical text analysis took more time than human text analysis. However, along with the development of hardware and analysis algorithm, big data technology has appeared. Thanks to big data technology, all the previously mentioned problems have been solved while analysis through text-mining is recognized to be valuable as well. However, applying text-mining to Korean text is still at the initial stage due to the linguistic domain characteristics that the Korean language has. If not only the data searching but also the analysis through text-mining is possible, saving the cost of human and material resources required for text analysis will lead efficient resource utilization in numerous public work fields. Thus, in this paper, we compare and evaluate the public document classification by handwork to public document classification where word frequency(TF-IDF) in a text-mining-based text and Cosine similarity between each document have been utilized in big data environment.

  • PDF

Studies on the Structure of the Forest Community in Mt. Sokri(II) -Analysis on the Plant Community by the Classification and Ordination Techniques- (속리산 삼림군집구조에 관한 연구(II) Classification 및 Ordination 방법에 의한 식생분석 -)

  • 이경재;박인협;조재창;오충현
    • Korean Journal of Environment and Ecology
    • /
    • v.4 no.1
    • /
    • pp.33-43
    • /
    • 1990
  • A survey of Popju Temple district. was conducted using 70 sample plots of 500$m^2$ size. The classification by TWINSPAN and DCA ordination were applied to the study area in order to classify them into several groups based on woody plants and environmental variables. By both techniques. the plant com-munity were divided into six groups by the altitude and soil moisture. The successional trends of tree species seem to be from Pinus densiflora, Sorbus alnifolia through Quercus serrata to Carpinus laxiflora and from P. densiflora, Fraxinus sieboldiana through Q. mongolica in the canopy layer, and from Lespedeza cyrtobotrya, Rhus trichocarpa, Zanthoxylum schnifolium through Rhododendron mucronulatum, Corylus sieboldiana, Lindera obtusiloba, Magnclia sieboldii to Euonymus sieboldianus in the understory and shrub layer. The species diversity of the plant community in the burnt plot was decreased by the forest fire but the importance values of Quercus species were increased in above plot.

  • PDF

Multi-Label Classification Approach to Effective Aspect-Mining (효과적인 애스팩트 마이닝을 위한 다중 레이블 분류접근법)

  • Jong Yoon Won;Kun Chang Lee
    • Information Systems Review
    • /
    • v.22 no.3
    • /
    • pp.81-97
    • /
    • 2020
  • Recent trends in sentiment analysis have been focused on applying single label classification approaches. However, when considering the fact that a review comment by one person is usually composed of several topics or aspects, it would be better to classify sentiments for those aspects respectively. This paper has two purposes. First, based on the fact that there are various aspects in one sentence, aspect mining is performed to classify the emotions by each aspect. Second, we apply the multiple label classification method to analyze two or more dependent variables (output values) at once. To prove our proposed approach's validity, online review comments about musical performances were garnered from domestic online platform, and the multi-label classification approach was applied to the dataset. Results were promising, and potentials of our proposed approach were discussed.

An Analysis on the 500m - Mesh Classification based on the Combinations of Building Needs in Busan (부산시 500m 메시 레벨에서의 건물용도 구성에 따른 유형화 분석)

  • Hwang, Kwang-Il;Choi, Duk-In;Kim, Da-Hye;Yang, Ing-Chan;Yoon, So-Ra
    • Proceedings of the Korean Institute of Navigation and Port Research Conference
    • /
    • 2010.04a
    • /
    • pp.191-192
    • /
    • 2010
  • This study classify the every meshes in Busan metropolitan city, based on the building need. The number of the effective meshes is 3289, all of those meshes are databased with simplified 7 building needs. The area for residential, commercial, educational needs occupy 92.4 % among all the areas. To simplify the multiple variables, principal component analysis is performed before the cluster analysis. Ans as the result 5 classification are obtained.

  • PDF

Classification of Middle Aged Women's Breast Shapes Using 3D Body Measurement Data (3차원 인체 측정치들을 이용한 중년 여성의 유방 형태에 따른 유형)

  • Lee, Hyun-Young;Hong, Kyung-Hee
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.34 no.3
    • /
    • pp.385-392
    • /
    • 2010
  • The breast types of middle-aged women of 80A (formerly 80B) size were classified through a 3D scanned nude body. Thirty seven measurements including the radius of curvature, surface area, volume, surface length, and breast displacements were used as input variables. We extracted five main factors through the factor analysis of the measurements and classified 36 subjects into 3 clusters through the cluster analysis. As a result of the factor analysis, the size of the breast, breast sag, the curvature of the inner and the outer breast curve, the width of the breast, and the nipple direction were found as the main factors. For the results of the classification of breast types, Cluster 1 was characterized by narrow breast width and unsymmetrical under the breast curve, whereas Cluster 2 was a wide and sagged shape. Cluster 3 was classified into big breast volume and symmetrical under-breast curve. The results are useful to the product development of high quality brassieres which reflect the 3D characteristics of breast types of middle-aged women.

Voice Personality Transformation Using a Multiple Response Classification and Regression Tree (다중 응답 분류회귀트리를 이용한 음성 개성 변환)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.253-261
    • /
    • 2004
  • In this paper, a new voice personality transformation method is proposed. which modifies speaker-dependent feature variables in the speech signals. The proposed method takes the cepstrum vectors and pitch as the transformation paremeters, which represent vocal tract transfer function and excitation signals, respectively. To transform these parameters, a multiple response classification and regression tree (MR-CART) is employed. MR-CART is the vector extended version of a conventional CART, whose response is given by the vector form. We evaluated the performance of the proposed method by comparing with a previously proposed codebook mapping method. We also quantitatively analyzed the performance of voice transformation and the complexities according to various observations. From the experimental results for 4 speakers, the proposed method objectively outperforms a conventional codebook mapping method. and we also observed that the transformed speech sounds closer to target speech.

Decomposition of category mixture in a pixel and its application for supervised image classification

  • Matsumoto, Masao;Arai, Kohei;Ishimatsu, Takakazu
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1992.10b
    • /
    • pp.514-519
    • /
    • 1992
  • To make an accurate retrieval of the proportion of each category among mixed pixels (Mixel's) of a remotely sensed imagery, a maximum likelihood estimation method of category proportion is proposed. In this method, the observed multispectral vector is considered as probability variables along with the approximation that the supervised data of each category can be characterized by normal distribution. The results show that this method can retrieve accurate proportion of each category among Mixel's. And a index that can estimate the degree of error in each category is proposed. AS one of the application of the proportion estimation, a method for image classification based on category proportion estimation is proposed. In this method all pixel in a remotely sensed imagery are assumed to be Mixel's, and are classified to most dominant category. Among the Mixel's, there exists unconfidential pixels which should be categorized as unclassified pixels. In order to discriminate them, two types of criteria, Chi square and AIC, are proposed for fitness test on pure pixel hypothesis. Experimental result with a simulated dataset show an usefulness of proposed classification criterion compared to the conventional maximum likelihood criterion and applicability of the fitness tests based on Chi square and AIC,

  • PDF

Statistical Approach to Noisy Band Removal for Enhancement of HIRIS Image Classification

  • Huan, Nguyen Van;Kim, Hak-Il
    • Proceedings of the KSRS Conference
    • /
    • 2008.03a
    • /
    • pp.195-200
    • /
    • 2008
  • The accuracy of classifying pixels in HIRIS images is usually degraded by noisy bands since noisy bands may deform the typical shape of spectral reflectance. Proposed in this paper is a statistical method for noisy band removal which mainly makes use of the correlation coefficients between bands. Considering each band as a random variable, the correlation coefficient measures the strength and direction of a linear relationship between two random variables. While the correlation between two signal bands is high, existence of a noisy band will produce a low correlation due to ill-correlativeness and undirectedness. The application of the correlation coefficient as a measure for detecting noisy bands is under a two-pass screening scheme. This method is independent of the prior knowledge of the sensor or the cause resulted in the noise. The classification in this experiment uses the unsupervised k-nearest neighbor algorithm in accordance with the well-accepted Euclidean distance measure and the spectral angle mapper measure. This paper also proposes a hierarchical combination of these measures for spectral matching. Finally, a separability assessment based on the between-class and within-class scatter matrices is followed to evaluate the performance.

  • PDF

Classification and Characteristic analysis of Mountain Village Landscape Using Cluster Analysis (군집분석을 이용한 산촌경관 유형 구분 및 특성 분석)

  • Ko, Arang;Lim, Jungwoo;Kim, Seong Hak
    • Journal of Korean Society of Rural Planning
    • /
    • v.26 no.1
    • /
    • pp.101-112
    • /
    • 2020
  • Recently, public awareness regarding mountain villages' landscapes is increasing. Thus, this study aimed to provide standards for conservation, management and creation of mountain village landscape by characterizing and classifying those exist. 286 mountain villages' data were collected and 19 variables - extracted from GIS spatial information and statistic data of mountain villages, chosen as right sources according to former studies - were utilized to conduct factor and cluster analysis. As a result of the factor analysis, 7 characteristics of the mountain villages' landscapes were defined - 'Location', 'Cultivation', 'Ecology·Nature', 'Tourism', 'Residence', 'Recreation'. The K-means cluster analysis categorized the mountain villages' landscapes into four types - 'Residential', 'Touristic', 'General', 'Environmentally protected'. The classification was examined to be appropriate by field assessment, and basic guidelines of mountain village landscape management were set. The results of this study are expected to be utilized planning and implementing regarding mountain village landscape in the future.

An Analysis of Teacher's Perceptions on School Organizational Culture in Secondary School (중등학교 교사의 학교조직문화에 대한 인식 분석)

  • Won, Hyo-Heon;Choi, Dong-Kyu
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.25 no.1
    • /
    • pp.246-259
    • /
    • 2013
  • The principal purpose of this study is to analyze school organizational culture in secondary school in Busan. This study measures background variables such as gender, teaching experience, classification of school, grade of school, and scale of school. The results of the study are as follows : First, to see the difference on the perception of organizational culture depending on gender, female teachers have a stronger sense of professionalism, community spirit and consideration than male teachers. Second, to see the difference on the perception of organizational culture in terms of teaching experience, teachers who have more than 21 years of teaching experience have a more positive perception on decision-making and consideration than those who have 11~20 years of teaching experience. Third, to see the difference on the perception of organizational culture according to classification of school, public schools have a more positive perception on every item such as professionalism, decision-making, community spirit, and consideration than private school. Fourth, to see the difference on the perception of organizational culture in terms of classification of schools, secondary schools have a more positive perception on professionalism and community spirit than high schools. Lastly, as it is seen in the difference on the perception of organizational culture depending on scale of school, schools which have 13~35 classes have a more positive perception on professionalism than others.