• Title/Summary/Keyword: Feature evaluation

Search Result 974, Processing Time 0.033 seconds

A Comparative Experiment on Dimensional Reduction Methods Applicable for Dissimilarity-Based Classifications (비유사도-기반 분류를 위한 차원 축소방법의 비교 실험)

  • Kim, Sang-Woon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.53 no.3
    • /
    • pp.59-66
    • /
    • 2016
  • This paper presents an empirical evaluation on dimensionality reduction strategies by which dissimilarity-based classifications (DBC) can be implemented efficiently. In DBC, classification is not based on feature measurements of individual objects (a set of attributes), but rather on a suitable dissimilarity measure among the individual objects (pair-wise object comparisons). One problem of DBC is the high dimensionality of the dissimilarity space when a lots of objects are treated. To address this issue, two kinds of solutions have been proposed in the literature: prototype selection (PS)-based methods and dimension reduction (DR)-based methods. In this paper, instead of utilizing the PS-based or DR-based methods, a way of performing DBC in Eigen spaces (ES) is considered and empirically compared. In ES-based DBC, classifications are performed as follows: first, a set of principal eigenvectors is extracted from the training data set using a principal component analysis; second, an Eigen space is expanded using a subset of the extracted and selected Eigen vectors; third, after measuring distances among the projected objects in the Eigen space using $l_p$-norms as the dissimilarity, classification is performed. The experimental results, which are obtained using the nearest neighbor rule with artificial and real-life benchmark data sets, demonstrate that when the dimensionality of the Eigen spaces has been selected appropriately, compared to the PS-based and DR-based methods, the performance of the ES-based DBC can be improved in terms of the classification accuracy.

Evaluation and Intercomparisons of the Estimated TOVS Precipitable Waters for the Tropical Plume (Tropical Plume 에 대한 TOVS 추정 가강수량의 평가와 상호비교)

  • 정효상;신동인
    • Korean Journal of Remote Sensing
    • /
    • v.9 no.2
    • /
    • pp.51-69
    • /
    • 1993
  • Precipitable Water(PW) are retrieved over the tropical and subtropical Pacific Ocean from TOVS infrared and microwave channel brightness temperature and OLR observations by means of stepwise linear regression. The retrieved TOVS PW fields generated by PW$_{sfc}$(71.1 % of the variance and 0.62 g cm$^{-2}$ standard error over the surface) and PW$_{700500}$(71.7 % and 0.17 g cm$^{-2}$ over the 700 - 500 hPa layer) revealed more evolving synoptic signals over the tropical and subtropical Pacific Ocean. The PW$_{sfc}$ dose not show significantly the TP feature because of the representation of the lower PW for high-level clouds not associated with deep convection. There exists some elusion to trace the TP on the PW$_{sfc}$ field if any supplementary information does not provide. But ECMWF analysis has a general tendency of drying the subtropics and moistening the ITCZ (InterTropical Convergence Zone) and SPCZ(South Pacific Convergence Zone). However, although ECMWF analysis is fairly successful in capturing mean patterms, it is unsuccessful in following active synoptic signal like a tropical plume. Similarly, SMMR-PW does not represent the TP well which consists of the highand middle-level clouds, but PW$_{sfc}$ shows underestimated moistness of TP and does not depict significant signal of TP. In the PW field derived from microwave observations, the TP can not be recognized well. Furthermore, the signature of PW$_{sfc}$ was different from OLR for the TP, which implies the presence of high- and middle-layer thin clouds, but in a closer agreement for deep and active convection areas which contain thick middle- and lower-layer clouds; though OLR represented the cloudiness in the tropics well. In synoptically active regions, it differed from OLR analysis, primarily bacause of actual differences in water vapor and cloud features. The signature of PW$_{sfc}$ was different from OLR for the TP.

Evaluation of International Quality Control Procedures for Detecting Outliers in Water Temperature Time-series at Ieodo Ocean Research Station (이어도 해양과학기지 수온 시계열 자료의 이상값 검출을 위한 국제 품질검사의 성능 평가)

  • Min, Yongchim;Jun, Hyunjung;Jeong, Jin-Yong;Park, Sung-Hwan;Lee, Jaeik;Jeong, Jeongmin;Min, Inki;Kim, Yong Sun
    • Ocean and Polar Research
    • /
    • v.43 no.4
    • /
    • pp.229-243
    • /
    • 2021
  • Quality control (QC) to process observed time series has become more critical as the types and amount of observed data have increased along with the development of ocean observing sensors and communication technology. International ocean observing institutions have developed and operated automatic QC procedures for these observed time series. In this study, the performance of automated QC procedures proposed by U.S. IOOS (Integrated Ocean Observing System), NDBC (National Data Buy Center), and OOI (Ocean Observatory Initiative) were evaluated for observed time-series particularly from the Yellow and East China Seas by taking advantage of a confusion matrix. We focused on detecting additive outliers (AO) and temporary change outliers (TCO) based on ocean temperature observation from the Ieodo Ocean Research Station (I-ORS) in 2013. Our results present that the IOOS variability check procedure tends to classify normal data as AO or TCO. The NDBC variability check tracks outliers well but also tends to classify a lot of normal data as abnormal, particularly in the case of rapidly fluctuating time-series. The OOI procedure seems to detect the AO and TCO most effectively and the rate of classifying normal data as abnormal is also the lowest among the international checks. However, all three checks need additional scrutiny because they often fail to classify outliers when intermittent observations are performed or as a result of systematic errors, as well as tending to classify normal data as outliers in the case where there is abrupt change in the observed data due to a sensor being located within a sharp boundary between two water masses, which is a common feature in shallow water observations. Therefore, this study underlines the necessity of developing a new QC algorithm for time-series occurring in a shallow sea.

Design and Evaluation of a High-performance Key-value Storage for Industrial IoT Environments (산업용 IoT 환경을 위한 고성능 키-값 저장소의 설계 및 평가)

  • Han, Hyuck
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.7
    • /
    • pp.127-133
    • /
    • 2021
  • In industrial IoT environments, sensors generate data for their detection targets and deliver the data to IoT gateways. Therefore, managing large amounts of real-time sensor data is an essential feature for IoT gateways, and key-value storage engines are widely used to manage these sensor data. However, key-value storage engines used in IoT gateways do not take into account the characteristics of sensor data generated in industrial IoT environments, and this limits the performance of key-value storage engines. In this paper, we optimize the key-value storage engine by utilizing the features of sensor data in industrial IoT environments. The proposed optimization technique is to analyze the key, which is the input of a key-value storage engine, for further indexing. This reduces excessive write amplification and improves performance. We implement our optimization scheme in LevelDB and use the workload of the TPCx-IoT benchmark to evaluate our proposed scheme. From experimental results we show that our proposed technique achieves up to 21 times better than the existing scheme, and this shows that the proposed technique can perform high-speed data ingestion in industrial IoT environments.

Development of the conventional crop composition database for new genetically engineered crop safety assessment (새로운 생명공학작물 안전성 평가를 위한 작물 성분 DB 구축)

  • Kim, Eun-Ha;Lee, Seong-Kon;Park, Soo-Yun;Lee, Sang-Gu;Oh, Seon-Woo
    • Journal of Plant Biotechnology
    • /
    • v.45 no.4
    • /
    • pp.289-298
    • /
    • 2018
  • The Biosafety Division of the National Academy of Agricultural Science has developed a 'Crop Composition DB' that provides analytical data on commercialized crops. It can be used as a reference in the 'Comparative Evaluation by Compositional Analysis' for the safety assessment of genetically modified (GM) crops. This database provides the composition of crops cultivated in Korea, and thus upgrades the data to check the extent of changes in the compositional content depending on the cultivated area, varieties and year. The database is a compilation of data on the antioxidant, nutrient and secondary metabolite compositions of rice and capsicum grown in two or more cultivation areas for a period of more than two years. Data analysis was conducted under the guidelines of the Association of Official Analytical Chemists or methods previously reported on papers. The data was provided as average, minimum and maximum values to assess whether the statistical differences between the GM crops and comparative non-GM crops fall within the biological differences or tolerances of the existing commercial crops. The Crop Composition DB is an open-access source and is easy to access based on the query selected by the user. Moreover, functional ingredients of colored crops, such as potatoes, sweet potatoes and cauliflowers, were provided so that food information can be used and utilized by general consumers. This paper introduces the feature and usage of 'Crop Composition DB', which is a valuable tool for characterizing the composition of conventional crops.

Evaluation of Classification Performance of Inception V3 Algorithm for Chest X-ray Images of Patients with Cardiomegaly (심장비대증 환자의 흉부 X선 영상에 대한 Inception V3 알고리즘의 분류 성능평가)

  • Jeong, Woo-Yeon;Kim, Jung-Hun;Park, Ji-Eun;Kim, Min-Jeong;Lee, Jong-Min
    • Journal of the Korean Society of Radiology
    • /
    • v.15 no.4
    • /
    • pp.455-461
    • /
    • 2021
  • Cardiomegaly is one of the most common diseases seen on chest X-rays, but if it is not detected early, it can cause serious complications. In view of this, in recent years, many researches on image analysis in which deep learning algorithms using artificial intelligence are applied to medical care have been conducted with the development of various science and technology fields. In this paper, we would like to evaluate whether the Inception V3 deep learning model is a useful model for the classification of Cardiomegaly using chest X-ray images. For the images used, a total of 1026 chest X-ray images of patients diagnosed with normal heart and those diagnosed with Cardiomegaly in Kyungpook National University Hospital were used. As a result of the experiment, the classification accuracy and loss of the Inception V3 deep learning model according to the presence or absence of Cardiomegaly were 96.0% and 0.22%, respectively. From the research results, it was found that the Inception V3 deep learning model is an excellent deep learning model for feature extraction and classification of chest image data. The Inception V3 deep learning model is considered to be a useful deep learning model for classification of chest diseases, and if such excellent research results are obtained by conducting research using a little more variety of medical image data, I think it will be great help for doctor's diagnosis in future.

The Development of Park Analysis Indicators and Current Status: A Case Study of Daejeon Metropolitan City (공원 분석 지표 개발 및 현황 분석: 대전광역시를 중심으로)

  • Hwang, Jae-Yeon;Gwak, Seung-Yeon;Kim, Sang-Kyu;Park, Min-Ju
    • Land and Housing Review
    • /
    • v.13 no.1
    • /
    • pp.99-112
    • /
    • 2022
  • There is growing significance in securing urban parks and enhancing their accessibility due to irrational residential developments and apartment construction. Accordingly, Daejeon Metropolitan City has carried out urban park management projects to improve the quality of parks and create new parks. Daejeon Metropolitan City generates and manages park data for the purpose of management by the administrative district. However, these datasets take different forms in each administrative district. This study integrates the park data in Daejeon, generated by administrative districts, into the same format and generates geographic information data with the area information of each park for analysis. Analysis results show that urban parks are severely imbalanced across administrative districts, requiring new policy measures. In addition, by normalizing the park analysis results and, then, creating their rankings, this study compares them with the actual park information in detail to confirm the soundness of the dataset. The analysis results provide implications to improve the management of urban parks. This study proposes integrated datasets and the continued management of them in each administrative district by including essential data that can feature the objective information of the parks along with park evaluation indicators based on previous studies.

A pilot study on the application of environmental DNA to the estimation of the biomass of dominant species in the northwestern waters of Jeju Island (제주도 서북 해역에서의 우점종 생물량 추정에 환경 유전자의 적용에 관한 시범 연구)

  • KANG, Myounghee;PARK, Kyeong-Dong;MIN, Eunbi;LEE, Changheon;KANG, Taejong;OH, Taegeon;LIM, Byeonggwon;HWANG, Doojin;KIM, Byung-Yeob
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.58 no.1
    • /
    • pp.39-48
    • /
    • 2022
  • Using environmental DNA (eDNA) in the fisheries and oceanography fields, research on the diversity of biological species, the presence or absence of specific species and quantitative evaluation of species has considerably been performed. Up to date, no study on eDNA has been tried in the area of fisheries acoustics in Korea. In this study, the biomass of a dominant species in the northwestern waters of Jeju Island was examined using 1) the catch ratio of the species from trawl survey results and 2) the ranking ratio of the species from the eDNA results. The dominant species was Zoarces gillii, and its trawl catch ratio was 68.2% and its eDNA ratio was 81.3%. The Zoarces gillii biomass from the two methods was 7199.4 tons (trawl) and 8584.6 tons (eDNA), respectively. The mean and standard deviation of the acoustic backscattering strength values (120 kHz) from the entire survey area were 135.5 and 157.7 m2/nm2, respectively. The strongest echo signal occurred at latitude 34° and longitude 126°15' (northwest of Jeju Island). High echo signals were observed in a specific oceanographic feature (salinity range of 32-33 psu and the water temperature range of 19-20℃). This study was a pilot study on evaluating quantitatively aquatic resources by applying the eDNA technique into acoustic-trawl survey method. Points to be considered for high-quality quantitative estimation using the eDNA to fisheries acosutics were discussed.

Contrast Media Side Effects Prediction Study using Artificial Intelligence Technique (인공지능 기법을 이용한 조영제 부작용 예측 연구)

  • Sang-Hyun Kim
    • Journal of the Korean Society of Radiology
    • /
    • v.17 no.3
    • /
    • pp.423-431
    • /
    • 2023
  • The purpose of this study is to analyze the factors affecting the classification of the severity of contrast media side effects based on the patient's body information using artificial intelligence techniques to be used as basic data to reduce the degree of contrast medium side effects. The data used in this study were 606 examiners who had no contrast medium side effects in the past history survey among 1,235 cases of contrast medium side effects among 58,000 CT scans performed at a general hospital in Seoul. The total data is 606, of which 70% was used as a training set and the remaining 30% was used as a test set for validation. Age, BMI(Body Mass Index), GFR(Glomerular Filtration Rate), BUN(Blood Urea Nitrogen), GGT(Gamma Glutamyl Transgerase), AST(Aspartate Amino Transferase,), and ALT(Alanine Amiono Transferase) features were used as independent variables, and contrast media severity was used as a target variable. AUC(Area under curve), CA(Classification Accuracy), F1, Precision, and Recall were identified through AdaBoost, Tree, Neural network, SVM, and Random foest algorithm. AdaBoost and Random Forest show the highest evaluation index in the classification prediction algorithm. The largest factors in the predictions of all models were GFR, BMI, and GGT. It was found that the difference in the amount of contrast media injected according to renal filtration function and obesity, and the presence or absence of metabolic syndrome affected the severity of contrast medium side effects.

Derivation of Engineered Barrier System (EBS) Degradation Mechanism and Its Importance in the Early Phase of the Deep Geological Repository for High-Level Radioactive Waste (HLW) through Analysis on the Long-Term Evolution Characteristics in the Finnish Case (핀란드 고준위방폐물 심층처분장 장기진화 특성 분석을 통한 폐쇄 초기단계 공학적방벽 성능저하 메커니즘 및 중요도 도출)

  • Sukhoon Kim;Jeong-Hwan Lee
    • The Journal of Engineering Geology
    • /
    • v.33 no.4
    • /
    • pp.725-736
    • /
    • 2023
  • The compliance of deep geological disposal facilities for high-level radioactive waste with safety objectives requires consideration of uncertainties owing to temporal changes in the disposal system. A comprehensive review and analysis of the characteristics of this evolution should be undertaken to identify the effects on multiple barriers and the biosphere. We analyzed the evolution of the buffer, backfill, plug, and closure regions during the early phase of the post-closure period as part of a long-term performance assessment for an operating license application for a deep geological repository in Finland. Degradation mechanisms generally expected in engineered barriers were considered, and long-term evolution features were examined for use in performance assessments. The importance of evolution features was classified into six categories based on the design of the Finnish case. Results are expected to be useful as a technical basis for performance and safety assessment in developing the Korean deep geological disposal system for high-level radioactive waste. However, for a more detailed review and evaluation of each feature, it is necessary to obtain data for the final disposal site and facility-specific design, and to assess its impact in advance.