• Title/Summary/Keyword: Science and technology classification

Search Result 1,634, Processing Time 0.027 seconds

Machine learning application to seismic site classification prediction model using Horizontal-to-Vertical Spectral Ratio (HVSR) of strong-ground motions

  • Francis G. Phi;Bumsu Cho;Jungeun Kim;Hyungik Cho;Yun Wook Choo;Dookie Kim;Inhi Kim
    • Geomechanics and Engineering
    • /
    • v.37 no.6
    • /
    • pp.539-554
    • /
    • 2024
  • This study explores development of prediction model for seismic site classification through the integration of machine learning techniques with horizontal-to-vertical spectral ratio (HVSR) methodologies. To improve model accuracy, the research employs outlier detection methods and, synthetic minority over-sampling technique (SMOTE) for data balance, and evaluates using seven machine learning models using seismic data from KiK-net. Notably, light gradient boosting method (LGBM), gradient boosting, and decision tree models exhibit improved performance when coupled with SMOTE, while Multiple linear regression (MLR) and Support vector machine (SVM) models show reduced efficacy. Outlier detection techniques significantly enhance accuracy, particularly for LGBM, gradient boosting, and voting boosting. The ensemble of LGBM with the isolation forest and SMOTE achieves the highest accuracy of 0.91, with LGBM and local outlier factor yielding the highest F1-score of 0.79. Consistently outperforming other models, LGBM proves most efficient for seismic site classification when supported by appropriate preprocessing procedures. These findings show the significance of outlier detection and data balancing for precise seismic soil classification prediction, offering insights and highlighting the potential of machine learning in optimizing site classification accuracy.

A study on extraction of optimized API sequence length and combination for efficient malware classification (효율적인 악성코드 분류를 위한 최적의 API 시퀀스 길이 및 조합 도출에 관한 연구)

  • Choi, Ji-Yeon;Kim, HeeSeok;Kim, Kyu-Il;Park, Hark-Soo;Song, Jung-Suk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.5
    • /
    • pp.897-909
    • /
    • 2014
  • With the development of the Internet, the number of cyber threats is continuously increasing and their techniques are also evolving for the purpose of attacking our crucial systems. Since attackers are able to easily make exploit codes, i.e., malware, using dedicated generation tools, the number of malware is rapidly increasing. However, it is not easy to analyze all of malware due to an extremely large number of malware. Because of this, many researchers have proposed the malware classification methods that aim to identify unforeseen malware from the well-known malware. The existing malware classification methods used malicious information obtained from the static and the dynamic malware analysis as the criterion of calculating the similarity between malwares. Also, most of them used API functions and their sequences that are divided into a certain length. Thus, the accuracy of the malware classification heavily depends on the length of divided API sequences. In this paper, we propose an extraction method of optimized API sequence length and combination that can be used for improving the performance of the malware classification.

Improving Methods for Resources Selection and Classification Practice of Major Korean Directories (국내 주요 검색 포털의 디렉터리 서비스 정보자원 선정 및 분류작업 개선방안)

  • Kim, Sung-Won
    • Journal of Information Management
    • /
    • v.36 no.4
    • /
    • pp.91-115
    • /
    • 2005
  • While the amount of information exchanged through internet has dramatically increased recently, certain inefficiencies still exist with regard to the storage, distribution, and retrieval of information. As a means of improving efficiency in accessing information, many search portals provide directory services to present organized guidance to information, based on the classification schemes. This study examines the classification activities practiced by the major search portals in Korea and makes some suggestions to improve the quality of directory services.

Classification Index and Grade Levels for Energy Efficiency Classification of Agricultural Heaters in Korea

  • Shin, Chang Seop;Jang, Ji Hoon;Kim, Young Tae;Kim, Kyeong Uk
    • Journal of Biosystems Engineering
    • /
    • v.38 no.4
    • /
    • pp.264-269
    • /
    • 2013
  • Purpose: This study was carried out to develop a classification index and grade levels to rate agricultural heaters for energy efficiency classification. Methods: The classification index was developed mainly by taking simplicity of calculation and easy access to relevant data into consideration. The grade levels were developed on the basis of a 5-grade classification system in which graded heaters are to be normally distributed over the grades. The value of each grade level were determined in terms of the classification index values calculated using the published performance data of agricultural heaters tested at the FACT in Korea over the past 12 years. Results: The thermal efficiency of agricultural heaters based on the enthalpy method was proposed as a reasonable classification index. The grade levels were proposed in equation form for three types of agricultural heaters: fossil fuel heaters, wood pellet heaters and wood pellet boilers. A reasonable energy efficiency classification of agricultural heaters could be performed using the proposed classification index and grade levels. Conclusions: It is expected that energy saving programs will be extended to agricultural machines in the near future. The classification index and grade levels to rate agricultural heaters for energy efficiency classification were developed and proposed for such near future to come.

Estimated Soft Information based Most Probable Classification Scheme for Sorting Metal Scraps with Laser-induced Breakdown Spectroscopy (레이저유도 플라즈마 분광법을 이용한 폐금속 분류를 위한 추정 연성정보 기반의 최빈 분류 기술)

  • Kim, Eden;Jang, Hyemin;Shin, Sungho;Jeong, Sungho;Hwang, Euiseok
    • Resources Recycling
    • /
    • v.27 no.1
    • /
    • pp.84-91
    • /
    • 2018
  • In this study, a novel soft information based most probable classification scheme is proposed for sorting recyclable metal alloys with laser induced breakdown spectroscopy (LIBS). Regression analysis with LIBS captured spectrums for estimating concentrations of common elements can be efficient for classifying unknown arbitrary metal alloys, even when that particular alloy is not included for training. Therefore, partial least square regression (PLSR) is employed in the proposed scheme, where spectrums of the certified reference materials (CRMs) are used for training. With the PLSR model, the concentrations of the test spectrum are estimated independently and are compared to those of CRMs for finding out the most probable class. Then, joint soft information can be obtained by assuming multi-variate normal (MVN) distribution, which enables to account the probability measure or a prior information and improves classification performance. For evaluating the proposed schemes, MVN soft information is evaluated based on PLSR of LIBS captured spectrums of 9 metal CRMs, and tested for classifying unknown metal alloys. Furthermore, the likelihood is evaluated with the radar chart to effectively visualize and search the most probable class among the candidates. By the leave-one-out cross validation tests, the proposed scheme is not only showing improved classification accuracies but also helpful for adaptive post-processing to correct the mis-classifications.

Missing Value Imputation based on Locally Linear Reconstruction for Improving Classification Performance (분류 성능 향상을 위한 지역적 선형 재구축 기반 결측치 대치)

  • Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.4
    • /
    • pp.276-284
    • /
    • 2012
  • Classification algorithms generally assume that the data is complete. However, missing values are common in real data sets due to various reasons. In this paper, we propose to use locally linear reconstruction (LLR) for missing value imputation to improve the classification performance when missing values exist. We first investigate how much missing values degenerate the classification performance with regard to various missing ratios. Then, we compare the proposed missing value imputation (LLR) with three well-known single imputation methods over three different classifiers using eight data sets. The experimental results showed that (1) any imputation methods, although some of them are very simple, helped to improve the classification accuracy; (2) among the imputation methods, the proposed LLR imputation was the most effective over all missing ratios, and (3) when the missing ratio is relatively high, LLR was outstanding and its classification accuracy was as high as the classification accuracy derived from the compete data set.

Assessment of the Severity of Coronavirus Disease: Quantitative Computed Tomography Parameters versus Semiquantitative Visual Score

  • Xi Yin;Xiangde Min;Yan Nan;Zhaoyan Feng;Basen Li;Wei Cai;Xiaoqing Xi;Liang Wang
    • Korean Journal of Radiology
    • /
    • v.21 no.8
    • /
    • pp.998-1006
    • /
    • 2020
  • Objective: To compare the accuracies of quantitative computed tomography (CT) parameters and semiquantitative visual score in evaluating clinical classification of severity of coronavirus disease (COVID-19). Materials and Methods: We retrospectively enrolled 187 patients with COVID-19 treated at Tongji Hospital of Tongji Medical College from February 15, 2020, to February 29, 2020. Demographic data, imaging characteristics, and clinical data were collected, and based on the clinical classification of severity, patients were divided into groups 1 (mild) and 2 (severe/critical). A semiquantitative visual score was used to estimate the lesion extent. A three-dimensional slicer was used to precisely quantify the volume and CT value of the lung and lesions. Correlation coefficients of the quantitative CT parameters, semiquantitative visual score, and clinical classification were calculated using Spearman's correlation. A receiver operating characteristic curve was used to compare the accuracies of quantitative and semi-quantitative methods. Results: There were 59 patients in group 1 and 128 patients in group 2. The mean age and sex distribution of the two groups were not significantly different. The lesions were primarily located in the subpleural area. Compared to group 1, group 2 had larger values for all volume-dependent parameters (p < 0.001). The percentage of lesions had the strongest correlation with disease severity with a correlation coefficient of 0.495. In comparison, the correlation coefficient of semiquantitative score was 0.349. To classify the severity of COVID-19, area under the curve of the percentage of lesions was the highest (0.807; 95% confidence interval, 0.744-0.861: p < 0.001) and that of the quantitative CT parameters was significantly higher than that of the semiquantitative visual score (p = 0.001). Conclusion: The classification accuracy of quantitative CT parameters was significantly superior to that of semiquantitative visual score in terms of evaluating the severity of COVID-19.

Leveraging Deep Learning and Farmland Fertility Algorithm for Automated Rice Pest Detection and Classification Model

  • Hussain. A;Balaji Srikaanth. P
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.4
    • /
    • pp.959-979
    • /
    • 2024
  • Rice pest identification is essential in modern agriculture for the health of rice crops. As global rice consumption rises, yields and quality must be maintained. Various methodologies were employed to identify pests, encompassing sensor-based technologies, deep learning, and remote sensing models. Visual inspection by professionals and farmers remains essential, but integrating technology such as satellites, IoT-based sensors, and drones enhances efficiency and accuracy. A computer vision system processes images to detect pests automatically. It gives real-time data for proactive and targeted pest management. With this motive in mind, this research provides a novel farmland fertility algorithm with a deep learning-based automated rice pest detection and classification (FFADL-ARPDC) technique. The FFADL-ARPDC approach classifies rice pests from rice plant images. Before processing, FFADL-ARPDC removes noise and enhances contrast using bilateral filtering (BF). Additionally, rice crop images are processed using the NASNetLarge deep learning architecture to extract image features. The FFA is used for hyperparameter tweaking to optimise the model performance of the NASNetLarge, which aids in enhancing classification performance. Using an Elman recurrent neural network (ERNN), the model accurately categorises 14 types of pests. The FFADL-ARPDC approach is thoroughly evaluated using a benchmark dataset available in the public repository. With an accuracy of 97.58, the FFADL-ARPDC model exceeds existing pest detection methods.

A Study on NTIS Standard Code and Classification Service Development (NTIS 표준코드 및 분류지원 서비스 개발에 관한 연구)

  • Kim, yun-jeong;Kim, tae-hyun;Lim, chul-su;Kim, jae-soo
    • Proceedings of the Korea Contents Association Conference
    • /
    • 2007.11a
    • /
    • pp.376-380
    • /
    • 2007
  • The national R&D information of ministries which define shared information related to national R&D projects has been derived. Among them, 21 percent are code items which can provide important standards to classify information and put out S&T statistics. Therefore, it is necessary to standardize the code items that are differently defined and managed by each research management specialized organization. For this, the National Science & Technology Information System(NTIS) intends to provide a clear code standard for the national R&D information of ministries by defining the NTIS Standard Code. In this study, we also describe the classification service to manage the NTIS Standard Code, National Standard Science and Technology Classification Codes which have been used for national R&D projects's survey and analysis as a unified way.

  • PDF