• Title/Summary/Keyword: machine-learning

Search Result 5,432, Processing Time 0.037 seconds

Study on the Seismic Random Noise Attenuation for the Seismic Attribute Analysis (탄성파 속성 분석을 위한 탄성파 자료 무작위 잡음 제거 연구)

  • Jongpil Won;Jungkyun Shin;Jiho Ha;Hyunggu Jun
    • Economic and Environmental Geology
    • /
    • v.57 no.1
    • /
    • pp.51-71
    • /
    • 2024
  • Seismic exploration is one of the widely used geophysical exploration methods with various applications such as resource development, geotechnical investigation, and subsurface monitoring. It is essential for interpreting the geological characteristics of subsurface by providing accurate images of stratum structures. Typically, geological features are interpreted by visually analyzing seismic sections. However, recently, quantitative analysis of seismic data has been extensively researched to accurately extract and interpret target geological features. Seismic attribute analysis can provide quantitative information for geological interpretation based on seismic data. Therefore, it is widely used in various fields, including the analysis of oil and gas reservoirs, investigation of fault and fracture, and assessment of shallow gas distributions. However, seismic attribute analysis is sensitive to noise within the seismic data, thus additional noise attenuation is required to enhance the accuracy of the seismic attribute analysis. In this study, four kinds of seismic noise attenuation methods are applied and compared to mitigate random noise of poststack seismic data and enhance the attribute analysis results. FX deconvolution, DSMF, Noise2Noise, and DnCNN are applied to the Youngil Bay high-resolution seismic data to remove seismic random noise. Energy, sweetness, and similarity attributes are calculated from noise-removed seismic data. Subsequently, the characteristics of each noise attenuation method, noise removal results, and seismic attribute analysis results are qualitatively and quantitatively analyzed. Based on the advantages and disadvantages of each noise attenuation method and the characteristics of each seismic attribute analysis, we propose a suitable noise attenuation method to improve the result of seismic attribute analysis.

Development and Assessment of LSTM Model for Correcting Underestimation of Water Temperature in Korean Marine Heatwave Prediction System (한반도 고수온 예측 시스템의 수온 과소모의 보정을 위한 LSTM 모델 구축 및 예측성 평가)

  • NA KYOUNG IM;HYUNKEUN JIN;GYUNDO PAK;YOUNG-GYU PARK;KYEONG OK KIM;YONGHAN CHOI;YOUNG HO KIM
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.29 no.2
    • /
    • pp.101-115
    • /
    • 2024
  • The ocean heatwave is emerging as a major issue due to global warming, posing a direct threat to marine ecosystems and humanity through decreased food resources and reduced carbon absorption capacity of the oceans. Consequently, the prediction of ocean heatwaves in the vicinity of the Korean Peninsula is becoming increasingly important for marine environmental monitoring and management. In this study, an LSTM model was developed to improve the underestimated prediction of ocean heatwaves caused by the coarse vertical grid system of the Korean Peninsula Ocean Prediction System. Based on the results of ocean heatwave predictions for the Korean Peninsula conducted in 2023, as well as those generated by the LSTM model, the performance of heatwave predictions in the East Sea, Yellow Sea, and South Sea areas surrounding the Korean Peninsula was evaluated. The LSTM model developed in this study significantly improved the prediction performance of sea surface temperatures during periods of temperature increase in all three regions. However, its effectiveness in improving prediction performance during periods of temperature decrease or before temperature rise initiation was limited. This demonstrates the potential of the LSTM model to address the underestimated prediction of ocean heatwaves caused by the coarse vertical grid system during periods of enhanced stratification. It is anticipated that the utility of data-driven artificial intelligence models will expand in the future to improve the prediction performance of dynamical models or even replace them.

Development of a Prediction Model for Personal Thermal Sensation on Logistic Regression Considering Urban Spatial Factors (도시공간적 요인을 고려한 로지스틱 회귀분석 기반 체감더위 예측 모형 개발)

  • Uk-Je SUNG;Hyeong-Min PARK;Jae-Yeon LIM;Yu-Jin SEO;Jeong-Min SON;Jin-Kyu MIN;Jeong-Hee EUM
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.27 no.1
    • /
    • pp.81-98
    • /
    • 2024
  • This study analyzed the impact of urban spatial factors on the thermal environment. The personal thermal sensation was set as the unit of thermal environment to analyze its correlation with environmental factors. To collect data on personal thermal sensation, Living Lab was applied, allowing citizens to record their thermal sensation and measure the temperature. Based on the input points of the collected personal thermal sensation, nearby urban spatial elements were collected to build a dataset for statistical analysis. Logistic regression analysis was conducted to analyze the impact of each factor on personal thermal sensation. The analysis results indicate that the temperature is influenced by the surrounding spatial environment, showing a negative correlation with building height, greenery rate, and road rate, and a positive correlation with sky view factor. Furthermore, the road rate, sky view factor, and greenery rate, in that order, had a strong impact on perceived heat. The results of this study are expected to be utilized as basic data for assessing the thermal environment to prepare local thermal environment measures in response to climate change.

Probability Map of Migratory Bird Habitat for Rational Management of Conservation Areas - Focusing on Busan Eco Delta City (EDC) - (보존지역의 합리적 관리를 위한 철새 서식 확률지도 구축 - 부산 Eco Delta City (EDC)를 중심으로 -)

  • Kim, Geun Han;Kong, Seok Jun;Kim, Hee Nyun;Koo, Kyung Ah
    • Journal of the Korean Society of Environmental Restoration Technology
    • /
    • v.26 no.6
    • /
    • pp.67-84
    • /
    • 2023
  • In some areas of the Republic of Korea, the designation and management of conservation areas do not adequately reflect regional characteristics and often impose behavioral regulations without considering the local context. One prominent example is the Busan EDC area. As a result, conflicts may arise, including large-scale civil complaints, regarding the conservation and utilization of these areas. Therefore, for the efficient designation and management of protected areas, it is necessary to consider various ecosystem factors, changes in land use, and regional characteristics. In this study, we specifically focused on the Busan EDC area and applied machine learning techniques to analyze the habitat of regional species. Additionally, we employed Explainable Artificial Intelligence techniques to interpret the results of our analysis. To analyze the regional characteristics of the waterfront area in the Busan EDC district and the habitat of migratory birds, we used bird observations as dependent variables, distinguishing between presence and absence. The independent variables were constructed using land cover, elevation, slope, bridges, and river depth data. We utilized the XGBoost (eXtreme Gradient Boosting) model, known for its excellent performance in various fields, to predict the habitat probabilities of 11 bird species. Furthermore, we employed the SHapley Additive exPlanations technique, one of the representative methodologies of XAI, to analyze the relative importance and impact of the variables used in the model. The analysis results showed that in the EDC business district, as one moves closer to the river from the waterfront, the likelihood of bird habitat increases based on the overlapping habitat probabilities of the analyzed bird species. By synthesizing the major variables influencing the habitat of each species, key variables such as rivers, rice fields, fields, pastures, inland wetlands, tidal flats, orchards, cultivated lands, cliffs & rocks, elevation, lakes, and deciduous forests were identified as areas that can serve as habitats, shelters, resting places, and feeding grounds for birds. On the other hand, artificial structures such as bridges, railways, and other public facilities were found to have a negative impact on bird habitat. The development of a management plan for conservation areas based on the objective analysis presented in this study is expected to be extensively utilized in the future. It will provide diverse evidential materials for establishing effective conservation area management strategies.

Studying the Comparative Analysis of Highway Traffic Accident Severity Using the Random Forest Method. (Random Forest를 활용한 고속도로 교통사고 심각도 비교분석에 관한 연구)

  • Sun-min Lee;Byoung-Jo Yoon;WutYeeLwin
    • Journal of the Society of Disaster Information
    • /
    • v.20 no.1
    • /
    • pp.156-168
    • /
    • 2024
  • Purpose: The trend of highway traffic accidents shows a repeating pattern of increase and decrease, with the fatality rate being highest on highways among all road types. Therefore, there is a need to establish improvement measures that reflect the situation within the country. Method: We conducted accident severity analysis using Random Forest on data from accidents occurring on 10 specific routes with high accident rates among national highways from 2019 to 2021. Factors influencing accident severity were identified. Result: The analysis, conducted using the SHAP package to determine the top 10 variable importance, revealed that among highway traffic accidents, the variables with a significant impact on accident severity are the age of the perpetrator being between 20 and less than 39 years, the time period being daytime (06:00-18:00), occurrence on weekends (Sat-Sun), seasons being summer and winter, violation of traffic regulations (failure to comply with safe driving), road type being a tunnel, geometric structure having a high number of lanes and a high speed limit. We identified a total of 10 independent variables that showed a positive correlation with highway traffic accident severity. Conclusion: As accidents on highways occur due to the complex interaction of various factors, predicting accidents poses significant challenges. However, utilizing the results obtained from this study, there is a need for in-depth analysis of the factors influencing the severity of highway traffic accidents. Efforts should be made to establish efficient and rational response measures based on the findings of this research.

Domain Knowledge Incorporated Local Rule-based Explanation for ML-based Bankruptcy Prediction Model (머신러닝 기반 부도예측모형에서 로컬영역의 도메인 지식 통합 규칙 기반 설명 방법)

  • Soo Hyun Cho;Kyung-shik Shin
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.105-123
    • /
    • 2022
  • Thanks to the remarkable success of Artificial Intelligence (A.I.) techniques, a new possibility for its application on the real-world problem has begun. One of the prominent applications is the bankruptcy prediction model as it is often used as a basic knowledge base for credit scoring models in the financial industry. As a result, there has been extensive research on how to improve the prediction accuracy of the model. However, despite its impressive performance, it is difficult to implement machine learning (ML)-based models due to its intrinsic trait of obscurity, especially when the field requires or values an explanation about the result obtained by the model. The financial domain is one of the areas where explanation matters to stakeholders such as domain experts and customers. In this paper, we propose a novel approach to incorporate financial domain knowledge into local rule generation to provide explanations for the bankruptcy prediction model at instance level. The result shows the proposed method successfully selects and classifies the extracted rules based on the feasibility and information they convey to the users.

Safety Verification Techniques of Privacy Policy Using GPT (GPT를 활용한 개인정보 처리방침 안전성 검증 기법)

  • Hye-Yeon Shim;MinSeo Kweun;DaYoung Yoon;JiYoung Seo;Il-Gu Lee
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.34 no.2
    • /
    • pp.207-216
    • /
    • 2024
  • As big data was built due to the 4th Industrial Revolution, personalized services increased rapidly. As a result, the amount of personal information collected from online services has increased, and concerns about users' personal information leakage and privacy infringement have increased. Online service providers provide privacy policies to address concerns about privacy infringement of users, but privacy policies are often misused due to the long and complex problem that it is difficult for users to directly identify risk items. Therefore, there is a need for a method that can automatically check whether the privacy policy is safe. However, the safety verification technique of the conventional blacklist and machine learning-based privacy policy has a problem that is difficult to expand or has low accessibility. In this paper, to solve the problem, we propose a safety verification technique for the privacy policy using the GPT-3.5 API, which is a generative artificial intelligence. Classification work can be performed evenin a new environment, and it shows the possibility that the general public without expertise can easily inspect the privacy policy. In the experiment, how accurately the blacklist-based privacy policy and the GPT-based privacy policy classify safe and unsafe sentences and the time spent on classification was measured. According to the experimental results, the proposed technique showed 10.34% higher accuracy on average than the conventional blacklist-based sentence safety verification technique.

Multifaceted Evaluation Methodology for AI Interview Candidates - Integration of Facial Recognition, Voice Analysis, and Natural Language Processing (AI면접 대상자에 대한 다면적 평가방법론 -얼굴인식, 음성분석, 자연어처리 영역의 융합)

  • Hyunwook Ji;Sangjin Lee;Seongmin Mun;Jaeyeol Lee;Dongeun Lee;kyusang Lim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.55-58
    • /
    • 2024
  • 최근 각 기업의 AI 면접시스템 도입이 증가하고 있으며, AI 면접에 대한 실효성 논란 또한 많은 상황이다. 본 논문에서는 AI 면접 과정에서 지원자를 평가하는 방식을 시각, 음성, 자연어처리 3영역에서 구현함으로써, 면접 지원자를 다방면으로 분석 방법론의 적절성에 대해 평가하고자 한다. 첫째, 시각적 측면에서, 면접 지원자의 감정을 인식하기 위해, 합성곱 신경망(CNN) 기법을 활용해, 지원자 얼굴에서 6가지 감정을 인식했으며, 지원자가 카메라를 응시하고 있는지를 시계열로 도출하였다. 이를 통해 지원자가 면접에 임하는 태도와 특히 얼굴에서 드러나는 감정을 분석하는 데 주력했다. 둘째, 시각적 효과만으로 면접자의 태도를 파악하는 데 한계가 있기 때문에, 지원자 음성을 주파수로 환산해 특성을 추출하고, Bidirectional LSTM을 활용해 훈련해 지원자 음성에 따른 6가지 감정을 추출했다. 셋째, 지원자의 발언 내용과 관련해 맥락적 의미를 파악해 지원자의 상태를 파악하기 위해, 음성을 STT(Speech-to-Text) 기법을 이용하여 텍스트로 변환하고, 사용 단어의 빈도를 분석하여 지원자의 언어 습관을 파악했다. 이와 함께, 지원자의 발언 내용에 대한 감정 분석을 위해 KoBERT 모델을 적용했으며, 지원자의 성격, 태도, 직무에 대한 이해도를 파악하기 위해 객관적인 평가지표를 제작하여 적용했다. 논문의 분석 결과 AI 면접의 다면적 평가시스템의 적절성과 관련해, 시각화 부분에서는 상당 부분 정확도가 객관적으로 입증되었다고 판단된다. 음성에서 감정분석 분야는 면접자가 제한된 시간에 모든 유형의 감정을 드러내지 않고, 또 유사한 톤의 말이 진행되다 보니 특정 감정을 나타내는 주파수가 다소 집중되는 현상이 나타났다. 마지막으로 자연어처리 영역은 면접자의 발언에서 나오는 말투, 특정 단어의 빈도수를 넘어, 전체적인 맥락과 느낌을 이해할 수 있는 자연어처리 분석모델의 필요성이 더욱 커졌음을 판단했다.

  • PDF

A Study on the Drug Classification Using Machine Learning Techniques (머신러닝 기법을 이용한 약물 분류 방법 연구)

  • Anmol Kumar Singh;Ayush Kumar;Adya Singh;Akashika Anshum;Pradeep Kumar Mallick
    • Advanced Industrial SCIence
    • /
    • v.3 no.2
    • /
    • pp.8-16
    • /
    • 2024
  • This paper shows the system of drug classification, the goal of this is to foretell the apt drug for the patients based on their demographic and physiological traits. The dataset consists of various attributes like Age, Sex, BP (Blood Pressure), Cholesterol Level, and Na_to_K (Sodium to Potassium ratio), with the objective to determine the kind of drug being given. The models used in this paper are K-Nearest Neighbors (KNN), Logistic Regression and Random Forest. Further to fine-tune hyper parameters using 5-fold cross-validation, GridSearchCV was used and each model was trained and tested on the dataset. To assess the performance of each model both with and without hyper parameter tuning evaluation metrics like accuracy, confusion matrices, and classification reports were used and the accuracy of the models without GridSearchCV was 0.7, 0.875, 0.975 and with GridSearchCV was 0.75, 1.0, 0.975. According to GridSearchCV Logistic Regression is the most suitable model for drug classification among the three-model used followed by the K-Nearest Neighbors. Also, Na_to_K is an essential feature in predicting the outcome.

Verification Test of High-Stability SMEs Using Technology Appraisal Items (기술력 평가항목을 이용한 고안정성 중소기업 판별력 검증)

  • Jun-won Lee
    • Information Systems Review
    • /
    • v.20 no.4
    • /
    • pp.79-96
    • /
    • 2018
  • This study started by focusing on the internalization of the technology appraisal model into the credit rating model to increase the discriminative power of the credit rating model not only for SMEs but also for all companies, reflecting the items related to the financial stability of the enterprises among the technology appraisal items. Therefore, it is aimed to verify whether the technology appraisal model can be applied to identify high-stability SMEs in advance. We classified companies into industries (manufacturing vs. non-manufacturing) and the age of company (initial vs. non-initial), and defined as a high-stability company that has achieved an average debt ratio less than 1/2 of the group for three years. The C5.0 was applied to verify the discriminant power of the model. As a result of the analysis, there is a difference in importance according to the type of industry and the age of company at the sub-item level, but in the mid-item level the R&D capability was a key variable for discriminating high-stability SMEs. In the early stage of establishment, the funding capacity (diversification of funding methods, capital structure and capital cost which taking into account profitability) is an important variable in financial stability. However, we concluded that technology development infrastructure, which enables continuous performance as the age of company increase, becomes an important variable affecting financial stability. The classification accuracy of the model according to the age of company and industry is 71~91%, and it is confirmed that it is possible to identify high-stability SMEs by using technology appraisal items.