• Title/Summary/Keyword: Algorithm Comparison

Search Result 2,966, Processing Time 0.029 seconds

Performance Comparison of Automatic Classification Using Word Embeddings of Book Titles (단행본 서명의 단어 임베딩에 따른 자동분류의 성능 비교)

  • Yong-Gu Lee
    • Journal of the Korean Society for information Management
    • /
    • v.40 no.4
    • /
    • pp.307-327
    • /
    • 2023
  • To analyze the impact of word embedding on book titles, this study utilized word embedding models (Word2vec, GloVe, fastText) to generate embedding vectors from book titles. These vectors were then used as classification features for automatic classification. The classifier utilized the k-nearest neighbors (kNN) algorithm, with the categories for automatic classification based on the DDC (Dewey Decimal Classification) main class 300 assigned by libraries to books. In the automatic classification experiment applying word embeddings to book titles, the Skip-gram architectures of Word2vec and fastText showed better results in the automatic classification performance of the kNN classifier compared to the TF-IDF features. In the optimization of various hyperparameters across the three models, the Skip-gram architecture of the fastText model demonstrated overall good performance. Specifically, better performance was observed when using hierarchical softmax and larger embedding dimensions as hyperparameters in this model. From a performance perspective, fastText can generate embeddings for substrings or subwords using the n-gram method, which has been shown to increase recall. The Skip-gram architecture of the Word2vec model generally showed good performance at low dimensions(size 300) and with small sizes of negative sampling (3 or 5).

Soil Moisture Estimation Using KOMPSAT-3 and KOMPSAT-5 SAR Images and Its Validation: A Case Study of Western Area in Jeju Island (KOMPSAT-3와 KOMPSAT-5 SAR 영상을 이용한 토양수분 산정과 결과 검증: 제주 서부지역 사례 연구)

  • Jihyun Lee;Hayoung Lee;Kwangseob Kim;Kiwon Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.6_1
    • /
    • pp.1185-1193
    • /
    • 2023
  • The increasing interest in soil moisture data from satellite imagery for applications in hydrology, meteorology, and agriculture has led to the development of methods to produce variable-resolution soil moisture maps. Research on accurate soil moisture estimation using satellite imagery is essential for remote sensing applications. The purpose of this study is to generate a soil moisture estimation map for a test area using KOMPSAT-3/3A and KOMPSAT-5 SAR imagery and to quantitatively compare the results with soil moisture data from the Soil Moisture Active Passive (SMAP) mission provided by NASA, with a focus on accuracy validation. In addition, the Korean Environmental Geographic Information Service (EGIS) land cover map was used to determine soil moisture, especially in agricultural and forested regions. The selected test area for this study is the western part of Jeju, South Korea, where input data were available for the soil moisture estimation algorithm based on the Water Cloud Model (WCM). Synthetic Aperture Radar (SAR) imagery from KOMPSAT-5 HV and Sentinel-1 VV were used for soil moisture estimation, while vegetation indices were calculated from the surface reflectance of KOMPSAT-3 imagery. Comparison of the derived soil moisture results with SMAP (L-3) and SMAP (L-4) data by differencing showed a mean difference of 4.13±3.60 p% and 14.24±2.10 p%, respectively, indicating a level of agreement. This research suggests the potential for producing highly accurate and precise soil moisture maps using future South Korean satellite imagery and publicly available data sources, as demonstrated in this study.

Comparative analysis on darcy-forchheimer flow of 3-D MHD hybrid nanofluid (MoS2-Fe3O4/H2O) incorporating melting heat and mass transfer over a rotating disk with dufour and soret effects

  • A.M. Abd-Alla;Esraa N. Thabet;S.M.M.El-Kabeir;H. A. Hosham;Shimaa E. Waheed
    • Advances in nano research
    • /
    • v.16 no.4
    • /
    • pp.325-340
    • /
    • 2024
  • There are several novel uses for dispersing many nanoparticles into a conventional fluid, including dynamic sealing, damping, heat dissipation, microfluidics, and more. Therefore, melting heat and mass transfer characteristics of a 3-D MHD Hybrid Nanofluid flow over a rotating disc with presenting dufour and soret effects are assessed numerically in this study. In this instance, we investigated both ferric sulfate and molybdenum disulfide as nanoparticles suspended within base fluid water. The governing partial differential equations are transformed into linked higher-order non-linear ordinary differential equations by the local similarity transformation. The collection of these deduced equations is then resolved using a Chebyshev spectral collocation-based algorithm built into the Mathematica software. To demonstrate how different instances of hybrid/ nanofluid are impacted by changes in temperature, velocity, and the distribution of nanoparticle concentration, examples of graphical and numerical data are given. For many values of the material parameters, the computational findings are shown. Simulations conducted for different physical parameters in the model show that adding hybrid nanoparticle to the fluid mixture increases heat transfer in comparison to simple nanofluids. It has been identified that hybrid nanoparticles, as opposed to single-type nanoparticles, need to be taken into consideration to create an effective thermal system. Furthermore, porosity lowers the velocities of simple and hybrid nanofluids in both cases. Additionally, results show that the drag force from skin friction causes the nanoparticle fluid to travel more slowly than the hybrid nanoparticle fluid. The findings also demonstrate that suction factors like magnetic and porosity parameters, as well as nanoparticles, raise the skin friction coefficient. Furthermore, It indicates that the outcomes from different flow scenarios correlate and are in strong agreement with the findings from the published literature. Bar chart depictions are altered by changes in flow rates. Moreover, the results confirm doctors' views to prescribe hybrid nanoparticle and particle nanoparticle contents for achalasia patients and also those who suffer from esophageal stricture and tumors. The results of this study can also be applied to the energy generated by the melting disc surface, which has a variety of industrial uses. These include, but are not limited to, the preparation of semiconductor materials, the solidification of magma, the melting of permafrost, and the refreezing of frozen land.

How to Combine Diffusion-Weighted and T2-Weighted Imaging for MRI Assessment of Pathologic Complete Response to Neoadjuvant Chemoradiotherapy in Patients with Rectal Cancer?

  • Jong Keon Jang;Chul-min Lee;Seong Ho Park;Jong Hoon Kim;Jihun Kim;Seok-Byung Lim;Chang Sik Yu;Jin Cheon Kim
    • Korean Journal of Radiology
    • /
    • v.22 no.9
    • /
    • pp.1451-1461
    • /
    • 2021
  • Objective: Adequate methods of combining T2-weighted imaging (T2WI) and diffusion-weighted imaging (DWI) to assess complete response (CR) to chemoradiotherapy (CRT) for rectal cancer are obscure. We aimed to determine an algorithm for combining T2WI and DWI to optimally suggest CR on MRI using visual assessment. Materials and Methods: We included 376 patients (male:female, 256:120; mean age ± standard deviation, 59.7 ± 11.1 years) who had undergone long-course CRT for rectal cancer and both pre- and post-CRT high-resolution rectal MRI during 2017-2018. Two experienced radiologists independently evaluated whether a tumor signal was absent, representing CR, on both post-CRT T2WI and DWI, and whether the pre-treatment DWI showed homogeneous hyperintensity throughout the lesion. Algorithms for combining T2WI and DWI were as follows: 'AND,' if both showed CR; 'OR,' if any one showed CR; and 'conditional OR,' if T2WI showed CR or DWI showed CR after the pre-treatment DWI showed homogeneous hyperintensity. Their efficacies for diagnosing pathologic CR (pCR) were determined in comparison with T2WI alone. Results: Sixty-nine patients (18.4%) had pCR. AND had a lower sensitivity without statistical significance (vs. 62.3% [43/69]; 59.4% [41/69], p = 0.500) and a significantly higher specificity (vs. 87.0% [267/307]; 90.2% [277/307], p = 0.002) than those of T2WI. Both OR and conditional OR combinations resulted in a large increase in sensitivity (vs. 62.3% [43/69]; 81.2% [56/69], p < 0.001; and 73.9% [51/69], p = 0.008, respectively) and a large decrease in specificity (vs. 87.0% [267/307]; 57.0% [175/307], p < 0.001; and 69.1% [212/307], p < 0.001, respectively) as compared with T2WI, ultimately creating additional false interpretations of CR more frequently than additional identification of patients with pCR. Conclusion: AND combination of T2WI and DWI is an appropriate strategy for suggesting CR using visual assessment of MRI after CRT for rectal cancer.

Time-series Change Analysis of Quarry using UAV and Aerial LiDAR (UAV와 LiDAR를 활용한 토석채취지의 시계열 변화 분석)

  • Dong-Hwan Park;Woo-Dam Sim
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.27 no.2
    • /
    • pp.34-44
    • /
    • 2024
  • Recently, due to abnormal climate caused by climate change, natural disasters such as floods, landslides, and soil outflows are rapidly increasing. In Korea, more than 63% of the land is vulnerable to slope disasters due to the geographical characteristics of mountainous areas, and in particular, Quarry mines soil and rocks, so there is a high risk of landslides not only inside the workplace but also outside.Accordingly, this study built a DEM using UAV and aviation LiDAR for monitoring the quarry, conducted a time series change analysis, and proposed an optimal DEM construction method for monitoring the soil collection site. For DEM construction, UAV and LiDAR-based Point Cloud were built, and the ground was extracted using three algorithms: Aggressive Classification (AC), Conservative Classification (CC), and Standard Classification (SC). UAV and LiDAR-based DEM constructed according to the algorithm evaluated accuracy through comparison with digital map-based DEM.

Mapping Mammalian Species Richness Using a Machine Learning Algorithm (머신러닝 알고리즘을 이용한 포유류 종 풍부도 매핑 구축 연구)

  • Zhiying Jin;Dongkun Lee;Eunsub Kim;Jiyoung Choi;Yoonho Jeon
    • Journal of Environmental Impact Assessment
    • /
    • v.33 no.2
    • /
    • pp.53-63
    • /
    • 2024
  • Biodiversity holds significant importance within the framework of environmental impact assessment, being utilized in site selection for development, understanding the surrounding environment, and assessing the impact on species due to disturbances. The field of environmental impact assessment has seen substantial research exploring new technologies and models to evaluate and predict biodiversity more accurately. While current assessments rely on data from fieldwork and literature surveys to gauge species richness indices, limitations in spatial and temporal coverage underscore the need for high-resolution biodiversity assessments through species richness mapping. In this study, leveraging data from the 4th National Ecosystem Survey and environmental variables, we developed a species distribution model using Random Forest. This model yielded mapping results of 24 mammalian species' distribution, utilizing the species richness index to generate a 100-meter resolution map of species richness. The research findings exhibited a notably high predictive accuracy, with the species distribution model demonstrating an average AUC value of 0.82. In addition, the comparison with National Ecosystem Survey data reveals that the species richness distribution in the high-resolution species richness mapping results conforms to a normal distribution. Hence, it stands as highly reliable foundational data for environmental impact assessment. Such research and analytical outcomes could serve as pivotal new reference materials for future urban development projects, offering insights for biodiversity assessment and habitat preservation endeavors.

Evaluation of Hazardous Zones by Evacuation Scenario under Disasters on Training Ships (실습선 재난 시 피난 시나리오 별 위험구역 평가)

  • SangJin Lim;YoonHo Lee
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.2
    • /
    • pp.200-208
    • /
    • 2024
  • The occurrence a fire on a training ship with a large number of people on board can lead to severe casualties. Hence the Seafarers' Act and Safety Life At Sea(SOLAS) emphasizes the importance of the abandon ship drill. Therefore, in this study, the training ship of Mokpo National Maritime University, Segero, which has a large number of people on board, was selected as the target ship and the likelihood and severity of fire accidents on each deck were predicted through the preliminary hazard analysis(PHA) qualitative risk assessment. Additionally, assuming a fire in a high-risk area, a simulation of evacuation time and population density was performed to quantitatively predict the risk. The the total evacuation time was predicted to be the longest at 501s in the meal time scenario, in which the population distribution was concentrated in one area. Depending on the scenario, some decks had relatively high population densities of over 1.4pers/m2, preventing stagnation in the number of evacuees. The results of this study are expected to be used as basic data to develop training scenarios for training ships by quantifying evacuation time and population density according to various evacuation scenarios, and the research can be expanded in the future through comparison of mathematical models and experimental values.

Comparing the Performance of a Deep Learning Model (TabPFN) for Predicting River Algal Blooms with Varying Data Composition (데이터 구성에 따른 하천 조류 예측 딥러닝 모형 (TabPFN) 성능 비교)

  • Hyunseok Yang;Jungsu Park
    • Journal of Wetlands Research
    • /
    • v.26 no.3
    • /
    • pp.197-203
    • /
    • 2024
  • The algal blooms in rivers can negatively affect water source management and water treatment processes, necessitating continuous management. In this study, a multi-classification model was developed to predict the concentration of chlorophyll-a (chl-a), one of the key indicators of algal blooms, using Tabular Prior Fitted Networks (TabPFN), a novel deep learning algorithm known for its relatively superior performance on small tabular datasets. The model was developed using daily observation data collected at Buyeo water quality monitoring station from January 1, 2014, to December 31, 2022. The collected data were averaged to construct input data sets with measurement frequencies of 1 day, 3 days, 6 days, 12 days. The performance comparison of the four models, constructed with input data on observation frequencies of 1 day, 3 days, 6 days, and 12 days, showed that the model exhibits stable performance even when the measurement frequency is longer and the number of observations is smaller. The macro average for each model were analyzed as follows: Precision was 0.77, 0.76, 0.83, 0.84; Recall was 0.63, 0.65, 0.66, 0.74; F1-score was 0.67, 0.69, 0.71, 0.78. For the weighted average, Precision was 0.76, 0.77, 0.81, 0.84; Recall was 0.76, 0.78, 0.81, 0.85; F1-score was 0.74, 0.77, 0.80, 0.84. This study demonstrates that the chl-a prediction model constructed using TabPFN exhibits stable performance even with small-scale input data, verifying the feasibility of its application in fields where the input data required for model construction is limited.

Comparison of Lambertian Model on Multi-Channel Algorithm for Estimating Land Surface Temperature Based on Remote Sensing Imagery

  • A Sediyo Adi Nugraha;Muhammad Kamal;Sigit Heru Murti;Wirastuti Widyatmanti
    • Korean Journal of Remote Sensing
    • /
    • v.40 no.4
    • /
    • pp.397-418
    • /
    • 2024
  • The Land Surface Temperature (LST) is a crucial parameter in identifying drought. It is essential to identify how LST can increase its accuracy, particularly in mountainous and hill areas. Increasing the LST accuracy can be achieved by applying early data processing in the correction phase, specifically in the context of topographic correction on the Lambertian model. Empirical evidence has demonstrated that this particular stage effectively enhances the process of identifying objects, especially within areas that lack direct illumination. Therefore, this research aims to examine the application of the Lambertian model in estimating LST using the Multi-Channel Method (MCM) across various physiographic regions. Lambertian model is a method that utilizes Lambertian reflectance and specifically addresses the radiance value obtained from Sun-Canopy-Sensor(SCS) and Cosine Correction measurements. Applying topographical adjustment to the LST outcome results in a notable augmentation in the dispersion of LST values. Nevertheless, the area physiography is also significant as the plains terrain tends to have an extreme LST value of ≥ 350 K. In mountainous and hilly terrains, the LST value often falls within the range of 310-325 K. The absence of topographic correction in LST results in varying values: 22 K for the plains area, 12-21 K for hilly and mountainous terrain, and 7-9 K for both plains and mountainous terrains. Furthermore, validation results indicate that employing the Lambertian model with SCS and Cosine Correction methods yields superior outcomes compared to processing without the Lambertian model, particularly in hilly and mountainous terrain. Conversely, in plain areas, the Lambertian model's application proves suboptimal. Additionally, the relationship between physiography and LST derived using the Lambertian model shows a high average R2 value of 0.99. The lowest errors(K) and root mean square error values, approximately ±2 K and 0.54, respectively, were achieved using the Lambertian model with the SCS method. Based on the findings, this research concluded that the Lambertian model could increase LST values. These corrected values are often higher than the LST values obtained without the Lambertian model.

Comparison of the Performance of Machine Learning Models for TOC Prediction Based on Input Variable Composition (입력변수 구성에 따른 총유기탄소(TOC) 예측 머신러닝 모형의 성능 비교)

  • Sohyun Lee;Jungsu Park
    • Journal of the Korea Organic Resources Recycling Association
    • /
    • v.32 no.3
    • /
    • pp.19-29
    • /
    • 2024
  • Total organic carbon (TOC) represents the total amount of organic carbon contained in water and is a key water quality parameter used, along with biochemical oxygen demand (BOD) and chemical oxygen demand (COD), to quantify the amount of organic matter in water. In this study, a model to predict TOC was developed using XGBoost (XGB), a representative ensemble machine learning algorithm. Independent variables for model construction included water temperature, pH, electrical conductivity, dissolved oxygen concentration, BOD, COD, suspended solids, total nitrogen, total phosphorus, and discharge. To quantitatively analyze the impact of various water quality parameters used in model construction, the feature importance of input variables was calculated. Based on the results of feature importance analysis, items with low importance were sequentially excluded to observe changes in model performance. When built by sequentially excluding items with low importance, the performance of the model showed a root mean squared error-observation standard deviation ratio (RSR) range of 0.53 to 0.55. The model that applied all input variables showed the best performance with an RSR value of 0.53. To enhance the model's field applicability, models using relatively easily measurable parameters were also built, and the performance changes were analyzed. The results showed that a model constructed using only the relatively easily measurable parameters of water temperature, electrical conductivity, pH, dissolved oxygen concentration, and suspended solids had an RSR of 0.72. This indicates that stable performance can be achieved using relatively easily measurable field water quality parameters.