• Title/Summary/Keyword: Cross - Validation

Search Result 994, Processing Time 0.025 seconds

A Node2Vec-Based Gene Expression Image Representation Method for Effectively Predicting Cancer Prognosis (암 예후를 효과적으로 예측하기 위한 Node2Vec 기반의 유전자 발현량 이미지 표현기법)

  • Choi, Jonghwan;Park, Sanghyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.10
    • /
    • pp.397-402
    • /
    • 2019
  • Accurately predicting cancer prognosis to provide appropriate treatment strategies for patients is one of the critical challenges in bioinformatics. Many researches have suggested machine learning models to predict patients' outcomes based on their gene expression data. Gene expression data is high-dimensional numerical data containing about 17,000 genes, so traditional researches used feature selection or dimensionality reduction approaches to elevate the performance of prognostic prediction models. These approaches, however, have an issue of making it difficult for the predictive models to grasp any biological interaction between the selected genes because feature selection and model training stages are performed independently. In this paper, we propose a novel two-dimensional image formatting approach for gene expression data to achieve feature selection and prognostic prediction effectively. Node2Vec is exploited to integrate biological interaction network and gene expression data and a convolutional neural network learns the integrated two-dimensional gene expression image data and predicts cancer prognosis. We evaluated our proposed model through double cross-validation and confirmed superior prognostic prediction accuracy to traditional machine learning models based on raw gene expression data. As our proposed approach is able to improve prediction models without loss of information caused by feature selection steps, we expect this will contribute to development of personalized medicine.

A Fuzzy-AHP-based Movie Recommendation System with the Bidirectional Recurrent Neural Network Language Model (양방향 순환 신경망 언어 모델을 이용한 Fuzzy-AHP 기반 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.18 no.12
    • /
    • pp.525-531
    • /
    • 2020
  • In today's IT environment where various pieces of information are distributed in large volumes, recommendation systems are in the spotlight capable of figuring out users' needs fast and helping them with their decisions. The current recommendation systems, however, have a couple of problems including that user preference may not be reflected on the systems right away according to their changing tastes or interests and that items with no relations to users' preference may be recommended, being induced by advertising. In an effort to solve these problems, this study set out to propose a Fuzzy-AHP-based movie recommendation system by applying the BRNN(Bidirectional Recurrent Neural Network) language model. Applied to this system was Fuzzy-AHP to reflect users' tastes or interests in clear and objective ways. In addition, the BRNN language model was adopted to analyze movie-related data collected in real time and predict movies preferred by users. The system was assessed for its performance with grid searches to examine the fitness of the learning model for the entire size of word sets. The results show that the learning model of the system recorded a mean cross-validation index of 97.9% according to the entire size of word sets, thus proving its fitness. The model recorded a RMSE of 0.66 and 0.805 against the movie ratings on Naver and LSTM model language model, respectively, demonstrating the system's superior performance in predicting movie ratings.

A TBM data-based ground prediction using deep neural network (심층 신경망을 이용한 TBM 데이터 기반의 굴착 지반 예측 연구)

  • Kim, Tae-Hwan;Kwak, No-Sang;Kim, Taek Kon;Jung, Sabum;Ko, Tae Young
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.23 no.1
    • /
    • pp.13-24
    • /
    • 2021
  • Tunnel boring machine (TBM) is widely used for tunnel excavation in hard rock and soft ground. In the perspective of TBM-based tunneling, one of the main challenges is to drive the machine optimally according to varying geological conditions, which could significantly lead to saving highly expensive costs by reducing the total operation time. Generally, drilling investigations are conducted to survey the geological ground before the TBM tunneling. However, it is difficult to provide the precise ground information over the whole tunnel path to operators because it acquires insufficient samples around the path sparsely and irregularly. To overcome this issue, in this study, we proposed a geological type classification system using the TBM operating data recorded in a 5 s sampling rate. We first categorized the various geological conditions (here, we limit to granite) as three geological types (i.e., rock, soil, and mixed type). Then, we applied the preprocessing methods including outlier rejection, normalization, and extracting input features, etc. We adopted a deep neural network (DNN), which has 6 hidden layers, to classify the geological types based on TBM operating data. We evaluated the classification system using the 10-fold cross-validation. Average classification accuracy presents the 75.4% (here, the total number of data were 388,639 samples). Our experimental results still need to improve accuracy but show that geology information classification technique based on TBM operating data could be utilized in the real environment to complement the sparse ground information.

Utilization Evaluation of Numerical forest Soil Map to Predict the Weather in Upland Crops (밭작물 농업기상을 위한 수치형 산림입지토양도 활용성 평가)

  • Kang, Dayoung;Hwang, Yeongeun;Yoon, Sanghoo
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.1
    • /
    • pp.34-45
    • /
    • 2021
  • Weather is one of the important factors in the agricultural industry as it affects the price, production, and quality of crops. Upland crops are directly exposed to the natural environment because they are mainly grown in mountainous areas. Therefore, it is necessary to provide accurate weather for upland crops. This study examined the effectiveness of 12 forest soil factors to interpolate the weather in mountainous areas. The daily temperature and precipitation were collected by the Korea Meteorological Administration between January 2009 and December 2018. The Generalized Additive Model (GAM), Kriging, and Random Forest (RF) were considered to interpolate. For evaluating the interpolation performance, automatic weather stations were used as training data and automated synoptic observing systems were used as test data for cross-validation. Unfortunately, the forest soil factors were not significant to interpolate the weather in the mountainous areas. GAM with only geography aspects showed that it can interpolate well in terms of root mean squared error and mean absolute error. The significance of the factors was tested at the 5% significance level in GAM, and the climate zone code (CLZN_CD) and soil water code B (SIBFLR_LAR) were identified as relatively important factors. It has shown that CLZN_CD could help to interpolate the daily average and minimum daily temperature for upland crops.

Verification of GEO-KOMPSAT-2A AMI Radiometric Calibration Parameters Using an Evaluation Tool (분석툴을 이용한 천리안2A 기상탑재체 복사 보정 파라미터 검증)

  • Jin, Kyoungwook;Park, Jin-Hyung
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_1
    • /
    • pp.1323-1337
    • /
    • 2020
  • GEO-KOMPSAT-2A AMI (Advanced Meteorological Imager) radiometric calibration evaluation is an essential element not only for functional and performance verification of the payload but for the quality of the sensor data. AMI instrument consists of six reflective channels and ten thermal infrared ones. One of the key parameters representing radiometric properties of the sensor is a SNR (Signal-to-Noise Ratio) for the reflective channels and a NEdT (Noise Equivalent delta Temperature) for the IR ones respectively. Other important radiometric calibration parameters are a dynamic range and a gain value related with the responsivity of detectors. To verify major radiometric calibration performance of AMI, an offline radiometric evaluation tool was developed separately with a real-time AMI data processing system. Using the evaluation tool, validation activities were carried out during the GEO-KOMPSAT-2A In-Orbit Test period. The results from the evaluation tool were cross checked with those of the HARRIS, which is the AMI payload vendor. AMI radiometric evaluation activities were conducted through three phases for both sides (Side 1 and Side 2) of AMI payload. Results showed that performances of the key radiometric properties were outstanding with respect to the radiometric requirements of the payload. The effectiveness of the evaluation tool was verified as well.

Translation and Cross-Cultural Adaptation Study on a Korean of Sensory Processing Measure Home Form (가정용 Sensory Processing Measure(SPM)의 국내적용을 위한 번역연구)

  • Lee, Hye-Rim;Yoo, Eun-Jung;Kim, Kyeong-Mi
    • The Journal of Korean Academy of Sensory Integration
    • /
    • v.19 no.3
    • /
    • pp.22-31
    • /
    • 2021
  • Purpose : This study aimed to conduct a translation, backtranslation, and content validity test of the Sensory Processing Measure (SPM) for Korean children. Methods : The translation and content validation process involved direct and backward translation; a test of equivalence between the two versions (the original SPM and the Korean version SPM; K-SPM) was performed using content-related evidence collected by a group of experts and a group of parents. Data analysis was carried out using Excel Content validity indices (CVI), mean, and standard deviation were used for the analysis of content validity. Results : The result of the comparison between the original SPM and K-SPM in the group of experts was 3.54 ± .74, the S-CVI/Avg for semanticity was .92, and the S-CVI/Avg for structure was .86. The results for the mean of the understanding test and the S-CVI/Avg were 3.48 ± .63 and .94, respectively. Conclusion : K-SPM will considerately be used as an assessment to identify sensory processing, praxis, and social participation issues for children in Korea. Further studies are suggested to increase the age range and the sample size for a more comprehensive applicability of the K-SPM to Korean children.

Sentiment Analysis of Product Reviews to Identify Deceptive Rating Information in Social Media: A SentiDeceptive Approach

  • Marwat, M. Irfan;Khan, Javed Ali;Alshehri, Dr. Mohammad Dahman;Ali, Muhammad Asghar;Hizbullah;Ali, Haider;Assam, Muhammad
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.3
    • /
    • pp.830-860
    • /
    • 2022
  • [Introduction] Nowadays, many companies are shifting their businesses online due to the growing trend among customers to buy and shop online, as people prefer online purchasing products. [Problem] Users share a vast amount of information about products, making it difficult and challenging for the end-users to make certain decisions. [Motivation] Therefore, we need a mechanism to automatically analyze end-user opinions, thoughts, or feelings in the social media platform about the products that might be useful for the customers to make or change their decisions about buying or purchasing specific products. [Proposed Solution] For this purpose, we proposed an automated SentiDecpective approach, which classifies end-user reviews into negative, positive, and neutral sentiments and identifies deceptive crowd-users rating information in the social media platform to help the user in decision-making. [Methodology] For this purpose, we first collected 11781 end-users comments from the Amazon store and Flipkart web application covering distant products, such as watches, mobile, shoes, clothes, and perfumes. Next, we develop a coding guideline used as a base for the comments annotation process. We then applied the content analysis approach and existing VADER library to annotate the end-user comments in the data set with the identified codes, which results in a labelled data set used as an input to the machine learning classifiers. Finally, we applied the sentiment analysis approach to identify the end-users opinions and overcome the deceptive rating information in the social media platforms by first preprocessing the input data to remove the irrelevant (stop words, special characters, etc.) data from the dataset, employing two standard resampling approaches to balance the data set, i-e, oversampling, and under-sampling, extract different features (TF-IDF and BOW) from the textual data in the data set and then train & test the machine learning algorithms by applying a standard cross-validation approach (KFold and Shuffle Split). [Results/Outcomes] Furthermore, to support our research study, we developed an automated tool that automatically analyzes each customer feedback and displays the collective sentiments of customers about a specific product with the help of a graph, which helps customers to make certain decisions. In a nutshell, our proposed sentiments approach produces good results when identifying the customer sentiments from the online user feedbacks, i-e, obtained an average 94.01% precision, 93.69% recall, and 93.81% F-measure value for classifying positive sentiments.

Major environmental factors and traits of invasive alien plants determining their spatial distribution

  • Oh, Minwoo;Heo, Yoonjeong;Lee, Eun Ju;Lee, Hyohyemi
    • Journal of Ecology and Environment
    • /
    • v.45 no.4
    • /
    • pp.277-286
    • /
    • 2021
  • Background: As trade increases, the influx of various alien species and their spread to new regions are prevalent and no longer a special problem. Anthropogenic activities and climate changes have made the distribution of alien species out of their native range common. As a result, alien species can be easily found anywhere, and they have nothing but only a few differences in intensity. The prevalent distribution of alien species adversely affects the ecosystem, and a strategic management plan must be established to control them effectively. To this end, hot spots and cold spots were analyzed according to the degree of distribution of invasive alien plants, and major environmental factors related to hot spots were found. We analyzed the 10,287 distribution points of 126 species of alien plants collected through the national survey of alien species by the hierarchical model of species communities (HMSC) framework. Results: The explanatory and fourfold cross-validation predictive power of the model were 0.91 and 0.75 as AUC values, respectively. The hot spots of invasive plants were found in the Seoul metropolitan area, Daegu metropolitan city, Chungcheongbuk-do Province, southwest shore, and Jeju island. Generally, the hot spots were found where the higher maximum temperature of summer, precipitation of winter, and road density are observed, but temperature seasonality, annual temperature range, precipitation of the summer, and distance to river and sea were negatively related to the hot spots. According to the model, the functional traits accounted for 55% of the variance explained by the environmental factors. The species with higher specific leaf areas were more found where temperature seasonality was low. Taller species preferred the bigger annual temperature range. The heavier seed mass was only preferred when the max temperature of summer exceeded 29 ℃. Conclusions: In this study, hot spots were places where 2.1 times more alien plants were distributed on average than non-hot spots (33.5 vs 15.7 species). The hot spots of invasive plants were expected to appear in less stressful climate conditions, such as low fluctuation of temperature and precipitation. Also, the disturbance by anthropogenic factors or water flow had positive influences on the hot spots. These results were consistent with the previous reports about the ruderal or competitive strategies of invasive plants instead of the stress-tolerant strategy. The functional traits are closely related to the ecological strategies of plants by shaping the response of species to various environmental filters, and our result confirmed this. Therefore, in order to effectively control alien plants, it is judged that the occurrence of disturbed sites in which alien plants can grow in large quantities is minimized, and the river management of waterfronts is required.

A Fuzzy-AHP-based Movie Recommendation System using the GRU Language Model (GRU 언어 모델을 이용한 Fuzzy-AHP 기반 영화 추천 시스템)

  • Oh, Jae-Taek;Lee, Sang-Yong
    • Journal of Digital Convergence
    • /
    • v.19 no.8
    • /
    • pp.319-325
    • /
    • 2021
  • With the advancement of wireless technology and the rapid growth of the infrastructure of mobile communication technology, systems applying AI-based platforms are drawing attention from users. In particular, the system that understands users' tastes and interests and recommends preferred items is applied to advanced e-commerce customized services and smart homes. However, there is a problem that these recommendation systems are difficult to reflect in real time the preferences of various users for tastes and interests. In this research, we propose a Fuzzy-AHP-based movies recommendation system using the Gated Recurrent Unit (GRU) language model to address a problem. In this system, we apply Fuzzy-AHP to reflect users' tastes or interests in real time. We also apply GRU language model-based models to analyze the public interest and the content of the film to recommend movies similar to the user's preferred factors. To validate the performance of this recommendation system, we measured the suitability of the learning model using scraping data used in the learning module, and measured the rate of learning performance by comparing the Long Short-Term Memory (LSTM) language model with the learning time per epoch. The results show that the average cross-validation index of the learning model in this work is suitable at 94.8% and that the learning performance rate outperforms the LSTM language model.

A Study on Digital Documentation of Precise Monitoring for Microscale Displacements within the Tomb of King Muryeong and the Royal Tombs in Gongju, Korea (공주 무령왕릉과 왕릉원 내부 미세변위 정밀모니터링을 위한 디지털 기록화 연구)

  • Choi, Il Kyu;Yang, Hye Ri;Lee, Chan Hee
    • Journal of Conservation Science
    • /
    • v.37 no.6
    • /
    • pp.626-637
    • /
    • 2021
  • The tomb complex of the royal family from the period of the Ungjin Baekje Kingdom (475 to 538 AD) in Gongju, Korea, contains the tomb of King Muryeong and other royal tombs. After the excavation of the tomb of King Muryeong in 1971, these tombs were opened up to the public, without the establishment of systems for their safety, conservation and management. The tombs have consequently experienced rapid environmental changes and suffered various damages. In this study, specific vulnerable parts inside the tombs were selected for deviation analysis using 3D scanning, and 3D image models were constructed on this basis. Progressive displacement was identified in tomb No. 5, and basic data for future investigations was acquired from tomb No. 6 and the tomb of King Muryeong. In the deviation analysis for the southern plastered wall of tomb No. 5, the damage was not found to exceed the ranges of ±18 mm and ±2 mm. However, the lintel stone was found to be sagging by 0.32 mm on average, and the distance between the walls to have increased by 0.36 mm on average. Direct water seepage occurring in tomb No. 5 is considered to be increasing the damage within the tomb, such as the dropping and sagging of the lintel. The 3D image models constructed in this study will play an important role as baseline data for future research, and can be used to discuss a secure conservation scheme for the tombs through cross-validation with precise measurement monitoring.