• Title/Summary/Keyword: convolutional network

Search Result 1,671, Processing Time 0.032 seconds

Predictive modeling algorithms for liver metastasis in colorectal cancer: A systematic review of the current literature

  • Isaac Seow-En;Ye Xin Koh;Yun Zhao;Boon Hwee Ang;Ivan En-Howe Tan;Aik Yong Chok;Emile John Kwong Wei Tan;Marianne Kit Har Au
    • Annals of Hepato-Biliary-Pancreatic Surgery
    • /
    • v.28 no.1
    • /
    • pp.14-24
    • /
    • 2024
  • This study aims to assess the quality and performance of predictive models for colorectal cancer liver metastasis (CRCLM). A systematic review was performed to identify relevant studies from various databases. Studies that described or validated predictive models for CRCLM were included. The methodological quality of the predictive models was assessed. Model performance was evaluated by the reported area under the receiver operating characteristic curve (AUC). Of the 117 articles screened, seven studies comprising 14 predictive models were included. The distribution of included predictive models was as follows: radiomics (n = 3), logistic regression (n = 3), Cox regression (n = 2), nomogram (n = 3), support vector machine (SVM, n = 2), random forest (n = 2), and convolutional neural network (CNN, n = 2). Age, sex, carcinoembryonic antigen, and tumor staging (T and N stage) were the most frequently used clinicopathological predictors for CRCLM. The mean AUCs ranged from 0.697 to 0.870, with 86% of the models demonstrating clear discriminative ability (AUC > 0.70). A hybrid approach combining clinical and radiomic features with SVM provided the best performance, achieving an AUC of 0.870. The overall risk of bias was identified as high in 71% of the included studies. This review highlights the potential of predictive modeling to accurately predict the occurrence of CRCLM. Integrating clinicopathological and radiomic features with machine learning algorithms demonstrates superior predictive capabilities.

Speech Emotion Recognition in People at High Risk of Dementia

  • Dongseon Kim;Bongwon Yi;Yugwon Won
    • Dementia and Neurocognitive Disorders
    • /
    • v.23 no.3
    • /
    • pp.146-160
    • /
    • 2024
  • Background and Purpose: The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia. Methods: Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition. Results: Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%. Conclusions: The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

Feasibility of a deep learning-based diagnostic platform to evaluate lower urinary tract disorders in men using simple uroflowmetry

  • Seokhwan Bang;Sokhib Tukhtaev;Kwang Jin Ko;Deok Hyun Han;Minki Baek;Hwang Gyun Jeon;Baek Hwan Cho;Kyu-Sung Lee
    • Investigative and Clinical Urology
    • /
    • v.63 no.3
    • /
    • pp.301-308
    • /
    • 2022
  • Purpose To diagnose lower urinary tract symptoms (LUTS) in a noninvasive manner, we created a prediction model for bladder outlet obstruction (BOO) and detrusor underactivity (DUA) using simple uroflowmetry. In this study, we used deep learning to analyze simple uroflowmetry. Materials and Methods We performed a retrospective review of 4,835 male patients aged ≥40 years who underwent a urodynamic study at a single center. We excluded patients with a disease or a history of surgery that could affect LUTS. A total of 1,792 patients were included in the study. We extracted a simple uroflowmetry graph automatically using the ABBYY Flexicapture® image capture program (ABBYY, Moscow, Russia). We applied a convolutional neural network (CNN), a deep learning method to predict DUA and BOO. A 5-fold cross-validation average value of the area under the receiver operating characteristic (AUROC) curve was chosen as an evaluation metric. When it comes to binary classification, this metric provides a richer measure of classification performance. Additionally, we provided the corresponding average precision-recall (PR) curves. Results Among the 1,792 patients, 482 (26.90%) had BOO, and 893 (49.83%) had DUA. The average AUROC scores of DUA and BOO, which were measured using 5-fold cross-validation, were 73.30% (mean average precision [mAP]=0.70) and 72.23% (mAP=0.45), respectively. Conclusions Our study suggests that it is possible to differentiate DUA from non-DUA and BOO from non-BOO using a simple uroflowmetry graph with a fine-tuned VGG16, which is a well-known CNN model.

Classification of mandibular molar furcation involvement in periapical radiographs by deep learning

  • Katerina Vilkomir;Cody Phen;Fiondra Baldwin;Jared Cole;Nic Herndon;Wenjian Zhang
    • Imaging Science in Dentistry
    • /
    • v.54 no.3
    • /
    • pp.257-263
    • /
    • 2024
  • Purpose: The purpose of this study was to classify mandibular molar furcation involvement (FI) in periapical radiographs using a deep learning algorithm. Materials and Methods: Full mouth series taken at East Carolina University School of Dental Medicine from 2011-2023 were screened. Diagnostic-quality mandibular premolar and molar periapical radiographs with healthy or FI mandibular molars were included. The radiographs were cropped into individual molar images, annotated as "healthy" or "FI," and divided into training, validation, and testing datasets. The images were preprocessed by PyTorch transformations. ResNet-18, a convolutional neural network model, was refined using the PyTorch deep learning framework for the specific imaging classification task. CrossEntropyLoss and the AdamW optimizer were employed for loss function training and optimizing the learning rate, respectively. The images were loaded by PyTorch DataLoader for efficiency. The performance of ResNet-18 algorithm was evaluated with multiple metrics, including training and validation losses, confusion matrix, accuracy, sensitivity, specificity, the receiver operating characteristic (ROC) curve, and the area under the ROC curve. Results: After adequate training, ResNet-18 classified healthy vs. FI molars in the testing set with an accuracy of 96.47%, indicating its suitability for image classification. Conclusion: The deep learning algorithm developed in this study was shown to be promising for classifying mandibular molar FI. It could serve as a valuable supplemental tool for detecting and managing periodontal diseases.

CNN-ViT Hybrid Aesthetic Evaluation Model Based on Quantification of Cognitive Features in Images (이미지의 인지적 특징 정량화를 통한 CNN-ViT 하이브리드 미학 평가 모델)

  • Soo-Eun Kim;Joon-Shik Lim
    • Journal of IKEEE
    • /
    • v.28 no.3
    • /
    • pp.352-359
    • /
    • 2024
  • This paper proposes a CNN-ViT hybrid model that automatically evaluates the aesthetic quality of images by combining local and global features. In this approach, CNN is used to extract local features such as color and object placement, while ViT is employed to analyze the aesthetic value of the image by reflecting global features. Color composition is derived by extracting the primary colors from the input image, creating a color palette, and then passing it through the CNN. The Rule of Thirds is quantified by calculating how closely objects in the image are positioned near the thirds intersection points. These values provide the model with critical information about the color balance and spatial harmony of the image. The model then analyzes the relationship between these factors to predict scores that align closely with human judgment. Experimental results on the AADB image database show that the proposed model achieved a Spearman's Rank Correlation Coefficient (SRCC) of 0.716, indicating more consistent rank predictions, and a Pearson Correlation Coefficient (LCC) of 0.72, which is 2~4% higher than existing models.

Satellite-Based Cabbage and Radish Yield Prediction Using Deep Learning in Kangwon-do (딥러닝을 활용한 위성영상 기반의 강원도 지역의 배추와 무 수확량 예측)

  • Hyebin Park;Yejin Lee;Seonyoung Park
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.1031-1042
    • /
    • 2023
  • In this study, a deep learning model was developed to predict the yield of cabbage and radish, one of the five major supply and demand management vegetables, using satellite images of Landsat 8. To predict the yield of cabbage and radish in Gangwon-do from 2015 to 2020, satellite images from June to September, the growing period of cabbage and radish, were used. Normalized difference vegetation index, enhanced vegetation index, lead area index, and land surface temperature were employed in this study as input data for the yield model. Crop yields can be effectively predicted using satellite images because satellites collect continuous spatiotemporal data on the global environment. Based on the model developed previous study, a model designed for input data was proposed in this study. Using time series satellite images, convolutional neural network, a deep learning model, was used to predict crop yield. Landsat 8 provides images every 16 days, but it is difficult to acquire images especially in summer due to the influence of weather such as clouds. As a result, yield prediction was conducted by splitting June to July into one part and August to September into two. Yield prediction was performed using a machine learning approach and reference models , and modeling performance was compared. The model's performance and early predictability were assessed using year-by-year cross-validation and early prediction. The findings of this study could be applied as basic studies to predict the yield of field crops in Korea.

Comparison of Convolutional Neural Network (CNN) Models for Lettuce Leaf Width and Length Prediction (상추잎 너비와 길이 예측을 위한 합성곱 신경망 모델 비교)

  • Ji Su Song;Dong Suk Kim;Hyo Sung Kim;Eun Ji Jung;Hyun Jung Hwang;Jaesung Park
    • Journal of Bio-Environment Control
    • /
    • v.32 no.4
    • /
    • pp.434-441
    • /
    • 2023
  • Determining the size or area of a plant's leaves is an important factor in predicting plant growth and improving the productivity of indoor farms. In this study, we developed a convolutional neural network (CNN)-based model to accurately predict the length and width of lettuce leaves using photographs of the leaves. A callback function was applied to overcome data limitations and overfitting problems, and K-fold cross-validation was used to improve the generalization ability of the model. In addition, ImageDataGenerator function was used to increase the diversity of training data through data augmentation. To compare model performance, we evaluated pre-trained models such as VGG16, Resnet152, and NASNetMobile. As a result, NASNetMobile showed the highest performance, especially in width prediction, with an R_squared value of 0.9436, and RMSE of 0.5659. In length prediction, the R_squared value was 0.9537, and RMSE of 0.8713. The optimized model adopted the NASNetMobile architecture, the RMSprop optimization tool, the MSE loss functions, and the ELU activation functions. The training time of the model averaged 73 minutes per Epoch, and it took the model an average of 0.29 seconds to process a single lettuce leaf photo. In this study, we developed a CNN-based model to predict the leaf length and leaf width of plants in indoor farms, which is expected to enable rapid and accurate assessment of plant growth status by simply taking images. It is also expected to contribute to increasing the productivity and resource efficiency of farms by taking appropriate agricultural measures such as adjusting nutrient solution in real time.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.