• 제목/요약/키워드: Deep Learning Convergence Study

검색결과 326건 처리시간 0.041초

SVM on Top of Deep Networks for Covid-19 Detection from Chest X-ray Images

  • Do, Thanh-Nghi;Le, Van-Thanh;Doan, Thi-Huong
    • Journal of information and communication convergence engineering
    • /
    • 제20권3호
    • /
    • pp.219-225
    • /
    • 2022
  • In this study, we propose training a support vector machine (SVM) model on top of deep networks for detecting Covid-19 from chest X-ray images. We started by gathering a real chest X-ray image dataset, including positive Covid-19, normal cases, and other lung diseases not caused by Covid-19. Instead of training deep networks from scratch, we fine-tuned recent pre-trained deep network models, such as DenseNet121, MobileNet v2, Inception v3, Xception, ResNet50, VGG16, and VGG19, to classify chest X-ray images into one of three classes (Covid-19, normal, and other lung). We propose training an SVM model on top of deep networks to perform a nonlinear combination of deep network outputs, improving classification over any single deep network. The empirical test results on the real chest X-ray image dataset show that deep network models, with an exception of ResNet50 with 82.44%, provide an accuracy of at least 92% on the test set. The proposed SVM on top of the deep network achieved the highest accuracy of 96.16%.

Comparison of Deep-Learning Algorithms for the Detection of Railroad Pedestrians

  • Fang, Ziyu;Kim, Pyeoungkee
    • Journal of information and communication convergence engineering
    • /
    • 제18권1호
    • /
    • pp.28-32
    • /
    • 2020
  • Railway transportation is the main land-based transportation in most countries. Accordingly, railway-transportation safety has always been a key issue for many researchers. Railway pedestrian accidents are the main reasons of railway-transportation casualties. In this study, we conduct experiments to determine which of the latest convolutional neural network models and algorithms are appropriate to build pedestrian railroad accident prevention systems. When a drone cruises over a pre-specified path and altitude, the real-time status around the rail is recorded, following which the image information is transmitted back to the server in time. Subsequently, the images are analyzed to determine whether pedestrians are present around the railroads, and a speed-deceleration order is immediately sent to the train driver, resulting in a reduction of the instances of pedestrian railroad accidents. This is the first part of an envisioned drone-based intelligent security system. This system can effectively address the problem of insufficient manual police force.

Deep Learning-based Image Data Processing and Archival System for Object Detection of Endangered Species

  • Choe, Dea-Gyu;Kim, Dong-Keun
    • Journal of information and communication convergence engineering
    • /
    • 제18권4호
    • /
    • pp.267-277
    • /
    • 2020
  • It is important to understand the exact habitat distribution of endangered species because of their decreasing numbers. In this study, we build a system with a deep learning module that collects the image data of endangered animals, processes the data, and saves the data automatically. The system provides a more efficient way than human effort for classifying images and addresses two problems faced in previous studies. First, specious answers were suggested in those studies because the probability distributions of answer candidates were calculated even if the actual answer did not exist within the group. Second, when there were more than two entities in an image, only a single entity was focused on. We applied an object detection algorithm (YOLO) to resolve these problems. Our system has an average precision of 86.79%, a mean recall rate of 93.23%, and a processing speed of 13 frames per second.

Similar Image Retrieval Technique based on Semantics through Automatic Labeling Extraction of Personalized Images

  • Jung-Hee, Seo
    • Journal of information and communication convergence engineering
    • /
    • 제22권1호
    • /
    • pp.56-63
    • /
    • 2024
  • Despite the rapid strides in content-based image retrieval, a notable disparity persists between the visual features of images and the semantic features discerned by humans. Hence, image retrieval based on the association of semantic similarities recognized by humans with visual similarities is a difficult task for most image-retrieval systems. Our study endeavors to bridge this gap by refining image semantics, aligning them more closely with human perception. Deep learning techniques are used to semantically classify images and retrieve those that are semantically similar to personalized images. Moreover, we introduce a keyword-based image retrieval, enabling automatic labeling of images in mobile environments. The proposed approach can improve the performance of a mobile device with limited resources and bandwidth by performing retrieval based on the visual features and keywords of the image on the mobile device.

A Comparative Study on OCR using Super-Resolution for Small Fonts

  • Cho, Wooyeong;Kwon, Juwon;Kwon, Soonchu;Yoo, Jisang
    • International journal of advanced smart convergence
    • /
    • 제8권3호
    • /
    • pp.95-101
    • /
    • 2019
  • Recently, there have been many issues related to text recognition using Tesseract. One of these issues is that the text recognition accuracy is significantly lower for smaller fonts. Tesseract extracts text by creating an outline with direction in the image. By searching the Tesseract database, template matching with characters with similar feature points is used to select the character with the lowest error. Because of the poor text extraction, the recognition accuracy is lowerd. In this paper, we compared text recognition accuracy after applying various super-resolution methods to smaller text images and experimented with how the recognition accuracy varies for various image size. In order to recognize small Korean text images, we have used super-resolution algorithms based on deep learning models such as SRCNN, ESRCNN, DSRCNN, and DCSCN. The dataset for training and testing consisted of Korean-based scanned images. The images was resized from 0.5 times to 0.8 times with 12pt font size. The experiment was performed on x0.5 resized images, and the experimental result showed that DCSCN super-resolution is the most efficient method to reduce precision error rate by 7.8%, and reduce the recall error rate by 8.4%. The experimental results have demonstrated that the accuracy of text recognition for smaller Korean fonts can be improved by adding super-resolution methods to the OCR preprocessing module.

Deep-learning-based gestational sac detection in ultrasound images using modified YOLOv7-E6E model

  • Tae-kyeong Kim;Jin Soo Kim;Hyun-chong Cho
    • Journal of Animal Science and Technology
    • /
    • 제65권3호
    • /
    • pp.627-637
    • /
    • 2023
  • As the population and income levels rise, meat consumption steadily increases annually. However, the number of farms and farmers producing meat decrease during the same period, reducing meat sufficiency. Information and Communications Technology (ICT) has begun to be applied to reduce labor and production costs of livestock farms and improve productivity. This technology can be used for rapid pregnancy diagnosis of sows; the location and size of the gestation sacs of sows are directly related to the productivity of the farm. In this study, a system proposes to determine the number of gestation sacs of sows from ultrasound images. The system used the YOLOv7-E6E model, changing the activation function from sigmoid-weighted linear unit (SiLU) to a multi-activation function (SiLU + Mish). Also, the upsampling method was modified from nearest to bicubic to improve performance. The model trained with the original model using the original data achieved mean average precision of 86.3%. When the proposed multi-activation function, upsampling, and AutoAugment were applied, the performance improved by 0.3%, 0.9%, and 0.9%, respectively. When all three proposed methods were simultaneously applied, a significant performance improvement of 3.5% to 89.8% was achieved.

Traffic Flow Prediction with Spatio-Temporal Information Fusion using Graph Neural Networks

  • Huijuan Ding;Giseop Noh
    • International journal of advanced smart convergence
    • /
    • 제12권4호
    • /
    • pp.88-97
    • /
    • 2023
  • Traffic flow prediction is of great significance in urban planning and traffic management. As the complexity of urban traffic increases, existing prediction methods still face challenges, especially for the fusion of spatiotemporal information and the capture of long-term dependencies. This study aims to use the fusion model of graph neural network to solve the spatio-temporal information fusion problem in traffic flow prediction. We propose a new deep learning model Spatio-Temporal Information Fusion using Graph Neural Networks (STFGNN). We use GCN module, TCN module and LSTM module alternately to carry out spatiotemporal information fusion. GCN and multi-core TCN capture the temporal and spatial dependencies of traffic flow respectively, and LSTM connects multiple fusion modules to carry out spatiotemporal information fusion. In the experimental evaluation of real traffic flow data, STFGNN showed better performance than other models.

고등학교 수학에서 딥러닝 예측을 이용한 통계교육 프로그램 연구 (Research on a statistics education program utilizing deep learning predictions in high school mathematics)

  • 진혜성;서보억
    • 한국수학교육학회지시리즈A:수학교육
    • /
    • 제63권2호
    • /
    • pp.209-231
    • /
    • 2024
  • 4차 산업혁명과 인공지능의 발전으로 교육 분야에서 많은 변화가 일어나고 있다. 특히, 인공지능을 기반으로 하는 교육의 중요성이 강조되고 있다. 이러한 흐름에 따라 본 연구에서는 고등학교 수학에서 딥러닝 예측을 이용한 통계교육 프로그램을 개발하고 이러한 통계적 문제해결 과정 중심의 통계교육 프로그램이 고등학생들의 통계적 소양 및 컴퓨팅 사고력에 미치는 영향을 고찰하고자 한다. 먼저, 본 연구에서는 고등학교 수학에 적용할 수 있는 딥러닝 예측을 이용한 통계교육 프로그램을 개발하였고, 이를 실제 수업상황에 적용하여 분석하였다. 분석 결과, 학생들은 자료가 어떤 맥락에서 생성되고 수집되었는지 경험함으로써 맥락에 대한 이해도가 향상되었으며, 다양한 데이터셋을 탐색하고 분석하는 과정에서 자료의 변이성에 대한 이해도가 높아졌고, 자료의 신뢰성을 검증하는 과정에서 자료를 비판적으로 분석하는 능력을 보였다. 통계교육 프로그램이 고등학생들의 컴퓨팅 사고력에 미치는 영향을 분석하고자 대응 표본 t-검정 시행하였고, 수업 전과 후의 컴퓨팅 사고력 (t=-11.657, p<0.001)은 통계적으로 유의한 차이가 있음을 확인하였다.

Estimation of gender and age using CNN-based face recognition algorithm

  • Lim, Sooyeon
    • International journal of advanced smart convergence
    • /
    • 제9권2호
    • /
    • pp.203-211
    • /
    • 2020
  • This study proposes a method for estimating gender and age that is robust to various external environment changes by applying deep learning-based learning. To improve the accuracy of the proposed algorithm, an improved CNN network structure and learning method are described, and the performance of the algorithm is also evaluated. In this study, in order to improve the learning method based on CNN composed of 6 layers of hidden layers, a network using GoogLeNet's inception module was constructed. As a result of the experiment, the age estimation accuracy of 5,328 images for the performance test of the age estimation method is about 85%, and the gender estimation accuracy is about 98%. It is expected that real-time age recognition will be possible beyond feature extraction of face images if studies on the construction of a larger data set, pre-processing methods, and various network structures and activation functions have been made to classify the age classes that are further subdivided according to age.

Automatic Generation of Video Metadata for the Super-personalized Recommendation of Media

  • Yong, Sung Jung;Park, Hyo Gyeong;You, Yeon Hwi;Moon, Il-Young
    • Journal of information and communication convergence engineering
    • /
    • 제20권4호
    • /
    • pp.288-294
    • /
    • 2022
  • The media content market has been growing, as various types of content are being mass-produced owing to the recent proliferation of the Internet and digital media. In addition, platforms that provide personalized services for content consumption are emerging and competing with each other to recommend personalized content. Existing platforms use a method in which a user directly inputs video metadata. Consequently, significant amounts of time and cost are consumed in processing large amounts of data. In this study, keyframes and audio spectra based on the YCbCr color model of a movie trailer were extracted for the automatic generation of metadata. The extracted audio spectra and image keyframes were used as learning data for genre recognition in deep learning. Deep learning was implemented to determine genres among the video metadata, and suggestions for utilization were proposed. A system that can automatically generate metadata established through the results of this study will be helpful for studying recommendation systems for media super-personalization.