• Title/Summary/Keyword: RetinaNet model

Search Result 11, Processing Time 0.024 seconds

Steel Surface Defect Detection using the RetinaNet Detection Model

  • Sharma, Mansi;Lim, Jong-Tae;Chae, Yi-Geun
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.2
    • /
    • pp.136-146
    • /
    • 2022
  • Some surface defects make the weak quality of steel materials. To limit these defects, we advocate a one-stage detector model RetinaNet among diverse detection algorithms in deep learning. There are several backbones in the RetinaNet model. We acknowledged two backbones, which are ResNet50 and VGG19. To validate our model, we compared and analyzed several traditional models, one-stage models like YOLO and SSD models and two-stage models like Faster-RCNN, EDDN, and Xception models, with simulations based on steel individual classes. We also performed the correlation of the time factor between one-stage and two-stage models. Comparative analysis shows that the proposed model achieves excellent results on the dataset of the Northeastern University surface defect detection dataset. We would like to work on different backbones to check the efficiency of the model for real world, increasing the datasets through augmentation and focus on improving our limitation.

Comparison of Pre-processed Brain Tumor MR Images Using Deep Learning Detection Algorithms

  • Kwon, Hee Jae;Lee, Gi Pyo;Kim, Young Jae;Kim, Kwang Gi
    • Journal of Multimedia Information System
    • /
    • v.8 no.2
    • /
    • pp.79-84
    • /
    • 2021
  • Detecting brain tumors of different sizes is a challenging task. This study aimed to identify brain tumors using detection algorithms. Most studies in this area use segmentation; however, we utilized detection owing to its advantages. Data were obtained from 64 patients and 11,200 MR images. The deep learning model used was RetinaNet, which is based on ResNet152. The model learned three different types of pre-processing images: normal, general histogram equalization, and contrast-limited adaptive histogram equalization (CLAHE). The three types of images were compared to determine the pre-processing technique that exhibits the best performance in the deep learning algorithms. During pre-processing, we converted the MR images from DICOM to JPG format. Additionally, we regulated the window level and width. The model compared the pre-processed images to determine which images showed adequate performance; CLAHE showed the best performance, with a sensitivity of 81.79%. The RetinaNet model for detecting brain tumors through deep learning algorithms demonstrated satisfactory performance in finding lesions. In future, we plan to develop a new model for improving the detection performance using well-processed data. This study lays the groundwork for future detection technologies that can help doctors find lesions more easily in clinical tasks.

Design of a deep learning model to determine fire occurrence in distribution switchboard using thermal imaging data (열화상 영상 데이터 기반 배전반 화재 발생 판별을 위한 딥러닝 모델 설계)

  • Dongjoon Park;Minyoung Kim
    • The Journal of the Convergence on Culture Technology
    • /
    • v.9 no.5
    • /
    • pp.737-745
    • /
    • 2023
  • This paper discusses a study on developing an artificial intelligence model to detect incidents of fires in distribution switchboard using thermal images. The objective of the research is to preprocess collected thermal images into suitable data for object detection models and design a model capable of determining the occurrence of fires within distribution panels. The study utilizes thermal image data from AI-HUB's industrial complex for training. Two CNN-based deep learning object detection algorithms, namely Faster R-CNN and RetinaNet, are employed to construct models. The paper compares and analyzes these two models, ultimately proposing the optimal model for the task.

Face Detection Method based Fusion RetinaNet using RGB-D Image (RGB-D 영상을 이용한 Fusion RetinaNet 기반 얼굴 검출 방법)

  • Nam, Eun-Jeong;Nam, Chung-Hyeon;Jang, Kyung-Sik
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.4
    • /
    • pp.519-525
    • /
    • 2022
  • The face detection task of detecting a person's face in an image is used as a preprocess or core process in various image processing-based applications. The neural network models, which have recently been performing well with the development of deep learning, are dependent on 2D images, so if noise occurs in the image, such as poor camera quality or pool focus of the face, the face may not be detected properly. In this paper, we propose a face detection method that uses depth information together to reduce the dependence of 2D images. The proposed model was trained after generating and preprocessing depth information in advance using face detection dataset, and as a result, it was confirmed that the FRN model was 89.16%, which was about 1.2% better than the RetinaNet model, which showed 87.95%.

A Study on Object Detection using Restructured RetinaNet (재구조화된 RetinaNet을 활용한 객체 탐지에 관한 연구)

  • Kim, Jun Yeong;Jung, Se Hoon;Sim, Chun Bo
    • Journal of Korea Multimedia Society
    • /
    • v.23 no.12
    • /
    • pp.1531-1539
    • /
    • 2020
  • Searching for portable baggage through the system before boarding an airplane at an airport is important because it prevents many risks. In addition to these dangerous items, personal and confidential information leaks are occurring at airports through data storage devices. In the airport search system, there is a need for a system that searches not only dangerous items but also devices that can leak data. In this paper, we proposed a model that searches for a data storage device by improving the existing model. A comparative evaluation was conducted using existing algorithms. As a result, it was confirmed that the performance of the proposed model is 74 in the training data and 46.73 in the test data, which is superior to the existing model.

Object detection in financial reporting documents for subsequent recognition

  • Sokerin, Petr;Volkova, Alla;Kushnarev, Kirill
    • International journal of advanced smart convergence
    • /
    • v.10 no.1
    • /
    • pp.1-11
    • /
    • 2021
  • Document page segmentation is an important step in building a quality optical character recognition module. The study examined already existing work on the topic of page segmentation and focused on the development of a segmentation model that has greater functional significance for application in an organization, as well as broad capabilities for managing the quality of the model. The main problems of document segmentation were highlighted, which include a complex background of intersecting objects. As classes for detection, not only classic text, table and figure were selected, but also additional types, such as signature, logo and table without borders (or with partially missing borders). This made it possible to pose a non-trivial task of detecting non-standard document elements. The authors compared existing neural network architectures for object detection based on published research data. The most suitable architecture was RetinaNet. To ensure the possibility of quality control of the model, a method based on neural network modeling using the RetinaNet architecture is proposed. During the study, several models were built, the quality of which was assessed on the test sample using the Mean average Precision metric. The best result among the constructed algorithms was shown by a model that includes four neural networks: the focus of the first neural network on detecting tables and tables without borders, the second - seals and signatures, the third - pictures and logos, and the fourth - text. As a result of the analysis, it was revealed that the approach based on four neural networks showed the best results in accordance with the objectives of the study on the test sample in the context of most classes of detection. The method proposed in the article can be used to recognize other objects. A promising direction in which the analysis can be continued is the segmentation of tables; the areas of the table that differ in function will act as classes: heading, cell with a name, cell with data, empty cell.

Comparison and Verification of Deep Learning Models for Automatic Recognition of Pills (알약 자동 인식을 위한 딥러닝 모델간 비교 및 검증)

  • Yi, GyeongYun;Kim, YoungJae;Kim, SeongTae;Kim, HyoEun;Kim, KwangGi
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.3
    • /
    • pp.349-356
    • /
    • 2019
  • When a prescription change occurs in the hospital depending on a patient's improvement status, pharmacists directly classify manually returned pills which are not taken by a patient. There are hundreds of kinds of pills to classify. Because it is manual, mistakes can occur and which can lead to medical accidents. In this study, we have compared YOLO, Faster R-CNN and RetinaNet to classify and detect pills. The data consisted of 10 classes and used 100 images per class. To evaluate the performance of each model, we used cross-validation. As a result, the YOLO Model had sensitivity of 91.05%, FPs/image of 0.0507. The Faster R-CNN's sensitivity was 99.6% and FPs/image was 0.0089. The RetinaNet showed sensitivity of 98.31% and FPs/image of 0.0119. Faster RCNN showed the best performance among these three models tested. Thus, the most appropriate model for classifying pills among the three models is the Faster R-CNN with the most accurate detection and classification results and a low FP/image.

Quantitative Evaluations of Deep Learning Models for Rapid Building Damage Detection in Disaster Areas (재난지역에서의 신속한 건물 피해 정도 감지를 위한 딥러닝 모델의 정량 평가)

  • Ser, Junho;Yang, Byungyun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.40 no.5
    • /
    • pp.381-391
    • /
    • 2022
  • This paper is intended to find one of the prevailing deep learning models that are a type of AI (Artificial Intelligence) that helps rapidly detect damaged buildings where disasters occur. The models selected are SSD-512, RetinaNet, and YOLOv3 which are widely used in object detection in recent years. These models are based on one-stage detector networks that are suitable for rapid object detection. These are often used for object detection due to their advantages in structure and high speed but not for damaged building detection in disaster management. In this study, we first trained each of the algorithms on xBD dataset that provides the post-disaster imagery with damage classification labels. Next, the three models are quantitatively evaluated with the mAP(mean Average Precision) and the FPS (Frames Per Second). The mAP of YOLOv3 is recorded at 34.39%, and the FPS reached 46. The mAP of RetinaNet recorded 36.06%, which is 1.67% higher than YOLOv3, but the FPS is one-third of YOLOv3. SSD-512 received significantly lower values than the results of YOLOv3 on two quantitative indicators. In a disaster situation, a rapid and precise investigation of damaged buildings is essential for effective disaster response. Accordingly, it is expected that the results obtained through this study can be effectively used for the rapid response in disaster management.

Fundamental Function Design of Real-Time Unmanned Monitoring System Applying YOLOv5s on NVIDIA TX2TM AI Edge Computing Platform

  • LEE, SI HYUN
    • International journal of advanced smart convergence
    • /
    • v.11 no.2
    • /
    • pp.22-29
    • /
    • 2022
  • In this paper, for the purpose of designing an real-time unmanned monitoring system, the YOLOv5s (small) object detection model was applied on the NVIDIA TX2TM AI (Artificial Intelligence) edge computing platform in order to design the fundamental function of an unmanned monitoring system that can detect objects in real time. YOLOv5s was applied to the our real-time unmanned monitoring system based on the performance evaluation of object detection algorithms (for example, R-CNN, SSD, RetinaNet, and YOLOv5). In addition, the performance of the four YOLOv5 models (small, medium, large, and xlarge) was compared and evaluated. Furthermore, based on these results, the YOLOv5s model suitable for the design purpose of this paper was ported to the NVIDIA TX2TM AI edge computing system and it was confirmed that it operates normally. The real-time unmanned monitoring system designed as a result of the research can be applied to various application fields such as an security or monitoring system. Future research is to apply NMS (Non-Maximum Suppression) modification, model reconstruction, and parallel processing programming techniques using CUDA (Compute Unified Device Architecture) for the improvement of object detection speed and performance.

Automatically Diagnosing Skull Fractures Using an Object Detection Method and Deep Learning Algorithm in Plain Radiography Images

  • Tae Seok, Jeong;Gi Taek, Yee; Kwang Gi, Kim;Young Jae, Kim;Sang Gu, Lee;Woo Kyung, Kim
    • Journal of Korean Neurosurgical Society
    • /
    • v.66 no.1
    • /
    • pp.53-62
    • /
    • 2023
  • Objective : Deep learning is a machine learning approach based on artificial neural network training, and object detection algorithm using deep learning is used as the most powerful tool in image analysis. We analyzed and evaluated the diagnostic performance of a deep learning algorithm to identify skull fractures in plain radiographic images and investigated its clinical applicability. Methods : A total of 2026 plain radiographic images of the skull (fracture, 991; normal, 1035) were obtained from 741 patients. The RetinaNet architecture was used as a deep learning model. Precision, recall, and average precision were measured to evaluate the deep learning algorithm's diagnostic performance. Results : In ResNet-152, the average precision for intersection over union (IOU) 0.1, 0.3, and 0.5, were 0.7240, 0.6698, and 0.3687, respectively. When the intersection over union (IOU) and confidence threshold were 0.1, the precision was 0.7292, and the recall was 0.7650. When the IOU threshold was 0.1, and the confidence threshold was 0.6, the true and false rates were 82.9% and 17.1%, respectively. There were significant differences in the true/false and false-positive/false-negative ratios between the anterior-posterior, towne, and both lateral views (p=0.032 and p=0.003). Objects detected in false positives had vascular grooves and suture lines. In false negatives, the detection performance of the diastatic fractures, fractures crossing the suture line, and fractures around the vascular grooves and orbit was poor. Conclusion : The object detection algorithm applied with deep learning is expected to be a valuable tool in diagnosing skull fractures.