• Title/Summary/Keyword: Deep Learning Dataset

Search Result 796, Processing Time 0.028 seconds

Character Detection and Recognition of Steel Materials in Construction Drawings using YOLOv4-based Small Object Detection Techniques (YOLOv4 기반의 소형 물체탐지기법을 이용한 건설도면 내 철강 자재 문자 검출 및 인식기법)

  • Sim, Ji-Woo;Woo, Hee-Jo;Kim, Yoonhwan;Kim, Eung-Tae
    • Journal of Broadcast Engineering
    • /
    • v.27 no.3
    • /
    • pp.391-401
    • /
    • 2022
  • As deep learning-based object detection and recognition research have been developed recently, the scope of application to industry and real life is expanding. But deep learning-based systems in the construction system are still much less studied. Calculating materials in the construction system is still manual, so it is a reality that transactions of wrong volumn calculation are generated due to a lot of time required and difficulty in accurate accumulation. A fast and accurate automatic drawing recognition system is required to solve this problem. Therefore, we propose an AI-based automatic drawing recognition accumulation system that detects and recognizes steel materials in construction drawings. To accurately detect steel materials in construction drawings, we propose data augmentation techniques and spatial attention modules for improving small object detection performance based on YOLOv4. The detected steel material area is recognized by text, and the number of steel materials is integrated based on the predicted characters. Experimental results show that the proposed method increases the accuracy and precision by 1.8% and 16%, respectively, compared with the conventional YOLOv4. As for the proposed method, Precision performance was 0.938. The recall was 1. Average Precision AP0.5 was 99.4% and AP0.5:0.95 was 67%. Accuracy for character recognition obtained 99.9.% by configuring and learning a suitable dataset that contains fonts used in construction drawings compared to the 75.6% using the existing dataset. The average time required per image was 0.013 seconds in the detection, 0.65 seconds in character recognition, and 0.16 seconds in the accumulation, resulting in 0.84 seconds.

Assessment of the Object Detection Ability of Interproximal Caries on Primary Teeth in Periapical Radiographs Using Deep Learning Algorithms (유치의 치근단 방사선 사진에서 딥 러닝 알고리즘을 이용한 모델의 인접면 우식증 객체 탐지 능력의 평가)

  • Hongju Jeon;Seonmi Kim;Namki Choi
    • Journal of the korean academy of Pediatric Dentistry
    • /
    • v.50 no.3
    • /
    • pp.263-276
    • /
    • 2023
  • The purpose of this study was to evaluate the performance of a model using You Only Look Once (YOLO) for object detection of proximal caries in periapical radiographs of children. A total of 2016 periapical radiographs in primary dentition were selected from the M6 database as a learning material group, of which 1143 were labeled as proximal caries by an experienced dentist using an annotation tool. After converting the annotations into a training dataset, YOLO was trained on the dataset using a single convolutional neural network (CNN) model. Accuracy, recall, specificity, precision, negative predictive value (NPV), F1-score, Precision-Recall curve, and AP (area under curve) were calculated for evaluation of the object detection model's performance in the 187 test datasets. The results showed that the CNN-based object detection model performed well in detecting proximal caries, with a diagnostic accuracy of 0.95, a recall of 0.94, a specificity of 0.97, a precision of 0.82, a NPV of 0.96, and an F1-score of 0.81. The AP was 0.83. This model could be a valuable tool for dentists in detecting carious lesions in periapical radiographs.

Class Classification and Validation of a Musculoskeletal Risk Factor Dataset for Manufacturing Workers (제조업 노동자 근골격계 부담요인 데이터셋 클래스 분류와 유효성 검증)

  • Young-Jin Kang;;;Jeong, Seok Chan
    • The Journal of Bigdata
    • /
    • v.8 no.1
    • /
    • pp.49-59
    • /
    • 2023
  • There are various items in the safety and health standards of the manufacturing industry, but they can be divided into work-related diseases and musculoskeletal diseases according to the standards for sickness and accident victims. Musculoskeletal diseases occur frequently in manufacturing and can lead to a decrease in labor productivity and a weakening of competitiveness in manufacturing. In this paper, to detect the musculoskeletal harmful factors of manufacturing workers, we defined the musculoskeletal load work factor analysis, harmful load working postures, and key points matching, and constructed data for Artificial Intelligence(AI) learning. To check the effectiveness of the suggested dataset, AI algorithms such as YOLO, Lite-HRNet, and EfficientNet were used to train and verify. Our experimental results the human detection accuracy is 99%, the key points matching accuracy of the detected person is @AP0.5 88%, and the accuracy of working postures evaluation by integrating the inferred matching positions is LEGS 72.2%, NECT 85.7%, TRUNK 81.9%, UPPERARM 79.8%, and LOWERARM 92.7%, and considered the necessity for research that can prevent deep learning-based musculoskeletal diseases.

Comparative Study of Fish Detection and Classification Performance Using the YOLOv8-Seg Model (YOLOv8-Seg 모델을 이용한 어류 탐지 및 분류 성능 비교연구)

  • Sang-Yeup Jin;Heung-Bae Choi;Myeong-Soo Han;Hyo-tae Lee;Young-Tae Son
    • Journal of the Korean Society of Marine Environment & Safety
    • /
    • v.30 no.2
    • /
    • pp.147-156
    • /
    • 2024
  • The sustainable management and enhancement of marine resources are becoming increasingly important issues worldwide. This study was conducted in response to these challenges, focusing on the development and performance comparison of fish detection and classification models as part of a deep learning-based technique for assessing the effectiveness of marine resource enhancement projects initiated by the Korea Fisheries Resources Agency. The aim was to select the optimal model by training various sizes of YOLOv8-Seg models on a fish image dataset and comparing each performance metric. The dataset used for model construction consisted of 36,749 images and label files of 12 different species of fish, with data diversity enhanced through the application of augmentation techniques during training. When training and validating five different YOLOv8-Seg models under identical conditions, the medium-sized YOLOv8m-Seg model showed high learning efficiency and excellent detection and classification performance, with the shortest training time of 13 h and 12 min, an of 0.933, and an inference speed of 9.6 ms. Considering the balance between each performance metric, this was deemed the most efficient model for meeting real-time processing requirements. The use of such real-time fish detection and classification models could enable effective surveys of marine resource enhancement projects, suggesting the need for ongoing performance improvements and further research.

Change Detection for High-resolution Satellite Images Using Transfer Learning and Deep Learning Network (전이학습과 딥러닝 네트워크를 활용한 고해상도 위성영상의 변화탐지)

  • Song, Ah Ram;Choi, Jae Wan;Kim, Yong Il
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.3
    • /
    • pp.199-208
    • /
    • 2019
  • As the number of available satellites increases and technology advances, image information outputs are becoming increasingly diverse and a large amount of data is accumulating. In this study, we propose a change detection method for high-resolution satellite images that uses transfer learning and a deep learning network to overcome the limit caused by insufficient training data via the use of pre-trained information. The deep learning network used in this study comprises convolutional layers to extract the spatial and spectral information and convolutional long-short term memory layers to analyze the time series information. To use the learned information, the two initial convolutional layers of the change detection network are designed to use learned values from 40,000 patches of the ISPRS (International Society for Photogrammertry and Remote Sensing) dataset as initial values. In addition, 2D (2-Dimensional) and 3D (3-dimensional) kernels were used to find the optimized structure for the high-resolution satellite images. The experimental results for the KOMPSAT-3A (KOrean Multi-Purpose SATllite-3A) satellite images show that this change detection method can effectively extract changed/unchanged pixels but is less sensitive to changes due to shadow and relief displacements. In addition, the change detection accuracy of two sites was improved by using 3D kernels. This is because a 3D kernel can consider not only the spatial information but also the spectral information. This study indicates that we can effectively detect changes in high-resolution satellite images using the constructed image information and deep learning network. In future work, a pre-trained change detection network will be applied to newly obtained images to extend the scope of the application.

Deep Learning Based Prediction Method of Long-term Photovoltaic Power Generation Using Meteorological and Seasonal Information (기후 및 계절정보를 이용한 딥러닝 기반의 장기간 태양광 발전량 예측 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.24 no.1
    • /
    • pp.1-16
    • /
    • 2019
  • Recently, since responding to meteorological changes depending on increasing greenhouse gas and electricity demand, the importance prediction of photovoltaic power (PV) is rapidly increasing. In particular, the prediction of PV power generation may help to determine a reasonable price of electricity, and solve the problem addressed such as a system stability and electricity production balance. However, since the dynamic changes of meteorological values such as solar radiation, cloudiness, and temperature, and seasonal changes, the accurate long-term PV power prediction is significantly challenging. Therefore, in this paper, we propose PV power prediction model based on deep learning that can be improved the PV power prediction performance by learning to use meteorological and seasonal information. We evaluate the performances using the proposed model compared to seasonal ARIMA (S-ARIMA) model, which is one of the typical time series methods, and ANN model, which is one hidden layer. As the experiment results using real-world dataset, the proposed model shows the best performance. It means that the proposed model shows positive impact on improving the PV power forecast performance.

A Study on Improving Facial Recognition Performance to Introduce a New Dog Registration Method (새로운 반려견 등록방식 도입을 위한 안면 인식 성능 개선 연구)

  • Lee, Dongsu;Park, Gooman
    • Journal of Broadcast Engineering
    • /
    • v.27 no.5
    • /
    • pp.794-807
    • /
    • 2022
  • Although registration of dogs is mandatory according to the revision of the Animal Protection Act, the registration rate is low due to the inconvenience of the current registration method. In this paper, a performance improvement study was conducted on the dog face recognition technology, which is being reviewed as a new registration method. Through deep learning learning, an embedding vector for facial recognition of a dog was created and a method for identifying each dog individual was experimented. We built a dog image dataset for deep learning learning and experimented with InceptionNet and ResNet-50 as backbone networks. It was learned by the triplet loss method, and the experiments were divided into face verification and face recognition. In the ResNet-50-based model, it was possible to obtain the best facial verification performance of 93.46%, and in the face recognition test, the highest performance of 91.44% was obtained in rank-5, respectively. The experimental methods and results presented in this paper can be used in various fields, such as checking whether a dog is registered or not, and checking an object at a dog access facility.

Multi-scale face detector using anchor free method

  • Lee, Dong-Ryeol;Kim, Yoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.7
    • /
    • pp.47-55
    • /
    • 2020
  • In this paper, we propose one stage multi-scale face detector based Fully Convolution Network using anchor free method. Recently almost all state-of-the-art face detectors which predict location of faces using anchor-based methods rely on pre-defined anchor boxes. However this face detectors need to hyper-parameters and additional computation in training. The key idea of the proposed method is to eliminate hyper-parameters and additional computation using anchor free method. To do this, we apply two ideas. First, by eliminating the pre-defined set of anchor boxes, we avoid the additional computation and hyper-parameters related to anchor boxes. Second, our detector predicts location of faces using multi-feature maps to reduce foreground/background imbalance issue. Through Quantitative evaluation, the performance of the proposed method is evaluated and analyzed. Experimental results on the FDDB dataset demonstrate the effective of our proposed method.

Prediction of Student's Interest on Sports for Classification using Bi-Directional Long Short Term Memory Model

  • Ahamed, A. Basheer;Surputheen, M. Mohamed
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.246-256
    • /
    • 2022
  • Recently, parents and teachers consider physical education as a minor subject for students in elementary and secondary schools. Physical education performance has become increasingly significant as parents and schools pay more attention to physical schooling. The sports mining with distribution analysis model considers different factors, including the games, comments, conversations, and connection made on numerous sports interests. Using different machine learning/deep learning approach, children's athletic and academic interests can be tracked over the course of their academic lives. There have been a number of studies that have focused on predicting the success of students in higher education. Sports interest prediction research at the secondary level is uncommon, but the secondary level is often used as a benchmark to describe students' educational development at higher levels. An Automated Student Interest Prediction on Sports Mining using DL Based Bi-directional Long Short-Term Memory model (BiLSTM) is presented in this article. Pre-processing of data, interest classification, and parameter tweaking are all the essential operations of the proposed model. Initially, data augmentation is used to expand the dataset's size. Secondly, a BiLSTM model is used to predict and classify user interests. Adagrad optimizer is employed for hyperparameter optimization. In order to test the model's performance, a dataset is used and the results are analysed using precision, recall, accuracy and F-measure. The proposed model achieved 95% accuracy on 400th instances, where the existing techniques achieved 93.20% accuracy for the same. The proposed model achieved 95% of accuracy and precision for 60%-40% data, where the existing models achieved 93% for accuracy and precision.

Dog-Species Classification through CycleGAN and Standard Data Augmentation

  • Chan, Park;Nammee, Moon
    • Journal of Information Processing Systems
    • /
    • v.19 no.1
    • /
    • pp.67-79
    • /
    • 2023
  • In the image field, data augmentation refers to increasing the amount of data through an editing method such as rotating or cropping a photo. In this study, a generative adversarial network (GAN) image was created using CycleGAN, and various colors of dogs were reflected through data augmentation. In particular, dog data from the Stanford Dogs Dataset and Oxford-IIIT Pet Dataset were used, and 10 breeds of dog, corresponding to 300 images each, were selected. Subsequently, a GAN image was generated using CycleGAN, and four learning groups were established: 2,000 original photos (group I); 2,000 original photos + 1,000 GAN images (group II); 3,000 original photos (group III); and 3,000 original photos + 1,000 GAN images (group IV). The amount of data in each learning group was augmented using existing data augmentation methods such as rotating, cropping, erasing, and distorting. The augmented photo data were used to train the MobileNet_v3_Large, ResNet-152, InceptionResNet_v2, and NASNet_Large frameworks to evaluate the classification accuracy and loss. The top-3 accuracy for each deep neural network model was as follows: MobileNet_v3_Large of 86.4% (group I), 85.4% (group II), 90.4% (group III), and 89.2% (group IV); ResNet-152 of 82.4% (group I), 83.7% (group II), 84.7% (group III), and 84.9% (group IV); InceptionResNet_v2 of 90.7% (group I), 88.4% (group II), 93.3% (group III), and 93.1% (group IV); and NASNet_Large of 85% (group I), 88.1% (group II), 91.8% (group III), and 92% (group IV). The InceptionResNet_v2 model exhibited the highest image classification accuracy, and the NASNet_Large model exhibited the highest increase in the accuracy owing to data augmentation.