• Title/Summary/Keyword: learning object

Search Result 1,573, Processing Time 0.026 seconds

Performance Evaluation of YOLOv5s for Brain Hemorrhage Detection Using Computed Tomography Images (전산화단층영상 기반 뇌출혈 검출을 위한 YOLOv5s 성능 평가)

  • Kim, Sungmin;Lee, Seungwan
    • Journal of the Korean Society of Radiology
    • /
    • v.16 no.1
    • /
    • pp.25-34
    • /
    • 2022
  • Brain computed tomography (CT) is useful for brain lesion diagnosis, such as brain hemorrhage, due to non-invasive methodology, 3-dimensional image provision, low radiation dose. However, there has been numerous misdiagnosis owing to a lack of radiologist and heavy workload. Recently, object detection technologies based on artificial intelligence have been developed in order to overcome the limitations of traditional diagnosis. In this study, the applicability of a deep learning-based YOLOv5s model was evaluated for brain hemorrhage detection using brain CT images. Also, the effect of hyperparameters in the trained YOLOv5s model was analyzed. The YOLOv5s model consisted of backbone, neck and output modules. The trained model was able to detect a region of brain hemorrhage and provide the information of the region. The YOLOv5s model was trained with various activation functions, optimizer functions, loss functions and epochs, and the performance of the trained model was evaluated in terms of brain hemorrhage detection accuracy and training time. The results showed that the trained YOLOv5s model is able to provide a bounding box for a region of brain hemorrhage and the accuracy of the corresponding box. The performance of the YOLOv5s model was improved by using the mish activation function, the stochastic gradient descent (SGD) optimizer function and the completed intersection over union (CIoU) loss function. Also, the accuracy and training time of the YOLOv5s model increased with the number of epochs. Therefore, the YOLOv5s model is suitable for brain hemorrhage detection using brain CT images, and the performance of the model can be maximized by using appropriate hyperparameters.

Design of an Intellectual Smart Mirror Appication helping Face Makeup (얼굴 메이크업을 도와주는 지능형 스마트 거울 앱의설계)

  • Oh, Sun Jin;Lee, Yoon Suk
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.497-502
    • /
    • 2022
  • Information delivery among young generation has a distinct tendency to prefer visual to text as means of information distribution and sharing recently, and it is natural to distribute information through Youtube or one-man broadcasting on Internet. That is, young generation usually get their information through this kind of distribution procedure. Many young generation are also drastic and more aggressive for decorating themselves very uniquely. It tends to create personal characteristics freely through drastic expression and attempt of face makeup, hair styling and fashion coordination without distinction of sex. Especially, face makeup becomes an object of major concern among males nowadays, and female of course, then it is the major means to express their personality. In this study, to meet the demands of the times, we design and implement the intellectual smart mirror application that efficiently retrieves and recommends the related videos among Youtube or one-man broadcastings produced by famous professional makeup artists to implement the face makeup congruous with our face shape, hair color & style, skin tone, fashion color & style in order to create the face makeup that represent our characteristics. We also introduce the AI technique to provide optimal solution based on the learning of user's search patterns and facial features, and finally provide the detailed makeup face images to give the chance to get the makeup skill stage by stage.

Evaluation of Human Demonstration Augmented Deep Reinforcement Learning Policy Optimization Methods Using Object Manipulation with an Anthropomorphic Robot Hand (휴먼형 로봇 손의 사물 조작 수행을 이용한 인간 행동 복제 강화학습 정책 최적화 방법 성능 평가)

  • Park, Na Hyeon;Oh, Ji Heon;Ryu, Ga Hyun;Anazco, Edwin Valarezo;Lopez, Patricio Rivera;Won, Da Seul;Jeong, Jin Gyun;Chang, Yun Jung;Kim, Tae-Seong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.858-861
    • /
    • 2020
  • 로봇이 사람과 같이 다양하고 복잡한 사물 조작을 하기 위해서 휴먼형 로봇손의 사물 파지 작업이 필수적이다. 자유도 (Degree of Freedom, DoF)가 높은 휴먼형(anthropomorphic) 로봇손을 학습시키기 위하여 사람 데모(human demonstration)가 결합된 강화학습 최적화 방법이 제안되었다. 본 연구에서는 강화학습 최적화 방법에 사람 데모가 결합된 Demonstration Augmented Natural Policy Gradient(DA-NPG)와 NPG 의 성능 비교를 통하여 행동 복제의 효율성을 확인하고, DA-NPG, DA-Trust Region Policy Optimization (DA-TRPO), DA-Proximal Policy Optimization (DA-PPO)의 최적화 방법의 성능 평가를 위하여 6 종의 물체에 대한 휴먼형 로봇손의 사물 조작 작업을 수행한다. 그 결과, DA-NPG 와 NPG를 비교한 결과를 통해 휴먼형 로봇손의 사물 조작 강화학습에 행동 복제가 효율적임을 증명하였다. 또한, DA-NPG 는 DA-TRPO 와 유사한 성능을 보이면서 모든 물체에 대한 사물 파지에 성공하여 가장 안정적이었다. 반면, DA-TRPO 와 DA-PPO 는 사물 조작에 실패한 물체가 존재하여 불안정한 성능을 보였다. 본 연구에서 제안하는 방법은 향후 실제 휴먼형 로봇에 적용하여 휴먼형 로봇 손의 사물조작 지능 개발에 유용할 것으로 전망된다.

Development of Agricultural Products Screening System through X-ray Density Analysis

  • Eunhyeok Baek;Young-Tae Kwak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.105-112
    • /
    • 2023
  • In this paper, we propose a new method for displaying colored defects by measuring the relative density with the wide-area and local densities of X-ray. The relative density of one pixel represents a relative difference from the surrounding pixels, and we also suggest a colorization of X-ray images representing these pixels as normal and defective. The traditional method mainly inspects materials such as plastics and metals, which have large differences in transmittance to the object. Our proposed method can be used to detect defects such as sprouts or holes in images obtained by an inspection machine that detects X-rays. In the experiment, the products that could not be seen with the naked eye were colored with pests or sprouts in a specific color so that they could be used in the agricultural product selection system. Products that are uniformly filled with a single ingredient inside, such as potatoes, carrots, and apples, can be detected effectively. However, it does not work well with bumpy products, such as peppers and paprika. The advantage of this method is that, unlike machine learning, it doesn't require large amounts of data. The proposed method could be applied to a screening system using X-rays and used not only in agricultural product screening systems but also in manufacturing processes such as processed food and parts manufacturing, so that it can be actively used to select defective products.

Automatic Validation of the Geometric Quality of Crowdsourcing Drone Imagery (크라우드소싱 드론 영상의 기하학적 품질 자동 검증)

  • Dongho Lee ;Kyoungah Choi
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_1
    • /
    • pp.577-587
    • /
    • 2023
  • The utilization of crowdsourced spatial data has been actively researched; however, issues stemming from the uncertainty of data quality have been raised. In particular, when low-quality data is mixed into drone imagery datasets, it can degrade the quality of spatial information output. In order to address these problems, the study presents a methodology for automatically validating the geometric quality of crowdsourced imagery. Key quality factors such as spatial resolution, resolution variation, matching point reprojection error, and bundle adjustment results are utilized. To classify imagery suitable for spatial information generation, training and validation datasets are constructed, and machine learning is conducted using a radial basis function (RBF)-based support vector machine (SVM) model. The trained SVM model achieved a classification accuracy of 99.1%. To evaluate the effectiveness of the quality validation model, imagery sets before and after applying the model to drone imagery not used in training and validation are compared by generating orthoimages. The results confirm that the application of the quality validation model reduces various distortions that can be included in orthoimages and enhances object identifiability. The proposed quality validation methodology is expected to increase the utility of crowdsourced data in spatial information generation by automatically selecting high-quality data from the multitude of crowdsourced data with varying qualities.

GPT-enabled SNS Sentence writing support system Based on Image Object and Meta Information (이미지 객체 및 메타정보 기반 GPT 활용 SNS 문장 작성 보조 시스템)

  • Dong-Hee Lee;Mikyeong Moon;Bong-Jun, Choi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.3
    • /
    • pp.160-165
    • /
    • 2023
  • In this study, we propose an SNS sentence writing assistance system that utilizes YOLO and GPT to assist users in writing texts with images, such as SNS. We utilize the YOLO model to extract objects from images inserted during writing, and also extract meta-information such as GPS information and creation time information, and use them as prompt values for GPT. To use the YOLO model, we trained it on form image data, and the mAP score of the model is about 0.25 on average. GPT was trained on 1,000 blog text data with the topic of 'restaurant reviews', and the model trained in this study was used to generate sentences with two types of keywords extracted from the images. A survey was conducted to evaluate the practicality of the generated sentences, and a closed-ended survey was conducted to clearly analyze the survey results. There were three evaluation items for the questionnaire by providing the inserted image and keyword sentences. The results showed that the keywords in the images generated meaningful sentences. Through this study, we found that the accuracy of image-based sentence generation depends on the relationship between image keywords and GPT learning contents.

AI-Based Object Recognition Research for Augmented Reality Character Implementation (증강현실 캐릭터 구현을 위한 AI기반 객체인식 연구)

  • Seok-Hwan Lee;Jung-Keum Lee;Hyun Sim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.6
    • /
    • pp.1321-1330
    • /
    • 2023
  • This study attempts to address the problem of 3D pose estimation for multiple human objects through a single image generated during the character development process that can be used in augmented reality. In the existing top-down method, all objects in the image are first detected, and then each is reconstructed independently. The problem is that inconsistent results may occur due to overlap or depth order mismatch between the reconstructed objects. The goal of this study is to solve these problems and develop a single network that provides consistent 3D reconstruction of all humans in a scene. Integrating a human body model based on the SMPL parametric system into a top-down framework became an important choice. Through this, two types of collision loss based on distance field and loss that considers depth order were introduced. The first loss prevents overlap between reconstructed people, and the second loss adjusts the depth ordering of people to render occlusion inference and annotated instance segmentation consistently. This method allows depth information to be provided to the network without explicit 3D annotation of the image. Experimental results show that this study's methodology performs better than existing methods on standard 3D pose benchmarks, and the proposed losses enable more consistent reconstruction from natural images.

Detecting high-resolution usage status of individual parcel of land using object detecting deep learning technique (객체 탐지 딥러닝 기법을 활용한 필지별 조사 방안 연구)

  • Jeon, Jeong-Bae
    • Journal of Cadastre & Land InformatiX
    • /
    • v.54 no.1
    • /
    • pp.19-32
    • /
    • 2024
  • This study examined the feasibility of image-based surveys by detecting objects in facilities and agricultural land using the YOLO algorithm based on drone images and comparing them with the land category by law. As a result of detecting objects through the YOLO algorithm, buildings showed a performance of detecting objects corresponding to 96.3% of the buildings provided in the existing digital map. In addition, the YOLO algorithm developed in this study detected 136 additional buildings that were not located in the digital map. Plastic greenhouses detected a total of 297 objects, but the detection rate was low for some plastic greenhouses for fruit trees. Also, agricultural land had the lowest detection rate. This result is because agricultural land has a larger area and irregular shape than buildings, so the accuracy is lower than buildings due to the inconsistency of training data. Therefore, segmentation detection, rather than box-shaped detection, is likely to be more effective for agricultural fields. Comparing the detected objects with the land category by law, it was analyzed that some buildings exist in agricultural and forest areas where it is difficult to locate buildings. It seems that it is necessary to link with administrative information to understand that these buildings are used illegally. Therefore, at the current level, it is possible to objectively determine the existence of buildings in fields where it is difficult to locate buildings.

Digital Library Interface Research Based on EEG, Eye-Tracking, and Artificial Intelligence Technologies: Focusing on the Utilization of Implicit Relevance Feedback (뇌파, 시선추적 및 인공지능 기술에 기반한 디지털 도서관 인터페이스 연구: 암묵적 적합성 피드백 활용을 중심으로)

  • Hyun-Hee Kim;Yong-Ho Kim
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.1
    • /
    • pp.261-282
    • /
    • 2024
  • This study proposed and evaluated electroencephalography (EEG)-based and eye-tracking-based methods to determine relevance by utilizing users' implicit relevance feedback while navigating content in a digital library. For this, EEG/eye-tracking experiments were conducted on 32 participants using video, image, and text data. To assess the usefulness of the proposed methods, deep learning-based artificial intelligence (AI) techniques were used as a competitive benchmark. The evaluation results showed that EEG component-based methods (av_P600 and f_P3b components) demonstrated high classification accuracy in selecting relevant videos and images (faces/emotions). In contrast, AI-based methods, specifically object recognition and natural language processing, showed high classification accuracy for selecting images (objects) and texts (newspaper articles). Finally, guidelines for implementing a digital library interface based on EEG, eye-tracking, and artificial intelligence technologies have been proposed. Specifically, a system model based on implicit relevance feedback has been presented. Moreover, to enhance classification accuracy, methods suitable for each media type have been suggested, including EEG-based, eye-tracking-based, and AI-based approaches.

Automated Analyses of Ground-Penetrating Radar Images to Determine Spatial Distribution of Buried Cultural Heritage (매장 문화재 공간 분포 결정을 위한 지하투과레이더 영상 분석 자동화 기법 탐색)

  • Kwon, Moonhee;Kim, Seung-Sep
    • Economic and Environmental Geology
    • /
    • v.55 no.5
    • /
    • pp.551-561
    • /
    • 2022
  • Geophysical exploration methods are very useful for generating high-resolution images of underground structures, and such methods can be applied to investigation of buried cultural properties and for determining their exact locations. In this study, image feature extraction and image segmentation methods were applied to automatically distinguish the structures of buried relics from the high-resolution ground-penetrating radar (GPR) images obtained at the center of Silla Kingdom, Gyeongju, South Korea. The major purpose for image feature extraction analyses is identifying the circular features from building remains and the linear features from ancient roads and fences. Feature extraction is implemented by applying the Canny edge detection and Hough transform algorithms. We applied the Hough transforms to the edge image resulted from the Canny algorithm in order to determine the locations the target features. However, the Hough transform requires different parameter settings for each survey sector. As for image segmentation, we applied the connected element labeling algorithm and object-based image analysis using Orfeo Toolbox (OTB) in QGIS. The connected components labeled image shows the signals associated with the target buried relics are effectively connected and labeled. However, we often find multiple labels are assigned to a single structure on the given GPR data. Object-based image analysis was conducted by using a Large-Scale Mean-Shift (LSMS) image segmentation. In this analysis, a vector layer containing pixel values for each segmented polygon was estimated first and then used to build a train-validation dataset by assigning the polygons to one class associated with the buried relics and another class for the background field. With the Random Forest Classifier, we find that the polygons on the LSMS image segmentation layer can be successfully classified into the polygons of the buried relics and those of the background. Thus, we propose that these automatic classification methods applied to the GPR images of buried cultural heritage in this study can be useful to obtain consistent analyses results for planning excavation processes.