• Title/Summary/Keyword: Camera-based Recognition

Search Result 593, Processing Time 0.026 seconds

A Study on Gesture Interface through User Experience (사용자 경험을 통한 제스처 인터페이스에 관한 연구)

  • Yoon, Ki Tae;Cho, Eel Hea;Lee, Jooyoup
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.7 no.6
    • /
    • pp.839-849
    • /
    • 2017
  • Recently, the role of the kitchen has evolved from the space for previous survival to the space that shows the present life and culture. Along with these changes, the use of IoT technology is spreading. As a result, the development and diffusion of new smart devices in the kitchen is being achieved. The user experience for using these smart devices is also becoming important. For a natural interaction between a user and a computer, better interactions can be expected based on context awareness. This paper examines the Natural User Interface (NUI) that does not touch the device based on the user interface (UI) of the smart device used in the kitchen. In this method, we use the image processing technology to recognize the user's hand gesture using the camera attached to the device and apply the recognized hand shape to the interface. The gestures used in this study are proposed to gesture according to the user's context and situation, and 5 kinds of gestures are classified and used in the interface.

Deep Learning-Based Defects Detection Method of Expiration Date Printed In Product Package (딥러닝 기반의 제품 포장에 인쇄된 유통기한 결함 검출 방법)

  • Lee, Jong-woon;Jeong, Seung Su;Yu, Yun Seop
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.463-465
    • /
    • 2021
  • Currently, the inspection method printed on food packages and boxes is to sample only a few products and inspect them with human eyes. Such a sampling inspection has the limitation that only a small number of products can be inspected. Therefore, accurate inspection using a camera is required. This paper proposes a deep learning object recognition technology model, which is an artificial intelligence technology, as a method for detecting the defects of expiration date printed on the product packaging. Using the Faster R-CNN (region convolution neural network) model, the color images, converted gray images, and converted binary images of the printed expiration date are trained and then tested, and each detection rates are compared. The detection performance of expiration date printed on the package by the proposed method showed the same detection performance as that of conventional vision-based inspection system.

  • PDF

Study of Deep Learning Based Specific Person Following Mobility Control for Logistics Transportation (물류 이송을 위한 딥러닝 기반 특정 사람 추종 모빌리티 제어 연구)

  • Yeong Jun Yu;SeongHoon Kang;JuHwan Kim;SeongIn No;GiHyeon Lee;Seung Yong Lee;Chul-hee Lee
    • Journal of Drive and Control
    • /
    • v.20 no.4
    • /
    • pp.1-8
    • /
    • 2023
  • In recent years, robots have been utilized in various industries to reduce workload and enhance work efficiency. The following mobility offers users convenience by autonomously tracking specific locations and targets without the need for additional equipment such as forklifts or carts. In this paper, deep learning techniques were employed to recognize individuals and assign each of them a unique identifier to enable the recognition of a specific person even among multiple individuals. To achieve this, the distance and angle between the robot and the targeted individual are transmitted to respective controllers. Furthermore, this study explored the control methodology for mobility that tracks a specific person, utilizing Simultaneous Localization and Mapping (SLAM) and Proportional-Integral-Derivative (PID) control techniques. In the PID control method, a genetic algorithm is employed to extract the optimal gain value, subsequently evaluating PID performance through simulation. The SLAM method involves generating a map by synchronizing data from a 2D LiDAR and a depth camera using Real-Time Appearance-Based Mapping (RTAB-MAP). Experiments are conducted to compare and analyze the performance of the two control methods, visualizing the paths of both the human and the following mobility.

A Study on the Automation of Fish Species Identification and Body Length Measurement System (어종 인식 및 체장 측정 자동화 시스템에 관한 연구)

  • Seung-Beom Kang;Seung-Gyu Kim;Sae-Yong Park;Tae-ho Im
    • Journal of Internet Computing and Services
    • /
    • v.25 no.1
    • /
    • pp.17-27
    • /
    • 2024
  • Overfishing, climate change, and competitive fishing have led to a continuous decline in fishery production. To address these issues, the Total Allowable Catch (TAC) system has been established, which sets annual catch quotas for individual fish species and allows fishing only within those limits. As part of the TAC system, land-based investigators measure the length and height of fish species at auction markets to calculate the weight and TAC depletion. However, the accuracy of the acquired data varies depending on the skill level of the land-based investigators, and the labor-intensive nature of the work makes it unsustainable. To address these issues, this paper proposes a fish species recognition and length measurement system that automatically measures the length, height, and weight of eight TAC-managed fish species using the camera of a smart pad that can measure the distance to the water surface. This system can help to automate the current labor-intensive work, minimize data loss, and facilitate the establishment of the TAC system.

Real-Time Traffic Information and Road Sign Recognitions of Circumstance on Expressway for Vehicles in C-ITS Environments (C-ITS 환경에서 차량의 고속도로 주행 시 주변 환경 인지를 위한 실시간 교통정보 및 안내 표지판 인식)

  • Im, Changjae;Kim, Daewon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.54 no.1
    • /
    • pp.55-69
    • /
    • 2017
  • Recently, the IoT (Internet of Things) environment is being developed rapidly through network which is linked to intellectual objects. Through the IoT, it is possible for human to intercommunicate with objects and objects to objects. Also, the IoT provides artificial intelligent service mixed with knowledge of situational awareness. One of the industries based on the IoT is a car industry. Nowadays, a self-driving vehicle which is not only fuel-efficient, smooth for traffic, but also puts top priority on eventual safety for humans became the most important conversation topic. Since several years ago, a research on the recognition of the surrounding environment for self-driving vehicles using sensors, lidar, camera, and radar techniques has been progressed actively. Currently, based on the WAVE (Wireless Access in Vehicular Environment), the research is being boosted by forming networking between vehicles, vehicle and infrastructures. In this paper, a research on the recognition of a traffic signs on highway was processed as a part of the awareness of the surrounding environment for self-driving vehicles. Through the traffic signs which have features of fixed standard and installation location, we provided a learning theory and a corresponding results of experiment about the way that a vehicle is aware of traffic signs and additional informations on it.

Design of CNN-based Braille Conversion and Voice Output Device for the Blind (시각장애인을 위한 CNN 기반의 점자 변환 및 음성 출력 장치 설계)

  • Seung-Bin Park;Bong-Hyun Kim
    • Journal of Internet of Things and Convergence
    • /
    • v.9 no.3
    • /
    • pp.87-92
    • /
    • 2023
  • As times develop, information becomes more diverse and methods of obtaining it become more diverse. About 80% of the amount of information gained in life is acquired through the visual sense. However, visually impaired people have limited ability to interpret visual materials. That's why Braille, a text for the blind, appeared. However, the Braille decoding rate of the blind is only 5%, and as the demand of the blind who want various forms of platforms or materials increases over time, development and product production for the blind are taking place. An example of product production is braille books, which seem to have more disadvantages than advantages, and unlike non-disabled people, it is true that access to information is still very difficult. In this paper, we designed a CNN-based Braille conversion and voice output device to make it easier for visually impaired people to obtain information than conventional methods. The device aims to improve the quality of life by allowing books, text images, or handwritten images that are not made in Braille to be converted into Braille through camera recognition, and designing a function that can be converted into voice according to the needs of the blind.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

A Study on Releasing Cryptographic Key by Using Face and Iris Information on mobile phones (휴대폰 환경에서 얼굴 및 홍채 정보를 이용한 암호화키 생성에 관한 연구)

  • Han, Song-Yi;Park, Kang-Ryoung;Park, So-Young
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.44 no.6
    • /
    • pp.1-9
    • /
    • 2007
  • Recently, as a number of media are fused into a phone, the requirement of security of service provided on a mobile phone is increasing. For this, conventional cryptographic key based on password and security card is used in the mobile phone, but it has the characteristics which is easy to be vulnerable and to be illegally stolen. To overcome such a problem, the researches to generate key based on biometrics have been done. However, it has also the problem that biometric information is susceptible to the variation of environment, whereas conventional cryptographic system should generate invariant cryptographic key at any time. So, we propose new method of producing cryptographic key based on "Biometric matching-based key release" instead of "Biometric-based key generation" by using both face and iris information in order to overcome the unstability of uni-modal biometries. Also, by using mega-pixel camera embedded on mobile phone, we can provide users with convenience that both face and iris recognition is possible at the same time. Experimental results showed that we could obtain the EER(Equal Error Rate) performance of 0.5% when producing cryptographic key. And FAR was shown as about 0.002% in case of FRR of 25%. In addition, our system can provide the functionality of controlling FAR and FRR based on threshold.

The Tunnel Lane Positioning System of a Autonomous Vehicle in the LED Lighting (LED 조명을 이용한 자율주행차용 터널 차로측위 시스템)

  • Jeong, Jae hoon;Lee, Dong heon;Byun, Gi-sig;Cho, Hyung rae;Cho, Yoon ho
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.16 no.1
    • /
    • pp.186-195
    • /
    • 2017
  • Recently, autonomous vehicles have been studied actively. There are various technologies such as ITS, Connected Car, V2X and ADAS in order to realize such autonomous driving. Among these technologies, it is particularly important to recognize where the vehicle is on the road in order to change the lane and drive to the destination. Generally, it is done through GPS and camera image processing. However, there are limitations on the reliability of the positioning due to shaded areas such as tunnels in the case of GPS, and there are limitations in recognition and positioning according to the state of the road lane and the surrounding environment when performing the camera image processing. In this paper, we propose that LED lights should be installed for autonomous vehicles in tunnels which are shaded area of the GPS. In this paper, we show that it is possible to measure the position of the current lane of the autonomous vehicle by analyzing the color temperature after constructing the tunnel LED lighting simulation environment which illuminates light of different color temperature by lane. Based on the above, this paper proposes a lane positioning technique using tunnel LED lights.

Development of a Low-cost Monocular PSD Motion Capture System with Two Active Markers at Fixed Distance (일정간격의 두 능동마커를 이용한 저가형 단안 PSD 모션캡쳐 시스템 개발)

  • Seo, Pyeong-Won;Kim, Yu-Geon;Han, Chang-Ho;Ryu, Young-Kee;Oh, Choon-Suk
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.46 no.2
    • /
    • pp.61-71
    • /
    • 2009
  • In this paper, we propose a low-cost and compact motion capture system which enables to play motion games in PS2(Play Station 2). Recently, motion capture systems which are being used as a part in film producing and making games are too expensive and enormous systems. Now days, motion games using common USB camera are slow and have two-dimension recognition. But PSD sensor has a few good points, such as fast and low-cost. In recently year, 3D motion capture systems using 2D PSD (Position Sensitive Detector) optic sensor for motion capturing have been developed. One is Multi-PSD motion capture system applying stereo vision and another is Single-PSD motion capture system applying optical theory ship. But there are some problems to apply them to motion games. The Multi-PSD is high-cost and complicated because of using two more PSD Camera. It is so difficult to make markers having omni-direction equal intensity in Single-PSD. In this research, we propose a new theory that solves aforementioned problems. It can measure 3D coordination if separated two marker's intensity is equal to. We made a system based on this theory and experimented for performance capability. As a result, we were able to develop a motion capture system which is a single, low-cost, fast, compact, wide-angle and an adaptable motion games. The developed system is expected to be useful in animation, movies and games.