• Title/Summary/Keyword: camera vision

Search Result 1,386, Processing Time 0.637 seconds

Design and Implementation of a Real-Time Lipreading System Using PCA & HMM (PCA와 HMM을 이용한 실시간 립리딩 시스템의 설계 및 구현)

  • Lee chi-geun;Lee eun-suk;Jung sung-tae;Lee sang-seol
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1597-1609
    • /
    • 2004
  • A lot of lipreading system has been proposed to compensate the rate of speech recognition dropped in a noisy environment. Previous lipreading systems work on some specific conditions such as artificial lighting and predefined background color. In this paper, we propose a real-time lipreading system which allows the motion of a speaker and relaxes the restriction on the condition for color and lighting. The proposed system extracts face and lip region from input video sequence captured with a common PC camera and essential visual information in real-time. It recognizes utterance words by using the visual information in real-time. It uses the hue histogram model to extract face and lip region. It uses mean shift algorithm to track the face of a moving speaker. It uses PCA(Principal Component Analysis) to extract the visual information for learning and testing. Also, it uses HMM(Hidden Markov Model) as a recognition algorithm. The experimental results show that our system could get the recognition rate of 90% in case of speaker dependent lipreading and increase the rate of speech recognition up to 40~85% according to the noise level when it is combined with audio speech recognition.

  • PDF

Design and Implementation of the Stop line and Crosswalk Recognition Algorithm for Autonomous UGV (자율 주행 UGV를 위한 정지선과 횡단보도 인식 알고리즘 설계 및 구현)

  • Lee, Jae Hwan;Yoon, Heebyung
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.3
    • /
    • pp.271-278
    • /
    • 2014
  • In spite of that stop line and crosswalk should be aware of the most basic objects in transportation system, its features extracted are very limited. In addition to image-based recognition technology, laser and RF, GPS/INS recognition technology, it is difficult to recognize. For this reason, the limited research in this area has been done. In this paper, the algorithm to recognize the stop line and crosswalk is designed and implemented using image-based recognition technology with the images input through a vision sensor. This algorithm consists of three functions.; One is to select the area, in advance, needed for feature extraction in order to speed up the data processing, 'Region of Interest', another is to process the images only that white color is detected more than a certain proportion in order to remove the unnecessary operation, 'Color Pattern Inspection', the other is 'Feature Extraction and Recognition', which is to extract the edge features and compare this to the previously-modeled one to identify the stop line and crosswalk. For this, especially by using case based feature comparison algorithm, it can identify either both stop line and crosswalk exist or just one exists. Also the proposed algorithm is to develop existing researches by comparing and analysing effect of in-vehicle camera installation and changes in recognition rate of distance estimation and various constraints such as backlight and shadow.

Development of On-line Quality Sorting System for Dried Oak Mushroom - 3rd Prototype-

  • 김철수;김기동;조기현;이정택;김진현
    • Agricultural and Biosystems Engineering
    • /
    • v.4 no.1
    • /
    • pp.8-15
    • /
    • 2003
  • In Korea, quality evaluation of dried oak mushrooms are done first by classifying them into more than 10 different categories based on the state of opening of the cap, surface pattern, and colors. And mushrooms of each category are further classified into 3 or 4 groups based on its shape and size, resulting into total 30 to 40 different grades. Quality evaluation and sorting based on the external visual features are usually done manually. Since visual features of mushroom affecting quality grades are distributed over the entire surface of the mushroom, both front (cap) and back (stem and gill) surfaces should be inspected thoroughly. In fact, it is almost impossible for human to inspect every mushroom, especially when they are fed continuously via conveyor. In this paper, considering real time on-line system implementation, image processing algorithms utilizing artificial neural network have been developed for the quality grading of a mushroom. The neural network based image processing utilized the raw gray value image of fed mushrooms captured by the camera without any complex image processing such as feature enhancement and extraction to identify the feeding state and to grade the quality of a mushroom. Developed algorithms were implemented to the prototype on-line grading and sorting system. The prototype was developed to simplify the system requirement and the overall mechanism. The system was composed of automatic devices for mushroom feeding and handling, a set of computer vision system with lighting chamber, one chip microprocessor based controller, and pneumatic actuators. The proposed grading scheme was tested using the prototype. Network training for the feeding state recognition and grading was done using static images. 200 samples (20 grade levels and 10 per each grade) were used for training. 300 samples (20 grade levels and 15 per each grade) were used to validate the trained network. By changing orientation of each sample, 600 data sets were made for the test and the trained network showed around 91 % of the grading accuracy. Though image processing itself required approximately less than 0.3 second depending on a mushroom, because of the actuating device and control response, average 0.6 to 0.7 second was required for grading and sorting of a mushroom resulting into the processing capability of 5,000/hr to 6,000/hr.

  • PDF

The Aesthetic Transformation of Shadow Images and the Extended Imagination (그림자 이미지의 미학적 변용과 확장된 상상력 :디지털 실루엣 애니메이션과 최근 미디어 아트의 흐름을 중심으로)

  • Kim, Young-Ok
    • Cartoon and Animation Studies
    • /
    • s.49
    • /
    • pp.651-676
    • /
    • 2017
  • Shadow images are a representative medium and means of expression for the imagination that exists between consciousness and unconsciousness for thousands of years. Wherever light exists, people create play with their own shadows without special skills, and have made a fantasy at once. Shadow images have long been used as subjects and materials of literacy, art, philosophy, and popular culture. Especially in the field of art, people have been experimenting with visual stimulation through the uniqueness of simple silhouettes images. In the field of animation, it became to be recognized as a form of non - mainstream areas that are difficult to make. However, shadow images have been used more actively in the field of digital arts and media art. In this Environment with technologies, Various formative imaginations are being expressed more with shadow images in a new dimension. This study is to introduce and analyze these trends, the aesthetic transformations and extended methods focusing on digital silhouette animation and recent media art works using shadow images. Screen-based silhouette animation combines digital technology and new approaches that have escaped conventional methods have removed most of the elements that have been considered limitations, and these factors have become a matter of choice for the directors. Especially, in the display environment using various light sources, projection, and camera technology, shadow images were expressed with multiple-layered virtual spaces, and it becomes possible to imagine a new extended imagination. Through the computer vision, it became possible to find new gaze and spatial images and use it more flexibly. These changes have given new possibility to the use shadow images in a different way.

HMM-based Intent Recognition System using 3D Image Reconstruction Data (3차원 영상복원 데이터를 이용한 HMM 기반 의도인식 시스템)

  • Ko, Kwang-Enu;Park, Seung-Min;Kim, Jun-Yeup;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.2
    • /
    • pp.135-140
    • /
    • 2012
  • The mirror neuron system in the cerebrum, which are handled by visual information-based imitative learning. When we observe the observer's range of mirror neuron system, we can assume intention of performance through progress of neural activation as specific range, in include of partially hidden range. It is goal of our paper that imitative learning is applied to 3D vision-based intelligent system. We have experiment as stereo camera-based restoration about acquired 3D image our previous research Using Optical flow, unscented Kalman filter. At this point, 3D input image is sequential continuous image as including of partially hidden range. We used Hidden Markov Model to perform the intention recognition about performance as result of restoration-based hidden range. The dynamic inference function about sequential input data have compatible properties such as hand gesture recognition include of hidden range. In this paper, for proposed intention recognition, we already had a simulation about object outline and feature extraction in the previous research, we generated temporal continuous feature vector about feature extraction and when we apply to Hidden Markov Model, make a result of simulation about hand gesture classification according to intention pattern. We got the result of hand gesture classification as value of posterior probability, and proved the accuracy outstandingness through the result.

Mobile Robot Localization and Mapping using Scale-Invariant Features (스케일 불변 특징을 이용한 이동 로봇의 위치 추정 및 매핑)

  • Lee, Jong-Shill;Shen, Dong-Fan;Kwon, Oh-Sang;Lee, Eung-Hyuk;Hong, Seung-Hong
    • Journal of IKEEE
    • /
    • v.9 no.1 s.16
    • /
    • pp.7-18
    • /
    • 2005
  • A key component of an autonomous mobile robot is to localize itself accurately and build a map of the environment simultaneously. In this paper, we propose a vision-based mobile robot localization and mapping algorithm using scale-invariant features. A camera with fisheye lens facing toward to ceiling is attached to the robot to acquire high-level features with scale invariance. These features are used in map building and localization process. As pre-processing, input images from fisheye lens are calibrated to remove radial distortion then labeling and convex hull techniques are used to segment ceiling region from wall region. At initial map building process, features are calculated for segmented regions and stored in map database. Features are continuously calculated from sequential input images and matched against existing map until map building process is finished. If features are not matched, they are added to the existing map. Localization is done simultaneously with feature matching at map building process. Localization. is performed when features are matched with existing map and map building database is updated at same time. The proposed method can perform a map building in 2 minutes on $50m^2$ area. The positioning accuracy is ${\pm}13cm$, the average error on robot angle with the positioning is ${\pm}3$ degree.

  • PDF

Visual-Attention Using Corner Feature Based SLAM in Indoor Environment (실내 환경에서 모서리 특징을 이용한 시각 집중 기반의 SLAM)

  • Shin, Yong-Min;Yi, Chu-Ho;Suh, Il-Hong;Choi, Byung-Uk
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.4
    • /
    • pp.90-101
    • /
    • 2012
  • The landmark selection is crucial to successful perform in SLAM(Simultaneous Localization and Mapping) with a mono camera. Especially, in unknown environment, automatic landmark selection is needed since there is no advance information about landmark. In this paper, proposed visual attention system which modeled human's vision system will be used in order to select landmark automatically. The edge feature is one of the most important element for attention in previous visual attention system. However, when the edge feature is used in complicated indoor area, the response of complicated area disappears, and between flat surfaces are getting higher. Also, computation cost increases occurs due to the growth of the dimensionality since it uses the responses for 4 directions. This paper suggests to use a corner feature in order to solve or prevent the problems mentioned above. Using a corner feature can also increase the accuracy of data association by concentrating on area which is more complicated and informative in indoor environments. Finally, this paper will prove that visual attention system based on corner feature can be more effective in SLAM compared to previous method by experiment.

Geospatial Data Display Technique for Non-Glasses Stereoscopic Monitor (무안경식 입체 모니터를 이용한 지형공간 데이터의 디스플레이 기법)

  • Lee, Seun-Geun;Lee, Dong-Cheon
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.26 no.6
    • /
    • pp.599-609
    • /
    • 2008
  • Development of computer and electronic technology leads innovative progress in spatial informatics and successful commercialization. Geospatial information technology plays an important role in decision making in various applications. However, information display media are two-dimensional plane that limits visual perception. Understanding human visual processing mechanism to percept stereo vision makes possible to implement three-dimensional stereo image display. This paper proposes on-the-fly stereo image generation methods that are involved with various exterior and camera parameters including exposure station, viewing direction, image size, overlap and focal length. Collinearity equations and parameters related with stereo viewing conditions were solved to generate realisitc stereo imagery. In addition stereo flying simulation scenery was generated with different viewing locations and directions. The stereo viewing is based on the parallax principle of two veiwing locations. This study implemented anaglyphic stereogram, polarization and lenticular stereo display methods. Existing display technology has limitation to provide visual information of three-dimensional and dynamic nature of the real world because the 3D spatial information is projected into 2D plane. Therefore, stereo display methods developed in this study improves geospatial information and applications of GIS by realistic stereo visualization.

Gesture Spotting by Web-Camera in Arbitrary Two Positions and Fuzzy Garbage Model (임의 두 지점의 웹 카메라와 퍼지 가비지 모델을 이용한 사용자의 의미 있는 동작 검출)

  • Yang, Seung-Eun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.1 no.2
    • /
    • pp.127-136
    • /
    • 2012
  • Many research of hand gesture recognition based on vision system have been conducted which enable user operate various electronic devices more easily. 3D position calculation and meaningful gesture classification from similar gestures should be executed to recognize hand gesture accurately. A simple and cost effective method of 3D position calculation and gesture spotting (a task to recognize meaningful gesture from other similar meaningless gestures) is described in this paper. 3D position is achieved by calculation of two cameras relative position through pan/tilt module and a marker regardless with the placed position. Fuzzy garbage model is proposed to provide a variable reference value to decide whether the user gesture is the command gesture or not. The reference is achieved from fuzzy command gesture model and fuzzy garbage model which returns the score that shows the degree of belonging to command gesture and garbage gesture respectively. Two-stage user adaptation is proposed that off-line (batch) adaptation for inter-personal difference and on-line (incremental) adaptation for intra-difference to enhance the performance. Experiment is conducted for 5 different users. The recognition rate of command (discriminate command gesture) is more than 95% when only one command like meaningless gesture exists and more than 85% when the command is mixed with many other similar gestures.

Counting and Localizing Occupants using IR-UWB Radar and Machine Learning

  • Ji, Geonwoo;Lee, Changwon;Yun, Jaeseok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.5
    • /
    • pp.1-9
    • /
    • 2022
  • Localization systems can be used with various circumstances like measuring population movement and rescue technology, even in security technology (like infiltration detection system). Vision sensors such as camera often used for localization is susceptible with light and temperature, and can cause invasion of privacy. In this paper, we used ultra-wideband radar technology (which is not limited by aforementioned problems) and machine learning techniques to measure the number and location of occupants in other indoor spaces behind the wall. We used four different algorithms and compared their results, including extremely randomized tree for four different situations; detect the number of occupants in a classroom, split the classroom into 28 locations and check the position of occupant, select one out of the 28 locations, divide it into 16 fine-grained locations, and check the position of occupant, and checking the positions of two occupants (existing in different locations). Overall, four algorithms showed good results and we verified that detecting the number and location of occupants are possible with high accuracy using machine learning. Also we have considered the possibility of service expansion using the oneM2M standard platform and expect to develop more service and products if this technology is used in various fields.