• Title/Summary/Keyword: vision algorithm

Search Result 1,630, Processing Time 0.035 seconds

Training of a Siamese Network to Build a Tracker without Using Tracking Labels (샴 네트워크를 사용하여 추적 레이블을 사용하지 않는 다중 객체 검출 및 추적기 학습에 관한 연구)

  • Kang, Jungyu;Song, Yoo-Seung;Min, Kyoung-Wook;Choi, Jeong Dan
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.21 no.5
    • /
    • pp.274-286
    • /
    • 2022
  • Multi-object tracking has been studied for a long time under computer vision and plays a critical role in applications such as autonomous driving and driving assistance. Multi-object tracking techniques generally consist of a detector that detects objects and a tracker that tracks the detected objects. Various publicly available datasets allow us to train a detector model without much effort. However, there are relatively few publicly available datasets for training a tracker model, and configuring own tracker datasets takes a long time compared to configuring detector datasets. Hence, the detector is often developed separately with a tracker module. However, the separated tracker should be adjusted whenever the former detector model is changed. This study proposes a system that can train a model that performs detection and tracking simultaneously using only the detector training datasets. In particular, a Siam network with augmentation is used to compose the detector and tracker. Experiments are conducted on public datasets to verify that the proposed algorithm can formulate a real-time multi-object tracker comparable to the state-of-the-art tracker models.

A Study on the Artificial Intelligence-Based Soybean Growth Analysis Method (인공지능 기반 콩 생장분석 방법 연구)

  • Moon-Seok Jeon;Yeongtae Kim;Yuseok Jeong;Hyojun Bae;Chaewon Lee;Song Lim Kim;Inchan Choi
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.1-14
    • /
    • 2023
  • Soybeans are one of the world's top five staple crops and a major source of plant-based protein. Due to their susceptibility to climate change, which can significantly impact grain production, the National Agricultural Science Institute is conducting research on crop phenotypes through growth analysis of various soybean varieties. While the process of capturing growth progression photos of soybeans is automated, the verification, recording, and analysis of growth stages are currently done manually. In this paper, we designed and trained a YOLOv5s model to detect soybean leaf objects from image data of soybean plants and a Convolution Neural Network (CNN) model to judgement the unfolding status of the detected soybean leaves. We combined these two models and implemented an algorithm that distinguishes layers based on the coordinates of detected soybean leaves. As a result, we developed a program that takes time-series data of soybeans as input and performs growth analysis. The program can accurately determine the growth stages of soybeans up to the second or third compound leaves.

A Study on the Application of Drone to Prevent the Spread of Green Tides in Lake Environment (호수 환경의 녹조 확산 방지를 위한 드론 적용 방안에 관한 연구)

  • Jin-Taek Lim;Woo-Ram Lee;Sang-Beom Lee
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.24 no.1
    • /
    • pp.27-33
    • /
    • 2023
  • Recently, water shortages have occurred due to climate change, and the need for water management of agricultural water has increased due to the occurrence of algal blooms in reservoirs. Existing algae prevention is operated by putting many people on site and misses the optimal spraying time due to movement through boats. In order to solve this problem, it is necessary to block contamination in advance and move within time to uniformly spray complex microorganisms uniformly. Control drones are used for pesticide spraying and can be applied to algae prevention work by utilizing control drones. In this paper, basic research for the establishment of a marine control system was conducted for application to the reservoir environment, and as one of the results, the characteristics of a drone nozzle, a core technology that can be used for control drones, were calculated. In particular, it was found that the existing agricultural control drones had a disadvantage that the concentration was non-uniform within the suggested spraying interval, and to compensate for this, nozzle positioning and nozzle spraying uniformity were calculated. Based on the experimental results, we develop a core algorithm for establishing an algal bloom monitoring system in the reservoir environment and propose a precision control technology that can be used for marine control work in the future.

Improvement of Face Recognition Algorithm for Residential Area Surveillance System Based on Graph Convolution Network (그래프 컨벌루션 네트워크 기반 주거지역 감시시스템의 얼굴인식 알고리즘 개선)

  • Tan Heyi;Byung-Won Min
    • Journal of Internet of Things and Convergence
    • /
    • v.10 no.2
    • /
    • pp.1-15
    • /
    • 2024
  • The construction of smart communities is a new method and important measure to ensure the security of residential areas. In order to solve the problem of low accuracy in face recognition caused by distorting facial features due to monitoring camera angles and other external factors, this paper proposes the following optimization strategies in designing a face recognition network: firstly, a global graph convolution module is designed to encode facial features as graph nodes, and a multi-scale feature enhancement residual module is designed to extract facial keypoint features in conjunction with the global graph convolution module. Secondly, after obtaining facial keypoints, they are constructed as a directed graph structure, and graph attention mechanisms are used to enhance the representation power of graph features. Finally, tensor computations are performed on the graph features of two faces, and the aggregated features are extracted and discriminated by a fully connected layer to determine whether the individuals' identities are the same. Through various experimental tests, the network designed in this paper achieves an AUC index of 85.65% for facial keypoint localization on the 300W public dataset and 88.92% on a self-built dataset. In terms of face recognition accuracy, the proposed network achieves an accuracy of 83.41% on the IBUG public dataset and 96.74% on a self-built dataset. Experimental results demonstrate that the network designed in this paper exhibits high detection and recognition accuracy for faces in surveillance videos.

Automated Data Extraction from Unstructured Geotechnical Report based on AI and Text-mining Techniques (AI 및 텍스트 마이닝 기법을 활용한 지반조사보고서 데이터 추출 자동화)

  • Park, Jimin;Seo, Wanhyuk;Seo, Dong-Hee;Yun, Tae-Sup
    • Journal of the Korean Geotechnical Society
    • /
    • v.40 no.4
    • /
    • pp.69-79
    • /
    • 2024
  • Field geotechnical data are obtained from various field and laboratory tests and are documented in geotechnical investigation reports. For efficient design and construction, digitizing these geotechnical parameters is essential. However, current practices involve manual data entry, which is time-consuming, labor-intensive, and prone to errors. Thus, this study proposes an automatic data extraction method from geotechnical investigation reports using image-based deep learning models and text-mining techniques. A deep-learning-based page classification model and a text-searching algorithm were employed to classify geotechnical investigation report pages with 100% accuracy. Computer vision algorithms were utilized to identify valid data regions within report pages, and text analysis was used to match and extract the corresponding geotechnical data. The proposed model was validated using a dataset of 205 geotechnical investigation reports, achieving an average data extraction accuracy of 93.0%. Finally, a user-interface-based program was developed to enhance the practical application of the extraction model. It allowed users to upload PDF files of geotechnical investigation reports, automatically analyze these reports, and extract and edit data. This approach is expected to improve the efficiency and accuracy of digitizing geotechnical investigation reports and building geotechnical databases.

Improving Test Accuracy on the MNIST Dataset using a Simple CNN with Batch Normalization

  • Seungbin Lee;Jungsoo Rhee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.9
    • /
    • pp.1-7
    • /
    • 2024
  • In this paper, we proposes a Convolutional Neural Networks(CNN) equipped with Batch Normalization(BN) for handwritten digit recognition training the MNIST dataset. Aiming to surpass the performance of LeNet-5 by LeCun et al., a 6-layer neural network was designed. The proposed model processes 28×28 pixel images through convolution, Max Pooling, and Fully connected layers, with the batch normalization to improve learning stability and performance. The experiment utilized 60,000 training images and 10,000 test images, applying the Momentum optimization algorithm. The model configuration used 30 filters with a 5×5 filter size, padding 0, stride 1, and ReLU as activation function. The training process was set with a mini-batch size of 100, 20 epochs in total, and a learning rate of 0.1. As a result, the proposed model achieved a test accuracy of 99.22%, surpassing LeNet-5's 99.05%, and recorded an F1-score of 0.9919, demonstrating the model's performance. Moreover, the 6-layer model proposed in this paper emphasizes model efficiency with a simpler structure compared to LeCun et al.'s LeNet-5 (7-layer model) and the model proposed by Ji, Chun and Kim (10-layer model). The results of this study show potential for application in real industrial applications such as AI vision inspection systems. It is expected to be effectively applied in smart factories, particularly in determining the defective status of parts.

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

A Destructive Method in the Connection of the Algorithm and Design in the Digital media - Centered on the Rapid Prototyping Systems of Product Design - (디지털미디어 환경(環境)에서 디자인 특성(特性)에 관한 연구(硏究) - 실내제품(室內製品) 디자인을 중심으로 -)

  • Kim Seok-Hwa
    • Journal of Science of Art and Design
    • /
    • v.5
    • /
    • pp.87-129
    • /
    • 2003
  • The purpose of this thesis is to propose a new concept of design of the 21st century, on the basis of the study on the general signification of the structures and the signs of industrial product design, by examining the difference between modern and post-modern design, which is expected to lead the users to different design practice and interpretation of it. The starting point of this study is the different styles and patterns of 'Gestalt' in the post-modern design of the late 20th century from modern design - the factor of determination in industrial product design. That is to say, unlike functional and rational styles of modern product design, the late 20th century is based upon the pluralism characterized by complexity, synthetic and decorativeness. So far, most of the previous studies on design seem to have excluded visual aspects and usability, focused only on effective communication of design phenomena. These partial studies on design, blinded by phenomenal aspects, have resulted in failure to discover a principle of fundamental system. However, design varies according to the times; and the transformation of design is reflected in Design Pragnanz to constitute a new text of design. Therefore, it can be argued that Design Pragnanz serves as an essential factor under influence of the significance of text. In this thesis, therefore, I delve into analysis of the 20th century product design, in the light of Gestalt theory and Design Pragnanz, which have been functioning as the principle of the past design. For this study, I attempted to discover the fundamental elements in modern and post-modern designs, and to examine the formal structure of product design, the users' aesthetic preference and its semantics, from the integrative viewpoint. Also, with reference to history and theory of design my emphasis is more on fundamental visual phenomena than on structural analysis or process of visualization in product design, in order to examine the formal properties of modern and post-modern designs. Firstly, In Chapter 1, 'Issues and Background of the Study', I investigated the Gestalt theory and Design Pragnanz, on the premise of formal distinction between modern and post-modern designs. These theories are founded upon the discussion on visual perception of Gestalt in Germany in 1910's, in pursuit of the principle of perception centered around visual perception of human beings. In Chapter 2, I dealt with functionalism of modern design, as an advance preparation for the further study on the product design of the late 20th century. First of all, in Chapter 2-1, I examined the tendency of modern design focused on functionalism, which can be exemplified by the famous statement 'Form follows function'. Excluding all unessential elements in design - for example, decoration, this tendency has attained the position of the international style based on the spirit of Bauhause - universality and regularity - in search of geometric order, standardization and rationalization. In Chapter 2-2, I investigated the anthropological viewpoint that modern design started representing culture in a symbolic way including overall aspects of the society - politics, economics and ethics, and its criticism on functionalist design that aesthetic value is missing in exchange of excessive simplicity in style. Moreover, I examined the pluralist phenomena in post-modern design such as kitsch, eclecticism, reactionism, hi-tech and digital design, breaking away from functionalist purism of modern design. In Chapter 3, I analyzed Gestalt Pragnanz in design in a practical way, against the background of design trends. To begin with, I selected mass product design among those for the 20th century products as a target of analysis, highlighting representative styles in each category of the products. For this analysis, I adopted the theory of J. M Lehnhardt, who gradated in percentage the aesthetic and semantic levels of Pragnantz in design expression, and that of J. K. Grutter, who expressed it in a formula of M = O : C. I also employed eight units of dichotomies, according to the G. D. Birkhoff's aesthetic criteria, for the purpose of scientific classification of the degree of order and complexity in design; and I analyzed phenomenal aspects of design form represented in each unit. For Chapter 4, I executed a questionnaire about semiological phenomena of Design Pragnanz with 28 units of antonymous adjectives, based upon the research in the previous chapter. Then, I analyzed the process of signification of Design Pragnanz, founded on this research. Furthermore, the interpretation of the analysis served as an explanation to preference, through systematic analysis of Gestalt and Design Pragnanz in product design of the late 20th century. In Chapter 5, I determined the position of Design Pragnanz by integrating the analyses of Gestalt and Pragnanz in modern and post-modern designs In this process, 1 revealed the difference of each Design Pragnanz in formal respect, in order to suggest a vision of the future as a result, which will provide systemic and structural stimulation to current design.

  • PDF

Reproducibility of Regional Pulse Wave Velocity in Healthy Subjects

  • Im Jae-Joong;Lee, Nak-Bum;Rhee Moo-Yong;Na Sang-Hun;Kim, Young-Kwon;Lee, Myoung-Mook;Cockcroft John R.
    • International Journal of Vascular Biomedical Engineering
    • /
    • v.4 no.2
    • /
    • pp.19-24
    • /
    • 2006
  • Background: Pulse wave velocity (PWV), which is inversely related to the distensibility of an arterial wall, offers a simple and potentially useful approach for an evaluation of cardiovascular diseases. In spite of the clinical importance and widespread use of PWV, there exist no standard either for pulse sensors or for system requirements for accurate pulse wave measurement. Objective of this study was to assess the reproducibility of PWV values using a newly developed PWV measurement system in healthy subjects prior to a large-scale clinical study. Methods: System used for the study was the PP-1000 (Hanbyul Meditech Co., Korea), which provides regional PWV values based on the measurements of electrocardiography (ECG), phonocardiography (PCG), and pulse waves from four different sites of arteries (carotid, femoral, radial, and dorsalis pedis) simultaneously. Seventeen healthy male subjects with a mean age of 33 years (ranges 22 to 52 years) without any cardiovascular disease were participated for the experiment. Two observers (observer A and B) performed two consecutive measurements from the same subject in a random order. For an evaluation of system reproducibility, two analyses (within-observer and between-observer) were performed, and expressed in terms of mean difference ${\pm}2SD$, as described by Bland and Altman plots. Results: Mean and SD of PWVs for aorta, arm, and leg were $7.07{\pm}1.48m/sec,\;8.43{\pm}1.14m/sec,\;and\;8.09{\pm}0.98m/sec$ measured from observer A and $6.76{\pm}1.00m/sec,\;7.97{\pm}0.80m/sec,\;and\;\7.97{\pm}0.72m/sec$ from observer B, respectively. Between-observer differences ($mean{\pm}2SD$) for aorta, arm, and leg were $0.14{\pm\}0.62m/sec,\;0.18{\pm\}0.84m/sec,\;and\;0.07{\pm}0.86m/sec$, and the correlation coefficients were high especially 0.93 for aortic PWV. Within-observer differences ($mean{\pm}2SD$) for aorta, arm, and leg were $0.01{\pm}0.26m/sec,\;0.02{\pm}0.26m/sec,\;and\;0.08{\pm}0.32m/sec$ from observer A and $0.01{\pm}0.24m/sec,\;0.04{\pm}0.28m/sec,\;and\;0.01{\pm}0.20m/sec$ from observer B, respectively. All the measurements showed significantly high correlation coefficients ranges from 0.94 to 0.99. Conclusion: PWV measurement system used for the study offers comfortable and simple operation and provides accurate analysis results with high reproducibility. Since the reproducibility of the measurement is critical for the diagnosis in clinical use, it is necessary to provide an accurate algorithm for the detection of additional features such as flow wave, reflection wave, and dicrotic notch from a pulse waveform. This study will be extended for the comparison of PWV values from patients with various vascular risks for clinical application. Data acquired from the study could be used for the determination of the appropriate sample size for further studies relating various types of arteriosclerosis-related vascular disease.

  • PDF

A Study on the Digital Drawing of Archaeological Relics Using Open-Source Software (오픈소스 소프트웨어를 활용한 고고 유물의 디지털 실측 연구)

  • LEE Hosun;AHN Hyoungki
    • Korean Journal of Heritage: History & Science
    • /
    • v.57 no.1
    • /
    • pp.82-108
    • /
    • 2024
  • With the transition of archaeological recording method's transition from analog to digital, the 3D scanning technology has been actively adopted within the field. Research on the digital archaeological digital data gathered from 3D scanning and photogrammetry is continuously being conducted. However, due to cost and manpower issues, most buried cultural heritage organizations are hesitating to adopt such digital technology. This paper aims to present a digital recording method of relics utilizing open-source software and photogrammetry technology, which is believed to be the most efficient method among 3D scanning methods. The digital recording process of relics consists of three stages: acquiring a 3D model, creating a joining map with the edited 3D model, and creating an digital drawing. In order to enhance the accessibility, this method only utilizes open-source software throughout the entire process. The results of this study confirms that in terms of quantitative evaluation, the deviation of numerical measurement between the actual artifact and the 3D model was minimal. In addition, the results of quantitative quality analysis from the open-source software and the commercial software showed high similarity. However, the data processing time was overwhelmingly fast for commercial software, which is believed to be a result of high computational speed from the improved algorithm. In qualitative evaluation, some differences in mesh and texture quality occurred. In the 3D model generated by opensource software, following problems occurred: noise on the mesh surface, harsh surface of the mesh, and difficulty in confirming the production marks of relics and the expression of patterns. However, some of the open source software did generate the quality comparable to that of commercial software in quantitative and qualitative evaluations. Open-source software for editing 3D models was able to not only post-process, match, and merge the 3D model, but also scale adjustment, join surface production, and render image necessary for the actual measurement of relics. The final completed drawing was tracked by the CAD program, which is also an open-source software. In archaeological research, photogrammetry is very applicable to various processes, including excavation, writing reports, and research on numerical data from 3D models. With the breakthrough development of computer vision, the types of open-source software have been diversified and the performance has significantly improved. With the high accessibility to such digital technology, the acquisition of 3D model data in archaeology will be used as basic data for preservation and active research of cultural heritage.