• Title/Summary/Keyword: State Classification

Search Result 944, Processing Time 0.027 seconds

Driver Drowsiness Detection Model using Image and PPG data Based on Multimodal Deep Learning (이미지와 PPG 데이터를 사용한 멀티모달 딥 러닝 기반의 운전자 졸음 감지 모델)

  • Choi, Hyung-Tak;Back, Moon-Ki;Kang, Jae-Sik;Yoon, Seung-Won;Lee, Kyu-Chul
    • Database Research
    • /
    • v.34 no.3
    • /
    • pp.45-57
    • /
    • 2018
  • The drowsiness that occurs in the driving is a very dangerous driver condition that can be directly linked to a major accident. In order to prevent drowsiness, there are traditional drowsiness detection methods to grasp the driver's condition, but there is a limit to the generalized driver's condition recognition that reflects the individual characteristics of drivers. In recent years, deep learning based state recognition studies have been proposed to recognize drivers' condition. Deep learning has the advantage of extracting features from a non-human machine and deriving a more generalized recognition model. In this study, we propose a more accurate state recognition model than the existing deep learning method by learning image and PPG at the same time to grasp driver's condition. This paper confirms the effect of driver's image and PPG data on drowsiness detection and experiment to see if it improves the performance of learning model when used together. We confirmed the accuracy improvement of around 3% when using image and PPG together than using image alone. In addition, the multimodal deep learning based model that classifies the driver's condition into three categories showed a classification accuracy of 96%.

Generating Audio Adversarial Examples Using a Query-Efficient Decision-Based Attack (질의 효율적인 의사 결정 공격을 통한 오디오 적대적 예제 생성 연구)

  • Seo, Seong-gwan;Mun, Hyunjun;Son, Baehoon;Yun, Joobeom
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.1
    • /
    • pp.89-98
    • /
    • 2022
  • As deep learning technology was applied to various fields, research on adversarial attack techniques, a security problem of deep learning models, was actively studied. adversarial attacks have been mainly studied in the field of images. Recently, they have even developed a complete decision-based attack technique that can attack with just the classification results of the model. However, in the case of the audio field, research is relatively slow. In this paper, we applied several decision-based attack techniques to the audio field and improved state-of-the-art attack techniques. State-of-the-art decision-attack techniques have the disadvantage of requiring many queries for gradient approximation. In this paper, we improve query efficiency by proposing a method of reducing the vector search space required for gradient approximation. Experimental results showed that the attack success rate was increased by 50%, and the difference between original audio and adversarial examples was reduced by 75%, proving that our method could generate adversarial examples with smaller noise.

Shipping Container Load State and Accident Risk Detection Techniques Based Deep Learning (딥러닝 기반 컨테이너 적재 정렬 상태 및 사고 위험도 검출 기법)

  • Yeon, Jeong Hum;Seo, Yong Uk;Kim, Sang Woo;Oh, Se Yeong;Jeong, Jun Ho;Park, Jin Hyo;Kim, Sung-Hee;Youn, Joosang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.11
    • /
    • pp.411-418
    • /
    • 2022
  • Incorrectly loaded containers can easily knock down by strong winds. Container collapse accidents can lead to material damage and paralysis of the port system. In this paper, We propose a deep learning-based container loading state and accident risk detection technique. Using Darknet-based YOLO, the container load status identifies in real-time through corner casting on the top and bottom of the container, and the risk of accidents notifies the manager. We present criteria for classifying container alignment states and select efficient learning algorithms based on inference speed, classification accuracy, detection accuracy, and FPS in real embedded devices in the same environment. The study found that YOLOv4 had a weaker inference speed and performance of FPS than YOLOv3, but showed strong performance in classification accuracy and detection accuracy.

Classification of the Rusting State of Pipe Using a Laser Displacement Sensor (레이저 변위 센서를 활용한 배관 표면 상태분류)

  • Cheon, Kang-Min;Shin, Baek-Cheon;Shin, Geon-Ho;Go, Jeong-Il;Lee, Jun-Hyeok;Hur, Jang-Wook
    • Journal of the Korean Society of Manufacturing Process Engineers
    • /
    • v.21 no.5
    • /
    • pp.46-52
    • /
    • 2022
  • Although pipe performs various functions in industrial sites and residential spaces, if it is damaged due to corrosion caused by the external environment, it may cause equipment failure or a major accident. For this reason, various studies for safety management are being conducted, but studies on detecting corrosion or cracks on the pipe surface using a laser displacement sensor have hardly been conducted. Therefore, in this study, the corrosion degree of the pipe surface was compared and classified into 4 corrosion conditions, and inspection equipment using a laser scanner was manufactured. The corrosion height was calculated from the four surface data obtained from the measuring equipment and applied to various CNN algorithms, and 91% accuracy was obtained during training using the Modified VGGNet16 code with reduced number of parameters.

COVID-19 Diagnosis from CXR images through pre-trained Deep Visual Embeddings

  • Khalid, Shahzaib;Syed, Muhammad Shehram Shah;Saba, Erum;Pirzada, Nasrullah
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.175-181
    • /
    • 2022
  • COVID-19 is an acute respiratory syndrome that affects the host's breathing and respiratory system. The novel disease's first case was reported in 2019 and has created a state of emergency in the whole world and declared a global pandemic within months after the first case. The disease created elements of socioeconomic crisis globally. The emergency has made it imperative for professionals to take the necessary measures to make early diagnoses of the disease. The conventional diagnosis for COVID-19 is through Polymerase Chain Reaction (PCR) testing. However, in a lot of rural societies, these tests are not available or take a lot of time to provide results. Hence, we propose a COVID-19 classification system by means of machine learning and transfer learning models. The proposed approach identifies individuals with COVID-19 and distinguishes them from those who are healthy with the help of Deep Visual Embeddings (DVE). Five state-of-the-art models: VGG-19, ResNet50, Inceptionv3, MobileNetv3, and EfficientNetB7, were used in this study along with five different pooling schemes to perform deep feature extraction. In addition, the features are normalized using standard scaling, and 4-fold cross-validation is used to validate the performance over multiple versions of the validation data. The best results of 88.86% UAR, 88.27% Specificity, 89.44% Sensitivity, 88.62% Accuracy, 89.06% Precision, and 87.52% F1-score were obtained using ResNet-50 with Average Pooling and Logistic regression with class weight as the classifier.

A Study on Robust Speech Emotion Feature Extraction Under the Mobile Communication Environment (이동통신 환경에서 강인한 음성 감성특징 추출에 대한 연구)

  • Cho Youn-Ho;Park Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.6
    • /
    • pp.269-276
    • /
    • 2006
  • In this paper, we propose an emotion recognition system that can discriminate human emotional state into neutral or anger from the speech captured by a cellular-phone in real time. In general. the speech through the mobile network contains environment noise and network noise, thus it can causes serious System performance degradation due to the distortion in emotional features of the query speech. In order to minimize the effect of these noise and so improve the system performance, we adopt a simple MA (Moving Average) filter which has relatively simple structure and low computational complexity, to alleviate the distortion in the emotional feature vector. Then a SFS (Sequential Forward Selection) feature optimization method is implemented to further improve and stabilize the system performance. Two pattern recognition method such as k-NN and SVM is compared for emotional state classification. The experimental results indicate that the proposed method provides very stable and successful emotional classification performance such as 86.5%. so that it will be very useful in application areas such as customer call-center.

Classification of Aβ State From Brain Amyloid PET Images Using Machine Learning Algorithm

  • Chanda Simfukwe;Reeree Lee;Young Chul Youn;Alzheimer’s Disease and Related Dementias in Zambia (ADDIZ) Group
    • Dementia and Neurocognitive Disorders
    • /
    • v.22 no.2
    • /
    • pp.61-68
    • /
    • 2023
  • Background and Purpose: Analyzing brain amyloid positron emission tomography (PET) images to access the occurrence of β-amyloid (Aβ) deposition in Alzheimer's patients requires much time and effort from physicians, while the variation of each interpreter may differ. For these reasons, a machine learning model was developed using a convolutional neural network (CNN) as an objective decision to classify the Aβ positive and Aβ negative status from brain amyloid PET images. Methods: A total of 7,344 PET images of 144 subjects were used in this study. The 18F-florbetaben PET was administered to all participants, and the criteria for differentiating Aβ positive and Aβ negative state was based on brain amyloid plaque load score (BAPL) that depended on the visual assessment of PET images by the physicians. We applied the CNN algorithm trained in batches of 51 PET images per subject directory from 2 classes: Aβ positive and Aβ negative states, based on the BAPL scores. Results: The binary classification of the model average performance matrices was evaluated after 40 epochs of three trials based on test datasets. The model accuracy for classifying Aβ positivity and Aβ negativity was (95.00±0.02) in the test dataset. The sensitivity and specificity were (96.00±0.02) and (94.00±0.02), respectively, with an area under the curve of (87.00±0.03). Conclusions: Based on this study, the designed CNN model has the potential to be used clinically to screen amyloid PET images.

Integration of WFST Language Model in Pre-trained Korean E2E ASR Model

  • Junseok Oh;Eunsoo Cho;Ji-Hwan Kim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.6
    • /
    • pp.1692-1705
    • /
    • 2024
  • In this paper, we present a method that integrates a Grammar Transducer as an external language model to enhance the accuracy of the pre-trained Korean End-to-end (E2E) Automatic Speech Recognition (ASR) model. The E2E ASR model utilizes the Connectionist Temporal Classification (CTC) loss function to derive hypothesis sentences from input audio. However, this method reveals a limitation inherent in the CTC approach, as it fails to capture language information from transcript data directly. To overcome this limitation, we propose a fusion approach that combines a clause-level n-gram language model, transformed into a Weighted Finite-State Transducer (WFST), with the E2E ASR model. This approach enhances the model's accuracy and allows for domain adaptation using just additional text data, avoiding the need for further intensive training of the extensive pre-trained ASR model. This is particularly advantageous for Korean, characterized as a low-resource language, which confronts a significant challenge due to limited resources of speech data and available ASR models. Initially, we validate the efficacy of training the n-gram model at the clause-level by contrasting its inference accuracy with that of the E2E ASR model when merged with language models trained on smaller lexical units. We then demonstrate that our approach achieves enhanced domain adaptation accuracy compared to Shallow Fusion, a previously devised method for merging an external language model with an E2E ASR model without necessitating additional training.

Development of Mirror Neuron System-based BCI System using Steady-State Visually Evoked Potentials (정상상태시각유발전위를 이용한 Mirror Neuron System 기반 BCI 시스템 개발)

  • Lee, Sang-Kyung;Kim, Jun-Yeup;Park, Seung-Min;Ko, Kwang-Enu;Sim, Kwee-Bo
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.22 no.1
    • /
    • pp.62-68
    • /
    • 2012
  • Steady-State Visually Evoked Potentials (SSVEP) are natural response signal associated with the visual stimuli with specific frequency. By using SSVEP, occipital lobe region is electrically activated as frequency form equivalent to stimuli frequency with bandwidth from 3.5Hz to 75Hz. In this paper, we propose an experimental paradigm for analyzing EEGs based on the properties of SSVEP. At first, an experiment is performed to extract frequency feature of EEGs that is measured from the image-based visual stimuli associated with specific objective with affordance and object-related affordance is measured by using mirror neuron system based on the frequency feature. And then, linear discriminant analysis (LDA) method is applied to perform the online classification of the objective pattern associated with the EEG-based affordance data. By using the SSVEP measurement experiment, we propose a Brain-Computer Interface (BCI) system for recognizing user's inherent intentions. The existing SSVEP application system, such as speller, is able to classify the EEG pattern based on grid image patterns and their variations. However, our proposed SSVEP-based BCI system performs object pattern classification based on the matters with a variety of shapes in input images and has higher generality than existing system.

Primary Food Commodity Classification of Processed Foods of Plant Origin in the Codex Food Classification (코덱스 식품 분류에서 식물성 가공식품의 원료식품 분류)

  • Mi-Gyung, Lee
    • Journal of Food Hygiene and Safety
    • /
    • v.37 no.6
    • /
    • pp.418-428
    • /
    • 2022
  • The purpose of this study was to obtain the codex classification information on the primary food commodity (fresh state) of processed foods of plant origin that are included in the Codex Classification of Foods and Animal Feeds. Furthermore, whether or not the primary food commodity is included in the primary food classification from the Food Code of Korea was investigated. The results are summarized as follows: First, the Codex Classification information (number of classification codes/number of the primary food commodity group that fresh commodities of processed foods are classified/number of primary food commodity that is not included in the Codex Classification) by a processed food group appeared to be 46/8/0 for dried fruits, 76/11/1 for dried vegetables, 54/4/12 for dried herbs, 36/1/0 for cereal grain milling fractions, 17/4/3 for oils and fats (crude), 34/8/9 for oils and fats (refined), 20/8/0 for fruit juices, 3/2/0 for vegetable juices, and 19 codes for teas (in the Codex Classification, the primary food commodity group for tea does not exist). Second, the number of the primary food commodities not included in the Food Code of Korea was 9 for dried fruits, 14 for dried vegetables, 35 for dried herbs, 0 for cereal grain milling fractions, 6 for teas, 3 for oils and fats (crude), 9 for oils and fats (refined), 2 for fruit juices, and 0 for vegetable juices. Third, it was demonstrated that caution should be exercised when using Codex Classification due to differences in food classification between Codex and Korea, such as coconut (Codex, as tree nut as well as assorted tropical and sub-tropical fruit) and olive (Codex, as assorted tropical and sub-tropical fruit as well as olives for oil production), as well as special cases in the Codex Classification, such as dried chili pepper (Codex, as spice), tomato juice (Codex, as vegetable for primary food commodity and as fruit juice for juice) and ginger (Codex, as spice for rhizome and not including as primary commodity for leaves).