• Title/Summary/Keyword: neural network.

Search Result 11,770, Processing Time 0.04 seconds

Sound event detection model using self-training based on noisy student model (잡음 학생 모델 기반의 자가 학습을 활용한 음향 사건 검지)

  • Kim, Nam Kyun;Park, Chang-Soo;Kim, Hong Kook;Hur, Jin Ook;Lim, Jeong Eun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.479-487
    • /
    • 2021
  • In this paper, we propose an Sound Event Detection (SED) model using self-training based on a noisy student model. The proposed SED model consists of two stages. In the first stage, a mean-teacher model based on an Residual Convolutional Recurrent Neural Network (RCRNN) is constructed to provide target labels regarding weakly labeled or unlabeled data. In the second stage, a self-training-based noisy student model is constructed by applying different noise types. That is, feature noises, such as time-frequency shift, mixup, SpecAugment, and dropout-based model noise are used here. In addition, a semi-supervised loss function is applied to train the noisy student model, which acts as label noise injection. The performance of the proposed SED model is evaluated on the validation set of the Detection and Classification of Acoustic Scenes and Events (DCASE) 2020 Challenge Task 4. The experiments show that the single model and ensemble model of the proposed SED based on the noisy student model improve F1-score by 4.6 % and 3.4 % compared to the top-ranked model in DCASE 2020 challenge Task 4, respectively.

Dimensionality Reduction of Feature Set for API Call based Android Malware Classification

  • Hwang, Hee-Jin;Lee, Soojin
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.41-49
    • /
    • 2021
  • All application programs, including malware, call the Application Programming Interface (API) upon execution. Recently, using those characteristics, attempts to detect and classify malware based on API Call information have been actively studied. However, datasets containing API Call information require a large amount of computational cost and processing time. In addition, information that does not significantly affect the classification of malware may affect the classification accuracy of the learning model. Therefore, in this paper, we propose a method of extracting a essential feature set after reducing the dimensionality of API Call information by applying various feature selection methods. We used CICAndMal2020, a recently announced Android malware dataset, for the experiment. After extracting the essential feature set through various feature selection methods, Android malware classification was conducted using CNN (Convolutional Neural Network) and the results were analyzed. The results showed that the selected feature set or weight priority varies according to the feature selection methods. And, in the case of binary classification, malware was classified with 97% accuracy even if the feature set was reduced to 15% of the total size. In the case of multiclass classification, an average accuracy of 83% was achieved while reducing the feature set to 8% of the total size.

Seq2Seq model-based Prognostics and Health Management of Robot Arm (Seq2Seq 모델 기반의 로봇팔 고장예지 기술)

  • Lee, Yeong-Hyeon;Kim, Kyung-Jun;Lee, Seung-Ik;Kim, Dong-Ju
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.3
    • /
    • pp.242-250
    • /
    • 2019
  • In this paper, we propose a method to predict the failure of industrial robot using Seq2Seq (Sequence to Sequence) model, which is a model for transforming time series data among Artificial Neural Network models. The proposed method uses the data of the joint current and angular value, which can be measured by the robot itself, without additional sensor for fault diagnosis. After preprocessing the measured data for the model to learn, the Seq2Seq model was trained to convert the current to angle. Abnormal degree for fault diagnosis uses RMSE (Root Mean Squared Error) during unit time between predicted angle and actual angle. The performance evaluation of the proposed method was performed using the test data measured under different conditions of normal and defective condition of the robot. When the Abnormal degree exceed the threshold, it was classified as a fault, and the accuracy of the fault diagnosis was 96.67% from the experiment. The proposed method has the merit that it can perform fault prediction without additional sensor, and it has been confirmed from the experiment that high diagnostic performance and efficiency are available without requiring deep expert knowledge of the robot.

Pedestrian Classification using CNN's Deep Features and Transfer Learning (CNN의 깊은 특징과 전이학습을 사용한 보행자 분류)

  • Chung, Soyoung;Chung, Min Gyo
    • Journal of Internet Computing and Services
    • /
    • v.20 no.4
    • /
    • pp.91-102
    • /
    • 2019
  • In autonomous driving systems, the ability to classify pedestrians in images captured by cameras is very important for pedestrian safety. In the past, after extracting features of pedestrians with HOG(Histogram of Oriented Gradients) or SIFT(Scale-Invariant Feature Transform), people classified them using SVM(Support Vector Machine). However, extracting pedestrian characteristics in such a handcrafted manner has many limitations. Therefore, this paper proposes a method to classify pedestrians reliably and effectively using CNN's(Convolutional Neural Network) deep features and transfer learning. We have experimented with both the fixed feature extractor and the fine-tuning methods, which are two representative transfer learning techniques. Particularly, in the fine-tuning method, we have added a new scheme, called M-Fine(Modified Fine-tuning), which divideslayers into transferred parts and non-transferred parts in three different sizes, and adjusts weights only for layers belonging to non-transferred parts. Experiments on INRIA Person data set with five CNN models(VGGNet, DenseNet, Inception V3, Xception, and MobileNet) showed that CNN's deep features perform better than handcrafted features such as HOG and SIFT, and that the accuracy of Xception (threshold = 0.5) isthe highest at 99.61%. MobileNet, which achieved similar performance to Xception and learned 80% fewer parameters, was the best in terms of efficiency. Among the three transfer learning schemes tested above, the performance of the fine-tuning method was the best. The performance of the M-Fine method was comparable to or slightly lower than that of the fine-tuningmethod, but higher than that of the fixed feature extractor method.

A study on the development of severity-adjusted mortality prediction model for discharged patient with acute stroke using machine learning (머신러닝을 이용한 급성 뇌졸중 퇴원 환자의 중증도 보정 사망 예측 모형 개발에 관한 연구)

  • Baek, Seol-Kyung;Park, Jong-Ho;Kang, Sung-Hong;Park, Hye-Jin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.19 no.11
    • /
    • pp.126-136
    • /
    • 2018
  • The purpose of this study was to develop a severity-adjustment model for predicting mortality in acute stroke patients using machine learning. Using the Korean National Hospital Discharge In-depth Injury Survey from 2006 to 2015, the study population with disease code I60-I63 (KCD 7) were extracted for further analysis. Three tools were used for the severity-adjustment of comorbidity: the Charlson Comorbidity Index (CCI), the Elixhauser comorbidity index (ECI), and the Clinical Classification Software (CCS). The severity-adjustment models for mortality prediction in patients with acute stroke were developed using logistic regression, decision tree, neural network, and support vector machine methods. The most common comorbid disease in stroke patients were hypertension, uncomplicated (43.8%) in the ECI, and essential hypertension (43.9%) in the CCS. Among the CCI, ECI, and CCS, CCS had the highest AUC value. CCS was confirmed as the best severity correction tool. In addition, the AUC values for variables of CCS including main diagnosis, gender, age, hospitalization route, and existence of surgery were 0.808 for the logistic regression analysis, 0.785 for the decision tree, 0.809 for the neural network and 0.830 for the support vector machine. Therefore, the best predictive power was achieved by the support vector machine technique. The results of this study can be used in the establishment of health policy in the future.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Development of Productivity Prediction Model according to Choke Size and Gas Injection Rate by using ANN(Artificial Neural Network) at Oil Producer (오일 생산정에서 쵸크사이즈와 가스주입량에 따른 생산성 예측 인공신경망 모델 개발)

  • Han, Dong-kwon;Kwon, Sun-il
    • Journal of the Korean Institute of Gas
    • /
    • v.22 no.6
    • /
    • pp.90-103
    • /
    • 2018
  • This paper presents the development of two ANN models which can predict an optimum production rate by controlling choke size in oil well, and gas injection rate in gas-lift well. The input data was solution gas-oil ratio, water cut, reservoir pressure, and choke size or gas injection rate. The output data was wellhead pressure and production rate. Firstly, a range of each parameters was decided by conducting sensitive analysis of input data for onshore oil well. In addition, 1,715 sets training data for choke size decision model and 1,225 sets for gas injection rate decision model were generated by nodal analysis. From the results of comparing between the nodal analysis and the ANN on the same reservoir system showed that the correlation factors were very high(>0.99). Mean absolute error of wellhead pressure and oil production rate was 0.55%, 1.05% with the choke size model, respectively. And the gas injection rate model showed the errors of 1.23%, 2.67%. It was found that the developed models had been highly accurate.

Visual analysis of attention-based end-to-end speech recognition (어텐션 기반 엔드투엔드 음성인식 시각화 분석)

  • Lim, Seongmin;Goo, Jahyun;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.41-49
    • /
    • 2019
  • An end-to-end speech recognition model consisting of a single integrated neural network model was recently proposed. The end-to-end model does not need several training steps, and its structure is easy to understand. However, it is difficult to understand how the model recognizes speech internally. In this paper, we visualized and analyzed the attention-based end-to-end model to elucidate its internal mechanisms. We compared the acoustic model of the BLSTM-HMM hybrid model with the encoder of the end-to-end model, and visualized them using t-SNE to examine the difference between neural network layers. As a result, we were able to delineate the difference between the acoustic model and the end-to-end model encoder. Additionally, we analyzed the decoder of the end-to-end model from a language model perspective. Finally, we found that improving end-to-end model decoder is necessary to yield higher performance.

A Method for 3D Human Pose Estimation based on 2D Keypoint Detection using RGB-D information (RGB-D 정보를 이용한 2차원 키포인트 탐지 기반 3차원 인간 자세 추정 방법)

  • Park, Seohee;Ji, Myunggeun;Chun, Junchul
    • Journal of Internet Computing and Services
    • /
    • v.19 no.6
    • /
    • pp.41-51
    • /
    • 2018
  • Recently, in the field of video surveillance, deep learning based learning method is applied to intelligent video surveillance system, and various events such as crime, fire, and abnormal phenomenon can be robustly detected. However, since occlusion occurs due to the loss of 3d information generated by projecting the 3d real-world in 2d image, it is need to consider the occlusion problem in order to accurately detect the object and to estimate the pose. Therefore, in this paper, we detect moving objects by solving the occlusion problem of object detection process by adding depth information to existing RGB information. Then, using the convolution neural network in the detected region, the positions of the 14 keypoints of the human joint region can be predicted. Finally, in order to solve the self-occlusion problem occurring in the pose estimation process, the method for 3d human pose estimation is described by extending the range of estimation to the 3d space using the predicted result of 2d keypoint and the deep neural network. In the future, the result of 2d and 3d pose estimation of this research can be used as easy data for future human behavior recognition and contribute to the development of industrial technology.

A ground condition prediction ahead of tunnel face utilizing time series analysis of shield TBM data in soil tunnel (토사터널의 쉴드 TBM 데이터 시계열 분석을 통한 막장 전방 예측 연구)

  • Jung, Jee-Hee;Kim, Byung-Kyu;Chung, Heeyoung;Kim, Hae-Mahn;Lee, In-Mo
    • Journal of Korean Tunnelling and Underground Space Association
    • /
    • v.21 no.2
    • /
    • pp.227-242
    • /
    • 2019
  • This paper presents a method to predict ground types ahead of a tunnel face utilizing operational data of the earth pressure-balanced (EPB) shield tunnel boring machine (TBM) when running through soil ground. The time series analysis model which was applicable to predict the mixed ground composed of soils and rocks was modified to be applicable to soil tunnels. Using the modified model, the feasibility on the choice of the soil conditioning materials dependent upon soil types was studied. To do this, a self-organizing map (SOM) clustering was performed. Firstly, it was confirmed that the ground types should be classified based on the percentage of 35% passing through the #200 sieve. Then, the possibility of predicting the ground types by employing the modified model, in which the TBM operational data were analyzed, was studied. The efficacy of the modified model is demonstrated by its 98% accuracy in predicting ground types ten rings ahead of the tunnel face. Especially, the average prediction accuracy was approximately 93% in areas where ground type variations occur.