• Title/Summary/Keyword: 경량화 학습

Search Result 65, Processing Time 0.02 seconds

Fall detection based on acceleration sensor attached to wrist using feature data in frequency space (주파수 공간상의 특징 데이터를 활용한 손목에 부착된 가속도 센서 기반의 낙상 감지)

  • Roh, Jeong Hyun;Kim, Jin Heon
    • Smart Media Journal
    • /
    • v.10 no.3
    • /
    • pp.31-38
    • /
    • 2021
  • It is hard to predict when and where a fall accident will happen. Also, if rapid follow-up measures on it are not performed, a fall accident leads to a threat of life, so studies that can automatically detect a fall accident have become necessary. Among automatic fall-accident detection techniques, a fall detection scheme using an IMU (inertial measurement unit) sensor attached to a wrist is difficult to detect a fall accident due to its movement, but it is recognized as a technique that is easy to wear and has excellent accessibility. To overcome the difficulty in obtaining fall data, this study proposes an algorithm that efficiently learns less data through machine learning such as KNN (k-nearest neighbors) and SVM (support vector machine). In addition, to improve the performance of these mathematical classifiers, this study utilized feature data aquired in the frequency space. The proposed algorithm analyzed the effect by diversifying the parameters of the model and the parameters of the frequency feature extractor through experiments using standard datasets. The proposed algorithm could adequately cope with a realistic problem that fall data are difficult to obtain. Because it is lighter than other classifiers, this algorithm was also easy to implement in small embedded systems where SIMD (single instruction multiple data) processing devices were difficult to mount.

Image Processing and Deep Learning Techniques for Fast Pig's Posture Determining and Head Removal (돼지의 빠른 자세 결정과 머리 제거를 위한 영상처리 및 딥러닝 기법)

  • Ahn, Hanse;Choi, Wonseok;Park, Sunhwa;Chung, Yongwha;Park, Daihee
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.457-464
    • /
    • 2019
  • The weight of pig is one of the main factors in determining the health and growth state of pigs, their shipment, the breeding environment, and the ration of feed, and thus measuring the pig's weight is an important issue in productivity perspective. In order to estimate the pig's weight by using the number of pig's pixels from images, acquired from a Top-view camera, the posture determining and the head removal from images are necessary to measure the accurate number of pixels. In this research, we propose the fast and accurate method to determine the pig's posture by using a fast image processing technique, find the head location by using a fast deep learning technique, and remove pig's head by using light weighted image processing technique. First, we determine the pig's posture by comparing the length from the center of the pig's body to the outline of the pig in the binary image. Then, we train the location of pig's head, body, and hip in images using YOLO(one of the fast deep learning based object detector), and then we obtain the location of pig's head and remove an outside area of head by using head location. Finally, we find the boundary of head and body by using Convex-hull, and we remove pig's head. In the Experiment result, we confirmed that the pig's posture was determined with an accuracy of 0.98 and a processing speed of 250.00fps, and the pig's head was removed with an accuracy of 0.96 and a processing speed of 48.97fps.

Comparative study of data augmentation methods for fake audio detection (음성위조 탐지에 있어서 데이터 증강 기법의 성능에 관한 비교 연구)

  • KwanYeol Park;Il-Youp Kwak
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.101-114
    • /
    • 2023
  • The data augmentation technique is effectively used to solve the problem of overfitting the model by allowing the training dataset to be viewed from various perspectives. In addition to image augmentation techniques such as rotation, cropping, horizontal flip, and vertical flip, occlusion-based data augmentation methods such as Cutmix and Cutout have been proposed. For models based on speech data, it is possible to use an occlusion-based data-based augmentation technique after converting a 1D speech signal into a 2D spectrogram. In particular, SpecAugment is an occlusion-based augmentation technique for speech spectrograms. In this study, we intend to compare and study data augmentation techniques that can be used in the problem of false-voice detection. Using data from the ASVspoof2017 and ASVspoof2019 competitions held to detect fake audio, a dataset applied with Cutout, Cutmix, and SpecAugment, an occlusion-based data augmentation method, was trained through an LCNN model. All three augmentation techniques, Cutout, Cutmix, and SpecAugment, generally improved the performance of the model. In ASVspoof2017, Cutmix, in ASVspoof2019 LA, Mixup, and in ASVspoof2019 PA, SpecAugment showed the best performance. In addition, increasing the number of masks for SpecAugment helps to improve performance. In conclusion, it is understood that the appropriate augmentation technique differs depending on the situation and data.

Lightening of Human Pose Estimation Algorithm Using MobileViT and Transfer Learning

  • Kunwoo Kim;Jonghyun Hong;Jonghyuk Park
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.17-25
    • /
    • 2023
  • In this paper, we propose a model that can perform human pose estimation through a MobileViT-based model with fewer parameters and faster estimation. The based model demonstrates lightweight performance through a structure that combines features of convolutional neural networks with features of Vision Transformer. Transformer, which is a major mechanism in this study, has become more influential as its based models perform better than convolutional neural network-based models in the field of computer vision. Similarly, in the field of human pose estimation, Vision Transformer-based ViTPose maintains the best performance in all human pose estimation benchmarks such as COCO, OCHuman, and MPII. However, because Vision Transformer has a heavy model structure with a large number of parameters and requires a relatively large amount of computation, it costs users a lot to train the model. Accordingly, the based model overcame the insufficient Inductive Bias calculation problem, which requires a large amount of computation by Vision Transformer, with Local Representation through a convolutional neural network structure. Finally, the proposed model obtained a mean average precision of 0.694 on the MS COCO benchmark with 3.28 GFLOPs and 9.72 million parameters, which are 1/5 and 1/9 the number compared to ViTPose, respectively.

A Generalized Adaptive Deep Latent Factor Recommendation Model (일반화 적응 심층 잠재요인 추천모형)

  • Kim, Jeongha;Lee, Jipyeong;Jang, Seonghyun;Cho, Yoonho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.249-263
    • /
    • 2023
  • Collaborative Filtering, a representative recommendation system methodology, consists of two approaches: neighbor methods and latent factor models. Among these, the latent factor model using matrix factorization decomposes the user-item interaction matrix into two lower-dimensional rectangular matrices, predicting the item's rating through the product of these matrices. Due to the factor vectors inferred from rating patterns capturing user and item characteristics, this method is superior in scalability, accuracy, and flexibility compared to neighbor-based methods. However, it has a fundamental drawback: the need to reflect the diversity of preferences of different individuals for items with no ratings. This limitation leads to repetitive and inaccurate recommendations. The Adaptive Deep Latent Factor Model (ADLFM) was developed to address this issue. This model adaptively learns the preferences for each item by using the item description, which provides a detailed summary and explanation of the item. ADLFM takes in item description as input, calculates latent vectors of the user and item, and presents a method that can reflect personal diversity using an attention score. However, due to the requirement of a dataset that includes item descriptions, the domain that can apply ADLFM is limited, resulting in generalization limitations. This study proposes a Generalized Adaptive Deep Latent Factor Recommendation Model, G-ADLFRM, to improve the limitations of ADLFM. Firstly, we use item ID, commonly used in recommendation systems, as input instead of the item description. Additionally, we apply improved deep learning model structures such as Self-Attention, Multi-head Attention, and Multi-Conv1D. We conducted experiments on various datasets with input and model structure changes. The results showed that when only the input was changed, MAE increased slightly compared to ADLFM due to accompanying information loss, resulting in decreased recommendation performance. However, the average learning speed per epoch significantly improved as the amount of information to be processed decreased. When both the input and the model structure were changed, the best-performing Multi-Conv1d structure showed similar performance to ADLFM, sufficiently counteracting the information loss caused by the input change. We conclude that G-ADLFRM is a new, lightweight, and generalizable model that maintains the performance of the existing ADLFM while enabling fast learning and inference.