Search | Korea Science

LSTM(Long Short-Term Memory)-Based Abnormal Behavior Recognition Using AlphaPose (AlphaPose를 활용한 LSTM(Long Short-Term Memory) 기반 이상행동인식)

Bae, Hyun-Jae;Jang, Gyu-Jin;Kim, Young-Hun;Kim, Jin-Pyung
- KIPS Transactions on Software and Data Engineering
- /
- v.10 no.5
- /
- pp.187-194
- /
- 2021
A person's behavioral recognition is the recognition of what a person does according to joint movements. To this end, we utilize computer vision tasks that are utilized in image processing. Human behavior recognition is a safety accident response service that combines deep learning and CCTV, and can be applied within the safety management site. Existing studies are relatively lacking in behavioral recognition studies through human joint keypoint extraction by utilizing deep learning. There were also problems that were difficult to manage workers continuously and systematically at safety management sites. In this paper, to address these problems, we propose a method to recognize risk behavior using only joint keypoints and joint motion information. AlphaPose, one of the pose estimation methods, was used to extract joint keypoints in the body part. The extracted joint keypoints were sequentially entered into the Long Short-Term Memory (LSTM) model to be learned with continuous data. After checking the behavioral recognition accuracy, it was confirmed that the accuracy of the "Lying Down" behavioral recognition results was high.
https://doi.org/10.3745/KTSDE.2021.10.5.187 인용 PDF KSCI

Uncooperative Person Recognition Based on Stochastic Information Updates and Environment Estimators

Kim, Hye-Jin;Kim, Dohyung;Lee, Jaeyeon;Jeong, Il-Kwon
- ETRI Journal
- /
- v.37 no.2
- /
- pp.395-405
- /
- 2015
We address the problem of uncooperative person recognition through continuous monitoring. Multiple modalities, such as face, height, clothes color, and voice, can be used when attempting to recognize a person. In general, not all modalities are available for a given frame; furthermore, only some modalities will be useful as some frames in a video sequence are of a quality that is too low to be able to recognize a person. We propose a method that makes use of stochastic information updates of temporal modalities and environment estimators to improve person recognition performance. The environment estimators provide information on whether a given modality is reliable enough to be used in a particular instance; such indicators mean that we can easily identify and eliminate meaningless data, thus increasing the overall efficiency of the method. Our proposed method was tested using movie clips acquired under an unconstrained environment that included a wide variation of scale and rotation; illumination changes; uncontrolled distances from a camera to users (varying from 0.5 m to 5 m); and natural views of the human body with various types of noise. In this real and challenging scenario, our proposed method resulted in an outstanding performance.
https://doi.org/10.4218/etrij.15.0114.0037 인용 PDF KSCI

Neural Network-based Recognition of Handwritten Hangul Characters in Form's Monetary Fields (전표 금액란에 나타나는 필기 한글의 신경망-기반 인식)

이진선;오일석
- Journal of Korea Society of Industrial Information Systems
- /
- v.5 no.1
- /
- pp.25-30
- /
- 2000
Hangul is regarded as one of the difficult character set due to the large number of classes and the shape similarity among different characters. Most of the conventional researches attempted to recognize the 2,350 characters which are popularly used, but this approach has a problem or low recognition performance while it provides a generality. On the contrary, recognition of a small character set appearing in specific fields like postal address or bank checks is more practical approach. This paper describes a research for recognizing the handwritten Hangul characters appearing in monetary fields. The modular neural network is adopted for the classification and three kinds of feature are tested. The experiment performed using standard Hangul database PE92 showed the correct recognition rate 91.56%.
PDF

SEL-RefineMask: A Seal Segmentation and Recognition Neural Network with SEL-FPN

Dun, Ze-dong;Chen, Jian-yu;Qu, Mei-xia;Jiang, Bin
- Journal of Information Processing Systems
- /
- v.18 no.3
- /
- pp.411-427
- /
- 2022
Digging historical and cultural information from seals in ancient books is of great significance. However, ancient Chinese seal samples are scarce and carving methods are diverse, and traditional digital image processing methods based on greyscale have difficulty achieving superior segmentation and recognition performance. Recently, some deep learning algorithms have been proposed to address this problem; however, current neural networks are difficult to train owing to the lack of datasets. To solve the afore-mentioned problems, we proposed an SEL-RefineMask which combines selector of feature pyramid network (SEL-FPN) with RefineMask to segment and recognize seals. We designed an SEL-FPN to intelligently select a specific layer which represents different scales in the FPN and reduces the number of anchor frames. We performed experiments on some instance segmentation networks as the baseline method, and the top-1 segmentation result of 64.93% is 5.73% higher than that of humans. The top-1 result of the SEL-RefineMask network reached 67.96% which surpassed the baseline results. After segmentation, a vision transformer was used to recognize the segmentation output, and the accuracy reached 91%. Furthermore, a dataset of seals in ancient Chinese books (SACB) for segmentation and small seal font (SSF) for recognition were established which are publicly available on the website.
https://doi.org/10.3745/JIPS.02.0174 인용 PDF KSCI

Trends and Future Directions in Facial Expression Recognition Technology: A Text Mining Analysis Approach (얼굴 표정 인식 기술의 동향과 향후 방향: 텍스트 마이닝 분석을 중심으로)

Insu Jeon;Byeongcheon Lee;Subeen Leem;Jihoon Moon
- Proceedings of the Korea Information Processing Society Conference
- /
- 2023.05a
- /
- pp.748-750
- /
- 2023
Facial expression recognition technology's rapid growth and development have garnered significant attention in recent years. This technology holds immense potential for various applications, making it crucial to stay up-to-date with the latest trends and advancements. Simultaneously, it is essential to identify and address the challenges that impede the technology's progress. Motivated by these factors, this study aims to understand the latest trends, future directions, and challenges in facial expression recognition technology by utilizing text mining to analyze papers published between 2020 and 2023. Our research focuses on discerning which aspects of these papers provide valuable insights into the field's recent developments and issues. By doing so, we aim to present the information in an accessible and engaging manner for readers, enabling them to understand the current state and future potential of facial expression recognition technology. Ultimately, our study seeks to contribute to the ongoing dialogue and facilitate further advancements in this rapidly evolving field.
https://doi.org/10.3745/PKIPS.y2023m05a.748 인용 PDF

Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset

Jungwon Chang;Hosung Nam
- Phonetics and Speech Sciences
- /
- v.15 no.3
- /
- pp.83-88
- /
- 2023
This study investigates the fine-tuning of large-scale Automatic Speech Recognition (ASR) models, specifically OpenAI's Whisper model, for domain-specific applications using the KsponSpeech dataset. The primary research questions address the effectiveness of targeted lexical item emphasis during fine-tuning, its impact on domain-specific performance, and whether the fine-tuned model can maintain generalization capabilities across different languages and environments. Experiments were conducted using two fine-tuning datasets: Set A, a small subset emphasizing specific lexical items, and Set B, consisting of the entire KsponSpeech dataset. Results showed that fine-tuning with targeted lexical items increased recognition accuracy and improved domain-specific performance, with generalization capabilities maintained when fine-tuned with a smaller dataset. For noisier environments, a trade-off between specificity and generalization capabilities was observed. This study highlights the potential of fine-tuning using minimal domain-specific data to achieve satisfactory results, emphasizing the importance of balancing specialization and generalization for ASR models. Future research could explore different fine-tuning strategies and novel technologies such as prompting to further enhance large-scale ASR models' domain-specific performance.
https://doi.org/10.13064/KSSS.2023.15.3.083 인용 PDF

A Study on Non-Contact Care Robot System through Deep Learning

Hyun-Sik Ham;Sae Jun Ko
- Journal of the Korea Society of Computer and Information
- /
- v.28 no.12
- /
- pp.33-40
- /
- 2023
As South Korea enters the realm of an super-aging society, the demand for elderly welfare services has been steadily rising. However, the current shortage of welfare personnel has emerged as a social issue. To address this challenge, there is active research underway on elderly care robots designed to mitigate the social isolation of the elderly and provide emergency contact capabilities in critical situations. Nonetheless, these functionalities require direct user contact, which represents a limitation of conventional elderly care robots. In this paper, we propose a solution to overcome these challenges by introducing a care robot system capable of interacting with users without the need for direct physical contact. This system leverages commercialized elderly care robots and cameras. We have equipped the care robot with an edge device that incorporates facial expression recognition and action recognition models. The models were trained and validated using public available data. Experimental results demonstrate high accuracy rates, with facial expression recognition achieving 96.5% accuracy and action recognition reaching 90.9%. Furthermore, the inference times for these processes are 50ms and 350ms, respectively. These findings affirm that our proposed system offers efficient and accurate facial and action recognition, enabling seamless interaction even in non-contact situations.
https://doi.org/10.9708/jksci.2023.28.12.033 인용 PDF HTML

Role of Acrosomal Matrix in Mammalian Fertilization (포유류 수정과정에서 정자 침체기질의 기능)

;George L. Gerton
- Journal of Embryo Transfer
- /
- v.16 no.1
- /
- pp.61-68
- /
- 2001
Sperm competent for fertilization can become capacitated, bind to the zona pellucida (ZP) of an egg in a specific manner, and complete acrosomal exocytosis. Failure to carry out these functions results in infertility. Although the interactions between the ZP and the plasma mem-brane overlying the sperm acrosome have been considered important for sperm-egg recognition and signalling, recent results have prompted a reassessment of current paradigms concerning these interactions. In this review, we're going to discuss about the roles of the acrosomal matrix, the particulate component of the acrosomal contents, in fertilization. The general hypothesis is that acrosomal exocytosis leads to the exposure of acrosomal matrix proteins that become de-facto extracellular matrix (ECM) on the surface of the sperm head, and that the dynamic interactions of this newly-exposed sperm ECM with the egg ECM (the ZP) govern sperm-egg recognition and sperm penetration of the ZP Informations from these experiments may provide new ways to address the poor ZP binding of sperm from some human infertility patients and may offer new avenues for contraception through the disruption of purposeful sperm-ZP binding.
PDF

An Experimental Study on the Optimistic Recognition Level of Public Address System as a Soundscape Application Facility (사운드스케이프 적용을 위한 옥외 P.A. 시스템 적정 인지레벨에 관한 실험적 연구)

Song, Min-Jeong;Jang, Gil-Soo;Shin, Hoon;Shin, Young-Gyu;Lee, Tai-Kang
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2006.05a
- /
- pp.726-729
- /
- 2006
As a active soundscape facility, P.A. system is a useful instrument to give place identity and vitality by letting out music, environmental music, bird singing sound etc. In this study, to know the optimistic distance and sound level range of introducing sound, sound levels due to distance were measured and subject responses were checked by questionnaire. Levels from 64dB to 71dB are recommended by subjects. And the optimistic level of introducing level is related with level variance of sound source. The results of this study could used for street furniture location design and P.A. system output level.
PDF

Handwritten Korean Word Recognition for Address Recognition (주소 인식 시스템을 위한 필기 한글 단어 인식)

권진욱;이관용;변혜란;이일병
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1997.11a
- /
- pp.201-204
- /
- 1997
최근 주소를 자동으로 인식하여 우편물 분류와 같은 업무를 효과적으로 수행하기 위한 연구가 진행되고 있다. 기존 연구들은 낱자 단위의 인식을 수행한 후 사전 형태의 간단한 DB를 통해 최종의 결과를 생성한다. 그러나 한글과 같은 복잡한 구조의 필기 문자에 대한 인식기의 성능은 아직도 미흡한 상태이다. 따라서 낱자 인식기의 성능에 의존하는 현재와 같은 방법으로는 만족할 만한 결과를 얻기가 힘들 것으로 생각된다. 본 논문에서는 낱자 인식 결과에 크게 의존하지 않고 주소에 나타나는 단어의 낱자들 사이간 연결 정보를 이용하여 단어를 인식할 수 있는 시스템을 제안한다. 본 시스템은 통계적 인식기를 사용하여 낱자를 인식하는 부분과 낱자 인식 결과를 조합하여 단어 수준의 인식과정을 통해 최종의 결과를 생성하는 부분으로 구성된다. 통계적 인식기는 Nearest neighborhood 방법을 사용하여 간단한 형태로 구현하였다. 단어인식 모듈은 단어에서 모든 문자간의 관계를 표현할 수 있도록 HMM 모형을 사용하여 어휘정보 네트워크를 구성하고 이를 이용하여 주소에 나타나는 단어를 인식하도록 하였다. PE92 한글 문자 데이터를 이용하여 실험을 수 璿\ulcorner 결과, 통계적 인식기의 성능이 저조함에도 불구하고 HMM을 이용한 어휘정보 네트워크가 이를 보완함으로써 좋은 결과를 얻었다. 이러한 단어 인식 방법을 주소 이외의 다른 단어 집합에 대해서도 쉽게 적용될 수 있을 것으로 예상된다.
PDF

Search Result 224, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)