• 제목/요약/키워드: Recognition Model

검색결과 3,425건 처리시간 0.029초

Model Inversion Attack: Analysis under Gray-box Scenario on Deep Learning based Face Recognition System

  • Khosravy, Mahdi;Nakamura, Kazuaki;Hirose, Yuki;Nitta, Naoko;Babaguchi, Noboru
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제15권3호
    • /
    • pp.1100-1118
    • /
    • 2021
  • In a wide range of ML applications, the training data contains privacy-sensitive information that should be kept secure. Training the ML systems by privacy-sensitive data makes the ML model inherent to the data. As the structure of the model has been fine-tuned by training data, the model can be abused for accessing the data by the estimation in a reverse process called model inversion attack (MIA). Although, MIA has been applied to shallow neural network models of recognizers in literature and its threat in privacy violation has been approved, in the case of a deep learning (DL) model, its efficiency was under question. It was due to the complexity of a DL model structure, big number of DL model parameters, the huge size of training data, big number of registered users to a DL model and thereof big number of class labels. This research work first analyses the possibility of MIA on a deep learning model of a recognition system, namely a face recognizer. Second, despite the conventional MIA under the white box scenario of having partial access to the users' non-sensitive information in addition to the model structure, the MIA is implemented on a deep face recognition system by just having the model structure and parameters but not any user information. In this aspect, it is under a semi-white box scenario or in other words a gray-box scenario. The experimental results in targeting five registered users of a CNN-based face recognition system approve the possibility of regeneration of users' face images even for a deep model by MIA under a gray box scenario. Although, for some images the evaluation recognition score is low and the generated images are not easily recognizable, but for some other images the score is high and facial features of the targeted identities are observable. The objective and subjective evaluations demonstrate that privacy cyber-attack by MIA on a deep recognition system not only is feasible but also is a serious threat with increasing alert state in the future as there is considerable potential for integration more advanced ML techniques to MIA.

형상 형성 제어를 이용한 어휘인식 공유 모델의 가우시안 최적화 (Gaussian Optimization of Vocabulary Recognition Clustering Model using Configuration Thread Control)

  • 안찬식;오상엽
    • 한국컴퓨터정보학회논문지
    • /
    • 제15권2호
    • /
    • pp.127-134
    • /
    • 2010
  • 연속 어휘 인식 확률 분포의 공유 방법에서는 사용될 모델 파라미터들의 초기 추정치를 생성하기 위한 각 문맥들에 대한 음소 데이터가 반드시 필요하지만 이들 음소 데이터에 대한 모델을 구성할 수 없는 단점으로 가우시안 모델의 정확성을 확보하지 못한다는 단점이 있다. 이를 개선하기 위하여 확률 분포의 혼합 가우시안 모델을 최적화하고, 음소 단위로 데이터를 탐색을 지원하는 형상 형성 시스템을 제안한다. 본 논문의 형상 형성 시스템은 확장 facet 분류를 이용하여 사용자에게 음소 단위의 형상 형성 정보를 제공하므로 가우시안 모델의 정확성을 제공한다. 본 논문에서 제안한 시스템을 적용한 결과 시스템 성능에서 어휘 종속 인식률은 98.31%, 어휘 독립 인식률은 97.63%의 인식률을 나타내었다.

시선 추적을 활용한 패션 디자인 인지에 관한 연구 (A Study on Fashion Design Cognition Using Eye Tracking)

  • 이신영
    • 한국의류산업학회지
    • /
    • 제23권3호
    • /
    • pp.323-336
    • /
    • 2021
  • This study investigated the cognitive process of fashion design images through eye activity tracking. Differences in the cognitive process and gaze activity according to image elements were confirmed. The results of the study are as follows. First, a difference was found between groups in the gaze time for each section according to the model and design. Although model diversity is an important factor leading the interest of observers, the simplicity of the model was deemed more effective for observing the design. Second, the examination of the differences by segments regarding the gaze weight of the image area showed differences for each group. When a similar type of model is repeated, the proportion of face recognition decreases, and the proportion of design recognition time increases. Conversely, when the model diversity is high, the same amount of time is devoted to recognizing the model's face in all the processes. Additionally, there was a difference in the gaze activity in recognizing the same design according to the type of model. These results enabled the confirmation of the importance of the model as an image recognition factor in fashion design. In the fashion industry, it is important to find a cognitive factor that attracts and retains consumers' attention. If the design recognition effect is further maximized by finding service points to be utilized, the brand's sustainability is expected to be enhanced even in the rapidly changing fashion industry.

실시간 윈도우 환경에서 DMS모델을 이용한 자동 음성 제어 시스템에 관한 연구 (A Study on the Automatic Speech Control System Using DMS model on Real-Time Windows Environment)

  • 이정기;남동선;양진우;김순협
    • 한국음향학회지
    • /
    • 제19권3호
    • /
    • pp.51-56
    • /
    • 2000
  • 본 논문은 음성인식을 이용한 실시간 윈도우 자동 제어 시스템에 관한 연구이다. 사용된 음성 모델은 수행 속도를 높이기 위해 제안된 가변 DMS 모델을 이용하였으며, 인식 알고리즘으로 이를 이용한 One-Stage DP 알고리즘을 사용한다. 인식 대상단어는 윈도우에서 자주 사용되는 66개의 윈도우 제어 명령어들로 구성한다. 본 연구에서 온라인으로 음성을 처리하기 위해 음성 검출 알고리즘을 구현하였으며, 기존 DMS(Dynamic Multi Section)모델 생성시 고정적으로 적용하던 섹션의 수를 입력 신호의 지속 시간을 고려하여 가변적으로 적용한 가변 DMS 모델을 제안하였다. 또한 윈도우에서 사용자 작업에 의해 현재 상태에 인식 대상으로 불필요한 인식 대상단어가 발생하게 되는데 이를 효율적으로 처리하기 위해 사용 모델을 재구성하여 사용하도록 제안하였으며, 인간의 청각적 특성을 고려하여 음성신호에서 개인의 특성은 제외하고 음성 자체의 특징만을 추출하여 특징 벡터를 생성하는 인지 선형 예측(Perceptual Linear Predictive)분석 방법을 이용하였다. 시스템 성능 평가 결과 가변 동적 다중 섹션 모델(Variable DMS model)과 기존의 DMS 모델은 인식률 면에서는 거의 동일하지만 인식 수행 속도는 제안된 모델의 계산량이 기존 모델보다 작기 때문에 향상되었고, 다중 화자 독립 인식률은 99.08%, 다중 화자 종속 인식률은 99.39%의 인식률을 나타내었으며, 실제 노이즈가 있는 환경에서 화자독립실험의 경우 96.25%의 인식률을 보여 주었다.

  • PDF

다중 사용자를 위한 Dynamic Time Warping 기반의 특징 강조형 제스처 인식 모델 (Feature-Strengthened Gesture Recognition Model Based on Dynamic Time Warping for Multi-Users)

  • 이석균;엄현민;권혁태
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제5권10호
    • /
    • pp.503-510
    • /
    • 2016
  • 최근 제안된 FsGr 모델은 가속도 센서 기반의 제스처 인식을 위한 방법으로 DTW 알고리즘을 두 단계로 적용하여 인식률을 개선하였다. FsGr 모델에서는 유사제스처 집합 개념을 정의하는데 훈련과정에서 유사제스처 집합들을 생성한다. 제스처 인식의 1차 인식 시도에서 유사제스처 집합이 정의된 제스처로 판정되면, 이 유사제스처 집합의 제스처들에 대해 특징이 강조된 부분들을 추출해 DTW를 통한 2차 인식을 시도한다. 그러나 동일 제스처도 사용자의 신체 크기, 나이, 성별, 등의 신체적인 특징에 따라 매우 다른 특성을 보이고 있어 FsGr 모델을 다중 사용자 환경에 적용하기에는 한계가 있다. 본 논문에서는 이를 다중 사용자 환경으로 확장한 FsGrM 모델을 제안하고 이를 사용한 스마트TV의 채널 및 볼륨 제어 프로그램을 보인다.

Speech Recognition in Car Noise Environments Using Multiple Models Based on a Hybrid Method of Spectral Subtraction and Residual Noise Masking

  • Song, Myung-Gyu;Jung, Hoi-In;Shim, Kab-Jong;Kim, Hyung-Soon
    • The Journal of the Acoustical Society of Korea
    • /
    • 제18권3E호
    • /
    • pp.3-8
    • /
    • 1999
  • In speech recognition for real-world applications, the performance degradation due to the mismatch introduced between training and testing environments should be overcome. In this paper, to reduce this mismatch, we provide a hybrid method of spectral subtraction and residual noise masking. We also employ multiple model approach to obtain improved robustness over various noise environments. In this approach, multiple model sets are made according to several noise masking levels and then a model set appropriate for the estimated noise level is selected automatically in recognition phase. According to speaker independent isolated word recognition experiments in car noise environments, the proposed method using model sets with only two masking levels reduced average word error rate by 60% in comparison with spectral subtraction method.

  • PDF

Dynamic Human Activity Recognition Based on Improved FNN Model

  • Xu, Wenkai;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제15권4호
    • /
    • pp.417-424
    • /
    • 2012
  • In this paper, we propose an automatic system that recognizes dynamic human gestures activity, including Arabic numbers from 0 to 9. We assume the gesture trajectory is almost in a plane that called principal gesture plane, then the Least Squares Method is used to estimate the plane and project the 3-D trajectory model onto the principal. An improved FNN model combined with HMM is proposed for dynamic gesture recognition, which combines ability of HMM model for temporal data modeling with that of fuzzy neural network. The proposed algorithm shows that satisfactory performance and high recognition rate.

원통 모델과 스테레오 카메라를 이용한 포즈 변화에 강인한 얼굴인식 (Pose-invariant Face Recognition using Cylindrical Model and Stereo Camera)

  • 노진우;안병두;;고한석
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2003년도 하계종합학술대회 논문집 Ⅳ
    • /
    • pp.2012-2015
    • /
    • 2003
  • This paper proposes a pose-invariant face recognition method using cylindrical model and stereo camera. We divided this paper into two parts. One is single input image case, the other is stereo input image case. In single input image case, we normalized a face's yaw pose using cylindrical model, and in stereo input image case, we normalized a face's pitch pose using cylindrical model with estimated object's pitch pose by stereo geometry. Also, since we have advantage that we can utilize two images acquired at the same time, we can increase overall recognition rate by decision-level fusion. By experiment, we confirmed that recognition rate could be increased using our methods.

  • PDF

가변어휘 핵심어 검출을 위한 비핵심어 모델링 및 후처리 성능평가 (Performance Evaluation of Nonkeyword Modeling and Postprocessing for Vocabulary-independent Keyword Spotting)

  • 김형순;김영국;신영욱
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.225-239
    • /
    • 2003
  • In this paper, we develop a keyword spotting system using vocabulary-independent speech recognition technique, and investigate several non-keyword modeling and post-processing methods to improve its performance. In order to model non-keyword speech segments, monophone clustering and Gaussian Mixture Model (GMM) are considered. We employ likelihood ratio scoring method for the post-processing schemes to verify the recognition results, and filler models, anti-subword models and N-best decoding results are considered as an alternative hypothesis for likelihood ratio scoring. We also examine different methods to construct anti-subword models. We evaluate the performance of our system on the automatic telephone exchange service task. The results show that GMM-based non-keyword modeling yields better performance than that using monophone clustering. According to the post-processing experiment, the method using anti-keyword model based on Kullback-Leibler distance and N-best decoding method show better performance than other methods, and we could reduce more than 50% of keyword recognition errors with keyword rejection rate of 5%.

  • PDF

An Intelligent Emotion Recognition Model Using Facial and Bodily Expressions

  • Jae Kyeong Kim;Won Kuk Park;Il Young Choi
    • Asia pacific journal of information systems
    • /
    • 제27권1호
    • /
    • pp.38-53
    • /
    • 2017
  • As sensor technologies and image processing technologies make collecting information on users' behavior easy, many researchers have examined automatic emotion recognition based on facial expressions, body expressions, and tone of voice, among others. Specifically, many studies have used normal cameras in the multimodal case using facial and body expressions. Thus, previous studies used a limited number of information because normal cameras generally produce only two-dimensional images. In the present research, we propose an artificial neural network-based model using a high-definition webcam and Kinect to recognize users' emotions from facial and bodily expressions when watching a movie trailer. We validate the proposed model in a naturally occurring field environment rather than in an artificially controlled laboratory environment. The result of this research will be helpful in the wide use of emotion recognition models in advertisements, exhibitions, and interactive shows.