• 제목/요약/키워드: unseen model

검색결과 40건 처리시간 0.022초

가변어휘 음성인식기 구현에 관한 연구 (A Study on the Implementatin of Vocalbulary Independent Korean Speech Recognizer)

  • 황병한
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1998년도 학술발표대회 논문집 제5권
    • /
    • pp.60-63
    • /
    • 1998
  • 본 논문에서는 사용자가 별도의 훈련과정 없이 인식대상 어휘를 추가 및 변경이 가능한 가변어휘 인식시스템에 관하여 기술한다. 가변어휘 음성인식에서는 미리 구성된 음소모델을 토대로 인식대상 어휘가 결정되명 발음사전에 의거하여 이들 어휘에 해당하는 음소모델을 연결함으로써 단어모델을 만든다. 사용된 음소모델은 현재 음소의 앞뒤의 음소 context를 고려한 문맥종속형(Context-Dependent)음소모델인 triphone을 사용하였고, 연속확률분포를 가지는 Hidden Markov Model(HMM)기반의 고립단어인식 시스템을 구현하였다. 비교를 위해 문맥 독립형 음소모델인 monophone으로 인식실험을 병행하였다. 개발된 시스템은 음성특징벡터로 MFCC(Mel Frequency Cepstrum Coefficient)를 사용하였으며, test 환경에서 나타나지 않은 unseen triphone 문제를 해결하기 위하여 state-tying 방법중 음성학적 지식에 기반을 둔 tree-based clustering 기법을 도입하였다. 음소모델 훈련에는 ETRI에서 구축한 POW (Phonetically Optimized Words) 음성 데이터베이스(DB)[1]를 사용하였고, 어휘독립인식실험에는 POW DB와 관련없는 22개의 부서명을 50명이 발음한 총 1.100개의 고립단어 부서 DB[2]를 사용하였다. 인식실험결과 문맥독립형 음소모델이 88.6%를 보인데 비해 문맥종속형 음소모델은 96.2%의 더 나은 성능을 보였다.

  • PDF

오픈 월드 객체 감지의 현재 트렌드에 대한 리뷰 (Unveiling the Unseen: A Review on current trends in Open-World Object Detection)

  • 이크발 무하마드 알리;김수균
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2024년도 제69차 동계학술대회논문집 32권1호
    • /
    • pp.335-337
    • /
    • 2024
  • This paper presents a new open-world object detection method emphasizing uncertainty representation in machine learning models. The focus is on adapting to real-world uncertainties, incrementally updating the model's knowledge repository for dynamic scenarios. Applications like autonomous vehicles benefit from improved multi-class classification accuracy. The paper reviews challenges in existing methodologies, stressing the need for universal detectors capable of handling unknown classes. Future directions propose collaboration, integration of language models, to improve the adaptability and applicability of open-world object detection.

  • PDF

Model for Mobile Online Video viewed on Samsung Galaxy Note 5

  • Pal, Debajyoti;Vanijja, Vajirasak
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권11호
    • /
    • pp.5392-5418
    • /
    • 2017
  • The primary aim of this paper is to propose a non-linear regression based technique for mapping different network Quality of Service (QoS) factors to an integrated end-user Quality of Experience (QoE) or Mean Opinion Score (MOS) value for an online video streaming service on a mobile phone. We use six network QoS factors for finding out the user QoE. The contribution of this paper is threefold. First, we investigate the impact of the network QoS factors on the perceived video quality. Next, we perform an individual mapping of the significant network QoS parameters obtained in stage 1 to the user QoE based upon a non-linear regression method. The optimal QoS to QoE mapping function is chosen based upon a decision variable. In the final stage, we evaluate the integrated QoE of the system by taking the combined effect of all the QoS factors considered. Extensive subjective tests comprising of over 50 people across a wide variety of video contents encoded with H.265/HEVC and VP9 codec have been conducted in order to gather the actual MOS data for the purpose of QoS to QoE mapping. Our proposed hybrid model has been validated against unseen data and reveals good prediction accuracy.

Enhancing Wind Speed and Wind Power Forecasting Using Shape-Wise Feature Engineering: A Novel Approach for Improved Accuracy and Robustness

  • Mulomba Mukendi Christian;Yun Seon Kim;Hyebong Choi;Jaeyoung Lee;SongHee You
    • International Journal of Advanced Culture Technology
    • /
    • 제11권4호
    • /
    • pp.393-405
    • /
    • 2023
  • Accurate prediction of wind speed and power is vital for enhancing the efficiency of wind energy systems. Numerous solutions have been implemented to date, demonstrating their potential to improve forecasting. Among these, deep learning is perceived as a revolutionary approach in the field. However, despite their effectiveness, the noise present in the collected data remains a significant challenge. This noise has the potential to diminish the performance of these algorithms, leading to inaccurate predictions. In response to this, this study explores a novel feature engineering approach. This approach involves altering the data input shape in both Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) and Autoregressive models for various forecasting horizons. The results reveal substantial enhancements in model resilience against noise resulting from step increases in data. The approach could achieve an impressive 83% accuracy in predicting unseen data up to the 24th steps. Furthermore, this method consistently provides high accuracy for short, mid, and long-term forecasts, outperforming the performance of individual models. These findings pave the way for further research on noise reduction strategies at different forecasting horizons through shape-wise feature engineering.

언어-기반 제로-샷 물체 목표 탐색 이동 작업들을 위한 인공지능 기저 모델들의 활용 (Utilizing AI Foundation Models for Language-Driven Zero-Shot Object Navigation Tasks)

  • 최정현;백호준;박찬솔;김인철
    • 로봇학회논문지
    • /
    • 제19권3호
    • /
    • pp.293-310
    • /
    • 2024
  • In this paper, we propose an agent model for Language-Driven Zero-Shot Object Navigation (L-ZSON) tasks, which takes in a freeform language description of an unseen target object and navigates to find out the target object in an inexperienced environment. In general, an L-ZSON agent should able to visually ground the target object by understanding the freeform language description of it and recognizing the corresponding visual object in camera images. Moreover, the L-ZSON agent should be also able to build a rich spatial context map over the unknown environment and decide efficient exploration actions based on the map until the target object is present in the field of view. To address these challenging issues, we proposes AML (Agent Model for L-ZSON), a novel L-ZSON agent model to make effective use of AI foundation models such as Large Language Model (LLM) and Vision-Language model (VLM). In order to tackle the visual grounding issue of the target object description, our agent model employs GLEE, a VLM pretrained for locating and identifying arbitrary objects in images and videos in the open world scenario. To meet the exploration policy issue, the proposed agent model leverages the commonsense knowledge of LLM to make sequential navigational decisions. By conducting various quantitative and qualitative experiments with RoboTHOR, the 3D simulation platform and PASTURE, the L-ZSON benchmark dataset, we show the superior performance of the proposed agent model.

시계열 교차검증을 적용한 2,3-BDO 분리공정 온도예측 모델의 초매개변수 최적화 (Application of Time-series Cross Validation in Hyperparameter Tuning of a Predictive Model for 2,3-BDO Distillation Process)

  • 안나현;최영렬;조형태;김정환
    • Korean Chemical Engineering Research
    • /
    • 제59권4호
    • /
    • pp.532-541
    • /
    • 2021
  • 최근 인공지능에 대한 관심이 높아짐에 따라 화학공정분야에서도 인공지능을 활용한 연구가 많아지고 있다. 그러나 인공지능 기반 모델이 충분히 일반화되지 않아 학습에 이용되지 않은 새로운 데이터에 대한 예측률이 떨어지는 과적합 현상이 빈번하게 일어나고 있으며, 교차검증은 과적합을 해결하는 방법 중 하나이다. 본 연구에서는 2,3-BDO 분리 공정 온도 예측 모델의 초매개변수 중에서 배치 개수와 반복횟수를 조정하기 위해 시계열 교차검증을 적용하고 일반적으로 사용되는 K 겹 교차검증과 비교하였다. 결과적으로 K 겹 교차검증을 사용했을 때 보다 시계열 교차검증 방식을 사용했을 때 MAPE는 0.61% 증가한 반면 RMSE는 9.06% 감소하였고 학습 시간은 198.29초 적게 소요되었다.

Dental age estimation using the pulp-to-tooth ratio in canines by neural networks

  • Farhadian, Maryam;Salemi, Fatemeh;Saati, Samira;Nafisi, Nika
    • Imaging Science in Dentistry
    • /
    • 제49권1호
    • /
    • pp.19-26
    • /
    • 2019
  • Purpose: It has been proposed that using new prediction methods, such as neural networks based on dental data, could improve age estimation. This study aimed to assess the possibility of exploiting neural networks for estimating age by means of the pulp-to-tooth ratio in canines as a non-destructive, non-expensive, and accurate method. In addition, the predictive performance of neural networks was compared with that of a linear regression model. Materials and Methods: Three hundred subjects whose age ranged from 14 to 60 years and were well distributed among various age groups were included in the study. Two statistical software programs, SPSS 21 (IBM Corp., Armonk, NY, USA) and R, were used for statistical analyses. Results: The results indicated that the neural network model generally performed better than the regression model for estimation of age with pulp-to-tooth ratio data. The prediction errors of the developed neural network model were acceptable, with a root mean square error (RMSE) of 4.40 years and a mean absolute error (MAE) of 4.12 years for the unseen dataset. The prediction errors of the regression model were higher than those of the neural network, with an RMSE of 10.26 years and a MAE of 8.17 years for the test dataset. Conclusion: The neural network method showed relatively acceptable performance, with an MAE of 4.12 years. The application of neural networks creates new opportunities to obtain more accurate estimations of age in forensic research.

적응형 깊이 추정기를 이용한 미지 물체의 자세 예측 (Predicting Unseen Object Pose with an Adaptive Depth Estimator)

  • 송성호;김인철
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제11권12호
    • /
    • pp.509-516
    • /
    • 2022
  • 3차원 공간에서 물체들의 정확한 자세 예측은 실내외 환경에서 장면 이해, 로봇의 물체 조작, 자율 주행, 증강 현실 등과 같은 많은 응용 분야들에서 폭넓게 활용되는 중요한 시각 인식 기술이다. 물체들의 자세 예측을 위한 과거 연구들은 대부분 각 인식 대상 물체마다 정확한 3차원 CAD 모델을 요구한다는 한계점이 있었다. 이러한 과거 연구들과는 달리, 본 논문에서는 3차원 CAD 모델이 없어도 RGB 컬러 영상들만 이용해서 미지 물체들의 자세를 예측해낼 수 있는 새로운 신경망 모델을 제안한다. 제안 모델은 적응형 깊이 추정기인 AdaBins를 이용하여 스스로 미지 물체 자세 예측에 필요한 각 물체의 깊이 지도를 효과적으로 추정해낼 수 있다. 벤치마크 데이터 집합들을 이용한 다양한 실험들을 통해, 본 논문에서 제안한 모델의 유용성과 성능을 평가한다.

PERIOD VARIATIONS OF RT PERSEI

  • Kim, Chun-Hwey
    • Journal of Astronomy and Space Sciences
    • /
    • 제12권2호
    • /
    • pp.179-195
    • /
    • 1995
  • RT Per has been known as a close binary of which the orbital period has unpredictably varied so far. Although there are no agreements with the working mechanism for the changes of the period, two interpretations have been suggested and waiting for to be tested: 1) light-time effects due to the unseen 3rd and 4rd bodies (Panchatsaram 1981), 2) Abrupt period-changes, due to internal variations of the system (e.g. mass transfer or mass loss) superimposing to the light-time effect by a 3rd body (Frieboes-Conde & Herczeg 1973). In the point of view that the former interprepation models could predict the behavior of the changes of the orbital period theoretically, we checked whether the recent observed times of minimum lights follow the perdictions by the first model or not. We confirmed that the observed times of minimum lights have followed the variations calculated by the light-times effects due to the 3rd and 4rd bodies suggested by Panchatsatam. In this paper a total of 626 times of minimum lights were reanalyzed in terms of the light-time effects by the 3rd and 4rd bodies. We concluded that the eclipsing pair in SVCam system moves in an elliptic orbit about center of mass of the triple system with a period of about $42.^y2$, while the mass center of the triplet is in light-time orbit about the center of mass of the quadruple system with a period of $120^y$. The mean masses deduced for the 3rd and 4rd bodies were $0.89m_\odot$ and $0.82m_\odot$, respectively.

  • PDF

잠재변수 모형에서의 군집효율을 이용한 변수선택 (Variable selection for latent class analysis using clustering efficiency)

  • 김성경;서병태
    • 응용통계연구
    • /
    • 제31권6호
    • /
    • pp.721-732
    • /
    • 2018
  • 잠재집단 모형은 다변량 범주형 자료 안에 숨겨진 집단을 찾는 매우 중요한 도구종의 하나이다. 하지만 실제 자료분석에서 너무 많은 관찰변수들을 포함시킨 모형은 모형을 복잡하게 만들고 또한 모수추정의 정확도에 영향을 주기 때문에 정보가 손실되지 않는 내에서 유용한 변수를 찾는 것은 중요한 문제이다. Dean과 Raftery (2010)은 잠재집단 모형에서의 변수선택을 위해 BIC를 이용한 Headlong search 알고리즘을 제시하였는데 본 논문에서는 이 방법을 대체할 수 있는 방법으로 적합한 모형으로부터 계산된 잠재집단에 속할 사후확률을 이용하여 변수 선택을 하는 방법을 제안하고자 한다. 이를 위하여 잠재집단 모형의 적합성을 측정할 수 있는 새로운 통계량과 이를 이용한 변수선택 알고리즘을 제시할 것이다. 또한 제안된 방법의 효율성을 모의실험과 실증자료 분석을 통해 살펴보고자 한다.