• 제목/요약/키워드: Training Data Size

검색결과 416건 처리시간 0.025초

A Survey of Applications of Artificial Intelligence Algorithms in Eco-environmental Modelling

  • Kim, Kang-Suk;Park, Joon-Hong
    • Environmental Engineering Research
    • /
    • 제14권2호
    • /
    • pp.102-110
    • /
    • 2009
  • Application of artificial intelligence (AI) approaches in eco-environmental modeling has gradually increased for the last decade. Comprehensive understanding and evaluation on the applicability of this approach to eco-environmental modeling are needed. In this study, we reviewed the previous studies that used AI-techniques in eco-environmental modeling. Decision Tree (DT) and Artificial Neural Network (ANN) were found to be major AI algorithms preferred by researchers in ecological and environmental modeling areas. When the effect of the size of training data on model prediction accuracy was explored using the data from the previous studies, the prediction accuracy and the size of training data showed nonlinear correlation, which was best-described by hyperbolic saturation function among the tested nonlinear functions including power and logarithmic functions. The hyperbolic saturation equations were proposed to be used as a guideline for optimizing the size of training data set, which is critically important in designing the field experiments required for training AI-based eco-environmental modeling.

Speaker Verification with the Constraint of Limited Data

  • Kumari, Thyamagondlu Renukamurthy Jayanthi;Jayanna, Haradagere Siddaramaiah
    • Journal of Information Processing Systems
    • /
    • 제14권4호
    • /
    • pp.807-823
    • /
    • 2018
  • Speaker verification system performance depends on the utterance of each speaker. To verify the speaker, important information has to be captured from the utterance. Nowadays under the constraints of limited data, speaker verification has become a challenging task. The testing and training data are in terms of few seconds in limited data. The feature vectors extracted from single frame size and rate (SFSR) analysis is not sufficient for training and testing speakers in speaker verification. This leads to poor speaker modeling during training and may not provide good decision during testing. The problem is to be resolved by increasing feature vectors of training and testing data to the same duration. For that we are using multiple frame size (MFS), multiple frame rate (MFR), and multiple frame size and rate (MFSR) analysis techniques for speaker verification under limited data condition. These analysis techniques relatively extract more feature vector during training and testing and develop improved modeling and testing for limited data. To demonstrate this we have used mel-frequency cepstral coefficients (MFCC) and linear prediction cepstral coefficients (LPCC) as feature. Gaussian mixture model (GMM) and GMM-universal background model (GMM-UBM) are used for modeling the speaker. The database used is NIST-2003. The experimental results indicate that, improved performance of MFS, MFR, and MFSR analysis radically better compared with SFSR analysis. The experimental results show that LPCC based MFSR analysis perform better compared to other analysis techniques and feature extraction techniques.

심폐소생술 교육 효과에 대한 메타분석 연구 (A Meta-analysis of the effects of cardiopulmonary resuscitation training)

  • 유순규;이지은
    • 한국응급구조학회지
    • /
    • 제21권1호
    • /
    • pp.17-44
    • /
    • 2017
  • Purpose: This study aimed to identify the effects of cardiopulmonary resuscitation (CPR) training using a meta-analysis by effect size. Methods: The effect sizes for each variable and the overall effect size for the collected data were identified. The homogeneity verification of the effect size and the difference among the average effect sizes for each mediation variable were determined. Results: The overall average effect size for CPR training was 1.747. Homogeneity verification of the overall effect size was a Q-value of 3716.962, which was statistically significant (p=.000) when${\alpha}=.05$. CPR training showed statistically significant differences depending on age (p=.002), sex (p=.006), number of trainees (p=.000), research design (p=.000), training method (p=.027), and practical training tools (p=.000). Conclusion: CPR training can effectively improve knowledge, skills, and attitudes about CPR. The results of this meta-analysis contribute to the development of more effective educational guidelines for future CPR training and the advancement of the CPR education field.

The Effect of the Number of Training Data on Speech Recognition

  • Lee, Chang-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • 제28권2E호
    • /
    • pp.66-71
    • /
    • 2009
  • In practical applications of speech recognition, one of the fundamental questions might be on the number of training data that should be provided for a specific task. Though plenty of training data would undoubtedly enhance the system performance, we are then faced with the problem of heavy cost. Therefore, it is of crucial importance to determine the least number of training data that will afford a certain level of accuracy. For this purpose, we investigate the effect of the number of training data on the speaker-independent speech recognition of isolated words by using FVQ/HMM. The result showed that the error rate is roughly inversely proportional to the number of training data and grows linearly with the vocabulary size.

Domain-Adaptation Technique for Semantic Role Labeling with Structural Learning

  • Lim, Soojong;Lee, Changki;Ryu, Pum-Mo;Kim, Hyunki;Park, Sang Kyu;Ra, Dongyul
    • ETRI Journal
    • /
    • 제36권3호
    • /
    • pp.429-438
    • /
    • 2014
  • Semantic role labeling (SRL) is a task in natural-language processing with the aim of detecting predicates in the text, choosing their correct senses, identifying their associated arguments, and predicting the semantic roles of the arguments. Developing a high-performance SRL system for a domain requires manually annotated training data of large size in the same domain. However, such SRL training data of sufficient size is available only for a few domains. Constructing SRL training data for a new domain is very expensive. Therefore, domain adaptation in SRL can be regarded as an important problem. In this paper, we show that domain adaptation for SRL systems can achieve state-of-the-art performance when based on structural learning and exploiting a prior model approach. We provide experimental results with three different target domains showing that our method is effective even if training data of small size is available for the target domains. According to experimentations, our proposed method outperforms those of other research works by about 2% to 5% in F-score.

프로토타입 선택을 이용한 최근접 분류 학습의 성능 개선 (Performance Improvement of Nearest-neighbor Classification Learning through Prototype Selections)

  • 황두성
    • 전자공학회논문지CI
    • /
    • 제49권2호
    • /
    • pp.53-60
    • /
    • 2012
  • 최근접 이웃 분류에서 입력 데이터의 클래스는 선택된 근접 학습 데이터들 중에서 가장 빈번한 클래스로 예측된다. 최근접분류 학습은 학습 단계가 없으나, 준비된 데이터가 모두 예측 분류에 참여하여 일반화 성능이 학습 데이터의 질에 의존된다. 그러므로 학습 데이터가 많아지면 높은 기억 장치 용량과 예측 분류 시 높은 계산 시간이 요구된다. 본 논문에서는 분리 경계면에 위치한 학습 데이터들로 구성된 새로운 학습 데이터를 생성시켜 분류 예측을 수행하는 프로토타입 선택 알고리즘을 제안한다. 제안하는 알고리즘에서는 분리 경계 영역에 위치한 데이터를 Tomek links와 거리를 이용하여 선별하며, 이미 선택된 데이터와 클래스와 거리 관계 분석을 이용하여 프로토타입 집합에 추가 여부를 결정한다. 실험에서 선택된 프로토타입의 수는 원래 학습 데이터에 비해 적은 수의 데이터 집합이 되어 최근접 분류의 적용 시 기억장소의 축소와 빠른 예측 시간을 제공할수 있다.

Study on the Effect of Discrepancy of Training Sample Population in Neural Network Classification

  • Lee, Sang-Hoon;Kim, Kwang-Eun
    • 대한원격탐사학회지
    • /
    • 제18권3호
    • /
    • pp.155-162
    • /
    • 2002
  • Neural networks have been focused on as a robust classifier for the remotely sensed imagery due to its statistical independency and teaming ability. Also the artificial neural networks have been reported to be more tolerant to noise and missing data. However, unlike the conventional statistical classifiers which use the statistical parameters for the classification, a neural network classifier uses individual training sample in teaming stage. The training performance of a neural network is know to be very sensitive to the discrepancy of the number of the training samples of each class. In this paper, the effect of the population discrepancy of training samples of each class was analyzed with three layered feed forward network. And a method for reducing the effect was proposed and experimented with Landsat TM image. The results showed that the effect of the training sample size discrepancy should be carefully considered for faster and more accurate training of the network. Also, it was found that the proposed method which makes teaming rate as a function of the number of training samples in each class resulted in faster and more accurate training of the network.

Standardized Uptake Values Highly Correlate with Tumor Size and Fuhrman Grade in Patients with Clear Cell Renal Cell Carcinoma

  • Polat, Emre Can;Otunctemur, Alper;Ozbek, Emin;Besiroglu, Huseyin;Dursun, Murat;Ozer, Kutan;Horsanali, Mustafa Ozan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권18호
    • /
    • pp.7821-7824
    • /
    • 2014
  • Background: We investigated the correlation between standardized uptake value (SUVmax), tumor size and Fuhrman grade in patients with renal cell carcinoma (RC). Materials and Methods: We retrospectively analyzed the data of 54 patients with clear cell renal cell carcinoma histopathologically diagnosed who underwent fluorine-18 fluoro-2 deoxyglucose positron emission tomography/computed tomography (F-18 FDG PET/CT) between January 2005 and March 2014. Results: Avarage tumor sizes were $5.64{\pm}1.85$, $6.85{\pm}2.24$ and $7.98{\pm}2.45$ in low, medium and high SUVmax groups, respectively. The Spearman's correlation coefficient between the tumor size and SUVmax was 0.385 (p=0.004) and between the Fuhrman grade and SUVmax was 0.578 (p<0.001). Conclusions: SUVmax appears highly correlated with tumor size and Fuhrman grade in patients with histopathologically confirmed clear cell RC. Multicenter studies are needed to provide larger series for more accurate results.

The Effectiveness of the Training Program at HCL

  • Kumari, Neeraj
    • Asian Journal of Business Environment
    • /
    • 제5권3호
    • /
    • pp.23-28
    • /
    • 2015
  • Purpose - The aim of this study is to evaluate the effectiveness of a corporate training program. The case study of HCL Technologies was used to investigate how training programs improve the performance of employees on the job, as well as to identify unnecessary aspects of the training for the purpose of eliminating these from future training programs. Research design, data, and methodology - An exploratory research design was used to conduct the study. The research sample size included 50 HCL employees. The sampling technique for the data collection was convenience sampling. Results - Training is a crucial process in an organization and thus needs to be well designed. Specifically, the training programs should provide adequate knowledge to all employees, ensure correct methods are used for the selection of trainees, and avoid any perception of biasness. Conclusions - Employees were not fully satisfied by the separation of the training program into two parts, on the job and off the job training, but if sufficient data is provided to employees in advance, this could help them during the training process.

훈련지역의 취득방법 및 규모에 따른 JERS-1위성영상의 토지피복분류 정확도 평가 (Estimation of Classification Accuracy of JERS-1 Satellite Imagery according to the Acquisition Method and Size of Training Reference Data)

  • 하성룡;경천구;박상영;박대희
    • 한국지리정보학회지
    • /
    • 제5권1호
    • /
    • pp.27-37
    • /
    • 2002
  • 정량적인 토지피복도의 확보는 유역에 분포하는 비점오염원의 규명에 있어서 매우 중요한 과제로 인식되고 있다. 본 연구는 위성영상을 이용한 토지피복분류 과정에 있어서, 훈련지역의 취득방법 및 규모가 분류정확도에 미치는 영향을 JERS-1 OPS 위성영상을 기반으로 평가하였다. 전체 연구대상지역 중에서 0.3%, 0.5%, 1.0%를 훈련지역으로 추출함에 있어서 두 가지 기법을 제안하였다. 첫번째 기법은 해당지역에 대한 사전 지식을 갖춘 연구자가 훈련지역을 추출하였으며, 두번째 기법은 기하학적 보정을 행한 항공사진과 수치지도를 이용하여 훈련지역을 추출하였다. 영상의 토지피복 분류는 최대우도분류법을 이용하였다. 연구결과 사용자에 의한 훈련지역 취득기법보다 항공사진과 수치지도를 이용하여 훈련지역을 추출하여 최대우도분류법을 적용할 경우 전체정확도가 최대 18% 정도 향상하였다. 우리나라와 같이 복잡하고 다양한 토지이용을 가진 지형에서 JERS-1 영상을 이용하여 95%의 신뢰도를 얻기 위해서는 적어도 훈련지역을 전체지역의 약 1% 이상 추출하여야 만족할 만한 토지피복분류를 수행할 수 있었다.

  • PDF