• 제목/요약/키워드: 모델 일반화

Search Result 616, Processing Time 0.027 seconds

Two Statistical Models for Automatic Word Spacing of Korean Sentences (한글 문장의 자동 띄어쓰기를 위한 두 가지 통계적 모델)

  • 이도길;이상주;임희석;임해창
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.358-371
    • /
    • 2003
  • Automatic word spacing is a process of deciding correct boundaries between words in a sentence including spacing errors. It is very important to increase the readability and to communicate the accurate meaning of text to the reader. The previous statistical approaches for automatic word spacing do not consider the previous spacing state, and thus can not help estimating inaccurate probabilities. In this paper, we propose two statistical word spacing models which can solve the problem of the previous statistical approaches. The proposed models are based on the observation that the automatic word spacing is regarded as a classification problem such as the POS tagging. The models can consider broader context and estimate more accurate probabilities by generalizing hidden Markov models. We have experimented the proposed models under a wide range of experimental conditions in order to compare them with the current state of the art, and also provided detailed error analysis of our models. The experimental results show that the proposed models have a syllable-unit accuracy of 98.33% and Eojeol-unit precision of 93.06% by the evaluation method considering compound nouns.

Long-Term Arrival Time Estimation Model Based on Service Time (버스의 정차시간을 고려한 장기 도착시간 예측 모델)

  • Park, Chul Young;Kim, Hong Geun;Shin, Chang Sun;Cho, Yong Yun;Park, Jang Woo
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.7
    • /
    • pp.297-306
    • /
    • 2017
  • Citizens want more accurate forecast information using Bus Information System. However, most bus information systems that use an average based short-term prediction algorithm include many errors because they do not consider the effects of the traffic flow, signal period, and halting time. In this paper, we try to improve the precision of forecast information by analyzing the influencing factors of the error, thereby making the convenience of the citizens. We analyzed the influence factors of the error using BIS data. It is shown in the analyzed data that the effects of the time characteristics and geographical conditions are mixed, and that effects on halting time and passes speed is different. Therefore, the halt time is constructed using Generalized Additive Model with explanatory variable such as hour, GPS coordinate and number of routes, and we used Hidden Markov Model to construct a pattern considering the influence of traffic flow on the unit section. As a result of the pattern construction, accurate real-time forecasting and long-term prediction of route travel time were possible. Finally, it is shown that this model is suitable for travel time prediction through statistical test between observed data and predicted data. As a result of this paper, we can provide more precise forecast information to the citizens, and we think that long-term forecasting can play an important role in decision making such as route scheduling.

Comprehensive analysis of deep learning-based target classifiers in small and imbalanced active sonar datasets (소량 및 불균형 능동소나 데이터세트에 대한 딥러닝 기반 표적식별기의 종합적인 분석)

  • Geunhwan Kim;Youngsang Hwang;Sungjin Shin;Juho Kim;Soobok Hwang;Youngmin Choo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.329-344
    • /
    • 2023
  • In this study, we comprehensively analyze the generalization performance of various deep learning-based active sonar target classifiers when applied to small and imbalanced active sonar datasets. To generate the active sonar datasets, we use data from two different oceanic experiments conducted at different times and ocean. Each sample in the active sonar datasets is a time-frequency domain image, which is extracted from audio signal of contact after the detection process. For the comprehensive analysis, we utilize 22 Convolutional Neural Networks (CNN) models. Two datasets are used as train/validation datasets and test datasets, alternatively. To calculate the variance in the output of the target classifiers, the train/validation/test datasets are repeated 10 times. Hyperparameters for training are optimized using Bayesian optimization. The results demonstrate that shallow CNN models show superior robustness and generalization performance compared to most of deep CNN models. The results from this paper can serve as a valuable reference for future research directions in deep learning-based active sonar target classification.

Developing a Neural-Based Credit Evaluation System with Noisy Data (불량 데이타를 포함한 신경망 신용 평가 시스템의 개발)

  • Kim, Jeong-Won;Choi, Jong-Uk;Choi, Hong-Yun;Chuong, Yoon
    • The Transactions of the Korea Information Processing Society
    • /
    • v.1 no.2
    • /
    • pp.225-236
    • /
    • 1994
  • Many research result conducted by neural network researchers claimed that the degree of generalization of the neural network system is higher or at least equal to that of statistical methods. However, those successful results could be brought only if the neural network was trained by appropriately sound data, having a little of noisy data and being large enough to control noisy data. Real data used in a lot of fields, especially business fields, were not so sound that the network have frequently failed to obtain satisfactory prediction accuracy, the degree of generalization. Enhancing the degree of generalization with noisy data is discussed in this study. The suggestion, which was obtained through a series of experiments, to enhance the degree of generalization is to remove inconsistent data by checking overlapping and inconsistencies. Furthermore, the previous conclusion by other reports is also confirmed that the learning mechanism of neural network takes average value of two inconsistent data included in training set[2]. The interim results of on-going research project are reported in this paper These are ann architecture of the neural network adopted in this project and the whole idea of developing on-line credit evaluation system,being intergration of the expert(resoning)system and the neural network(learning system.Another definite result is corroborated through this study that quickprop,being agopted as a learing algorithm, also has more speedy learning process than does back propagation even in very noisy environment.

  • PDF

Generalized Frequency-wavenumber Migration Implemented by the Intrinsic Attenuation Effect (비탄성 매질의 진폭 감쇠 효과를 첨가한 일반화된 주파수-파수 구조보정)

  • Baag Chang-Eob;Shim Jae-Heon
    • The Korean Journal of Petroleum Geology
    • /
    • v.1 no.1 s.1
    • /
    • pp.47-52
    • /
    • 1993
  • A method and results of computations are presented for the 2-D seismic migration process in the frequency-wavenumber domain for the laterally and vertically inhomogeneous medium. In order to take the intrinsic attenuation effect into account in the migration process the complex-valued wave velocity is used in the wavefield extrapolation operator, improving the generalized frequency-wavenumber migration technique. The imaginary part of the complex-valued wave velocity includes the seismic quality factor Q value. In derivation of the solution of the wave equation for the medium of inhomogeneous wave velocity and anelasticity, the inhomogeneous medium is mathematically converted to an equivalent system which consists of a homogeneous medium of averaged slowness and an inhomogeneous distribution of hypothetical wave source. The strength of the hypothetical wave source depends on the deviation of squared slowness from the averaged value of the medium. Results of numerical computation using the technique show more distinct geologic images than those using the convensional generalized frequency-wavenumber migration. Especially, the obscured images due to the wave attenuation by anelasticity are restored to show sharp boundaries of structures. The method will be useful in the imaging of the reflection data obtained in the regions of possible petroleum or natural gas reservoir and of fractured zone.

  • PDF

Hybrid POS Tagging with generalized unknown word handling and post error-correction rules (일반화된 미등록어 처리와 오류 수정규칙을 이용한 혼합형 품사태깅)

  • Cha, Jeong-Won;Lee, Won-Il;Lee, Geun-Bae;Lee, Jong-Hyeok
    • Annual Conference on Human and Language Technology
    • /
    • 1997.10a
    • /
    • pp.88-93
    • /
    • 1997
  • 본 논문에서는 품사 태깅을 위해 여러 통계 모델을 실험을 통하여 비교하였으며 이를 토대로 통계적 모델을 구성하였다. 형태소 패턴 사전을 이용하여 미등록어의 위치와 개수에 관계없는 일반적인 방법의 미등록어 처리 방법을 개발하고 통계모델이 가지는 단점을 보완할 수 있는 오류 수정 규칙을 함께 이용하여 혼합형 품사 태깅 시스템인 $POSTAG^{i}$를 개발하였다. 미등록어를 추정하는 형태소 패턴 사전은 한국어 음절 정보와 용언의 불규칙 정보를 이용하여 구성하고 다어절어 사전을 이용하여 여러 어절에 걸쳐 나타나는 연어를 효과적으로 처리하면서 전체적인 태깅 정확도를 개선할 수 있다. 또 오류 수정 규칙은 Brill이 제안한 학습을 통하여 자동으로 얻어진다. 오류 수정 규칙의 자동 추출시에 몇 가지의 휴리스틱을 사용하여 보다 우수하고 일반적인 규clr을 추출할 수 있게 하였다. 10만의 형태소 품사 말뭉치로 학습하고 학습에 참여하지 않은 2만 5천여 형태소로 실험하여 97.28%의 정확도를 보였다.

  • PDF

A Study on the Guidelines of Key Mapping for Mobile Devices using the Method of Key Card Arranging (Key Card Arranging 기법을 활용한 핸드폰 기기의 Key Mapping 가이드라인에 대한 연구)

  • Choi, Jin-Ho;Kang, Han-Jong;Lee, Keun-Min;Lee, Kyoung-Jin;Kim, Jung-Ha
    • 한국HCI학회:학술대회논문집
    • /
    • 2006.02b
    • /
    • pp.275-280
    • /
    • 2006
  • 1990년대 휴대폰이 일반화가 시작된 이후 지금까지 수많은 종류의 휴대폰이 출시되어왔다. 종류가 다양해진 만큼 기능, 목적, 사용방법 또한 다변화 되었고, 이로 인해 매일같이 쏟아져 나오는 휴대폰의 기능 및 사용방법에 대해 유저들은 항상 새로운 방식을 익히도록 강요되고 있다. 이에 본 연구에서는 Key Card Arranging 기법을 활용하여 현재 모바일 기기 사용자들의 멘탈 모델에 적합한 최적의 Key Mapping에 대한 가이드라인을 제시함으로써 최소한의 노력으로 사용자들이 정보기기를 활용할 수 있도록 하는데 그 목적이 있다. 본 연구를 위하여 국내의 대표적인 6개 휴대폰 회사별로 선별된 테스트 서브젝트들을 선정하여 실험을 실시 하였으며, 구체적인 방법론은 다음과 같다. 우선 핸드폰 기기 내의 Hot Key에 대한 Key Mapping을 중심으로 Key Card Arranging 기법과 In-depth Interview 방법론을 활용하여 선정된 서브젝트들을 대상으로 사용자 멘탈 모델에 대한 데이터를 취합하였다. 취합된 자료를 중심으로 정량적 데이터 분석 방법을 활용, 사용자의 휴대폰 키 맵핑에 대한 이상적인 멘탈 모델을 제안하고, 최종적으로 본 연구를 통하여 사용자에게 최적의 경험을 제공하기 위하여 휴대폰 Key Mapping에 대한 가이드라인들을 제시한다.

  • PDF

Advanced Rake Receiver for Multiple Access M-ary Modulation UWB System in the IEEE Multipath Channel (IEEE 다중경로 채널에서 다중접속 M진 변조 초광대역 시스템을 위한 개선된 Rake 수신기)

  • An, Jinyoung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.12
    • /
    • pp.12-19
    • /
    • 2014
  • In this paper, an advanced UWB (ultra wideband) Rake receiving technique based on the statistical distribution model is studied in the M-ary TH-PPM system with multiple access interference (MAI). In order to improve the performance of the Rake receiver, the stochastic model, which can flexibly express the behavior of MAI-plus-noise, is required and the Laplace distribution and the generalized normal Laplace (GNL) model applied by the curtosis matching method are considered. The performance of Rake receiver based on each probability distribution is evaluated in the IEEE multipath fading channel and compared to that of the conventional Rake receiver. The suggested approach shows a superior BER performance than that of conventional Rake receiver.

Implementation of efficient L-diversity de-identification for large data (대용량 데이터에 대한 효율적인 L-diversity 비식별화 구현)

  • Jeon, Min-Hyuk;Temuujin, Odsuren;Ahn, Jinhyun;Im, Dong-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2019.10a
    • /
    • pp.465-467
    • /
    • 2019
  • 최근 많은 단체나 기업에서 다양하고 방대한 데이터를 요구로 하고, 그에 따라서 국가 공공데이터나 데이터 브로커등 데이터를 통해 직접 수집 하거나 구매해야 하는 경우가 많아지고 있다. 하지만 개인정보의 경우 개인의 동의 없이는 타인에게 양도가 불가능하여 이러한 데이터에 대한 연구에 어려움이 있다. 그래서 특정 개인을 추론할 수 없도록 하는 비식별 처리 기술이 연구되고 있다. 이러한 비식별화의 정도는 모델로 나타낼 수가 있는데, 현재 k-anonymity 와 l-diversity 모델 등이 많이 사용된다. 이 중에서 l-diversity 는 k-anonymity 의 만족 조건을 포함하고 있어 비식별화의 정도가 더욱 강하다. 이러한 l-diversity 모델을 만족하는 알고리즘은 The Hardness and Approximation, Anatomy 등이 있는데 본 논문에서는 일반화 과정을 거치지 않아 유용성이 높은 Anatomy 의 구현에 대해 연구하였다. 또한 비식별화 과정은 전체 데이터에 대한 특성을 고려해야 하기 때문에 데이터의 크기가 커짐에 따라 실질적인 처리량이 방대해지는데, 이러한 문제를 Spark 를 통해 데이터가 커짐에 따라서 최대한 안정적으로 대응하여 처리할 수 있는 시스템을 구현하였다.

Non-intrusive Calibration for User Interaction based Gaze Estimation (사용자 상호작용 기반의 시선 검출을 위한 비강압식 캘리브레이션)

  • Lee, Tae-Gyun;Yoo, Jang-Hee
    • Journal of Software Assessment and Valuation
    • /
    • v.16 no.1
    • /
    • pp.45-53
    • /
    • 2020
  • In this paper, we describe a new method for acquiring calibration data using a user interaction process, which occurs continuously during web browsing in gaze estimation, and for performing calibration naturally while estimating the user's gaze. The proposed non-intrusive calibration is a tuning process over the pre-trained gaze estimation model to adapt to a new user using the obtained data. To achieve this, a generalized CNN model for estimating gaze is trained, then the non-intrusive calibration is employed to adapt quickly to new users through online learning. In experiments, the gaze estimation model is calibrated with a combination of various user interactions to compare the performance, and improved accuracy is achieved compared to existing methods.