• Title/Summary/Keyword: data set

Search Result 10,939, Processing Time 0.035 seconds

Text-independent Speaker Identification by Bagging VQ Classifier

  • Kyung, Youn-Jeong;Park, Bong-Dae;Lee, Hwang-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.2E
    • /
    • pp.17-24
    • /
    • 2001
  • In this paper, we propose the bootstrap and aggregating (bagging) vector quantization (VQ) classifier to improve the performance of the text-independent speaker recognition system. This method generates multiple training data sets by resampling the original training data set, constructs the corresponding VQ classifiers, and then integrates the multiple VQ classifiers into a single classifier by voting. The bagging method has been proven to greatly improve the performance of unstable classifiers. Through two different experiments, this paper shows that the VQ classifier is unstable. In one of these experiments, the bias and variance of a VQ classifier are computed with a waveform database. The variance of the VQ classifier is compared with that of the classification and regression tree (CART) classifier[1]. The variance of the VQ classifier is shown to be as large as that of the CART classifier. The other experiment involves speaker recognition. The speaker recognition rates vary significantly by the minor changes in the training data set. The speaker recognition experiments involving a closed set, text-independent and speaker identification are performed with the TIMIT database to compare the performance of the bagging VQ classifier with that of the conventional VQ classifier. The bagging VQ classifier yields improved performance over the conventional VQ classifier. It also outperforms the conventional VQ classifier in small training data set problems.

  • PDF

Template Matching-Based Target Recognition Algorithm Development and Verification using SAR Images (SAR 영상을 이용한 템플릿 매칭 기반 자동식별 알고리즘 구현 및 성능시험)

  • Lim, Ho;Chae, Daeyoung;Yoo, Ji Hee;Kwon, Kyung-Il
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.17 no.3
    • /
    • pp.364-377
    • /
    • 2014
  • In this paper, we have developed a target recognition algorithm based on a template matching technique using Synthetic Aperture Radar (SAR) images. For efficient computations, Radon transform-based azimuth estimation algorithm was used with the template matching. MSTAR data set was divided into two groups according to the depression angles, which were a train set and a test set. Template data were generated by rotating and cropping chips which were from MSTAR train set using the azimuth estimation algorithm. Then the template matching process between test data and template data was performed under various conditions. Performance variation according to contrast enhancement preprocessing which is scarce in open literature was also presented. The analysis results show that the target recognition algorithm could be useful for the automatic target recognition using SAR images.

Sketch-based Solid Prototype Modeling System with Dual Data Structure of Point-set Surfaces and Voxels

  • Takeuchi, Ryota;Watanabe, Taichi;Yamakawa, Soji
    • International Journal of CAD/CAM
    • /
    • v.11 no.1
    • /
    • pp.18-26
    • /
    • 2011
  • This paper proposes a new solid-shape modeling system based on a lusterware-image illustration. The proposed method reconstructs a three dimensional solid shape from a set of rough sketches that are typically drawn in the early stages of the design process. The sketches do not have to be strictly accurate, and this tolerance to the roughness of the input sketches is one of the major advantages of the proposed method. The proposed system creates an initial shape based on the silhouette of the input lusterware-images. Then the user can edit the initial shape with intuitive cutting and dishing-up operations, which are based on sketching user interface. To achieve the goal, the system retains the geometric model with two representations: a point-set data and a volume data. This dual data structure allows the program to create an initial shape from the input images with little computational cost, and the user can apply cutting and dishing-up operations without substantially increasing computational and memory requirements. In this research, we have tested the proposed system by reconstructing solid models of some mechanical parts from rough sketches. The experimental results indicate that the proposed method is useful for the prototyping of a solid shape.

  • PDF

Quantification of Naproxen in Pharmaceutical Formulation using Near-Infrared Spectrometry (근적외 분광분석법을 이용한 나프록센 정제의 정량분석)

  • Kim Do Hyung;Woo Young Ah;Kim Hyo Jin
    • YAKHAK HOEJI
    • /
    • v.49 no.1
    • /
    • pp.1-5
    • /
    • 2005
  • Near-infrared (NIR) spectroscopy has been widely applied in various field, since it is nondestructive and no sample preparation is required. In this paper, NIR spectroscopy was used for the determination of naproxen in a commercial pharmaceutical preparation. NIR spectroscopy was used to determine the content of naproxen in intact naproxen tablets containing 250 mg ($65.8\%$ nominal concentration) by collecting NIR spectra in the range of $1100{\sim}1750nm$. The laboratory-made samples had $46.1{\sim}85.5\%$ nominal naproxen concentration. The measurements were made by reflection using a fiber-optic probe and calibration was carried out by partial least square regression (PLSR). Model validation was performed by randomly splitting the data set into calibration and validation data set (63 samples as a calibration data set and 42 samples as a validation data set). The developed NIR calibration gave results comparable to the known values of tablets in a laboratorial manufacturing process with standard error of calibration (SEC) and standard error of prediction (SEP) of $1.06\%\;and\;1.04\%$, respectively. The NIR method showed good accuracy and repeatability. NIR spectroscopic determination in intact tablets allowed the potential use of real time monitoring for a running production process.

Hyper-Rectangle Based Prototype Selection Algorithm Preserving Class Regions (클래스 영역을 보존하는 초월 사각형에 의한 프로토타입 선택 알고리즘)

  • Baek, Byunghyun;Euh, Seongyul;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.3
    • /
    • pp.83-90
    • /
    • 2020
  • Prototype selection offers the advantage of ensuring low learning time and storage space by selecting the minimum data representative of in-class partitions from the training data. This paper designs a new training data generation method using hyper-rectangles that can be applied to general classification algorithms. Hyper-rectangular regions do not contain different class data and divide the same class space. The median value of the data within a hyper-rectangle is selected as a prototype to form new training data, and the size of the hyper-rectangle is adjusted to reflect the data distribution in the class area. A set cover optimization algorithm is proposed to select the minimum prototype set that represents the whole training data. The proposed method reduces the time complexity that requires the polynomial time of the set cover optimization algorithm by using the greedy algorithm and the distance equation without multiplication. In experimented comparison with hyper-sphere prototype selections, the proposed method is superior in terms of prototype rate and generalization performance.

Deep Learning Model for Electric Power Demand Prediction Using Special Day Separation and Prediction Elements Extention (특수일 분리와 예측요소 확장을 이용한 전력수요 예측 딥 러닝 모델)

  • Park, Jun-Ho;Shin, Dong-Ha;Kim, Chang-Bok
    • Journal of Advanced Navigation Technology
    • /
    • v.21 no.4
    • /
    • pp.365-370
    • /
    • 2017
  • This study analyze correlation between weekdays data and special days data of different power demand patterns, and builds a separate data set, and suggests ways to reduce power demand prediction error by using deep learning network suitable for each data set. In addition, we propose a method to improve the prediction rate by adding the environmental elements and the separating element to the meteorological element, which is a basic power demand prediction elements. The entire data predicted power demand using LSTM which is suitable for learning time series data, and the special day data predicted power demand using DNN. The experiment result show that the prediction rate is improved by adding prediction elements other than meteorological elements. The average RMSE of the entire dataset was 0.2597 for LSTM and 0.5474 for DNN, indicating that the LSTM showed a good prediction rate. The average RMSE of the special day data set was 0.2201 for DNN, indicating that the DNN had better prediction than LSTM. The MAPE of the LSTM of the whole data set was 2.74% and the MAPE of the special day was 3.07 %.

Development of Windows forensic tool for verifying a set of data (윈도우 포렌식 도구의 검증용 데이터 세트의 개발)

  • Kim, Min-Seo;Lee, Sang-jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.25 no.6
    • /
    • pp.1421-1433
    • /
    • 2015
  • For an accurate analysis through the forensic of digital devices and computer, it is a very important validation of the reliability of digital forensic tools. To verify the reliability of the tool, it is necessary to research and development of the data set to be input to the tool. In many-used Windows operating system of the computer, there is a Window forensic artifacts associated with time and system behavior. In this paper, we developed a set of data in the Windows operating system to be able to analyze all of the two Windows artifacts and we conducted a test with published digital forensic tools. Therefore, the developed data set presents the use of the following method. First, artefacts education for growing ability can be analyzed acts standards. Secondly, the purpose of tool tests for verifying the reliability of digital forensics. Lastly, recyclability for new artifact analysis.

An Improvement of the Decision-Making of Categorical Data in Rough Set Analysis (범주형 데이터의 러프집합 분석을 통한 의사결정 향상기법)

  • Park, In-Kyu
    • Journal of Digital Convergence
    • /
    • v.13 no.6
    • /
    • pp.157-164
    • /
    • 2015
  • An efficient retrieval of useful information is a prerequisite of an optimal decision making system. Hence, A research of data mining techniques finding useful patterns from the various forms of data has been progressed with the increase of the application of Big Data for convergence and integration with other industries. Each technique is more likely to have its drawback so that the generalization of retrieving useful information is weak. Another integrated technique is essential for retrieving useful information. In this paper, a uncertainty measure of information is calculated such that algebraic probability is measured by Bayesian theory and then information entropy of the probability is measured. The proposed measure generates the effective reduct set (i.e., reduced set of necessary attributes) and formulating the core of the attribute set. Hence, the optimal decision rules are induced. Through simulation deciding contact lenses, the proposed approach is compared with the equivalence and value-reduct theories. As the result, the proposed is more general than the previous theories in useful decision-making.

Prototype-Based Classification Using Class Hyperspheres (클래스 초월구를 이용한 프로토타입 기반 분류)

  • Lee, Hyun-Jong;Hwang, Doosung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.10
    • /
    • pp.483-488
    • /
    • 2016
  • In this paper, we propose a prototype-based classification learning by using the nearest-neighbor rule. The nearest-neighbor is applied to segment the class area of all the training data with hyperspheres, and a hypersphere must cover the data from the same class. The radius of a hypersphere is computed by the mid point of the two distances to the farthest same class point and the nearest other class point. And we transform the prototype selection problem into a set covering problem in order to determine the smallest set of prototypes that cover all the training data. The proposed prototype selection method is designed by a greedy algorithm and applicable to process a large-scale training set in parallel. The prediction rule is the nearest-neighbor rule and the new training data is the set of prototypes. In experiments, the generalization performance of the proposed method is superior to existing methods.

Improving Classification Performance for Data with Numeric and Categorical Attributes Using Feature Wrapping (특징 래핑을 통한 숫자형 특징과 범주형 특징이 혼합된 데이터의 클래스 분류 성능 향상 기법)

  • Lee, Jae-Sung;Kim, Dae-Won
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.12
    • /
    • pp.1024-1027
    • /
    • 2009
  • In this letter, we evaluate the classification performance of mixed numeric and categorical data for comparing the efficiency of feature filtering and feature wrapping. Because the mixed data is composed of numeric and categorical features, the feature selection method was applied to data set after discretizing the numeric features in the given data set. In this study, we choose the feature subset for improving the classification performance of the data set after preprocessing. The experimental result of comparing the classification performance show that the feature wrapping method is more reliable than feature filtering method in the aspect of classification accuracy.