• Title/Summary/Keyword: Model Generalization

Search Result 432, Processing Time 0.024 seconds

MARGIN-BASED GENERALIZATION FOR CLASSIFICATIONS WITH INPUT NOISE

  • Choe, Hi Jun;Koh, Hayeong;Lee, Jimin
    • Journal of the Korean Mathematical Society
    • /
    • v.59 no.2
    • /
    • pp.217-233
    • /
    • 2022
  • Although machine learning shows state-of-the-art performance in a variety of fields, it is short a theoretical understanding of how machine learning works. Recently, theoretical approaches are actively being studied, and there are results for one of them, margin and its distribution. In this paper, especially we focused on the role of margin in the perturbations of inputs and parameters. We show a generalization bound for two cases, a linear model for binary classification and neural networks for multi-classification, when the inputs have normal distributed random noises. The additional generalization term caused by random noises is related to margin and exponentially inversely proportional to the noise level for binary classification. And in neural networks, the additional generalization term depends on (input dimension) × (norms of input and weights). For these results, we used the PAC-Bayesian framework. This paper is considering random noises and margin together, and it will be helpful to a better understanding of model sensitivity and the construction of robust generalization.

Generalization of Road Network using Logistic Regression

  • Park, Woojin;Huh, Yong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.37 no.2
    • /
    • pp.91-97
    • /
    • 2019
  • In automatic map generalization, the formalization of cartographic principles is important. This study proposes and evaluates the selection method for road network generalization that analyzes existing maps using reverse engineering and formalizes the selection rules for the road network. Existing maps with a 1:5,000 scale and a 1:25,000 scale are compared, and the criteria for selection of the road network data and the relative importance of each network object are determined and analyzed using $T{\ddot{o}}pfer^{\prime}s$ Radical Law as well as the logistic regression model. The selection model derived from the analysis result is applied to the test data, and road network data for the 1:25,000 scale map are generated from the digital topographic map on a 1:5,000 scale. The selected road network is compared with the existing road network data on the 1:25,000 scale for a qualitative and quantitative evaluation. The result indicates that more than 80% of road objects are matched to existing data.

Improvement of generalization of linear model through data augmentation based on Central Limit Theorem (데이터 증가를 통한 선형 모델의 일반화 성능 개량 (중심극한정리를 기반으로))

  • Hwang, Doohwan
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.19-31
    • /
    • 2022
  • In Machine learning, we usually divide the entire data into training data and test data, train the model using training data, and use test data to determine the accuracy and generalization performance of the model. In the case of models with low generalization performance, the prediction accuracy of newly data is significantly reduced, and the model is said to be overfit. This study is about a method of generating training data based on central limit theorem and combining it with existed training data to increase normality and using this data to train models and increase generalization performance. To this, data were generated using sample mean and standard deviation for each feature of the data by utilizing the characteristic of central limit theorem, and new training data was constructed by combining them with existed training data. To determine the degree of increase in normality, the Kolmogorov-Smirnov normality test was conducted, and it was confirmed that the new training data showed increased normality compared to the existed data. Generalization performance was measured through differences in prediction accuracy for training data and test data. As a result of measuring the degree of increase in generalization performance by applying this to K-Nearest Neighbors (KNN), Logistic Regression, and Linear Discriminant Analysis (LDA), it was confirmed that generalization performance was improved for KNN, a non-parametric technique, and LDA, which assumes normality between model building.

Application of Hydro-Cartographic Generalization on Buildings for 2-Dimensional Inundation Analysis (2차원 침수해석을 위한 수리학적 건물 일반화 기법의 적용)

  • PARK, In-Hyeok;JIN, Gi-Ho;JEON, Ka-Young;HA, Sung-Ryong
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.18 no.2
    • /
    • pp.1-15
    • /
    • 2015
  • Urban flooding threatens human beings and facilities with chemical and physical hazards since the beginning of human civilization. Recent studies have emphasized the integration of data and models for effective urban flood inundation modeling. However, the model set-up process is tend to be time consuming and to require a high level of data processing skill. Furthermore, in spite of the use of high resolution grid data, inundation depth and velocity are varied with building treatment methods in 2-D inundation model, because undesirable grids are generated and resulted in the reliability decline of the simulation results. Thus, it requires building generalization process or enhancing building orthogonality to minimize the distortion of building before converting building footprint into grid data. This study aims to develop building generalization method for 2-dimensional inundation analysis to enhance the model reliability, and to investigate the effect of building generalization method on urban inundation in terms of geographical engineering and hydraulic engineering. As a result to improve the reliability of 2-dimensional inundation analysis, the building generalization method developed in this study should be adapted using Digital Building Model(DBM) before model implementation in urban area. The proposed building generalization sequence was aggregation-simplification, and the threshold of the each method should be determined by considering spatial characteristics, which should not exceed the summation of building gap average and standard deviation.

Exploring the feasibility of fine-tuning large-scale speech recognition models for domain-specific applications: A case study on Whisper model and KsponSpeech dataset

  • Jungwon Chang;Hosung Nam
    • Phonetics and Speech Sciences
    • /
    • v.15 no.3
    • /
    • pp.83-88
    • /
    • 2023
  • This study investigates the fine-tuning of large-scale Automatic Speech Recognition (ASR) models, specifically OpenAI's Whisper model, for domain-specific applications using the KsponSpeech dataset. The primary research questions address the effectiveness of targeted lexical item emphasis during fine-tuning, its impact on domain-specific performance, and whether the fine-tuned model can maintain generalization capabilities across different languages and environments. Experiments were conducted using two fine-tuning datasets: Set A, a small subset emphasizing specific lexical items, and Set B, consisting of the entire KsponSpeech dataset. Results showed that fine-tuning with targeted lexical items increased recognition accuracy and improved domain-specific performance, with generalization capabilities maintained when fine-tuned with a smaller dataset. For noisier environments, a trade-off between specificity and generalization capabilities was observed. This study highlights the potential of fine-tuning using minimal domain-specific data to achieve satisfactory results, emphasizing the importance of balancing specialization and generalization for ASR models. Future research could explore different fine-tuning strategies and novel technologies such as prompting to further enhance large-scale ASR models' domain-specific performance.

Randomized Bagging for Bankruptcy Prediction (랜덤화 배깅을 이용한 재무 부실화 예측)

  • Min, Sung-Hwan
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.153-166
    • /
    • 2016
  • Ensemble classification is an approach that combines individually trained classifiers in order to improve prediction accuracy over individual classifiers. Ensemble techniques have been shown to be very effective in improving the generalization ability of the classifier. But base classifiers need to be as accurate and diverse as possible in order to enhance the generalization abilities of an ensemble model. Bagging is one of the most popular ensemble methods. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. In this study we proposed a new bagging variant ensemble model, Randomized Bagging (RBagging) for improving the standard bagging ensemble model. The proposed model was applied to the bankruptcy prediction problem using a real data set and the results were compared with those of the other models. The experimental results showed that the proposed model outperformed the standard bagging model.

On the design of a teaching unit for the exploration of number patterns in Pascal graphs and triangles applying theoretical generalization. (이론적 일반화를 적용한 파스칼 그래프와 삼각형에 내재된 수의 패턴 탐구를 위한 교수단원의 설계)

  • Kim, Jin Hwan
    • East Asian mathematical journal
    • /
    • v.40 no.2
    • /
    • pp.209-229
    • /
    • 2024
  • In this study, we design a teaching unit that constructs Pascal graphs and extended Pascal triangles to explore number patterns inherent in them. This teaching unit is designed to consider the diachronic process of teaching-learning by combining Dörfler's theoretical generalization model with Wittmann's design science ideas, which are applied to the didactical practice of mathematization. In the teaching unit, considering the teaching-learning level of prospective teachers who studied discrete mathematics, we generalize the well-known Pascal triangle and its number patterns to extended Pascal triangles which have directed graphs(called Pascal graphs) as geometric models. In this process, the use of symbols and the introduction of variables are exhibited as important means of generalization. It provides practical experiences of mathematization to prospective teachers by going through various steps of the generalization process targeting symbols. This study reflects Wittmann's intention in that well-understood mathematics and the context of the first type of empirical research as structure-genetic didactical analysis are considered in the design of the learning environment.

Improving Generalization Performance of Neural Networks using Natural Pruning and Bayesian Selection (자연 프루닝과 베이시안 선택에 의한 신경회로망 일반화 성능 향상)

  • 이현진;박혜영;이일병
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.3_4
    • /
    • pp.326-338
    • /
    • 2003
  • The objective of a neural network design and model selection is to construct an optimal network with a good generalization performance. However, training data include noises, and the number of training data is not sufficient, which results in the difference between the true probability distribution and the empirical one. The difference makes the teaming parameters to over-fit only to training data and to deviate from the true distribution of data, which is called the overfitting phenomenon. The overfilled neural network shows good approximations for the training data, but gives bad predictions to untrained new data. As the complexity of the neural network increases, this overfitting phenomenon also becomes more severe. In this paper, by taking statistical viewpoint, we proposed an integrative process for neural network design and model selection method in order to improve generalization performance. At first, by using the natural gradient learning with adaptive regularization, we try to obtain optimal parameters that are not overfilled to training data with fast convergence. By adopting the natural pruning to the obtained optimal parameters, we generate several candidates of network model with different sizes. Finally, we select an optimal model among candidate models based on the Bayesian Information Criteria. Through the computer simulation on benchmark problems, we confirm the generalization and structure optimization performance of the proposed integrative process of teaming and model selection.

Area-wise relational knowledge distillation

  • Sungchul Cho;Sangje Park;Changwon Lim
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.5
    • /
    • pp.501-516
    • /
    • 2023
  • Knowledge distillation (KD) refers to extracting knowledge from a large and complex model (teacher) and transferring it to a relatively small model (student). This can be done by training the teacher model to obtain the activation function values of the hidden or the output layers and then retraining the student model using the same training data with the obtained values. Recently, relational KD (RKD) has been proposed to extract knowledge about relative differences in training data. This method improved the performance of the student model compared to conventional KDs. In this paper, we propose a new method for RKD by introducing a new loss function for RKD. The proposed loss function is defined using the area difference between the teacher model and the student model in a specific hidden layer, and it is shown that the model can be successfully compressed, and the generalization performance of the model can be improved. We demonstrate that the accuracy of the model applying the method proposed in the study of model compression of audio data is up to 1.8% higher than that of the existing method. For the study of model generalization, we demonstrate that the model has up to 0.5% better performance in accuracy when introducing the RKD method to self-KD using image data.

Component classification modeling for component circulation market activation (컴포넌트 유통시장 활성화를 위한 분류체계 모델링)

  • 이서정;조은숙
    • The Journal of Society for e-Business Studies
    • /
    • v.7 no.3
    • /
    • pp.49-60
    • /
    • 2002
  • Many researchers have studied component technologies with concept, methodology and implementation for partial business domain, however there are rarely researches for component classification to manage these systematically. In this paper, we suggest a component classification model, which can make component reusability higher and can derive higher productivity of software development. We take four focuses generalization, abstraction, technology and size. The generalization means which category a component belongs to. The abstraction means how specific a component encapsulates its inside. The technology means which platform for hardware environment a component can be plugged in. The size means the physical component volume.

  • PDF