• Title/Summary/Keyword: 결합 학습

Search Result 1,085, Processing Time 0.025 seconds

Non-Stationary/Mixed Noise Estimation Algorithm Based on Minimum Statistics and Codebook Driven Short-Term Predictor Parameter Estimation (최소 통계법과 Short-Term 예측계수 코드북을 이용한 Non-Stationary/Mixed 배경잡음 추정 기법)

  • Lee, Myeong-Seok;Noh, Myung-Hoon;Park, Sung-Joo;Lee, Seok-Pil;Kim, Moo-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.3
    • /
    • pp.200-208
    • /
    • 2010
  • In this work, the minimum statistics (MS) algorithm is combined with the codebook driven short-term predictor parameter estimation (CDSTP) to design a speech enhancement algorithm that is robust against various background noise environments. The MS algorithm functions well for the stationary noise but relatively not for the non-stationary noise. The CDSTP works efficiently for the non-stationary noise, but not for the noise that was not considered in the training stage. Thus, we propose to combine CDSTP and MS. Compared with the single use of MS and CDSTP, the proposed method produces better perceptual evaluation of speech quality (PESQ) score, and especially works excellent for the mixed background noise between stationary and non-stationary noises.

Model Interpretation through LIME and SHAP Model Sharing (LIME과 SHAP 모델 공유에 의한 모델 해석)

  • Yong-Gil Kim
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.2
    • /
    • pp.177-184
    • /
    • 2024
  • In the situation of increasing data at fast speed, we use all kinds of complex ensemble and deep learning algorithms to get the highest accuracy. It's sometimes questionable how these models predict, classify, recognize, and track unknown data. Accomplishing this technique and more has been and would be the goal of intensive research and development in the data science community. A variety of reasons, such as lack of data, imbalanced data, biased data can impact the decision rendered by the learning models. Many models are gaining traction for such interpretations. Now, LIME and SHAP are commonly used, in which are two state of the art open source explainable techniques. However, their outputs represent some different results. In this context, this study introduces a coupling technique of LIME and Shap, and demonstrates analysis possibilities on the decisions made by LightGBM and Keras models in classifying a transaction for fraudulence on the IEEE CIS dataset.

Neural network with occlusion-resistant and reduced parameters in stereo images (스테레오 영상에서 폐색에 강인하고 축소된 파라미터를 갖는 신경망)

  • Kwang-Yeob Lee;Young-Min Jeon;Jun-Mo Jeong
    • Journal of IKEEE
    • /
    • v.28 no.1
    • /
    • pp.65-71
    • /
    • 2024
  • This paper proposes a neural network that can reduce the number of parameters while reducing matching errors in occluded regions to increase the accuracy of depth maps in stereo matching. Stereo matching-based object recognition is utilized in many fields to more accurately recognize situations using images. When there are many objects in a complex image, an occluded area is generated due to overlap between objects and occlusion by background, thereby lowering the accuracy of the depth map. To solve this problem, existing research methods that create context information and combine it with the cost volume or RoIselect in the occluded area increase the complexity of neural networks, making it difficult to learn and expensive to implement. In this paper, we create a depthwise seperable neural network that enhances regional feature extraction before cost volume generation, reducing the number of parameters and proposing a neural network that is robust to occlusion errors. Compared to PSMNet, the proposed neural network reduced the number of parameters by 30%, improving 5.3% in color error and 3.6% in test loss.

Prospects and Issues on the Expansion of AI Tech's Influence in Film Creation (AI 기술의 영상제작 분야 영향력 확대에 관한 전망과 쟁점)

  • Hanjin Lee;Minhee Kim;Juwon Yun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.4
    • /
    • pp.107-112
    • /
    • 2024
  • One More Pumpkin won the grand prize at the 2023 Dubai International AI Film Festival, and new possibilities were also opened through the International AI and Metaverse Film Festival (GAMFF), which was held for the first time in Korea. Generative works began to stand out in earnest, with 527 diverse works from 42 countries at home and abroad using AI and metaverse technology submitted to this contest. AI is being used in a variety of fields, including the creation and implementation of digital characters through combination with VFX, improving the efficiency of video production, and managing the overall video production process. This contributes to saving human and material resources required for production and significantly improving the quality of produced videos. However, generative AI also has ambiguity in copyright attribution, ethical issues inherent in the learned dataset, and technical limitations that fall short of the level of human emotion and creativity. Accordingly, this study suggests implications at the level of production, screening, and use, as generative AI may have an impact in more areas in the future.

Anomaly detection in blade pitch systems of floating wind turbines using LSTM-Autoencoder (LSTM-Autoencoder를 이용한 부유식 풍력터빈 블레이드 피치 시스템의 이상징후 감지)

  • Seongpil Cho
    • Journal of Aerospace System Engineering
    • /
    • v.18 no.4
    • /
    • pp.43-52
    • /
    • 2024
  • This paper presents an anomaly detection system that uses an LSTM-Autoencoder model to identify early-stage anomalies in the blade pitch system of floating wind turbines. The sensor data used in power plant monitoring systems is primarily composed of multivariate time-series data for each component. Comprising two unidirectional LSTM networks, the system skillfully uncovers long-term dependencies hidden within sequential time-series data. The autoencoder mechanism, learning solely from normal state data, effectively classifies abnormal states. Thus, by integrating these two networks, the system can proficiently detect anomalies. To confirm the effectiveness of the proposed framework, a real multivariate time-series dataset collected from a wind turbine model was employed. The LSTM-autoencoder model showed robust performance, achieving high classification accuracy.

GEase-K: Linear and Nonlinear Autoencoder-based Recommender System with Side Information (GEase-K: 부가 정보를 활용한 선형 및 비선형 오토인코더 기반의 추천시스템)

  • Taebeom Lee;Seung-hak Lee;Min-jeong Ma;Yoonho Cho
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.167-183
    • /
    • 2023
  • In the recent field of recommendation systems, various studies have been conducted to model sparse data effectively. Among these, GLocal-K(Global and Local Kernels for Recommender Systems) is a research endeavor combining global and local kernels to provide personalized recommendations by considering global data patterns and individual user characteristics. However, due to its utilization of kernel tricks, GLocal-K exhibits diminished performance on highly sparse data and struggles to offer recommendations for new users or items due to the absence of side information. In this paper, to address these limitations of GLocal-K, we propose the GEase-K (Global and EASE kernels for Recommender Systems) model, incorporating the EASE(Embarrassingly Shallow Autoencoders for Sparse Data) model and leveraging side information. Initially, we substitute EASE for the local kernel in GLocal-K to enhance recommendation performance on highly sparse data. EASE, functioning as a simple linear operational structure, is an autoencoder that performs highly on extremely sparse data through regularization and learning item similarity. Additionally, we utilize side information to alleviate the cold-start problem. We enhance the understanding of user-item similarities by employing a conditional autoencoder structure during the training process to incorporate side information. In conclusion, GEase-K demonstrates resilience in highly sparse data and cold-start situations by combining linear and nonlinear structures and utilizing side information. Experimental results show that GEase-K outperforms GLocal-K based on the RMSE and MAE metrics on the highly sparse GoodReads and ModCloth datasets. Furthermore, in cold-start experiments divided into four groups using the GoodReads and ModCloth datasets, GEase-K denotes superior performance compared to GLocal-K.

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

Problems with ERP Education at College and How to Solve the Problems (대학에서의 ERP교육의 문제점 및 개선방안)

  • Kim, Mang-Hee;Ra, Ki-La;Park, Sang-Bong
    • Management & Information Systems Review
    • /
    • v.31 no.2
    • /
    • pp.41-59
    • /
    • 2012
  • ERP is a new technique of process innovation. It indicates enterprise resource planning whose purpose is an integrated total management of enterprise resources. ERP can be also seen as one of the latest management systems that organically connects by using computers all business processes including marketing, production and delivery and control those processes on a real-time basis. Currently, however, it's not easy for local enterprises to have operators who will be in charge of ERP programs, even if they want to introduce the resource management system. This suggests that it's urgently needed to train such operators through ERP education at school. But in the field of education, actually, the lack of professional ERP instructors and less effective learning programs for industrial applications of ERP are obstacles to bringing up ERP workers who are competent as much as required by enterprises. In ERP, accounting is more important than any others. Accountants are assuming more and more roles in ERP. Thus, there's a rapidly increasing demand for experts in ERP accounting. This study examined previous researches and literature concerning ERP education, identified problems with current ERP education at college and proposed how to solve the problems. This study proposed the ways of improving ERP education at college as follows. First, a prerequisite learning of ERP, that is, educating the principle of accounting should be intensified to make students get a basic theoretical knowledge of ERP enough. Second, lots of different scenarios designed to try ERP programs in business should be created. In association, students should be educated to get a better understanding of incidents or events taken place in those scenarios and apply it to trying ERP for themselves. Third, as mentioned earlier, ERP is a system that integrates all enterprise resources such as marketing, procurement, personnel management, remuneration and production under the framework of accounting. It should be noted that under ERP, business activities are organically connected with accounting modules. More importantly, those modules should be recognized not individually, but as parts comprising a whole flow of accounting. This study has a limitation because it is a literature research that heavily relied on previous studies, publications and reports. This suggests the need to compare the efficiency of ERP education between before and after applying what this study proposed to improve that education. Also, it's needed to determine students' and professors' perceived effectiveness of current ERP education and compare and analyze the difference in that perception between the two groups.

  • PDF

Estimation of TROPOMI-derived Ground-level SO2 Concentrations Using Machine Learning Over East Asia (기계학습을 활용한 동아시아 지역의 TROPOMI 기반 SO2 지상농도 추정)

  • Choi, Hyunyoung;Kang, Yoojin;Im, Jungho
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.2
    • /
    • pp.275-290
    • /
    • 2021
  • Sulfur dioxide (SO2) in the atmosphere is mainly generated from anthropogenic emission sources. It forms ultra-fine particulate matter through chemical reaction and has harmful effect on both the environment and human health. In particular, ground-level SO2 concentrations are closely related to human activities. Satellite observations such as TROPOMI (TROPOspheric Monitoring Instrument)-derived column density data can provide spatially continuous monitoring of ground-level SO2 concentrations. This study aims to propose a 2-step residual corrected model to estimate ground-level SO2 concentrations through the synergistic use of satellite data and numerical model output. Random forest machine learning was adopted in the 2-step residual corrected model. The proposed model was evaluated through three cross-validations (i.e., random, spatial and temporal). The results showed that the model produced slopes of 1.14-1.25, R values of 0.55-0.65, and relative root-mean-square-error of 58-63%, which were improved by 10% for slopes and 3% for R and rRMSE when compared to the model without residual correction. The model performance by country was slightly reduced in Japan, often resulting in overestimation, where the sample size was small, and the concentration level was relatively low. The spatial and temporal distributions of SO2 produced by the model agreed with those of the in-situ measurements, especially over Yangtze River Delta in China and Seoul Metropolitan Area in South Korea, which are highly dependent on the characteristics of anthropogenic emission sources. The model proposed in this study can be used for long-term monitoring of ground-level SO2 concentrations on both the spatial and temporal domains.