• Title/Summary/Keyword: Neural network analysis

Search Result 2,592, Processing Time 0.033 seconds

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

VKOSPI Forecasting and Option Trading Application Using SVM (SVM을 이용한 VKOSPI 일 중 변화 예측과 실제 옵션 매매에의 적용)

  • Ra, Yun Seon;Choi, Heung Sik;Kim, Sun Woong
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.4
    • /
    • pp.177-192
    • /
    • 2016
  • Machine learning is a field of artificial intelligence. It refers to an area of computer science related to providing machines the ability to perform their own data analysis, decision making and forecasting. For example, one of the representative machine learning models is artificial neural network, which is a statistical learning algorithm inspired by the neural network structure of biology. In addition, there are other machine learning models such as decision tree model, naive bayes model and SVM(support vector machine) model. Among the machine learning models, we use SVM model in this study because it is mainly used for classification and regression analysis that fits well to our study. The core principle of SVM is to find a reasonable hyperplane that distinguishes different group in the data space. Given information about the data in any two groups, the SVM model judges to which group the new data belongs based on the hyperplane obtained from the given data set. Thus, the more the amount of meaningful data, the better the machine learning ability. In recent years, many financial experts have focused on machine learning, seeing the possibility of combining with machine learning and the financial field where vast amounts of financial data exist. Machine learning techniques have been proved to be powerful in describing the non-stationary and chaotic stock price dynamics. A lot of researches have been successfully conducted on forecasting of stock prices using machine learning algorithms. Recently, financial companies have begun to provide Robo-Advisor service, a compound word of Robot and Advisor, which can perform various financial tasks through advanced algorithms using rapidly changing huge amount of data. Robo-Adviser's main task is to advise the investors about the investor's personal investment propensity and to provide the service to manage the portfolio automatically. In this study, we propose a method of forecasting the Korean volatility index, VKOSPI, using the SVM model, which is one of the machine learning methods, and applying it to real option trading to increase the trading performance. VKOSPI is a measure of the future volatility of the KOSPI 200 index based on KOSPI 200 index option prices. VKOSPI is similar to the VIX index, which is based on S&P 500 option price in the United States. The Korea Exchange(KRX) calculates and announce the real-time VKOSPI index. VKOSPI is the same as the usual volatility and affects the option prices. The direction of VKOSPI and option prices show positive relation regardless of the option type (call and put options with various striking prices). If the volatility increases, all of the call and put option premium increases because the probability of the option's exercise possibility increases. The investor can know the rising value of the option price with respect to the volatility rising value in real time through Vega, a Black-Scholes's measurement index of an option's sensitivity to changes in the volatility. Therefore, accurate forecasting of VKOSPI movements is one of the important factors that can generate profit in option trading. In this study, we verified through real option data that the accurate forecast of VKOSPI is able to make a big profit in real option trading. To the best of our knowledge, there have been no studies on the idea of predicting the direction of VKOSPI based on machine learning and introducing the idea of applying it to actual option trading. In this study predicted daily VKOSPI changes through SVM model and then made intraday option strangle position, which gives profit as option prices reduce, only when VKOSPI is expected to decline during daytime. We analyzed the results and tested whether it is applicable to real option trading based on SVM's prediction. The results showed the prediction accuracy of VKOSPI was 57.83% on average, and the number of position entry times was 43.2 times, which is less than half of the benchmark (100 times). A small number of trading is an indicator of trading efficiency. In addition, the experiment proved that the trading performance was significantly higher than the benchmark.

Comparison of Disk Tension Infiltrometer and van Genuchten-Mualem Model on Estimation of Unsaturated Hydraulic Conductivity (장력 침투계(Disk Tension Infiltrometer)와 van Genuchten-Mualem 모형 적용에 따른 불포화수리 전도도의 비교 해석)

  • Hur, Seung-Oh;Jung, Kang-Ho;Park, Chan-Won;Ha, Sang-Keun;Kim, Geong-Gyu
    • Korean Journal of Soil Science and Fertilizer
    • /
    • v.39 no.5
    • /
    • pp.259-267
    • /
    • 2006
  • Hydraulic conductivity is the rate of water flux on hydraulic gradient. The van Genuchten Mualem (VGM) model is frequently used for describing unsaturated state of soils, that is composed with the function of soil water potential and soil water content and requests various parameters. This study is to get the value of VGM parameters used Rosetta computer program based on neural network analysis method and to calculate VGM parameters. VGM parameters included Ko(effective saturated hydraulic conductivity), ${\theta}r$(residual soil water content), ${\theta}s$(saturated soil water content), L, n and m. The unsaturated hydraulic conductivity at 10 kPa was calculated by using Rosetta program. Unsaturated hydraulic conductivities of 17 soil series at 1, 3, 5, 7 kPa were also obtained by applying saturated hydraulic conductivity by disk tension infiltrometer based on Gardner and Wooding's equation. Water flow at the water potential of 3 kPa was very low except Namgye, Hagog, Baegsan, Sangju, Seogcheon, Yesan soil series. Unsaturated hydraulic conductivity at 1 kPa showed the highest value for Samgag soil series and was in order of Yesan, Hwabong, Hagog and Baegsan soil series. Those of Gacheon, Seocheon and Ugog soil series were very low. When the value by VGM was compared with the value by disc tension infiltrometer, there was a tendency with exponential function to soils without gravel but there was no tendency to soils including gravel. Conclusively, it would be limited that VGM model for unsaturated hydraulic conductivity analysis applies to Korean agricultural land including gravel and having steep slope, shallow soil depth.

Requirement Analysis for Agricultural Meteorology Information Service Systems based on the Fourth Industrial Revolution Technologies (4차 산업혁명 기술에 기반한 농업 기상 정보 시스템의 요구도 분석)

  • Kim, Kwang Soo;Yoo, Byoung Hyun;Hyun, Shinwoo;Kang, DaeGyoon
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.3
    • /
    • pp.175-186
    • /
    • 2019
  • Efforts have been made to introduce the climate smart agriculture (CSA) for adaptation to future climate conditions, which would require collection and management of site specific meteorological data. The objectives of this study were to identify requirements for construction of agricultural meteorology information service system (AMISS) using technologies that lead to the fourth industrial revolution, e.g., internet of things (IoT), artificial intelligence, and cloud computing. The IoT sensors that require low cost and low operating current would be useful to organize wireless sensor network (WSN) for collection and analysis of weather measurement data, which would help assessment of productivity for an agricultural ecosystem. It would be recommended to extend the spatial extent of the WSN to a rural community, which would benefit a greater number of farms. It is preferred to create the big data for agricultural meteorology in order to produce and evaluate the site specific data in rural areas. The digital climate map can be improved using artificial intelligence such as deep neural networks. Furthermore, cloud computing and fog computing would help reduce costs and enhance the user experience of the AMISS. In addition, it would be advantageous to combine environmental data and farm management data, e.g., price data for the produce of interest. It would also be needed to develop a mobile application whose user interface could meet the needs of stakeholders. These fourth industrial revolution technologies would facilitate the development of the AMISS and wide application of the CSA.

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

  • Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.201-220
    • /
    • 2019
  • Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.

Clustering and classification of residential noise sources in apartment buildings based on machine learning using spectral and temporal characteristics (주파수 및 시간 특성을 활용한 머신러닝 기반 공동주택 주거소음의 군집화 및 분류)

  • Jeong-hun Kim;Song-mi Lee;Su-hong Kim;Eun-sung Song;Jong-kwan Ryu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.6
    • /
    • pp.603-616
    • /
    • 2023
  • In this study, machine learning-based clustering and classification of residential noise in apartment buildings was conducted using frequency and temporal characteristics. First, a residential noise source dataset was constructed . The residential noise source dataset was consisted of floor impact, airborne, plumbing and equipment noise, environmental, and construction noise. The clustering of residential noise was performed by K-Means clustering method. For frequency characteristics, Leq and Lmax values were derived for 1/1 and 1/3 octave band for each sound source. For temporal characteristics, Leq values were derived at every 6 ms through sound pressure level analysis for 5 s. The number of k in K-Means clustering method was determined through the silhouette coefficient and elbow method. The clustering of residential noise source by frequency characteristic resulted in three clusters for both Leq and Lmax analysis. Temporal characteristic clustered residential noise source into 9 clusters for Leq and 11 clusters for Lmax. Clustering by frequency characteristic clustered according to the proportion of low frequency band. Then, to utilize the clustering results, the residential noise source was classified using three kinds of machine learning. The results of the residential noise classification showed the highest accuracy and f1-score for data labeled with Leq values in 1/3 octave bands, and the highest accuracy and f1-score for classifying residential noise sources with an Artificial Neural Network (ANN) model using both frequency and temporal features, with 93 % accuracy and 92 % f1-score.

A Comparative Study on Failure Pprediction Models for Small and Medium Manufacturing Company (중소제조기업의 부실예측모형 비교연구)

  • Hwangbo, Yun;Moon, Jong Geon
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.11 no.3
    • /
    • pp.1-15
    • /
    • 2016
  • This study has analyzed predication capabilities leveraging multi-variate model, logistic regression model, and artificial neural network model based on financial information of medium-small sized companies list in KOSDAQ. 83 delisted companies from 2009 to 2012 and 83 normal companies, i.e. 166 firms in total were sampled for the analysis. Modelling with training data was mobilized for 100 companies inlcuding 50 delisted ones and 50 normal ones at random out of the 166 companies. The rest of samples, 66 companies, were used to verify accuracies of the models. Each model was designed by carrying out T-test with 79 financial ratios for the last 5 years and identifying 9 significant variables. T-test has shown that financial profitability variables were major variables to predict a financial risk at an early stage, and financial stability variables and financial cashflow variables were identified as additional significant variables at a later stage of insolvency. When predication capabilities of the models were compared, for training data, a logistic regression model exhibited the highest accuracy while for test data, the artificial neural networks model provided the most accurate results. There are differences between the previous researches and this study as follows. Firstly, this study considered a time-series aspect in light of the fact that failure proceeds gradually. Secondly, while previous studies constructed a multivariate discriminant model ignoring normality, this study has reviewed the regularity of the independent variables, and performed comparisons with the other models. Policy implications of this study is that the reliability for the disclosure documents is important because the simptoms of firm's fail woule be shown on financial statements according to this paper. Therefore institutional arragements for restraing moral laxity from accounting firms or its workers should be strengthened.

  • PDF

Development of DL-MCS Hybrid Expert System for Automatic Estimation of Apartment Remodeling (공동주택 리모델링 자동견적을 위한 DL-MCS Hybrid Expert System 개발)

  • Kim, Jun;Cha, Heesung
    • Korean Journal of Construction Engineering and Management
    • /
    • v.21 no.6
    • /
    • pp.113-124
    • /
    • 2020
  • Social movements to improve the performance of buildings through remodeling of aging apartment houses are being captured. To this end, the remodeling construction cost analysis, structural analysis, and political institutional review have been conducted to suggest ways to activate the remodeling. However, although the method of analyzing construction cost for remodeling apartment houses is currently being proposed for research purposes, there are limitations in practical application possibilities. Specifically, In order to be used practically, it is applicable to cases that have already been completed or in progress, but cases that will occur in the future are also used for construction cost analysis, so the sustainability of the analysis method is lacking. For the purpose of this, we would like to suggest an automated estimating method. For the sustainability of construction cost estimates, Deep-Learning was introduced in the estimating procedure. Specifically, a method for automatically finding the relationship between design elements, work types, and cost increase factors that can occur in apartment remodeling was presented. In addition, Monte Carlo Simulation was included in the estimation procedure to compensate for the lack of uncertainty, which is the inherent limitation of the Deep Learning-based estimation. In order to present higher accuracy as cases are accumulated, a method of calculating higher accuracy by comparing the estimate result with the existing accumulated data was also suggested. In order to validate the sustainability of the automated estimates proposed in this study, 13 cases of learning procedures and an additional 2 cases of cumulative procedures were performed. As a result, a new construction cost estimating procedure was automatically presented that reflects the characteristics of the two additional projects. In this study, the method of estimate estimate was used using 15 cases, If the cases are accumulated and reflected, the effect of this study is expected to increase.

A Study of Prediction of Daily Water Supply Usion ANFIS (ANFIS를 이용한 상수도 1일 급수량 예측에 관한 연구)

  • Rhee, Kyoung-Hoon;Moon, Byoung-Seok;Kang, Il-Hwan
    • Journal of Korea Water Resources Association
    • /
    • v.31 no.6
    • /
    • pp.821-832
    • /
    • 1998
  • This study investigates the prediction of daily water supply, which is a necessary for the efficient management of water distribution system. Fuzzy neuron, namely artificial intelligence, is a neural network into which fuzzy information is inputted and then processed. In this study, daily water supply was predicted through an adaptive learning method by which a membership function and fuzzy rules were adapted for daily water supply prediction. This study was investigated methods for predicting water supply based on data about the amount of water supplied to the city of Kwangju. For variables choice, four analyses of input data were conducted: correlation analysis, autocorrelation analysis, partial autocorrelation analysis, and cross-correlation analysis. Input variables were (a) the amount of water supplied (b) the mean temperature, and (c)the population of the area supplied with water. Variables were combined in an integrated model. Data of the amount of daily water supply only was modelled and its validity was verified in the case that the meteorological office of weather forecast is not always reliable. Proposed models include accidental cases such as a suspension of water supply. The maximum error rate between the estimation of the model and the actual measurement was 18.35% and the average error was lower than 2.36%. The model is expected to be a real-time estimation of the operational control of water works and water/drain pipes.

  • PDF

Data Mining Analysis of Determinants of Alcohol Problems of Youth from an Ecological Perspective (청년의 문제음주에 미치는 사회생태학적 결정요인에 관한 데이터 마이닝 분석)

  • Lee, Suk-Hyun;Moon, Sang Ho
    • Korean Journal of Social Welfare Studies
    • /
    • v.49 no.4
    • /
    • pp.65-100
    • /
    • 2018
  • Korean Youth are facing diverse problems. For-instance Korean youth are even called '7 given-up generation' which indicates that they gave up marriage, giving birth, social relationship, housing, dream and the hope. From this point, the study concludes that the influential factors of the alcohol problems of youth should be studied based on the eco social perspectives. And it adopted data-mining methods, using SAS-Enterprise Miner for the analysis, targeting 2538 youths. Specifically, the study analyzed and chose the most predictable model using decision tree analysis, artificial neural network and logistic analysis. As the result, the study found that gender, age, smoking, spouse, family-number, jobsearching and economic participation are statistically significant determinants of alcohol problems of youth. Precisely, those who are male, younger, have the spouse, have less family number, searching jobs, have more income and have the job were more prone to have the alcohol problems. Based on the result, this study proposed the addiction problems targeting youth and etc. and expect to have the contribution on implementing procedures for the alcohol problems.