• Title/Summary/Keyword: Generate Data

Search Result 3,065, Processing Time 0.034 seconds

Increasing Accuracy of Stock Price Pattern Prediction through Data Augmentation for Deep Learning (데이터 증강을 통한 딥러닝 기반 주가 패턴 예측 정확도 향상 방안)

  • Kim, Youngjun;Kim, Yeojeong;Lee, Insun;Lee, Hong Joo
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.1-12
    • /
    • 2019
  • As Artificial Intelligence (AI) technology develops, it is applied to various fields such as image, voice, and text. AI has shown fine results in certain areas. Researchers have tried to predict the stock market by utilizing artificial intelligence as well. Predicting the stock market is known as one of the difficult problems since the stock market is affected by various factors such as economy and politics. In the field of AI, there are attempts to predict the ups and downs of stock price by studying stock price patterns using various machine learning techniques. This study suggest a way of predicting stock price patterns based on the Convolutional Neural Network(CNN) among machine learning techniques. CNN uses neural networks to classify images by extracting features from images through convolutional layers. Therefore, this study tries to classify candlestick images made by stock data in order to predict patterns. This study has two objectives. The first one referred as Case 1 is to predict the patterns with the images made by the same-day stock price data. The second one referred as Case 2 is to predict the next day stock price patterns with the images produced by the daily stock price data. In Case 1, data augmentation methods - random modification and Gaussian noise - are applied to generate more training data, and the generated images are put into the model to fit. Given that deep learning requires a large amount of data, this study suggests a method of data augmentation for candlestick images. Also, this study compares the accuracies of the images with Gaussian noise and different classification problems. All data in this study is collected through OpenAPI provided by DaiShin Securities. Case 1 has five different labels depending on patterns. The patterns are up with up closing, up with down closing, down with up closing, down with down closing, and staying. The images in Case 1 are created by removing the last candle(-1candle), the last two candles(-2candles), and the last three candles(-3candles) from 60 minutes, 30 minutes, 10 minutes, and 5 minutes candle charts. 60 minutes candle chart means one candle in the image has 60 minutes of information containing an open price, high price, low price, close price. Case 2 has two labels that are up and down. This study for Case 2 has generated for 60 minutes, 30 minutes, 10 minutes, and 5minutes candle charts without removing any candle. Considering the stock data, moving the candles in the images is suggested, instead of existing data augmentation techniques. How much the candles are moved is defined as the modified value. The average difference of closing prices between candles was 0.0029. Therefore, in this study, 0.003, 0.002, 0.001, 0.00025 are used for the modified value. The number of images was doubled after data augmentation. When it comes to Gaussian Noise, the mean value was 0, and the value of variance was 0.01. For both Case 1 and Case 2, the model is based on VGG-Net16 that has 16 layers. As a result, 10 minutes -1candle showed the best accuracy among 60 minutes, 30 minutes, 10 minutes, 5minutes candle charts. Thus, 10 minutes images were utilized for the rest of the experiment in Case 1. The three candles removed from the images were selected for data augmentation and application of Gaussian noise. 10 minutes -3candle resulted in 79.72% accuracy. The accuracy of the images with 0.00025 modified value and 100% changed candles was 79.92%. Applying Gaussian noise helped the accuracy to be 80.98%. According to the outcomes of Case 2, 60minutes candle charts could predict patterns of tomorrow by 82.60%. To sum up, this study is expected to contribute to further studies on the prediction of stock price patterns using images. This research provides a possible method for data augmentation of stock data.

  • PDF

An Integrated Hierarchical Temporal Memory Network for Multi-interval Prediction of Data Streams (데이터 스트림의 다중-간격 예측을 위한 통합된 계층형 시간적 메모리 네트워크)

  • Diao, Jian-Hua;Bae, Sun-Gap;Sim, Myung-Sun;Bae, Jong-Min;Kang, Hyun-Syug
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.7
    • /
    • pp.558-567
    • /
    • 2010
  • There is a large body of ongoing research to develop efficient prediction methods for data streams. These methods provide single prediction with a fixed time interval. It is necessary to develop a method for multi-interval prediction (MIP) because different prediction results may be obtained based on different intervals in many cases. In this paper, we propose a solution for MIP based on the Hierarchical Temporal Memory (HTM) model. In order to solve the problem of MIP with HTM, we present an Integrated Hierarchical Temporal Memory (IHTM) network by introducing a new node type Zeta1LastNode to the original HTM network. Using the hierarchical characteristic of the IHTM network, different levels in the network learn and model the features of a data stream with different intervals and generate prediction results for different intervals. Performance evaluation shows that the IHTM is efficient in the memory and time consumption compared with the original HTM network in MIP.

Developing the Satellite Image based e-Thematic Construction and Management System -Case Study of Supporting Forest Administrative Service- (위성영상기반 전자주제도 작성 및 관리시스템 개발 - 산림행정업무지원서비스를 사례연구로 -)

  • Jo, Myung-Hee;Jo, Yun-Won
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.9 no.1
    • /
    • pp.89-100
    • /
    • 2006
  • Recently the dramatical development of domestic spatial information technology and the successful construction of Korea NGIS(Nation Geographic Information System) have been the foundation of the scientific national territory utilization and management. However, there still exists the original problems to construct the thematic maps for supporting the various administrative services because administrative officials tend to depend on paper maps and inventories to generate spatial information, process, upgrade and manage. In this situation, there is a greater need to develop GIS system for the effective construction of various thematic maps. In this study, the satellite image based high accurate e-thematic construction and management system was developed to support the forest administrative service such as generating user based forest thematic maps, modifying them, analyzing and outputting through GIS, GPS and satellite images. For the case study, the previous forest paper map of Jeju Island was converted in format of raster and vector data using satellite images to maintain more exact location information so that this system helps to manage domestic spatial information scientifically and effectively within shorter time then support the standard for domestic spatial information. Moreover, this system plays the role of DSS(Decision Supporting System) for forest administrative affairs by integrating the attribute data, managing the GPS data and linking the multimedia data. For this, the additional main objective of this study was acquired powerful GIS component, which is called as e-mapping component, so that it could be regarded as enabling interoperability and reusability within this application. For the future works, the essential element idea and technology in this study could be applied very usefully to other official works such as constructing thematic maps and supporting the desired affairs.

  • PDF

Classification of Remote Sensing Data using Random Selection of Training Data and Multiple Classifiers (훈련 자료의 임의 선택과 다중 분류자를 이용한 원격탐사 자료의 분류)

  • Park, No-Wook;Yoo, Hee Young;Kim, Yihyun;Hong, Suk-Young
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.5
    • /
    • pp.489-499
    • /
    • 2012
  • In this paper, a classifier ensemble framework for remote sensing data classification is presented that combines classification results generated from both different training sets and different classifiers. A core part of the presented framework is to increase a diversity between classification results by using both different training sets and classifiers to improve classification accuracy. First, different training sets that have different sampling densities are generated and used as inputs for supervised classification using different classifiers that show different discrimination capabilities. Then several preliminary classification results are combined via a majority voting scheme to generate a final classification result. A case study of land-cover classification using multi-temporal ENVISAT ASAR data sets is carried out to illustrate the potential of the presented classification framework. In the case study, nine classification results were combined that were generated by using three different training sets and three different classifiers including maximum likelihood classifier, multi-layer perceptron classifier, and support vector machine. The case study results showed that complementary information on the discrimination of land-cover classes of interest would be extracted within the proposed framework and the best classification accuracy was obtained. When comparing different combinations, to combine any classification results where the diversity of the classifiers is not great didn't show an improvement of classification accuracy. Thus, it is recommended to ensure the greater diversity between classifiers in the design of multiple classifier systems.

A Study on Heat Transfer Coefficient of a Perfluorocarbon Heat Pipe (Perfluorocarbon 히트파이프의 열전달 계수에 관한 연구)

  • 강환국;김철주;김재진
    • Journal of Energy Engineering
    • /
    • v.7 no.2
    • /
    • pp.194-201
    • /
    • 1998
  • In electric commuter trains using AC motors, lots of GTO thyristors and diodes are needed for power controls. These semiconductors generate heat about 1~2 kW, and for cooling which perfluorocarbon(PFC) heat pipes have been in use for the last two decades. The present study was investigated on the effects of such important design parameters as structure of internal surface (grooved or smooth), fill charge ratio, and inclinating angle from a vertical on heat transfer coefficients at both evaporators and condensers. To obtain experimental data, several heat pipes of the same geometry of 520 mm long and diameter of 15.88 mm but different in fill charge ratio and internal surface structure were designed and fabricated. For prediction of the heat transfer coefficients, related expressions were examined and the results of calculations were compared with experimental data. Performance tests were conducted while heat pipes operated at mode of thermosyphons. High enhancements of heat transfer coefficient were obtained internal grooves. In these cases, the evaporating heat transfer coefficients distributed in the range of 2~5.5 kW/$m^2$K, with an increase of heat flux from 15~45 kW/$m^2$. These experimental data were in good agreement with Rohsenow's expression based on nucleate boiling when correction factor $C_R$=1.3 was encountered. In addition, the condensation heat transfer coefficients were distributed from 1.5 to 3.5 kW/$m^2$K, and the data were in good agreements with Nusselt's correlation, based on filmwise condensation on vertical plate, when choosing a correction factor $C_N=4$. A fill charge ratio of 40~100% were recommended, and the in clination angle effects were negligible when the angle was higher then 30$^{\circ}$.

  • PDF

Selection of Domestic Test Species Suitable for Korean Soil Ecological Risk Assessment (토양생태 위해성평가를 위한 국내 서식 토양독성 시험종 선별 연구)

  • Kim, Shin Woong;Kwak, Jin Il;Yoon, Jin-Yul;Jeong, Seung-Woo;An, Youn-Joo
    • Journal of Korean Society of Environmental Engineers
    • /
    • v.36 no.5
    • /
    • pp.359-366
    • /
    • 2014
  • For an efficient and reasonable management scheme for protecting the soil environment, a soil ecological risk assessment (ERA) method should be developed prior to utilization, based on the contemporary uses and situations of each country. The Korean environmental policy focusing on soil protection is currently accelerating the development of the soil ecological risk assessment method. The soil ERA requires toxicological data on various trophic levels in the soil environment, and ultimately uses PNEC (Predicted No Effect Concentration), which is derived from collected toxicological data. Therefore, test species that are used to generate toxicity data are essential for conducting reliable ERA. This study aimed to select domestic test species for potential use in a reliable Korean ERA. Copper (Cu) and Nickel (Ni) were identified as target substances, with toxicity data (Cu, Ni) and standard test methods being collected to determine candidate species. The candidate species were first classified by soil trophic level, and then sorted into final domestic species. Forty out of 166 domestic species were determined as potential standard test species, whereas 17 out of 120 species were determined as potential Cu and Ni test species. Finally, this study presented potential soil test species based on the characteristics of the domestic soil environment, and established a preliminary step toward developing a reliable Korean soil ERA method.

A 3D Terrain Reconstruction System using Navigation Information and Realtime-Updated Terrain Data (항법정보와 실시간 업데이트 지형 데이터를 사용한 3D 지형 재구축 시스템)

  • Baek, In-Sun;Um, Ky-Hyun;Cho, Kyung-Eun
    • Journal of Korea Game Society
    • /
    • v.10 no.6
    • /
    • pp.157-168
    • /
    • 2010
  • A terrain is an essential element for constructing a virtual world in which game characters and objects make various interactions with one another. Creating a terrain requires a great deal of time and repetitive editing processes. This paper presents a 3D terrain reconstruction system to create 3D terrain in virtual space based on real terrain data. In this system, it converts the coordinate system of the height maps which are generated from a stereo camera and a laser scanner from global GPS into 3D world using the x and z axis vectors of the global GPS coordinate system. It calculates the movement vectors and the rotation matrices frame by frame. Terrain meshes are dynamically generated and rendered in the virtual areas which are represented in an undirected graph. The rendering meshes are exactly created and updated by correcting terrain data errors. In our experiments, the FPS of the system was regularly checked until the terrain was reconstructed by our system, and the visualization quality of the terrain was reviewed. As a result, our system shows that it has 3 times higher FPS than other terrain management systems with Quadtree for small area, improves 40% than others for large area. The visualization of terrain data maintains the same shape as the contour of real terrain. This system could be used for the terrain system of realtime 3D games to generate terrain on real time, and for the terrain design work of CG Movies.

A Study on Backup Route Setup Scheme in Ad Hoc Networks (애드혹 네트워크에서의 보조 경로 설정 기법에 관한 연구)

  • Jung Se-Won;Lee Chae-Woo
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.8 s.350
    • /
    • pp.47-58
    • /
    • 2006
  • Due to the movement of nodes, ad-hoc networks suffer from the problems such as the decrease of data delivery ratio, the increase of end-to-end delay, and the increase of routing overhead. The backup routing schemes try to solve these problems by finding the backup routes during the route discovery phase and using them when a route fails. Generally the backup routing schemes outperform the single-path routing schemes in terms of data delivery ratio, end-to-end delay, and routing overhead when the nodes move rapidly. But when the nodes don't move rapidly, the backup routing schemes generate more routing traffics than the single-path routing schemes because they need to exchange packets to find the backup route. In addition, when the backup route fails earlier than the main route, it can not use the backup route because in many backup route algorithms, the backup route is found only at the initial route discovery phase. RBR(Reactive Backup Routing Algorithm) proposed in this paper is an algorithm that provides more stable data delivery than the previous backup routing schemes through the selective maintenance of backup route and the backup route rediscovery. To do that RBR prioritize the backup routes, and maintain and use them selectively Thus it can also decrease the routing overheads. Also, RBR can increase data delivery ratio and decrease delay because it reestablishes the backup route when the network topology changes. For the performance evaluation, OPNET simulator is used to compare RBR with the single-path routing scheme and some of the well known backup routing schemes.

Electrical Characteristics Measurement of Eddy Current Testing Instrument for Steam Generator in NPP (원전 증기발생기 와전류검사 장치의 전기적 특성 측정)

  • Lee, Hee-Jong;Cho, Chan-Hee;Yoo, Hyun-Joo;Moon, Gyoon-Young;Lee, Tae-Hun
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.33 no.5
    • /
    • pp.465-471
    • /
    • 2013
  • A steam generator in nuclear power plant is a heatexchager which is used to convert water into steam from heat produced in a nuclear reactor core, and the steam produced in steam generator is delivered to the turbine to generate electricity. Because of damage to steam generator tubing may impair its ability to adequately perform required safety functions in terms of both structural integrity and leakage integrity, eddy current testing is periodically performed to evaluate the integrity of tubes in steam generator. This assessment is normally performed during a reactor refueling outage. Currently, the eddy current testing for steam generator of nuclear power plant in Korea is performed in accordance with KEPIC & ASME Code requirements, the eddy current testing system is consists of remote data acquisition unit and data analysis program to evaluate the acquired data. The KEPIC & ASME Code require that the electrical properties of remote data acquisition unit, such as total harmonic distortion, input & output impedance, amplifier linearity & stability, phase linearity, bandwidth & demodulation filter response, analog-to-digital conversion, and channel crosstalk shall be measured in accordance with the KEPIC & ASME Code requirements. In this paper, the measurement requirements of electrical properties for eddy current testing instrument described in KEPIC & ASME Code are presented, and the measurement results of newly developed eddy current testing instrument by KHNP(Korea Hydro & Nuclear Power Co., LTD) are presented.

A study on the spread of the foot-and-mouth disease in Korea in 2010/2011 (2010/2011년도 한국 발생 구제역 확산에 관한 연구)

  • Hwang, Jihyun;Oh, Changhyuck
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.271-280
    • /
    • 2014
  • Foot-and-mouth Disease (FMD) is a highly infectious and fatal viral livestock disease that affects cloven-hoofed animals domestic and wild and the FMD outbreak in Korea in 2010/2011 was a disastrous incident for the country and the economy. Thus, efforts at the national level are put to prevent foot-and-mouth disease and to reduce the damage in the case of outbreak. As one of these efforts, it is useful to study the spread of the disease by using probabilistic model. In fact, after the FMD epidemic in the UK occurred in 2001, many studies have been carried on the spread of the disease using a variety of stochastic models as an effort to prepare future outbreak of FMD. However, for the FMD outbreak in Korea occurred in 2010/2011, there are few study by utilizing probabilistic model. This paper assumes a stochastic spatial-temporal susceptible-infectious-removed (SIR) epidemic model for the 2010/2011 FMD outbreak to understand spread of the disease. Since data on infections of FMD disease during 2010/2011 outbreak of Aniaml and Plant Quarantine Agency and on the livestock farms from the nationwide census in 2011 of Statistics Korea do not have detail informations on address or missing values, we generate detail information on address by randomly allocating farms within corresponding Si/Gun area. The kernel function is estimated using the infection data and by using simulations, the susceptibility and transmission of the spatial-temporal stochastic SIR models are determined.