• Title/Summary/Keyword: Prediction Algorithms

Search Result 1,034, Processing Time 0.024 seconds

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.

CHANGING THE ANIMAL WORLD WITH NIR : SMALL STEPS OR GIANT LEAPS\ulcorner

  • Flinn, Peter C.
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1062-1062
    • /
    • 2001
  • The concept of “precision agriculture” or “site-specific farming” is usually confined to the fields of soil science, crop science and agronomy. However, because plants grow in soil, animals eat plants, and humans eat animal products, it could be argued (perhaps with some poetic licence) that the fields of feed quality, animal nutrition and animal production should also be considered in this context. NIR spectroscopy has proved over the last 20 years that it can provide a firm foundation for quality measurement across all of these fields, and with the continuing developments in instrumentation, computer capacity and software, is now a major cog in the wheel of precision agriculture. There have been a few giant leaps and a lot of small steps in the impact of NIR on the animal world. These have not been confined to the amazing advances in hardware and software, although would not have occurred without them. Rapid testing of forages, grains and mixed feeds by NIR for nutritional value to livestock is now commonplace in commercial laboratories world-wide. This would never have been possible without the pioneering work done by the USDA NIR Forage Research Network in the 1980's, following the landmark paper of Norris et al. in 1976. The advent of calibration transfer between instruments, algorithms which utilize huge databases for calibration and prediction, and the ability to directly scan whole grains and fresh forages can also be considered as major steps, if not leaps. More adventurous NIR applications have emerged in animal nutrition, with emphasis on estimating the functional properties of feeds, such as in vivo digestibility, voluntary intake, protein degradability and in vitro assays to simulate starch digestion. The potential to monitor the diets of grazing animals by using faecal NIR spectra is also now being realized. NIR measurements on animal carcasses and even live animals have also been attempted, with varying degrees of success, The use of discriminant analysis in these fields is proving a useful tool. The latest giant leap is likely to be the advent of relatively low-cost, portable and ultra-fast diode array NIR instruments, which can be used “on-site” and also be fitted to forage or grain harvesters. The fodder and livestock industries are no longer satisfied with what we once thought was revolutionary: a 2-3 day laboratory turnaround for fred quality testing. This means that the instrument needs to be taken to the samples rather than vice versa. Considerable research is underway in this area, but the challenge of calibration transfer and maintenance of instrument networks of this type remains. The animal world is currently facing its biggest challenges ever; animal welfare, alleged effects of animal products on human health, environmental and economic issues are difficult enough, but the current calamities of BSE and foot and mouth disease are “the last straw” NIR will not of course solve all these problems, but is already proving useful in some of these areas and will continue to do so.

  • PDF

Application of High Resolution Multi-satellite Precipitation Products and a Distributed Hydrological Modeling for Daily Runoff Simulation (고해상도 다중위성 강수자료와 분포형 수문모형의 유출모의 적용)

  • Kim, Jong Pil;Park, Kyung-Won;Jung, Il-Won;Han, Kyung-Soo;Kim, Gwangseob
    • Korean Journal of Remote Sensing
    • /
    • v.29 no.2
    • /
    • pp.263-274
    • /
    • 2013
  • In this study we evaluated the hydrological applicability of multi-satellite precipitation estimates. Three high-resolution global multi-satellite precipitation products, the Tropical Rainfall Measuring Mission (TRMM) Multi-satellite Precipitation Analysis (TMPA), the Global Satellite Mapping of Precipitation (GSMaP), and the Climate Precipitation Center (CPC) Morphing technique (CMORPH), were applied to the Coupled Routing and Excess Storage (CREST) model for the evaluation of their hydrological utility. The CREST model was calibrated from 2002 to 2005 and validated from 2006 to 2009 in the Chungju Dam watershed, including two years of warm-up periods (2002-2003 and 2006-2007). Areal-averaged precipitation time series of the multi-satellite data were compared with those of the ground records. The results indicate that the multi-satellite precipitation can reflect the seasonal variation of precipitation in the Chungju Dam watershed. However, TMPA overestimates the amount of annual and monthly precipitation while GSMaP and CMORPH underestimate the precipitation during the period from 2002 to 2009. These biases of multi-satellite precipitation products induce poor performances in hydrological simulation, although TMPA is better than both of GSMaP and CMORPH. Our results indicate that advanced rainfall algorithms may be required to improve its hydrological applicability in South Korea.

A File System for User Special Functions using Speed-based Prefetch in Embedded Multimedia Systems (임베디드 멀티미디어 재생기에서 속도기반 미리읽기를 이용한 사용자기능 지원 파일시스템)

  • Choe, Tae-Young;Yoon, Hyeon-Ju
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.14 no.7
    • /
    • pp.625-635
    • /
    • 2008
  • Portable multimedia players have some different properties compared to general multimedia file server. Some of those properties are single user ownership, relatively low hardware performance, I/O burst by user special functions, and short software development cycles. Though suitable for processing multiple user requests at a time, the general multimedia file systems are not efficient for special user functions such as fast forwards/backwards. Soml' methods has been proposed to improve the performance and functionality, which the application programs give prediction hints to the file system. Unfortunately, they require the modification of all applications and recompilation. In this paper, we present a file system that efficiently supports user special functions in embedded multimedia systems using file block allocation, buffer-cache, and prefetch. A prefetch algorithm, SPRA (SPeed-based PRefetch Algorithm) predicts the next block using I/O patterns instead of hints from applications and it is resident in the file system, so doesn't affect application development process. From the experimental file system implementation and comparison with Linux readahead-based algorithms, the proposed system shows $4.29%{\sim}52.63%$ turnaround time and 1.01 to 3,09 times throughput in average.

An Empirical Study on Defense Future Technology in Artificial Intelligence (인공지능 분야 국방 미래기술에 관한 실증연구)

  • Ahn, Jin-Woo;Noh, Sang-Woo;Kim, Tae-Hwan;Yun, Il-Woong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.5
    • /
    • pp.409-416
    • /
    • 2020
  • Artificial intelligence, which is in the spotlight as the core driving force of the 4th industrial revolution, is expanding its scope to various industrial fields such as smart factories and autonomous driving with the development of high-performance hardware, big data, data processing technology, learning methods and algorithms. In the field of defense, as the security environment has changed due to decreasing defense budget, reducing military service resources, and universalizing unmanned combat systems, advanced countries are also conducting technical and policy research to incorporate artificial intelligence into their work by including recognition systems, decision support, simplification of the work processes, and efficient resource utilization. For this reason, the importance of technology-driven planning and investigation is also increasing to discover and research potential defense future technologies. In this study, based on the research data that was collected to derive future defense technologies, we analyzed the characteristic evaluation indicators for future technologies in the field of artificial intelligence and conducted empirical studies. The study results confirmed that in the future technologies of the defense AI field, the applicability of the weapon system and the economic ripple effect will show a significant relationship with the prospect.

Online Document Mining Approach to Predicting Crowdfunding Success (온라인 문서 마이닝 접근법을 활용한 크라우드펀딩의 성공여부 예측 방법)

  • Nam, Suhyeon;Jin, Yoonsun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.3
    • /
    • pp.45-66
    • /
    • 2018
  • Crowdfunding has become more popular than angel funding for fundraising by venture companies. Identification of success factors may be useful for fundraisers and investors to make decisions related to crowdfunding projects and predict a priori whether they will be successful or not. Recent studies have suggested several numeric factors, such as project goals and the number of associated SNS, studying how these affect the success of crowdfunding campaigns. However, prediction of the success of crowdfunding campaigns via non-numeric and unstructured data is not yet possible, especially through analysis of structural characteristics of documents introducing projects in need of funding. Analysis of these documents is promising because they are open and inexpensive to obtain. We propose a novel method to predict the success of a crowdfunding project based on the introductory text. To test the performance of the proposed method, in our study, texts related to 1,980 actual crowdfunding projects were collected and empirically analyzed. From the text data set, the following details about the projects were collected: category, number of replies, funding goal, fundraising method, reward, number of SNS followers, number of images and videos, and miscellaneous numeric data. These factors were identified as significant input features to be used in classification algorithms. The results suggest that the proposed method outperforms other recently proposed, non-text-based methods in terms of accuracy, F-score, and elapsed time.

A Study on the Use of Criminal Justice Information Big Data in terms of the Structuralization and Categorization (형사사법정보의 빅데이터 활용방안 연구: 구조화 범주화 관점으로)

  • Kim, Mi Ryung;Roh, Yoon Ju;Kim, Seonghun
    • Journal of the Korean Society for information Management
    • /
    • v.36 no.4
    • /
    • pp.253-277
    • /
    • 2019
  • In the era of the 4th Industrial Revolution, the importance of data is intensifying, but there are many cases where it is not easy to use data due to personal information protection. Although criminal justice information is expected to have various useful values such as crime prediction and prevention, scientific investigation of criminal investigations, and rationalization of sentencing, the use of criminal justice information is currently limited as a matter of legal interpretation related to privacy protection and criminal justice information. This study proposed to convert criminal justice information into 'crime data' and use it as big data through the structuralization and categorization of criminal justice information. And when using "crime data," legal issues, value in use, considerations for data generation and use were verified by experts, and future strategic development plans were identified. Finally we found that 'crime data' seems to have solved the privacy problem, but it is necessary to specify in the criminal justice information related law and it is urgent to be organized in a standardized form for analysis to use big data. Future directions are to derive data elements, construct a dictionary thesaurus, define and classify personal sensitive information for data grading, and develop algorithms for shaping unstructured data.

A Novel Compressed Sensing Technique for Traffic Matrix Estimation of Software Defined Cloud Networks

  • Qazi, Sameer;Atif, Syed Muhammad;Kadri, Muhammad Bilal
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.10
    • /
    • pp.4678-4702
    • /
    • 2018
  • Traffic Matrix estimation has always caught attention from researchers for better network management and future planning. With the advent of high traffic loads due to Cloud Computing platforms and Software Defined Networking based tunable routing and traffic management algorithms on the Internet, it is more necessary as ever to be able to predict current and future traffic volumes on the network. For large networks such origin-destination traffic prediction problem takes the form of a large under- constrained and under-determined system of equations with a dynamic measurement matrix. Previously, the researchers had relied on the assumption that the measurement (routing) matrix is stationary due to which the schemes are not suitable for modern software defined networks. In this work, we present our Compressed Sensing with Dynamic Model Estimation (CS-DME) architecture suitable for modern software defined networks. Our main contributions are: (1) we formulate an approach in which measurement matrix in the compressed sensing scheme can be accurately and dynamically estimated through a reformulation of the problem based on traffic demands. (2) We show that the problem formulation using a dynamic measurement matrix based on instantaneous traffic demands may be used instead of a stationary binary routing matrix which is more suitable to modern Software Defined Networks that are constantly evolving in terms of routing by inspection of its Eigen Spectrum using two real world datasets. (3) We also show that linking this compressed measurement matrix dynamically with the measured parameters can lead to acceptable estimation of Origin Destination (OD) Traffic flows with marginally poor results with other state-of-art schemes relying on fixed measurement matrices. (4) Furthermore, using this compressed reformulated problem, a new strategy for selection of vantage points for most efficient traffic matrix estimation is also presented through a secondary compression technique based on subset of link measurements. Experimental evaluation of proposed technique using real world datasets Abilene and GEANT shows that the technique is practical to be used in modern software defined networks. Further, the performance of the scheme is compared with recent state of the art techniques proposed in research literature.

A RFID Tag Anti-Collision Algorithm Using 4-Bit Pattern Slot Allocation Method (4비트 패턴에 따른 슬롯 할당 기법을 이용한 RFID 태그 충돌 방지 알고리즘)

  • Kim, Young Back;Kim, Sung Soo;Chung, Kyung Ho;Ahn, Kwang Seon
    • Journal of Internet Computing and Services
    • /
    • v.14 no.4
    • /
    • pp.25-33
    • /
    • 2013
  • The procedure of the arbitration which is the tag collision is essential because the multiple tags response simultaneously in the same frequency to the request of the Reader. This procedure is known as Anti-collision and it is a key technology in the RFID system. In this paper, we propose the 4-Bit Pattern Slot Allocation(4-BPSA) algorithm for the high-speed identification of the multiple tags. The proposed algorithm is based on the tree algorithm using the time slot and identify the tag quickly and efficiently through accurate prediction using the a slot as a 4-bit pattern according to the slot allocation scheme. Through mathematical performance analysis, We proved that the 4-BPSA is an O(n) algorithm by analyzing the worst-case time complexity and the performance of the 4-BPSA is improved compared to existing algorithms. In addition, we verified that the 4-BPSA is performed the average 0.7 times the query per the Tag through MATLAB simulation experiments with performance evaluation of the algorithm and the 4-BPSA ensure stable performance regardless of the number of the tags.

Heat Transfer Analysis and Experiments of Reinforced Concrete Slabs Using Galerkin Finite Element Method (Galerkin 유한요소법을 이용한 철근콘크리트 슬래브의 열전달해석 및 실험)

  • Han, Byung-Chan;Kim, Yun-Yong;Kwon, Young-Jin;Cho, Chang-Geun
    • Journal of the Korea Concrete Institute
    • /
    • v.24 no.5
    • /
    • pp.567-575
    • /
    • 2012
  • A research was conducted to develop a 2-D nonlinear Galerkin finite element analysis of reinforced concrete structures subjected to high temperature with experiments. Algorithms for calculating the closed-form element stiffness for a triangular element with a fully populated material conductance are developed. The validity of the numerical model used in the program is established by comparing the prediction from the computer program with results from full-scale fire resistance tests. Details of fire resistance experiments carried out on reinforced concrete slabs, together with results, are presented. The results obtained from experimental test indicated in that the proposed numerical model and the implemented codes are accurate and reliable. The changes in thermal parameters are discussed from the point of view of changes of structure and chemical composition due to the high temperature exposure. The proposed numerical model takes into account time-varying thermal loads, convection and radiation affected heat fluctuation, and temperature-dependent material properties. Although, this study considered standard fire scenario for reinforced concrete slabs, other time versus temperature relationship can be easily incorporated.