• Title/Summary/Keyword: smart mining

Search Result 261, Processing Time 0.022 seconds

Stock-Index Invest Model Using News Big Data Opinion Mining (뉴스와 주가 : 빅데이터 감성분석을 통한 지능형 투자의사결정모형)

  • Kim, Yoo-Sin;Kim, Nam-Gyu;Jeong, Seung-Ryul
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.143-156
    • /
    • 2012
  • People easily believe that news and stock index are closely related. They think that securing news before anyone else can help them forecast the stock prices and enjoy great profit, or perhaps capture the investment opportunity. However, it is no easy feat to determine to what extent the two are related, come up with the investment decision based on news, or find out such investment information is valid. If the significance of news and its impact on the stock market are analyzed, it will be possible to extract the information that can assist the investment decisions. The reality however is that the world is inundated with a massive wave of news in real time. And news is not patterned text. This study suggests the stock-index invest model based on "News Big Data" opinion mining that systematically collects, categorizes and analyzes the news and creates investment information. To verify the validity of the model, the relationship between the result of news opinion mining and stock-index was empirically analyzed by using statistics. Steps in the mining that converts news into information for investment decision making, are as follows. First, it is indexing information of news after getting a supply of news from news provider that collects news on real-time basis. Not only contents of news but also various information such as media, time, and news type and so on are collected and classified, and then are reworked as variable from which investment decision making can be inferred. Next step is to derive word that can judge polarity by separating text of news contents into morpheme, and to tag positive/negative polarity of each word by comparing this with sentimental dictionary. Third, positive/negative polarity of news is judged by using indexed classification information and scoring rule, and then final investment decision making information is derived according to daily scoring criteria. For this study, KOSPI index and its fluctuation range has been collected for 63 days that stock market was open during 3 months from July 2011 to September in Korea Exchange, and news data was collected by parsing 766 articles of economic news media M company on web page among article carried on stock information>news>main news of portal site Naver.com. In change of the price index of stocks during 3 months, it rose on 33 days and fell on 30 days, and news contents included 197 news articles before opening of stock market, 385 news articles during the session, 184 news articles after closing of market. Results of mining of collected news contents and of comparison with stock price showed that positive/negative opinion of news contents had significant relation with stock price, and change of the price index of stocks could be better explained in case of applying news opinion by deriving in positive/negative ratio instead of judging between simplified positive and negative opinion. And in order to check whether news had an effect on fluctuation of stock price, or at least went ahead of fluctuation of stock price, in the results that change of stock price was compared only with news happening before opening of stock market, it was verified to be statistically significant as well. In addition, because news contained various type and information such as social, economic, and overseas news, and corporate earnings, the present condition of type of industry, market outlook, the present condition of market and so on, it was expected that influence on stock market or significance of the relation would be different according to the type of news, and therefore each type of news was compared with fluctuation of stock price, and the results showed that market condition, outlook, and overseas news was the most useful to explain fluctuation of news. On the contrary, news about individual company was not statistically significant, but opinion mining value showed tendency opposite to stock price, and the reason can be thought to be the appearance of promotional and planned news for preventing stock price from falling. Finally, multiple regression analysis and logistic regression analysis was carried out in order to derive function of investment decision making on the basis of relation between positive/negative opinion of news and stock price, and the results showed that regression equation using variable of market conditions, outlook, and overseas news before opening of stock market was statistically significant, and classification accuracy of logistic regression accuracy results was shown to be 70.0% in rise of stock price, 78.8% in fall of stock price, and 74.6% on average. This study first analyzed relation between news and stock price through analyzing and quantifying sensitivity of atypical news contents by using opinion mining among big data analysis techniques, and furthermore, proposed and verified smart investment decision making model that could systematically carry out opinion mining and derive and support investment information. This shows that news can be used as variable to predict the price index of stocks for investment, and it is expected the model can be used as real investment support system if it is implemented as system and verified in the future.

A Hybrid Efficient Feature Selection Model for High Dimensional Data Set based on KNHNAES (2013~2015) (KNHNAES (2013~2015) 에 기반한 대형 특징 공간 데이터집 혼합형 효율적인 특징 선택 모델)

  • Kwon, Tae il;Li, Dingkun;Park, Hyun Woo;Ryu, Kwang Sun;Kim, Eui Tak;Piao, Minghao
    • Journal of Digital Contents Society
    • /
    • v.19 no.4
    • /
    • pp.739-747
    • /
    • 2018
  • With a large feature space data, feature selection has become an extremely important procedure in the Data Mining process. But the traditional feature selection methods with single process may no longer fit for this procedure. In this paper, we proposed a hybrid efficient feature selection model for high dimensional data. We have applied our model on KNHNAES data set, the result shows that our model outperforms many existing methods in terms of accuracy over than at least 5%.

Operational Big Data Analytics platform for Smart Factory (스마트팩토리를 위한 운영빅데이터 분석 플랫폼)

  • Bae, Hyerim;Park, Sanghyuck;Choi, Yulim;Joo, Byeongjun;Sutrisnowati, Riska Asriana;Pulshashi, Iq Reviessay;Putra, Ahmad Dzulfikar Adi;Adi, Taufik Nur;Lee, Sanghwa;Won, Seokrae
    • The Journal of Bigdata
    • /
    • v.1 no.2
    • /
    • pp.9-19
    • /
    • 2016
  • Since ICT convergence became a major issue, German government has carried forward a policy 'Industry 4.0' that triggered ICT convergence with manufacturing. Now this trend gets into our stride. From this facts, we can expect great leap up to quality perfection in low cost. Recently Korean government also enforces policy with 'Manufacturing 3.0' for upgrading Korean manufacturing industry with being accelerated by many related technologies. We, in the paper, developed a custom-made operational big data analysis platform for the implementation of operational intelligence to improve industry capability. Our platform is designed based on spring framework and web. In addition, HDFS and spark architectures helps our system analyze massive data on the field with streamed data processed by process mining algorithm. Extracted knowledge from data will support enhancement of manufacturing performance.

  • PDF

Numerical simulation of the effect of confining pressure and tunnel depth on the vertical settlement using particle flow code (with direct tensile strength calibration in PFC Modeling)

  • Haeri, Hadi;Sarfarazi, Vahab;Marji, Mohammad Fatehi
    • Smart Structures and Systems
    • /
    • v.25 no.4
    • /
    • pp.433-446
    • /
    • 2020
  • In this paper the effect of confining pressure and tunnel depth on the ground vertical settlement has been investigated using particle flow code (PFC2D). For this perpuse firstly calibration of PFC2D was performed using both of tensile test and triaxial test. Then a model with dimention of 100 m × 100 m was built. A circular tunnel with diameter of 20 m was drillled in the middle of the model. Also, a rectangular tunnel with wide of 10 m and length of 20 m was drilled in the model. The center of tunnel was situated 15 m, 20 m, 25 m, 30 m, 35 m, 40 m, 45 m, 50 m, 55 m and 60 m below the ground surface. these models are under confining pressure of 0.001 GPa, 0.005 GPa, 0.01 GPa, 0.03 GPa, 0.05 GPa and 0.07 GPa. The results show that the volume of colapce zone is constant by increasing the distance between ground surface and tunnel position. Also, the volume of colapce zone was increased by decreasing of confining pressure. The maximum of settlement occurs at the top of the tunnel roof. The maximum of settlement occurs when center of tunnel was situated 15 m below the ground surface. The settlement decreases by increasing the distance between tunnel center line and measuring circles in the ground surface. The minimum of settlement occurs when center of circular tunnel was situated 60 m below the surface ground. Its to be note that the settlement increase by decreasing the confining pressure.

A study of Big-data analysis for relationship between students (학생들의 관계성 파악을 위한 빅-데이터 분석에 관한 연구)

  • Hwang, Deuk-Young;Kim, Jin-Mook
    • Convergence Security Journal
    • /
    • v.15 no.4
    • /
    • pp.113-119
    • /
    • 2015
  • Recent, cyber violence is increasing in a school and the severity of the problems encountered day by day. In particular, the severity of the cyber force using the smart phone is recognized as a very high and great problems socially. Cyberbullying have long damage degree and a wide range time duration against of existed physical cyber violence. Then student's affects is very seriously. Therefore, we analyzes the relationship and languages in the classroom for students to use to identify signs of cyber violence that may occur between friends in the class. And we support this information to identified parent, classroom teachers and school sheriff for prevent cyberbullying accidents in the school. For this research, we will design and implement a messenger in the cyber classroom. It have many components that are Big-data vocabulary, analyzer, and communication interface. Our proposed messenger can analyze lingual sign and friendship between students using Big-data analysis method such as text mining. It can analysis relationship by per-student, per-classroom.

Numerical simulation of compressive to tensile load conversion for determining the tensile strength of ultra-high performance concrete

  • Haeri, Hadi;Mirshekari, Nader;Sarfarazi, Vahab;Marji, Mohammad Fatehi
    • Smart Structures and Systems
    • /
    • v.26 no.5
    • /
    • pp.605-617
    • /
    • 2020
  • In this study, the experimental tests for the direct tensile strength measurement of Ultra-High Performance Concrete (UHPC) were numerically modeled by using the discrete element method (circle type element) and Finite Element Method (FEM). The experimental tests used for the laboratory tensile strength measurement is the Compressive-to-Tensile Load Conversion (CTLC) device. In this paper, the failure process including the cracks initiation, propagation and coalescence studied and then the direct tensile strength of the UHPC specimens measured by the novel apparatus i.e., CTLC device. For this purpose, the UHPC member (each containing a central hole) prepared, and situated in the CTLC device which in turn placed in the universal testing machine. The direct tensile strength of the member is measured due to the direct tensile stress which is applied to this specimen by the CTLC device. This novel device transferring the applied compressive load to that of the tensile during the testing process. The UHPC beam specimen of size 150 × 60 × 190 mm and internal hole of 75 × 60 mm was used in this study. The rate of the applied compressive load to CTLC device through the universal testing machine was 0.02 MPa/s. The direct tensile strength of UHPC was found using a new formula based on the present analyses. The numerical simulation given in this study gives the tensile strength and failure behavior of the UHPC very close to those obtained experimentally by the CTLC device implemented in the universal testing machine. The percent variation between experimental results and numerical results was found as nearly 2%. PFC2D simulations of the direct tensile strength measuring specimen and ABAQUS simulation of the tested CTLC specimens both demonstrate the validity and capability of the proposed testing procedure for the direct tensile strength measurement of UHPC specimens.

Analysis of Mineral Resource Exploration and Strategy in Australia (호주 광물자원탐사와 전략분석)

  • Kim, Seong-Yong;Heo, Chul-Ho
    • Economic and Environmental Geology
    • /
    • v.51 no.3
    • /
    • pp.291-307
    • /
    • 2018
  • Australia is the world's top gold, nickel, iron ore, lead, zinc and uranium, and is ranked in the top five in many other important minerals. Extension to existing resources will continue to support well-established local production. There are perceptions by some that Australia is a mature exploration destination where the easily won near-surface deposits were largely discovered many decades ago. In recent years, Australia faces increasing global competition for investment spending in all jurisdictions in which mineral exploration is encouraged. Many regional communities face the threat of losing their main economic driver as a number of long-term mines are reaching the end of their economic life. However, given the trend of increasing mineral demand due to the 4th industrial revolution, it is considered that Korea is also an opportunity to acquire global competitiveness of geoscience and mining technology by smart and digital mining, and by ICT-convergence technology R&D.

An Energy Consumption Prediction Model for Smart Factory Using Data Mining Algorithms (데이터 마이닝 기반 스마트 공장 에너지 소모 예측 모델)

  • Sathishkumar, VE;Lee, Myeongbae;Lim, Jonghyun;Kim, Yubin;Shin, Changsun;Park, Jangwoo;Cho, Yongyun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.5
    • /
    • pp.153-160
    • /
    • 2020
  • Energy Consumption Predictions for Industries has a prominent role to play in the energy management and control system as dynamic and seasonal changes are occurring in energy demand and supply. This paper introduces and explores the steel industry's predictive models of energy consumption. The data used includes lagging and leading reactive power lagging and leading current variable, emission of carbon dioxide (tCO2) and load type. Four statistical models are trained and tested in the test set: (a) Linear Regression (LR), (b) Radial Kernel Support Vector Machine (SVM RBF), (c) Gradient Boosting Machine (GBM), and (d) Random Forest (RF). Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are used for calculating regression model predictive performance. When using all the predictors, the best model RF can provide RMSE value 7.33 in the test set.

Design of knowledge search algorithm for PHR based personalized health information system (PHR 기반 개인 맞춤형 건강정보 탐사 알고리즘 설계)

  • SHIN, Moon-Sun
    • Journal of Digital Convergence
    • /
    • v.15 no.4
    • /
    • pp.191-198
    • /
    • 2017
  • It is needed to support intelligent customized health information service for user convenience in PHR based Personal Health Care Service Platform. In this paper, we specify an ontology-based health data model for Personal Health Care Service Platform. We also design a knowledge search algorithm that can be used to figure out similar health record by applying machine learning and data mining techniques. Axis-based mining algorithm, which we proposed, can be performed based on axis-attributes in order to improve relevance of knowledge exploration and to provide efficient search time by reducing the size of candidate item set. And K-Nearest Neighbor algorithm is used to perform to do grouping users byaccording to the similarity of the user profile. These algorithms improves the efficiency of customized information exploration according to the user 's disease and health condition. It can be useful to apply the proposed algorithm to a process of inference in the Personal Health Care Service Platform and makes it possible to recommend customized health information to the user. It is useful for people to manage smart health care in aging society.

A Design of a TV Advertisement Effectiveness Analysis System Using SNS Big-data (SNS Big-data를 활용한 TV 광고 효과 분석 시스템 설계)

  • Lee, Areum;Bang, Jiseon;Kim, Yoonhee
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.9
    • /
    • pp.579-586
    • /
    • 2015
  • As smart-phone usage increases, the number of Social Networking Service (SNS) users has also exponentially increased. SNS allows people to efficiently exchange their personal opinion, and for this reason, it is possible to collect the reaction of each individual to a given event in real-time. Nevertheless, new methods need to be developed to collect and analyze people's opinion in real-time in order to effectively evaluate the impact of a TV advertisement. Hence, we designed and constructed a system that analyzes the effect of an advertisement in real-time by using data related to the advertisement collected from SNS, specifically, Twitter. In detail, Hadoop is used in the system to enable big-data analysis in parallel, and various analyses can be conducted by conducting separate numerical analyses of the degrees of mentioning, preference and reliability. The analysis can be accurate if the reliability is assessed using opinion mining technology. The proposed system is therefore proven to effectively handle and analyze data responses to divers TV advertisement.