• Title/Summary/Keyword: Predictive system

Search Result 1,209, Processing Time 0.033 seconds

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

  • Seong, NohYoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.1-17
    • /
    • 2019
  • Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.

1-month Prediction on Rice Harvest Date in South Korea Based on Dynamically Downscaled Temperature (역학적 규모축소 기온을 이용한 남한지역 벼 수확일 1개월 예측)

  • Jina Hur;Eun-Soon Im;Subin Ha;Yong-Seok Kim;Eung-Sup Kim;Joonlee Lee;Sera Jo;Kyo-Moon Shim;Min-Gu Kang
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.25 no.4
    • /
    • pp.267-275
    • /
    • 2023
  • This study predicted rice harvest date in South Korea using 11-year (2012-2022) hindcasts based on dynamically downscaled 2m air temperature at subseasonal (1-month lead) timescale. To obtain high (5 km) resolution meteorological information over South Korea, global prediction obtained from the NOAA Climate Forecast System (CFSv2) is dynamically downscaled using the Weather Research and Forecasting (WRF) double-nested modeling system. To estimate rice harvest date, the growing degree days (GDD) is used, which accumulated the daily temperature from the seeding date (1 Jan.) to the reference temperature (1400℃ + 55 days) for harvest. In terms of the maximum (minimum) temperatures, the hindcasts tends to have a cold bias of about 1. 2℃ (0. 1℃) for the rice growth period (May to October) compared to the observation. The harvest date derived from hindcasts (DOY 289) well simulates one from observation (DOY 280), despite a margin of 9 days. The study shows the possibility of obtaining the detailed predictive information for rice harvest date over South Korea based on the dynamical downscaling method.

Assessment of Additional MRI-Detected Breast Lesions Using the Quantitative Analysis of Contrast-Enhanced Ultrasound Scans and Its Comparability with Dynamic Contrast-Enhanced MRI Findings of the Breast (유방자기공명영상에서 추가적으로 발견된 유방 병소에 대한 조영증강 초음파의 정량적 분석을 통한 진단 능력 평가와 동적 조영증강 유방 자기공명영상 결과와의 비교)

  • Sei Young Lee;Ok Hee Woo;Hye Seon Shin;Sung Eun Song;Kyu Ran Cho;Bo Kyoung Seo;Soon Young Hwang
    • Journal of the Korean Society of Radiology
    • /
    • v.82 no.4
    • /
    • pp.889-902
    • /
    • 2021
  • Purpose To assess the diagnostic performance of contrast-enhanced ultrasound (CEUS) for additional MR-detected enhancing lesions and to determine whether or not kinetic pattern results comparable to dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) of the breast can be obtained using the quantitative analysis of CEUS. Materials and Methods In this single-center prospective study, a total of 71 additional MR-detected breast lesions were included. CEUS examination was performed, and lesions were categorized according to the Breast Imaging-Reporting and Data System (BI-RADS). The sensitivity, specificity, and diagnostic accuracy of CEUS were calculated by comparing the BI-RADS category to the final pathology results. The degree of agreement between CEUS and DCE-MRI kinetic patterns was evaluated using weighted kappa. Results On CEUS, 46 lesions were assigned as BI-RADS category 4B, 4C, or 5, while 25 lesions category 3 or 4A. The diagnostic performance of CEUS for enhancing lesions on DCE-MRI was excellent, with 84.9% sensitivity, 94.4% specificity, and 97.8% positive predictive value. A total of 57/71 (80%) lesions had correlating kinetic patterns and showed good agreement (weighted kappa = 0.66) between CEUS and DCE-MRI. Benign lesions showed excellent agreement (weighted kappa = 0.84), and invasive ductal carcinoma (IDC) showed good agreement (weighted kappa = 0.69). Conclusion The diagnostic performance of CEUS for additional MR-detected breast lesions was excellent. Accurate kinetic pattern assessment, fairly comparable to DCE-MRI, can be obtained for benign and IDC lesions using CEUS.

A Study on People Counting in Public Metro Service using Hybrid CNN-LSTM Algorithm (Hybrid CNN-LSTM 알고리즘을 활용한 도시철도 내 피플 카운팅 연구)

  • Choi, Ji-Hye;Kim, Min-Seung;Lee, Chan-Ho;Choi, Jung-Hwan;Lee, Jeong-Hee;Sung, Tae-Eung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.131-145
    • /
    • 2020
  • In line with the trend of industrial innovation, IoT technology utilized in a variety of fields is emerging as a key element in creation of new business models and the provision of user-friendly services through the combination of big data. The accumulated data from devices with the Internet-of-Things (IoT) is being used in many ways to build a convenience-based smart system as it can provide customized intelligent systems through user environment and pattern analysis. Recently, it has been applied to innovation in the public domain and has been using it for smart city and smart transportation, such as solving traffic and crime problems using CCTV. In particular, it is necessary to comprehensively consider the easiness of securing real-time service data and the stability of security when planning underground services or establishing movement amount control information system to enhance citizens' or commuters' convenience in circumstances with the congestion of public transportation such as subways, urban railways, etc. However, previous studies that utilize image data have limitations in reducing the performance of object detection under private issue and abnormal conditions. The IoT device-based sensor data used in this study is free from private issue because it does not require identification for individuals, and can be effectively utilized to build intelligent public services for unspecified people. Especially, sensor data stored by the IoT device need not be identified to an individual, and can be effectively utilized for constructing intelligent public services for many and unspecified people as data free form private issue. We utilize the IoT-based infrared sensor devices for an intelligent pedestrian tracking system in metro service which many people use on a daily basis and temperature data measured by sensors are therein transmitted in real time. The experimental environment for collecting data detected in real time from sensors was established for the equally-spaced midpoints of 4×4 upper parts in the ceiling of subway entrances where the actual movement amount of passengers is high, and it measured the temperature change for objects entering and leaving the detection spots. The measured data have gone through a preprocessing in which the reference values for 16 different areas are set and the difference values between the temperatures in 16 distinct areas and their reference values per unit of time are calculated. This corresponds to the methodology that maximizes movement within the detection area. In addition, the size of the data was increased by 10 times in order to more sensitively reflect the difference in temperature by area. For example, if the temperature data collected from the sensor at a given time were 28.5℃, the data analysis was conducted by changing the value to 285. As above, the data collected from sensors have the characteristics of time series data and image data with 4×4 resolution. Reflecting the characteristics of the measured, preprocessed data, we finally propose a hybrid algorithm that combines CNN in superior performance for image classification and LSTM, especially suitable for analyzing time series data, as referred to CNN-LSTM (Convolutional Neural Network-Long Short Term Memory). In the study, the CNN-LSTM algorithm is used to predict the number of passing persons in one of 4×4 detection areas. We verified the validation of the proposed model by taking performance comparison with other artificial intelligence algorithms such as Multi-Layer Perceptron (MLP), Long Short Term Memory (LSTM) and RNN-LSTM (Recurrent Neural Network-Long Short Term Memory). As a result of the experiment, proposed CNN-LSTM hybrid model compared to MLP, LSTM and RNN-LSTM has the best predictive performance. By utilizing the proposed devices and models, it is expected various metro services will be provided with no illegal issue about the personal information such as real-time monitoring of public transport facilities and emergency situation response services on the basis of congestion. However, the data have been collected by selecting one side of the entrances as the subject of analysis, and the data collected for a short period of time have been applied to the prediction. There exists the limitation that the verification of application in other environments needs to be carried out. In the future, it is expected that more reliability will be provided for the proposed model if experimental data is sufficiently collected in various environments or if learning data is further configured by measuring data in other sensors.

An Empirical Study on the Influencing Factors for Big Data Intented Adoption: Focusing on the Strategic Value Recognition and TOE Framework (빅데이터 도입의도에 미치는 영향요인에 관한 연구: 전략적 가치인식과 TOE(Technology Organizational Environment) Framework을 중심으로)

  • Ka, Hoi-Kwang;Kim, Jin-soo
    • Asia pacific journal of information systems
    • /
    • v.24 no.4
    • /
    • pp.443-472
    • /
    • 2014
  • To survive in the global competitive environment, enterprise should be able to solve various problems and find the optimal solution effectively. The big-data is being perceived as a tool for solving enterprise problems effectively and improve competitiveness with its' various problem solving and advanced predictive capabilities. Due to its remarkable performance, the implementation of big data systems has been increased through many enterprises around the world. Currently the big-data is called the 'crude oil' of the 21st century and is expected to provide competitive superiority. The reason why the big data is in the limelight is because while the conventional IT technology has been falling behind much in its possibility level, the big data has gone beyond the technological possibility and has the advantage of being utilized to create new values such as business optimization and new business creation through analysis of big data. Since the big data has been introduced too hastily without considering the strategic value deduction and achievement obtained through the big data, however, there are difficulties in the strategic value deduction and data utilization that can be gained through big data. According to the survey result of 1,800 IT professionals from 18 countries world wide, the percentage of the corporation where the big data is being utilized well was only 28%, and many of them responded that they are having difficulties in strategic value deduction and operation through big data. The strategic value should be deducted and environment phases like corporate internal and external related regulations and systems should be considered in order to introduce big data, but these factors were not well being reflected. The cause of the failure turned out to be that the big data was introduced by way of the IT trend and surrounding environment, but it was introduced hastily in the situation where the introduction condition was not well arranged. The strategic value which can be obtained through big data should be clearly comprehended and systematic environment analysis is very important about applicability in order to introduce successful big data, but since the corporations are considering only partial achievements and technological phases that can be obtained through big data, the successful introduction is not being made. Previous study shows that most of big data researches are focused on big data concept, cases, and practical suggestions without empirical study. The purpose of this study is provide the theoretically and practically useful implementation framework and strategies of big data systems with conducting comprehensive literature review, finding influencing factors for successful big data systems implementation, and analysing empirical models. To do this, the elements which can affect the introduction intention of big data were deducted by reviewing the information system's successful factors, strategic value perception factors, considering factors for the information system introduction environment and big data related literature in order to comprehend the effect factors when the corporations introduce big data and structured questionnaire was developed. After that, the questionnaire and the statistical analysis were performed with the people in charge of the big data inside the corporations as objects. According to the statistical analysis, it was shown that the strategic value perception factor and the inside-industry environmental factors affected positively the introduction intention of big data. The theoretical, practical and political implications deducted from the study result is as follows. The frist theoretical implication is that this study has proposed theoretically effect factors which affect the introduction intention of big data by reviewing the strategic value perception and environmental factors and big data related precedent studies and proposed the variables and measurement items which were analyzed empirically and verified. This study has meaning in that it has measured the influence of each variable on the introduction intention by verifying the relationship between the independent variables and the dependent variables through structural equation model. Second, this study has defined the independent variable(strategic value perception, environment), dependent variable(introduction intention) and regulatory variable(type of business and corporate size) about big data introduction intention and has arranged theoretical base in studying big data related field empirically afterwards by developing measurement items which has obtained credibility and validity. Third, by verifying the strategic value perception factors and the significance about environmental factors proposed in the conventional precedent studies, this study will be able to give aid to the afterwards empirical study about effect factors on big data introduction. The operational implications are as follows. First, this study has arranged the empirical study base about big data field by investigating the cause and effect relationship about the influence of the strategic value perception factor and environmental factor on the introduction intention and proposing the measurement items which has obtained the justice, credibility and validity etc. Second, this study has proposed the study result that the strategic value perception factor affects positively the big data introduction intention and it has meaning in that the importance of the strategic value perception has been presented. Third, the study has proposed that the corporation which introduces big data should consider the big data introduction through precise analysis about industry's internal environment. Fourth, this study has proposed the point that the size and type of business of the corresponding corporation should be considered in introducing the big data by presenting the difference of the effect factors of big data introduction depending on the size and type of business of the corporation. The political implications are as follows. First, variety of utilization of big data is needed. The strategic value that big data has can be accessed in various ways in the product, service field, productivity field, decision making field etc and can be utilized in all the business fields based on that, but the parts that main domestic corporations are considering are limited to some parts of the products and service fields. Accordingly, in introducing big data, reviewing the phase about utilization in detail and design the big data system in a form which can maximize the utilization rate will be necessary. Second, the study is proposing the burden of the cost of the system introduction, difficulty in utilization in the system and lack of credibility in the supply corporations etc in the big data introduction phase by corporations. Since the world IT corporations are predominating the big data market, the big data introduction of domestic corporations can not but to be dependent on the foreign corporations. When considering that fact, that our country does not have global IT corporations even though it is world powerful IT country, the big data can be thought to be the chance to rear world level corporations. Accordingly, the government shall need to rear star corporations through active political support. Third, the corporations' internal and external professional manpower for the big data introduction and operation lacks. Big data is a system where how valuable data can be deducted utilizing data is more important than the system construction itself. For this, talent who are equipped with academic knowledge and experience in various fields like IT, statistics, strategy and management etc and manpower training should be implemented through systematic education for these talents. This study has arranged theoretical base for empirical studies about big data related fields by comprehending the main variables which affect the big data introduction intention and verifying them and is expected to be able to propose useful guidelines for the corporations and policy developers who are considering big data implementationby analyzing empirically that theoretical base.

A Study of Anomaly Detection for ICT Infrastructure using Conditional Multimodal Autoencoder (ICT 인프라 이상탐지를 위한 조건부 멀티모달 오토인코더에 관한 연구)

  • Shin, Byungjin;Lee, Jonghoon;Han, Sangjin;Park, Choong-Shik
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.57-73
    • /
    • 2021
  • Maintenance and prevention of failure through anomaly detection of ICT infrastructure is becoming important. System monitoring data is multidimensional time series data. When we deal with multidimensional time series data, we have difficulty in considering both characteristics of multidimensional data and characteristics of time series data. When dealing with multidimensional data, correlation between variables should be considered. Existing methods such as probability and linear base, distance base, etc. are degraded due to limitations called the curse of dimensions. In addition, time series data is preprocessed by applying sliding window technique and time series decomposition for self-correlation analysis. These techniques are the cause of increasing the dimension of data, so it is necessary to supplement them. The anomaly detection field is an old research field, and statistical methods and regression analysis were used in the early days. Currently, there are active studies to apply machine learning and artificial neural network technology to this field. Statistically based methods are difficult to apply when data is non-homogeneous, and do not detect local outliers well. The regression analysis method compares the predictive value and the actual value after learning the regression formula based on the parametric statistics and it detects abnormality. Anomaly detection using regression analysis has the disadvantage that the performance is lowered when the model is not solid and the noise or outliers of the data are included. There is a restriction that learning data with noise or outliers should be used. The autoencoder using artificial neural networks is learned to output as similar as possible to input data. It has many advantages compared to existing probability and linear model, cluster analysis, and map learning. It can be applied to data that does not satisfy probability distribution or linear assumption. In addition, it is possible to learn non-mapping without label data for teaching. However, there is a limitation of local outlier identification of multidimensional data in anomaly detection, and there is a problem that the dimension of data is greatly increased due to the characteristics of time series data. In this study, we propose a CMAE (Conditional Multimodal Autoencoder) that enhances the performance of anomaly detection by considering local outliers and time series characteristics. First, we applied Multimodal Autoencoder (MAE) to improve the limitations of local outlier identification of multidimensional data. Multimodals are commonly used to learn different types of inputs, such as voice and image. The different modal shares the bottleneck effect of Autoencoder and it learns correlation. In addition, CAE (Conditional Autoencoder) was used to learn the characteristics of time series data effectively without increasing the dimension of data. In general, conditional input mainly uses category variables, but in this study, time was used as a condition to learn periodicity. The CMAE model proposed in this paper was verified by comparing with the Unimodal Autoencoder (UAE) and Multi-modal Autoencoder (MAE). The restoration performance of Autoencoder for 41 variables was confirmed in the proposed model and the comparison model. The restoration performance is different by variables, and the restoration is normally well operated because the loss value is small for Memory, Disk, and Network modals in all three Autoencoder models. The process modal did not show a significant difference in all three models, and the CPU modal showed excellent performance in CMAE. ROC curve was prepared for the evaluation of anomaly detection performance in the proposed model and the comparison model, and AUC, accuracy, precision, recall, and F1-score were compared. In all indicators, the performance was shown in the order of CMAE, MAE, and AE. Especially, the reproduction rate was 0.9828 for CMAE, which can be confirmed to detect almost most of the abnormalities. The accuracy of the model was also improved and 87.12%, and the F1-score was 0.8883, which is considered to be suitable for anomaly detection. In practical aspect, the proposed model has an additional advantage in addition to performance improvement. The use of techniques such as time series decomposition and sliding windows has the disadvantage of managing unnecessary procedures; and their dimensional increase can cause a decrease in the computational speed in inference.The proposed model has characteristics that are easy to apply to practical tasks such as inference speed and model management.

Correlation between Leaf Size and Seed Weight of Soybean (콩의 잎 크기와 종실 무게와의 상관)

  • Park, Gyu-Hwan;Baek, In Youl;Han, Won Young;Kang, Sung Taek;Choung, Myoung Gun;Ko, Jong Min
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.58 no.4
    • /
    • pp.383-387
    • /
    • 2013
  • This study was carried out to examine whether the leaf size is likely to be used as a selection criterion for large seed genotype in soybean (Glycine. max (L.) Merr.) breeding program. Two hundred twenty nine soybean germplasms which had collected in Korea, United States, China and Japan were used in this experiment. The area of unifoliate leaf, middle leaflet of first trifoliate and third trifoliate leaf ranged from $3.2cm^2$ to $33.8cm^2$, 9.2 to $29.5cm^2$, and 7.2 to $58.9cm^2$, respectively. One hundred seed weight also showed great variation from 2.7 to 39.0 gram. The average leaf area of unifoliate, middle leaflet of first trifoliate and third trifoliate leaf were $15.7cm^2$, $18.1cm^2$ and $32.7cm^2$, respectively, and that of seed average weight was 17.2 gram per one hundred seed. Significantly positive correlations were observed between seed weight and leaf area of unifoliate (r=$0.80^{**}$), first trifoliate (r=$0.75^{**}$) and third trifoliate (r=$0.67^{**}$), respectively. Both the leaf length and leaf width of unifoliate, middle leaflet of first trifoliate and third trifoliate leaf were significantly positively correlated with seed weight and both the correlations of unifoliate were higher than the other leaves. The correlations of leaf width in soybean leaflet were higher than those of leaf length. Leaf length/width (L/W) ratio of upper leaf was higher than that of lower leaf in the leaf size. Both the leaf area and leaf width of unifoliate leaf are the most suitable predictive characteristics of early selection in related to seed weight for soybean breeding program.

Experimental Models of Schizophrenia (정신분열병의 실험적 모델)

  • Cheon, Jin-Sook
    • Korean Journal of Biological Psychiatry
    • /
    • v.6 no.2
    • /
    • pp.153-160
    • /
    • 1999
  • Animal models can provide a useful tool for the study of some aspects of psychiatric disorders and their treatment. The four criteria for the evaluation of animal models of psychiatric disorders are as following : 1) similarity of inducing conditions 2) similarity of behavioral state 3) common underlying neurobiological mechanisms 4) reversal by clinically effective treatment techniques. Several animal models have been proposed for schizophrenia : phenylethylamine model, L-dopa model, hallucinogen model, cocaine model, amphetamine model, phencyclidine model, noradrenergic reward system lesion model, reticular stimulation model, social isolation model, conditioned avoidance reaction, catalepsy test, paw test, self-stimulation paradigms, latent inhibition paradigms, blocking paradigms, prepulse inhibition of the startle reflex, rodent interaction, social behavior in monkeys, hippocampal damage, high ambient pressure, and models using selective breeding. Among them, animals with bilateral lesion of the hippocampus may provide an adequate animal model for several symptoms of schizophrenia, and ketamine model can reproduce negative symptoms and cognitive deficits as well as positive symptoms of schizophrenia. In conclusion, no model of schizophrenia is entirely representative of the disease, and findings gleaned from model systems must be cautiously interpreted. Furthermore, the process of developing and validating animal models must work in concert with the process to identify reliable measures of human phenomenology.

  • PDF

Online news-based stock price forecasting considering homogeneity in the industrial sector (산업군 내 동질성을 고려한 온라인 뉴스 기반 주가예측)

  • Seong, Nohyoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.2
    • /
    • pp.1-19
    • /
    • 2018
  • Since stock movements forecasting is an important issue both academically and practically, studies related to stock price prediction have been actively conducted. The stock price forecasting research is classified into structured data and unstructured data, and it is divided into technical analysis, fundamental analysis and media effect analysis in detail. In the big data era, research on stock price prediction combining big data is actively underway. Based on a large number of data, stock prediction research mainly focuses on machine learning techniques. Especially, research methods that combine the effects of media are attracting attention recently, among which researches that analyze online news and utilize online news to forecast stock prices are becoming main. Previous studies predicting stock prices through online news are mostly sentiment analysis of news, making different corpus for each company, and making a dictionary that predicts stock prices by recording responses according to the past stock price. Therefore, existing studies have examined the impact of online news on individual companies. For example, stock movements of Samsung Electronics are predicted with only online news of Samsung Electronics. In addition, a method of considering influences among highly relevant companies has also been studied recently. For example, stock movements of Samsung Electronics are predicted with news of Samsung Electronics and a highly related company like LG Electronics.These previous studies examine the effects of news of industrial sector with homogeneity on the individual company. In the previous studies, homogeneous industries are classified according to the Global Industrial Classification Standard. In other words, the existing studies were analyzed under the assumption that industries divided into Global Industrial Classification Standard have homogeneity. However, existing studies have limitations in that they do not take into account influential companies with high relevance or reflect the existence of heterogeneity within the same Global Industrial Classification Standard sectors. As a result of our examining the various sectors, it can be seen that there are sectors that show the industrial sectors are not a homogeneous group. To overcome these limitations of existing studies that do not reflect heterogeneity, our study suggests a methodology that reflects the heterogeneous effects of the industrial sector that affect the stock price by applying k-means clustering. Multiple Kernel Learning is mainly used to integrate data with various characteristics. Multiple Kernel Learning has several kernels, each of which receives and predicts different data. To incorporate effects of target firm and its relevant firms simultaneously, we used Multiple Kernel Learning. Each kernel was assigned to predict stock prices with variables of financial news of the industrial group divided by the target firm, K-means cluster analysis. In order to prove that the suggested methodology is appropriate, experiments were conducted through three years of online news and stock prices. The results of this study are as follows. (1) We confirmed that the information of the industrial sectors related to target company also contains meaningful information to predict stock movements of target company and confirmed that machine learning algorithm has better predictive power when considering the news of the relevant companies and target company's news together. (2) It is important to predict stock movements with varying number of clusters according to the level of homogeneity in the industrial sector. In other words, when stock prices are homogeneous in industrial sectors, it is important to use relational effect at the level of industry group without analyzing clusters or to use it in small number of clusters. When the stock price is heterogeneous in industry group, it is important to cluster them into groups. This study has a contribution that we testified firms classified as Global Industrial Classification Standard have heterogeneity and suggested it is necessary to define the relevance through machine learning and statistical analysis methodology rather than simply defining it in the Global Industrial Classification Standard. It has also contribution that we proved the efficiency of the prediction model reflecting heterogeneity.

A Study on Relationship between Physical Elements and Tennis/Golf Elbow

  • Choi, Jungmin;Park, Jungwoo;Kim, Hyunseung
    • Journal of the Ergonomics Society of Korea
    • /
    • v.36 no.3
    • /
    • pp.183-196
    • /
    • 2017
  • Objective: The purpose of this research was to assess the agreement between job physical risk factor analysis by ergonomists using ergonomic methods and physical examinations made by occupational physicians on the presence of musculoskeletal disorders of the upper extremities. Background: Ergonomics is the systematic application of principles concerned with the design of devices and working conditions for enhancing human capabilities and optimizing working and living conditions. Proper ergonomic design is necessary to prevent injuries and physical and emotional stress. The major types of ergonomic injuries and incidents are cumulative trauma disorders (CTDs), acute strains, sprains, and system failures. Minimization of use of excessive force and awkward postures can help to prevent such injuries Method: Initial data were collected as part of a larger study by the University of Utah Ergonomics and Safety program field data collection teams and medical data collection teams from the Rocky Mountain Center for Occupational and Environmental Health (RMCOEH). Subjects included 173 male and female workers, 83 at Beehive Clothing (a clothing plant), 74 at Autoliv (a plant making air bags for vehicles), and 16 at Deseret Meat (a meat-processing plant). Posture and effort levels were analyzed using a software program developed at the University of Utah (Utah Ergonomic Analysis Tool). The Ergonomic Epicondylitis Model (EEM) was developed to assess the risk of epicondylitis from observable job physical factors. The model considers five job risk factors: (1) intensity of exertion, (2) forearm rotation, (3) wrist posture, (4) elbow compression, and (5) speed of work. Qualitative ratings of these physical factors were determined during video analysis. Personal variables were also investigated to study their relationship with epicondylitis. Logistic regression models were used to determine the association between risk factors and symptoms of epicondyle pain. Results: Results of this study indicate that gender, smoking status, and BMI do have an effect on the risk of epicondylitis but there is not a statistically significant relationship between EEM and epicondylitis. Conclusion: This research studied the relationship between an Ergonomic Epicondylitis Model (EEM) and the occurrence of epicondylitis. The model was not predictive for epicondylitis. However, it is clear that epicondylitis was associated with some individual risk factors such as smoking status, gender, and BMI. Based on the results, future research may discover risk factors that seem to increase the risk of epicondylitis. Application: Although this research used a combination of questionnaire, ergonomic job analysis, and medical job analysis to specifically verify risk factors related to epicondylitis, there are limitations. This research did not have a very large sample size because only 173 subjects were available for this study. Also, it was conducted in only 3 facilities, a plant making air bags for vehicles, a meat-processing plant, and a clothing plant in Utah. If working conditions in other kinds of facilities are considered, results may improve. Therefore, future research should perform analysis with additional subjects in different kinds of facilities. Repetition and duration of a task were not considered as risk factors in this research. These two factors could be associated with epicondylitis so it could be important to include these factors in future research. Psychosocial data and workplace conditions (e.g., low temperature) were also noted during data collection, and could be used to further study the prevalence of epicondylitis. Univariate analysis methods could be used for each variable of EEM. This research was performed using multivariate analysis. Therefore, it was difficult to recognize the different effect of each variable. Basically, the difference between univariate and multivariate analysis is that univariate analysis deals with one predictor variable at a time, whereas multivariate analysis deals with multiple predictor variables combined in a predetermined manner. The univariate analysis could show how each variable is associated with epicondyle pain. This may allow more appropriate weighting factors to be determined and therefore improve the performance of the EEM.