• Title/Summary/Keyword: predictive accuracy

Search Result 821, Processing Time 0.025 seconds

Sentiment Analysis for Public Opinion in the Social Network Service (SNS 기반 여론 감성 분석)

  • HA, Sang Hyun;ROH, Tae Hyup
    • The Journal of the Convergence on Culture Technology
    • /
    • v.6 no.1
    • /
    • pp.111-120
    • /
    • 2020
  • As an application of big data and artificial intelligence techniques, this study proposes an atypical language-based sentimental opinion poll methodology, unlike conventional opinion poll methodology. An alternative method for the sentimental classification model based on existing statistical analysis was to collect real-time Twitter data related to parliamentary elections and perform empirical analyses on the Polarity and Intensity of public opinion using attribute-based sensitivity analysis. In order to classify the polarity of words used on individual SNS, the polarity of the new Twitter data was estimated using the learned Lasso and Ridge regression models while extracting independent variables that greatly affect the polarity variables. A social network analysis of the relationships of people with friends on SNS suggested a way to identify peer group sensitivity. Based on what voters expressed on social media, political opinion sensitivity analysis was used to predict party approval rating and measure the accuracy of the predictive model polarity analysis, confirming the applicability of the sensitivity analysis methodology in the political field.

Hybrid Preference Prediction Technique Using Weighting based Data Reliability for Collaborative Filtering Recommendation System (협업 필터링 추천 시스템을 위한 데이터 신뢰도 기반 가중치를 이용한 하이브리드 선호도 예측 기법)

  • Lee, O-Joun;Baek, Yeong-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.5
    • /
    • pp.61-69
    • /
    • 2014
  • Collaborative filtering recommendation creates similar item subset or similar user subset based on user preference about items and predict user preference to particular item by using them. Thus, if preference matrix has low density, reliability of recommendation will be sharply decreased. To solve these problems we suggest Hybrid Preference Prediction Technique Using Weighting based Data Reliability. Preference prediction is carried out by creating similar item subset and similar user subset and predicting user preference by each subset and merging each predictive value by weighting point applying model condition. According to this technique, we can increase accuracy of user preference prediction and implement recommendation system which can provide highly reliable recommendation when density of preference matrix is low. Efficiency of this system is verified by Mean Absolute Error. Proposed technique shows average 21.7% improvement than Hao Ji's technique when preference matrix sparsity is more than 84% through experiment.

Model for assessing the contamination of agricultural plants by accidentally released tritium (삼중수소 사고유출로 인한 농작물 오염 평가 모델)

  • Keum, Dong-Kwon;Lee, Han-Soo;Kang, Hee-Suk;Choi, Young-Ho;Lee, Chang-Woo
    • Journal of Radiation Protection and Research
    • /
    • v.30 no.1
    • /
    • pp.45-54
    • /
    • 2005
  • A dynamic compartment model was developed to appraise the level of the contamination of agricultural plants by accidentally released tritium from nuclear facility. The model consists of a set of inter-connected compartments representing atmosphere, soil and plant. In the model three categories of plant are considered: leafy vegetables, grain plants and tuber plants, of which each is modeled separately to account for the different transport pathways of tritium. The predictive accuracy of the model was tested through the analysis of the tritium exposure experiments for rice-plants. The predicted TFWT(tissue free water tritium) concentration of the rice ear at harvest was greatly affected by the absolute humidity of air, the ratio of root uptake, and the rate of rainfall, while its OBT(organically bound tritium) concentration the stowing period of the ear, the absolute humidity of air and the content of hydrogen in the organic phase. There was a good agreement between the model prediction and the experimental results lot the OBT concentration of the ear.

Design of Particle Swarm Optimization-based Polynomial Neural Networks (입자 군집 최적화 알고리즘 기반 다항식 신경회로망의 설계)

  • Park, Ho-Sung;Kim, Ki-Sang;Oh, Sung-Kwun
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.60 no.2
    • /
    • pp.398-406
    • /
    • 2011
  • In this paper, we introduce a new architecture of PSO-based Polynomial Neural Networks (PNN) and discuss its comprehensive design methodology. The conventional PNN is based on a extended Group Method of Data Handling (GMDH) method, and utilized the polynomial order (viz. linear, quadratic, and modified quadratic) as well as the number of node inputs fixed (selected in advance by designer) at Polynomial Neurons located in each layer through a growth process of the network. Moreover it does not guarantee that the conventional PNN generated through learning results in the optimal network architecture. The PSO-based PNN results in a structurally optimized structure and comes with a higher level of flexibility that the one encountered in the conventional PNN. The PSO-based design procedure being applied at each layer of PNN leads to the selection of preferred PNs with specific local characteristics (such as the number of input variables, input variables, and the order of the polynomial) available within the PNN. In the sequel, two general optimization mechanisms of the PSO-based PNN are explored: the structural optimization is realized via PSO whereas in case of the parametric optimization we proceed with a standard least square method-based learning. To evaluate the performance of the PSO-based PNN, the model is experimented with using Gas furnace process data, and pH neutralization process data. For the characteristic analysis of the given entire data with non-linearity and the construction of efficient model, the given entire system data is partitioned into two type such as Division I(Training dataset and Testing dataset) and Division II(Training dataset, Validation dataset, and Testing dataset). A comparative analysis shows that the proposed PSO-based PNN is model with higher accuracy as well as more superb predictive capability than other intelligent models presented previously.

Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games (데이터마이닝을 활용한 한국프로야구 승패예측모형 수립에 관한 연구)

  • Oh, Younhak;Kim, Han;Yun, Jaesub;Lee, Jong-Seok
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.1
    • /
    • pp.8-17
    • /
    • 2014
  • In this research, we employed various data mining techniques to build predictive models for win-loss prediction in Korean professional baseball games. The historical data containing information about players and teams was obtained from the official materials that are provided by the KBO website. Using the collected raw data, we additionally prepared two more types of dataset, which are in ratio and binary format respectively. Dividing away-team's records by the records of the corresponding home-team generated the ratio dataset, while the binary dataset was obtained by comparing the record values. We applied seven classification techniques to three (raw, ratio, and binary) datasets. The employed data mining techniques are decision tree, random forest, logistic regression, neural network, support vector machine, linear discriminant analysis, and quadratic discriminant analysis. Among 21(= 3 datasets${\times}$7 techniques) prediction scenarios, the most accurate model was obtained from the random forest technique based on the binary dataset, which prediction accuracy was 84.14%. It was also observed that using the ratio and the binary dataset helped to build better prediction models than using the raw data. From the capability of variable selection in decision tree, random forest, and stepwise logistic regression, we found that annual salary, earned run, strikeout, pitcher's winning percentage, and four balls are important winning factors of a game. This research is distinct from existing studies in that we used three different types of data and various data mining techniques for win-loss prediction in Korean professional baseball games.

Optimized Structural and Colorimetrical Modeling of Yarn-Dyed Woven Fabrics Based on the Kubelka-Munk Theory (Kubelka-Munk이론에 기반한 사염직물의 최적화된 구조-색채모델링)

  • Chae, Youngjoo
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.42 no.3
    • /
    • pp.503-515
    • /
    • 2018
  • In this research, the three-dimensional structural and colorimetrical modeling of yarn-dyed woven fabrics was conducted based on the Kubelka-Munk theory (K-M theory) for their accurate color predictions. In the K-M theory for textile color formulation, the absorption and scattering coefficients, denoted K and S, respectively, of a colored fabric are represented using those of the individual colorants or color components used. One-hundred forty woven fabric samples were produced in a wide range of structures and colors using red, yellow, green, and blue yarns. Through the optimization of previous two-dimensional color prediction models by considering the key three-dimensional structural parameters of woven fabrics, three three-dimensional K/S-based color prediction models, that is, linear K/S, linear log K/S, and exponential K/S models, were developed. To evaluate the performance of the three-dimensional color prediction models, the color differences, ${\Delta}L^*$, ${\Delta}C^*$, ${\Delta}h^{\circ}$, and ${\Delta}E_{CMC(2:1)}$, between the predicted and the measured colors of the samples were calculated as error values and then compared with those of previous two-dimensional models. As a result, three-dimensional models have proved to be of substantially higher predictive accuracy than two-dimensional models in all lightness, chroma, and hue predictions with much lower ${\Delta}L^*$, ${\Delta}C^*$, ${\Delta}h^{\circ}$, and the resultant ${\Delta}E_{CMC(2:1)}$ values.

Efficient Transmission Scheme with Viewport Prediction of 360VR Content using Sound Location Information (360VR 콘텐츠의 음원위치정보를 활용한 시점예측 전송기법)

  • Jeong, Eunyoung;Kim, Dong Ho
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1002-1012
    • /
    • 2019
  • 360VR content requires short latency, such as immediate response to viewers' viewport changes and high quality video delivery. It is necessary to consider efficient transmission that guarantees the QoE(Quality of Experience) of the 360VR contents with limited bandwidth. Several research has been introduced to reduce overall bandwidth consumption by predicting a user's viewport and allocating different bit rates to the area corresponding to the viewport. In this paper, we propose novel viewport prediction scheme that uses sound source location information of 360VR contents as auditory recognition information along with visual recognition information. Also, we propose efficient transmission algorithm by allocating a bit rate properly based on improved viewport prediction. The proposed scheme improves the accuracy of the viewport prediction and provides high quality videos to tiles corresponding to the user's viewpoint within the limited bandwidth.

Extraction Method of Significant Clinical Tests Based on Data Discretization and Rough Set Approximation Techniques: Application to Differential Diagnosis of Cholecystitis and Cholelithiasis Diseases (데이터 이산화와 러프 근사화 기술에 기반한 중요 임상검사항목의 추출방법: 담낭 및 담석증 질환의 감별진단에의 응용)

  • Son, Chang-Sik;Kim, Min-Soo;Seo, Suk-Tae;Cho, Yun-Kyeong;Kim, Yoon-Nyun
    • Journal of Biomedical Engineering Research
    • /
    • v.32 no.2
    • /
    • pp.134-143
    • /
    • 2011
  • The selection of meaningful clinical tests and its reference values from a high-dimensional clinical data with imbalanced class distribution, one class is represented by a large number of examples while the other is represented by only a few, is an important issue for differential diagnosis between similar diseases, but difficult. For this purpose, this study introduces methods based on the concepts of both discernibility matrix and function in rough set theory (RST) with two discretization approaches, equal width and frequency discretization. Here these discretization approaches are used to define the reference values for clinical tests, and the discernibility matrix and function are used to extract a subset of significant clinical tests from the translated nominal attribute values. To show its applicability in the differential diagnosis problem, we have applied it to extract the significant clinical tests and its reference values between normal (N = 351) and abnormal group (N = 101) with either cholecystitis or cholelithiasis disease. In addition, we investigated not only the selected significant clinical tests and the variations of its reference values, but also the average predictive accuracies on four evaluation criteria, i.e., accuracy, sensitivity, specificity, and geometric mean, during l0-fold cross validation. From the experimental results, we confirmed that two discretization approaches based rough set approximation methods with relative frequency give better results than those with absolute frequency, in the evaluation criteria (i.e., average geometric mean). Thus it shows that the prediction model using relative frequency can be used effectively in classification and prediction problems of the clinical data with imbalanced class distribution.

Analysis Standardization Layout for Efficient Prediction Model (예측모델 구축을 위한 분석 단계별 레이아웃 표준화 연구)

  • Kim, Hyo-Kwan;Hwang, Won-Yong
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.11 no.5
    • /
    • pp.543-549
    • /
    • 2018
  • The importance of prediction is becoming more emphasized, due to the uncertain business environment. In order to implement the predictive model, a number of data engineers and scientists are involved in the project and various prediction ideas are suggested to enhance the model. it takes a long time to validate the model's accuracy. Also It's hard to redesign and develop the code. In this study, development method such as Lego is suggested to find the most efficient idea to integrate various prediction methodologies into one model. This development methodology is possible by setting the same data layout for the development code for each idea. Therefore, it can be validated by each idea and it is easy to add and delete ideas as it is developed in Lego form, which can shorten the entire development process time. Finally, result of test is shown to confirm whether the proposed method is easy to add and delete ideas.

Predictive System for Unconfined Compressive Strength of Lightweight Treated Soil(LTS) using Deep Learning (딥러닝을 이용한 경량혼합토의 일축압축강도 예측 시스템)

  • Park, Bohyun;Kim, Dookie;Park, Dae-Wook
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.24 no.3
    • /
    • pp.18-25
    • /
    • 2020
  • The unconfined compressive strength of lightweight treated soils strongly depends on mixing ratio. To characterize the relation between various LTS components and the unconfined compressive strength of LTS, extensive studies have been conducted, proposing normalized factor using regression models based on their experimental results. However, these results obtained from laboratory experiments do not expect consistent prediction accuracy due to complicated relation between materials and mix proportions. In this study, deep neural network model(Deep-LTS), which was based on experimental test results performed on various mixing conditions, was applied to predict the unconfined compressive strength. It was found that the unconfined compressive strength LTS at a given mixing ratio could be resonable estimated using proposed Deep-LTS.