• Title/Summary/Keyword: statistical computing

Search Result 412, Processing Time 0.026 seconds

A Co-Evolutionary Computing for Statistical Learning Theory

  • Jun Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.4
    • /
    • pp.281-285
    • /
    • 2005
  • Learning and evolving are two basics for data mining. As compared with classical learning theory based on objective function with minimizing training errors, the recently evolutionary computing has had an efficient approach for constructing optimal model without the minimizing training errors. The global search of evolutionary computing in solution space can settle the local optima problems of learning models. In this research, combining co-evolving algorithm into statistical learning theory, we propose an co-evolutionary computing for statistical learning theory for overcoming local optima problems of statistical learning theory. We apply proposed model to classification and prediction problems of the learning. In the experimental results, we verify the improved performance of our model using the data sets from UCI machine learning repository and KDD Cup 2000.

An Efficient Method for Computing MINQUE Estimators in the Mixed Models

  • Lee, Jang-Taek;Kim, Byung-Chun
    • Journal of the Korean Statistical Society
    • /
    • v.18 no.1
    • /
    • pp.4-12
    • /
    • 1989
  • An efficient method for computing minimum norm quadratic unbiased estimates (MINQUE) of variance components in the mixed model is developed. This computing algorithm which used W-matrix saves both storage usage and computing time.

  • PDF

Ubiquitous Computing and Statistics; What's the Connection?

  • Jun, Sung-Hae;Jorn, Hongsuk
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.287-295
    • /
    • 2004
  • Mark Weiser introduced ubiquitous computing in his article titled 'The computer for 21st Century' in 1991. This has been new paradigm after internet. Now, the rapid development of mobile computer, wireless network, and intelligent system has supported ubiquitous computing environment. In the related area of information science, the researchers have studied on ubiquitous computing. But in the field of Korea statistics, this research has not been worked yet. So, we proposed the connection between statistics and ubiquitous computing in this paper. As an example, we showed an efficient cache hoarding for ubiquitous computing using statistical methods. In experimental results, we verified our proposed issue.

A Brief Introduction to Soft Computing

  • Hong Dug Hun;Hwang Changha
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2004.11a
    • /
    • pp.65-66
    • /
    • 2004
  • The aim of this article is to illustrate what soft computing is and how important it is.

  • PDF

R: AN OVERVIEW AND SOME CURRENT DIRECTIONS

  • Tierney, Luke
    • Journal of the Korean Statistical Society
    • /
    • v.36 no.1
    • /
    • pp.31-55
    • /
    • 2007
  • R is an open source language for statistical computing and graphics based on the ACM software award-winning S language. R is widely used for data analysis and has become a major vehicle for making available new statistical methodology. This paper presents an overview of the design philosophy and the development model for R, reviews the basic capabilities of the system, and outlines some current projects that will influence future developments of R.

Microblog User Geolocation by Extracting Local Words Based on Word Clustering and Wrapper Feature Selection

  • Tian, Hechan;Liu, Fenlin;Luo, Xiangyang;Zhang, Fan;Qiao, Yaqiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.10
    • /
    • pp.3972-3988
    • /
    • 2020
  • Existing methods always rely on statistical features to extract local words for microblog user geolocation. There are many non-local words in extracted words, which makes geolocation accuracy lower. Considering the statistical and semantic features of local words, this paper proposes a microblog user geolocation method by extracting local words based on word clustering and wrapper feature selection. First, ordinary words without positional indications are initially filtered based on statistical features. Second, a word clustering algorithm based on word vectors is proposed. The remaining semantically similar words are clustered together based on the distance of word vectors with semantic meanings. Next, a wrapper feature selection algorithm based on sequential backward subset search is proposed. The cluster subset with the best geolocation effect is selected. Words in selected cluster subset are extracted as local words. Finally, the Naive Bayes classifier is trained based on local words to geolocate the microblog user. The proposed method is validated based on two different types of microblog data - Twitter and Weibo. The results show that the proposed method outperforms existing two typical methods based on statistical features in terms of accuracy, precision, recall, and F1-score.

Applications of R package for statistical engineering (통계공학을 위한 R 패키지 응용)

  • Jang, Dae-Heung
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.1
    • /
    • pp.87-105
    • /
    • 2020
  • Statistical engineering contains the design of experiments, quality control/management, and reliability engineering. R is a free software environment for statistical computing and graphics that is supported by the R Foundation for Statistical Computing. R package has many functions and libraries for statistical engineering. We can use R package as a useful tool for statistical engineering. This paper shows the applications of R package for statistical engineering and suggests a R Task View for statistical engineering.

R programming: Language and Environment for Statistical Computing and Data Visualization (R 프로그래밍: 통계 계산과 데이터 시각화를 위한 환경)

  • Lee, D.H.;Ren, Ye
    • Electronics and Telecommunications Trends
    • /
    • v.28 no.1
    • /
    • pp.42-51
    • /
    • 2013
  • The R language is an open source programming language and a software environment for statistical computing and data visualization. The R language is widely used among a lot of statisticians and data scientists to develop statistical software and data analysis. The R language provides a variety of statistical and graphical techniques, including basic descriptive statistics, linear or nonlinear modeling, conventional or advanced statistical tests, time series analysis, clustering, simulation, and others. In this paper, we first introduce the R language and investigate its features as a data analytics tool. As results, we may explore the application possibility of the R language in the field of data analytics.

  • PDF

An Efficient Computing Method of the Orthogonal Projection Matrix for the Balanced Factorial Design

  • Kim, Byung-Chun;Park, Jong-Tae
    • Journal of the Korean Statistical Society
    • /
    • v.22 no.2
    • /
    • pp.249-258
    • /
    • 1993
  • It is well known that design matrix X for any factorial design can be represented by a product $X = TX_o$ where T is replication matrix and $X_o$ is the corresponding balanced design matrix. Since $X_o$ consists of regular arrangement of 0's and 1's, we can easily find the spectral decomposition of $X_o',X_o$. Also using this we propose an efficient algorithm for computing the orthogonal projection matrix for a balanced factorial design.

  • PDF

Computing Fractional Bayes Factor Using the Generalized Savage-Dickey Density Ratio

  • Younshik Chung;Lee, Sangjeen
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.4
    • /
    • pp.385-396
    • /
    • 1998
  • A computing method of fractional Bayes factor (FBF) for a point null hypothesis is explained. We propose alternative form of FBF that is the product of density ratio and a quantity using the generalized Savage-Dickey density ratio method. When it is difficult to compute the alternative form of FBF analytically, each term of the proposed form can be estimated by MCMC method. Finally, two examples are given.

  • PDF