• 제목/요약/키워드: statistical computing

검색결과 417건 처리시간 0.02초

A Co-Evolutionary Computing for Statistical Learning Theory

  • Jun Sung-Hae
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제5권4호
    • /
    • pp.281-285
    • /
    • 2005
  • Learning and evolving are two basics for data mining. As compared with classical learning theory based on objective function with minimizing training errors, the recently evolutionary computing has had an efficient approach for constructing optimal model without the minimizing training errors. The global search of evolutionary computing in solution space can settle the local optima problems of learning models. In this research, combining co-evolving algorithm into statistical learning theory, we propose an co-evolutionary computing for statistical learning theory for overcoming local optima problems of statistical learning theory. We apply proposed model to classification and prediction problems of the learning. In the experimental results, we verify the improved performance of our model using the data sets from UCI machine learning repository and KDD Cup 2000.

An Efficient Method for Computing MINQUE Estimators in the Mixed Models

  • Lee, Jang-Taek;Kim, Byung-Chun
    • Journal of the Korean Statistical Society
    • /
    • 제18권1호
    • /
    • pp.4-12
    • /
    • 1989
  • An efficient method for computing minimum norm quadratic unbiased estimates (MINQUE) of variance components in the mixed model is developed. This computing algorithm which used W-matrix saves both storage usage and computing time.

  • PDF

Ubiquitous Computing and Statistics; What's the Connection?

  • Jun, Sung-Hae;Jorn, Hongsuk
    • Communications for Statistical Applications and Methods
    • /
    • 제11권2호
    • /
    • pp.287-295
    • /
    • 2004
  • Mark Weiser introduced ubiquitous computing in his article titled 'The computer for 21st Century' in 1991. This has been new paradigm after internet. Now, the rapid development of mobile computer, wireless network, and intelligent system has supported ubiquitous computing environment. In the related area of information science, the researchers have studied on ubiquitous computing. But in the field of Korea statistics, this research has not been worked yet. So, we proposed the connection between statistics and ubiquitous computing in this paper. As an example, we showed an efficient cache hoarding for ubiquitous computing using statistical methods. In experimental results, we verified our proposed issue.

A Brief Introduction to Soft Computing

  • Hong Dug Hun;Hwang Changha
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2004년도 학술발표논문집
    • /
    • pp.65-66
    • /
    • 2004
  • The aim of this article is to illustrate what soft computing is and how important it is.

  • PDF

R: AN OVERVIEW AND SOME CURRENT DIRECTIONS

  • Tierney, Luke
    • Journal of the Korean Statistical Society
    • /
    • 제36권1호
    • /
    • pp.31-55
    • /
    • 2007
  • R is an open source language for statistical computing and graphics based on the ACM software award-winning S language. R is widely used for data analysis and has become a major vehicle for making available new statistical methodology. This paper presents an overview of the design philosophy and the development model for R, reviews the basic capabilities of the system, and outlines some current projects that will influence future developments of R.

Microblog User Geolocation by Extracting Local Words Based on Word Clustering and Wrapper Feature Selection

  • Tian, Hechan;Liu, Fenlin;Luo, Xiangyang;Zhang, Fan;Qiao, Yaqiong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제14권10호
    • /
    • pp.3972-3988
    • /
    • 2020
  • Existing methods always rely on statistical features to extract local words for microblog user geolocation. There are many non-local words in extracted words, which makes geolocation accuracy lower. Considering the statistical and semantic features of local words, this paper proposes a microblog user geolocation method by extracting local words based on word clustering and wrapper feature selection. First, ordinary words without positional indications are initially filtered based on statistical features. Second, a word clustering algorithm based on word vectors is proposed. The remaining semantically similar words are clustered together based on the distance of word vectors with semantic meanings. Next, a wrapper feature selection algorithm based on sequential backward subset search is proposed. The cluster subset with the best geolocation effect is selected. Words in selected cluster subset are extracted as local words. Finally, the Naive Bayes classifier is trained based on local words to geolocate the microblog user. The proposed method is validated based on two different types of microblog data - Twitter and Weibo. The results show that the proposed method outperforms existing two typical methods based on statistical features in terms of accuracy, precision, recall, and F1-score.

통계공학을 위한 R 패키지 응용 (Applications of R package for statistical engineering)

  • 장대흥
    • 응용통계연구
    • /
    • 제33권1호
    • /
    • pp.87-105
    • /
    • 2020
  • 통계공학은 실험계획법, 품질관리/품질경영, 신뢰성공학으로 구성된다. R은 무료로 개방되어 있는 통계패키지로서 통계모형, 통계 계산 및 통계 그래픽 관련 패키지가 방대하다. 우리는 이러한 R 패키지를 통계공학을 위한 기본 통계패키지로 유용하게 사용할 수 있다. 본 논문에서는 통계공학을 위한 R 패키지 응용을 살펴보고 통계공학 관련 CRAN Task Views가 필요함을 제안하였다.

R 프로그래밍: 통계 계산과 데이터 시각화를 위한 환경 (R programming: Language and Environment for Statistical Computing and Data Visualization)

  • 이두호
    • 전자통신동향분석
    • /
    • 제28권1호
    • /
    • pp.42-51
    • /
    • 2013
  • The R language is an open source programming language and a software environment for statistical computing and data visualization. The R language is widely used among a lot of statisticians and data scientists to develop statistical software and data analysis. The R language provides a variety of statistical and graphical techniques, including basic descriptive statistics, linear or nonlinear modeling, conventional or advanced statistical tests, time series analysis, clustering, simulation, and others. In this paper, we first introduce the R language and investigate its features as a data analytics tool. As results, we may explore the application possibility of the R language in the field of data analytics.

  • PDF

An Efficient Computing Method of the Orthogonal Projection Matrix for the Balanced Factorial Design

  • Kim, Byung-Chun;Park, Jong-Tae
    • Journal of the Korean Statistical Society
    • /
    • 제22권2호
    • /
    • pp.249-258
    • /
    • 1993
  • It is well known that design matrix X for any factorial design can be represented by a product $X = TX_o$ where T is replication matrix and $X_o$ is the corresponding balanced design matrix. Since $X_o$ consists of regular arrangement of 0's and 1's, we can easily find the spectral decomposition of $X_o',X_o$. Also using this we propose an efficient algorithm for computing the orthogonal projection matrix for a balanced factorial design.

  • PDF

Computing Fractional Bayes Factor Using the Generalized Savage-Dickey Density Ratio

  • Younshik Chung;Lee, Sangjeen
    • Journal of the Korean Statistical Society
    • /
    • 제27권4호
    • /
    • pp.385-396
    • /
    • 1998
  • A computing method of fractional Bayes factor (FBF) for a point null hypothesis is explained. We propose alternative form of FBF that is the product of density ratio and a quantity using the generalized Savage-Dickey density ratio method. When it is difficult to compute the alternative form of FBF analytically, each term of the proposed form can be estimated by MCMC method. Finally, two examples are given.

  • PDF