• Title/Summary/Keyword: statistical data processing

Search Result 688, Processing Time 0.024 seconds

On Robust Principal Component using Analysis Neural Networks (신경망을 이용한 로버스트 주성분 분석에 관한 연구)

  • Kim, Sang-Min;Oh, Kwang-Sik;Park, Hee-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.1
    • /
    • pp.113-118
    • /
    • 1996
  • Principal component analysis(PCA) is an essential technique for data compression and feature extraction, and has been widely used in statistical data analysis, communication theory, pattern recognition, and image processing. Oja(1992) found that a linear neuron with constrained Hebbian learning rule can extract the principal component by using stochastic gradient ascent method. In practice real data often contain some outliers. These outliers will significantly deteriorate the performances of the PCA algorithms. In order to make PCA robust, Xu & Yuille(1995) applied statistical physics to the problem of robust principal component analysis(RPCA). Devlin et.al(1981) obtained principal components by using techniques such as M-estimation. The propose of this paper is to investigate from the statistical point of view how Xu & Yuille's(1995) RPCA works under the same simulation condition as in Devlin et.al(1981).

  • PDF

Development and Application of Statistical Programs Based on Data and Artificial Intelligence Prediction Model to Improve Statistical Literacy of Elementary School Students (초등학생의 통계적 소양 신장을 위한 데이터와 인공지능 예측모델 기반의 통계프로그램 개발 및 적용)

  • Kim, Yunha;Chang, Hyewon
    • Communications of Mathematical Education
    • /
    • v.37 no.4
    • /
    • pp.717-736
    • /
    • 2023
  • The purpose of this study is to develop a statistical program using data and artificial intelligence prediction models and apply it to one class in the sixth grade of elementary school to see if it is effective in improving students' statistical literacy. Based on the analysis of problems in today's elementary school statistical education, a total of 15 sessions of the program was developed to encourage elementary students to experience the entire process of statistical problem solving and to make correct predictions by incorporating data, the core in the era of the Fourth Industrial Revolution into AI education. The biggest features of this program are the recognition of the importance of data, which are the key elements of artificial intelligence education, and the collection and analysis activities that take into account context using real-life data provided by public data platforms. In addition, since it consists of activities to predict the future based on data by using engineering tools such as entry and easy statistics, and creating an artificial intelligence prediction model, it is composed of a program focused on the ability to develop communication skills, information processing capabilities, and critical thinking skills. As a result of applying this program, not only did the program positively affect the statistical literacy of elementary school students, but we also observed students' interest, critical inquiry, and mathematical communication in the entire process of statistical problem solving.

Deinterlacing Algorithm Based on Statistical Tests

  • Kim, Yeong-Hwa;Nam, Ji-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.723-734
    • /
    • 2008
  • The main reason for deinterlacing is frame-rate conversion. The other reason for deinterlacing is of course improve clarity and reduce flicker. Using a deinterlacer can help clarity and stability of the image. Many deinterlacing algorithms are available in image processing literatures such as ELA and E-ELA. This paper propose a new statistical deinterlacing algorithm based on statistical tests such as the Bartlett test, the Levene test and the Kruskal-Wallis test. The results obtained from the proposed algorithms are found to be comparable to those from many well-known deinterlacers. However, the results in the proposed deinterlacers are found to be more efficient than other deinterlacers.

  • PDF

PCA vs. ICA for Face Recognition

  • Lee, Oyoung;Park, Hyeyoung;Park, Seung-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2000.07b
    • /
    • pp.873-876
    • /
    • 2000
  • The information-theoretic approach to face recognition is based on the compact coding where face images are decomposed into a small set of basis images. Most popular method for the compact coding may be the principal component analysis (PCA) which eigenface methods are based on. PCA based methods exploit only second-order statistical structure of the data, so higher- order statistical dependencies among pixels are not considered. Independent component analysis (ICA) is a signal processing technique whose goal is to express a set of random variables as linear combinations of statistically independent component variables. ICA exploits high-order statistical structure of the data that contains important information. In this paper we employ the ICA for the efficient feature extraction from face images and show that ICA outperforms the PCA in the task of face recognition. Experimental results using a simple nearest classifier and multi layer perceptron (MLP) are presented to illustrate the performance of the proposed method.

  • PDF

Implementation of a Real-time Data fusion Algorithm for Flight Test Computer (비행시험통제컴퓨터용 실시간 데이터 융합 알고리듬의 구현)

  • Lee, Yong-Jae;Won, Jong-Hoon;Lee, Ja-Sung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.8 no.4 s.23
    • /
    • pp.24-31
    • /
    • 2005
  • This paper presents an implementation of a real-time multi-sensor data fusion algorithm for Flight Test Computer. The sensor data consist of positional information of the target from a radar, a GPS receiver and an INS. The data fusion algorithm is designed by the 21st order distributed Kalman Filter which is based on the PVA model with sensor bias states. A fault detection and correction logics are included in the algorithm for bad measurements and sensor faults. The statistical parameters for the states are obtained from Monte Carlo simulations and covariance analysis using test tracking data. The designed filter is verified by using real data both in post processing and real-time processing.

Digital Image Processing of Side Scan Sonar for Underwater Man-made Structure (수중 인공구조물에 대한 사이드스캔소나 탐사자료의 영상처리)

  • Shin, Sung-Ryul;Lim, Min-Hyuk;Kim, Kwang-Eun
    • Journal of Advanced Marine Engineering and Technology
    • /
    • v.33 no.2
    • /
    • pp.344-354
    • /
    • 2009
  • Side scan sonar using acoustic wave plays a very important role in the underwater, sea floor, and shallow marine geologic survey. In this study, we have acquired side scan sonar data for the underwater man-made structures, artificial reefs and fishing grounds, installed and distributed in the survey area. We applied digital image processing techniques to side scan sonar data in order to improve and enhance an image quality. We carried out digital image processing with various kinds of filtering in spatial domain and frequency domain. We tested filtering parameters such as kernel size, differential operator, and statistical value. We could easily estimate the conditions, distribution and environment of artificial structures through the interpretation of side scan sonar.

Predicting the Unemployment Rate Using Social Media Analysis

  • Ryu, Pum-Mo
    • Journal of Information Processing Systems
    • /
    • v.14 no.4
    • /
    • pp.904-915
    • /
    • 2018
  • We demonstrate how social media content can be used to predict the unemployment rate, a real-world indicator. We present a novel method for predicting the unemployment rate using social media analysis based on natural language processing and statistical modeling. The system collects social media contents including news articles, blogs, and tweets written in Korean, and then extracts data for modeling using part-of-speech tagging and sentiment analysis techniques. The autoregressive integrated moving average with exogenous variables (ARIMAX) and autoregressive with exogenous variables (ARX) models for unemployment rate prediction are fit using the analyzed data. The proposed method quantifies the social moods expressed in social media contents, whereas the existing methods simply present social tendencies. Our model derived a 27.9% improvement in error reduction compared to a Google Index-based model in the mean absolute percentage error metric.

Information Technology Infrastructure for Agriculture Genotyping Studies

  • Pardamean, Bens;Baurley, James W.;Perbangsa, Anzaludin S.;Utami, Dwinita;Rijzaani, Habib;Satyawan, Dani
    • Journal of Information Processing Systems
    • /
    • v.14 no.3
    • /
    • pp.655-665
    • /
    • 2018
  • In efforts to increase its agricultural productivity, the Indonesian Center for Agricultural Biotechnology and Genetic Resources Research and Development has conducted a variety of genomic studies using high-throughput DNA genotyping and sequencing. The large quantity of data (big data) produced by these biotechnologies require high performance data management system to store, backup, and secure data. Additionally, these genetic studies are computationally demanding, requiring high performance processors and memory for data processing and analysis. Reliable network connectivity with large bandwidth to transfer data is essential as well as database applications and statistical tools that include cleaning, quality control, querying based on specific criteria, and exporting to various formats that are important for generating high yield varieties of crops and improving future agricultural strategies. This manuscript presents a reliable, secure, and scalable information technology infrastructure tailored to Indonesian agriculture genotyping studies.

Elastic modulus in large concrete structures by a sequential hypothesis testing procedure applied to impulse method data

  • Antonaci, Paola;Bocca, Pietro G.;Sellone, Fabrizio
    • Structural Engineering and Mechanics
    • /
    • v.26 no.5
    • /
    • pp.499-516
    • /
    • 2007
  • An experimental method denoted as Impulse Method is proposed as a cost-effective non-destructive technique for the on-site evaluation of concrete elastic modulus in existing structures: on the basis of Hertz's quasi-static theory of elastic impact and with the aid of a simple portable testing equipment, it makes it possible to collect series of local measurements of the elastic modulus in an easy way and in a very short time. A Hypothesis Testing procedure is developed in order to provide a statistical tool for processing the data collected by means of the Impulse Method and assessing the possible occurrence of significant variations in the elastic modulus without exceeding some prescribed error probabilities. It is based on a particular formulation of the renowned sequential probability ratio test and reveals to be optimal with respect to the error probabilities and the required number of observations, thus further improving the time-effectiveness of the Impulse Method. The results of an experimental investigation on different types of plain concrete prove the validity of the Impulse Method in estimating the unknown value of the elastic modulus and attest the effectiveness of the proposed Hypothesis Testing procedure in identifying significant variations in the elastic modulus.

Statistical Analysis on the Web Using PHP3 (PHP3를 이용한 웹상에서의 통계분석)

  • Hwang, Jin-Soo;Uhm, Dae-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.501-510
    • /
    • 1999
  • We have seen a rapid development of multimedia intustry as computer evolves and the internet has changed our way of life dramatically in these days. There we several attempts to teach elementary statistics on the web but most of them are based on commercial products. The need for statistical data analysis and decision making based on those analysis is growing. In this article we try to show one way of reaching that goal by using a server side scripting language PHP3 toghether with extra graphical module and statistical distribution module on the web. We showed some elementary exploratory graphical data analysis and statistical inferences. There are plenty of room of improvements to make it a full blown statistical analysis tool on the web in the new future. All the programs and databases used in our article we public programs. The main engine PHP3 is included as an apache web server module so it is very light and fast. It will be much better when the PHP4(ZEND) will be officially out in terms of processing speed.

  • PDF