• Title/Summary/Keyword: statistical data processing

Search Result 688, Processing Time 0.025 seconds

A Study on Classifications of Remote Sensed Multispectral Image Data using Soft Computing Technique - Stressed on Rough Sets - (소프트 컴퓨팅기술을 이용한 원격탐사 다중 분광 이미지 데이터의 분류에 관한 연구 -Rough 집합을 중심으로-)

  • Won Sung-Hyun
    • Management & Information Systems Review
    • /
    • v.3
    • /
    • pp.15-45
    • /
    • 1999
  • Processing techniques of remote sensed image data using computer have been recognized very necessary techniques to all social fields, such as, environmental observation, land cultivation, resource investigation, military trend grasp and agricultural product estimation, etc. Especially, accurate classification and analysis to remote sensed image da are important elements that can determine reliability of remote sensed image data processing systems, and many researches have been processed to improve these accuracy of classification and analysis. Traditionally, remote sensed image data processing systems have been processed 2 or 3 selected bands in multiple bands, in this time, their selection criterions are statistical separability or wavelength properties. But, it have be bring up the necessity of bands selection method by data distribution characteristics than traditional bands selection by wavelength properties or statistical separability. Because data sensing environments change from multispectral environments to hyperspectral environments. In this paper for efficient data classification in multispectral bands environment, a band feature extraction method using the Rough sets theory is proposed. First, we make a look up table from training data, and analyze the properties of experimental multispectral image data, then select the efficient band using indiscernibility relation of Rough set theory from analysis results. Proposed method is applied to LANDSAT TM data on 2 June 1992. From this, we show clustering trends that similar to traditional band selection results by wavelength properties, from this, we verify that can use the proposed method that centered on data properties to select the efficient bands, though data sensing environment change to hyperspectral band environments.

  • PDF

Image Data Compression Using Laplacian Pyramid Processing and Vector Quantization (라플라시안 피라미드 프로세싱과 백터 양자화 방법을 이용한 영상 데이타 압축)

  • Park, G.H.;Cha, I.H.;Youn, D.H.
    • Proceedings of the KIEE Conference
    • /
    • 1987.07b
    • /
    • pp.1347-1351
    • /
    • 1987
  • This thesis aims at studying laplacian pyramid vector quantization which keeps a simple compression algorithm and stability against various kinds of image data. To this end, images are devied into two groups according to their statistical characteristics. At 0.860 bits/pixel and 0.360 bits/pixel respectively, laplacian pyramid vector quantization is compared to the existing spatial domain vector quantization and transform coding under the same condition in both objective and subjective value. The laplacian pyramid vector quantization is much more stable against the statistical characteristics of images than the existing vector quantization and transform coding.

  • PDF

Reengineering of the Data Collection Process for Discharge Abstract Database (퇴원환자 진료정보 DB의 데이터 수집 과정 재설계)

  • Hong, Joon Hyun;Choi, Kwisook;Lee, Eun Mee
    • Quality Improvement in Health Care
    • /
    • v.7 no.1
    • /
    • pp.106-116
    • /
    • 2000
  • Background : Severance Hospital is an university hospital which has 1,580 beds. A LAN system was installed in the Medical Record Department in 1992 and discharge abstract data have been added to the discharge abstract database(DB) The previous work flow in the Medical Record Department had 5 levels: 1) chart collection from wards, 2) assembling, 3) abstracting data from medical record on worksheet by 2 RRAs, 4) checking deficiencies and coding diagnosis and procedures by 4 RRAs, 5) inputting the data into the discharge abstract data base by 1 RRA. The average processing time took 19.3 days from the patient discharge date. It had the production of monthly statistical report delayed. Besides, it caused the users in the hospital to complain. Methods : A CQI team was organized to find a way to shorten the processing time less than 10 days. The team identified the factors making the processing time long and integrated three levels from the 3rd level into one. Each of 7 RRAs performed the integrated level on her workstation instead of taking one of three separate levels. The comparison of processing time before and after the changes was made with 3'846 discharges of April, 1999 and 4,189 discharges of August, 1999. Results : The average processing time was shortened from 19.3 days to 8.7 days. Especially the integrated level took only 3.6 days, compared with 12.3 days before the change. The percentage of finishing up the whole processing within 10 days from discharge was increased up to 77.6%, which was 2.4% before the integration. The prevalence of error in data input was not increased in the new method. Conclusions : The integrated processing method has the following advantages: 1) the expedition of production of monthly statistical report, 2) the increase of utilizing rate of dischare abstract data by Billing Dept, Emergency Room, QI Dept., etc., 3) the improvement of intradepartmental work follow, 4) the enhancement of medical record quality by checking the deficiencies earlier than before.

  • PDF

Implementation of GrADS and R Scripts for Processing Future Climate Data to Produce Agricultural Climate Information (농업 기후 정보 생산을 위한 미래 기후 자료 처리 GrADS 및 R 프로그램 구현)

  • Lee, Kyu Jong;Lee, Semi;Lee, Byun Woo;Kim, Kwang Soo
    • Atmosphere
    • /
    • v.23 no.2
    • /
    • pp.237-243
    • /
    • 2013
  • A set of scripts for GrADS (Grid Analysis and Display System) and R was implemented to produce agricultural climate information using the future climate scenarios based on the Representative Concentration Pathways. The GrADS script was used to calculate agricultural climate indices including growing degree days and cooling degree days. The script generated agricultural climate maps of these indices, which are compatible with common Geographic Information System (GIS) applications. To perform a statistical analysis using the agricultural climate maps, a script for R, which is open source statistical software, was used. Because a large number of spatial climate data were produced, parallel processing packages such as SNOW, doSNOW, and foreach were used to perform a simple statistical analysis in the R script. The parallel script of R had speedup on workstations with multi-CPU cores.

Education Improvement Plan Related to Data Analysis & Processing in the ICT Field for the Era of Hyperconnectivity & Superintelligence

  • LEE, Seung-Woo;LEE, Sangwon
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.102-109
    • /
    • 2021
  • Since the 4th Industrial Revolution is implemented based on superintelligence, new insights must be provided through convergence studies with other fields to find optimal solutions to create new ideas. In this paper, we intende to present improvement measures for probability and statistical education, which is an athlete's subject on data analysis and processing in the ICT(Information & Communication Technologies) field in the era of superintelligence of the 4th industrial revolution. This paper aims to strengthen competitiveness through early development and commercialization of new technologies by presenting probabilities and statistical curriculums that require linkage in the ICT field. Second, it is necessary to present an educational system diagram linking probabilities and statistics in the ICT field to prepare a mid- to long-term response strategy for ICT education in response to innovative changes. Third, through a survey, we intend to present an effective educational operation plan linking probability and statistics to ICT major subjects by analyzing the perception of probability, statistical importance, and utilization of majors in this field.

Statistical System of the CIS Countries

  • Kim, Joo-Hwan
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.4
    • /
    • pp.1023-1032
    • /
    • 2007
  • We introduce the statistical system of the Commonwelth Independence State(CIS) countries located in the Central Asia. At present, the level of the national statistics production system of Korean National Statistical Office(NSO) is very high and locate on just behind Japan among all asian countries, and they are also trying to reach the statistics quality level upto the advanced developed countries in the world. To have the optimal Statistics production processing, we must understand the methodologies parts as well as the aspect of the macro statistics that can be applied to the country#s economic plan. Like the history is repeated, it is valuable to look at the development history of statistical system of other countries one century ago. We study the relationship among CIS countries along with the history of Russian statistics development. It will be helpful to look and understand the statistical system of CIS countries including Russia to use their statistics for international comparison study.

  • PDF

Big Data Smoothing and Outlier Removal for Patent Big Data Analysis

  • Choi, JunHyeog;Jun, Sunghae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.8
    • /
    • pp.77-84
    • /
    • 2016
  • In general statistical analysis, we need to make a normal assumption. If this assumption is not satisfied, we cannot expect a good result of statistical data analysis. Most of statistical methods processing the outlier and noise also need to the assumption. But the assumption is not satisfied in big data because of its large volume and heterogeneity. So we propose a methodology based on box-plot and data smoothing for controling outlier and noise in big data analysis. The proposed methodology is not dependent upon the normal assumption. In addition, we select patent documents as target domain of big data because patent big data analysis is a important issue in management of technology. We analyze patent documents using big data learning methods for technology analysis. The collected patent data from patent databases on the world are preprocessed and analyzed by text mining and statistics. But the most researches about patent big data analysis did not consider the outlier and noise problem. This problem decreases the accuracy of prediction and increases the variance of parameter estimation. In this paper, we check the existence of the outlier and noise in patent big data. To know whether the outlier is or not in the patent big data, we use box-plot and smoothing visualization. We use the patent documents related to three dimensional printing technology to illustrate how the proposed methodology can be used for finding the existence of noise in the searched patent big data.

Spatial Data Analysis using the Kriging Method

  • Jang, Jihui;Hong, Taekyong;NamKung, Pyong
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.423-432
    • /
    • 2003
  • The data observed at different positions are called the estimate of interested variable at new observation point on the Kriging utilize the space estimate technique, in which case there is correlation spatially. In this paper we provide the estimate for Variogram and Kriging methods as a field of kriging theory and dealt with actually measured data. And at the same time we forecast the amount of ozone that was not measured at this point by Kriging method and compared Ordinary Kriging method with Inverse Distance Kriging method.

TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data

  • Lim, Jae Hyun;Lee, Soo Youn;Kim, Ju Han
    • Genomics & Informatics
    • /
    • v.15 no.1
    • /
    • pp.51-53
    • /
    • 2017
  • High-throughput transcriptome sequencing, also known as RNA sequencing (RNA-Seq), is a standard technology for measuring gene expression with unprecedented accuracy. Numerous bioconductor packages have been developed for the statistical analysis of RNA-Seq data. However, these tools focus on specific aspects of the data analysis pipeline, and are difficult to appropriately integrate with one another due to their disparate data structures and processing methods. They also lack visualization methods to confirm the integrity of the data and the process. In this paper, we propose an R-based RNA-Seq analysis pipeline called TRAPR, an integrated tool that facilitates the statistical analysis and visualization of RNA-Seq expression data. TRAPR provides various functions for data management, the filtering of low-quality data, normalization, transformation, statistical analysis, data visualization, and result visualization that allow researchers to build customized analysis pipelines.

Fault Detection in Semiconductor Manufacturing Using Statistical Method

  • Lim, Woo-Yup;Jeon, Sung-Ik;Han, Seung-Soo;Soh, Dae-Wha;Hong, Sang-Jeen
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2009.11a
    • /
    • pp.44-44
    • /
    • 2009
  • Fault detection is necessary for yield enhancement and cost reduction in semiconductor manufacturing. Sensory data acquired from the semiconductor processing tool is too large to analyze for the purpose of fault detection and classification(FDC). We studied the techniques of fault detection using statistical method. Multiple regression analysis smoothly detected faults and can be easy made a model. For real-time and fast computing time, the huge data was analyzed by each step. We also considered interaction and critical factors in tool parameters and process.

  • PDF