• 제목/요약/키워드: Data interpretation, statistical

검색결과 173건 처리시간 0.024초

Tree-Structured Nonlinear Regression

  • Chang, Young-Jae;Kim, Hyeon-Soo
    • 응용통계연구
    • /
    • 제24권5호
    • /
    • pp.759-768
    • /
    • 2011
  • Tree algorithms have been widely developed for regression problems. One of the good features of a regression tree is the flexibility of fitting because it can correctly capture the nonlinearity of data well. Especially, data with sudden structural breaks such as the price of oil and exchange rates could be fitted well with a simple mixture of a few piecewise linear regression models. Now that split points are determined by chi-squared statistics related with residuals from fitting piecewise linear models and the split variable is chosen by an objective criterion, we can get a quite reasonable fitting result which goes in line with the visual interpretation of data. The piecewise linear regression by a regression tree can be used as a good fitting method, and can be applied to a dataset with much fluctuation.

A Study on the Calculation and Provision of Accruals-Quality by Big Data Real-Time Predictive Analysis Program

  • Shin, YeounOuk
    • International journal of advanced smart convergence
    • /
    • 제8권3호
    • /
    • pp.193-200
    • /
    • 2019
  • Accruals-Quality(AQ) is an important proxy for evaluating the quality of accounting information disclosures. High-quality accounting information will provide high predictability and precision in the disclosure of earnings and will increase the response to stock prices. And high Accruals-Quality, such as mitigating heterogeneity in accounting information interpretation, provides information usefulness in capital markets. The purpose of this study is to suggest how AQ, which represents the quality of accounting information disclosure, is transformed into digitized data in real-time in combination with IT information technology and provided to financial analyst's information environment in real-time. And AQ is a framework for predictive analysis through big data log analysis system. This real-time information from AQ will help financial analysts to increase their activity and reduce information asymmetry. In addition, AQ, which is provided in real time through IT information technology, can be used as an important basis for decision-making by users of capital market information, and is expected to contribute in providing companies with incentives to voluntarily improve the quality of accounting information disclosure.

Assessment through Statistical Methods of Water Quality Parameters(WQPs) in the Han River in Korea

  • Kim, Jae Hyoun
    • 한국환경보건학회지
    • /
    • 제41권2호
    • /
    • pp.90-101
    • /
    • 2015
  • Objective: This study was conducted to develop a chemical oxygen demand (COD) regression model using water quality monitoring data (January, 2014) obtained from the Han River auto-monitoring stations. Methods: Surface water quality data at 198 sampling stations along the six major areas were assembled and analyzed to determine the spatial distribution and clustering of monitoring stations based on 18 WQPs and regression modeling using selected parameters. Statistical techniques, including combined genetic algorithm-multiple linear regression (GA-MLR), cluster analysis (CA) and principal component analysis (PCA) were used to build a COD model using water quality data. Results: A best GA-MLR model facilitated computing the WQPs for a 5-descriptor COD model with satisfactory statistical results ($r^2=92.64$,$Q{^2}_{LOO}=91.45$,$Q{^2}_{Ext}=88.17$). This approach includes variable selection of the WQPs in order to find the most important factors affecting water quality. Additionally, ordination techniques like PCA and CA were used to classify monitoring stations. The biplot based on the first two principal components (PCs) of the PCA model identified three distinct groups of stations, but also differs with respect to the correlation with WQPs, which enables better interpretation of the water quality characteristics at particular stations as of January 2014. Conclusion: This data analysis procedure appears to provide an efficient means of modelling water quality by interpreting and defining its most essential variables, such as TOC and BOD. The water parameters selected in a COD model as most important in contributing to environmental health and water pollution can be utilized for the application of water quality management strategies. At present, the river is under threat of anthropogenic disturbances during festival periods, especially at upstream areas.

세계 각국의 자원에 대한 통계적 고찰 (Statistical Consideration on the Resources of the Countries in the World)

  • 허문열;최병수;이승천
    • 응용통계연구
    • /
    • 제22권1호
    • /
    • pp.41-57
    • /
    • 2009
  • 본 논문에서는 세계 232 개국에 대한 인구, 경제 및 기타 자원에 관한 자료를 사용하여 국가의 개발정도, 인간개발지수, 경제력 그리고 OECD 가입 여부에 어떤 자원이 어떻게 영향을 미치는가를 통계적으로 고찰해보고자 한다. 여기서 사용하는 국가별 자원 자료는 연속형 자료와 이산형 자료가 혼재되어있는 혼합형이며 많은 결측값이 포함되어 있어 기존의 방법으로는 분석하는 데 한계가 있다. 이 논문에서는 시각적 방법을 동원하여 복합형 자료를 탐색하는 과정을 제시하고 이러한 방법의 한계점을 보이고자한다. 이러한 한계점을 극복하고 객관적인 판단기준을 적용하여 주어진 문제에 대한 과학적인 결론을 유도하기 위해 Shannon (1948)의 엔트로피 이론에 기본을 둔 상호정보(MI)를 활용하고자 한다. 상호정보를 추정하는 방법은 여러 가지가 있으며 각 방법에 따라 결과가 매우 다르게 나타난다. 본 논문에서는 Fayyad와 Irani (1992)의 이산화 방법을 적용하여 MI를 추정하는 방법을 적용한다. 여기서 이루어지는 모든 과정은 다차원 자료의 시각적 탐색 도구인 DAVIS (Huh와 Song, 2002)를 사용하였다.

데이터 분석 프로젝트 참여한 예비 교사의 통계적 지식에 대한 변화와 데이터 기반 의사 결정의 경험 (Changes in Statistical Knowledge and Experience of Data-driven Decision-making of Pre-service Teachers who Participated in Data Analysis Projects)

  • 서희주;한선영
    • 한국수학교육학회지시리즈E:수학교육논문집
    • /
    • 제35권2호
    • /
    • pp.153-172
    • /
    • 2021
  • 미래 사회는 데이터를 다룰 수 있는 역량이 특히 중요해질 것이라 예측되며, 따라서 통계적 지식과 더불어 통계적 사고력을 갖춘 교사 교육이 필요한 시대가 되었다. 이에 따라 본 연구는 연구자들이 개발한 데이터 분석 프로젝트를 예비 교사들에게 적용해본 뒤 이들의 통계적 지식의 변화를 살펴보고 통계적 사고력을 활용한 데이터 기반 의사 결정 경험의 내용을 살펴보았다. 해당 프로젝트를 통해 예비 교사들은 실제 데이터를 공학적 도구를 통해 분석하는 기회를 가질 수 있었다. 연구를 위해 혼합연구 모형을 적용하여 예비 교사들의 통계적 지식의 변화를 양적으로 분석하였고 데이터 기반 의사 결정의 경험을 질적으로 살펴보았다. 그 결과 예비 교사들은 모평균과 표본평균의 관계, 그리고 모평균 추정 및 해석에 관한 통계적 지식이 부족한 것으로 드러났다. 데이터 기반 의사 결정에 관해서는 데이터와 분석 방법 및 분석 결과에 대한 이해의 깊이에 차이를 보였으며 이러한 차이는 예비 교사들이 한 모둠에서 같이 활동한 경우에도 발생하였다. 이와 같은 결과를 바탕으로 통계 교육의 질 제고를 위한 제안점을 논하였다.

수중 인공구조물에 대한 사이드스캔소나 탐사자료의 영상처리 (Digital Image Processing of Side Scan Sonar for Underwater Man-made Structure)

  • 신성렬;임민혁;김광은
    • Journal of Advanced Marine Engineering and Technology
    • /
    • 제33권2호
    • /
    • pp.344-354
    • /
    • 2009
  • Side scan sonar using acoustic wave plays a very important role in the underwater, sea floor, and shallow marine geologic survey. In this study, we have acquired side scan sonar data for the underwater man-made structures, artificial reefs and fishing grounds, installed and distributed in the survey area. We applied digital image processing techniques to side scan sonar data in order to improve and enhance an image quality. We carried out digital image processing with various kinds of filtering in spatial domain and frequency domain. We tested filtering parameters such as kernel size, differential operator, and statistical value. We could easily estimate the conditions, distribution and environment of artificial structures through the interpretation of side scan sonar.

THE NUMERICAL IMPLEMENTATION OF RISK

  • Lee, Chun-Jin
    • Journal of applied mathematics & informatics
    • /
    • 제2권2호
    • /
    • pp.53-62
    • /
    • 1995
  • If one is to estimate environmemtal risk based on data or predict risk based on expert opinion the parameter environmental risk musk be defined precisely so that when data becomes available the numerical values of the estimates and/or prediction can be evaluated. Also the definitionmust be precise so that it may be successfully used in regulatory and litigation activities. The presentation is a develop-ment of a definition which lends to statistical analysis and to inference in addition lends to ease of engineering interpretation. Various impli-cations and useful extensions in measuring numerically for two or more dimensional mixed effects of several toxicants could be developed in further research.

영농청소년의 지역사회참여실태 및 활성화 방안 (A Study on the Realities and Activation of Community Participation of Young Farmers)

  • 이채식;박은식
    • 농촌지도와개발
    • /
    • 제14권2호
    • /
    • pp.395-415
    • /
    • 2007
  • The purposes of this study were to investigate the realities of community participation of young farmers and to suggest measures to activate community participation. The data were collected from 234 young farmers from rural Korea. With SPSS 13.0 program for Windows, Frequency, t-test, ANOVA, LSD for post-hoc interpretation and Factor Analysis were employed to analyze the data with statistical significance level of .05. The main results of the study and suggestions were as follows: 1) Young farmers were more likely to participate in watching television on community, discussion with others and internet search for community, while, rural youths were less likely to participate in contacting government and parliament. 2) Difficulties of community participations of young farmers were lack of time and insufficient information about participatory activities. The study suggested that young farmers should get more opportunities to participate in diverse types of active opportunities and practical information.

  • PDF

BIVARIATE ANALYSIS에 의한 월류량에 모의발생에 관한 연구 (A STUDY ON SYNTHETIC GENERATION OF MONTHLY STREAMFLOW BY BIVARIATE ANALYSIS)

  • 서병하;윤용남;강관원
    • 물과 미래
    • /
    • 제12권2호
    • /
    • pp.63-69
    • /
    • 1979
  • The sequences of monthly streamflows constitute a non-statonary time series. The purely stochastic model has been applied to data generation of non-stationary time series. Tow different mothods--single site and multisite generation--have been used on the hydrologic time series. In this study the synthetic generation method by bivariate analysis, studied by Thomas Fiering, one of multi-site models, has been applied to the historical data on monthly streamflows at two sites in Nakdong River, and also for validity of this model the single site Thomas Fiering model applied. Through statistical analysis it has been shown that the performance of bivariate Thomas Fiering model was better than that of the other. By comparison of mean and standard deviaion between the historical and the generated, and cross correlogram interpretation, it has been known that the model used herein has good performance to simultaneously generate the monthly streamflows at two sites in a river hasin.

  • PDF

공간 데이터베이스를 이용한 1991년 용인지역 산사태 분석 (Landsilde Analysis of Yongin Area Using Spatial Database)

  • 이사로;민경덕
    • 자원환경지질
    • /
    • 제33권4호
    • /
    • pp.321-332
    • /
    • 2000
  • The purpose of this study is to analyze landslide that occurred in Yongin area in 1991 using spatial database. For this, landslide locations are detected from aerial photographs interpretation and field survey. The locations of landslide, topography, soil, forest and geology were constructed to spatial database using Geographic Information System (GIS). To establish occurrence factors of landslide, slope, aspect and curvature of topography were calculated from the topographic database. Texture, material, drainage and effective thickness of soil were extracted from the soil database, and type, age, diameter and density of wood were extracted from the forest database. Lithology was extracted from the geological database, and land use was classified from the TM satellite image. Landslide was analyzed using spatial correlation between the landslide and the landslide occurrence factors by bivariate probability methods. GIS was used to analyze vast data efficiently and statistical programs were used to maintain specialty and accuracy. The result can be used to prevention of hazard, land use planning and construction planning as basic data.

  • PDF