• 제목/요약/키워드: Count Data Analysis

검색결과 418건 처리시간 0.021초

Hierarchical Bayes Analysis of Longitudinal Poisson Count Data

  • 김달호;신임희;최인순
    • Journal of the Korean Data and Information Science Society
    • /
    • 제13권2호
    • /
    • pp.227-234
    • /
    • 2002
  • In this paper, we consider hierarchical Bayes generalized linear models for the analysis of longitudinal count data. Specifically we introduce the hierarchical Bayes random effects models. We discuss implementation of the Bayes procedures via Markov chain Monte Carlo (MCMC) integration techniques. The hierarchical Baye method is illustrated with a real dataset and is compared with other statistical methods.

  • PDF

Bibliometric Approach to Research Assessment: Publication Count, Citation Count, & Author Rank

  • Yang, Kiduk;Lee, Jongwook
    • Journal of Information Science Theory and Practice
    • /
    • 제1권1호
    • /
    • pp.27-41
    • /
    • 2013
  • We investigated how bibliometric indicators such as publication count and citation count affect the assessment of research performance by computing various bibliometric scores of the works of Korean LIS faculty members and comparing the rankings by those scores. For the study data, we used the publication and citation data of 159 tenure-track faculty members of Library and Information Science departments in 34 Korean universities. The study results showed correlation between publication count and citation count for authors with many publications but the opposite evidence for authors with few publications. The study results suggest that as authors publish more and more work, citations to their work tend to increase along with publication count. However, for junior faculty members who have not yet accumulated enough publications, citations to their work are of great importance in assessing their research performance. The study data also showed that there are marked differences in the magnitude of citations between papers published in Korean journals and papers published in international journals.

Count Data Model을 이용한 중소기업의 정보화 효과 분석 (Analysis on the Effects of the Informatization Level on SMEs through Count Data Model)

  • 황순환
    • 한국IT서비스학회지
    • /
    • 제3권1호
    • /
    • pp.5-20
    • /
    • 2004
  • It has been known generally that investment in the extending ability to use the IT applications have further enhanced the productivity of effects of IT on firms by reducing costs, increasing returns, and increasing the speed of operations, etc. Notwithstanding this fact, it was very complex and difficult to evaluate concretely the effect of informatization of firm. SMEs(Small- & Medium-sized Enterprises) in particular. In this study, I point out the weakness of SMEs and analyze the effects of informatization through the count data model. For this analysis, I separate the effects into two part, such as organizational effect and personal effect. It comes to conclusion that organizational effect is larger than personal effect and the ability to practical use of IT systems is most efficient item related with informatization level. Since it will be important to cencentrate on raising this ability for heightening the competitiveness of SMEs.

Statistical Analysis of Count Rate Data for On-line Seawater Radioactivity Monitoring

  • Lee, Dong-Myung;Cong, Binh Do;Lee, Jun-Ho;Yeo, In-Young;Kim, Cheol-Su
    • Journal of Radiation Protection and Research
    • /
    • 제44권2호
    • /
    • pp.64-71
    • /
    • 2019
  • Background: It is very difficult to distinguish between a radioactive contamination source and background radiation from natural radionuclides in the marine environment by means of online monitoring system. The objective of this study was to investigate a statistical process for triggering abnormal level of count rate data measured from our on-line seawater radioactivity monitoring. Materials and Methods: Count rate data sets in time series were collected from 9 monitoring posts. All of the count rate data were measured every 15 minutes from the region of interest (ROI) for $^{137}Cs$ ($E_{\gamma}=661.6keV$) on the gamma-ray energy spectrum. The Shewhart ($3{\sigma}$), CUSUM, and Bayesian S-R control chart methods were evaluated and the comparative analysis of determination methods for count rate data was carried out in terms of the false positive incidence rate. All statistical algorithms were developed using R Programming by the authors. Results and Discussion: The $3{\sigma}$, CUSUM, and S-R analyses resulted in the average false positive incidence rate of $0.164{\pm}0.047%$, $0.064{\pm}0.0367%$, and $0.030{\pm}0.018%$, respectively. The S-R method has a lower value than that of the $3{\sigma}$ and CUSUM method, because the Bayesian S-R method use the information to evaluate a posterior distribution, even though the CUSUM control chart accumulate information from recent data points. As the result of comparison between net count rate and gross count rate measured in time series all the year at a monitoring post using the $3{\sigma}$ control charts, the two methods resulted in the false positive incidence rate of 0.142% and 0.219%, respectively. Conclusion: Bayesian S-R and CUSUM control charts are better suited for on-line seawater radioactivity monitoring with an count rate data in time series than $3{\sigma}$ control chart. However, it requires a continuous increasing trend to differentiate between a false positive and actual radioactive contamination. For the determination of count rate, the net count method is better than the gross count method because of relatively a small variation in the data points.

An Analysis of Panel Count Data from Multiple random processes

  • 박유성;김희영
    • 한국통계학회:학술대회논문집
    • /
    • 한국통계학회 2002년도 추계 학술발표회 논문집
    • /
    • pp.265-272
    • /
    • 2002
  • An Integer-valued autoregressive integrated (INARI) model is introduced to eliminate stochastic trend and seasonality from time series of count data. This INARI extends the previous integer-valued ARMA model. We show that it is stationary and ergodic to establish asymptotic normality for conditional least squares estimator. Optimal estimating equations are used to reflect categorical and serial correlations arising from panel count data and variations arising from three random processes for obtaining observation into estimation. Under regularity conditions for martingale sequence, we show asymptotic normality for estimators from the estimating equations. Using cancer mortality data provided by the U.S. National Center for Health Statistics (NCHS), we apply our results to estimate the probability of cells classified by 4 causes of death and 6 age groups and to forecast death count of each cell. We also investigate impact of three random processes on estimation.

  • PDF

A Bayesian joint model for continuous and zero-inflated count data in developmental toxicity studies

  • Hwang, Beom Seuk
    • Communications for Statistical Applications and Methods
    • /
    • 제29권2호
    • /
    • pp.239-250
    • /
    • 2022
  • In many applications, we frequently encounter correlated multiple outcomes measured on the same subject. Joint modeling of such multiple outcomes can improve efficiency of inference compared to independent modeling. For instance, in developmental toxicity studies, fetal weight and number of malformed pups are measured on the pregnant dams exposed to different levels of a toxic substance, in which the association between such outcomes should be taken into account in the model. The number of malformations may possibly have many zeros, which should be analyzed via zero-inflated count models. Motivated by applications in developmental toxicity studies, we propose a Bayesian joint modeling framework for continuous and count outcomes with excess zeros. In our model, zero-inflated Poisson (ZIP) regression model would be used to describe count data, and a subject-specific random effects would account for the correlation across the two outcomes. We implement a Bayesian approach using MCMC procedure with data augmentation method and adaptive rejection sampling. We apply our proposed model to dose-response analysis in a developmental toxicity study to estimate the benchmark dose in a risk assessment.

가산자료모형을 이용한 송정 해수욕장의 경제적 가치추정: - 비수기 해수욕장의 가치추정 - (Estimating the Economic Value of the Songieong Beach Using A Count Data Model: - Off-season Estimating Value of the Beach -)

  • 허윤정;이승래
    • 수산경영론집
    • /
    • 제38권2호
    • /
    • pp.79-101
    • /
    • 2007
  • The purpose of this study is to estimate the economic value of the Songieong Beach in Off-season, using a Individual Travel Cost Model(ITCM). Songieong Beach is located in Busan but far away from city. These days, however, the increased rate of traffic inflow to the Songieong beach and the five-day working week are reflected in the trend analysis. Moreover, people have changed psychological value. For that reason, visitors are on the increase on the beach in off-season. The ITCM is applied to estimate non-market value or environmental Good like a Contingent Valuation Method and Hedonic Price Model etc. The ITCM was derived from the Count Data Model(i.e. Poisson and Negative Binomial model). So this paper compares Poisson and negative binomial count data models to measure the tourism demands. The data for the study were collected from the Songjeong Beach on visitors over the a week from November 1 through November 23, 2006. Interviewers were instructed to interview only individuals. So the sample was taken in 113. A dependent variable that is defined on the non-negative integers and subject to sampling truncation is the result of a truncated count data process. This paper analyzes the effects of determinants on visitors' demand for exhibition using a class of maximum-likelihood regression estimators for count data from truncated samples, The count data and truncated models are used primarily to explain non-negative integer and truncation properties of tourist trips as suggested by the economic valuation literature. The results suggest that the truncated negative binomial model is improved overdispersion problem and more preferred than the other models in the study. This paper is not the same as the others. One thing is that Estimating Value of the Beach in off-season. The other thing is this study emphasizes in particular 'travel cost' that is not only monetary cost but also including opportunity cost of 'travel time'. According to the truncated negative binomial model, estimates the Consumer Surplus(CS) values per trip of about 199,754 Korean won and the total economic value was estimated to be 1,288,680 Korean won.

  • PDF

시각적 평가에 의한 개더 드레이프 형상 분석 (Analysis of Types of Gather Drape with Visual Evaluation)

  • 이명희;정희경
    • 한국의상디자인학회지
    • /
    • 제7권1호
    • /
    • pp.33-40
    • /
    • 2005
  • Gathering is method used to control fullness along a seam line. The purpose of this study was to investigate the relationship between the quantitative research and qualitative method; the effect of gather and the types of gather drape. The experimental design consists of four factors: (l) three kinds of different weight and thickness of fabrics (2) three kinds of stitch densities (3) five kinds of ratio of gathers (4) three kinds of grain directions. Therefore one hundred thirty five (135) samples were made. And utilized SPSS WIN 10.0 Package in data analysis. The results of this study were as follows; First, after frequency analysis, side height, hem line width, node depth, node count, node width accorded with these result data recording. Second, after correlation analysis, side height related with front statements. Side height and entire visual was negative correlation. Hem line width, node depth, node count with section statements was negative correlation but node width at section statements was positive correlation. Third, after $k^2$ analysis, front picture parts getting excellent evaluation were 1st side height, 3rd hem line width, 4th node depth, 3rd node count, 3rd node width. And section illustration parts getting excellent evaluation were 4th side height, 1st hem line width, 2nd node depth, 3rd node count, 4th node width.

  • PDF

Analysis of Marginal Count Failure Data by using Covariates

  • Karim, Md.Rezaul;Suzuki, Kazuyuki
    • International Journal of Reliability and Applications
    • /
    • 제4권2호
    • /
    • pp.79-95
    • /
    • 2003
  • Manufacturers collect and analyze field reliability data to enhance the quality and reliability of their products and to improve customer satisfaction. To reduce the data collecting and maintenance costs, the amount of data maintained for evaluating product quality and reliability should be minimized. With this in mind, some industrial companies assemble warranty databases by gathering data from different sources for a particular time period. This “marginal count failure data” does not provide (i) the number of failures by when the product entered service, (ii) the number of failures by product age, or (iii) information about the effects of the operating season or environment. This article describes a method for estimating age-based claim rates from marginal count failure data. It uses covariates to identify variations in claims relative to variables such as manufacturing characteristics, time of manufacture, operating season or environment. A Poisson model is presented, and the method is illustrated using warranty claims data for two electrical products.

  • PDF

통계적인 기법을 활용한 동질성구간에 따른 교통량 수시조사 효율화 연구 (Determination of a Homogeneous Segment for Short-term Traffic Count Efficiency Using a Statistical Approach)

  • 정유석;오주삼
    • 한국도로학회논문집
    • /
    • 제17권4호
    • /
    • pp.135-141
    • /
    • 2015
  • PURPOSES: This study has been conducted to determine a homogeneous segment and integration to improve the efficiency of short-term traffic count. We have also attempted to reduce the traffic monitoring budget. METHODS: Based on the statistical approach, a homogeneous segment in the same road section is determined. Statistical analysis using t-test, mean difference, and correlation coefficient are carried out for 10-year-long (2004-2013) short-term count traffic data and the MAPE of fresh data (2014) are evaluated. The correlation coefficient represents a trend in traffic count, while the mean difference and t-score represent an average traffic count. RESULTS : The statistical analysis suggests that the number of target segments varies with the criteria. The correlation coefficient of more than 30% of the adjacent segment is higher than 0.8. A mean difference of 36.2% and t-score of 19.5% for adjacent segments are below 20% and 2.8, respectively. According to the effectiveness analysis, the integration criteria of the mean difference have a higher effect as compared to the t-score criteria. Thus, the mean difference represents a traffic volume similarity. CONCLUSIONS : The integration of 47 road segments from 882 adjacent road segments indicate 8.87% of MAPE, which is within an acceptable range. It can reduce the traffic monitoring budget and increase the count to improve an accuracy of traffic volume estimation.