• Title/Summary/Keyword: conditional probability and independence

Search Result 19, Processing Time 0.025 seconds

A Framework for Assessing Probability Knowledge and Skills for Middle School Students: A Case of U.S. (중학교 학생들의 확률적 사고 수준 평가 기준 개발 : 미국의 사례)

  • Park, Ji-Yoon;Lee, Kyung-Hwa
    • School Mathematics
    • /
    • v.11 no.1
    • /
    • pp.1-15
    • /
    • 2009
  • Some researchers (Jones et al., 1997; Tarr & Jones, 1997; Tarr & Lannin, 2005) have worked on students' probabilistic thinking framework. These studies contributed to an understanding of students' thinking in probability by depicting levels. However, understanding middle school students' probabilistic thinking is limited to the concepts in conditional probability and independence. In this study, the framework to understand middle school students' thinking in probability is integrated on the works of Jones et al. (1997), Polaki (2005) and Tarr and Jones (1997). As in their works, depicting levels of probabilistic thinking is focused on the concepts and skills for students in middle school. The concepts and skills considered as being necessary for middle school students were integrated from NCTM documents and NAEP frameworks.

  • PDF

A probabilistic information retrieval model by document ranking using term dependencies (용어간 종속성을 이용한 문서 순위 매기기에 의한 확률적 정보 검색)

  • You, Hyun-Jo;Lee, Jung-Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.763-782
    • /
    • 2019
  • This paper proposes a probabilistic document ranking model incorporating term dependencies. Document ranking is a fundamental information retrieval task. The task is to sort documents in a collection according to the relevance to the user query (Qin et al., Information Retrieval Journal, 13, 346-374, 2010). A probabilistic model is a model for computing the conditional probability of the relevance of each document given query. Most of the widely used models assume the term independence because it is challenging to compute the joint probabilities of multiple terms. Words in natural language texts are obviously highly correlated. In this paper, we assume a multinomial distribution model to calculate the relevance probability of a document by considering the dependency structure of words, and propose an information retrieval model to rank a document by estimating the probability with the maximum entropy method. The results of the ranking simulation experiment in various multinomial situations show better retrieval results than a model that assumes the independence of words. The results of document ranking experiments using real-world datasets LETOR OHSUMED also show better retrieval results.

Study on Teachers' Understanding on Generating Random Number in Monte Carlo Simulation (몬테카를로 시뮬레이션의 난수 생성에 관한 교사들의 이해에 관한 연구)

  • Heo, Nam Gu;Kang, Hyangim
    • School Mathematics
    • /
    • v.17 no.2
    • /
    • pp.241-255
    • /
    • 2015
  • The purpose of this study is to analyze teachers' understanding on generating random number in Monte Carlo simulation and to provide educational implications in school practice. The results showed that the 70% of the teachers selected wrong ideas from three types for random-number as strategies for problem solving a probability problem and also they make some errors to justify their opinion. The first kind of the errors was that the probability of a point or boundary was equal to the value of the probability density function in the continuous probability distribution. The second kind of the errors was that the teachers failed to recognize that the sample space has been changed by conditional probability. The third kind of the errors was that when two random variables X, Y are independence of each other, then only, joint probability distribution is satisfied $P(X=x,\;Y=y)=p(X=x){\times}P(Y=y{\mid}X=x)$.

The Analytic Performance Model of the Superscalar Processor Using Multiple Branch Prediction (독립시행의 정리를 이용하는 수퍼스칼라 프로세서의 다중 분기 예측 성능 모델)

  • 이종복
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.1009-1012
    • /
    • 1999
  • An analytical performance model that can predict the performance of a superscalar processor employing multiple branch prediction is introduced. The model is based on the conditional independence probability and the basic block size of instructions, with the degree of multiple branch prediction, the fetch rate, and the window size of a superscalar architecture. Trace driven simulation is performed for the subset of SPEC integer benchmarks, and the measured IPCs are compared with the results derived from the model. As the result, our analytic model could predict the performance of the superscalar processor using multiple branch prediction within 6.6 percent on the average.

  • PDF

Initial Value Selection in Applying an EM Algorithm for Recursive Models of Categorical Variables

  • Jeong, Mi-Sook;Kim, Sung-Ho;Jeong, Kwang-Mo
    • Journal of the Korean Statistical Society
    • /
    • v.27 no.1
    • /
    • pp.25-55
    • /
    • 1998
  • Maximum likelihood estimates (MLEs) for recursive models of categorical variables are discussed under an EM framework. Since MLEs by EM often depend on the choice of the initial values for MLEs, we explore reasonable rules for selecting the initial values for EM. Simulation results strongly support the proposed rules.

  • PDF

Application of a weight-of-evidence model to landslide susceptibility analysis Boeun, Korea

  • Moung-Jin, Lee;Yu, Young-Tae
    • Proceedings of the Korean Association of Geographic Inforamtion Studies Conference
    • /
    • 2003.04a
    • /
    • pp.65-70
    • /
    • 2003
  • The weight-of-evidence model one of the Bayesian probability model was applied to the task of evaluating landslide susceptibility using GIS. Using the location of the landslides and spatial database such as topography, soil, forest, geology, land use and lineament, the weight-of-evidence model was applied to calculate each factor's rating at Boun area in Korea where suffered substantial landslide damage fellowing heavy rain in 1998, The factors are slope, aspect and curvature from the topographic database, soil texture, soil material, soil drainage, soil effective thickness, and topographic type from the soil database, forest type, timber diameter, timber age and forest density from the forest map, lithology from the geological database, land use from Landsat TM satellite image and lineament from IRS satellite image. Tests of conditional independence were performed for the selection of the factors, allowing the 43 combinations of factors to be analyzed. For the analysis, the contrast value, W$\^$+/and W$\^$-/, as each factor's rating, were overlaid to map laudslide susceptibility. The results of the analysis were validated using the observed landslide locations, and among the combinations, the combination of slope, curvature, topographic, timber diameter, geology and lineament show the best results. The results can be used for hazard prevention and planning land use and construction

  • PDF

Dependency-based Framework of Combining Multiple Experts for Recognizing Unconstrained Handwritten Numerals (무제약 필기 숫자를 인식하기 위한 다수 인식기를 결합하는 의존관계 기반의 프레임워크)

  • Kang, Hee-Joong;Lee, Seong-Whan
    • Journal of KIISE:Software and Applications
    • /
    • v.27 no.8
    • /
    • pp.855-863
    • /
    • 2000
  • Although Behavior-Knowledge Space (BKS) method, one of well known decision combination methods, does not need any assumptions in combining the multiple experts, it should theoretically build exponential storage spaces for storing and managing jointly observed K decisions from K experts. That is, combining K experts needs a (K+1)st-order probability distribution. However, it is well known that the distribution becomes unmanageable in storing and estimating, even for a small K. In order to overcome such weakness, it has been studied to decompose a probability distribution into a number of component distributions and to approximate the distribution with a product of the component distributions. One of such previous works is to apply a conditional independence assumption to the distribution. Another work is to approximate the distribution with a product of only first-order tree dependencies or second-order distributions as shown in [1]. In this paper, higher order dependency than the first-order is considered in approximating the distribution and a dependency-based framework is proposed to optimally approximate the (K+1)st-order probability distribution with a product set of dth-order dependencies where ($1{\le}d{\le}K$), and to combine multiple experts based on the product set using the Bayesian formalism. This framework was experimented and evaluated with a standardized CENPARMI data base.

  • PDF

Learning Distribution Graphs Using a Neuro-Fuzzy Network for Naive Bayesian Classifier (퍼지신경망을 사용한 네이브 베이지안 분류기의 분산 그래프 학습)

  • Tian, Xue-Wei;Lim, Joon S.
    • Journal of Digital Convergence
    • /
    • v.11 no.11
    • /
    • pp.409-414
    • /
    • 2013
  • Naive Bayesian classifiers are a powerful and well-known type of classifiers that can be easily induced from a dataset of sample cases. However, the strong conditional independence assumptions can sometimes lead to weak classification performance. Normally, naive Bayesian classifiers use Gaussian distributions to handle continuous attributes and to represent the likelihood of the features conditioned on the classes. The probability density of attributes, however, is not always well fitted by a Gaussian distribution. Another eminent type of classifier is the neuro-fuzzy classifier, which can learn fuzzy rules and fuzzy sets using supervised learning. Since there are specific structural similarities between a neuro-fuzzy classifier and a naive Bayesian classifier, the purpose of this study is to apply learning distribution graphs constructed by a neuro-fuzzy network to naive Bayesian classifiers. We compare the Gaussian distribution graphs with the fuzzy distribution graphs for the naive Bayesian classifier. We applied these two types of distribution graphs to classify leukemia and colon DNA microarray data sets. The results demonstrate that a naive Bayesian classifier with fuzzy distribution graphs is more reliable than that with Gaussian distribution graphs.

An Analysis on Consumer Preference for Attributes of Agricultural Box Scheme (농산물 꾸러미 속성별 소비자선호 분석)

  • Park, Jae-Dong;Kim, Tae-Kyun;Jang, Woo-Whan;Lim, Cheong-Ryong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.20 no.1
    • /
    • pp.329-338
    • /
    • 2019
  • In this study, we analyze consumer preferences based on the agricultural box scheme attributes, and make a suggestion for business revival. We estimate the marginal willingness to pay (MWTP) for box scheme attributes using a choice experiment. Attributes include the bundle method, the delivery method, and price. To select an efficient model for statistical analysis, we evaluate the conditional logit model, heteroscedastic extreme value model(HEV model), multinomial probit model, and mixed logit model under different assumptions. The results of these four models show that the bundle method, the delivery method, and price are statistically significant in explaining the probability of participation in a box scheme. The results of likelihood ratio tests show that the heteroscedastic extreme value model is the most appropriate for our survey data. The results also indicate that MWTP for a change from fixed type to selection type is KRW 7,096.6. MWTP for a change from parcel service to direct delivery and cold-chain delivery are KRW 3,497.5 and KRW 7,532.7, respectively. The results of this study may contribute to the government's local food policies.