• Title, Summary, Keyword: Distributional Hypothesis

Search Result 10, Processing Time 0.036 seconds

A Statistical Analysis of the Distribution of Sasang Constitutions in Iksan Wonkwang Oriental Medicine (익산원광한의원 내원환자의 체질분포에 관한 통계적 분석)

  • 김종열;김홍기
    • The Journal of Korean Medicine
    • /
    • v.24 no.3
    • /
    • pp.118-129
    • /
    • 2003
  • Objective : To learn the distributional characteristics of Sasang constitutions, Methods : We statistically analyzed those 1338 patients who had been treated at Iksan Wonkwang Oriental Medicine during the period of three years from 2000 to 2002. The data were obtained through the electronic chart developed by Kim Jong- Yeol, and analyzed using the statistical Package SPSS. Results : The distributional ratio of Soeumin : Soyangin : Taeumin was 22.8 : 29.2 : 47.8. Thus the hypothesis : 'the distributional ratio of Soeumin : Soyangin : Taeumin is 2 : 3 : 5' was barely rejected by $x^2$ test for goodness-of-fit at the significance level of 5 %. When $x^2$ test for homogeneity was applied, the distributional characteristics between women and men were different and the distributional characteristics among several age groups were different under significance level of 5%. Conclusion : Though the hypothesis: 'the distributional ratio of Soeumin : Soyangin : Taeumin is 2 : 3 : 5' was rejected by $x^2$ test at the significance level of 5%, the observed distributional ratio was not so far away from the hypothesis.

  • PDF

Application of Bootstrap Method for Change Point Test based on Kernel Density Estimator

  • Kim, Dae-Hak
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.1
    • /
    • pp.107-117
    • /
    • 2004
  • Change point testing problem is considered. Kernel density estimators are used for constructing proposed change point test statistics. The proposed method can be used to the hypothesis testing of not only parameter change but also distributional change. Bootstrap method is applied to get the sampling distribution of proposed test statistic. Small sample Monte Carlo Simulation were also conducted in order to show the performance of proposed method.

  • PDF

Word Sense Similarity Clustering Based on Vector Space Model and HAL (벡터 공간 모델과 HAL에 기초한 단어 의미 유사성 군집)

  • Kim, Dong-Sung
    • Korean Journal of Cognitive Science
    • /
    • v.23 no.3
    • /
    • pp.295-322
    • /
    • 2012
  • In this paper, we cluster similar word senses applying vector space model and HAL (Hyperspace Analog to Language). HAL measures corelation among words through a certain size of context (Lund and Burgess 1996). The similarity measurement between a word pair is cosine similarity based on the vector space model, which reduces distortion of space between high frequency words and low frequency words (Salton et al. 1975, Widdows 2004). We use PCA (Principal Component Analysis) and SVD (Singular Value Decomposition) to reduce a large amount of dimensions caused by similarity matrix. For sense similarity clustering, we adopt supervised and non-supervised learning methods. For non-supervised method, we use clustering. For supervised method, we use SVM (Support Vector Machine), Naive Bayes Classifier, and Maximum Entropy Method.

  • PDF

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model (Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소)

  • Kim, Kwang-Ho;Lee, Donghyun;Lim, Minkyu;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.3-8
    • /
    • 2015
  • In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google's Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-${\left|V\right|}$ coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).

Environmental Equity Analysis of the Accessibility of Urban Neighborhood Parks in Daegu City (대구시 도시근린공원의 접근성에 따른 환경적 형평성 분석)

  • Seo, Hyun-Jin;Jun, Byong-Woon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.14 no.4
    • /
    • pp.221-237
    • /
    • 2011
  • This study aims to investigate the environmental equity of the accessibility to urban neighborhood parks in the city of Daegu. The spatial distribution of urban neighborhood parks was explored by spatial statistics and the spatial accessibility to them was then evaluated by both minimum distance and coverage approaches. Descriptive and inferential statistics such as proximity ratio, Mann Whitney U test, and logistic regression were used for comparing the socioeconomic characteristics over different accessibilities to the neighborhood parks and then testing the distributional inequity hypothesis. The results from the minimum distance method indicated that Dalseo-gu had the best accessibility to the neighborhood parks while Dong-gu had the worst accessibility. It was apparent with the coverage method that Dalseo-gu had the best accessibility whereas Dong-gu and Nam-gu had the worst accessibility to the neighborhood parks at 500m and 1,000m buffer distances. There existed the spatial pattern of environmental inequity in old towns with respect to population density and the percentage of people under the age of 18. The spatial pattern of environmental inequity in new towns was explored on the basis of the percentage of people over the age of 65, the percentage of people below the poverty level, and the percentage of free of charge rental housing. These results were closely related to the development process of urban parks in Daegu stimulated by the quantitative urban park policy, urban development process, and residential location pattern such as permanent rental housing and free of charge rental housing. This study further extends the existing research topics of environmental justice related to the distributional inequity of environmental disamenities and hazards by focusing on environmental amenities such as urban neighborhood parks. The results from this study can be used in making the decisions for urban park management and setting up urban park policy with considering the social geography of Daegu.

A Study on the Computational Model of Word Sense Disambiguation, based on Corpora and Experiments on Native Speaker's Intuition (직관 실험 및 코퍼스를 바탕으로 한 의미 중의성 해소 계산 모형 연구)

  • Kim, Dong-Sung;Choe, Jae-Woong
    • Korean Journal of Cognitive Science
    • /
    • v.17 no.4
    • /
    • pp.303-321
    • /
    • 2006
  • According to Harris'(1966) distributional hypothesis, understanding the meaning of a word is thought to be dependent on its context. Under this hypothesis about human language ability, this paper proposes a computational model for native speaker's language processing mechanism concerning word sense disambiguation, based on two sets of experiments. Among the three computational models discussed in this paper, namely, the logic model, the probabilistic model, and the probabilistic inference model, the experiment shows that the logic model is first applied fer semantic disambiguation of the key word. Nexr, if the logic model fails to apply, then the probabilistic model becomes most relevant. The three models were also compared with the test results in terms of Pearson correlation coefficient value. It turns out that the logic model best explains the human decision behaviour on the ambiguous words, and the probabilistic inference model tomes next. The experiment consists of two pans; one involves 30 sentences extracted from 1 million graphic-word corpus, and the result shows the agreement rate anong native speakers is at 98% in terms of word sense disambiguation. The other pm of the experiment, which was designed to exclude the logic model effect, is composed of 50 cleft sentences.

  • PDF

The Effect of Organizational Justice on Information Security-Related Role Stress and Negative Behaviors

  • Hwang, Inho;Ahn, SangJoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.11
    • /
    • pp.87-98
    • /
    • 2019
  • In recent years, many organizations protect their information resources by investing in information security technology. However, information security threats from insiders have not been reduced. This study proposes a method for reducing information security threats within an organization by mitigating negative information security behaviors of employees. Specifically, the study finds a relationship between information security related role stress and negative behavior and suggests whether organizational justice mitigates role stress. That is, the purpose of the study is to suggest a mechanism between organizational justice, information security related role stress, and negative behavior. Negative behavior consist of avoidance behavior and deviant behavior, and security related role stress consist of role conflict and role ambiguity. Organizational justice consist of distributional justice, procedural justice, and informational justice. The research model is verified through structural equation modeling. After establishing a research model and hypothesis, we develop a survey questionnaire and collect data from 383 employees whose organizations have already implemented security policies. The findings appear that security related role stress increases negative behavior and that organizational justice mitigates role stress. The results of the analysis suggest the direction of organizational strategy for minimizing insider's security-related negative behaviors.

Distribution of Benthic Diatoms in Tidal Flats of Hampyeong Bay, Korea (함평만 갯벌의 저서규조류 분포 특성)

  • Lee, Hak-Young;Jung, Myoung-Hwa
    • Korean Journal of Environmental Biology
    • /
    • v.29 no.1
    • /
    • pp.17-22
    • /
    • 2011
  • The distributional pattern of benthic diatoms in tidal flats of Hampyeong Bay, Korea, was studied from January to October in 2009. As benthic diatoms of Hampyeong Bay tidal flats, 45 species were identified, and the most dominant species was Paralia sulcata. The most diverse flora was observed at Gaip and Songseok sites in April with 22 species, and the least at Hyeonhwa site in January. The ranges of chlorophyll-a concentration in tidal flats were 21.2~31.8 mg$m^{-2}$ at Hyeonhwa site, 23.6~35.4 mg $m^{-2}$ at Gaip site, and 24.2~34.3 mg $m^{-2}$ at Songseok site. The concentrations of pheopigment ranged between 25.3 and 45.2 mg$m^{-2}$. The standing crops of benthic diatoms showed highest density in April and lowest in January, February, and October. The cell volumes of benthic diatoms were highest in April. The taxa and biomass of benthic diatoms showed correlations with temperature. On temperature variables, the benthic diatoms showed optimal occurrences at the range of $14{\sim}17^{\circ}C$.

Effects of Areal Interpolation Methods on Environmental Equity Analysis (면내삽법이 환경적 형평성 분석에 미치는 영향)

  • Jun, Byong-Woon
    • Journal of the Korean association of regional geographers
    • /
    • v.14 no.6
    • /
    • pp.736-751
    • /
    • 2008
  • Although a growing number of studies have commonly used a simple areal weighting interpolation method to quantify demographic characteristics of impacted areas in environmental equity analysis, the results obtained are inevitably imprecise because of the method's unrealistic assumption that population is evenly distributed within a census enumeration unit. Two alternative areal interpolation methods such as intelligent areal weighting and regression methods can account for the distributional biases in the estimation of impacted populations by making use of additional information about the geographic distribution of population. This research explores five areal interpolation methods for estimating the population characteristics of impacted areas in environmental equity analysis and evaluates the sensitivity of the outcomes of environmental equity analysis to areal interpolation methods. This study used GIS techniques to allow areal interpolation to be informed by the distribution of land cover types, as inferred from a satellite image. in both the source and target units. Independent samples t-test statistics were measured to verify the environmental equity hypothesis while coefficients of variation were calculated to compare the relative variability and consistency in the socioeconomic characteristics of populations at risk over different areal interpolation methods. Results show that the outcomes of environmental equity analysis in the study area are not sensitive to the areal interpolation methods used in estimating affected populations, but the population estimates within the impacted areas are largely variable as different areal interpolation methods are used. This implies that the use of different areal interpolation methods may to some degree alter the statistical results of environmental equity analysis.

  • PDF

An Epistemological Inquiry on the Development of Statistical Concepts (통계적 개념 발달에 관한 인식론적 고찰)

  • Lee, Young-Ha;Nam, Joo-Hyun
    • The Mathematical Education
    • /
    • v.44 no.3
    • /
    • pp.457-475
    • /
    • 2005
  • We have inquired on what the statistical classes of the secondary schools had been aiming to, say the epistermlogical objects. And we now appreciate that the main obstacle to the systematic articulation is the lack of anticipation on what the statistical concepts are. This study focuses on the ingredients of the statistical concepts. Those are to be the ground of the systematic articulation of statistic courses, especially of the one for the school kids. Thus we required that those ingredients must satisfy the followings. i) directly related to the contents of statistics ii) psychologically developing iii) mutually exclusive each other as much as possible iv) exhaustive enough to cover all statistical concepts We examined what and how statisticians had been doing and the various previous views on these. After all we suggest the following three concepts are the core of conceptual developments of statistic, say the concept of distributions, the summarizing ability and the concept of samples. By the concepts of distributions we mean the frequency views on each random categories and that is developing from the count through the probability along ages. Summarizing ability is another important resources to embed his probe with the data set. It is not only viewed as a number but also to be anticipated as one reflecting a random phenomena. Inductive generalization is one of the most hazardous thing. Statistical induction is a scientific way of challenging this and this starts from distinguishing the chance with the inevitable consequences. One's inductive logic grows up along with one's deductive arguments, nevertheless they are different. The concept of samples reflects' one's view on the sample data and the way of compounding one's logic with the data within one's hypothesis. With these three in mind we observed Korean Statistic Curriculum from K to 12. Distributional concepts are dealt with throughout but not sequenced well. The way of summarization has been introduced in the 1 st, 5th, 7th and the 10th grade as a numerical value only. One activity on the concept of sample is given at the 6th grade. And it jumps into the statistical reasoning at the selective courses of ' Mathematics I ' or of ' Probability and Statistics ' in the grades of 11-12. We want to suggest further studies on the developing stages of these three conceptual features so as to obtain a firm basis of successive statistical articulation.

  • PDF