• Title/Summary/Keyword: Bayesian statistics

Search Result 711, Processing Time 0.023 seconds

Differences of Cold-heat Patterns between Healthy and Disease Group (건강군과 질환군의 한열지표 차이에 관한 고찰)

  • Kim Ji-Eun;Lee Seung-Gi;Ryu Hwa-Seung;Park Kyung-Mo
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.20 no.1
    • /
    • pp.224-228
    • /
    • 2006
  • The pattern identification of exterior-interior syndrome and cold-heat syndrome is one of the diagnostic methods using most frequently in Oriental medicine. There was no systematic studies analyzing the characteristics of the 'exterior-interior and cold-heat' between healthy and disease group. In this study, cold-heat pattern, blood pressure, pulse rate, height and weight are recorded from 100 healthy subjects and 196 disease subjects with age ranging from 30 to 59 years. To analyze the differences between healthy and disease group, we used the descriptive statistics. And linear regression function, linear support vector machine and bayesian classifier were used for distinguishing healthy group from disease group. The score of both exterior-heat and interior-cold in healthy group is higher than the score in disease group. This means that if one belongs to the disease group, his(or her) exterior gets cold and his interior gets hot. And also, these result have no relevance to age. But, the attempt to classify healthy group from disease group with a exterior-interior and cold-heat and other vital signs did not have good performance. It mean that even though they have a different trend each other, only these kinds of information couldn't classify healthy group and disease group.

A Korean Homonym Disambiguation Model Based on Statistics Using Weights (가중치를 이용한 통계 기반 한국어 동형이의어 분별 모델)

  • 김준수;최호섭;옥철영
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.11
    • /
    • pp.1112-1123
    • /
    • 2003
  • WSD(word sense disambiguation) is one of the most difficult problems in Korean information processing. The Bayesian model that used semantic information, extracted from definition corpus(1 million POS-tagged eojeol, Korean dictionary definitions), resulted in accuracy of 72.08% (nouns 78.12%, verbs 62.45%). This paper proposes the statistical WSD model using NPH(New Prior Probability of Homonym sense) and distance weights. We select 46 homonyms(30 nouns, 16 verbs) occurred high frequency in definition corpus, and then we experiment the model on 47,977 contexts from ‘21C Sejong Corpus’(3.5 million POS-tagged eojeol). The WSD model using NPH improves on accuracy to average 1.70% and the one using NPH and distance weights improves to 2.01%.

An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm (Monte-Carlo expectation-maximaization 방법을 이용한 무응답 모형 추정방법)

  • Choi, Boseung;You, Hyeon Sang;Yoon, Yong Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.587-598
    • /
    • 2016
  • In predicting an outcome of election using a variety of methods ahead of the election, non-response is one of the major issues. Therefore, to address the non-response issue, a variety of methods of non-response imputation may be employed, but the result of forecasting tend to vary according to methods. In this study, in order to improve electoral forecasts, we studied a model based method of non-response imputation attempting to apply the Monte Carlo Expectation Maximization (MCEM) algorithm, introduced by Wei and Tanner (1990). The MCEM algorithm using maximum likelihood estimates (MLEs) is applied to solve the boundary solution problem under the non-ignorable non-response mechanism. We performed the simulation studies to compare estimation performance among MCEM, maximum likelihood estimation, and Bayesian estimation method. The results of simulation studies showed that MCEM method can be a reasonable candidate for non-response model estimation. We also applied MCEM method to the Korean presidential election exit poll data of 2012 and investigated prediction performance using modified within precinct error (MWPE) criterion (Bautista et al., 2007).

Study on the Multilevel Effects of Integrated Crisis Intervention Model for the Prevention of Elderly Suicide: Focusing on Suicidal Ideation and Depression (노인자살예방을 위한 통합적 위기개입모델 다층효과 연구: 자살생각·우울을 중심으로)

  • Kim, Eun Joo;Yook, Sung Pil
    • 한국노년학
    • /
    • v.37 no.1
    • /
    • pp.173-200
    • /
    • 2017
  • This study is designed to verify the actual effect on the prevention of the elderly suicide of the integrated crisis intervention service which has been widely provided across all local communities in Gyeonggi-province focusing on the integrated crisis intervention model developed for the prevention of elderly suicide. The integrated crisis intervention model for the local communities and its manual were developed for the prevention of elderly suicide by integrating the crisis intervention theory which contains local community's integrated system approach and the stress vulnerability theory. For the analysis of the effect, the geriatric depression and suicidal ideation scale was adopted and the data was collected as follows; The data was collected from 258 people in the first preliminary test. Then, it was collected from the secondary test of 184 people after the integrated crisis intervention service was performed for 6 months. The third collection of data was made from 124 people after 2 or 3 years later using the backward tracing method. As for the analysis, the researcher used the R Statistics computing to conduct the test equating, and the vertical scaling between measuring points. Then, the researcher conducted descriptive statistics analysis and univariate analysis of variance, and performed multi-level modeling analysis using Bayesian estimation. As a result of the study, it was found out that the integrated crisis intervention model which has been developed for the elderly suicide prevention has a statistically significant effect on the reduction of elderly suicide in terms of elderly depression and suicide ideation in the follow-up measurement after the implementation of crisis intervention rather than in the first preliminary scores. The integrated crisis intervention model for the prevention of elderly suicide was found to be effective to the extent of 0.56 for the reduction of depression and 0.39 for the reduction of suicidal ideation. However, it was found out in the backward tracing test conducted 2-3 years after the first crisis intervention that the improved values returned to its original state, thus showing that the effect of the intervention is not maintained for long. Multilevel analysis was conducted to find out the factors such as the service type(professional counseling, medication, peer counseling), characteristics of the client (sex, age), the characteristics of the counselor(age, career, major) and the interaction between the characteristics of the counselor and intervention which affect depression and suicidal ideation. It was found that only medication can significantly reduce suicidal ideation and that if the counselor's major is counseling, it significantly further reduces suicidal ideation by interacting with professional counseling. Furthermore, as the characteristics of the suicide prevention experts are found to regulate the intervention effect on elderly suicide prevention in applying integrated crisis intervention model, the primary consideration should be given to the counseling ability of these experts.

Optimal supervised LSA method using selective feature dimension reduction (선택적 자질 차원 축소를 이용한 최적의 지도적 LSA 방법)

  • Kim, Jung-Ho;Kim, Myung-Kyu;Cha, Myung-Hoon;In, Joo-Ho;Chae, Soo-Hoan
    • Science of Emotion and Sensibility
    • /
    • v.13 no.1
    • /
    • pp.47-60
    • /
    • 2010
  • Most of the researches about classification usually have used kNN(k-Nearest Neighbor), SVM(Support Vector Machine), which are known as learn-based model, and Bayesian classifier, NNA(Neural Network Algorithm), which are known as statistics-based methods. However, there are some limitations of space and time when classifying so many web pages in recent internet. Moreover, most studies of classification are using uni-gram feature representation which is not good to represent real meaning of words. In case of Korean web page classification, there are some problems because of korean words property that the words have multiple meanings(polysemy). For these reasons, LSA(Latent Semantic Analysis) is proposed to classify well in these environment(large data set and words' polysemy). LSA uses SVD(Singular Value Decomposition) which decomposes the original term-document matrix to three different matrices and reduces their dimension. From this SVD's work, it is possible to create new low-level semantic space for representing vectors, which can make classification efficient and analyze latent meaning of words or document(or web pages). Although LSA is good at classification, it has some drawbacks in classification. As SVD reduces dimensions of matrix and creates new semantic space, it doesn't consider which dimensions discriminate vectors well but it does consider which dimensions represent vectors well. It is a reason why LSA doesn't improve performance of classification as expectation. In this paper, we propose new LSA which selects optimal dimensions to discriminate and represent vectors well as minimizing drawbacks and improving performance. This method that we propose shows better and more stable performance than other LSAs' in low-dimension space. In addition, we derive more improvement in classification as creating and selecting features by reducing stopwords and weighting specific values to them statistically.

  • PDF

Application of Stable Isotopic Niche Space to Large River Monitoring: Analysis of Benthic Macroinvertebrates of the Seongchon Wier (안정동위원소비를 활용한 생태지위면적 분석의 수생태계 평가 가능성 분석: 영산강 승촌보의 저서성 대형무척추동물을 대상으로)

  • Seo, Dong-Hwan;Oh, Hye-Ji;Jin, Mei-Yan;Oda, Yusuke;Kim, Hyun-Woo;Jang, Min-Ho;Choi, Bohyung;Shin, Kyung-Hoon;Lee, Kyung-Lak;Lee, Su-Woong;Chang, Kwang-Hyeon
    • Journal of Environmental Impact Assessment
    • /
    • v.27 no.6
    • /
    • pp.685-694
    • /
    • 2018
  • We measured ecological niche space (ENS) using carbon and nitrogen stable isotope ratios of benthic macroinvertebrates to estimate its applicability for large river assessment. In particular, we compared ENSs of selected macroinvertebrates between upper and lower area of Seungchon Weir in Yeongsan River to estimate the impact of weir on biological community. We also measured basic water quality and community indices including benthic macroinvertebrates index (BMI) to estimate their correlations with calculated ENS. ENS was calculated using the Bayesian Stable Isotope in R statistics (package "SIBER"). The results showed that seasonal variations in water quality and community indices were found, but there was no apparent tendency between upper and lower area of the Seungchon Weir in June (before rainy season) and August (after rainy season). However, ENS of benthic macroinvertebrates markedly decreased across the weir in both June and August regardless of changes in water quality. This means the physical change of the stream due to the weir cause decrease of ecological isotopic niche space of benthic macroinvertebrates regardless of water quality, suggesting physical modification by the weir can affect the interaction between habitat condition and macroinvertebrates. Therefore, the ecological isotopic niche space can be a useful supplementary indicator for the river ecosystem assessment.

Models for Estimating Genetic Parameters of Milk Production Traits Using Random Regression Models in Korean Holstein Cattle

  • Cho, C.I.;Alam, M.;Choi, T.J.;Choy, Y.H.;Choi, J.G.;Lee, S.S.;Cho, K.H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.5
    • /
    • pp.607-614
    • /
    • 2016
  • The objectives of the study were to estimate genetic parameters for milk production traits of Holstein cattle using random regression models (RRMs), and to compare the goodness of fit of various RRMs with homogeneous and heterogeneous residual variances. A total of 126,980 test-day milk production records of the first parity Holstein cows between 2007 and 2014 from the Dairy Cattle Improvement Center of National Agricultural Cooperative Federation in South Korea were used. These records included milk yield (MILK), fat yield (FAT), protein yield (PROT), and solids-not-fat yield (SNF). The statistical models included random effects of genetic and permanent environments using Legendre polynomials (LP) of the third to fifth order (L3-L5), fixed effects of herd-test day, year-season at calving, and a fixed regression for the test-day record (third to fifth order). The residual variances in the models were either homogeneous (HOM) or heterogeneous (15 classes, HET15; 60 classes, HET60). A total of nine models (3 orders of $polynomials{\times}3$ types of residual variance) including L3-HOM, L3-HET15, L3-HET60, L4-HOM, L4-HET15, L4-HET60, L5-HOM, L5-HET15, and L5-HET60 were compared using Akaike information criteria (AIC) and/or Schwarz Bayesian information criteria (BIC) statistics to identify the model(s) of best fit for their respective traits. The lowest BIC value was observed for the models L5-HET15 (MILK; PROT; SNF) and L4-HET15 (FAT), which fit the best. In general, the BIC values of HET15 models for a particular polynomial order was lower than that of the HET60 model in most cases. This implies that the orders of LP and types of residual variances affect the goodness of models. Also, the heterogeneity of residual variances should be considered for the test-day analysis. The heritability estimates of from the best fitted models ranged from 0.08 to 0.15 for MILK, 0.06 to 0.14 for FAT, 0.08 to 0.12 for PROT, and 0.07 to 0.13 for SNF according to days in milk of first lactation. Genetic variances for studied traits tended to decrease during the earlier stages of lactation, which were followed by increases in the middle and decreases further at the end of lactation. With regards to the fitness of the models and the differential genetic parameters across the lactation stages, we could estimate genetic parameters more accurately from RRMs than from lactation models. Therefore, we suggest using RRMs in place of lactation models to make national dairy cattle genetic evaluations for milk production traits in Korea.

Genetic Variation of Korean Fir Sub-Populations in Mt. Jiri for the Restoration of Genetic Diversity (유전다양성 복원을 위한 지리산 구상나무 아집단의 유전변이)

  • Ahn, Ji Young;Lim, Hyo-In;Ha, Hyun-Woo;Han, Jingyu;Han, Sim-Hee
    • Journal of Korean Society of Forest Science
    • /
    • v.106 no.4
    • /
    • pp.417-423
    • /
    • 2017
  • To provide a ecological restoration strategy considering genetic diversity of Abies koreana in Mt. Jiri, the genetic diversity and the genetic differentiation among sub-populations such as Banyabong, Byeoksoryeong, and Cheonwangbong were investigated. The average number of alleles (A) was 7.8, the average number of effective alleles ($A_e$) was 4.9, observed heterozygosity ($H_o$) was 0.578, and expected heterozygosity ($H_e$) was 0.672, respectively. The level of genetic diversity within sub-populations ($H_e=0.672$) was lower than those of both population ($H_e=0.778$) and species ($H_e=0.759$) level. However, the level of genetic diversity was high compared those of Genus Abies. Genetic differentiation was 0.014 from F-statistics ($F_{ST}$) and was 0.004 from AMOVA analysis (${\Phi}_{ST}$). There was no almost genetic differentiation among sub-populations in Mt. Jiri from bayesian clustering. Therefore, If the seeds are sampled sufficiently by selecting the parameters from three sub-populations, it is possible that we could obtain genetically appropriate materials for ecological restoration.

Report of the 3rd Japan-Korea Workshop on Acupuncture and EBM;Protocol development for the acupuncture trial on the osteoarthritis of the knee

  • Jang, Jun-Hyouk;Kenji, Kawakita;Hahn, Seo-Kyung;Park, Hi-Joon;Lee, Seung-Deok;Kim, Yong-Suk;Norihito, Takahashi;Toshiyuki, Shichidou;Kazunori, Itoh;Eiji, Sumiya;Eiji, Furuya;Hitoshi, Yamashita;Hiroshi, Tsukayama
    • Journal of Acupuncture Research
    • /
    • v.23 no.6
    • /
    • pp.239-254
    • /
    • 2006
  • The 3rd Japan-Korea Workshop on Acupuncture and EBM was held at Kanazawa on June $16^{th}$. From Korea team, 4 papers were presented. Dr. Hahn introduced a new approach of data analysis on series of n-of-1 trials using the Bayesian statistics. It offered important information for the future n-of-1 trials. Dr. Park clearly demonstrated the significance of various sham devices proposed and stressed the importance of research questions when we choose the control intervention in RCT. Dr. Lee reported the results of survey in Korean Medical Doctors (KMD) for their point selection and techniques to the distal and local points. Dr. Kim presented the results of face to face survey on the KMD with 28 items for acupuncture treatment on the knee OA. Finally, a draft of protocol was introduced by Dr. Kim. The title was "multi-center, a randomized, single blinded, two arms, parallel-group study to compare the effectiveness and safety of 'individualized acupuncture' and 'standardized minimal acupuncture' in Korean and Japanese patients with knee osteoarthritis (Phase IV)". From Japan team, 7 speakers presented their comments and proposals on the protocol. Dr. Takahashi introduced several issues regarding n-of-1 trials and pointed out the importance of obtaining generalizability from n-of-1 trials. Dr. Shichidou pointed the importance of research design, selection of outcome measures and reduction of biases. Dr. Itoh presented the results of point selection for the knee OA based on the literature survey. Dr. Sumiya introduced several differences between KMD and Japanese acupuncturists based on the questionnaire used in KMD survey. Dr. Furuya demonstrated a result of press tack needle and its sham device on shoulder stiffness. Dr. Yamashita introduced the results of literature survey regarding adverse events occurred by acupuncture on knee OA. Dr.Tsukayama stressed the importance of responsibility of Institutional Review Board (IRB) for the conduction of clinical trials. After several issues were discussed, the need of continued meeting for final protocol development was agreed, then the workshop was closed.

  • PDF

Semantic Topic Selection Method of Document for Classification (문서분류를 위한 의미적 주제선정방법)

  • Ko, kwang-Sup;Kim, Pan-Koo;Lee, Chang-Hoon;Hwang, Myung-Gwon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.11 no.1
    • /
    • pp.163-172
    • /
    • 2007
  • The web as global network includes text document, video, sound, etc and connects each distributed information using link Through development of web, it accumulates abundant information and the main is text based documents. Most of user use the web to retrieve information what they want. So, numerous researches have progressed to retrieve the text documents using the many methods, such as probability, statistics, vector similarity, Bayesian, and so on. These researches however, could not consider both the subject and the semantics of documents. As a result user have to find by their hand again. Especially, it is more hard to find the korean document because the researches of korean document classification is insufficient. So, to overcome the previous problems, we propose the korean document classification method for semantic retrieval. This method firstly, extracts TF value and RV value of concepts that is included in document, and maps into U-WIN that is korean vocabulary dictionary to select the topic of document. This method is possible to classify the document semantically and showed the efficiency through experiment.