• Title/Summary/Keyword: Random Numbers

Search Result 446, Processing Time 0.027 seconds

A Review of Multivariate Analysis Studies Applied for Plant Morphology in Korea (국내 식물 형태 연구에 사용된 다변량분석 논문에 대한 재고)

  • Chang, Kae Sun;Oh, Hana;Kim, Hui;Lee, Heung Soo;Chang, Chin-Sung
    • Journal of Korean Society of Forest Science
    • /
    • v.98 no.3
    • /
    • pp.215-224
    • /
    • 2009
  • A review was given of the role of traditional morphometrics in plant morphological studies using 54 published studies in three major journals and others in Korea, such as Journal of Korean Forestry Society, Korean Journal of Plant Taxonomy, Korean Journal of Breeding, Korean Journal of Apiculture, Journal of Life Science, and Korean Journal of Plant Resources from 1997 to 2008. The two most commonly used techniques of data analysis, cluster analysis (CA) and principal components analysis (PCA) with other statistical tests were discussed. The common problem of PCA is the underlying assumptions of methods, like random sampling and multivariate normal distribution of data. The procedure was intended mainly for continuous data and was not efficient for data which were not well summarized by variances or covariances. Likewise CA was most appropriate for categorical rather than continuous data. Also, the CA produced clusters whether or not natural groupings existed, and the results depended on both the similarity measure chosen and the algorithm used for clustering. An additional problems of the PCA and the CA arised with both qualitative and quantitative data with a limited number of variables and/or too few numbers of samples. Some of these problems may be avoided if a certain number of variables (more than 20 at least) and sufficient samples (40-50 at least) are considered for morphometric analyses, but we do not think that the methods are all mighty tools for data analysts. Instead, we do believe that reasonable applications combined with focus on objectives and limitations of each procedure would be a step forward.

Embryo Production in Superior Hanwoo Donors and Embryo Transfer (우수 한우의 수정란 생산 및 이식)

  • Son D.S.;Han M.H.;Choe C.Y.;Choi S.H.;Cho S.R.;Kim H.J.;Ryu I.S.;Choi S.B.;Lee S.S.;Kim Y.K.;Kim S.K.;Kim S.H.;Shin K.H.
    • Journal of Embryo Transfer
    • /
    • v.21 no.2
    • /
    • pp.147-156
    • /
    • 2006
  • The objective of this study was to supply excellent genetic resources to livestock farms by transferring embryos produced by genetically superior Korean cows (Hanwoo). Eighty Hanwoo donors were superovulated with gonadotropin ($Folltrpin^(R)\;or\;Antorin^(R)$) for 4 days combined with or without progesterone releasing intravaginal device (CIDR) insertion. The collected fresh or frozen-thawed embryos were transferred to 226 farm recipients. In this study, the effect of CIDR insertion in combination with gonadotropin ($Folltrpin^(R)$) treatments initiated at the random stage of estrous cycle on embryo production was evaluated and compared to conventional superovulation protocol. Moreover, the effect of gonadotropin ($Antorin^(R)$) dose in CIDR-treated Hanwoo donors on the embryo yield was determined. In addition, the effects of embryos (fresh vs. frozen-thawed), embryo transfer person, seasons and farms on the pregnancy rate were evaluated. In Hanwoo donors, CIDR insertion in combination with $Folltrpin^(R)$ treatments regardless of estrous detection resulted in increased numbers of total ova (6.5 vs. 5.8) and transferable embryos (3.9 vs. 3.2) compared to the conventional superovulation protocol (p<0.01). In CIDR-treated Hanwoo donors, the higher dose of $Antorin^(R)$ (36 vs. 28 mg) resulted in the increased number of transferable embryos (8.3 vs. 5.4, p<0.05). The embryos (fresh 43.9% vs. frozen-thawed 23.1%) and embryo transfer person (53.9 vs. $0{\sim}16.7%$) significantly affected the pregnancy rate after embryo transfer (p<0.01). These results suggest that CIDR-based superovulation protocol may be effectively used for production of superior Hanwoo embryos and, multiple ovulation and embryo transfer in Hanwoo might be effectively applied for livestock improvement if pregmancy rate with frozen-thawed embryos and embryo transfer skill would be improved.

Selective Word Embedding for Sentence Classification by Considering Information Gain and Word Similarity (문장 분류를 위한 정보 이득 및 유사도에 따른 단어 제거와 선택적 단어 임베딩 방안)

  • Lee, Min Seok;Yang, Seok Woo;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.105-122
    • /
    • 2019
  • Dimensionality reduction is one of the methods to handle big data in text mining. For dimensionality reduction, we should consider the density of data, which has a significant influence on the performance of sentence classification. It requires lots of computations for data of higher dimensions. Eventually, it can cause lots of computational cost and overfitting in the model. Thus, the dimension reduction process is necessary to improve the performance of the model. Diverse methods have been proposed from only lessening the noise of data like misspelling or informal text to including semantic and syntactic information. On top of it, the expression and selection of the text features have impacts on the performance of the classifier for sentence classification, which is one of the fields of Natural Language Processing. The common goal of dimension reduction is to find latent space that is representative of raw data from observation space. Existing methods utilize various algorithms for dimensionality reduction, such as feature extraction and feature selection. In addition to these algorithms, word embeddings, learning low-dimensional vector space representations of words, that can capture semantic and syntactic information from data are also utilized. For improving performance, recent studies have suggested methods that the word dictionary is modified according to the positive and negative score of pre-defined words. The basic idea of this study is that similar words have similar vector representations. Once the feature selection algorithm selects the words that are not important, we thought the words that are similar to the selected words also have no impacts on sentence classification. This study proposes two ways to achieve more accurate classification that conduct selective word elimination under specific regulations and construct word embedding based on Word2Vec embedding. To select words having low importance from the text, we use information gain algorithm to measure the importance and cosine similarity to search for similar words. First, we eliminate words that have comparatively low information gain values from the raw text and form word embedding. Second, we select words additionally that are similar to the words that have a low level of information gain values and make word embedding. In the end, these filtered text and word embedding apply to the deep learning models; Convolutional Neural Network and Attention-Based Bidirectional LSTM. This study uses customer reviews on Kindle in Amazon.com, IMDB, and Yelp as datasets, and classify each data using the deep learning models. The reviews got more than five helpful votes, and the ratio of helpful votes was over 70% classified as helpful reviews. Also, Yelp only shows the number of helpful votes. We extracted 100,000 reviews which got more than five helpful votes using a random sampling method among 750,000 reviews. The minimal preprocessing was executed to each dataset, such as removing numbers and special characters from text data. To evaluate the proposed methods, we compared the performances of Word2Vec and GloVe word embeddings, which used all the words. We showed that one of the proposed methods is better than the embeddings with all the words. By removing unimportant words, we can get better performance. However, if we removed too many words, it showed that the performance was lowered. For future research, it is required to consider diverse ways of preprocessing and the in-depth analysis for the co-occurrence of words to measure similarity values among words. Also, we only applied the proposed method with Word2Vec. Other embedding methods such as GloVe, fastText, ELMo can be applied with the proposed methods, and it is possible to identify the possible combinations between word embedding methods and elimination methods.

A Comparison of the Effects between Eye-Mask and Light-Off Conditions on Psychiatric Patient Sleep (야간 조명 하 안대와 소등의 수면에 대한 효과 비교)

  • Shin, Juyong;Lim, Kyoung-Ok;Cho, Seongnam;Jang, Soyeong;Cha, Seung-Min;Han, Songyi;Kim, Moojin
    • Sleep Medicine and Psychophysiology
    • /
    • v.28 no.1
    • /
    • pp.27-33
    • /
    • 2021
  • Objectives: The purpose of this study is to investigate the difference in the effects of eye-mask and light-off on sleep status according to a commercial fitness tracker and a sleep diary of psychiatric in-patients in correctional facilities where nocturnal light is compulsory. Methods: This study was conducted over 3 consecutive nights. In-patients of the National Forensic Psychiatric Hospital (n = 29) were assigned random subject numbers and slept as usual in the light-on condition on the first night. The subjects slept with eye-masks in the light-on condition on another night and without an eye-mask in the light-off condition on the other night. Subjects were asked to sleep wearing a commercial fitness tracker and to keep a sleep diary. The order of these changes in bedroom lighting condition on the second and third nights was assigned randomly to participants. Results: In comparison of the sleep variables between the light-on condition and the eye-mask condition, the Wakefullness After Sleep Onset (WASO) was shorter and sleep satisfaction was higher in the latter.(respectively, Z = 3.66, p < 0.017 ; Z = 2.69, p < 0.017) In comparison of the sleep variables between the light-on and light-off conditions, the WASO was shorter and sleep efficiency and sleep satisfaction were higher in the latter (respectively, Z = 2.40, p < 0.017 ; Z = 3.02, p < 0.017 ; Z = 3.88, p < 0.017). However, there were no differences in the sleep variables between the eye-mask condition and the light-off condition. Conclusion: Subjective improvements in sleep variables were noted in sleep diaries of institutionalized psychiatric patients under either the 'eye-mask' or 'light-off' condition. However, there were no significant differences between the 'eye-mask' and 'light-off' conditions. Therefore, we suggest that psychiatric patients in correctional facilities use eye-masks when sleeping.

Strain Improvement of the Genus Pleurotus by Protoplast Fusion (원형질체(原形質體) 융합(融合)에 의한 느타리버섯속(屬)의 품종개발(品種開發))

  • Yoo, Young-Bok;You, Chang-Hyun;Cha, Dong-Yeul
    • The Korean Journal of Mycology
    • /
    • v.21 no.3
    • /
    • pp.200-211
    • /
    • 1993
  • Somatic hybrids of Pleurotus florida ASI 2016 and Pleurotus ostreatus ASI 2018 were obtained by protoplast fusion. The 40 fusants($P1{\sim}P40$) was examined for the yield on fermented and pasteurized rice straw in a tray. The carpophore yield of them were showed as the range of $27.0{\sim}155.2$, based on parental values of 100(ASI 2018), The pilei of fusants between orange white colored P. florida and dark grey colored P. ostreatus had mixed colors in the young stage. Other breeding programmes were performed to improve new varieties with high yield and good quality. A new oyster mushroom variety, Wonhyeongneutaribeosus(P72), was developed at the Agricultural Sciences Institute, Rural Development Administration in 1990. This P.florida-ostreatus-ostreatus hybrid P72 was selected from 38 protoplast fusion products($P41{\sim}P78$) between P.florida-ostreatus recombinant P5-M 43-arg rib and P. ostreatus ASI 2-13-0 2001-19-pro orn. The yield indexes of 38 hybrids ranged $40.5{\sim}152.7$ compared with the parental values of 100(ASI 2001). Hybrid P72 was characterized by the large fruiting bundle of semispherical shape with long stipe and by the small and circular pileus, resulting in lower harvesting cost. A significant increase in carpophore production was observed in somatic hybrids of protoplasts due to heterosis. A comparision of hybrid with parents P72 was made using isozyme analysis. The esterase banding patterns could be characterized by new bands in the hybrids. Seven fusion products of four crosses between P.florida ASI 2016 and P. ostreatus ASI 2018 were analysed with respect to the distribution of progenies and segregation of gene markers by random basidiospore analysis. Segregation of alleles should yield progeny of four genotypes in a Mendelian ratio of 1 : 1 : 1 : 1 for prototrophs, auxotrophs of one parental type, auxotrophs of the other parental type, and auxotrophic recombinants, respectively. However, five fusants of them did not detect one parental, P.ostreatus, type. Basidiospores could yield progeny of 16 genotypes in the cross of one of the recombinant P5-M43-arg $rib{\times}P. ostreatus$ ASI 2-13-pro orn but the segregants of three fusants were not detected clearly. The allele ratio of loci could be expected 1 : 1 : 1 : 1 for arg, rib, pro and orn. The ratio, however, would be changed to 4 : 1 : 1 : 1 with increasing proportion of argo In almost all the fusants, prototrophic recombinants were recovered in large numbers against auxotrophic markers. Parental genotypes were recovered with the recombinant progeny amounting to $38.68{\sim}99.56%$. The analysis provides proof of heterokaryosis and strong evidence for haploidy of vegetative nuclei, a sexual cycle consisting of nuclear fusion and meiosis.

  • PDF

A Time Series Graph based Convolutional Neural Network Model for Effective Input Variable Pattern Learning : Application to the Prediction of Stock Market (효과적인 입력변수 패턴 학습을 위한 시계열 그래프 기반 합성곱 신경망 모형: 주식시장 예측에의 응용)

  • Lee, Mo-Se;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.1
    • /
    • pp.167-181
    • /
    • 2018
  • Over the past decade, deep learning has been in spotlight among various machine learning algorithms. In particular, CNN(Convolutional Neural Network), which is known as the effective solution for recognizing and classifying images or voices, has been popularly applied to classification and prediction problems. In this study, we investigate the way to apply CNN in business problem solving. Specifically, this study propose to apply CNN to stock market prediction, one of the most challenging tasks in the machine learning research. As mentioned, CNN has strength in interpreting images. Thus, the model proposed in this study adopts CNN as the binary classifier that predicts stock market direction (upward or downward) by using time series graphs as its inputs. That is, our proposal is to build a machine learning algorithm that mimics an experts called 'technical analysts' who examine the graph of past price movement, and predict future financial price movements. Our proposed model named 'CNN-FG(Convolutional Neural Network using Fluctuation Graph)' consists of five steps. In the first step, it divides the dataset into the intervals of 5 days. And then, it creates time series graphs for the divided dataset in step 2. The size of the image in which the graph is drawn is $40(pixels){\times}40(pixels)$, and the graph of each independent variable was drawn using different colors. In step 3, the model converts the images into the matrices. Each image is converted into the combination of three matrices in order to express the value of the color using R(red), G(green), and B(blue) scale. In the next step, it splits the dataset of the graph images into training and validation datasets. We used 80% of the total dataset as the training dataset, and the remaining 20% as the validation dataset. And then, CNN classifiers are trained using the images of training dataset in the final step. Regarding the parameters of CNN-FG, we adopted two convolution filters ($5{\times}5{\times}6$ and $5{\times}5{\times}9$) in the convolution layer. In the pooling layer, $2{\times}2$ max pooling filter was used. The numbers of the nodes in two hidden layers were set to, respectively, 900 and 32, and the number of the nodes in the output layer was set to 2(one is for the prediction of upward trend, and the other one is for downward trend). Activation functions for the convolution layer and the hidden layer were set to ReLU(Rectified Linear Unit), and one for the output layer set to Softmax function. To validate our model - CNN-FG, we applied it to the prediction of KOSPI200 for 2,026 days in eight years (from 2009 to 2016). To match the proportions of the two groups in the independent variable (i.e. tomorrow's stock market movement), we selected 1,950 samples by applying random sampling. Finally, we built the training dataset using 80% of the total dataset (1,560 samples), and the validation dataset using 20% (390 samples). The dependent variables of the experimental dataset included twelve technical indicators popularly been used in the previous studies. They include Stochastic %K, Stochastic %D, Momentum, ROC(rate of change), LW %R(Larry William's %R), A/D oscillator(accumulation/distribution oscillator), OSCP(price oscillator), CCI(commodity channel index), and so on. To confirm the superiority of CNN-FG, we compared its prediction accuracy with the ones of other classification models. Experimental results showed that CNN-FG outperforms LOGIT(logistic regression), ANN(artificial neural network), and SVM(support vector machine) with the statistical significance. These empirical results imply that converting time series business data into graphs and building CNN-based classification models using these graphs can be effective from the perspective of prediction accuracy. Thus, this paper sheds a light on how to apply deep learning techniques to the domain of business problem solving.