• Title/Summary/Keyword: Random vectors

Search Result 151, Processing Time 0.022 seconds

A Combination Strategy for Construction of Peptide-β2m-H-2Kb Single Chain with Overlap Extension PCR and One-Step Cloning

  • Xu, Tao;Li, Xiaoe;Wu, You;Shahzad, Khawar Ali;Wang, Wei;Zhang, Lei;Shen, Chuanlai
    • Journal of Microbiology and Biotechnology
    • /
    • v.26 no.12
    • /
    • pp.2184-2191
    • /
    • 2016
  • The time-consuming and high-cost preparation of soluble peptide-major histocompatibility complexes (pMHC) currently limits their wide uses in monitoring antigen-specific T cells. The single-chain trimer (SCT) of peptide-${\beta}2m$-MHC class I heavy chain was developed as an alternative strategy, but its gene fusion is hindered in many cases owing to the incompatibility between the multiple restriction enzymes and the restriction endonuclease sites of plasmid vectors. In this study, overlap extension PCR and one-step cloning were adopted to overcome this restriction. The SCT gene of the $OVA_{257-264}$ peptide-$(GS_4)_3-{\beta}2m-(GS_4)_4-H-2K^b$ heavy chain was constructed and inserted into plasmid pET28a by overlap extension PCR and one-step cloning, without the requirement of restriction enzymes. The SCT protein was expressed in Escherichia coli, and then purified and refolded. The resulting $H-2K^b/OVA_{257-264}$ complex showed the correct structural conformation and capability to bind with $OVA_{257-264}$-specific T-cell receptor. The overlap extension PCR and one-step cloning ensure the construction of single-chain MHC class I molecules associated with random epitopes, and will facilitate the preparation of soluble pMHC multimers.

Bivariate ROC Curve (이변량 ROC곡선)

  • Hong, C.S.;Kim, G.C.;Jeong, J.A.
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.2
    • /
    • pp.277-286
    • /
    • 2012
  • For credit assessment models, the ROC curves evaluate the classification performance using two univariate cumulative distribution functions of the false positive rate and true positive rate. In this paper, it is extended to two bivariate normal distribution functions of default and non-default borrowers; in addition, the bivariate ROC curves are proposed to represent the joint cumulative distribution functions by making use of the linear function that passes though the mean vectors of two score random variables. We explore the classification performance based on these ROC curves obtained from various bivariate normal distributions, and analyze with the corresponding AUROC. The optimal threshold could be derived from the bivariate ROC curve using many well known classification criteria and it is possible to establish an optimal cut-off criteria of bivariate mixture distribution functions.

Genetic Diversity and Population Structure of Chimaphila japonica in Southern Part of Korea (한국 남부지역의 매화노루발의 유전적 다양성과 집단구조)

  • Joo-Soo Choi;Man-Kyu Huh
    • Journal of Life Science
    • /
    • v.8 no.6
    • /
    • pp.687-694
    • /
    • 1998
  • Enzyme electrophoresis was used to estimate genetic diversity and population structure of Chimaphila japonica Miq. in Korea. The percent of polymorphic loci within the enzymes was 48.7%. Genetic diversity at the species level and at the population level was high (Hes=0.278 ; Hep=0.222, respectively), whereas the extent of the population divergence was relatively low ( $G_{ST}$ =0.079). $F_{IS}$ , a measure of the deviation from random mating within the 7 populations, was 0.355. An indirect estimate of the number of migrants per generation (Nm=2.61) indicates that gene flow is high among Korean populations of the species. In addition, analysis of fixation indices revealed a substantial heterozygosity deficiency in some populations and at some loci. Factors contributing to the high levels of genetic dive-rsity found in the entire species of C. japonica include wide distribution, long-lived perennials, ability to regenerate due to rhizomatous spread, outcrossing induced by animal vectors, and occasional pollen dispersal by wind.

  • PDF

Undecided inference using logistic regression for credit evaluation (신용평가에서 로지스틱 회귀를 이용한 미결정자 추론)

  • Hong, Chong-Sun;Jung, Min-Sub
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.2
    • /
    • pp.149-157
    • /
    • 2011
  • Undecided inference could be regarded as a missing data problem such as MARand MNAR. Under the assumption of MAR, undecided inference make use of logistic regression model. The probability of default for the undecided group is obtained with regression coefficient vectors for the decided group and compare with the probability of default for the decided group. And under the assumption of MNAR, undecide dinference make use of logistic regression model with additional feature random vector. Simulation results based on two kinds of real data are obtained and compared. It is found that the misclassification rates are not much different from the rate of rawdata under the assumption of MAR. However the misclassification rates under the assumption of MNAR are less than those under the assumption of MAR, and as the ratio of the undecided group is increasing, the misclassification rates is decreasing.

MCMC Algorithm for Dirichlet Distribution over Gridded Simplex (그리드 단체 위의 디리슐레 분포에서 마르코프 연쇄 몬테 칼로 표집)

  • Sin, Bong-Kee
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.1
    • /
    • pp.94-99
    • /
    • 2015
  • With the recent machine learning paradigm of using nonparametric Bayesian statistics and statistical inference based on random sampling, the Dirichlet distribution finds many uses in a variety of graphical models. It is a multivariate generalization of the gamma distribution and is defined on a continuous (K-1)-simplex. This paper presents a sampling method for a Dirichlet distribution for the problem of dividing an integer X into a sequence of K integers which sum to X. The target samples in our problem are all positive integer vectors when multiplied by a given X. They must be sampled from the correspondingly gridded simplex. In this paper we develop a Markov Chain Monte Carlo (MCMC) proposal distribution for the neighborhood grid points on the simplex and then present the complete algorithm based on the Metropolis-Hastings algorithm. The proposed algorithm can be used for the Markov model, HMM, and Semi-Markov model for accurate state-duration modeling. It can also be used for the Gamma-Dirichlet HMM to model q the global-local duration distributions.

Detection of Depression Trends in Literary Cyber Writers Using Sentiment Analysis and Machine Learning

  • Faiza Nasir;Haseeb Ahmad;CM Nadeem Faisal;Qaisar Abbas;Mubarak Albathan;Ayyaz Hussain
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.3
    • /
    • pp.67-80
    • /
    • 2023
  • Rice is an important food crop for most of the population in Nowadays, psychologists consider social media an important tool to examine mental disorders. Among these disorders, depression is one of the most common yet least cured disease Since abundant of writers having extensive followers express their feelings on social media and depression is significantly increasing, thus, exploring the literary text shared on social media may provide multidimensional features of depressive behaviors: (1) Background: Several studies observed that depressive data contains certain language styles and self-expressing pronouns, but current study provides the evidence that posts appearing with self-expressing pronouns and depressive language styles contain high emotional temperatures. Therefore, the main objective of this study is to examine the literary cyber writers' posts for discovering the symptomatic signs of depression. For this purpose, our research emphases on extracting the data from writers' public social media pages, blogs, and communities; (3) Results: To examine the emotional temperatures and sentences usage between depressive and not depressive groups, we employed the SentiStrength algorithm as a psycholinguistic method, TF-IDF and N-Gram for ranked phrases extraction, and Latent Dirichlet Allocation for topic modelling of the extracted phrases. The results unearth the strong connection between depression and negative emotional temperatures in writer's posts. Moreover, we used Naïve Bayes, Support Vector Machines, Random Forest, and Decision Tree algorithms to validate the classification of depressive and not depressive in terms of sentences, phrases and topics. The results reveal that comparing with others, Support Vectors Machines algorithm validates the classification while attaining highest 79% f-score; (4) Conclusions: Experimental results show that the proposed system outperformed for detection of depression trends in literary cyber writers using sentiment analysis.

Corpus of Eye Movements in L3 Spanish Reading: A Prediction Model

  • Hui-Chuan Lu;Li-Chi Kao;Zong-Han Li;Wen-Hsiang Lu;An-Chung Cheng
    • Asia Pacific Journal of Corpus Research
    • /
    • v.5 no.1
    • /
    • pp.23-36
    • /
    • 2024
  • This research centers on the Taiwan Eye-Movement Corpus of Spanish (TECS), a specially created corpus comprising eye-tracking data from Chinese-speaking learners of Spanish as a third language in Taiwan. Its primary purpose is to explore the broad utility of TECS in understanding language learning processes, particularly the initial stages of language learning. Constructing this corpus involves gathering data on eye-tracking, reading comprehension, and language proficiency to develop a machine-learning model that predicts learner behaviors, and subsequently undergoes a predictability test for validation. The focus is on examining attention in input processing and their relationship to language learning outcomes. The TECS eye-tracking data consists of indicators derived from eye movement recordings while reading Spanish sentences with temporal references. These indicators are obtained from eye movement experiments focusing on tense verbal inflections and temporal adverbs. Chinese expresses tense using aspect markers, lexical references, and contextual cues, differing significantly from inflectional languages like Spanish. Chinese-speaking learners of Spanish face particular challenges in learning verbal morphology and tenses. The data from eye movement experiments were structured into feature vectors, with learner behaviors serving as class labels. After categorizing the collected data, we used two types of machine learning methods for classification and regression: Random Forests and the k-nearest neighbors algorithm (KNN). By leveraging these algorithms, we predicted learner behaviors and conducted performance evaluations to enhance our understanding of the nexus between learner behaviors and language learning process. Future research may further enrich TECS by gathering data from subsequent eye-movement experiments, specifically targeting various Spanish tenses and temporal lexical references during text reading. These endeavors promise to broaden and refine the corpus, advancing our understanding of language processing.

Separations and Feature Extractions for Image Signals Using Independent Component Analysis Based on Neural Networks of Efficient Learning Rule (효율적인 학습규칙의 신경망 기반 독립성분분석을 이용한 영상신호의 분리 및 특징추출)

  • Cho, Yong-Hyun
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.13 no.2
    • /
    • pp.200-208
    • /
    • 2003
  • This paper proposes a separation and feature extraction of image signals using the independent component analysis(ICA) based on neural networks of efficient learning rule. The proposed learning rule is a hybrid fixed-point(FP) algorithm based on secant method and momentum. Secant method is applied to improve the performance by simplifying the 1st-order derivative computation for optimizing the objective function, which is to minimize the mutual informations of the independent components. The momentum is applied for high-speed convergence by restraining the oscillation in the process of converging to the optimal solution. The proposed algorithm has been applied to the composite images generated by random mixing matrix from the 10 images of $512\times512$-pixel. The simulation results show that the proposed algorithm has better performances of the separation speed and rate than those using the FP algorithm based on Newton and secant method. The proposed algorithm has been also applied to extract the features using a 3 set of 10,000 image patches from the 10 fingerprints of $256\times256$-pixel and the front and the rear paper money of $480\times225$-pixel, respectively, The simulation results show that the proposed algorithm has also better extraction speed than those using the another methods. Especially, the 160 basis vectors(features) of $16\times16$-pixel show the local features which have the characteristics of spatial frequency and oriented edges in the images.

Floating Point Unit Design for the IEEE754-2008 (IEEE754-2008을 위한 고속 부동소수점 연산기 설계)

  • Hwang, Jin-Ha;Kim, Hyun-Pil;Park, Sang-Su;Lee, Yong-Surk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.48 no.10
    • /
    • pp.82-90
    • /
    • 2011
  • Because of the development of Smart phone devices, the demands of high performance FPU(Floating-point Unit) becomes increasing. Therefore, we propose the high-speed single-/double-precision FPU design that includes an elementary add/sub unit and improved multiplier and compare and convert units. The most commonly used add/sub unit is optimized by the parallel rounding unit. The matrix operation is used in complex calculation something like a graphic calculation. We designed the Multiply-Add Fused(MAF) instead of multiplier to calculate the matrix more quickly. The branch instruction that is decided by the compare operation is very frequently used in various programs. We bypassed the result of the compare operation before all the pipeline processes ended to decrease the total execution time. And we included additional convert operations that are added in IEEE754-2008 standard. To verify our RTL designs, we chose four hundred thousand test vectors by weighted random method and simulated each unit. The FPU that was synthesized by Samsung's 45-nm low-power process satisfied the 600-MHz operation frequency. And we confirm a reduction in area by comparing the improved FPU with the existing FPU.

Human Visual Perception-Based Quantization For Efficiency HEVC Encoder (HEVC 부호화기 고효율 압축을 위한 인지시각 특징기반 양자화 방법)

  • Kim, Young-Woong;Ahn, Yong-Jo;Sim, Donggyu
    • Journal of Broadcast Engineering
    • /
    • v.22 no.1
    • /
    • pp.28-41
    • /
    • 2017
  • In this paper, the fast encoding algorithm in High Efficiency Video Coding (HEVC) encoder was studied. For the encoding efficiency, the current HEVC reference software is divided the input image into Coding Tree Unit (CTU). then, it should be re-divided into CU up to maximum depth in form of quad-tree for RDO (Rate-Distortion Optimization) in encoding precess. But, it is one of the reason why complexity is high in the encoding precess. In this paper, to reduce the high complexity in the encoding process, it proposed the method by determining the maximum depth of the CU using a hierarchical clustering at the pre-processing. The hierarchical clustering results represented an average combination of motion vectors (MV) on neighboring blocks. Experimental results showed that the proposed method could achieve an average of 16% time saving with minimal BD-rate loss at 1080p video resolution. When combined the previous fast algorithm, the proposed method could achieve an average 45.13% time saving with 1.84% BD-rate loss.