• Title/Summary/Keyword: model rank

Search Result 607, Processing Time 0.027 seconds

An Application of the Clustering Threshold Gradient Descent Regularization Method for Selecting Genes in Predicting the Survival Time of Lung Carcinomas

  • Lee, Seung-Yeoun;Kim, Young-Chul
    • Genomics & Informatics
    • /
    • v.5 no.3
    • /
    • pp.95-101
    • /
    • 2007
  • In this paper, we consider the variable selection methods in the Cox model when a large number of gene expression levels are involved with survival time. Deciding which genes are associated with survival time has been a challenging problem because of the large number of genes and relatively small sample size (n<

Patent citation network analysis (특허 인용 네트워크 분석)

  • Lee, Minjung;Kim, Yongdai;Jang, Woncheol
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.4
    • /
    • pp.613-625
    • /
    • 2016
  • The development of technology has changed the world drastically. Patent data analysis helps to understand modern technology trends and predict prospective future technology. In this paper, we analyze the patent citation network using the USPTO data between 1985 and 2012 to identify technology trends. We use network centrality measures that include a PageRank algorithm to find core technologies and identify groups of technology with similar properties with statistical network models.

A new approach to model reduction using matrix pencil method (Matrix Pencil을 이용한 모델 저차화의 새로운 접근방법)

  • 권혁성;정정주;서병설
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.105-108
    • /
    • 1997
  • This paper proposes a new approach of balanced model reduction using matrix pencil. The algorithm presented in this paper is to convert full-rank high-order system into rank-deficient system using perturbation made by matrix pencil method. Then the system can be truncated to a low-order system that we want via balanced realization. We discuss the comparison with other methods and the various observations by simulations.

  • PDF

An Orthogonal Representation of Estimable Functions

  • Yi, Seong-Baek
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.6
    • /
    • pp.837-842
    • /
    • 2008
  • Students taking linear model courses have difficulty in determining which parametric functions are estimable when the design matrix of a linear model is rank deficient. In this note a special form of estimable functions is presented with a linear combination of some orthogonal estimable functions. Here, the orthogonality means the least squares estimators of the estimable functions are uncorrelated and have the same variance. The number of the orthogonal estimable functions composing the special form is equal to the rank of the design matrix. The orthogonal estimable functions can be easily obtained through the singular value decomposition of the design matrix.

Variable Selection with Nonconcave Penalty Function on Reduced-Rank Regression

  • Jung, Sang Yong;Park, Chongsun
    • Communications for Statistical Applications and Methods
    • /
    • v.22 no.1
    • /
    • pp.41-54
    • /
    • 2015
  • In this article, we propose nonconcave penalties on a reduced-rank regression model to select variables and estimate coefficients simultaneously. We apply HARD (hard thresholding) and SCAD (smoothly clipped absolute deviation) symmetric penalty functions with singularities at the origin, and bounded by a constant to reduce bias. In our simulation study and real data analysis, the new method is compared with an existing variable selection method using $L_1$ penalty that exhibits competitive performance in prediction and variable selection. Instead of using only one type of penalty function, we use two or three penalty functions simultaneously and take advantages of various types of penalty functions together to select relevant predictors and estimation to improve the overall performance of model fitting.

Frame Selection, Hybrid, Modified Weighting Model Rank Method for Robust Text-independent Speaker Identification (강건한 문맥독립 화자식별을 위한 프레임 선택방법, 복합방법, 수정된 가중모델순위 방법)

  • 김민정;오세진;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.8
    • /
    • pp.735-743
    • /
    • 2002
  • In this paper, we propose three new text-independent speaker identification methods. At first, to exclude the frames not having enough features of speaker's vocal from calculation of the maximum likelihood, we propose the FS(Frame Selection) method. This approach selects the important frames by evaluating the difference between the biggest likelihood and the second in each frame, and uses only the frames in calculating the score of likelihood. Our secondly proposed, called the Hybrid, is a combined version of the FS and WMR(Weighting Model Rank). This method determines the claimed speaker using exponential function weights, instead of likelihood itself, only on the selected frames obtained from the FS method. The last proposed, called MWMR (Modified WMR), considers both original likelihood itself and its relative position, when the claimed speaker is determined. It is different from the WMR that take into account only the relative position of likelihood. Through the experiments of the speaker identification, we show that the all the proposed have higher identification rates than the ML. In addition, the Hybrid and MWMR have higher identification rate about 2% and about 3% than WMR, respectively.

Performance Improvement Methods of a Spoken Chatting System Using SVM (SVM을 이용한 음성채팅시스템의 성능 향상 방법)

  • Ahn, HyeokJu;Lee, SungHee;Song, YeongKil;Kim, HarkSoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.6
    • /
    • pp.261-268
    • /
    • 2015
  • In spoken chatting systems, users'spoken queries are converted to text queries using automatic speech recognition (ASR) engines. If the top-1 results of the ASR engines are incorrect, these errors are propagated to the spoken chatting systems. To improve the top-1 accuracies of ASR engines, we propose a post-processing model to rearrange the top-n outputs of ASR engines using a ranking support vector machine (RankSVM). On the other hand, a number of chatting sentences are needed to train chatting systems. If new chatting sentences are not frequently added to training data, responses of the chatting systems will be old-fashioned soon. To resolve this problem, we propose a data collection model to automatically select chatting sentences from TV and movie scenarios using a support vector machine (SVM). In the experiments, the post-processing model showed a higher precision of 4.4% and a higher recall rate of 6.4% compared to the baseline model (without post-processing). Then, the data collection model showed the high precision of 98.95% and the recall rate of 57.14%.

An Empirical Study on Statistical Optimization Model for the Portfolio Construction of Sponsored Search Advertising(SSA) (키워드검색광고 포트폴리오 구성을 위한 통계적 최적화 모델에 대한 실증분석)

  • Yang, Hognkyu;Hong, Juneseok;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.167-194
    • /
    • 2019
  • This research starts from the four basic concepts of incentive incompatibility, limited information, myopia and decision variable which are confronted when making decisions in keyword bidding. In order to make these concept concrete, four framework approaches are designed as follows; Strategic approach for the incentive incompatibility, Statistical approach for the limited information, Alternative optimization for myopia, and New model approach for decision variable. The purpose of this research is to propose the statistical optimization model in constructing the portfolio of Sponsored Search Advertising (SSA) in the Sponsor's perspective through empirical tests which can be used in portfolio decision making. Previous research up to date formulates the CTR estimation model using CPC, Rank, Impression, CVR, etc., individually or collectively as the independent variables. However, many of the variables are not controllable in keyword bidding. Only CPC and Rank can be used as decision variables in the bidding system. Classical SSA model is designed on the basic assumption that the CPC is the decision variable and CTR is the response variable. However, this classical model has so many huddles in the estimation of CTR. The main problem is the uncertainty between CPC and Rank. In keyword bid, CPC is continuously fluctuating even at the same Rank. This uncertainty usually raises questions about the credibility of CTR, along with the practical management problems. Sponsors make decisions in keyword bids under the limited information, and the strategic portfolio approach based on statistical models is necessary. In order to solve the problem in Classical SSA model, the New SSA model frame is designed on the basic assumption that Rank is the decision variable. Rank is proposed as the best decision variable in predicting the CTR in many papers. Further, most of the search engine platforms provide the options and algorithms to make it possible to bid with Rank. Sponsors can participate in the keyword bidding with Rank. Therefore, this paper tries to test the validity of this new SSA model and the applicability to construct the optimal portfolio in keyword bidding. Research process is as follows; In order to perform the optimization analysis in constructing the keyword portfolio under the New SSA model, this study proposes the criteria for categorizing the keywords, selects the representing keywords for each category, shows the non-linearity relationship, screens the scenarios for CTR and CPC estimation, selects the best fit model through Goodness-of-Fit (GOF) test, formulates the optimization models, confirms the Spillover effects, and suggests the modified optimization model reflecting Spillover and some strategic recommendations. Tests of Optimization models using these CTR/CPC estimation models are empirically performed with the objective functions of (1) maximizing CTR (CTR optimization model) and of (2) maximizing expected profit reflecting CVR (namely, CVR optimization model). Both of the CTR and CVR optimization test result show that the suggested SSA model confirms the significant improvements and this model is valid in constructing the keyword portfolio using the CTR/CPC estimation models suggested in this study. However, one critical problem is found in the CVR optimization model. Important keywords are excluded from the keyword portfolio due to the myopia of the immediate low profit at present. In order to solve this problem, Markov Chain analysis is carried out and the concept of Core Transit Keyword (CTK) and Expected Opportunity Profit (EOP) are introduced. The Revised CVR Optimization model is proposed and is tested and shows validity in constructing the portfolio. Strategic guidelines and insights are as follows; Brand keywords are usually dominant in almost every aspects of CTR, CVR, the expected profit, etc. Now, it is found that the Generic keywords are the CTK and have the spillover potentials which might increase consumers awareness and lead them to Brand keyword. That's why the Generic keyword should be focused in the keyword bidding. The contribution of the thesis is to propose the novel SSA model based on Rank as decision variable, to propose to manage the keyword portfolio by categories according to the characteristics of keywords, to propose the statistical modelling and managing based on the Rank in constructing the keyword portfolio, and to perform empirical tests and propose a new strategic guidelines to focus on the CTK and to propose the modified CVR optimization objective function reflecting the spillover effect in stead of the previous expected profit models.

Development of An Automatic Incident Detection Model Using Wilcoxon Rank Sum Test (Wilcoxon Rank Sum Test 기법을 이용한 자동돌발상황검지 모형 개발)

  • 이상민;이승환
    • Journal of Korean Society of Transportation
    • /
    • v.20 no.6
    • /
    • pp.81-98
    • /
    • 2002
  • 본 연구는 Wilcoxon Rank Sum Test 기법을 이용한 자동 돌발상황 검지 모형을 개발하는 것이다. 본 연구의 수행을 위하여 고속도로에 설치된 루프 차량 검지기(Loop Vehicle Detection System)에서 수집된 점유율 데이터를 사용하였다. 기존의 검지모형은 산정하기가 까다로운 임계치에 의하여 돌발상황을 검지하는 방식이었다. 반면 본 연구 모델은 위치와 시간대 교통 패턴에 관계없이 모형을 일정하게 적용하며, 지속적으로 돌발상황 지점과 상·하류의 교통패턴을 비교 검정 기법인 Wilcoxon Rank Sum Test 기법을 사용하여 돌발상황 검지를 수행하도록 하였다. 연구모형의 검증을 위한 테스트 결과 시간과 위치에 관계없이 정확하고 빠른 검지시간(돌발 상황 발생 후 2∼3분)을 가짐을 알 수 있었다. 또한 기존의 모형인 APID, DES, DELOS모형과 비교검증을 위하여 검지율 및 오보율 테스트를 수행한 결과 향상된 검지 능력(검지율 : 89.01%, 오보율 : 0.97%)을 나타남을 알 수 있었다. 그러나 압축파와 같은 유사 돌발상황이 발생되면 제대로 검지를 하지 못하는 단점을 가지고 있으며 향후 이에 대한 연구가 추가된다면 더욱 신뢰성 있는 검지모형으로 발전할 것이다.

Document Summarization Considering Entailment Relation between Sentences (문장 수반 관계를 고려한 문서 요약)

  • Kwon, Youngdae;Kim, Noo-ri;Lee, Jee-Hyong
    • Journal of KIISE
    • /
    • v.44 no.2
    • /
    • pp.179-185
    • /
    • 2017
  • Document summarization aims to generate a summary that is consistent and contains the highly related sentences in a document. In this study, we implemented for document summarization that extracts highly related sentences from a whole document by considering both similarities and entailment relations between sentences. Accordingly, we proposed a new algorithm, TextRank-NLI, which combines a Recurrent Neural Network based Natural Language Inference model and a Graph-based ranking algorithm used in single document extraction-based summarization task. In order to evaluate the performance of the new algorithm, we conducted experiments using the same datasets as used in TextRank algorithm. The results indicated that TextRank-NLI showed 2.3% improvement in performance, as compared to TextRank.