• 제목/요약/키워드: Quantile vector

검색결과 34건 처리시간 0.02초

New Normalization Methods using Support Vector Machine Regression Approach in cDNA Microarray Analysis

  • Sohn, In-Suk;Kim, Su-Jong;Hwang, Chang-Ha;Lee, Jae-Won
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.51-56
    • /
    • 2005
  • There are many sources of systematic variations in cDNA microarray experiments which affect the measured gene expression levels like differences in labeling efficiency between the two fluorescent dyes. Print-tip lowess normalization is used in situations where dye biases can depend on spot overall intensity and/or spatial location within the array. However, print-tip lowess normalization performs poorly in situation where error variability for each gene is heterogeneous over intensity ranges. We proposed the new print-tip normalization methods based on support vector machine regression(SVMR) and support vector machine quantile regression(SVMQR). SVMQR was derived by employing the basic principle of support vector machine (SVM) for the estimation of the linear and nonlinear quantile regressions. We applied our proposed methods to previous cDNA micro array data of apolipoprotein-AI-knockout (apoAI-KO) mice, diet-induced obese mice, and genistein-fed obese mice. From our statistical analysis, we found that the proposed methods perform better than the existing print-tip lowess normalization method.

  • PDF

SVQR with asymmetric quadratic loss function

  • Shim, Jooyong;Kim, Malsuk;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권6호
    • /
    • pp.1537-1545
    • /
    • 2015
  • Support vector quantile regression (SVQR) can be obtained by applying support vector machine with a check function instead of an e-insensitive loss function into the quantile regression, which still requires to solve a quadratic program (QP) problem which is time and memory expensive. In this paper we propose an SVQR whose objective function is composed of an asymmetric quadratic loss function. The proposed method overcomes the weak point of the SVQR with the check function. We use the iterative procedure to solve the objective problem. Furthermore, we introduce the generalized cross validation function to select the hyper-parameters which affect the performance of SVQR. Experimental results are then presented, which illustrate the performance of proposed SVQR.

다변량 정규분포에서 대안적인 VaR의 특성 (Properties of alternative VaR for multivariate normal distributions)

  • 홍종선;이기쁨
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권6호
    • /
    • pp.1453-1463
    • /
    • 2016
  • 가장 선호하는 금융위험 측정 방법은 통계적으로 최대손실금액을 추정하는 VaR (Value at Risk)이다. 포트폴리오를 구성하는 여러 산업에 대한 VaR (Value at Risk)는 분산공분산 행렬과 특정한 포트폴리오가 포함되어 변환된 일변량 위험을 이용하여 추정한다. Hong 등 (2016)은 다변량 분위벡터를 바탕으로 Vector at Risk를 정의하였으며, 특정한 포트폴리오가 설정되면 Vector at Risk 중의 한 점을 최적의 VaR 즉, 대안적인 VaR (AVaR)로 제안하였다. 본 연구에서는 다변량 정규분포에 대하여 AVaR의 특성을 탐색한다. 여러 종류의 분산공분산 행렬과 다양한 포트폴리오 가중값 벡터인 경우의 이변량과 삼변량의 정규분포를 따르는 모의실험 자료와 실증예제를 이용하여 대안적인 최대손실금액인 AVaR을 구하고 VaR과 비교 분석한다. 다변량 분위벡터를 이용한 AVaR는 VaR보다 작게 추정함을 발견하였으며, 이런 특징과 함께 AVaR의 특성을 토론한다.

다변량 경험분포함수와 시각적인 표현방법 (Multivariate empirical distribution functions and descriptive methods)

  • 홍종선;박준;박용호
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권1호
    • /
    • pp.87-98
    • /
    • 2017
  • 일변량 이상의 다변량 경험분포함수의 정의를 새롭게 제안하고, 경험분포함수의 기대값과 분산을 유도하면서 다변량 경험분포함수가 실제의 분포함수로 수렴함을 확인한다. 그리고 다양한 상관계수의 이변량 표준정규분포에서 추출한 확률표본을 바탕으로 이변량 경험분포함수를 구하고 이를 이차원 평면에 시각적으로 표현하는 두 종류의 그래픽적인 방법을 제안한다. 하나는 계단으로 표현하여 계단식 함수와 유사한 성격을 갖고 있는 방법이고, 다른 하나는 이변량 분위벡터로 설명되는 그림 방법이다. 두 종류의 시각적인 표현 방법은 삼차원으로 표현할 수 있으나 이차원 평면으로도 쉽게 구현이 가능하며, 일반적으로 이변량 누적분포함수의 모든 특징을 충분히 설명할 수 있다. 따라서 삼변량 경험분포함수를 시각적 표현이 가능함을 보인다. 이변량과 사변량의 실증 예제를 통하여 본 연구에서 제안한 다변량 경험분포함수와 이차원 평면에 표현하는 시각적인 표현 방법들을 구현하고 탐색한다.

A concise overview of principal support vector machines and its generalization

  • Jungmin Shin;Seung Jun Shin
    • Communications for Statistical Applications and Methods
    • /
    • 제31권2호
    • /
    • pp.235-246
    • /
    • 2024
  • In high-dimensional data analysis, sufficient dimension reduction (SDR) has been considered as an attractive tool for reducing the dimensionality of predictors while preserving regression information. The principal support vector machine (PSVM) (Li et al., 2011) offers a unified approach for both linear and nonlinear SDR. This article comprehensively explores a variety of SDR methods based on the PSVM, which we call principal machines (PM) for SDR. The PM achieves SDR by solving a sequence of convex optimizations akin to popular supervised learning methods, such as the support vector machine, logistic regression, and quantile regression, to name a few. This makes the PM straightforward to handle and extend in both theoretical and computational aspects, as we will see throughout this article.

Multivariate CTE for copula distributions

  • Hong, Chong Sun;Kim, Jae Young
    • Journal of the Korean Data and Information Science Society
    • /
    • 제28권2호
    • /
    • pp.421-433
    • /
    • 2017
  • The CTE (conditional tail expectation) is a useful risk management measure for a diversified investment portfolio that can be generally estimated by using a transformed univariate distribution. Hong et al. (2016) proposed a multivariate CTE based on multivariate quantile vectors, and explored its characteristics for multivariate normal distributions. Since most real financial data is not distributed symmetrically, it is problematic to apply the CTE to normal distributions. In order to obtain a multivariate CTE for various kinds of joint distributions, distribution fitting methods using copula functions are proposed in this work. Among the many copula functions, the Clayton, Frank, and Gumbel functions are considered, and the multivariate CTEs are obtained by using their generator functions and parameters. These CTEs are compared with CTEs obtained using other distribution functions. The characteristics of the multivariate CTEs are discussed, as are the properties of the distribution functions and their corresponding accuracy. Finally, conclusions are derived and presented with illustrative examples.

셀룰라 이동통신 채널에서 비선형 등화기를 이용한 최적의 데이터 복원 (Optimization of Data Recovery using Non-Linear Equalizer in Cellular Mobile Channel)

  • 최상호;호광춘;김영권
    • 전기전자학회논문지
    • /
    • 제5권1호
    • /
    • pp.1-7
    • /
    • 2001
  • 본 논문에서 역 방향 링크 채널에 대해 비 선형 등화기를 이용하여 CDMA 셀룰라 시스템을 연구하였다. 일반적으로 무선 통신에서 불확실한 채널 특성 때문에 Observable 들의 확률분포는 유한 세트의 파라미터로 규정될 수 없다. 대신에 training 샘플에 기반을 둔 Quantile과 Vector Quantizer를 사용함으로서 유한 수의 disjoint된 영역으로 m차 샘플 공간으로 분할하였다. 제안된 알고리듬은 RMSA 알고리즘에 의해 예측된 Quantile와 조건부 분할 모멘트에 따른 regression function의 부분적인 근사에 근간을 두고 있다. 본 논문의 등화기와 검출기는 잡음 분포의 Variation에 민감하지 않다는 관점에서 상당히 강한 특성을 보여 준다. 주요 아이디어는 Robust equalizer와 Robust partition detector가 어떤 환경의 무선 채널 하에서도 partition되지 않은 Observation space의 일반적인 등화기 보다 Observation의 등 확률로 분할된 부 공간에서 더 낳은 성능을 보여 준다. 또한 이런 개념을 CDMA 시스템에 적용하여 BER 성능을 분석하였다.

  • PDF

서포트벡터 회귀를 이용한 실시간 제품표면거칠기 예측 (Real-Time Prediction for Product Surface Roughness by Support Vector Regression)

  • 최수진;이동주
    • 산업경영시스템학회지
    • /
    • 제44권3호
    • /
    • pp.117-124
    • /
    • 2021
  • The development of IOT technology and artificial intelligence technology is promoting the smartization of manufacturing system. In this study, data extracted from acceleration sensor and current sensor were obtained through experiments in the cutting process of SKD11, which is widely used as a material for special mold steel, and the amount of tool wear and product surface roughness were measured. SVR (Support Vector Regression) is applied to predict the roughness of the product surface in real time using the obtained data. SVR, a machine learning technique, is widely used for linear and non-linear prediction using the concept of kernel. In particular, by applying GSVQR (Generalized Support Vector Quantile Regression), overestimation, underestimation, and neutral estimation of product surface roughness are performed and compared. Furthermore, surface roughness is predicted using the linear kernel and the RBF kernel. In terms of accuracy, the results of the RBF kernel are better than those of the linear kernel. Since it is difficult to predict the amount of tool wear in real time, the product surface roughness is predicted with acceleration and current data excluding the amount of tool wear. In terms of accuracy, the results of excluding the amount of tool wear were not significantly different from those including the amount of tool wear.

Support vector expectile regression using IRWLS procedure

  • Choi, Kook-Lyeol;Shim, Jooyong;Seok, Kyungha
    • Journal of the Korean Data and Information Science Society
    • /
    • 제25권4호
    • /
    • pp.931-939
    • /
    • 2014
  • In this paper we propose the iteratively reweighted least squares procedure to solve the quadratic programming problem of support vector expectile regression with an asymmetrically weighted squares loss function. The proposed procedure enables us to select the appropriate hyperparameters easily by using the generalized cross validation function. Through numerical studies on the artificial and the real data sets we show the effectiveness of the proposed method on the estimation performances.

The Limit Distribution of an Invariant Test Statistic for Multivariate Normality

  • Kim Namhyun
    • Communications for Statistical Applications and Methods
    • /
    • 제12권1호
    • /
    • pp.71-86
    • /
    • 2005
  • Testing for normality has always been an important part of statistical methodology. In this paper a test statistic for multivariate normality is proposed. The underlying idea is to investigate all the possible linear combinations that reduce to the standard normal distribution under the null hypothesis and compare the order statistics of them with the theoretical normal quantiles. The suggested statistic is invariant with respect to nonsingular matrix multiplication and vector addition. We show that the limit distribution of an approximation to the suggested statistic is representable as the supremum over an index set of the integral of a suitable Gaussian process.