• Title/Summary/Keyword: Weighting Schemes

Search Result 60, Processing Time 0.024 seconds

A Study on Negation Handling and Term Weighting Schemes and Their Effects on Mood-based Text Classification (감정 기반 블로그 문서 분류를 위한 부정어 처리 및 단어 가중치 적용 기법의 효과에 대한 연구)

  • Jung, Yu-Chul;Choi, Yoon-Jung;Myaeng, Sung-Hyon
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.4
    • /
    • pp.477-497
    • /
    • 2008
  • Mood classification of blog text is an interesting problem, with a potential for a variety of services involving the Web. This paper introduces an approach to mood classification enhancements through the normalized negation n-grams which contain mood clues and corpus-specific term weighting(CSTW). We've done experiments on blog texts with two different classification methods: Enhanced Mood Flow Analysis(EMFA) and Support Vector Machine based Mood Classification(SVMMC). It proves that the normalized negation n-gram method is quite effective in dealing with negations and gave gradual improvements in mood classification with EMF A. From the selection of CSTW, we noticed that the appropriate weighting scheme is important for supporting adequate levels of mood classification performance because it outperforms the result of TF*IDF and TF.

  • PDF

Complexity Estimation Based Work Load Balancing for a Parallel Lidar Waveform Decomposition Algorithm

  • Jung, Jin-Ha;Crawford, Melba M.;Lee, Sang-Hoon
    • Korean Journal of Remote Sensing
    • /
    • v.25 no.6
    • /
    • pp.547-557
    • /
    • 2009
  • LIDAR (LIght Detection And Ranging) is an active remote sensing technology which provides 3D coordinates of the Earth's surface by performing range measurements from the sensor. Early small footprint LIDAR systems recorded multiple discrete returns from the back-scattered energy. Recent advances in LIDAR hardware now make it possible to record full digital waveforms of the returned energy. LIDAR waveform decomposition involves separating the return waveform into a mixture of components which are then used to characterize the original data. The most common statistical mixture model used for this process is the Gaussian mixture. Waveform decomposition plays an important role in LIDAR waveform processing, since the resulting components are expected to represent reflection surfaces within waveform footprints. Hence the decomposition results ultimately affect the interpretation of LIDAR waveform data. Computational requirements in the waveform decomposition process result from two factors; (1) estimation of the number of components in a mixture and the resulting parameter estimates, which are inter-related and cannot be solved separately, and (2) parameter optimization does not have a closed form solution, and thus needs to be solved iteratively. The current state-of-the-art airborne LIDAR system acquires more than 50,000 waveforms per second, so decomposing the enormous number of waveforms is challenging using traditional single processor architecture. To tackle this issue, four parallel LIDAR waveform decomposition algorithms with different work load balancing schemes - (1) no weighting, (2) a decomposition results-based linear weighting, (3) a decomposition results-based squared weighting, and (4) a decomposition time-based linear weighting - were developed and tested with varying number of processors (8-256). The results were compared in terms of efficiency. Overall, the decomposition time-based linear weighting work load balancing approach yielded the best performance among four approaches.

Design of Acceptance Control Charts According to the Process Independence, Data Weighting Scheme, Subgrouping, and Use of Charts (프로세스의 독립성, 데이터 가중치 체계, 부분군 형성과 관리도 용도에 따른 합격판정 관리도의 설계)

  • Choi, Sung-Woon
    • Journal of the Korea Safety Management & Science
    • /
    • v.12 no.3
    • /
    • pp.257-262
    • /
    • 2010
  • The study investigates the various Acceptance Control Charts (ACCs) based on the factors that include process independence, data weighting scheme, subgrouping, and use of control charts. USL - LSL > $6{\sigma}$ that used in the good condition processes in the ACCs are designed by considering user's perspective, producer's perspective and both perspectives. ACCs developed from the research is efficiently applied by using the simple control limit unified with APL (Acceptable Process Level), RLP (Rejectable Process Level), Type I Error $\alpha$, and Type II Error $\beta$. Sampling interval of subgroup examines i.i.d. (Identically and Independent Distributed) or auto-correlated processes. Three types of weight schemes according to the reliability of data include Shewhart, Moving Average(MA) and Exponentially Weighted Moving Average (EWMA) which are considered when designing ACCs. Two types of control charts by the purpose of improvement are also presented. Overall, $\alpha$, $\beta$ and APL for nonconforming proportion and RPL of claim proportion can be designed by practioners who emphasize productivity and claim defense cost.

MULTIGRID SOLUTION OF THREE DIMENSIONAL BIHARMONIC EQUATIONS WITH DIRICHLET BOUNDARY CONDITIONS OF SECOND KIND

  • Ibrahim, S.A. Hoda;Hassan, Naglaa Ameen
    • Journal of applied mathematics & informatics
    • /
    • v.30 no.1_2
    • /
    • pp.235-244
    • /
    • 2012
  • In this paper, we solve the three-dimensional biharmonic equation with Dirichlet boundary conditions of second kind using the full multigrid (FMG) algorithm. We derive a finite difference approximations for the biharmonic equation on a 18 point compact stencil. The unknown solution and its second derivatives are carried as unknowns at grid points. In the multigrid methods, we use a fourth order interpolation to producing a new intermediate unknown functions values on a finer grid, and the full weighting restriction operators to calculating the residuals at coarse grid points. A set of test problems gives excellent results.

A Generalized M-Estimator in Linear Regression

  • Song, Moon-Sup;Park, Chang-Soon;Nam, Ho-Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.1 no.1
    • /
    • pp.27-32
    • /
    • 1994
  • We propose a robust regression estimator which has both a high breakdown point and a bounded influence function. The main contribution of this article is to present a weight function in the generalized M (GM)-estimator. The weighting schemes which control leverage points only without considering residuals cannot be efficient, since control leverage points only without considering residuals cannot be efficient, since these schemes inevitably downweight some good leverage points. In this paper we propose a weight function which depends both on design points and residuals, so as not to downweight good leverage points. Some motivating illustrations are also given.

  • PDF

Efficient ICI Self-Cancellation Scheme for OFDM Systems

  • Kim, Kyung-Hwa;Seo, Bangwon
    • ETRI Journal
    • /
    • v.36 no.4
    • /
    • pp.537-544
    • /
    • 2014
  • In this paper, we present a new inter-carrier interference (ICI) self-cancellation scheme - namely, ISC scheme - for orthogonal frequency-division multiplexing systems to reduce the ICI generated from phase noise (PHN) and residual frequency offset (RFO). The proposed scheme comprises a new ICI cancellation mapping (ICM) scheme at the transmitter and an appropriate method of combining the received signals at the receiver. In the proposed scheme, the transmitted signal is transformed into a real signal through the new ICM using the real property of the transmitted signal; the fast-varying PHN and RFO are estimated and compensated. Therefore, the ICI caused by fast-varying PHN and RFO is significantly suppressed. We also derive the carrier-to-interference power ratio (CIR) of the proposed scheme by using the symmetric conjugate property of the ICI weighting function and then compare it with those of conventional schemes. Through simulation results, we show that the proposed ISC scheme has a higher CIR and better bit error rate performance than the conventional schemes.

Weighted DCT-IF for Image up Scaling

  • Lee, Jae-Yung;Yoon, Sung-Jun;Kim, Jae-Gon;Han, Jong-Ki
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.2
    • /
    • pp.790-809
    • /
    • 2019
  • The design of an efficient scaler to enhance the edge data is one of the most important issues in video signal applications, because the perceptual quality of the processed image is sensitively affected by the degradation of edge data. Various conventional scaling schemes have been proposed to enhance the edge data. In this paper, we propose an efficient scaling algorithm for this purpose. The proposed method is based on the discrete cosine transform-based interpolation filter (DCT-IF) because it outperforms other scaling algorithms in various configurations. The proposed DCT-IF incorporates weighting parameters that are optimized for training data. Simulation results show that the quality of the resized image produced by the proposed DCT-IF is much higher than that of those produced by the conventional schemes, although the proposed DCT-IF is more complex than other conventional scaling algorithms.

Traffic Offloading Algorithm Using Social Context in MEC Environment (MEC 환경에서의 Social Context를 이용한 트래픽 오프로딩 알고리즘)

  • Cheon, Hye-Rim;Lee, Seung-Que;Kim, Jae-Hyun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.42 no.2
    • /
    • pp.514-522
    • /
    • 2017
  • Traffic offloading is a promising solution to solve the explosive growth of mobile traffic. One of offloading schemes, in LIPA/SIPTO(Local IP Access and Selected IP Traffic Offload) offloading, we can offload mobile traffic that can satisfy QoS requirement for application. In addition, it is necessary for traffic offloading using social context due to large traffic from SNS. Thus, we propose the LIPA/SIPTO offloading algorithm using social context. We define the application selection probability using social context, the application popularity. Then, we find the optimal offloading weighting factor to maximize the QoS(Quality of Service) of small cell users in term of effective data rate. Finally, we determine the offloading ratio by this application selection probability and optimal offloading weighting factor. By performance analysis, the effective data rate achievement ratio of the proposed algorithm is similar with the conventional one although the total offloading ratio of the proposed algorithm is about 46 percent of the conventional one.

Preprocessing Technique for Improvement of Speech Recognition in a Car (차량에서의 음성인식율 향상을 위한 전처리 기법)

  • Kim, Hyun-Tae;Park, Jang-Sik
    • The Journal of the Korea Contents Association
    • /
    • v.9 no.1
    • /
    • pp.139-146
    • /
    • 2009
  • This paper addresses a modified spectral subtraction schemes which is suitable to speech recognition under low signal-to-noise ratio (SNR) noisy environment such as the automatic speech recognition (ASR) system in car. The conventional spectral subtraction schemes rely on the SNR such that attenuation is imposed on that part of the spectrum that appears to have low SNR, and accentuation is made on that part of high SNR. However, such postulation is adequate for high SNR environment, it is grossly inadequate for low SNR scenarios such as that of car environment. Proposed methods focused specifically to low SNR noisy environment by using weighting function for enhancing speech dominant region in speech spectrum. Experimental results by using voice commands for car show the superior performance of the proposed method over conventional methods.

An Analytical Study on Performance Factors of Automatic Classification based on Machine Learning (기계학습에 기초한 자동분류의 성능 요소에 관한 연구)

  • Kim, Pan Jun
    • Journal of the Korean Society for information Management
    • /
    • v.33 no.2
    • /
    • pp.33-59
    • /
    • 2016
  • This study examined the factors affecting the performance of automatic classification for the domestic conference papers based on machine learning techniques. In particular, In view of the classification performance that assigning automatically the class labels to the papers in Proceedings of the Conference of Korean Society for Information Management using Rocchio algorithm, I investigated the characteristics of the key factors (classifier formation methods, training set size, weighting schemes, label assigning methods) through the diversified experiments. Consequently, It is more effective that apply proper parameters (${\beta}$, ${\lambda}$) and training set size (more than 5 years) according to the classification environments and properties of the document set. and If the performance is equivalent, I discovered that the use of the more simple methods (single weighting schemes) is very efficient. Also, because the classification of domestic papers is corresponding with multi-label classification which assigning more than one label to an article, it is necessary to develop the optimum classification model based on the characteristics of the key factors in consideration of this environment.