• Title/Summary/Keyword: robust performance.

Search Result 3,675, Processing Time 0.036 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Development of an Automatic 3D Coregistration Technique of Brain PET and MR Images (뇌 PET과 MR 영상의 자동화된 3차원적 합성기법 개발)

  • Lee, Jae-Sung;Kwark, Cheol-Eun;Lee, Dong-Soo;Chung, June-Key;Lee, Myung-Chul;Park, Kwang-Suk
    • The Korean Journal of Nuclear Medicine
    • /
    • v.32 no.5
    • /
    • pp.414-424
    • /
    • 1998
  • Purpose: Cross-modality coregistration of positron emission tomography (PET) and magnetic resonance imaging (MR) could enhance the clinical information. In this study we propose a refined technique to improve the robustness of registration, and to implement more realistic visualization of the coregistered images. Materials and Methods: Using the sinogram of PET emission scan, we extracted the robust head boundary and used boundary-enhanced PET to coregister PET with MR. The pixels having 10% of maximum pixel value were considered as the boundary of sinogram. Boundary pixel values were exchanged with maximum value of sinogram. One hundred eighty boundary points were extracted at intervals of about 2 degree using simple threshold method from each slice of MR images. Best affined transformation between the two point sets was performed using least square fitting which should minimize the sum of Euclidean distance between the point sets. We reduced calculation time using pre-defined distance map. Finally we developed an automatic coregistration program using this boundary detection and surface matching technique. We designed a new weighted normalization technique to display the coregistered PET and MR images simultaneously. Results: Using our newly developed method, robust extraction of head boundary was possible and spatial registration was successfully performed. Mean displacement error was less than 2.0 mm. In visualization of coregistered images using weighted normalization method, structures shown in MR image could be realistically represented. Conclusion: Our refined technique could practically enhance the performance of automated three dimensional coregistration.

  • PDF

A digital Audio Watermarking Algorithm using 2D Barcode (2차원 바코드를 이용한 오디오 워터마킹 알고리즘)

  • Bae, Kyoung-Yul
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.2
    • /
    • pp.97-107
    • /
    • 2011
  • Nowadays there are a lot of issues about copyright infringement in the Internet world because the digital content on the network can be copied and delivered easily. Indeed the copied version has same quality with the original one. So, copyright owners and content provider want a powerful solution to protect their content. The popular one of the solutions was DRM (digital rights management) that is based on encryption technology and rights control. However, DRM-free service was launched after Steve Jobs who is CEO of Apple proposed a new music service paradigm without DRM, and the DRM is disappeared at the online music market. Even though the online music service decided to not equip the DRM solution, copyright owners and content providers are still searching a solution to protect their content. A solution to replace the DRM technology is digital audio watermarking technology which can embed copyright information into the music. In this paper, the author proposed a new audio watermarking algorithm with two approaches. First, the watermark information is generated by two dimensional barcode which has error correction code. So, the information can be recovered by itself if the errors fall into the range of the error tolerance. The other one is to use chirp sequence of CDMA (code division multiple access). These make the algorithm robust to the several malicious attacks. There are many 2D barcodes. Especially, QR code which is one of the matrix barcodes can express the information and the expression is freer than that of the other matrix barcodes. QR code has the square patterns with double at the three corners and these indicate the boundary of the symbol. This feature of the QR code is proper to express the watermark information. That is, because the QR code is 2D barcodes, nonlinear code and matrix code, it can be modulated to the spread spectrum and can be used for the watermarking algorithm. The proposed algorithm assigns the different spread spectrum sequences to the individual users respectively. In the case that the assigned code sequences are orthogonal, we can identify the watermark information of the individual user from an audio content. The algorithm used the Walsh code as an orthogonal code. The watermark information is rearranged to the 1D sequence from 2D barcode and modulated by the Walsh code. The modulated watermark information is embedded into the DCT (discrete cosine transform) domain of the original audio content. For the performance evaluation, I used 3 audio samples, "Amazing Grace", "Oh! Carol" and "Take me home country roads", The attacks for the robustness test were MP3 compression, echo attack, and sub woofer boost. The MP3 compression was performed by a tool of Cool Edit Pro 2.0. The specification of MP3 was CBR(Constant Bit Rate) 128kbps, 44,100Hz, and stereo. The echo attack had the echo with initial volume 70%, decay 75%, and delay 100msec. The sub woofer boost attack was a modification attack of low frequency part in the Fourier coefficients. The test results showed the proposed algorithm is robust to the attacks. In the MP3 attack, the strength of the watermark information is not affected, and then the watermark can be detected from all of the sample audios. In the sub woofer boost attack, the watermark was detected when the strength is 0.3. Also, in the case of echo attack, the watermark can be identified if the strength is greater and equal than 0.5.

Enhancement of Inter-Image Statistical Correlation for Accurate Multi-Sensor Image Registration (정밀한 다중센서 영상정합을 위한 통계적 상관성의 증대기법)

  • Kim, Kyoung-Soo;Lee, Jin-Hak;Ra, Jong-Beom
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.4 s.304
    • /
    • pp.1-12
    • /
    • 2005
  • Image registration is a process to establish the spatial correspondence between images of the same scene, which are acquired at different view points, at different times, or by different sensors. This paper presents a new algorithm for robust registration of the images acquired by multiple sensors having different modalities; the EO (electro-optic) and IR(infrared) ones in the paper. The two feature-based and intensity-based approaches are usually possible for image registration. In the former selection of accurate common features is crucial for high performance, but features in the EO image are often not the same as those in the R image. Hence, this approach is inadequate to register the E0/IR images. In the latter normalized mutual Information (nHr) has been widely used as a similarity measure due to its high accuracy and robustness, and NMI-based image registration methods assume that statistical correlation between two images should be global. Unfortunately, since we find out that EO and IR images don't often satisfy this assumption, registration accuracy is not high enough to apply to some applications. In this paper, we propose a two-stage NMI-based registration method based on the analysis of statistical correlation between E0/1R images. In the first stage, for robust registration, we propose two preprocessing schemes: extraction of statistically correlated regions (ESCR) and enhancement of statistical correlation by filtering (ESCF). For each image, ESCR automatically extracts the regions that are highly correlated to the corresponding regions in the other image. And ESCF adaptively filters out each image to enhance statistical correlation between them. In the second stage, two output images are registered by using NMI-based algorithm. The proposed method provides prospective results for various E0/1R sensor image pairs in terms of accuracy, robustness, and speed.

RPC Correction of KOMPSAT-3A Satellite Image through Automatic Matching Point Extraction Using Unmanned AerialVehicle Imagery (무인항공기 영상 활용 자동 정합점 추출을 통한 KOMPSAT-3A 위성영상의 RPC 보정)

  • Park, Jueon;Kim, Taeheon;Lee, Changhui;Han, Youkyung
    • Korean Journal of Remote Sensing
    • /
    • v.37 no.5_1
    • /
    • pp.1135-1147
    • /
    • 2021
  • In order to geometrically correct high-resolution satellite imagery, the sensor modeling process that restores the geometric relationship between the satellite sensor and the ground surface at the image acquisition time is required. In general, high-resolution satellites provide RPC (Rational Polynomial Coefficient) information, but the vendor-provided RPC includes geometric distortion caused by the position and orientation of the satellite sensor. GCP (Ground Control Point) is generally used to correct the RPC errors. The representative method of acquiring GCP is field survey to obtain accurate ground coordinates. However, it is difficult to find the GCP in the satellite image due to the quality of the image, land cover change, relief displacement, etc. By using image maps acquired from various sensors as reference data, it is possible to automate the collection of GCP through the image matching algorithm. In this study, the RPC of KOMPSAT-3A satellite image was corrected through the extracted matching point using the UAV (Unmanned Aerial Vehichle) imagery. We propose a pre-porocessing method for the extraction of matching points between the UAV imagery and KOMPSAT-3A satellite image. To this end, the characteristics of matching points extracted by independently applying the SURF (Speeded-Up Robust Features) and the phase correlation, which are representative feature-based matching method and area-based matching method, respectively, were compared. The RPC adjustment parameters were calculated using the matching points extracted through each algorithm. In order to verify the performance and usability of the proposed method, it was compared with the GCP-based RPC correction result. The GCP-based method showed an improvement of correction accuracy by 2.14 pixels for the sample and 5.43 pixelsfor the line compared to the vendor-provided RPC. In the proposed method using SURF and phase correlation methods, the accuracy of sample was improved by 0.83 pixels and 1.49 pixels, and that of line wasimproved by 4.81 pixels and 5.19 pixels, respectively, compared to the vendor-provided RPC. Through the experimental results, the proposed method using the UAV imagery presented the possibility as an alternative to the GCP-based method for the RPC correction.

Segmentation of Airborne LIDAR Data: From Points to Patches (항공 라이다 데이터의 분할: 점에서 패치로)

  • Lee Im-Pyeong
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.24 no.1
    • /
    • pp.111-121
    • /
    • 2006
  • Recently, many studies have been performed to apply airborne LIDAR data to extracting urban models. In order to model efficiently the man-made objects which are the main components of these urban models, it is important to extract automatically planar patches from the set of the measured three-dimensional points. Although some research has been carried out for their automatic extraction, no method published yet is sufficiently satisfied in terms of the accuracy and completeness of the segmentation results and their computational efficiency. This study thus aimed to developing an efficient approach to automatic segmentation of planar patches from the three-dimensional points acquired by an airborne LIDAR system. The proposed method consists of establishing adjacency between three-dimensional points, grouping small number of points into seed patches, and growing the seed patches into surface patches. The core features of this method are to improve the segmentation results by employing the variable threshold value repeatedly updated through a statistical analysis during the patch growing process, and to achieve high computational efficiency using priority heaps and sequential least squares adjustment. The proposed method was applied to real LIDAR data to evaluate the performance. Using the proposed method, LIDAR data composed of huge number of three dimensional points can be converted into a set of surface patches which are more explicit and robust descriptions. This intermediate converting process can be effectively used to solve object recognition problems such as building extraction.

A Study on the Current Measurement Using birefringence Fiber (복굴절 광섬유를 이용한 전류측정에 관한 연구)

  • Jang Nam-Young;Choi Pyung-Suk;Eun Jae-Jeong
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.6 no.2
    • /
    • pp.59-66
    • /
    • 2005
  • Accuracy of current measurement in fiber optic current sensor(FOCS), especially, unidirectional polarimetric fiber optic current sensor(PFOCS) is affected by the environment perturbations such as acoustic vibrations changes to the sensing fiber, and intrinsic perturbations such as the bending fiber that the sensing fiber wound around a current carrying wire. The perturbations affect the birefringence properties of sensing fiber in sensor head and cause false current readings. Thus, using compensation technique, reciprocal PFOCS, for unidirectional PFOCS the perturbations are suppressed. In this paper, we carried out the numerical analysis of performance in reciprocal PFOCS including the degree of polarization error, and false current of environmental and intrinsic perturbations on the sensing fiber. Also, we compared the effect of mirror with the faraday rotation mirror(FRM) in reciprocal PFOCS configuration. And the different optical source's wavelengths, 633nm and 1300nm is used. In the results, at 633nm, using mirror and FRM, the degree of polarization error is calculated to $2.3\%$ and $0.0196\%$, respectively. At $1300{\cal}nm$ using mirror and FRM the degree of polarization error is calculated to $9.97\%$ and $0.0196\%$, respectively. Also, compared with false current, the results is calculated to $9.82{\times}10^{-9}A$ and $1.4{\times}10^{-17}A$, respectively, and show that the reciprocal PFOCS is more robust configuration than unidiretionnal PFOCS for environmental and intrinsic perturbations.

  • PDF

A Study on Touchless Finger Vein Recognition Robust to the Alignment and Rotation of Finger (손가락 정렬과 회전에 강인한 비 접촉식 손가락 정맥 인식 연구)

  • Park, Kang-Ryoung;Jang, Young-Kyoon;Kang, Byung-Jun
    • The KIPS Transactions:PartB
    • /
    • v.15B no.4
    • /
    • pp.275-284
    • /
    • 2008
  • With increases in recent security requirements, biometric technology such as fingerprints, faces and iris recognitions have been widely used in many applications including door access control, personal authentication for computers, internet banking, automatic teller machines and border-crossing controls. Finger vein recognition uses the unique patterns of finger veins in order to identify individuals at a high level of accuracy. This paper proposes new device and methods for touchless finger vein recognition. This research presents the following five advantages compared to previous works. First, by using a minimal guiding structure for the finger tip, side and the back of finger, we were able to obtain touchless finger vein images without causing much inconvenience to user. Second, by using a hot mirror, which was slanted at the angle of 45 degrees in front of the camera, we were able to reduce the depth of the capturing device. Consequently, it would be possible to use the device in many applications having size limitations such as mobile phones. Third, we used the holistic texture information of the finger veins based on a LBP (Local Binary Pattern) without needing to extract accurate finger vein regions. By using this method, we were able to reduce the effect of non-uniform illumination including shaded and highly saturated areas. Fourth, we enhanced recognition performance by excluding non-finger vein regions. Fifth, when matching the extracted finger vein code with the enrolled one, by using the bit-shift in both the horizontal and vertical directions, we could reduce the authentic variations caused by the translation and rotation of finger. Experimental results showed that the EER (Equal Error Rate) was 0.07423% and the total processing time was 91.4ms.

Dual Dictionary Learning for Cell Segmentation in Bright-field Microscopy Images (명시야 현미경 영상에서의 세포 분할을 위한 이중 사전 학습 기법)

  • Lee, Gyuhyun;Quan, Tran Minh;Jeong, Won-Ki
    • Journal of the Korea Computer Graphics Society
    • /
    • v.22 no.3
    • /
    • pp.21-29
    • /
    • 2016
  • Cell segmentation is an important but time-consuming and laborious task in biological image analysis. An automated, robust, and fast method is required to overcome such burdensome processes. These needs are, however, challenging due to various cell shapes, intensity, and incomplete boundaries. A precise cell segmentation will allow to making a pathological diagnosis of tissue samples. A vast body of literature exists on cell segmentation in microscopy images [1]. The majority of existing work is based on input images and predefined feature models only - for example, using a deformable model to extract edge boundaries in the image. Only a handful of recent methods employ data-driven approaches, such as supervised learning. In this paper, we propose a novel data-driven cell segmentation algorithm for bright-field microscopy images. The proposed method minimizes an energy formula defined by two dictionaries - one is for input images and the other is for their manual segmentation results - and a common sparse code, which aims to find the pixel-level classification by deploying the learned dictionaries on new images. In contrast to deformable models, we do not need to know a prior knowledge of objects. We also employed convolutional sparse coding and Alternating Direction of Multiplier Method (ADMM) for fast dictionary learning and energy minimization. Unlike an existing method [1], our method trains both dictionaries concurrently, and is implemented using the GPU device for faster performance.

Outlier Detection from High Sensitive Geiger Mode Imaging LIDAR Data retaining a High Outlier Ratio (높은 이상점 비율을 갖는 고감도 가이거모드 영상 라이다 데이터로부터 이상점 검출)

  • Kim, Seongjoon;Lee, Impyeong;Lee, Youngcheol;Jo, Minsik
    • Korean Journal of Remote Sensing
    • /
    • v.28 no.5
    • /
    • pp.573-586
    • /
    • 2012
  • Point clouds acquired by a LIDAR(Light Detection And Ranging, also LADAR) system often contain erroneous points called outliers seeming not to be on physical surfaces, which should be carefully detected and eliminated before further processing for applications. Particularly in case of LIDAR systems employing with a Gieger-mode array detector (GmFPA) of high sensitivity, the outlier ratio is significantly high, which makes existing algorithms often fail to detect the outliers from such a data set. In this paper, we propose a method to discriminate outliers from a point cloud with high outlier ratio acquired by a GmFPA LIDAR system. The underlying assumption of this method is that a meaningful targe surface occupy at least two adjacent pixels and the ranges from these pixels are similar. We applied the proposed method to simulated LIDAR data of different point density and outlier ratio and analyzed the performance according to different thresholds and data properties. Consequently, we found that the outlier detection probabilities are about 99% in most cases. We also confirmed that the proposed method is robust to data properties and less sensitive to the thresholds. The method will be effectively utilized for on-line realtime processing and post-processing of GmFPA LIDAR data.