• Title/Summary/Keyword: sensitive information

Search Result 2,322, Processing Time 0.03 seconds

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

DISEASE DIAGNOSED AND DESCRIBED BY NIRS

  • Tsenkova, Roumiana N.
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1031-1031
    • /
    • 2001
  • The mammary gland is made up of remarkably sensitive tissue, which has the capability of producing a large volume of secretion, milk, under normal or healthy conditions. When bacteria enter the gland and establish an infection (mastitis), inflammation is initiated accompanied by an influx of white cells from the blood stream, by altered secretory function, and changes in the volume and composition of secretion. Cell numbers in milk are closely associated with inflammation and udder health. These somatic cell counts (SCC) are accepted as the international standard measurement of milk quality in dairy and for mastitis diagnosis. NIR Spectra of unhomogenized composite milk samples from 14 cows (healthy and mastitic), 7days after parturition and during the next 30 days of lactation were measured. Different multivariate analysis techniques were used to diagnose the disease at very early stage and determine how the spectral properties of milk vary with its composition and animal health. PLS model for prediction of somatic cell count (SCC) based on NIR milk spectra was made. The best accuracy of determination for the 1100-2500nm range was found using smoothed absorbance data and 10 PLS factors. The standard error of prediction for independent validation set of samples was 0.382, correlation coefficient 0.854 and the variation coefficient 7.63%. It has been found that SCC determination by NIR milk spectra was indirect and based on the related changes in milk composition. From the spectral changes, we learned that when mastitis occurred, the most significant factors that simultaneously influenced milk spectra were alteration of milk proteins and changes in ionic concentration of milk. It was consistent with the results we obtained further when applied 2DCOS. Two-dimensional correlation analysis of NIR milk spectra was done to assess the changes in milk composition, which occur when somatic cell count (SCC) levels vary. The synchronous correlation map revealed that when SCC increases, protein levels increase while water and lactose levels decrease. Results from the analysis of the asynchronous plot indicated that changes in water and fat absorptions occur before other milk components. In addition, the technique was used to assess the changes in milk during a period when SCC levels do not vary appreciably. Results indicated that milk components are in equilibrium and no appreciable change in a given component was seen with respect to another. This was found in both healthy and mastitic animals. However, milk components were found to vary with SCC content regardless of the range considered. This important finding demonstrates that 2-D correlation analysis may be used to track even subtle changes in milk composition in individual cows. To find out the right threshold for SCC when used for mastitis diagnosis at cow level, classification of milk samples was performed using soft independent modeling of class analogy (SIMCA) and different spectral data pretreatment. Two levels of SCC - 200 000 cells/$m\ell$ and 300 000 cells/$m\ell$, respectively, were set up and compared as thresholds to discriminate between healthy and mastitic cows. The best detection accuracy was found with 200 000 cells/$m\ell$ as threshold for mastitis and smoothed absorbance data: - 98% of the milk samples in the calibration set and 87% of the samples in the independent test set were correctly classified. When the spectral information was studied it was found that the successful mastitis diagnosis was based on reviling the spectral changes related to the corresponding changes in milk composition. NIRS combined with different ways of spectral data ruining can provide faster and nondestructive alternative to current methods for mastitis diagnosis and a new inside into disease understanding at molecular level.

  • PDF

Prediction of Seasonal Nitrate Concentration in Springs on the Southern Slope of Jeju Island using Multiple Linear Regression of Geographic Spatial Data (지리 공간 자료의 다중회귀분석을 이용한 제주도 남측사면 용천수의 시기별 질산성 질소 농도 예측)

  • Jung, Youn-Young;Koh, Dong-Chan;Kang, Bong-Rae;Ko, Kyung-Suk;Yu, Yong-Jae
    • Economic and Environmental Geology
    • /
    • v.44 no.2
    • /
    • pp.135-152
    • /
    • 2011
  • Nitrate concentrations in springs at the southern slope of Jeju Island were predicted using multiple linear regression (MLR) of spatial variables including hydrogeological parameters and land use characteristics. Springs showed wide range of nitrate concentrations from <0.02 to 86 mg/L with a mean of 20 mg/L. Spatial variables were generated for the circular buffer when the optimal buffer radius was assigned as 400 m. Selected regression models were tested using the p values and Durbin-Watson statistics. Explanatory variables were selected using the adjusted $R^2$, Cp (total squared error) and AIC (Akaike's Information Criterion), and significance. In addition, mutual linear relations between variables were also considered. Small portion of springs, usually <10% of total samples, were identified as outliers indicating limitations of MLR using circular buffers. Adjusted $R^2$ of the proposed models was improved from 0.75 to 0.87 when outliers were eliminated. In particular, the areal proportion of natural area had the greatest influence on the nitrate concentrations in springs. Among anthropogenic land uses, the influence of nitrate contamination is diminishing in the following order of orchard, residential area, and dry farmland. It is apparent quality of springs in the study area is likely to be controlled by land uses instead of hydrogeological parameters. Most of all, it is worth highlighting that the contamination susceptibility of springs is highly sensitive to nearby land uses, in particular, orchard.

Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine (AdaBoost 알고리즘기반 SVM을 이용한 부실 확률분포 기반의 기업신용평가)

  • Shin, Taek-Soo;Hong, Tae-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.25-41
    • /
    • 2011
  • Recently, support vector machines (SVMs) are being recognized as competitive tools as compared with other data mining techniques for solving pattern recognition or classification decision problems. Furthermore, many researches, in particular, have proved them more powerful than traditional artificial neural networks (ANNs) (Amendolia et al., 2003; Huang et al., 2004, Huang et al., 2005; Tay and Cao, 2001; Min and Lee, 2005; Shin et al., 2005; Kim, 2003).The classification decision, such as a binary or multi-class decision problem, used by any classifier, i.e. data mining techniques is so cost-sensitive particularly in financial classification problems such as the credit ratings that if the credit ratings are misclassified, a terrible economic loss for investors or financial decision makers may happen. Therefore, it is necessary to convert the outputs of the classifier into wellcalibrated posterior probabilities-based multiclass credit ratings according to the bankruptcy probabilities. However, SVMs basically do not provide such probabilities. So it required to use any method to create the probabilities (Platt, 1999; Drish, 2001). This paper applied AdaBoost algorithm-based support vector machines (SVMs) into a bankruptcy prediction as a binary classification problem for the IT companies in Korea and then performed the multi-class credit ratings of the companies by making a normal distribution shape of posterior bankruptcy probabilities from the loss functions extracted from the SVMs. Our proposed approach also showed that their methods can minimize the misclassification problems by adjusting the credit grade interval ranges on condition that each credit grade for credit loan borrowers has its own credit risk, i.e. bankruptcy probability.

Analysis on TV News Frame on Whistle-Blower: Focused on News Coverages on 'Kim Yong Chul' Claiming Samsung Group's Slush Fund (내부고발에 대한 텔레비전 뉴스 프레임: '김용철' 변호사의 삼성비리 고발사건을 중심으로)

  • Kim, Nam-Il
    • Korean journal of communication and information
    • /
    • v.43
    • /
    • pp.117-151
    • /
    • 2008
  • This paper regards former Samsung lawyer Kim Yong-Chul's action of claiming Samsung Group's slush fund as typical Whistle-Blowing from inside. News frames in KBS, SBS TV were examined through comparative analysis. In formal feature, 'episodic news frame' hold an absolute majority in both stations. From news sources, the group of whistle-blower such as lawyer Kim Yong-Chul and civic groups was confronted with Samsung and state authorities including the Prosecutor, financial agencies. Analysis on the theme of news coverages demonstrated 5 frames: 'public announcing frame', 'news of conflict frame' 'demanding a close inquiry frame', 'declaration of conscience frame', 'causing social upheaval frame', Analysis result shows that 'public announcing frame' was most frequently used in reporting and there was distinction between KBS and SBS in 'declaration of conscience frame' and 'causing social upheaval frame'. Relatively KBS preferred 'declaration of conscience frame' and SBS would use 'causing social upheaval frame', from which reciprocal relation as media ownership could be analogized. Both media tend to make light of in-depth news coverages on structural issues or essential settlement and it is shown that both stations treated this situation with intriguing audiences as stressing sensitive parts in this event. Follow-up of changing process of 'declaration of conscience frame' through diachronic analysis on framing informs that additional exposure of 'Lee Yong Chul', former secretary in Nov 19, 2007 influenced increasing of frequency of using 'declaration of conscience frame'. However, news reporting on whistle-blower in KBS and SBS generally adheres to passive attitude of following changes in the surroundings rather than playing an active role in improving social recognition on whistle-blowing, which can induce to the spread of negative feature on it. Thus it is assumed that terrestial television broadcasting should regard whistle-blowing as contradiction in social structures and active depth reporting seems to be neded for improving social recognition on whistle-blowing.

  • PDF

Adaptive Data Hiding Techniques for Secure Communication of Images (영상 보안통신을 위한 적응적인 데이터 은닉 기술)

  • 서영호;김수민;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.5C
    • /
    • pp.664-672
    • /
    • 2004
  • Widespread popularity of wireless data communication devices, coupled with the availability of higher bandwidths, has led to an increased user demand for content-rich media such as images and videos. Since such content often tends to be private, sensitive, or paid for, there exists a requirement for securing such communication. However, solutions that rely only on traditional compute-intensive security mechanisms are unsuitable for resource-constrained wireless and embedded devices. In this paper, we propose a selective partial image encryption scheme for image data hiding , which enables highly efficient secure communication of image data to and from resource constrained wireless devices. The encryption scheme is invoked during the image compression process, with the encryption being performed between the quantizer and the entropy coder stages. Three data selection schemes are proposed: subband selection, data bit selection and random selection. We show that these schemes make secure communication of images feasible for constrained embed-ded devices. In addition we demonstrate how these schemes can be dynamically configured to trade-off the amount of ded devices. In addition we demonstrate how these schemes can be dynamically configured to trade-off the amount of data hiding achieved with the computation requirements imposed on the wireless devices. Experiments conducted on over 500 test images reveal that, by using our techniques, the fraction of data to be encrypted with our scheme varies between 0.0244% and 0.39% of the original image size. The peak signal to noise ratios (PSNR) of the encrypted image were observed to vary between about 9.5㏈ to 7.5㏈. In addition, visual test indicate that our schemes are capable of providing a high degree of data hiding with much lower computational costs.

Feasibility of Automated Detection of Inter-fractional Deviation in Patient Positioning Using Structural Similarity Index: Preliminary Results (Structural Similarity Index 인자를 이용한 방사선 분할 조사간 환자 체위 변화의 자동화 검출능 평가: 초기 보고)

  • Youn, Hanbean;Jeon, Hosang;Lee, Jayeong;Lee, Juhye;Nam, Jiho;Park, Dahl;Kim, Wontaek;Ki, Yongkan;Kim, Donghyun
    • Progress in Medical Physics
    • /
    • v.26 no.4
    • /
    • pp.258-266
    • /
    • 2015
  • The modern radiotherapy technique which delivers a large amount of dose to patients asks to confirm the positions of patients or tumors more accurately by using X-ray projection images of high-definition. However, a rapid increase in patient's exposure and image information for CT image acquisition may be additional burden on the patient. In this study, by introducing structural similarity (SSIM) index that can effectively extract the structural information of the image, we analyze the differences between daily acquired x-ray images of a patient to verify the accuracy of patient positioning. First, for simulating a moving target, the spherical computational phantoms changing the sizes and positions were created to acquire projected images. Differences between the images were automatically detected and analyzed by extracting their SSIM values. In addition, as a clinical test, differences between daily acquired x-ray images of a patient for 12 days were detected in the same way. As a result, we confirmed that the SSIM index was changed in the range of 0.85~1 (0.006~1 when a region of interest (ROI) was applied) as the sizes or positions of the phantom changed. The SSIM was more sensitive to the change of the phantom when the ROI was limited to the phantom itself. In the clinical test, the daily change of patient positions was 0.799~0.853 in SSIM values, those well described differences among images. Therefore, we expect that SSIM index can provide an objective and quantitative technique to verify the patient position using simple x-ray images, instead of time and cost intensive three-dimensional x-ray images.

Importance of Microtextural and Geochemical Characterizations of Soils on Landslide Sites (산사태지역 토층의 미세조직과 지화학적 특성의 중요성)

  • Kim Kyeong-Su;Choo Chang-Oh;Booh Seong-An;Jeong Gyo-Cheol
    • The Journal of Engineering Geology
    • /
    • v.15 no.4 s.42
    • /
    • pp.447-462
    • /
    • 2005
  • The purposes of this study are to evaluate and discuss the importance of geochemical properties of soil materials that play an important role in the occurrence of the landslide, using analyses of microtexture, particle size distribution, XRC, and FE-SEM equipped with energy dispersive spectrum on soils collected from landslide slopes of gneiss, granite and sedimentary rock areas. Soils from gneiss and granite areas where landslides took place have much clay content relative to those from non landslide areas, particularly pronounced in the granite area. Therefore the clay content is considered a sensitive factor on landslide. Clay minerals contained in soils are illite, chlorite, kaolinite and montmorillonite. Especially the content of clay minerals in soils from the Tertiary sedimentary rocks is highest, with abundant montmorillonite as expandable species. It is believed that this area was much vulnerable to landslide comparable to other areas because of its high content of monoorillonite, even though there might be weak precipitation. Since no conspicuous differentiation in mineralogy between the landslide area and non landslide area can be made, the occurrence of landslide may be influenced not by mineralogy, but by local geography and mechanical properties of soils. Geochemical information on weathering properties, mineralogy, and microtexture of soils is helpful to better understand the causes and patterns of landslide, together with engineering geological analyses.

Relativeness between Growth and Bio-informations of Aeroponically Grown Tomato as Influenced by Spray Intervals of Nutrient Solution (양액의 분무간격에 따른 분무경재배 토마토의 생장 및 생체정보와의 관련성)

  • 정순주;소원온;지전영남;영목방부
    • Journal of Bio-Environment Control
    • /
    • v.1 no.2
    • /
    • pp.154-161
    • /
    • 1992
  • This experiment was carried oui to determine the relativeness between growth, yield characters and bio-informations as influenced by the spray and rest time intervals of nutrient solution. Tomato(Lycopersicon esculentum Mill.) were grown in aeroponic system on a misting schedule of continuously 60 sec, 30 sec and 10 sec at 10 min intervals with full strength Yamazaki's solution recommended for tomato production. The results obtained were as follows : 1. Leaf area was highest in the plot of 30 sec spray and 10 min rest while the forest one was the plot of 60 sec spray and 10 min rest. Growth characteristics in terms of dry weight of each organ, number of flower, number of flower setted and fruit dry weight were greater in the plot of 30 sec spray and 10 min rest than the other treatments. 2. The number of flower increased with decreasing dry weight but number of flower sorted was not significantly different among treatment except for the plot of 60 sec spray and 10 min rest. 3. Leaf dry weight and fruit dry weight were highly correlated so that 30 sec spray and 10 min rest plot which is the highest fruit dry weight showed the largest leaf area. Continuously sprayed plot reduced markedly the fruit dry weight compared with leaf area. Optimum spray and rest time of nutrient solution in the range of this experiment was determined as 30 sec spray and 10 min rest. 4. Solar radiation within glasshouse during daytime reduced severely compared with outdoor one and air temperature within greenhouse was higher than the leaf temperature of tomato plant. The changes of environmental factors, solar radiation, temperature were accompanied with the sensitive change of bio-informations of tomato leaf Especially differences of spray intervals of nutrient solution affected greatly to the changes of bio-informations : leaf water potential, stomatal resistance and leaf temperature etc. 5. The changing patterns of leaf growth as influenced by the spray and rest intervals of nutrient solution were closely related to the leaf water potential, stomatal resistance and leaf temperature. Feasibility was demonstrated that measurement of bio-information of tomato leaf as influenced by the change of environmental factors could be expected to the amount of growth and fruit yield.

  • PDF

Effective Geophysical Methods in Detecting Subsurface Caves: On the Case of Manjang Cave, Cheju Island (지하 동굴 탐지에 효율적인 지구물리탐사기법 연구: 제주도 만장굴을 대상으로)

  • Kwon, Byung-Doo;Lee, Heui-Soon;Lee, Gyu-Ho;Rim, Hyoung-Rea;Oh, Seok-Hoon
    • Journal of the Korean earth science society
    • /
    • v.21 no.4
    • /
    • pp.408-422
    • /
    • 2000
  • Multiple geophysical methods were applied over the Manjang cave area in Cheju Island to compare and contrast the effectiveness of each method for exploration of underground cavities. The used methods are gravity, magnetic, electrical resistivity and GPR(Ground Pentrating Radar) survey, of which instruments are portable and operations are relatively economical. We have chosen seven survey lines and applied appropriate multiple surveys depending on the field conditions. In the case of magnetic method. two-dimensional grid-type surveys were carried out to cover the survey area. The geophysical survey results reveal the characteristic responses of each method relatively well. Among the applied methods, the electric resistivity methods appeared to be the most effective ones in detecting the Manjang Cave and surrounding miscellaneous cavities. Especially, on the inverted resistivity section obtained from the dipole-dipole array data, the two-dimensional distribution of high resistivity cavities are revealed well. The gravity and magnetic data are contaminated easily by various noises and do not show the definitive responses enough to locate and delineate the Manjang cave. But they provide useful information in verifying the dipole-dipole resistivity survey results. The grid-type 2-D magnetic survey data show the trend of cave development well, and it may be used as a reconnaissance regional survey for determining survey lines for further detailed explorations. The GPR data show very sensitive response to the various shallow volcanic structures such as thin spaces between lava flows and small cavities, so we cannot identify the response of the main cave. Although each geophysical method provides its own useful information, the integrated interpretation of multiple survey data is most effective for investigation of the underground caves.

  • PDF