• Title/Summary/Keyword: mixed data set

Search Result 149, Processing Time 0.026 seconds

An enhanced feature selection filter for classification of microarray cancer data

  • Mazumder, Dilwar Hussain;Veilumuthu, Ramachandran
    • ETRI Journal
    • /
    • v.41 no.3
    • /
    • pp.358-370
    • /
    • 2019
  • The main aim of this study is to select the optimal set of genes from microarray cancer datasets that contribute to the prediction of specific cancer types. This study proposes the enhancement of the feature selection filter algorithm based on Joe's normalized mutual information and its use for gene selection. The proposed algorithm is implemented and evaluated on seven benchmark microarray cancer datasets, namely, central nervous system, leukemia (binary), leukemia (3 class), leukemia (4 class), lymphoma, mixed lineage leukemia, and small round blue cell tumor, using five well-known classifiers, including the naive Bayes, radial basis function network, instance-based classifier, decision-based table, and decision tree. An average increase in the prediction accuracy of 5.1% is observed on all seven datasets averaged over all five classifiers. The average reduction in training time is 2.86 seconds. The performance of the proposed method is also compared with those of three other popular mutual information-based feature selection filters, namely, information gain, gain ratio, and symmetric uncertainty. The results are impressive when all five classifiers are used on all the datasets.

Hydrologic evaluation of SWAT considered forest type using MODIS LAI data: a case of Yongdam Dam watershed (MODIS LAI 자료를 활용하여 임상별로 고려한 SWAT의 수문 평가: 용담댐유역을 대상으로)

  • Han, Daeyoung;Lee, Jiwan;Kim, Wonjin;Baek, Seungchul;Kim, Seongjoon
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.11
    • /
    • pp.875-889
    • /
    • 2021
  • This study compares and analyzes the Soil and Water Assessment Tool (SWAT) and Terra MODIS (Moderate Resolution Imaging Spectroradiometer) as coniferous, deciduous and mixed forest with Yongdam Dam upstream (904.4 km2). The hydrologic evaluation period was set to 10 years from 2010 to 2019, and the applicability of the 8-day MOD15A2 Leaf Area Index (LAI) data, 3 TDR (Time Domain Reflectometry) (GB, JC, CC), and 1 Flux Tower (DU) evaporation volume (YDD) data was simulated. As a result, the R2 of coniferous forest, deciduous forest and mixed forest are 0.95, 0.89, 0.90, soil moisture and evaportranspiration stations R2 were analyzed at 0.50 to 0.55 and 0.51, respectively, with R2 at 0.74, RMSE 2.75 mm/day, NSE 0.70 and PBIAS 14.3% for Yongdam inflow. Based on the calibrated and validated watersheds, the annual average evaportranspiration was calculated as coniferous 469.7 mm, deciduous 501. mm and 511.5 mm mixed forest, total runoff were estimated at coniferous 909.8 mm, deciduous 860.6 mm and 864.2 mm mixed forest. In the case of annual average evaportranspiration, it was evaluated that deciduous were high, but in the case of streamflow, it was evaluated that coniferous were high. Unlike other hydrologic with similar patterns throughout the year, the average annual evapotranspiration was about 7% higher than coniferous due to the higher evapotranspiration of deciduous with high leaf area index in summer and fall. In addition, deciduous were 9% and 6% higher for surface runoff and lateral flow, but the groundwater of coniferous was 77% higher. Therefore, it was confirmed that the total runoff was in order of coniferous, mixed forest, and deciduous.

Study on Long-Term Preservation of Hwangnyunhaedok-Tang Pharmacopuncture (황련해독탕 약침의 장기보존시험에 관한 연구)

  • Lee, Jin-Ho;Ha, In-Hyuk;Kim, Me-Riong;Chung, Hwa-Jin;Lee, Jae-Woong;Kim, Min-Jeong;Kim, Eun-Jee;Lee, In-Hee
    • Journal of Korean Medicine Rehabilitation
    • /
    • v.26 no.2
    • /
    • pp.51-59
    • /
    • 2016
  • Objectives We studied long-term preservation in stability of a mixed preparation of distilled and 70% alcohol extracted Hwangnyunhaedok-tang pharmacopuncture to establish standards for expiration date and quality control. Methods Three lots of consecutively prepared Hwangnyunhaedok-tang pharmacopuncture were each tested in triplicate to a total 5 tests at 3 month intervals over a period of 12 months for analysis of appearance, pH, specific gravity, index component content, endotoxins, microbial sterility, residual organic solvents, heavy metals, and pesticides. Items with no difference by elapsed time were tested at the initial and final timepoints, and data of items with potential difference by elapsed time were analyzed for trends to establish individual quality control standards. Results All tested items were stable over the study period, and therefore the expiration date was set as 12 months. pH quality control standards were set as 3.66~5.69, and that of specific gravity as 0.802~1.203, respectively. In index component content standards, berberine was set at $4.96{\sim}8.98{\mu}g/vial$, baicalin at $6.47{\sim}10.31{\mu}g/vial$, and geniposide at $116.03{\sim}189.55{\mu}g/vial$, respectively. Standards for other items with no difference by elapsed time were set according to general Korean herbal medicine standards in the Korean Pharmacopoeia. Conclusions Manageable expiration date and quality control standards were established through long-term preservation testing of Hwangnyunhaedok-tang pharmacopuncture, furthering standardization of Korean medicine pharmacopuncture.

Estimation of Biogenic Emissions over South Korea and Its Evaluation Using Air Quality Simulations (남한지역 자연 배출량 산정 및 대기질 모사를 이용한 평가)

  • Kim, Soon-Tae;Moon, Nan-Kyoung;Cho, Kyu-Tak;Byun, Dae-Won W.;Song, Eun-Young
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.24 no.4
    • /
    • pp.423-438
    • /
    • 2008
  • BEIS2 (Biogenic Emissions Inventory System version 2) and BEIS3.12 (BEIS version 3.12) were used to estimate hourly biogenic emissions over South Korea using a set of vegetation and meteorological data simulated with the MM5 (Mesoscale Model version 5). Two biogenic emission models utilized different emission factors and showed different responses to solar radiations, resulting in about $10{\sim}20%$ difference in the nationwide isoprene emission estimates. Among the 11-vegetation classes, it was found that mixed forest and deciduous forest are the most important vegetation classes producing isoprene emissions over South Korea comprising ${\sim}90%$ of the total. The simulated isoprene concentrations over Seoul metropolitan area show that diurnal and daily variations match relatively well with the PAMS (Photochemical Air Monitoring Station) measurements during the period of June 3${\sim}$June 10, 2004. Compared to BEIS2, BEIS3.12 yielded ${\sim}35%$ higher isoprene concentrations during daytime and presented better matches to the high peaks observed over the Seoul area. This study showed that the importance of vegetation data and emission factors to estimate biogenic emissions. Thus, it is expected to improve domestic vegetation categories and emission factors in order to better represent biogenic emissions over South Korea.

A Hangul Script Matching Algorithm for PDA (PDA상에서의 한글 필기체 매칭 알고리즘)

  • Cho, Mi-Gyung;Cho, Hwan-Gue
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.10
    • /
    • pp.684-693
    • /
    • 2002
  • Electronic Ink is a stored data in the form of the handwritten text or the script without converting it into ASCII by handwritten recognition on the pen-based computers and Personal Digital Assistants(PDAs) for supporting natural and convenient data input. One of the most Important issue is to search the electronic ink in order to use it. We proposed and implemented a script matching algorithm for the electronic ink. Proposed matching algorithm separated the input stroke into a set of primitive stroke using the curvature of the stroke curve. After determining the type of separated strokes, it produced a stroke feature vector. And then it calculated the distance between the stroke feature vector of input strokes and one of strokes in the database using the dynamic programming technique. We did various experiments and our algorithm showed high matching rate over 97.7% for only the Korean script and 94% for the data mixed Korean with the Chinese character.

Corporate Social Responsibility and Financial Performance: The impact of the MSCI ESG Ratings on Korean Firms (기업의 사회책임과 재무성과: 한국기업의 MSCI ESG 평가를 중심으로)

  • Kim, Jinwook;Chung, Sunggon;Park, Cheongkyu
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.11
    • /
    • pp.5586-5593
    • /
    • 2013
  • This study investigates how the Corporate Social Responsibility (CSR) performance of a firm is associated with its financial performance in the stock market. Prior studies provide mixed evidence on the relation between CSR and financial performance. This study sheds some lights on the positive effect of CSR on firms' financial performance. Using a unique set of data on CSR performance of Korean firms provided by Morgan Stanley Capital International (MCSI), we find that firms' CSR performance is positively associated with their contemporaneous stock returns and Tobin's Q in the Korean market. This finding suggests that stock market participants value firms' CSR activities. This is the first study that provides empirical evidence on the existence of the positive association between the CSR performance of Korean firms and their financial performance using MCSI data which is considered more reliable than the data used in the prior CSR studies in Korea.

Characteristics of Unconfined Compressive Strength of Dredged Clay Mixed with Friendly Soil Hardening Agent (준설토와 친토양 경화재 혼합지반의 일축강도특성)

  • Oh, Sewook;Yeon, Yonghum;Kwon, Youngcheul
    • Journal of the Korean GEO-environmental Society
    • /
    • v.17 no.10
    • /
    • pp.73-81
    • /
    • 2016
  • In the construction on low strength and high compressible soft ground, the many problems have been occurred in recent construction project. therefore, the soil improvement have been developed to obtain high strength in relatively short period of curing time. Based on the laboratory tests using undisturbed marine clay, the effect of improvement on soft ground was estimated. Deep mixing method by cement have been virtually used for decades to improve the mechanical properties of soft ground. However, previous researches set the focus on the short term strength the about 10% of cement treated clay. In this paper, cement and Natural Soil Stabilizer (NSS) were used as the stabilizing agent to obtain trafficability and mechanical strength of the soft clay. Based on the several laboratory tests, optimum condition was proposed to ensure the mechanical strength and compressibility as the foundation soil using cement and NSS mixed soil. Finally, research data was proposed about the applicability of NSS as the stabilizing agent to soft clay to increase the mechanical strength of soil.

Accident Analysis and Discussion of Circular Intersections based on Land Use and Vehicle Type (토지이용과 차종에 근거한 원형교차로 사고분석 및 논의)

  • Lee, Min Yeong;Park, Byung Ho
    • International Journal of Highway Engineering
    • /
    • v.20 no.2
    • /
    • pp.75-85
    • /
    • 2018
  • PURPOSES : This study aimed to analyze traffic accidents at circular intersections, and discuss accident reduction strategies based on land use and vehicle type. METHODS : Traffic accident data from 2010 to 2014 were collected from the "traffic accident analysis system" (TAAS) data set of the Road Traffic Authority. To develop the accident rate model, a multiple linear regression model was used. Explanatory variables such as geometry and traffic volume were used to develop the models. RESULTS : The main results of the study are as follows. First, it was found that the null hypotheses that land use and vehicle type do not affect the accident rate should be rejected. Second, 16 accident rate models, which are statistically significant (with high $R^2$ values), were developed. Finally, the area of the central island, number of speed humps, entry lane width, circulatory roadway width, bus stops, and pedestrian crossings were analyzed to determine their effect on accidents according to the type of land use and vehicle. CONCLUSIONS : Through the developed accident rate models, it was revealed that the accident factors at circular intersections changed depending on land use and vehicle type. Thus, selecting the appropriate location of bus stops for trucks, widening entry lanes for cars, and installing splitter islands and optimal lighting for motorcycles were determined to be important for reducing the accident rate. Additionally, the evaluation showed that commercial and mixed land use had a weaker effect on accidents than residential land use.

A dominant hyperrectangle generation technique of classification using IG partitioning (정보이득 분할을 이용한 분류기법의 지배적 초월평면 생성기법)

  • Lee, Hyeong-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.1
    • /
    • pp.149-156
    • /
    • 2014
  • NGE(Nested Generalized Exemplar Method) can increase the performance of the noisy data at the same time, can reduce the size of the model. It is the optimal distance-based classification method using a matching rule. NGE cross or overlap hyperrectangles generated in the learning has been noted to inhibit the factors. In this paper, We propose the DHGen(Dominant Hyperrectangle Generation) algorithm which avoids the overlapping and the crossing between hyperrectangles, uses interval weights for mixed hyperrectangles to be splited based on the mutual information. The DHGen improves the classification performance and reduces the number of hyperrectangles by processing the training set in an incremental manner. The proposed DHGen has been successfully shown to exhibit comparable classification performance to k-NN and better result than EACH system which implements the NGE theory using benchmark data sets from UCI Machine Learning Repository.

Analysis of Photoelastic Stress Field Around Inclined Crack Tip by Using Hybrid Technique (하이브리드 기법에 의한 경사균열 팁 주위의 광탄성 응력장 해석)

  • Chen, Lei;Seo, Jin;Lee, Byung-Hee;Kim, Myung-Soo;Baek, Tae-Hyun
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.34 no.9
    • /
    • pp.1287-1292
    • /
    • 2010
  • In this paper, a hybrid technique is presented. First, the isochromatic fringe data of a given set of points are calculated by the finite element method and are used as input data in complex variable formulations. Then the numerical model of the specimen with a central inclined crack is transformed from the physical plane to the complex plane by conformal mapping. The stress field is analyzed and the mixed-mode stress intensity factors are calculated for this complex plane. The stress intensity factors are calculated by the finite element method as well as by a theoretical method and compared with each other. In order to conveniently compare these values with each other, both actual and regenerated photoelastic fringe patterns are multiplied by a factor of two and sharpened by digital image processing.