• Title/Summary/Keyword: mixed data set

Search Result 150, Processing Time 0.024 seconds

Comparison of several criteria for ordering independent components (독립성분의 순서화 방법 비교)

  • Choi, Eunbin;Cho, Sulim;Park, Mira
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.889-899
    • /
    • 2017
  • Independent component analysis is a multivariate approach to separate mixed signals into original signals. It is the most widely used method of blind source separation technique. ICA uses linear transformations such as principal component analysis and factor analysis, but differs in that ICA requires statistical independence and non-Gaussian assumptions of original signals. PCA have a natural ordering based on cumulative proportion of explained variance; howerver, ICA algorithms cannot identify the unique optimal ordering of the components. It is meaningful to set order because major components can be used for further analysis such as clustering and low-dimensional graphs. In this paper, we compare the performance of several criteria to determine the order of the components. Kurtosis, absolute value of kurtosis, negentropy, Kolmogorov-Smirnov statistic and sum of squared coefficients are considered. The criteria are evaluated by their ability to classify known groups. Two types of data are analyzed for illustration.

CFD Simulation for Mixture Characteristic of DME-Propane Liquified Fuels (DME-Propane 액화연료의 혼합특성에 대한 CFD 시뮬레이션)

  • Kim, Cha-Hwan;Chun, Seuk-Hoon;Shin, Dong-Woo;Kim, Lae-Hyun;Lee, Hyun-Chan;Baek, Young-Soon
    • Korean Chemical Engineering Research
    • /
    • v.50 no.2
    • /
    • pp.328-333
    • /
    • 2012
  • In this study, CFD simulation was performed with commercial CFD code FLUENT for the 3D mixing tank model (1 m in a diameter and 2.5 m in a height) of DME-Propane liquified fuels. Initial condition set-up with existence of DME 146 l at the upper side of mixing tank and Propane 770 l at the lower side of mixing tank. Characteristics of mixture and fluid flow were observed for 34 hours simulation. Two liquid fuel were uniformly mixed within range of 3 mol% after 24 hours, and range of 1 mol% after 34 hours. The simulation result following 4 hours was verified with KOGAS experimental data.

Bayesian analysis of finite mixture model with cluster-specific random effects (군집 특정 변량효과를 포함한 유한 혼합 모형의 베이지안 분석)

  • Lee, Hyejin;Kyung, Minjung
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.57-68
    • /
    • 2017
  • Clustering algorithms attempt to find a partition of a finite set of objects in to a potentially predetermined number of nonempty subsets. Gibbs sampling of a normal mixture of linear mixed regressions with a Dirichlet prior distribution calculates posterior probabilities when the number of clusters was known. Our approach provides simultaneous partitioning and parameter estimation with the computation of classification probabilities. A Monte Carlo study of curve estimation results showed that the model was useful for function estimation. Examples are given to show how these models perform on real data.

The Asymmetric Relationship between Output Volatility and Growth : Evidence from the U.K. Industrial Production (영국 산업생산 자료에 나타난 성장률과 변동성간의 비대칭적 관계)

  • Kim, Jan R.
    • International Area Studies Review
    • /
    • v.14 no.3
    • /
    • pp.86-107
    • /
    • 2010
  • Prior research on the relationship between output volatility and growth has produced mixed results, failing to provide clear empirical evidence on the sign of the relationship. In this paper, we raise the possibility that such failure is due to misspecification in empirical models previously used, i.e., not taking into account the business cycle dependence of the volatility-growth relation. We set off with the conjecture that higher volatility exerts qualitatively different effect on growth depending on whether the economy is in expansion or contraction. We estimate a series of ARCH-type models with the monthly industrial production data of the U.K., and find strong evidence suggesting that the volatility-growth relation is positive when the economy is in expansion, while higher volatility lowers growth rate in the contraction phase. We also find evidence supporting that the volatility-growth relation estimated in the paper captures a a causal relation, not a bidirectional correlation.

The Structural Studies of Peptide P143 Derived from Apo B-100 by NMR

  • Lee, Ji-Eun;Kim, Gil-Hoon;Won, Ho-Shik
    • Journal of the Korean Magnetic Resonance Society
    • /
    • v.25 no.4
    • /
    • pp.58-63
    • /
    • 2021
  • Apolipoprotein B-100 (apo B-100), the main protein component that makes up LDL (Low density lipoprotein), consists of 4,536 amino acids and serves to combine with the LDL receptor. The oxidized LDL peptides by malondialdehyde (MDA) or acetylation in vivo act as immunoglobulin (Ig) antigens and peptide groups were classified into 7 peptide groups with subsequent 20 amino acids (P1-P302). The biomimetic peptide P143 (IALDD AKINF NEKLS QLQTY) out of C-group peptides carrying the highest value of IgG antigens were selected for structural studies that may provide antigen specificity. Experimental results show that P143 has β-sheet in Ile[1]-Asn[9] and α-helice in Gln[16]-Tyr[20] structure. Homonuclear 2D-NMR (COSY, TOCSY, NOESY) experiments were carried out for NMR signal assignments and structure determination for P143. On the basis of these completely assigned NMR spectra and proton distance information, distance geometry (DG) and molecular dynamic (MD) were carried out to determine the structures of P143. The proposed structure was selected by comparisons between experimental NOE spectra and back-calculated 2D NOE results from determined structure showing acceptable agreement. The total Root-Mean-Square-Deviation (RMSD) value of P143 obtained upon superposition of all atoms were in the set range. The solution state P143 has a mixed structure of pseudo α-helix and β-turn(Phe[10] to Glu[12]). These results are well consistent with calculated structure from experimental data of NOE spectra. Structural studies based on NMR may contribute to the prevent oxidation studies of atherosclerosis and observed conformational characteristics of apo B-100 in LDL using monoclonal antibodies.

Weather Conditions Drive the Damage Area Caused by Armillaria Root Disease in Coniferous Forests across Poland

  • Pawel Lech;Oksana Mychayliv;Robert Hildebrand;Olga Orman
    • The Plant Pathology Journal
    • /
    • v.39 no.6
    • /
    • pp.548-565
    • /
    • 2023
  • Armillaria root disease affects forests around the world. It occurs in many habitats and causes losses in the infested stands. Weather conditions are important factors for growth and development of Armillaria species. Yet, the relation between occurrence of damage caused by Armillaria disease and weather variables are still poorly understood. Thus, we used generalized linear mixed models to determine the relationship between weather conditions of current and previous year (temperature, precipitation and their deviation from long-term averages, air humidity and soil temperature) and the incidence of Armillaria-induced damage in young (up to 20 years old) and older (over 20 years old) coniferous stands in selected forest districts across Poland. We used unique data, gathered over the course of 23 years (1987-2009) on tree damage incidence from Armillaria root disease and meteorological parameters from the 24-year period (1986-2009) to reflect the dynamics of damage occurrence and weather conditions. Weather parameters were better predictors of damage caused by Armillaria disease in younger stands than in older ones. The strongest predictor was soil temperature, especially that of the previous year growing season and the current year spring. We found that temperature and precipitation of different seasons in previous year had more pronounced effect on the young stand area affected by Armillaria. Each stand's age class was characterized by a different set of meteorological parameters that explained the area of disease occurrence. Moreover, forest district was included in all models and thus, was an important variable in explaining the stand area affected by Armillaria.

Scanning Determination & Observation Features by Sex shown in the Process of Acquiring Visual Information - With the Object of Subway Station Hall Space - (시각정보획득과정에 나타난 주사판정과 성별 주시특성 - 지하철 홀 공간을 대상으로 -)

  • Kim, Jong-Ha;Choi, Gae-Young
    • Korean Institute of Interior Design Journal
    • /
    • v.23 no.6
    • /
    • pp.115-124
    • /
    • 2014
  • This study has carried out scanning tests in order to figure out the features of scanning search by sex of space users, with the result of which the validity of data has been estimated. In this research, the scanning patterns were set up for verifying the typology of scanning paths and then the reason for determining scanning paths and the validity of estimation method were reviewed. Since the observation features depends on sex, the analysis of visual activities for acquiring any information in a space will reveal the intention and purpose of space users. The findings by analyzing the features of scanning pattern by sex which were found at the determination of scanning patterns can be defined as the followings. First, for estimating the process of space-information search, the movement distance at each point of continuative-observation data from the angle of eye-movement has been extracted, on the ground of which the fixation and movement of eye have been defined for the establishment of scanning-cut characteristics. Second, the scanning times were estimated for the extraction of effective observation data that would be used for comparative analysis, which showed that men had more data (3,398.2/64.4%) than women (2,998.2/55.6%). This enables the acknowledgment that the scanning cut of men was relatively less, which indicates that men will acquire more information on space than women in the process of observing any space. Third, men's scanning times (58.0 times/2.02 seconds) were less than those of women (71.9 times/1.39 seconds) while the scanning time of the former was longer than that of the latter, which shows the feature that it takes longer for men than women in scanning while the scanning times of the former is less than those of the latter. Fourth, the observation features can be determined that the combination of this result with the predominance character by sex for a general viewpoint to be employed indicates that while men employ mixed-scanning for observation activities to acquire space-information spending for longer time, women, by concentrated-scanning, focus on a single point for shorter time or stay at one location for a considerably long time for space-information acquirement.

Comparison of Korean Classification Models' Korean Essay Score Range Prediction Performance (한국어 학습 모델별 한국어 쓰기 답안지 점수 구간 예측 성능 비교)

  • Cho, Heeryon;Im, Hyeonyeol;Yi, Yumi;Cha, Junwoo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.3
    • /
    • pp.133-140
    • /
    • 2022
  • We investigate the performance of deep learning-based Korean language models on a task of predicting the score range of Korean essays written by foreign students. We construct a data set containing a total of 304 essays, which include essays discussing the criteria for choosing a job ('job'), conditions of a happy life ('happ'), relationship between money and happiness ('econ'), and definition of success ('succ'). These essays were labeled according to four letter grades (A, B, C, and D), and a total of eleven essay score range prediction experiments were conducted (i.e., five for predicting the score range of 'job' essays, five for predicting the score range of 'happiness' essays, and one for predicting the score range of mixed topic essays). Three deep learning-based Korean language models, KoBERT, KcBERT, and KR-BERT, were fine-tuned using various training data. Moreover, two traditional probabilistic machine learning classifiers, naive Bayes and logistic regression, were also evaluated. Experiment results show that deep learning-based Korean language models performed better than the two traditional classifiers, with KR-BERT performing the best with 55.83% overall average prediction accuracy. A close second was KcBERT (55.77%) followed by KoBERT (54.91%). The performances of naive Bayes and logistic regression classifiers were 52.52% and 50.28% respectively. Due to the scarcity of training data and the imbalance in class distribution, the overall prediction performance was not high for all classifiers. Moreover, the classifiers' vocabulary did not explicitly capture the error features that were helpful in correctly grading the Korean essay. By overcoming these two limitations, we expect the score range prediction performance to improve.

Development of glufosinate-tolerant GMO detection markers for food safety management (식품안전관리를 위한 제초제 glufosinate 특이적 GM 작물 검출마커 개발)

  • Song, Minji;Qin, Yang;Cho, Younsung;Park, TaeSung;Lim, Myung-Ho
    • Korean Journal of Food Science and Technology
    • /
    • v.52 no.1
    • /
    • pp.40-45
    • /
    • 2020
  • Over 500 genetically modified organisms (GMOs) have been developed since 1996, of which nearly 44% have glufosinate herbicide-tolerant traits. Identification of specific markers that can be used to identify herbicide-tolerant traits is challenging as the DNA sequences of the gene(s) of a trait are highly variable depending on the origin of the gene(s), plant species, and developers. To develop specific PCR marker(s) for the detection of the glufosinate-tolerance trait, DNA sequences of several pat or bar genes were compared and a diverse combination of PCR primer sets were examined using certified reference materials or transgenic plants. Based on both the qualitative and quantitative PCR tests, a primer set specific for pat and non-specific for bar was developed. Additionally, a set of markers that can detect both pat and bar was developed, and the quantitative PCR data indicated that the primer pairs were sensitive enough to detect 0.1% of the mixed seed content rate.

Solvent Extraction of Light (Pr, Nd) and Medium (Tb, Dy) Rare Earth Elements with PC88A of Rare Earth Chloride Solution from Waste Permanent Magnet (폐 영구자석으로부터 회수한 염화희토류용액에서 PC88A를 이용한 경희토류(Pr, Nd)/중희토류(Tb, Dy) 용매추출)

  • Jeon, Su-Byung;Son, InJoon;Lim, Byung-Chul;Kim, Jeong-Mo;Kim, Yeon-Jin;Ha, Tae-Gyu;Yoon, Ho-Sung;Kim, Chul-Joo;Chung, Kyeong-Woo
    • Resources Recycling
    • /
    • v.27 no.3
    • /
    • pp.8-15
    • /
    • 2018
  • Solvent extraction behavior of light rare earth elements (Pr, Nd) and medium rare erath elements (Tb, Dy) in the HCl-PC88A-kerosene extraction system was investigated in order to separate high-purity light rare earths (Pr, Nd) and medium rare earths (Tb, Dy) in the mixed rare earth chloride solution. In the batch test step, it was confirmed that the separation efficiency was good when the extractant concentration (PC88A) was 0.5 M, the equilibrium pH after extraction was 0.8 to 1.0 (initial pH 1.3 of the feed), the concentrations of hydrochloric acid in scrubbing solution was set as 0.1 M, the concentrations of hydrochloric acid in stripping solution was set as 2.0 M or more. Based on the experimental data obtained from the batch test, the mixer-settler was composed as follows; 4 stages of extraction, 8 stages of scrubbing, 4 stages of stripping, and 3 stages of pickling organic solution. The Mixer-settler was operated for 180 hours, and the operating conditions were continuously adjusted to obtain the high-purity light/medium rare earths. Finally, the purity of light (Pr, Nd) and medium rare earth elements (Tb, Dy) was reached as 3 N class.