• Title/Summary/Keyword: Functional principal component analysis

Search Result 73, Processing Time 0.03 seconds

Data Visualization using Linear and Non-linear Dimensionality Reduction Methods

  • Kim, Junsuk;Youn, Joosang
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.12
    • /
    • pp.21-26
    • /
    • 2018
  • As the large amount of data can be efficiently stored, the methods extracting meaningful features from big data has become important. Especially, the techniques of converting high- to low-dimensional data are crucial for the 'Data visualization'. In this study, principal component analysis (PCA; linear dimensionality reduction technique) and Isomap (non-linear dimensionality reduction technique) are introduced and applied to neural big data obtained by the functional magnetic resonance imaging (fMRI). First, we investigate how much the physical properties of stimuli are maintained after the dimensionality reduction processes. We moreover compared the amount of residual variance to quantitatively compare the amount of information that was not explained. As result, the dimensionality reduction using Isomap contains more information than the principal component analysis. Our results demonstrate that it is necessary to consider not only linear but also nonlinear characteristics in the big data analysis.

Study of age specific lung cancer mortality trends in the US using functional data analysis

  • Tharu, Bhikhari;Pokhrel, Keshav;Aryal, Gokarna;Kafle, Ram C.;Khanal, Netra
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.2
    • /
    • pp.119-134
    • /
    • 2021
  • Lung cancer is one of the leading causes of cancer deaths in the world. Investigation of mortality rates is pivotal to adequately understand the determinants causing this disease, allocate public health resources, and apply different control measures. Our study aims to analyze and forecast age-specific US lung cancer mortality trends. We report functions of mortality rates for different age groups by incorporating functional principal component analysis to understand the underlying mortality trend with respect to time. The mortality rates of lung cancer have been higher in men than in women. These rates have been decreasing for all age groups since 1990 in men. The same pattern is observed for women since 2000 except for the age group 85 and above. No significant changes in mortality rates in lower age groups have been reported for both gender. Lung cancer mortality rates for males are relatively higher than females. Ten-year predictions of mortality rates depict a continuous decline for both gender with no apparent change for lower age groups (below 40).

Daily Gas Demand Forecast Using Functional Principal Component Analysis (함수 주성분 분석을 이용한 일별 도시가스 수요 예측)

  • Choi, Yongok;Park, Haeseong
    • Environmental and Resource Economics Review
    • /
    • v.29 no.4
    • /
    • pp.419-442
    • /
    • 2020
  • The majority of the natural gas demand in South Korea is mainly determined by the heating demand. Accordingly, there is a distinct seasonality in which the gas demand increases in winter and decreases in summer. Moreover, the degree of sensitiveness to temperature on gas demand has changed over time. This study firstly introduces changing temperature response function (TRF) to capture effects of changing seasonality. The temperature effect (TE), estimated by integrating temperature response function with daily temperature density, represents for the amount of gas demand change due to variation of temperature distribution. Also, this study presents an innovative way in forecasting daily temperature density by employing functional principal component analysis based on daily max/min temperature forecasts for the five big cities in Korea. The forecast errors of the temperature density and gas demand are decreased by 50% and 80% respectively if we use the proposed forecasted density rather than the average daily temperature density.

Analysis of Molecular Pathways in Pancreatic Ductal Adenocarcinomas with a Bioinformatics Approach

  • Wang, Yan;Li, Yan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.6
    • /
    • pp.2561-2567
    • /
    • 2015
  • Pancreatic ductal adenocarcinoma (PDAC) is a leading cause of cancer death worldwide. Our study aimed to reveal molecular mechanisms. Microarray data of GSE15471 (including 39 matching pairs of pancreatic tumor tissues and patient-matched normal tissues) was downloaded from Gene Expression Omnibus (GEO) database. We identified differentially expressed genes (DEGs) in PDAC tissues compared with normal tissues by limma package in R language. Then GO and KEGG pathway enrichment analyses were conducted with online DAVID. In addition, principal component analysis was performed and a protein-protein interaction network was constructed to study relationships between the DEGs through database STRING. A total of 532 DEGs were identified in the 38 PDAC tissues compared with 33 normal tissues. The results of principal component analysis of the top 20 DEGs could differentiate the PDAC tissues from normal tissues directly. In the PPI network, 8 of the 20 DEGs were all key genes of the collagen family. Additionally, FN1 (fibronectin 1) was also a hub node in the network. The genes of the collagen family as well as FN1 were significantly enriched in complement and coagulation cascades, ECM-receptor interaction and focal adhesion pathways. Our results suggest that genes of collagen family and FN1 may play an important role in PDAC progression. Meanwhile, these DEGs and enriched pathways, such as complement and coagulation cascades, ECM-receptor interaction and focal adhesion may be important molecular mechanisms involved in the development and progression of PDAC.

A Study on Rural Land Use Planning Technique ( I ) Sub-regional Analysis by Principal Component Analysis - (농촌지역 토지이용계획 기법 연구(I) -주성분 분석법에 의한 지역 구분-)

  • 정하우;박병태
    • Journal of Korean Society of Rural Planning
    • /
    • v.1 no.2
    • /
    • pp.33-42
    • /
    • 1995
  • For formulation of the rational land us2 plan in regional base, it is a basic and prior condition to categorize total planning area into some functional subregions by purposely-selected indicators. As one of quantitive approaches to the areal categorization in rural area, Principal Component Analysis(PCA) was introduced and testified its applicability through a case study on Sunheungdistrict(called as myun in Korea) area, Youngpoong-county, Kyungbuk-province, Korea. Areal analysis by PCA was carried out on rurality and urbanity of parish-level area(ri in Korea) respectively. By use of PCA analysis results, classifying matrix was made through categorization of both index scores. Among 18 ri's of the case study area, 12 was classified as rural-dominated areas, 2 as urban- dominated areas, and reamaining 3 as intermediate areas.

  • PDF

Classical testing based on B-splines in functional linear models (함수형 선형모형에서의 B-스플라인에 기초한 검정)

  • Sohn, Jihoon;Lee, Eun Ryung
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.4
    • /
    • pp.607-618
    • /
    • 2019
  • A new and interesting task in statistics is to effectively analyze functional data that frequently comes from advances in modern science and technology in areas such as meteorology and biomedical sciences. Functional linear regression with scalar response is a popular functional data analysis technique and it is often a common problem to determine a functional association if a functional predictor variable affects the scalar response in the models. Recently, Kong et al. (Journal of Nonparametric Statistics, 28, 813-838, 2016) established classical testing methods for this based on functional principal component analysis (of the functional predictor), that is, the resulting eigenfunctions (as a basis). However, the eigenbasis functions are not generally suitable for regression purpose because they are only concerned with the variability of the functional predictor, not the functional association of interest in testing problems. Additionally, eigenfunctions are to be estimated from data so that estimation errors might be involved in the performance of testing procedures. To circumvent these issues, we propose a testing method based on fixed basis such as B-splines and show that it works well via simulations. It is also illustrated via simulated and real data examples that the proposed testing method provides more effective and intuitive results due to the localization properties of B-splines.

Investigating the underlying structure of particulate matter concentrations: a functional exploratory data analysis study using California monitoring data

  • Montoya, Eduardo L.
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.6
    • /
    • pp.619-631
    • /
    • 2018
  • Functional data analysis continues to attract interest because advances in technology across many fields have increasingly permitted measurements to be made from continuous processes on a discretized scale. Particulate matter is among the most harmful air pollutants affecting public health and the environment, and levels of PM10 (particles less than 10 micrometers in diameter) for regions of California remain among the highest in the United States. The relatively high frequency of particulate matter sampling enables us to regard the data as functional data. In this work, we investigate the dominant modes of variation of PM10 using functional data analysis methodologies. Our analysis provides insight into the underlying data structure of PM10, and it captures the size and temporal variation of this underlying data structure. In addition, our study shows that certain aspects of size and temporal variation of the underlying PM10 structure are associated with changes in large-scale climate indices that quantify variations of sea surface temperature and atmospheric circulation patterns.

Application of Electronic Nose for Quality Control of The High Quality and Functional Components (고품질 기능성 물질의 품질관리를 위한 전자코 응용)

  • Noh Bong-Soo
    • Proceedings of the Korean Society of Crop Science Conference
    • /
    • 2006.04a
    • /
    • pp.40-54
    • /
    • 2006
  • It's not easy to detect the high quality and functional compounds for control quality of food materials. The electronic nose was an instrument, which comprised of an array of electronic chemical sensors with partial specificity and an appropriate pattern recognition system, capable of recognizing simple or complex odors. It can conduct fast analysis and provide simple and straightforward results and is best suited for quality control and process monitoring in the field of functional foods. Numbers of applications of an electronic nose in the functional food industry include discrimination of habitats for medicinal food materials, monitoring storage process, lipid oxidation, and quality control of food and/or processing with principal component analysis, neural network analysis and the electronic nose based on GC-SAW sensor. The electronic nose would be possibly useful for a wide variety of quality control in the functional food and plant cultivation when correlating traditional analytical instrumental data with sensory evaluation results or electronic nose data.

  • PDF

Analysis of fMRI Signal Using Independent Component Analysis (Independent Component Analysis를 이용한 fMRI신호 분석)

  • 문찬홍;나동규;박현욱;유재욱;이은정;변홍식
    • Investigative Magnetic Resonance Imaging
    • /
    • v.3 no.2
    • /
    • pp.188-195
    • /
    • 1999
  • The fMRI signals are composed of many various signals. It is very difficult to find the accurate parameter for the model of fMRI signal containing only neural activity, though we may estimating the signal patterns by the modeling of several signal components. Besides the nose by the physiologic motion, the motion of object and noise of MR instruments make it more difficult to analyze signals of fMRI. Therefore, it is not easy to select an accurate reference data that can accurately reflect neural activity, and the method of an analysis of various signal patterns containing the information of neural activity is an issue of the post-processing methods for fMRI. In the present study, fMRI data was analyzed with the Independent Component Analysis(ICA) method that doesn't need a priori-knowledge or reference data. ICA can be more effective over the analytic method using cross-correlation analysis and can separate the signal patterns of the signals with delayed response or motion related components. The Principal component Analysis (PCA) threshold, wavelet spatial filtering and analysis of a part of whole images can be used for the reduction of the freedom of data before ICA analysis, and these preceding analyses may be useful for a more effective analysis. As a result, ICA method will be effective for the degree of freedom of the data.

  • PDF

Comparison of 12 Isoflavone Profiles of Soybean (Glycine max (L.) Merrill) Seed Sprouts from Three Different Countries

  • Park, Soo-Yun;Kim, Jae Kwang;Kim, Eun-Hye;Kim, Seung-Hyun;Prabakaran, Mayakrishnan;Chung, Ill-Min
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.63 no.4
    • /
    • pp.360-377
    • /
    • 2018
  • The levels of 12 isoflavones were measured in soybean (Glycine max (L.) Merrill) sprouts of 68 genetic varieties from three countries (China, Japan, and Korea). The isoflavone profile differences were analyzed using data mining methods. A principal component analysis (PCA) revealed that the CSRV021 variety was separated from the others by the first two principal components. This variety appears to be most suited for functional food production due to its high isoflavone levels. Partial least squares discriminant analysis (PLS-DA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) showed that there are meaningful isoflavone compositional differences in samples that have different countries of origin. Hierarchical clustering analysis (HCA) of these phytochemicals resulted in clusters derived from closely related biochemical pathways. These results indicate the usefulness of metabolite profiling combined with chemometrics as a tool for assessing the quality of foods and identifying metabolic links in biological systems.