Search | Korea Science

Gene Set and Pathway Analysis of Microarray Data (프마이크로어레이 데이터의 유전자 집합 및 대사 경로 분석)

Kim Seon-Young
- KOGO NEWS
- /
- v.6 no.1
- /
- pp.29-33
- /
- 2006
Gene set analysis is a new concept and method. to analyze and interpret microarray gene expression data and tries to extract biological meaning from gene expression data at gene set level rather than at gene level. Compared with methods which select a few tens or hundreds of genes before gene ontology and pathway analysis, gene set analysis identifies important gene ontology terms and pathways more consistently and performs well even in gene expression data sets with minimal or moderate gene expression changes. Moreover, gene set analysis is useful for comparing multiple gene expression data sets dealing with similar biological questions. This review briefly summarizes the rationale behind the gene set analysis and introduces several algorithms and tools now available for gene set analysis.
PDF

Effective and Statistical Quantification Model for Network Data Comparing (통계적 수량화 방법을 이용한 효과적인 네트워크 데이터 비교 방법)

Cho, Jae-Ik;Kim, Ho-In;Moon, Jong-Sub
- Journal of Broadcast Engineering
- /
- v.13 no.1
- /
- pp.86-91
- /
- 2008
In the field of network data analysis, the research of how much the estimation data reflects the population data is inevitable. This paper compares and analyzes the well known MIT Lincoln Lab network data, which is composed of collectable standard information from the network with the KDD CUP 99 dataset which was composed from the MIT/LL data. For comparison and analysis, the protocol information of both the data was used. Correspondence analysis was used for analysis, SVD was used for 2 dimensional visualization and weigthed euclidean distance was used for network data quantification.
https://doi.org/10.5909/JBE.2008.13.1.86 인용 PDF KSCI

Improving Classification Performance for Data with Numeric and Categorical Attributes Using Feature Wrapping (특징 래핑을 통한 숫자형 특징과 범주형 특징이 혼합된 데이터의 클래스 분류 성능 향상 기법)

Lee, Jae-Sung;Kim, Dae-Won
- Journal of KIISE:Software and Applications
- /
- v.36 no.12
- /
- pp.1024-1027
- /
- 2009
In this letter, we evaluate the classification performance of mixed numeric and categorical data for comparing the efficiency of feature filtering and feature wrapping. Because the mixed data is composed of numeric and categorical features, the feature selection method was applied to data set after discretizing the numeric features in the given data set. In this study, we choose the feature subset for improving the classification performance of the data set after preprocessing. The experimental result of comparing the classification performance show that the feature wrapping method is more reliable than feature filtering method in the aspect of classification accuracy.
PDF KSCI

TMY2 Weather data for Korea (TMY2 방식에 의한 국내 기상자료 작성 연구)

Shin, Kee-Shik;Yoon, Chang-Ryuel;Park, Sang-Dong
- 한국신재생에너지학회:학술대회논문집
- /
- 2009.06a
- /
- pp.243-246
- /
- 2009
To evaluate the building energy performance, many building simulation programs are used and its capabilities are developed. Despite of its increased capabilities the weather data used In the Building Energy performance evaluation, are still using the same limited set of data. This often forces users to find or calculate weather data such as illuminance, solar radiation, and ground temperature from other sources to calculate it. Also, proper selection of a right weather data set has been considered as one of important factors for a successful building energy simulation. In this paper, we describe TMY2 data, a generalized weather data format developed for use, and applied to Seoul region and examine the differences comparing to existing weather data. A set of 23 years raw weather data base has been developed to provide the weather data file for building energy analysis in Seoul.
PDF

Design of Cache Memory System for Next Generation CPU (차세대 CPU를 위한 캐시 메모리 시스템 설계)

Jo, Ok-Rae;Lee, Jung-Hoon
- IEMEK Journal of Embedded Systems and Applications
- /
- v.11 no.6
- /
- pp.353-359
- /
- 2016
In this paper, we propose a high performance L1 cache structure for the high clock CPU. The proposed cache memory consists of three parts, i.e., a direct-mapped cache to support fast access time, a two-way set associative buffer to reduce miss ratio, and a way-select table. The most recently accessed data is stored in the direct-mapped cache. If a data has a high probability of a repeated reference, when the data is replaced from the direct-mapped cache, the data is stored into the two-way set associative buffer. For the high performance and fast access time, we propose an one way among two ways set associative buffer is selectively accessed based on the way-select table (WST). According to simulation results, access time can be reduced by about 7% and 40% comparing with a direct cache and Intel i7-6700 with two times more space respectively.
https://doi.org/10.14372/IEMEK.2016.11.6.353 인용 PDF KSCI

CNN-LSTM based Wind Power Prediction System to Improve Accuracy (정확도 향상을 위한 CNN-LSTM 기반 풍력발전 예측 시스템)

Park, Rae-Jin;Kang, Sungwoo;Lee, Jaehyeong;Jung, Seungmin
- New & Renewable Energy
- /
- v.18 no.2
- /
- pp.18-25
- /
- 2022
In this study, we propose a wind power generation prediction system that applies machine learning and data mining to predict wind power generation. This system increases the utilization rate of new and renewable energy sources. For time-series data, the data set was established by measuring wind speed, wind generation, and environmental factors influencing the wind speed. The data set was pre-processed so that it could be applied appropriately to the model. The prediction system applied the CNN (Convolutional Neural Network) to the data mining process and then used the LSTM (Long Short-Term Memory) to learn and make predictions. The preciseness of the proposed system is verified by comparing the prediction data with the actual data, according to the presence or absence of data mining in the model of the prediction system.
https://doi.org/10.7849/ksnre.2022.0001 인용 PDF KSCI

Statistical Methods for Comparing Predictive Values in Medical Diagnosis

Chanrim Park;Seo Young Park;Hwa Jung Kim;Hee Jung Shin
- Korean Journal of Radiology
- /
- v.25 no.7
- /
- pp.656-661
- /
- 2024
Evaluating the performance of a binary diagnostic test, including artificial intelligence classification algorithms, involves measuring sensitivity, specificity, positive predictive value, and negative predictive value. Particularly when comparing the performance of two diagnostic tests applied on the same set of patients, these metrics are crucial for identifying the more accurate test. However, comparing predictive values presents statistical challenges because their denominators depend on the test outcomes, unlike the comparison of sensitivities and specificities. This paper reviews existing methods for comparing predictive values and proposes using the permutation test. The permutation test is an intuitive, non-parametric method suitable for datasets with small sample sizes. We demonstrate each method using a dataset from MRI and combined modality of mammography and ultrasound in diagnosing breast cancer.
https://doi.org/10.3348/kjr.2024.0049 인용 PDF

Iowa Liquor Sales Data Predictive Analysis Using Spark

Ankita Paul;Shuvadeep Kundu;Jongwook Woo
- Asia pacific journal of information systems
- /
- v.31 no.2
- /
- pp.185-196
- /
- 2021
The paper aims to analyze and predict sales of liquor in the state of Iowa by applying machine learning algorithms to models built for prediction. We have taken recourse of Azure ML and Spark ML for our predictive analysis, which is legacy machine learning (ML) systems and Big Data ML, respectively. We have worked on the Iowa liquor sales dataset comprising of records from 2012 to 2019 in 24 columns and approximately 1.8 million rows. We have concluded by comparing the models with different algorithms applied and their accuracy in predicting the sales using both Azure ML and Spark ML. We find that the Linear Regression model has the highest precision and Decision Forest Regression has the fastest computing time with the sample data set using the legacy Azure ML systems. Decision Tree Regression model in Spark ML has the highest accuracy with the quickest computing time for the entire data set using the Big Data Spark systems.
https://doi.org/10.14329/apjis.2021.31.2.185 인용 PDF

Design and evaluation of artificial intelligence models for abnormal data detection and prediction

Hae-Jong Joo;Ho-Bin Song
- Journal of Platform Technology
- /
- v.11 no.6
- /
- pp.3-12
- /
- 2023
In today's system operation, it is difficult to detect failures and take immediate action in the case of a shortage of manpower compared to the number of equipment or failures in vulnerable time zones, which can lead to delays in failure recovery. In addition, various algorithms exist to detect abnormal symptom data, and it is important to select an appropriate algorithm for each problem. In this paper, an ensemble-based isolation forest model was used to efficiently detect multivariate point anomalies that deviated from the mean distribution in the data set generated to predict system failure and minimize service interruption. And since significant changes in memory space usage are observed together with changes in CPU usage, the problem is solved by using LSTM-Auto Encoder for a collective anomaly in which another feature exhibits an abnormal pattern according to a change in one by comparing two or more features. did In addition, evaluation indicators are set for the performance evaluation of the model presented in this study, and then AI model evaluation is performed.
PDF

Fractal dimension analysis as an easy computational approach to improve breast cancer histopathological diagnosis

Lucas Glaucio da Silva;Waleska Rayanne Sizinia da Silva Monteiro;Tiago Medeiros de Aguiar Moreira;Maria Aparecida Esteves Rabelo;Emílio Augusto Campos Pereira de Assis;Gustavo Torres de Souza
- Applied Microscopy
- /
- v.51
- /
- pp.6.1-6.9
- /
- 2021
Histopathology is a well-established standard diagnosis employed for the majority of malignancies, including breast cancer. Nevertheless, despite training and standardization, it is considered operator-dependent and errors are still a concern. Fractal dimension analysis is a computational image processing technique that allows assessing the degree of complexity in patterns. We aimed here at providing a robust and easily attainable method for introducing computer-assisted techniques to histopathology laboratories. Slides from two databases were used: A) Breast Cancer Histopathological; and B) Grand Challenge on Breast Cancer Histology. Set A contained 2480 images from 24 patients with benign alterations, and 5429 images from 58 patients with breast cancer. Set B comprised 100 images of each type: normal tissue, benign alterations, in situ carcinoma, and invasive carcinoma. All images were analyzed with the FracLac algorithm in the ImageJ computational environment to yield the box count fractal dimension (Db) results. Images on set A on 40x magnification were statistically different (p = 0.0003), whereas images on 400x did not present differences in their means. On set B, the mean Db values presented promising statistical differences when comparing. Normal and/or benign images to in situ and/or invasive carcinoma (all p < 0.0001). Interestingly, there was no difference when comparing normal tissue to benign alterations. These data corroborate with previous work in which fractal analysis allowed differentiating malignancies. Computer-aided diagnosis algorithms may beneficiate from using Db data; specific Db cut-off values may yield ~ 99% specificity in diagnosing breast cancer. Furthermore, the fact that it allows assessing tissue complexity, this tool may be used to understand the progression of the histological alterations in cancer.
https://doi.org/10.1186/s42649-021-00055-w 인용 PDF

Search Result 414, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)