사이언스 빅 데이터(Science Big Data) 처리 기술 동향

Kim, Hui-Jae;Ju, Gyeong-No;Yun, Chan-Hyeon;

정보와 통신 (Information and Communications Magazine)

제29권11호
/
Pages.11-23
/
2012
/
1226-4725(pISSN)

한국통신학회 (The Korean Institute of Commucations and Information Sciences)

사이언스 빅 데이터(Science Big Data) 처리 기술 동향

Kim, Hui-Jae (KAIST) ;
Ju, Gyeong-No (KAIST) ;
Yun, Chan-Hyeon (KAIST)

발행 : 2012.10.31

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 고에서는 과학 분야에서의 대용량 데이터 처리를 위한 기술인 사이언스 빅데이터의 처리 기술 동향에 대하여 기술한다. 서론에서 사이언스 빅데이터의 정의 및 필요성을 다루고, 본론에서는 데이터 중심 과학 패러다임의 등장과 그로 인한 사이언스 빅데이터 요구사항, 사이언스 빅데이터 소스 수집 및 정제, 저장 및 관리, 처리, 분석 등으로 이루어지는 사이언스 빅데이터 처리 기법에 대하여 기술한다. 또한 현재 다양한 기관에서 연구하고 있는 사이언스 빅데이터 플랫폼, 맵리듀스 등을 이용한 워크플로우 제어 기반의 사이언스 빅데이터 처리 기법을 예시로 소개한다.

키워드

참고문헌

한선화, "Science Big Data: Grand Challenges", IT 21 Global Conference, 2012
조성우, "Big Data 시대의 기술", 중앙연구소 Intelligent Knowledge Service
CERN, http://cern.org
Complete Genomics, www.completegenomics.com/
이명진, "빅 데이터 환경의 고급 분석 기법과 지원 기술 동향", 연세대학교 지식정보화연구소
Suresh Srinivas, "HDFS Federation", Yahoo! Inc.
이미영, 분산 스트림 컴퓨팅 기술 동향 ,ETRI
Bio Science, "Data Intensive Science: A New Paradigm for Biodiversity Studies"
KAIST 그리드 미들웨어 연구 센터, "시멘틱 그리드 기반 의 생물정보 지식 발굴 시스템 구축 연구
"Data Cleansing", http://en.wikipedia.org/wiki/Data_cleansing
Erhard Rahm, Hong Hai Do, "Data Cleaning: Problems and Current Approaches", 2000
Google, "Google Refine Tutorial"
R. Catell, "Scalable SQL and NoSQL Data Stores", 2011
S. Gilbert, N. Lynch, "Brewer's Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services"
T. V. Ganesh, "When NoSQL makes better sense than MySQL", 2011
"NoSQL", http://en.wikipedia.org/wiki/NoSQL
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, R. E. Gruber, "Bigtable: A Distributed Storage System for Structured Data", Google, Inc.
Dhruba Borthakur, "The Hadoop Distributed File System: Architecture and Design"
J. Dean, S. Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", OSDI, 2004
W. Y. Chen, Y. Song, H. Bai, C. J. Lin, E. Y. Chang, "Parallel Spectral Clustering in Distributed Systems"
"map/Reduce 개념", http://nadayyh.springnote.com/pages/6064905
A. Matsunaga, M. Tsugawa, J. Fortes, "CloudBlast: Combining MapReduce and Virtualization on Distributed Resources for Bioinformatics Application"
I. H. Witten, "Text Mining"
"Cluster Analysis: Basic Concepts and Algorithms"
J. Ekanayake, S. Pallickara, G. Fox, "MapReduce for Data Intensive Scientific Analyses"
An Oracle White Paper, "Oracle: Big Data for the Enterprise"
E. Pednault, Big Data Platforms, Tools, and Research at IBM
IBM, Why IBM for Big Data
IBM, "InfoSphere Streams", www-01.ibm.com
OLAP,, http://www.terms.co.kr/OLAP.htm
IBM, "IBM Netezza 1000, www-01.ibm.com"
X. Fei, S. Lu, C. Lin, "A MapReduce-Enable Scientific Workflow Composition Framework
J. Wang, D. Crawl, I. Altintas, "Kepler + Hadoop: A General Architecture Facilitating Data-Intensive Aplications in Scientific Workflow Systems

정보와 통신 (Information and Communications Magazine)

사이언스 빅 데이터(Science Big Data) 처리 기술 동향

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)