• Title/Summary/Keyword: Data Scientists

Search Result 3,360, Processing Time 0.031 seconds

The Current State and Tasks of Citizen Science in Korea (한국 시민과학의 현황과 과제)

  • Park, Jin Hee
    • Journal of Science and Technology Studies
    • /
    • v.18 no.2
    • /
    • pp.7-41
    • /
    • 2018
  • The projects of citizen science which is originated from citizen data collecting action driven by governmental institutes and science associations have been implemented with different form of collaboration with scientists. The themes of citizen science has extended from the ecology to astronomy, distributed computing, and particle physics. Citizen science could contribute to the advancement of science through cost-effective science research based on citizen volunteer data collecting. In addition, citizen science enhance the public understanding of science by increasing knowledge of citizen participants. The community-led citizen science projects could raise public awareness of environmental problems and promote the participation in environmental problem-solving. Citizen science projects based on local tacit knowledge can be of benefit to the local environmental policy decision making and implementation of policy. These social values of citizen science make many countries develop promoting policies of citizen science. The korean government also has introduced some citizen science projects. However there are some obstacles, such as low participation of citizen and scientists in projects which the government has to overcome in order to promote citizen science. It is important that scientists could recognize values of citizen science through the successful government driven citizen science projects and the evaluation tool of scientific career could be modified in order to promote scientist's participation. The project management should be well planned to intensify citizen participation. The government should prepare open data policy which could support a data reliability of the community-led monitoring projects. It is also desirable that a citizen science network could be made with the purpose of sharing best practices of citizen science.

Pedagogical Characteristics Supporting Gifted Science Students' Agentic Participation in the Scientist-led Research and Education (R&E) Program: Focusing on the Positioning of Instructors and Students (전문가 사사 R&E에서 과학영재의 행위주체적 연구 참여를 지원하는 교수적 특성 -교수자와 학생의 위치짓기를 중심으로-)

  • Minjoo Lee;Heesoo Ha
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.4
    • /
    • pp.351-368
    • /
    • 2023
  • The scientist-led Research and Education (R&E) program aims to strengthen gifted science students' research capabilities under the guidance of scientists. Students' actual research experiences in scientist-led R&E activities range from understanding how scientists conduct research to actively participating in research. To develop R&E that promotes student agency, i.e., student participation, this study aimed to identify the pedagogical characteristics that supported gifted science students' agentic participation in the scientist-led R&E program. We conducted interviews with learners and scientists in three teams undertaking R&E activities every three months. The interview covered their perceptions of R&E activities, student participation, and scientists' support for the activities. The recordings and transcripts of the interviews were used as primary data sources for the analysis. The trajectory of each team's activities, as well as the learners' and scientists' dynamic positioning were identified. Based on this analysis, we inductively identified the pedagogical characteristics that emerged from classes in which the scientists supported the students' learning and engagement in research. Regarding agency, three types of student participation were identified: 1) the sustained exercise of agency, 2) the initial exercise and subsequent discouragement of agency, and 3) the continuous non-exercise of agency. Two pedagogical characteristics that supported the learners' agentic participation were identified: 1) opportunities for students to take part in research management and 2) scientist-student interactions encouraging learners to present expert-level ideas. This study contributes to developing pedagogies that foster gifted science students' agentic participation in scientist-led R&E activities.

Toward Generic, Immersive, and Collaborative Solutions to the Data Interoperability Problem which Target End-Users

  • Sanchez-Ruiz, Arturo;Umapathy, Karthikeyan;Hayes, Pat
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.2
    • /
    • pp.127-141
    • /
    • 2009
  • In this paper, we describe our vision of a "Just-in-time" initiative to solve the Data Interoperability Problem (a.k.a. INTEROP.) We provide an architectural overview of our initiative which draws upon existing technologies to develop an immersive and collaborative approach which aims at empowering data stakeholders (e.g., data producers and data consumers) with integrated tools to interact and collaborate with each other while directly manipulating visual representations of their data in an immersive environment (e.g., implemented via Second Life.) The semantics of these visual representations and the operations associated with the data are supported by ontologies defined using the Common Logic Framework (CL). Data operations gestured by the stakeholders, through their avatars, are translated to a variety of generated resources such as multi-language source code, visualizations, web pages, and web services. The generality of the approach is supported by a plug-in architecture which allows expert users to customize tasks such as data admission, data manipulation in the immersive world, and automatic generation of resources. This approach is designed with a mindset aimed at enabling stakeholders from diverse domains to exchange data and generate new knowledge.

Data Firewall: A TPM-based Security Framework for Protecting Data in Thick Client Mobile Environment

  • Park, Woo-Ram;Park, Chan-Ik
    • Journal of Computing Science and Engineering
    • /
    • v.5 no.4
    • /
    • pp.331-337
    • /
    • 2011
  • Recently, Virtual Desktop Infrastructure (VDI) has been widely adopted to ensure secure protection of enterprise data and provide users with a centrally managed execution environment. However, user experiences may be restricted due to the limited functionalities of thin clients in VDI. If thick client devices like laptops are used, then data leakage may be possible due to malicious software installed in thick client mobile devices. In this paper, we present Data Firewall, a security framework to manage and protect security-sensitive data in thick client mobile devices. Data Firewall consists of three components: Virtual Machine (VM) image management, client VM integrity attestation, and key management for Protected Storage. There are two types of execution VMs managed by Data Firewall: Normal VM and Secure VM. In Normal VM, a user can execute any applications installed in the laptop in the same manner as before. A user can access security-sensitive data only in the Secure VM, for which the integrity should be checked prior to access being granted. All the security-sensitive data are stored in the space called Protected Storage for which the access keys are managed by Data Firewall. Key management and exchange between client and server are handled via Trusted Platform Module (TPM) in the framework. We have analyzed the security characteristics and built a prototype to show the performance overhead of the proposed framework.

A Missing Data Imputation by Combining K Nearest Neighbor with Maximum Likelihood Estimation for Numerical Software Project Data (K-NN과 최대 우도 추정법을 결합한 소프트웨어 프로젝트 수치 데이터용 결측값 대치법)

  • Lee, Dong-Ho;Yoon, Kyung-A;Bae, Doo-Hwan
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.4
    • /
    • pp.273-282
    • /
    • 2009
  • Missing data is one of the common problems in building analysis or prediction models using software project data. Missing imputation methods are known to be more effective missing data handling method than deleting methods in small software project data. While K nearest neighbor imputation is a proper missing imputation method in the software project data, it cannot use non-missing information of incomplete project instances. In this paper, we propose an approach to missing data imputation for numerical software project data by combining K nearest neighbor and maximum likelihood estimation; we also extend the average absolute error measure by normalization for accurate evaluation. Our approach overcomes the limitation of K nearest neighbor imputation and outperforms on our real data sets.

A Study of Analyzing Realtime Strategy Game Data using Data Mining (Data Mining을 이용한 전략시뮬레이션 게임 데이터 분석)

  • Yong, Hye-Ryeon;Kim, Do-Jin;Hwang, Hyun-Seok
    • Journal of Korea Game Society
    • /
    • v.15 no.4
    • /
    • pp.59-68
    • /
    • 2015
  • The progress in Information & Communication Technology enables data scientists to analyze big data for identifying peoples' daily lives and tacit preferences. A variety of industries already aware the potential usefulness of analyzing big data. However limited use of big data has been performed in game industry. In this research, we adopt data mining technique to analyze data gathered from a strategic simulation game. Decision Tree, Random Forest, Multi-class SVM, and Linear Regression techniques are used to find the most important variables to users' game levels. We provide practical guides for game design and usability based on the analyzed results.

Parallelization of Recursive Functions for Recursive Data Structures (재귀적 자료구조에 대한 재귀 함수의 병렬화)

  • An, Jun-Seon;Han, Tae-Suk
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.12
    • /
    • pp.1542-1552
    • /
    • 1999
  • 자료 병렬성이란 자료 집합의 원소들에 대하여 동일한 작업을 동시에 수행하므로써 얻어지는 병렬성을 말한다. 함수형 언어에서 자료 집합에 대한 반복 수행은 재귀적 자료형에 대한 재귀 함수에 의하여 표현된다. 본 논문에서는 이러한 재귀 함수를 자료 병렬 프로그램으로 변환하기 위한 병렬화 방법을 제시한다. 생성되는 병렬 프로그램의 병렬 수행 구조로는 일반적인 형태의 재귀적 자료형에 대하여 정의되는 다형적인 자료 병렬 연산을 사용하여 트리, 리스트 등과 같은 일반적인 재귀적 자료 집합에 대한 자료 병렬 수행이 가능하도록 하였다. 재귀 함수의 병렬화를 위해서는, 함수를 이루는 각각의 계산들의 병렬성을 재귀 호출에 의해 존재하는 의존성에 기반하여 분류하고, 이에 기반하여 각각의 계산들에 대한 적절한 자료 병렬 연산을 사용하는 병렬 프로그램을 생성하였다.Abstract Data parallelism is obtained by applying the same operations to each element of a data collection. In functional languages, iterative computations on data collections are expressed by recursions on recursive data structures. We propose a parallelization method for data-parallel implementation of such recursive functions. We employ polytypic data-parallel primitives to represent the parallel execution structure of the object programs, which enables data parallel execution with general recursive data structures, such as trees and lists. To transform sequential programs to their parallelized versions, we propose a method to classify the types of parallelism in subexpressions, based on the dependencies of the recursive calls, and generate the data-parallel programs using data-parallel primitives appropriately.

Distributed Data Recording on the Optical Disks using LDPC Codes (광학 저장 매체상에 LDPC 코드를 이용한 데이터의 분산 기록 방법)

  • Kim, Tae-Woong;Ryu, Jun-Kil;Park, Chan-Ik
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.15 no.9
    • /
    • pp.710-714
    • /
    • 2009
  • Optical discs' capacity has increased. In case of Blu-Disc, it can store data up to 25 GB. Due to the large capacity, it can substitute tape devices for the use of backup. However, optical discs' surfaces are exposed so that it can lose data easily by exterior damages like scratches. Therefore additional reliability must be provided to maintain data for a long time. In this paper, we suggest a writing technique that gives optical discs additional reliability. Redundant data, generated by LDPC codes, are stored in disc along with the original data. These original data and redundant data are scattered over the disc to avoid losing a large part of data with one scratch. By deploying data with the distance that provides the reliability a user wants, we can enhance optical discs' reliability.

Bounding Worst-Case Data Cache Performance by Using Stack Distance

  • Liu, Yu;Zhang, Wei
    • Journal of Computing Science and Engineering
    • /
    • v.3 no.4
    • /
    • pp.195-215
    • /
    • 2009
  • Worst-case execution time (WCET) analysis is critical for hard real-time systems to ensure that different tasks can meet their respective deadlines. While significant progress has been made for WCET analysis of instruction caches, the data cache timing analysis, especially for set-associative data caches, is rather limited. This paper proposes an approach to safely and tightly bounding data cache performance by computing the worst-case stack distance of data cache accesses. Our approach can not only be applied to direct-mapped caches, but also be used for set-associative or even fully-associative caches without increasing the complexity of analysis. Moreover, the proposed approach can statically categorize worst-case data cache misses into cold, conflict, and capacity misses, which can provide useful insights for designers to enhance the worst-case data cache performance. Our evaluation shows that the proposed data cache timing analysis technique can safely and accurately estimate the worst-case data cache performance, and the overestimation as compared to the observed worst-case data cache misses is within 1% on average.

A Data Quality Measuring Tool (데이타 품질 측정 도구)

  • 양자영;최병주
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.3
    • /
    • pp.278-288
    • /
    • 2003
  • Quality of the software is affected by quality of data required for operating the actual software. Especially, it is important that assure the quality of data in a knowledge-engineering system that extracts the meaningful knowledge from stored data. In this paper, we developed DAQUM tool that can measure quality of data. This paper shows: 1) main contents for implement of DAQUM tool; 2) detection of dirty data via DAQUM tool through case study and measurement of data quality which is quantifiable from end-user's point of view. DAQUM tool will greatly contribute to improving quality of software product that processes mainly the data through control and measurement of data quality.