• Title/Summary/Keyword: Big data Processing

Search Result 1,063, Processing Time 0.028 seconds

Genomic data Analysis System using GenoSync based on SQL in Distributed Environment

  • Seine Jang;Seok-Jae Moon
    • International journal of advanced smart convergence
    • /
    • v.13 no.3
    • /
    • pp.150-155
    • /
    • 2024
  • Genomic data plays a transformative role in medicine, biology, and forensic science, offering insights that drive advancements in clinical diagnosis, personalized medicine, and crime scene investigation. Despite its potential, the integration and analysis of diverse genomic datasets remain challenging due to compatibility issues and the specialized nature of existing tools. This paper presents the GenomeSync system, designed to overcome these limitations by utilizing the Hadoop framework for large-scale data handling and integration. GenomeSync enhances data accessibility and analysis through SQL-based search capabilities and machine learning techniques, facilitating the identification of genetic traits and the resolution of forensic cases. By pre-processing DNA profiles from crime scenes, the system calculates similarity scores to identify and aggregate related genomic data, enabling accurate prediction models and personalized treatment recommendations. GenomeSync offers greater flexibility and scalability, supporting complex analytical needs across industries. Its robust cloud-based infrastructure ensures data integrity and high performance, positioning GenomeSync as a crucial tool for reliable, data-driven decision-making in the genomic era.

FAIR Principle-Based Metadata Assessment Framework (FAIR 원칙 기반 메타데이터 평가 프레임워크)

  • Park, Jin Hyo;Kim, Sung-Hee;Youn, Joosang
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.12
    • /
    • pp.461-468
    • /
    • 2022
  • Development of the big data industry, the cases of providing data utilization services on digital platforms are increasing. In this regard, research in data-related fields is being conducted to apply the FAIR principle that can be applied to the assessment of (meta)data quality, service, and function to data quality evaluation. Especially, the European Open Data Portal applies an assessment model based on FAIR principles. Based on this, a data maturity assessment is conducted and the results are disclosed in reports every year. However, public data portals do not conduct data maturity evaluations based on metadata. In this paper, we propose and evaluate a new model for data maturity evaluation on a big data platform built for multiple domestic public data portals and data transactions, FAIR principles used for data maturity evaluation in Europe's open data portals. The proposed maturity evaluation model is a model that evaluates the quality of public data portal datasets.

Valid Data Conditions and Discrimination for Machine Learning: Case study on Dataset in the Public Data Portal (기계학습에 유효한 데이터 요건 및 선별: 공공데이터포털 제공 데이터 사례를 통해)

  • Oh, Hyo-Jung;Yun, Bo-Hyun
    • Journal of Internet of Things and Convergence
    • /
    • v.8 no.1
    • /
    • pp.37-43
    • /
    • 2022
  • The fundamental basis of AI technology is learningable data. Recently, the types and amounts of data collected and produced by the government or private companies are increasing exponentially, however, verified data that can be used for actual machine learning has not yet led to it. This study discusses the conditions that data actually can be used for machine learning should meet, and identifies factors that degrade data quality through case studies. To this end, two representative cases of developing a prediction model using public big data was selected, and data for actual problem solving was collected from the public data portal. Through this, there is a difference from the results of applying valid data screening criteria and post-processing. The ultimate purpose of this study is to argue the importance of data quality management that must be most fundamentally preceded before the development of machine learning technology, which is the core of artificial intelligence, and accumulating valid data.

A Study on Intelligent Skin Image Identification From Social media big data

  • Kim, Hyung-Hoon;Cho, Jeong-Ran
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.9
    • /
    • pp.191-203
    • /
    • 2022
  • In this paper, we developed a system that intelligently identifies skin image data from big data collected from social media Instagram and extracts standardized skin sample data for skin condition diagnosis and management. The system proposed in this paper consists of big data collection and analysis stage, skin image analysis stage, training data preparation stage, artificial neural network training stage, and skin image identification stage. In the big data collection and analysis stage, big data is collected from Instagram and image information for skin condition diagnosis and management is stored as an analysis result. In the skin image analysis stage, the evaluation and analysis results of the skin image are obtained using a traditional image processing technique. In the training data preparation stage, the training data were prepared by extracting the skin sample data from the skin image analysis result. And in the artificial neural network training stage, an artificial neural network AnnSampleSkin that intelligently predicts the skin image type using this training data was built up, and the model was completed through training. In the skin image identification step, skin samples are extracted from images collected from social media, and the image type prediction results of the trained artificial neural network AnnSampleSkin are integrated to intelligently identify the final skin image type. The skin image identification method proposed in this paper shows explain high skin image identification accuracy of about 92% or more, and can provide standardized skin sample image big data. The extracted skin sample set is expected to be used as standardized skin image data that is very efficient and useful for diagnosing and managing skin conditions.

English Learning Applications Using Big Data Development (빅데이터를 활용한 영어학습 애플리케이션 설계 및 구현)

  • Lee, Jae-hoon;Kim, Seung-beom;Kim, Chang-young;Yang, Won-seok;Kim, Do-woo
    • Annual Conference of KIPS
    • /
    • 2020.11a
    • /
    • pp.644-647
    • /
    • 2020
  • 최근 교육분야에서는 IT 기술을 활용하여 교육을 혁신하는 것을 의미하는 에듀테크에 대한 관심이 높아지고 있다. 단순한 지식의 전달이 아닌 사용자의 수준에 맞춰진 학습을 하고 자신의 학습 내용을 스스로 모니터링할 수 있는 새로운 교육시스템이 필요하다. 이에 본 논문에서는 빅데이터를 활용한 영어학습 애플리케이션를 제안한다. 제안하는 애플리케이션은 영어뉴스 기사에서 추출한 빅데이터를 활용하여 사용자 수준에 맞춘 유용한 문장을 분석해 자동으로 문제를 생성하고 사용자의 음성데이터를 강세 분석 알고리즘으로 원어민 발음과 비교분석 하여 발음 및 강세를 교정할 수 있도록 설계 및 구현하였다.

An Integrated Method of Iterative and Incremental Requirement Analysis for Large-Scale Systems (시스템 요구사항 분석을 위한 순환적-점진적 복합 분석방법)

  • Park, Jisung;Lee, Jaeho
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.193-202
    • /
    • 2017
  • Development of Intelligent Systems involves effective integration of large-scaled knowledge processing and understanding, human-machine interaction, and intelligent services. Especially, in our project for development of a self-growing knowledge-based system with inference methodologies utilizing the big data technology, we are building a platform called WiseKB as the central knowledge base for storing massive amount of knowledge and enabling question-answering by inferences. WiseKB thus requires an effective methodology to analyze diverse requirements convoluted with the integration of various components of knowledge representation, resource management, knowledge storing, complex hybrid inference, and knowledge learning, In this paper, we propose an integrated requirement analysis method that blends the traditional sequential method and the iterative-incremental method to achieve an efficient requirement analysis for large-scale systems.

Hierarchical Visualization of Cloud-Based Social Network Service Using Fuzzy (퍼지를 이용한 클라우드 기반의 소셜 네트워크 서비스 계층적 시각화)

  • Park, Sun;Kim, Yong-Il;Lee, Seong Ro
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38B no.7
    • /
    • pp.501-511
    • /
    • 2013
  • Recently, the visualization method of social network service have been only focusing on presentation of visualizing network data, which the methods do not consider an efficient processing speed and computational complexity for increasing at the ratio of arithmetical of a big data regarding social networks. This paper proposes a cloud based on visualization method to visualize a user focused hierarchy relationship between user's nodes on social network. The proposed method can intuitionally understand the user's social relationship since the method uses fuzzy to represent a hierarchical relationship of user nodes of social network. It also can easily identify a key role relationship of users on social network. In addition, the method uses hadoop and hive based on cloud for distributed parallel processing of visualization algorithm, which it can expedite the big data of social network.

Construction of Sports Event Standard System in the Context of Big Data and Internet of Things

  • Jin Zha
    • Journal of Information Processing Systems
    • /
    • v.20 no.3
    • /
    • pp.337-344
    • /
    • 2024
  • It is a complex project to construct the standard system of sports events. Sports events standard system covers from the implementation plan to the evaluation work after the smooth implementation of sports events, involving many links. Large-scale sports events have extremely high media value. However, the successful organization and operation of large-scale sports events face many problems to be overcome, especially in terms of event safety. Although the organizers and organizers of large-scale events have invested many resources for the safe holding of sports events, violence similar to "football hooligans" in Europe is endless. At present, compared with Western countries, the standardization of sports events in China is low, and there is a problem of a late start and huge difference with Western developed countries. Knowing the construction of the standardization system's situation in China, we have summarized the data related to 15 sports events held in Chengdu is the last 5 years. By analyzing the problems in the process of holding these 15 events and the reflections of participants on the experience of sports events, the problems in the development of the standard system of sports events are discussed in depth. The final conclusion is that the system structure of China's sports events is not so good and athletes have a poor experience. China's sports event system still has many problems. Finally, we built a sports event standardization model using Internet of Things, and after a practical test we found that it has a good effect. Finally, we combined the current situation of sports event standardization system in China and put forward the following suggestions: laws and regulations related to the standard system of sports events must be formulated at the national level. The implementation level must strengthen the degree of integration of sports events and technology. To improve the quality of human resources in the management of sports events. The article puts forward targeted solutions, which play a great role in promoting the perfection and completeness of China's standard system for sports events. At the same time, it also promotes economic development and improves China's international status.

Data Processing and Visualization Method for Retrospective Data Analysis and Research Using Patient Vital Signs (환자의 활력 징후를 이용한 후향적 데이터의 분석과 연구를 위한 데이터 가공 및 시각화 방법)

  • Kim, Su Min;Yoon, Ji Young
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.4
    • /
    • pp.175-185
    • /
    • 2021
  • Purpose: Vital sign are used to help assess the general physical health of a person, give clues to possible diseases, and show progress toward recovery. Researchers are using vital sign data and AI(artificial intelligence) to manage a variety of diseases and predict mortality. In order to analyze vital sign data using AI, it is important to select and extract vital sign data suitable for research purposes. Methods: We developed a method to visualize vital sign and early warning scores by processing retrospective vital sign data collected from EMR(electronic medical records) and patient monitoring devices. The vital sign data used for development were obtained using the open EMR big data MIMIC-III and the wearable patient monitoring device(CareTaker). Data processing and visualization were developed using Python. We used the development results with machine learning to process the prediction of mortality in ICU patients. Results: We calculated NEWS(National Early Warning Score) to understand the patient's condition. Vital sign data with different measurement times and frequencies were sampled at equal time intervals, and missing data were interpolated to reconstruct data. The normal and abnormal states of vital sign were visualized as color-coded graphs. Mortality prediction result with processed data and machine learning was AUC of 0.892. Conclusion: This visualization method will help researchers to easily understand a patient's vital sign status over time and extract the necessary data.

Active Peg-in-hole of Chamferless Parts Using Multi-sensors (다중센서를 사용한 챔퍼가 없는 부품의 능동적인 삽입작업)

  • Jeon, Hun-Jong;Kim, Kab-Il;Kim, Dae-Won;Son, Yu-Seck
    • Proceedings of the KIEE Conference
    • /
    • 1993.07a
    • /
    • pp.410-413
    • /
    • 1993
  • Chamferless peg-in-hole process of the cylindrical type parts using force/torque sensor and vision sensor is analyzed and simulated in this paper. Peg-in-hole process is classified to the normal mode (only position error) and tilted mode(position and orientation error). The tilted mode is sub-classified to the small and the big tilted mode according to the relative orientation error. Since the big tilted node happened very rare, most papers dealt with only the normal or the small tilted mode. But the most errors of the peg-in-hole process happened in the big tilted mode. This problem is analyzed and simulated in this paper using the force/torque sensor and vision senor. In the normal mode, fuzzy logic is introduced to combine the data of the force/torque sensor and vision sensor. Also the whole processing algorithms and simulations are presented.

  • PDF