• Title/Summary/Keyword: R프로그래밍

Search Result 94, Processing Time 0.028 seconds

Comparison of Scala and R for Machine Learning in Spark (스파크에서 스칼라와 R을 이용한 머신러닝의 비교)

  • Woo-Seok Ryu
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.85-90
    • /
    • 2023
  • Data analysis methodology in the healthcare field is shifting from traditional statistics-oriented research methods to predictive research using machine learning. In this study, we survey various machine learning tools, and compare several programming models, which utilize R and Spark, for applying R, a statistical tool widely used in the health care field, to machine learning. In addition, we compare the performance of linear regression model using scala, which is the basic languages of Spark and R. As a result of the experiment, the learning execution time when using SparkR increased by 10 to 20% compared to Scala. Considering the presented performance degradation, SparkR's distributed processing was confirmed as useful in R as the traditional statistical analysis tool that could be used as it is.

Curriculum of Basic Data Science Practices for Non-majors (비전공자 대상 기초 데이터과학 실습 커리큘럼)

  • Hur, Kyeong
    • Journal of Practical Engineering Education
    • /
    • v.12 no.2
    • /
    • pp.265-273
    • /
    • 2020
  • In this paper, to design a basic data science practice curriculum as a liberal arts subject for non-majors, we proposed an educational method using an Excel(spreadsheet) data analysis tool. Tools for data collection, data processing, and data analysis include Excel, R, Python, and Structured Query Language (SQL). When it comes to practicing data science, R, Python and SQL need to understand programming languages and data structures together. On the other hand, the Excel tool is a data analysis tool familiar to the general public, and it does not have the burden of learning a programming language. And if you practice basic data science practice with Excel, you have the advantage of being able to concentrate on acquiring data science content. In this paper, a basic data science practice curriculum for one semester and weekly Excel practice contents were proposed. And, to demonstrate the substance of the educational content, examples of Linear Regression Analysis were presented using Excel data analysis tools.

창업연구 실증연구 분석방법론

  • Lee, Il-Han
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2017.04a
    • /
    • pp.17-17
    • /
    • 2017
  • 구조방정식모델(Structural Equation Modeling: SEM)은 변수들 간의 인관관계 및 상관관계를 검증하기 위한 통계기법으로 사회학 및 심리학 분야에서 개발되었지만 현재는 경영학, 광고학, 교육학, 생물학, 체육학, 의학, 정치학 등 여러 학문분야에서 광범위하게 사용되고 있다. Amos는 기본적으로 그래픽(Amos graphics)과 베이직(Amos basic)을 제공하기 때문에 정확한 프로그램의 작성이나 행렬에 대한 지식이 없는 초보자들도 아이콘을 이용하여 복잡한 연구모델이나 다중집단분석모델을 분석할 수 있다. PLS(Partial Least Square)는 모형 추정과정에서 발생하는 잔차 또는 예측오차를 최소화하여 예측력을 극대화하기 위한 프로그램이며, 즉, PLS-SEM는 표본 수가 적고 자료가 정규분포를 보이지 않거나 조형지표 모델이거나 복잡한 연구모델 분석에 유용하다. 최근 빅데이터의 열풍으로 자료들을 분석을 위한 도구로 R이 실무 현장에서 인기를 끌고 있다. R은 통계 프로그래밍 언어이자 오픈 소프트웨어 환경으로 통계, 그래픽, 데이터마이닝 등의 다양하고 방대한 양의 패키지들을 지원한다. R에서 제공되는 패키지들이 오픈 소스이고 선형 및 비선형 모델링, 고전적인 통계분석, 시 계열 분석, 분류 및 군집분석 등의 다양한 통계 패키지들을 제공한다는 측면에서 R은 실무는 물론 학문적인 측면에서도, 특히 통계를 기반으로 실증분석을 수행하는 사회과학연구들에서 중요한 역할을 할 수 있을 것으로 기대된다.

  • PDF

SOAP-based Distributed Processing Scheduling Framework: pyBubble (SOAP기반의 분산처리 스케줄링 프레임웍: pyBubble)

  • ;;;R.S.Ramakrishna
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2004.04a
    • /
    • pp.742-744
    • /
    • 2004
  • 본 논문은 웹 서비스 프로토콜인 SOAP기반의 병렬처리 프레임웍인 pyBubble의 설계와 구현에 관한 것이다. 그리드 어플리케이션 프로그래밍의 어려움을 덜기 위해 그리드 미들웨어들로부터의 복잡성에 투명성을 제공하는 것을 본 논문의 목표로 한다. 이는 RPC스타일의 프로그래밍 인터페이스를 지원하면서 파이썬 스크립트 언어의 이식성과 확장성을 통해 기존 병렬처리 어플리케이션의 그리드화와 다양한 자원 스케줄링을 연구 할 수 있도록 하는 스케줄링 프레임웍이 주요 기능적 요소이다. 병렬처리를 위해 비동기 SOAP과 이를 이용한 Task-Farming과 DAG기반의 스케줄링의 지원함으로써 고성능의 그리드 계산환경을 제공하고자 한다.

  • PDF

Analysis of Linear Time-Invariant Spare Network and its Computer Programming (sparse 행렬을 이용한 저항 회로망의 해석과 전산프로그래밍)

  • 차균현
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.11 no.2
    • /
    • pp.1-4
    • /
    • 1974
  • Matrix inversion is very inefficient for computing direct solutions of the large sparse systems of linear equations that arise in many network problems. This paper describes some computer programming techniques for taking advantage of the sparsity of the admittance matrix. with this method, direct solutions are computed from sparse matrix. It is Possible to gain a significant reduction in computing time, memory and round-off emir[r. Retails of the method, numerical examples and programming are given.

  • PDF

An Exploratory Study on Determinants Affecting R Programming Acceptance (R 프로그래밍 수용 결정 요인에 대한 탐색 연구)

  • Rubianogroot, Jennifer;Namn, Su Hyeon
    • Management & Information Systems Review
    • /
    • v.37 no.1
    • /
    • pp.139-154
    • /
    • 2018
  • R programming is free and open source system associated with a rich and ever-growing set of libraries of functions developed and submitted by independent end-users. It is recognized as a popular tool for handling big data sets and analyzing them. Reflecting these characteristics, R has been gaining popularity from data analysts. However, the antecedents of R technology acceptance has not been studied yet. In this study we identify and investigates cognitive factors contributing to build user acceptance toward R in education environment. We extend the existing technology acceptance model by incorporating social norms and software capability. It was found that the factors of subjective norm, perceived usefulness, ease of use affect positively on the intention of acceptance R programming. In addition, perceived usefulness is related to subjective norms, perceived ease of use, and software capability. The main difference of this research from the previous ones is that the target system is not a stand-alone. In addition, the system is not static in the sense that the system is not a final version. Instead, R system is evolving and open source system. We applied the Technology Acceptance Model (TAM) to the target system which is a platform where diverse applications such as statistical, big data analyses, and visual rendering can be performed. The model presented in this work can be useful for both colleges that plan to invest in new statistical software and for companies that need to pursue future installations of new technologies. In addition, we identified a modified version of the TAM model which is extended by the constructs such as subjective norm and software capability to the original TAM model. However one of the weak aspects that might inhibit the reliability and validity of the model is that small number of sample size.

Uncertain Knowledge Processing for Oriental Medicine Diagnostic Model (한의 진단 모델의 추론 과정에서 발생하는 불확실한 진단 지식의 처리)

  • Shin, Yang-Kyu
    • Journal of the Korean Data and Information Science Society
    • /
    • v.8 no.1
    • /
    • pp.1-7
    • /
    • 1997
  • The inference process for medical expert system is mostly formed by diagnostic knowledge on the if-then rule base. Oriental medicine diagnostic knowledge, however, may involve uncertain knowledge caused by ambiguous concept. In this paper, we analyze an oriental medicine diagnostic process by a rule-based inference system, and propose a method for representing and processing uncertain oriental medicine diagnostic knowledge using CLP( R ) which is a kind of constraint satisfaction program.

  • PDF

Future Trends of Substation Automation System (변전소 자동화 시스템의 발전 추이와 미래)

  • Choi, Dae-Hee;Yang, Hang-Jun;Choi, Young-Jun;Hong, Jung-Gi
    • Proceedings of the KIEE Conference
    • /
    • 2003.07a
    • /
    • pp.531-533
    • /
    • 2003
  • 변전소는 에너지 측면에서 볼 때 에너지를 연계하거나 분리 또는 변환하는 지점에 설치되어지며 변전소 자동화 시스템은 이러한 변전소의 주요 기기들이나 각 피더를 감시하고 보호하는 역할을 수행하게 된다. Microprocessor 기술의 발전과 통신기술 및 프로그래밍 기술의 발전을 통하여 변전소의 시스템 또한 90년경부터 디지털화가 가속되어 그 시스템 또한 구조적으로 변화하게 된다. 기존 아날로그 시스템의 경직된 구성에서 탈피하여 변화의 속도 또한 점차 증가하고 있으며 이러한 흐름을 파악하기 위해 변전소 자동화 시스템의 현재 발전 상황과 앞으로의 변화 방향에 대하여 서술한다.

  • PDF

Hadoop and MapReduce (하둡과 맵리듀스)

  • Park, Jeong-Hyeok;Lee, Sang-Yeol;Kang, Da Hyun;Won, Joong-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1013-1027
    • /
    • 2013
  • As the need for large-scale data analysis is rapidly increasing, Hadoop, or the platform that realizes large-scale data processing, and MapReduce, or the internal computational model of Hadoop, are receiving great attention. This paper reviews the basic concepts of Hadoop and MapReduce necessary for data analysts who are familiar with statistical programming, through examples that combine the R programming language and Hadoop.

Algorithm and computerize programming to induce optimized Far-infrared radiation (원적외선 최적화 방사유도 알고리즘과 프로그래밍)

  • Kim, Jae-Yoon;Park, Don-Mork;Park, Young-Han;Park, Rae-Joon
    • The Journal of Korean Physical Therapy
    • /
    • v.13 no.2
    • /
    • pp.257-264
    • /
    • 2001
  • To take the Far-infrared(FIR) ray which is a optimized wavewlength and strength, at first, it is to be induced the characteristic algorithm and the computerized programing of FlR radiating materials. In this study, we induced that the formular of optimized FIR with physical, mathematical logic and theory, especially, Plank, Kirchhoff, Wien, Stefan-Boltzmann's logic and law. In the long run the formular was induced with mathematical integration. since we had to know the molecular wavelength. Base on the induced formular as above, we programmed the optimized FlR radiating computerized program, it would be useful to design semiconductor( VLSI) as the FlR instrument center control system.

  • PDF