• Title/Summary/Keyword: R 통계패키지

Search Result 55, Processing Time 0.033 seconds

Predictive Analysis of Problematic Smartphone Use by Machine Learning Technique

  • Kim, Yu Jeong;Lee, Dong Su
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.2
    • /
    • pp.213-219
    • /
    • 2020
  • In this paper, we propose a classification analysis method for diagnosing and predicting problematic smartphone use in order to provide policy data on problematic smartphone use, which is getting worse year after year. Attempts have been made to identify key variables that affect the study. For this purpose, the classification rates of Decision Tree, Random Forest, and Support Vector Machine among machine learning analysis methods, which are artificial intelligence methods, were compared. The data were from 25,465 people who responded to the '2018 Problematic Smartphone Use Survey' provided by the Korea Information Society Agency and analyzed using the R statistical package (ver. 3.6.2). As a result, the three classification techniques showed similar classification rates, and there was no problem of overfitting the model. The classification rate of the Support Vector Machine was the highest among the three classification methods, followed by Decision Tree and Random Forest. The top three variables affecting the classification rate among smartphone use types were Life Service type, Information Seeking type, and Leisure Activity Seeking type.

Factors Influencing Multi-cultural Acceptance of Freshmen in Nursing Colleges (간호대학 신입생의 다문화수용성 영향요인)

  • Jung, Sun-Young
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.322-331
    • /
    • 2021
  • This study attempted to identify the multi-cultural acceptance level of freshmen in nursing colleges and to analyze the factors influencing it. For the research method, data were collected from 410 first-year nursing students at K University in W City through a questionnaire from March 1 to 28, 2021, and frequency, reliability analysis, t-test, ANOVA, correlation, and multiple regression were conducted using the open-source statistical package R. As a result of the study, the multi-cultural acceptance level of freshman in nursing colleges averaged 77.36 points, indicating that they have a slightly higher multi-cultural acceptance capacity, and as a result of analyzing the influence of multi-cultural acceptance related factors, Korean recognition requirements(𝛽=0.34, p<.001), perceived threat recognition for migrants (𝛽=0.29, p<.001), Experience in multi-cultural education(𝛽=0.14, p<.001), Recognition of the appropriate age for multi-cultural education (𝛽=0.20, p<.001) was statistically significant. According to results, it is necessary to develop and actively utilize regular curriculum and programs related to multi-culturalism for nursing students.

An overview of Hawkes processes and their applications (혹스 과정의 개요 및 응용)

  • Mijeong Kim
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.4
    • /
    • pp.309-322
    • /
    • 2023
  • The Hawkes process is a point process with self-exciting characteristics. It has been mainly used to describe seismic phenomena in which aftershocks occur due to the main earthquake. Recently, it has been used to explain various phenomena with self-exciting properties, such as the spread of infectious diseases and the spread of news on SNS. The Hawkes process can be flexibly modified according to the characteristics of events by using various types of excitation functions. Since it is difficult to implement a maximum likelihood estimator numerically, estimation methods have been improved until recently. In this paper, the conditional intensity function and excitation function are explained to describe the Hawkes process. Then, existing examples of Hawkes processes used in seismic, epidemiological, criminal, and financial fields are described and estimation methods are introduced. I analyze earthquakes that occurred in gyeongsang-do, Korea from November 2017 to December 2022, using R package ETAS.

Data visualization of airquality data using R software (R 소프트웨어를 이용한 대기오염 데이터의 시각화)

  • Oh, Youngchang;Park, Eunsik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.2
    • /
    • pp.399-408
    • /
    • 2015
  • This paper presented airquality data through data visualization in several ways and described its characteristics related to statistical methods for analysis. Software R was used for visualization tools. The airquality data was measured in New York city from May to September of year 1973. First, simple, exploratory data analysis was done in terms of both data visualization and analysis to find out univariate characteristics. Then through data transformation and multiple regression analysis, model for describing the airquality level was found. Also, after some data categorization, overall feature of the data was explored using box plot and three-dimensional perspective drawing and scatter plot.

Improvement of in vitro Sun Protection Factor Measurement (In vitro SPF 측정법 개선에 관한 연구)

  • 안성연;배지현;이해광;문성준;장이섭
    • Journal of the Society of Cosmetic Scientists of Korea
    • /
    • v.30 no.1
    • /
    • pp.129-133
    • /
    • 2004
  • The major advantage of the in vitro test is that it is a rapid, objective and cost-effective screening methodology. In vitro tests can provide a formulation tool to identify new fillers that are optimized by combinations of old ones and they can be used to pre-screen protective formulas prior to in vivo testing in humans. Therefore, the accuracy of in vitro SPF measurement is very important. In this study, improvement of application method of samples was tried to improve the accuracy of in vitro SPF measurement. The outer part of Transpore$\^$(R)/ tape was used to apply samples as the substrates and the standard drying time was set at 15 min. The new method, topical applications at light scan areas, results in more accurate and reliable results. This result suggests that more accurate prediction system can be established for in vivo SPF with in vivo SPF measurement.

Memory Efficient Tri-Matching Algorithm (메모리 효율적인 3군 매칭 알고리즘 구현)

  • Kim, Donggil;Jung, Sung Jae
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2020.07a
    • /
    • pp.393-394
    • /
    • 2020
  • 세 군 매칭을 수행하여 관찰 데이터를 구축하고 통계분석에 기반한 연구를 수행하는 경우가 종종 발생한다. 매칭작업은 각 군에 속한 개체의 성향점수를 서로 비교해 거리가 가까운 짝을 찾아야 하므로 카테시안 곱 만큼의 경우의 수를 따져야 하는 문제이고, 메모리 소요가 크다. 특히 세 군 매칭은 세 쌍의 거리가 가까운 triplet을 찾는 문제로, 세 개체 사이에 존재하는 세 개의 거리를 따져야 하기 때문에 메모리 소요가 두 군 매칭에 비해 훨씬 크다. 각 군에 속한 개체가 늘어나면 메모리소요가 기하 급수적으로 늘어나게 된다. R패키지에 포함된 TriMatch함수는 세 군 매칭 수행을 위해 가장 널리 사용되는 프로그램이다. 이 프로그램은 세 개체 사이의 세 개 거리가 가장 짧은 triplet을 찾는 방식으로 구현 되었다. 이 프로그램은 메모리 소요가 매우 커 각 군에 속한 개체의 수가 많아지면 메모리 부족 에러가 발생하는 경우가 많다. 본 연구에서는 세 군 매칭에 소요되는 메모리 소요를 줄일 수 있는 알고리즘을 제안하고자 한다. 이 알고리즘의 구현을 통해 각 군에 속한 개체가 늘어나도 안정적인 세 군 매칭 결과를 얻을 수 있을 것으로 기대한다.

  • PDF

Development of gap filling technique for statistical downscaling of cimate change scenario data (기후변화 시나리오 자료의 통계적 상세화를 위한 결측자료 보정 기법 개발)

  • Cho, Jaepil;Kim, Kwang-Hyung;Park, Jihoon
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.16-16
    • /
    • 2019
  • 기후변화 시나리오 및 계절예측 자료를 포함한 기후정보를 수자원 분야에 활용하기 위해서는 기후정보의 시 공간적인 상세화(donwscaling)을 필요로 한다. 상세화의 경우 역학적 상세화와 통계학적 상세화로 구분될 수 있으며, 통계학적 상세화를 위해서는 대상 지역의 기후특성을 대표할 수 있는 장기 관측 자료의 확보가 중요하다. 국내의 경우에는 자동기상관측장비(Automatic Weather System, AWS)와 종관기상관측장비(Automatic Synoptic Observation System, ASOS)로 부터 수집된 기상관측자료를 사용할 수 있으나 기후변화 시나리오의 통계적 상세화를 위해서는 30년 이상의 자료 기간을 포함하는 ASOS 자료가 적합하다. 하지만 개발도상국과 같이 기상관측기반이 열악한 지역에서는 잦은 결측 등으로 인하여 품질이 좋은 관측자료의 획득이 어려운 상황이다. 따라서 본 연구에서는 측이 포함된 장기 기상관측 자료로부터 대상 지역의 기후특성을 재현할 수 있도록 기본적인 QC(Quality Control)을 거쳐 결측 자료를 보완할 수 있는 기법 및 R 기반패키지를 개발하여 적용성을 평가하였다. 개발된 기법의 적용성 평가를 위해서 기상청에서 QC를 통해 제공하고 있는 60개 ASOS 지점의 관측자료 중 강수량과 기온 변수를 사용하였다. 최대 50%까지의 현실적인 결측 패턴을 임의로 생성하기 위해 실제 개발도상국 관측자료의 일단위 결측 패턴을 이용하였다. 자료의 QC는 관측일 누락/중복 및 문자형 관측값 등 기본적인 오류 검사, 기온의 경우 물리적 허용 범위에 대한 검사, 최고기온과 최저기온의 비교 및 계측기 오작동에 의한 동일한 값의 반복 등을 포함한 내적 일치성 검사를 우선적으로 수행한다. 이후 결측값에 대해서 인근 기상관측소와의 상관성 분석 결과를 기반으로 결측값을 채우고, 최종적으로는 다양한 위성자료 및 재분석 자료 중에서 일단위 기후특성의 재현성 평가를 통해 선정된 격자형 자료와의 상관성 분석 결과를 기반으로 결측값을 보정하였다. 기온의 경우는 결측률이 높더라도 월평균 기후특성에 큰 영향을 미치지 않았지만 강수의 경우에는 5% 이상의 결측이 발생하는 경우 월평균 강수량에 영향을 미쳐 지역의 강수량을 과소 추정하는 결과를 보였다. 개발된 QC 기법을 강수 자료에 적용한 결과 월평균 기후특성을 잘 복원하는 결과를 보였지만, 일단위 강우 사상의 재현에 있어서는 미흡한 결과를 보였다.

  • PDF

Media exposure analysis of official sponsors and general companies of mega sport event (메가 스포츠이벤트의 공식스폰서와 일반기업의 미디어 노출 분석)

  • Kim, Joo-Hak;Cho, Sun-Mi
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.4
    • /
    • pp.171-181
    • /
    • 2018
  • As the proportion of sports events in the sports industry grows, the official sponsor market for sports events is also increasing. But because official sponsors are limited and expensive, some companies approach sporting events by way of Ambush marketing. This study is to analyze the differences of media exposure between official sponsors and general companies of mega sport events. To accomplish the purpose of the study, we collected text articles and analyzed them from the period of 2016 Rio Olympics, one year before the Olympics and one year after the Olympics. Web crawling was performed using Python for the collection of articles. Morphological and frequency analysis was performed using the KoNLP package and the TM package of statistical program R. In addition, the opinions of the related experts group were gathered to classify the companies or organizations in the media as the Organizing Committees for the Olympic Games(OCOGs), official sponsor, and general companies. As a result of the analysis, 5,220 times appeared related to the OCOGs, 7,845 times appeared related to the official sponsor, and 7,028 times appeared related to general companies. There isn't much difference in the frequency of exposure between official sponsors and general companies. It implies that Ambush marketing is recognized as a strategic marketing technique. The International Olympic Committee(IOC) has to recognize these social phenomena and establish reasonable standards for the marketing activities of official sponsors and general companies. And this study will serve as a basis for fair sponsor activities or marketing activities of sports events.

Influence Comparison of Customer Satisfaction Factor using Quantile Regression Model (분위회귀모형을 이용한 고객만족도 요인의 영향력 비교)

  • Kim, Seong-Yoon;Kim, Yong-Tae;Lee, Sang-Jun
    • Journal of Digital Convergence
    • /
    • v.13 no.6
    • /
    • pp.125-132
    • /
    • 2015
  • It is current situation that a number of issues are being raised how the weight is calculated from customer satisfaction survey. This study investigated how the weight of satisfaction for each quantile is different by comparing ordinary least square regression model to quantile regression model and carried out bootstrap verification to find the influence difference of regression coefficient for each quantile. As the analysis result of using R(Quantreg package) that is open software, it appeared that there was the influence size of satisfaction factor along study result and quantile and there was the significant difference statistically regarding regression coefficient for each quantile. So, to use quantile regression model that offers the influence of satisfaction factor for each customer group along satisfaction level would contribute to plan the quantitative convergence policy for customer satisfaction.

A Study on the Meal Kit Product Selection Attributes on Purchasing Behavior and Satisfaction (밀키트(Meal Kit)상품의 선택속성이 구매행동과 만족도에 미치는 영향 연구)

  • Chung, Hyun-Chae;Kim, Chan-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.6
    • /
    • pp.381-391
    • /
    • 2020
  • The purpose of this study is to investigate the relationship between meal kit product selection attributes, purchasing behavior, and satisfaction. The sampling of the study was conducted for 1 month for customers who have experience in using meal kit products recently launched by a restaurant company, and 287 copies of the questionnaire were used for analysis. For the hypothesis verification, regression analysis was performed using SPSS 20.0 package. As a result of analysis, first, the meal kit product selection attributes and purchase behavior of Hypothesis 1 have a significant effect on diversity (β = .026) and quality (β = .927). Hypothesis 2, meal kit product selection attributes and satisfaction have a significant effect on convenience (β = .503) and price (β = .121). Third, in the purchasing behavior of Hypothesis 3, the purchasing behavior (β = .561) has a significant effect on satisfaction. Lastly, this study is expected to provide basic data for researchers performing meal kit product related research, and to provide a rationale for suggesting direction for product development in a food service company and using marketing strategies.