• 제목/요약/키워드: Data analysis study

검색결과 62,164건 처리시간 0.077초

A Study of Comparison between Cruise Tours in China and U.S.A through Big Data Analytics

  • Shuting, Tao;Kim, Hak-Seon
    • 한국조리학회지
    • /
    • 제23권6호
    • /
    • pp.1-11
    • /
    • 2017
  • The purpose of this study was to compare the cruise tours between China and U.S.A. through the semantic network analysis of big data by collecting online data with SCTM (Smart crawling & Text mining), a data collecting and processing program. The data analysis period was from January $1^{st}$, 2015 to August $15^{th}$, 2017, meanwhile, "cruise tour, china", "cruise tour, usa" were conducted to be as keywords to collet related data and packaged Netdraw along with UCINET 6.0 were utilized for data analysis. Currently, Chinese cruisers concern on the cruising destinations while American cruisers pay more attention on the onboard experience and cruising expenditure. After performing CONCOR (convergence of iterated correlation) analysis, for Chinese cruise tour, there were three clusters created with domestic destinations, international destinations and hospitality tourism. As for American cruise tour, four groups have been segmented with cruise expenditure, onboard experience, cruise brand and destinations. Since the cruise tourism of America was greatly developed, this study also was supposed to provide significant and social network-oriented suggestions for Chinese cruise tourism.

Analysis of Impact Between Data Analysis Performance and Database

  • Kyoungju Min;Jeongyun Cho;Manho Jung;Hyangbae Lee
    • Journal of information and communication convergence engineering
    • /
    • 제21권3호
    • /
    • pp.244-251
    • /
    • 2023
  • Engineering or humanities data are stored in databases and are often used for search services. While the latest deep-learning technologies, such like BART and BERT, are utilized for data analysis, humanities data still rely on traditional databases. Representative analysis methods include n-gram and lexical statistical extraction. However, when using a database, performance limitation is often imposed on the result calculations. This study presents an experimental process using MariaDB on a PC, which is easily accessible in a laboratory, to analyze the impact of the database on data analysis performance. The findings highlight the fact that the database becomes a bottleneck when analyzing large-scale text data, particularly over hundreds of thousands of records. To address this issue, a method was proposed to provide real-time humanities data analysis web services by leveraging the open source database, with a focus on the Seungjeongwon-Ilgy, one of the largest datasets in the humanities fields.

빅데이터와 네트노그라피 분석을 통합한 온라인 커뮤니티 고객 욕구 도출 방안: 천기저귀 온라인 커뮤니티 사례를 중심으로 (How to Identify Customer Needs Based on Big Data and Netnography Analysis)

  • 박순화;박상혁;오승희
    • 경영정보학연구
    • /
    • 제21권4호
    • /
    • pp.175-195
    • /
    • 2019
  • 본 연구는 온라인 소비자 커뮤니티의 소비자 욕구와 행동을 분석하기 위해 빅데이터-네트노그라피 통합모델을 사용하였다. 빅데이터 분석은 상관관계를 파악하기에는 용이하나, 인과관계는 알아내기 어렵기 때문에 네트노그라피 분석을 함께 사용하였다. 온라인 환경에서 수행하는 질적연구방식인 네트노그라피 방법론은 맥락파악에 있어서는 탁월하나, 장시간에 걸쳐 축적된 많은 양의 데이터를 분석하기에는 시간과 비용이 많이 든다는 한계가 있다. 따라서 본 연구에서는 빅데이터 분석을 통하여 온라인 커뮤니티 사이트에서 축적된 전반적인 자료의 패턴을 찾고, 네트노그라피 분석이 필요한 특이점을 발견한 뒤, 특이점 전후 지점에서만 네트노그라피 분석을 수행하였다. 본 연구에서 빅데이터 분석을 통해 드러난 다양한 현상의 원인을 네트노그라피 분석을 통해 설명할 수 있었다. 뿐만 아니라 빅데이터 분석으로는 잘 드러나지 않는 커뮤니티의 내부 구조적 변화까지도 파악할 수 있었다. 따라서 본 연구를 통해 그동안 빅데이터가 놓쳐온 비정형데이터로부터 맥락적 의미 분석은 물론 이해하기 어려웠던 온라인 소비자 행동 중 많은 부분을 효과적으로 설명할 수 있었다. 본 연구에서 제안한 빅데이터-네트노그라피 통합모델은 온라인 환경에서 소비자 욕구를 새롭게 발견하기 좋은 도구로 활용될 수 있을 것이다. 향후 연구에서는 다양한 사례 적용연구를 통해 본 연구에서 제시한 방안의 적합성과 우수성을 검증하고 보완하고자 한다.

The effect of missing levels of nesting in multilevel analysis

  • Park, Seho;Chung, Yujin
    • Genomics & Informatics
    • /
    • 제20권3호
    • /
    • pp.34.1-34.11
    • /
    • 2022
  • Multilevel analysis is an appropriate and powerful tool for analyzing hierarchical structure data widely applied from public health to genomic data. In practice, however, we may lose the information on multiple nesting levels in the multilevel analysis since data may fail to capture all levels of hierarchy, or the top or intermediate levels of hierarchy are ignored in the analysis. In this study, we consider a multilevel linear mixed effect model (LMM) with single imputation that can involve all data hierarchy levels in the presence of missing top or intermediate-level clusters. We evaluate and compare the performance of a multilevel LMM with single imputation with other models ignoring the data hierarchy or missing intermediate-level clusters. To this end, we applied a multilevel LMM with single imputation and other models to hierarchically structured cohort data with some intermediate levels missing and to simulated data with various cluster sizes and missing rates of intermediate-level clusters. A thorough simulation study demonstrated that an LMM with single imputation estimates fixed coefficients and variance components of a multilevel model more accurately than other models ignoring data hierarchy or missing clusters in terms of mean squared error and coverage probability. In particular, when models ignoring data hierarchy or missing clusters were applied, the variance components of random effects were overestimated. We observed similar results from the analysis of hierarchically structured cohort data.

A Study on the Development of the Key Promoting Talent in the 4th Industrial Revolution - Utilizing Six Sigma MBB competency-

  • Kim, Kang Hee;Ree, Sang bok
    • 품질경영학회지
    • /
    • 제45권4호
    • /
    • pp.677-696
    • /
    • 2017
  • Purpose: This study suggests that Six Sigma MBB should be used as a key talent to lead the fourth industrial revolution era by training them with big data processing capability. Methods: Through the analysis between articles on the fourth industrial revolution and Six Sigma related papers, common competencies of data scientists and Six Sigma MBBs were identified and the big data analysis capabilities needed for Six Sigma MBB were derived. Then, training was conducted to improve the big data analysis capabilities so that Six Sigma MBB is able to design algorithms required in the fourth industrial revolution era. Results: Six Sigma MBBs, equipped with the knowledge in field site improvement and basic statistics, were provided with 40 hours of big data analysis training and then were made to design a big data algorithm. Positive results were obtained after applying a AI algorithm which could forecast process defects in a field site. Conclusion: Six Sigma MBB equipped with big data capability will make the best talent for the fourth industrial revolution era. A Six Sigma MBB has an excellent capability for improving field sites. Utilizing the competencies of MBB can be a key to success in the fourth industrial revolution. We hope that the results of this study will be shared with many companies and many more improved case studies will arise in the future as a result of this study.

빅데이터를 활용한 샤오미 동향분석 - 국내외 고객인식을 바탕으로 - (Analysis of Xiaomi Trends Using Big Data - Based on Customer Perception at Domestic and Global -)

  • 이은지;문재영
    • 품질경영학회지
    • /
    • 제52권2호
    • /
    • pp.323-340
    • /
    • 2024
  • Purpose: The purpose of this study was to propose useful suggestions by analyzing research Xiaomi which are big data analyses, by collecting data based on Customer Perception in Textom. Methods: The collected data through scraping social media on the Textom site. And data preprocessing was performed using deleting and organizing data(text) that are duplicated, irrelevant, and where there is no meaning. The derived data were analyzed using Textom and Ucinet 6.0 with Text Analysis, WordClould, TF-IDF, Network Analysis, and Emotional analysis. Results: The results of this study are as follows; although the results of Xiaomi's text at domestic and global were similar, it was analyzed that there were perceptions of Xiaomi-related smart home products and cost-effectiveness in Korea, while in foreign countries, there were perceptions of functions and performance centered on smartphones. At domestic and global, the perception of Xiaomi was analyzed to be positive, and implications were presented based on these analysis results. Conclusion: Based on the results, if the product's performance or product competitiveness is considered to be meaningful in the market, and it is expected that there will be an opportunity to change the overall image of Chinese products.

Categorical Data Analysis by Means of Echelon Analysis with Spatial Scan Statistics

  • Moon, Sung-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권1호
    • /
    • pp.83-94
    • /
    • 2004
  • In this study we analyze categorical data by means of spatial statistics and echelon analysis. To do this, we first determine the hierarchical structure of a given contingency table by using echelon dendrogram then, we detect candidates of hotspots given as the top echelon in the dendrogram. Next, we evaluate spatial scan statistics for the zones of significantly high or low rates based on the likelihood ratio. Finally, we detect hotspots of any size and shape based on spatial scan statistics.

  • PDF

효과적인 웹 사용자의 패턴 분석을 위한 하둡 시스템의 웹 로그 분석 방안 (A Method for Analyzing Web Log of the Hadoop System for Analyzing a Effective Pattern of Web Users)

  • 이병주;권정숙;고기철;최용락
    • 한국IT서비스학회지
    • /
    • 제13권4호
    • /
    • pp.231-243
    • /
    • 2014
  • Of the various data that corporations can approach, web log data are important data that correspond to data analysis to implement customer relations management strategies. As the volume of approachable data has increased exponentially due to the Internet and popularization of smart phone, web log data have also increased a lot. As a result, it has become difficult to expand storage to process large amounts of web logs data flexibly and extremely hard to implement a system capable of categorizing, analyzing, and processing web log data accumulated over a long period of time. This study thus set out to apply Hadoop, a distributed processing system that had recently come into the spotlight for its capacity of processing large volumes of data, and propose an efficient analysis plan for large amounts of web log. The study checked the forms of web log by the effective web log collection methods and the web log levels by using Hadoop and proposed analysis techniques and Hadoop organization designs accordingly. The present study resolved the difficulty with processing large amounts of web log data and proposed the activity patterns of users through web log analysis, thus demonstrating its advantages as a new means of marketing.

빅데이터 분석 도구 R을 이용한 비정형 데이터 텍스트 마이닝과 시각화 (Text Mining and Visualization of Unstructured Data Using Big Data Analytical Tool R)

  • 남수태;신성윤;진찬용
    • 한국정보통신학회논문지
    • /
    • 제25권9호
    • /
    • pp.1199-1205
    • /
    • 2021
  • 빅데이터 시대에는 단순히 데이터베이스에 잘 정리된 정형 데이터뿐만 아니라 인터넷, 소셜 네트워크 서비스, 모바일 환경에서 실시간 생성되는 웹 문서, 이메일, 소셜 데이터 등 비정형 빅데이터를 효과적으로 분석하는 것이 매우 중요하다. 빅데이터 분석은 데이터 저장소에 저장된 빅데이터 속에서 의미 있는 새로운 상관관계, 패턴, 추세를 발견하여 새로운 가치를 창출하는 과정이다. 빅데이터 분석 도구인 R 언어를 이용하여 비정형 논문 데이터를 빈도분석을 통해 분석결과를 요약과 시각화하고자 한다. 본 연구에서 사용된 데이터는 한국정보통신학회 학회지 논문 중에서 2021년 1월호-5월호 총 논문 104편을 대상으로 분석하였다. 최종 분석결과 가장 많이 언급된 키워드는 "데이터"가 1,538회로 1위를 차지하였다. 따라서 분석결과를 바탕으로 연구의 한계와 이론적 실무적 시사점을 제시하고자 한다.

Automatic Cross-calibration of Multispectral Imagery with Airborne Hyperspectral Imagery Using Spectral Mixture Analysis

  • Yeji, Kim;Jaewan, Choi;Anjin, Chang;Yongil, Kim
    • 한국측량학회지
    • /
    • 제33권3호
    • /
    • pp.211-218
    • /
    • 2015
  • The analysis of remote sensing data depends on sensor specifications that provide accurate and consistent measurements. However, it is not easy to establish confidence and consistency in data that are analyzed by different sensors using various radiometric scales. For this reason, the cross-calibration method is used to calibrate remote sensing data with reference image data. In this study, we used an airborne hyperspectral image in order to calibrate a multispectral image. We presented an automatic cross-calibration method to calibrate a multispectral image using hyperspectral data and spectral mixture analysis. The spectral characteristics of the multispectral image were adjusted by linear regression analysis. Optimal endmember sets between two images were estimated by spectral mixture analysis for the linear regression analysis, and bands of hyperspectral image were aggregated based on the spectral response function of the two images. The results were evaluated by comparing the Root Mean Square Error (RMSE), the Spectral Angle Mapper (SAM), and average percentage differences. The results of this study showed that the proposed method corrected the spectral information in the multispectral data by using hyperspectral data, and its performance was similar to the manual cross-calibration. The proposed method demonstrated the possibility of automatic cross-calibration based on spectral mixture analysis.