• Title/Summary/Keyword: data crawling

Search Result 195, Processing Time 0.026 seconds

Product Recommendation System based on User Purchase Priority

  • Bang, Jinsuk;Hwang, Doyeun;Jung, Hoekyung
    • Journal of information and communication convergence engineering
    • /
    • v.18 no.1
    • /
    • pp.55-60
    • /
    • 2020
  • As personalized customer services create a society that emphasizes the personality of an individual, the number of product reviews and quantity of user data generated by users on the internet in mobile shopping apps and sites are increasing. Such product review data are classified as unstructured data. Unstructured data have the potential to be transformed into information that companies and users can employ, using appropriate processing and analyses. However, existing systems do not reflect the detailed information they collect, such as user characteristics, purchase preference, or purchase priority while analyzing review data. Thus, it is challenging to provide customized recommendations for various users. Therefore, in this study, we have developed a product recommendation system that takes into account the user's priority, which they select, when searching for and purchasing a product. The recommendation system then displays the results to the user by processing and analyzing their preferences. Since the user's preference is considered, the user can obtain results that are more relevant.

Development of A Uniform And Casual Clothing Recognition System For Patient Care In Nursing Hospitals

  • Yun, Ye-Chan;Kwak, Young-Tae
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.45-53
    • /
    • 2020
  • The purpose of this paper is to reduce the ratio of the patient accidents that may occur in nursing hospitals. In other words, it determines whether the person approaching the dangerous area is a elderly (patient uniform) group or a practitioner(Casual Clothing) group, based on the clothing displayed by CCTV. We collected the basic learning data from web crawling techniques and nursing hospitals. Then model training data was created with Image Generator and Labeling program. Due to the limited performance of CCTV, it is difficult to create a good model with both high accuracy and speed. Therefore, we implemented the ResNet model with relatively excellent accuracy and the YOLO3 model with relatively excellent speed. Then we wanted to allow nursing hospitals to choose a model that they wanted. As a result of the study, we implemented a model that can distinguish patient and casual clothes with appropriate accuracy. Therefore, it is believed that it will contribute to the reduction of safety accidents in nursing hospitals by preventing the elderly from accessing the danger zone.

A Scientific Quantitative Analysis on Vegetables of Joseon Dynasty using the Joseonwangjoshilrok based Data (조선왕조실록 과학계량적 분석을 통한 채소류의 통시적 고찰)

  • Kim, Mi-Hye
    • Journal of the Korean Society of Food Culture
    • /
    • v.36 no.2
    • /
    • pp.143-157
    • /
    • 2021
  • This study aimed to analyze the periodic prevalence of the vegetables during the Joseon era with JoseonWangjoSilrok as a reference. The JoseonWangjoSilrok articles were collected from the Guksapyeonchanwewonhwe site, using web-crawling techniques to extract the relevant information. Out of 384,582 search results, 9,560 articles with vegetable-related keywords were found. According to the annual average vegetable recordings during the regimes of various kings, there were two peaking curves in the 15th and 18th centuryJoseon. The found was: 2,750 in the 18th century, 2,529 in the 15th century, 1,424 in the 16th century, and 1,018 in the 19th century. A Variable Interest Index was designed to ascertain the interestin vegetables of the 27 Joseon kings. The king most interested in vegetables was the 19th king Sookjong. The second most interested king was Youngjo. There were 5,105 vegetable-related findings within the JoseonWangjoSilrok related to specific species and categories of vegetables. Among the words found: 1,194 were stem-leaves vegetables (23.39%), 1,017 were root vegetables (19.92%), 1,148 were flower-fruit vegetables (22.49%), 1,144 were spice vegetables (22.41%), 95 were mushrooms (1.86%), and 507 were seaweeds (9.93%). Statistical analysis using ANOVA revealed the chronological factors that affected the vegetables' prevalence index.

A Study on Usage Frequency of Translated English Phrase Using Google Crawling

  • Kim, Kyuseok;Lee, Hyunno;Lim, Jisoo;Lee, Sungmin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.689-692
    • /
    • 2020
  • People have studied English using online English dictionaries when they looked for the meaning of English words or the example sentences. These days, as the AI technologies such as machine learning have been developing, documents can be translated in real time with Kakao, Papago, Google translators and so on. But, there has still been some problems with the accuracy of translation. The AI secretaries can be used for real-time interpreting, so this kind of systems are being used to translate such the web pages, papers into Korean. In this paper, we researched on the usage frequency of the combined English phrases from dictionaries by analyzing the number of the searched results on Google. With the result of this paper, we expect to help the people to use more English fluently.

A Keyword-Based Big Data Analysis for Individualized Health Activity: Focusing on Methodological Approach

  • Kim, Han-Byul;Bae, Geun-Pyo;Huh, Jun-Ho
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.540-543
    • /
    • 2017
  • It will be possible to solve some of the major issues in our society and economy with the emerging Big Data used across 21st century global digital economy. One of the main areas where big data can be quite useful is the medical and health area. IT technology is being used extensively in this area and expected to expand its application field further. However, there is still room for improvement in the usage of Big Data as it is difficult to search unstructured data contained in Big Data and collect statistics for them. This limits wider application of Big Data. Depending on data collection and analysis method, the results from a Big Data can be varied. Some of them could be positive or negative so that it is essential that Big Data should be handled adequately and appropriately adapting to a purpose. Therefore, a Big Data has been constructed in this study to applying Crawling technique for data mining and analyzed with R. Also, the data were visualized for easier recognition and this was effective in developing an individualized health plan from different angles.

Korean Collective Intelligence in Sharing Economy Using R Programming: A Text Mining and Time Series Analysis Approach (R프로그래밍을 활용한 공유경제의 한국인 집단지성: 텍스트 마이닝 및 시계열 분석)

  • Kim, Jae Won;Yun, You Dong;Jung, Yu Jin;Kim, Ki Youn
    • Journal of Internet Computing and Services
    • /
    • v.17 no.5
    • /
    • pp.151-160
    • /
    • 2016
  • The purpose of this research is to investigate Korean popular attitudes and social perceptions of 'sharing economy' terminology at the current moment from a creative or socio-economic point of view. In Korea, this study discovers and interprets the objective and tangible annual changes and patterns of sociocultural collective intelligence that have taken place over the last five years by applying text mining in the big data analysis approach. By crawling and Googling, this study collected a significant amount of time series web meta-data with regard to the theme of the sharing economy on the world wide web from 2010 to 2014. Consequently, huge amounts of raw data concerning sharing economy are processed into the value-added meaningful 'word clouding' form of graphs or figures by using the function of word clouding with R programming. Till now, the lack of accumulated data or collective intelligence about sharing economy notwithstanding, it is worth nothing that this study carried out preliminary research on conducting a time-series big data analysis from the perspective of knowledge management and processing. Thus, the results of this study can be utilized as fundamental data to help understand the academic and industrial aspects of future sharing economy-related markets or consumer behavior.

A Comparison of Starbucks between South Korea and U.S.A. through Big Data Analysis (빅데이터 분석을 통한 한국과 미국의 스타벅스 비교 분석)

  • Jo, Ara;Kim, Hak-Seon
    • Culinary science and hospitality research
    • /
    • v.23 no.8
    • /
    • pp.195-205
    • /
    • 2017
  • The purpose of this study was to compare the Starbucks in South Korea with Starbucks in U.S.A through the semantic network analysis of big data by collecting online data with SCTM(Smart Crawling & Text Mining) program which was developed by big data research institute at Kyungsung University, a data collecting and processing program. The data collection period was from January 1st 2014 to December 7th 2017, and packaged Netdraw along with UCINET 6.0 were utilized for data analysis and visualization. After performing CONCOR(convergence of iterated correlation) analysis and centrality analysis, this study illustrated the current characteristics of Starbucks for Korea and U.S.A reflected by the social network and the differences between Korea and U.S.A. Since the Starbucks was greatly developed, especially in Korea. this study also was supposed to provide significant and social-network oriented suggestions for Starbucks USA, Starbucks Korea and also the whole coffee industry. Also this study revealed that big data analytics can generate new insights into variables that have been extensively studied in existing hospitality literature. In addition, implications for theory and practice as well as directions for future research are discussed.

An Automatic Urban Function District Division Method Based on Big Data Analysis of POI

  • Guo, Hao;Liu, Haiqing;Wang, Shengli;Zhang, Yu
    • Journal of Information Processing Systems
    • /
    • v.17 no.3
    • /
    • pp.645-657
    • /
    • 2021
  • Along with the rapid development of the economy, the urban scale has extended rapidly, leading to the formation of different types of urban function districts (UFDs), such as central business, residential and industrial districts. Recognizing the spatial distributions of these districts is of great significance to manage the evolving role of urban planning and further help in developing reliable urban planning programs. In this paper, we propose an automatic UFD division method based on big data analysis of point of interest (POI) data. Considering that the distribution of POI data is unbalanced in a geographic space, a dichotomy-based data retrieval method was used to improve the efficiency of the data crawling process. Further, a POI spatial feature analysis method based on the mean shift algorithm is proposed, where data points with similar attributive characteristics are clustered to form the function districts. The proposed method was thoroughly tested in an actual urban case scenario and the results show its superior performance. Further, the suitability of fit to practical situations reaches 88.4%, demonstrating a reasonable UFD division result.

Water leakage accident analysis of water supply networks using big data analysis technique (R기반 빅데이터 분석기법을 활용한 상수도시스템 누수사고 분석)

  • Hong, Sung-Jin;Yoo, Do-Guen
    • Journal of Korea Water Resources Association
    • /
    • v.55 no.spc1
    • /
    • pp.1261-1270
    • /
    • 2022
  • The purpose of this study is to collect and analyze information related to water leaks that cannot be easily accessed, and utilized by using the news search results that people can easily access. We applied a web crawling technique for extracting big data news on water leakage accidents in the water supply system and presented an algorithm in a procedural way to obtain accurate leak accident news. In addition, a data analysis technique suitable for water leakage accident information analysis was developed so that additional information such as the date and time of occurrence, cause of occurrence, location of occurrence, damaged facilities, damage effect. The primary goal of value extraction through big data-based leak analysis proposed in this study is to extract a meaningful value through comparison with the existing waterworks statistical results. In addition, the proposed method can be used to effectively respond to consumers or determine the service level of water supply networks. In other words, the presentation of such analysis results suggests the need to inform the public of information such as accidents a little more, and can be used in conjunction to prepare a radio wave and response system that can quickly respond in case of an accident.

Proposal of Brand Evaluation Map through Big Data : Focus on The Hyundai Motor's Product Evaluation (빅데이터를 통한 브랜드 평가 맵 제안 : 현대자동차 제품 평가 중심으로)

  • Youn, Dae Myung;Lee, Yong Hyuck;Lee, Bong Gyou
    • Journal of Information Technology Services
    • /
    • v.19 no.4
    • /
    • pp.1-11
    • /
    • 2020
  • Through text mining, sentiment analysis, and semiotics analysis, this study aims to reinterpret the meaning of user emotional words and related words to derive strategic elements of brand and design. After selecting a local car manufacturer whose user opinion on the brand is a clear topic, web-crawl the car comments of the manufacturer directly created by the users online. Then, analyze the extracted morphology and its associated words and convert them to fit the marketing mix theory. Through this process, propose a methodology that allows consumers to supplement and improve brand elements with negative sensibilities, and to inherit elements with positive sensibilities and manage brands reasonably. In particular, the Map presented in this study are considered to be fully utilized as information for overall brand management.