• Title/Summary/Keyword: Text data

Search Result 2,959, Processing Time 0.028 seconds

A Study of Perception of Golfwear Using Big Data Analysis (빅데이터를 활용한 골프웨어에 관한 인식 연구)

  • Lee, Areum;Lee, Jin Hwa
    • Fashion & Textile Research Journal
    • /
    • v.20 no.5
    • /
    • pp.533-547
    • /
    • 2018
  • The objective of this study is to examine the perception of golfwear and related trends based on major keywords and associated words related to golfwear utilizing big data. For this study, the data was collected from blogs, Jisikin and Tips, news articles, and web $caf{\acute{e}}$ from two of the most commonly used search engines (Naver & Daum) containing the keywords, 'Golfwear' and 'Golf clothes'. For data collection, frequency and matrix data were extracted through Textom, from January 1, 2016 to December 31, 2017. From the matrix created by Textom, Degree centrality, Closeness centrality, Betweenness centrality, and Eigenvector centrality were calculated and analyzed by utilizing Netminer 4.0. As a result of analysis, it was found that the keyword 'brand' showed the highest rank in web visibility followed by 'woman', 'size', 'man', 'fashion', 'sports', 'price', 'store', 'discount', 'equipment' in the top 10 frequency rankings. For centrality calculations, only the top 30 keywords were included because the density was extremely high due to high frequency of the co-occurring keywords. The results of centrality calculations showed that the keywords on top of the rankings were similar to the frequency of the raw data. When the frequency was adjusted by subtracting 100 and 500 words, it showed different results as the low-ranking keywords such as J. Lindberg in the frequency analysis ranked high along with changes in the rankings of all centrality calculations. Such findings of this study will provide basis for marketing strategies and ways to increase awareness and web visibility for Golfwear brands.

Movie Box-office Analysis using Social Big Data (소셜 빅데이터를 이용한 영화 흥행 요인 분석)

  • Lee, O-Joun;Park, Seung-Bo;Chung, Daul;You, Eun-Soon
    • The Journal of the Korea Contents Association
    • /
    • v.14 no.10
    • /
    • pp.527-538
    • /
    • 2014
  • The demand prediction is a critical issue for the film industry. As the social media, such as Twitter and Facebook, gains momentum of late, considerable efforts are being dedicated to prediction and analysis of hit movies based on unstructured text data. For prediction of trends found in commercially successful films, the correlations between the amount of data and hit movies may be analyzed by estimating the data variation by period while opinion mining that assigns sentiment polarity score to data may be employed. However, it is not possible to understand why the audience chooses a certain movie or which attribute of a movie is preferred by using such a quantitative approach. This has limited the efforts to identify factors driving a movie's commercial success. In this regard, this study aims to investigate a movie's attributes that reflect the interests of the audience. This would be done by extracting topic keywords that represent the contents of Twits through frequency measurement based on the collected Twitter data while analyzing responses displayed by the audience. The objective is to propose factors driving a movie's commercial success.

Similarity Analysis of Hospitalization using Crowding Distance

  • Jung, Yong Gyu;Choi, Young Jin;Cha, Byeong Heon
    • International journal of advanced smart convergence
    • /
    • v.5 no.2
    • /
    • pp.53-58
    • /
    • 2016
  • With the growing use of big data and data mining, it serves to understand how such techniques can be used to understand various relationships in the healthcare field. This study uses hierarchical methods of data analysis to explore similarities in hospitalization across several New York state counties. The study utilized methods of measuring crowding distance of data for age-specific hospitalization period. Crowding distance is defined as the longest distance, or least similarity, between urban cities. It is expected that the city of Clinton have the greatest distance, while Albany the other cities are closer because they are connected by the shortest distance to each step. Similarities were stronger across hospital stays categorized by age. Hierarchical clustering can be applied to predict the similarity of data across the 10 cities of hospitalization with the measurement of crowding distance. In order to enhance the performance of hierarchical clustering, comparison can be made across congestion distance when crowding distance is applied first through the application of converting text to an attribute vector. Measurements of similarity between two objects are dependent on the measurement method used in clustering but is distinguished from the similarity of the distance; where the smaller the distance value the more similar two things are to one other. By applying this specific technique, it is found that the distance between crowding is reduced consistently in relationship to similarity between the data increases to enhance the performance of the experiments through the application of special techniques. Furthermore, through the similarity by city hospitalization period, when the construction of hospital wards in cities, by referring to results of experiments, or predict possible will land to the extent of the size of the hospital facilities hospital stay is expected to be useful in efficiently managing the patient in a similar area.

Implementation of PLC-Based Multi-modem for Process Automation of Non-destructive Inspection (비파괴검사 공정자동화를 위한 전력선통신 기반 복합통신장치의 구현)

  • Jung, Jun Hwan;Jun, Ho Ik;Kim, Hyun-Sik;Kang, Seog Geun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.4
    • /
    • pp.822-828
    • /
    • 2013
  • In this paper, a multi-modem for process automation of non-destructive inspection (NDI) which possibly generates various kinds of data is implemented and verified. Here, a variety of data such as control signals, text data, image data generated by inspection devices, sensors, computers are transmitted to the multi-modem via serial, Ethernet, and coaxial cable. We exploit a communication network in which powerlines are used as backbone transmission media. Thus, the implemented multi-modem has various ports and corresponding interfaces for data transmission. As a result of practical experiments, the multi-modem maintains almost constant data rate with little waveform distortion. In addition, the experiments confirm that the modem operates normally under extreme variation of temperature. It is, therefore, considered that the multi-modem can contribute significantly to implement powerline communication (PLC)-based process automation for NDI in which various kinds of data are practically generated.

An Analysis of the Network of Interactions among Medicinal Herbs and Their Uses (본초 상호작용 관계망 분석 및 활용 방향)

  • Lee, Jeong-Hyeon;Kwon, Oh-Min
    • Journal of Society of Preventive Korean Medicine
    • /
    • v.17 no.1
    • /
    • pp.1-11
    • /
    • 2013
  • Objectives : The aim of this research is to produce information by gathering up the data on the interaction between medicinal herbs which lie scattered in oriental medical books, and to provide people with easy access to the information by visualizing it. Methods : For this purpose, this study established the fundamental data by organizing the patterns of interaction into some kinds after selecting a part of Bonchogangmok(本草綱目) and extracting its text. In addition, in an effort to visualize the data, the study converted the data into 'net' file and visualized the interaction between medicinal herbs on Pajek. The visualization was done targeting a total of three patterns, such as 1 medicinal herb, 2 medicinal herbs, and 1 prescription. With the data on 'Chinese Lacquer(乾漆)' for 1 medicinal herb, data on 'Licorice(甘草)' and 'Chinese Lacquer(乾漆)' for 2 medicinal herbs, and data on 'Iijin-tang(二陳湯)' for prescription, the research conducted the analysis of the network using 'Kamada-Kawaii Algorithm' on Pajek. Results : As a result of the analysis, it was possible to see the meanings at a single glance as the scattered and fractional meanings were integrated with focus on medicinal herbs, but the increasing number of analyzed medicinal herbs tended to more and more complicate their relationships, thus, requiring additional work like filtering. Conclusions : Such results are fairly applicable in on-line database, and it is judged that if further research expands its scope to include systematic classification of medicinal herbs or cover other medical books than Bonchogangmok, it will create more objective, abundant information.

Study on the Environment Information Providing Method based on Spatial Information Document

  • Choi, Byoung Gil;Na, Young Woo;Kim, Sung Pyo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.2
    • /
    • pp.185-194
    • /
    • 2016
  • The purpose of this study is to present a method to provide environment information based on spatial information document. At present, a lot of spatial information, including environment information, is being produced, but separate software or system is required for the user to acquire the information. In particular, in the case of environment information, various types of information are being produced, such as ecology, vegetation and measurement network data. Therefore, it is necessary to present the form and the making method of spatial information document that allows using environment information as spatial information without separate software or system. To provide spatial information document-based environment information, types and forms of environment information, data format and offering methods produced by the government, in particular, the Ministry of Environment and the local governments, are analyzed. 12 fields are classified and the form of produced data is GIS DB, measurement network data, text data and so on. With decrease of paper maps, spatial information document that offers display by layer, coordinate data, attribute data, distance and area measurement, location search by coordinates, GPS location linkage and location display on the map is presented to increase utilization of geo-environment information maps. Finally, the standard document specification based on spatial information document is presented in consideration of usability and readability in order to provide a variety of environment information without separate software or system.

Implementation of a Display and Analysis Program to improve the Utilization of Radar Rainfall (레이더강우 자료 활용 증진을 위한 표출 및 분석 프로그램 구현)

  • Noh, Hui-Seong
    • Journal of Digital Contents Society
    • /
    • v.19 no.7
    • /
    • pp.1333-1339
    • /
    • 2018
  • Recently, as disasters caused by weather such as heavy rains have increased, interests in forecasting weather and disasters using radars have been increasing, and related studies have also been actively performed. As the Ministry of Environment(ME) has established and operated a radar network on a national scale, utilization of radars has been emphasized. However, persons in charge and researchers, who want to use the data from radars need to understand characteristics of the radar data and are also experiencing a lot of trials and errors when converting and calibrating the radar data from Universal Format(UF) files. Hence, this study developed a Radar Display and Analysis Program(RaDAP) based on Graphic User Interface(GUI) using the Java Programming Language in order for UF-type radar data to be generated in an ASCII-formatted image file and text file. The developed program can derive desired radar rainfall data and minimize the time required to perform its analysis. Therefore, it is expected that this program will contribute to enhancing the utilization of radar data in various fields.

The Design and Implementation of a Chatting System Sharing Paths (경로 공유 채팅 시스템의 설계 및 구현)

  • Kim, Dong-Hyun;Lee, Han-Bin;Ban, Chae-Hoon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.12 no.2
    • /
    • pp.281-286
    • /
    • 2017
  • SNS is a platformwhere users build a social relationship and share opinions and informations. To do these, it supports a text, an image and a video data. As it is possible to exploit the location data of a smart device, SNS tries to use the location data. However,since SNS does not support the coordinate data, it provides the restricted function sharing the image map instead of the vector map. In this paper, we propose a chatting systemsharing pathsto support the coordinate data on a classical SNS. On the proposed system, it is possible for usersjoined in a roomto watch a vector map of same area and exchange texts. If a user builds a path on the map, the system propagates the coordinate data of the generated path and the other users joined in the room watch the path immediately. The implemented chatting systemhasthe benefit to share the information related a map between users using coordinate data.

Method for 3D Visualization of Sound Data (사운드 데이터의 3D 시각화 방법)

  • Ko, Jae-Hyuk
    • Journal of Digital Convergence
    • /
    • v.14 no.7
    • /
    • pp.331-337
    • /
    • 2016
  • The purpose of this study is to provide a method to visualize the sound data to the three-dimensional image. The visualization of the sound data is performed according to the algorithm set after production of the text-based script that form the channel range of the sound data. The algorithm consists of a total of five levels, including setting sound channel range, setting picture frame for sound visualization, setting 3D image unit's property, extracting channel range of sound data and sound visualization, 3D visualization is performed with at least an operation signal input by the input device such as a mouse. With the sound files with the amount an animator can not finish in the normal way, 3D visualization method proposed in this study was highlighted that the low-cost, highly efficient way to produce creative artistic image by comparing the working time the animator with a study presented method and time for work. Future research will be the real-time visualization method of the sound data in a way that is going through a rendering process in the game engine.

Technology Clustering Using Textual Information of Reference Titles in Scientific Paper (과학기술 논문의 참고문헌 텍스트 정보를 활용한 기술의 군집화)

  • Park, Inchae;Kim, Songhee;Yoon, Byungun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.43 no.2
    • /
    • pp.25-32
    • /
    • 2020
  • Data on patent and scientific paper is considered as a useful information source for analyzing technological information and has been widely utilized. Technology big data is analyzed in various ways to identify the latest technological trends and predict future promising technologies. Clustering is one of the ways to discover new features by creating groups from technology big data. Patent includes refined bibliographic information such as patent classification code whereas scientific paper does not have appropriate bibliographic information for clustering. This research proposes a new approach for clustering data of scientific paper by utilizing reference titles in each scientific paper. In this approach, the reference titles are considered as textual information because each reference consists of the title of the paper that represents the core content of the paper. We collected the scientific paper data, extracted the title of the reference, and conducted clustering by measuring the text-based similarity. The results from the proposed approach are compared with the results using existing methodologies that one is the approach utilizing textual information from titles and abstracts and the other one is a citation-based approach. The suggested approach in this paper shows statistically significant difference compared to the existing approaches and it shows better clustering performance. The proposed approach will be considered as a useful method for clustering scientific papers.