• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.027 seconds

Randomized Block Size (RBS) Model for Secure Data Storage in Distributed Server

  • Sinha, Keshav;Paul, Partha;Amritanjali, Amritanjali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.12
    • /
    • pp.4508-4530
    • /
    • 2021
  • Today distributed data storage service are being widely used. However lack of proper means of security makes the user data vulnerable. In this work, we propose a Randomized Block Size (RBS) model for secure data storage in distributed environments. The model work with multifold block sizes encrypted with the Chinese Remainder Theorem-based RSA (C-RSA) technique for end-to-end security of multimedia data. The proposed RBS model has a key generation phase (KGP) for constructing asymmetric keys, and a rand generation phase (RGP) for applying optimal asymmetric encryption padding (OAEP) to the original message. The experimental results obtained with text and image files show that the post encryption file size is not much affected, and data is efficiently encrypted while storing at the distributed storage server (DSS). The parameters such as ciphertext size, encryption time, and throughput have been considered for performance evaluation, whereas statistical analysis like similarity measurement, correlation coefficient, histogram, and entropy analysis uses to check image pixels deviation. The number of pixels change rate (NPCR) and unified averaged changed intensity (UACI) were used to check the strength of the proposed encryption technique. The proposed model is robust with high resilience against eavesdropping, insider attack, and chosen-plaintext attack.

A Study on User Perception of Tourism Platform Using Big Data

  • Se-won Jeon;Sung-Woo Park;Youn Ju Ahn;Gi-Hwan Ryu
    • International journal of advanced smart convergence
    • /
    • v.13 no.1
    • /
    • pp.108-113
    • /
    • 2024
  • The purpose of this study is to analyze user perceptions of tourism platforms through big data. Data were collected from Naver, Daum, and Google as big data analysis channels. Using semantic network analysis with the keyword 'tourism platform,' a total of 29,265 words were collected. The collection period was set for two years, from August 31, 2021, to August 31, 2023. Keywords were analyzed for connected networks using TexTom and Ucinet programs for social network analysis. Keywords perceived by tourism platform users include 'travel,' 'diverse,' 'online,' 'service,' 'tourists,' 'reservation,' 'provision,' and 'region.' CONCOR analysis revealed four groups: 'platform information,' 'tourism information and products,' 'activation strategies for tourism platforms,' and 'tourism destination market.' This study aims to expand and activate services that meet the needs and preferences of users in the tourism field, as well as platforms tailored to the changing market, based on user perception, current status, and trend data on tourism platforms.

A Study on the Development Direction of Medical Tourism and Wellness Tourism Using Big Data

  • JINHO LEE;Gi-Hwan Ryu
    • International journal of advanced smart convergence
    • /
    • v.13 no.1
    • /
    • pp.180-184
    • /
    • 2024
  • Since COVID-19, many foreign tourists have visited Korea for medical tourism. When statistical data were checked from 2022, after COVID-19, the number of foreign patients visiting Korea for two years was 24.8 million, an increase of 70.1% from 2020. It was confirmed that it has achieved a 50% level compared to 2019 (Statistics Office, 2023). Therefore, to create a development plan by linking medical tourism and wellness tourism, the purpose of this study is to find the link between medical tourism and wellness tourism as big data and present a development plan. In this research method, medical tourism, and wellness tourism for two years from 2022 to 2023 from the post-COVID period as big data are set as central keywords to compare text data to find common points. When analyzing wellness tourism and medical tourism, it was confirmed that most wellness tourism had a greater frequency than medical tourism. This confirmed that wellness tourism occupies a larger pie than medical tourism. As a result, when checking the word frequency, it was confirmed that wellness tourism and medical tourism share a lot as complex tourism products, and when checking 2-gram, to attract many medical tourists, it is necessary to combine medical tourism clusters and wellness tourism according to each other's characteristics among local governments.

Clustering Analysis of Films on Box Office Performance : Based on Web Crawling (영화 흥행과 관련된 영화별 특성에 대한 군집분석 : 웹 크롤링 활용)

  • Lee, Jai-Ill;Chun, Young-Ho;Ha, Chunghun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.3
    • /
    • pp.90-99
    • /
    • 2016
  • Forecasting of box office performance after a film release is very important, from the viewpoint of increase profitability by reducing the production cost and the marketing cost. Analysis of psychological factors such as word-of-mouth and expert assessment is essential, but hard to perform due to the difficulties of data collection. Information technology such as web crawling and text mining can help to overcome this situation. For effective text mining, categorization of objects is required. In this perspective, the objective of this study is to provide a framework for classifying films according to their characteristics. Data including psychological factors are collected from Web sites using the web crawling. A clustering analysis is conducted to classify films and a series of one-way ANOVA analysis are conducted to statistically verify the differences of characteristics among groups. The result of the cluster analysis based on the review and revenues shows that the films can be categorized into four distinct groups and the differences of characteristics are statistically significant. The first group is high sales of the box office and the number of clicks on reviews is higher than other groups. The characteristic of the second group is similar with the 1st group, while the length of review is longer and the box office sales are not good. The third group's audiences prefer to documentaries and animations and the number of comments and interests are significantly lower than other groups. The last group prefer to criminal, thriller and suspense genre. Correspondence analysis is also conducted to match the groups and intrinsic characteristics of films such as genre, movie rating and nation.

Analysis of Information Education Related Theses Using R Program (R을 활용한 정보교육관련 논문 분석)

  • Park, SunJu
    • Journal of The Korean Association of Information Education
    • /
    • v.21 no.1
    • /
    • pp.57-66
    • /
    • 2017
  • Lately, academic interests in big data analysis and social network has been prominently raised. Various academic fields are involved in this social network based research trend, which is, social network has been actively used as the research topic in social science field as well as in natural science field. Accordingly, this paper focuses on the text analysis and the following social network analysis with the Master's and Doctor's dissertations. The result indicates that certain words had a high frequency throughout the entire period and some words had fluctuating frequencies in different period. In detail, the words with a high frequency had a higher betweenness centrality and each period seems to have a distinctive research flow. Therefore, it was found that the subjects of the Master's and Doctor's dissertations were changed sensitively to the development of IT technology and changes in information curriculum of elementary, middle and high school. It is predicted that researches related to smart, mobile, smartphone, SNS, application, storytelling, multicultural, and STEAM, which had an increased frequency in period 4, would be continuously conducted. Moreover, the topics of robots, programming, coding, algorithms, creativity, interaction, and privacy will also be studied steadily.

A Method for Extracting Relationships Between Terms Using Pattern-Based Technique (패턴 기반 기법을 사용한 용어 간 관계 추출 방법)

  • Kim, Young Tae;Kim, Chi Su
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.8
    • /
    • pp.281-286
    • /
    • 2018
  • With recent increase in complexity and variety of information and massively available information, interest in and necessity of ontology has been on the rise as a method of extracting a meaningful search result from massive data. Although there have been proposed many methods of extracting the ontology from a given text of a natural language, the extraction based on most of the current methods is not consistent with the structure of the ontology. In this paper, we propose a method of automatically creating ontology by distinguishing a term needed for establishing the ontology from a text given in a specific domain and extracting various relationships between the terms based on the pattern-based method. To extract the relationship between the terms, there is proposed a method of reducing the size of a searching space by taking a matching set of patterns into account and connecting a join-set concept and a pattern array. The result is that this method reduces the size of the search space by 50-95% without removing any useful patterns from the search space.

CAD/CAM System for 5-Axis Machining of Marine Propeller (프로펠러 5축 가공을 위한 CAD/CAM 시스템)

  • Jae-Woong Youn;Jong-Hwan Park
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.35 no.2
    • /
    • pp.51-62
    • /
    • 1998
  • In this paper, a CAD/CAM system for 5-axis machining of model propeller is introduced. This system has been developed under the environment of personal computer and Windows NT. In order to enhance the productivity, existing text-based design S/W was integrated into this graphic-based system. Non-Uniform Rational B-Spline method is used to represent the sculptured surface of propeller blades and hub using point data, and surface blending between blade and hub is realized in this system. For 5-axis machining of sculptured surface, tool/work collision and interference are checked and inverse kinematic analysis is performed to make NC data. In addition, tool and workpiece are animated on the PC monitor by preparing NC verification module. Finally, optimal cutting conditions are determined empirically and those cutting conditions are integrated into this S/W so that the whole process from design to machining can be done automatically.

  • PDF

Crisis Prediction of Regional Industry Ecosystem based on Text Sentiment Analysis Using News Data - Focused on the Automobile Industry in Gwangju - (뉴스 데이터를 활용한 텍스트 감성분석에 따른 지역 산업생태계 위기 예측 - 광주 지역 자동차 산업을 중심으로 -)

  • Kim, Hyun-Ji;Kim, Sung-Jin;Kim, Han-Gook
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.8
    • /
    • pp.1-9
    • /
    • 2020
  • As the aging problem of the regional industry ecosystem has gradually become serious, research to measure and regenerate the regional industry ecosystem decline has been actively conducted. However, little research has been done on regional industry ecosystem crises. Crisis emerges radically over a short period of time, and it is often impossible to respond by post-response, so you must respond before the crisis occurs. In other words, it is more necessary and required when looking at the crisis early and taking a proactive response from a long-term perspective. Therefore, it is necessary to develop a predictive model that can proactively recognize and respond to the crisis in the regional industry ecosystem. Therefore, this study checked the possibility of predicting the risk of regional industry and market according to the emotional score of the news by using large-scale news data. News sentiment analysis was performed using the Google sentiment analysis API, and this was organized by month to check the correlation between actual events.

Affinity and Variety between Words in the Framework of Hypernetwork (하이퍼네트워크에서 본 단어간 긴밀성과 다양성)

  • Kim, Joon-Shik;Park, Chan-Hoon;Lee, Eun-Seok;Zhang, Byoung-Tak
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.4
    • /
    • pp.166-171
    • /
    • 2008
  • We studied the variety and affinity between the successive words in the text document A number of groups were defined by the frequency of a following word in the whole text (corpus). In the previous studies, the Zipf's power law was explained by Chinese restaurant process and hub node was searched after by examining the edge number profile in scale free network. We have observed both a power law and a hub profile at the same time by studying the conditional frequency and degeneracy of a group. A symmetry between the affinity and the variety between words were found during the data analysis. And this phenomenon can be explained within a viewpoint of "exploitation and exploration." We also remark on a small symmetry breaking phenomenon in TIPSTER data.

Visualization of unstructured personal narratives of perterm birth using text network analysis (텍스트 네트워크 분석을 이용한 조산 경험 이야기의 시각화)

  • Kim, Jeung-Im
    • Women's Health Nursing
    • /
    • v.26 no.3
    • /
    • pp.205-212
    • /
    • 2020
  • Purpose: This study aimed to identify the components of preterm birth (PTB) through women's personal narratives and to visualize clinical symptom expressions (CSEs). Methods: The participants were 11 women who gave birth before 37 weeks of gestational age. Personal narratives were collected by interactive unstructured storytelling via individual interviews, from August 8 to December 4, 2019 after receiving approval of the Institutional Review Board. The textual data were converted to PDF and analyzed using the MAXQDA program (VERBI Software). Results: The participants' mean age was 34.6 (±2.98) years, and five participants had a spontaneous vaginal birth. The following nine components of PTB were identified: obstetric condition, emotional condition, physical condition, medical condition, hospital environment, life-related stress, pregnancy-related stress, spousal support, and informational support. The top three codes were preterm labor, personal characteristics, and premature rupture of membrane, and the codes found for more than half of the participants were short cervix, fear of PTB, concern about fetal well-being, sleep difficulty, insufficient spousal and informational support, and physical difficulties. The top six CSEs were stress, hydramnios, false labor, concern about fetal wellbeing, true labor pain, and uterine contraction. "Stress" was ranked first in terms of frequency and "uterine contraction" had individual attributes. Conclusion: The text network analysis of narratives from women who gave birth preterm yielded nine PTB components and six CSEs. These nine components should be included for developing a reliable and valid scale for PTB risk and stress. The CSEs can be applied for assessing preterm labor, as well as considered as strategies for students in women's health nursing practicum.