• Title/Summary/Keyword: Text data

Search Result 2,953, Processing Time 0.034 seconds

Research for the Element to Analyze the Performance of Modern-Web-Browser Based Applications (모던 웹 브라우저(Modern-Web-Browser) 기반 애플리케이션 성능분석을 위한 요소 연구)

  • Park, Jin-tae;Kim, Hyun-gook;Moon, Il-young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2018.10a
    • /
    • pp.278-281
    • /
    • 2018
  • The early Web technology was to show text information through a browser. However, as web technology advances, it is possible to show large amounts of multimedia data through browsers. Web technologies are being applied in a variety of fields such as sensor network, hardware control, and data collection and analysis for big data and AI services. As a result, the standard has been prepared for the Internet of Things, which typically controls a sensor via HTTP communication and provides information to users, by installing a web browser on the interface of the Internet of Things. In addition, the recent development of web-assembly enabled 3D objects, virtual/enhancing real-world content that could not be run in web browsers through a native language of C-class. Factors that evaluate the performance of existing Web applications include performance, network resources, and security. However, since there are many areas in which web applications are applied, it is time to revisit and review these factors. In this thesis, we will conduct an analysis of the factors that assess the performance of a web application. We intend to establish an indicator of the development of web-based applications by reviewing the analysis of each element, its main points, and its needs to be supplemented.

  • PDF

An Analysis System for Whole Genomic Sequence Using String B-Tree (스트링 B-트리를 이용한 게놈 서열 분석 시스템)

  • Choe, Jeong-Hyeon;Jo, Hwan-Gyu
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.509-516
    • /
    • 2001
  • As results of many genome projects, genomic sequences of many organisms are revealed. Various methods such as global alignment, local alignment are used to analyze the sequences of the organisms, and k -mer analysis is one of the methods for analyzing the genomic sequences. The k -mer analysis explores the frequencies of all k-mers or the symmetry of them where the k -mer is the sequenced base with the length of k. However, existing on-memory algorithms are not applicable to the k -mer analysis because a whole genomic sequence is usually a large text. Therefore, efficient data structures and algorithms are needed. String B-tree is a good data structure that supports external memory and fits into pattern matching. In this paper, we improve the string B-tree in order to efficiently apply the data structure to k -mer analysis, and the results of k -mer analysis for C. elegans and other 30 genomic sequences are shown. We present a visualization system which enables users to investigate the distribution and symmetry of the frequencies of all k -mers using CGR (Chaotic Game Representation). We also describe the method to find the signature which is the part of the sequence that is similar to the whole genomic sequence.

  • PDF

A Study on the Privacy Awareness through Bigdata Analysis (빅데이터 분석을 통한 프라이버시 인식에 관한 연구)

  • Lee, Song-Yi;Kim, Sung-Won;Lee, Hwan-Soo
    • Journal of Digital Convergence
    • /
    • v.17 no.10
    • /
    • pp.49-58
    • /
    • 2019
  • In the era of the 4th industrial revolution, the development of information technology brought various benefits, but it also increased social interest in privacy issues. As the possibility of personal privacy violation by big data increases, academic discussion about privacy management has begun to be active. While the traditional view of privacy has been defined at various levels as the basic human rights, most of the recent research trends are mainly concerned only with the information privacy of online privacy protection. This limited discussion can distort the theoretical concept and the actual perception, making the academic and social consensus of the concept of privacy more difficult. In this study, we analyze the privacy concept that is exposed on the internet based on 12,000 news data of the portal site for the past one year and compare the difference between the theoretical concept and the socially accepted concept. This empirical approach is expected to provide an understanding of the changing concept of privacy and a research direction for the conceptualization of privacy for current situations.

Classifying Sub-Categories of Apartment Defect Repair Tasks: A Machine Learning Approach (아파트 하자 보수 시설공사 세부공종 머신러닝 분류 시스템에 관한 연구)

  • Kim, Eunhye;Ji, HongGeun;Kim, Jina;Park, Eunil;Ohm, Jay Y.
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.10 no.9
    • /
    • pp.359-366
    • /
    • 2021
  • A number of construction companies in Korea invest considerable human and financial resources to construct a system for managing apartment defect data and for categorizing repair tasks. Thus, this study proposes machine learning models to automatically classify defect complaint text-data into one of the sub categories of 'finishing work' (i.e., one of the defect repair tasks). In the proposed models, we employed two word representation methods (Bag-of-words, Term Frequency-Inverse Document Frequency (TF-IDF)) and two machine learning classifiers (Support Vector Machine, Random Forest). In particular, we conducted both binary- and multi- classification tasks to classify 9 sub categories of finishing work: home appliance installation work, paperwork, painting work, plastering work, interior masonry work, plaster finishing work, indoor furniture installation work, kitchen facility installation work, and tiling work. The machine learning classifiers using the TF-IDF representation method and Random Forest classification achieved more than 90% accuracy, precision, recall, and F1 score. We shed light on the possibility of constructing automated defect classification systems based on the proposed machine learning models.

A Study on the Comparison and Semantic Analysis between SNS Big Data, Search Portal Trends and Drug Case Statistics (SNS 빅데이터 및 검색포털 트렌드와 마약류 사건 통계간의 비교 및 의미분석 연구)

  • Choi, Eunjung;Lee, SuRyeon;Kwon, Hyemin;Kim, Myuhngjoo;Lee, Insoo;Lee, Seunghoon
    • Journal of Digital Convergence
    • /
    • v.19 no.2
    • /
    • pp.231-238
    • /
    • 2021
  • SNS data can catch the user's thoughts and actions. And the trend of the search portal is a representative service that can observe the interests of users and their changes. In this paper, the relationship was analyzed by comparing statistics on narcotics incidents and the degree of exposure to narcotics related words in tweets of SNS and in the trends of search portal. It was confirmed that the trend of SNS and search portal trends was the same in the statistics of the prosecution office with a certain time difference.In addition, cluster analysis was performed to understand the meaning of tweets in which narcotics related words were mentioned. In the 50,000 tweets collected in January 2020, it was possible to find meaning related to the sale of actual drugs. Therefore, through SNS monitoring alone it is possible to monitor narcotics-related incidents and to find specific sales or purchase-related information, and this can be used in the investigation process. In the future, it is expected that crime monitoring and prediction systems can be proposed as related crime analysis may be possible not only with text but also images.

Analysis on Research Trends in Sport Facilities: Focusing on SCOPUS DB (스포츠시설에 관한 연구 동향 분석: SCOPUS DB를 중심으로)

  • Kim, Il-Gwang;Park, Seong-Taek;Park, Su-Sun;Kim, Mi-Suk;Park, Jong-Chul;Jiang, Jialei
    • Journal of Industrial Convergence
    • /
    • v.19 no.6
    • /
    • pp.11-19
    • /
    • 2021
  • The purpose of this study is to explore trends in research at home and abroad related to "Sport Facilities", and seek the direction of further research. 1,801 abstracts of papers including "Sport Facilities" were collected from the SCOPUS DB from 2016 to 2020. Topic modeling techniques based on Latent Dirichlet Allocation (LDA) algorithm implemented in R language, TD-IDF techniques, and word cluds using Tagxedo was conducted to analyze the data. As a result, 8 topics were optimally determined, and "sports", "facilities", "health", "physical", "data", and "using" were derived as the main keywords for topics. This results indicated that studies on physical activity, health and using facilities regarding sports facilities at home and abroad have been actively carried out in recent years. This indicates that papers in SCOPUS DB are paying attention to the instrumental value of sport facilities, such as health promotion and improving the quality of life. Therefore, various studies that help participants who use sport facilities for a healthy life should be continuously conducted in the future.

Trend Analysis of Corona Virus(COVID-19) based on Social Media (소셜미디어에 나타난 코로나 바이러스(COVID-19) 인식 분석)

  • Yoon, Sanghoo;Jung, Sangyun;Kim, Young A
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.5
    • /
    • pp.317-324
    • /
    • 2021
  • This study deals with keywords from social media on domestic portal sites related to COVID-19, which is spreading widely. The data were collected between January 20 and August 15, 2020, and were divided into three stages. The precursor period is before COVID-19 started spreading widely between January 20 and February 17, the serious period denotes the spread in Daegu between February 18 and April 20, and the stable period is the decrease in numbers of confirmed infections up to August 15. The top 50 words were extracted and clustered based on TF-IDF. As a result of the analysis, the precursor period keywords corresponded to congestion of the Situation. The frequent keywords in the serious period were Nation and Infection Route, along with instability surrounding the Treatment of COVID-19. The most common keywords in all periods were infection, mask, person, occurrence, confirmation, and information. People's emotions are becoming more positive as time goes by. Cafes and blogs share text containing writers' thoughts and subjectivity via the internet, so they are the main information-sharing spaces in the non-face-to-face era caused by COVID-19. However, since selectivity and randomness in information delivery exists, a critical view of the information produced on social media is necessary.

A Case Study on the Application of AI-OCR for Data Transformation of Paper Records (종이기록 데이터화를 위한 AI-OCR 적용 사례연구)

  • Ahn, Sejin;Hwang, Hyunho;Yim, Jin Hee
    • Journal of the Korean Society for information Management
    • /
    • v.39 no.3
    • /
    • pp.165-193
    • /
    • 2022
  • It can be said that digital technology is at the center of the change in the modern work environment. In particular, in general public institutions that prove their work with records produced by business management systems and document production systems, the record management system is also the work environment itself. Gimpo City applied for the 2021 public cloud leading project of the National Information Society Agency (NIA) to proactively respond to the 4th industrial revolution technology era and implemented a public cloud-based AI-OCR technology enhancement project with 330 million won in support of 330 million won. Through this, it was converted into data beyond the limitations of non-electronic records limited to search and image viewing that depend on standardized index values. In addition, a 98% recognition rate was realized by applying a new technology called AI-OCR. Since digital technology has been used to improve work efficiency, productivity, development cost, and record management service levels of internal and external users, we would like to share the direction of enhancing expertise in the record management and implementation of work environment innovation.

An Autobiographical Narrative Inquiry on the Process of Becoming-Scientist for Science Teachers (과학교사의 과학연구자-되기 과정에 관한 자서전적 내러티브 탐구)

  • Kwan-Young Kim;Sang-Hak Jeon
    • Journal of The Korean Association For Science Education
    • /
    • v.43 no.4
    • /
    • pp.369-387
    • /
    • 2023
  • This study aims to interpret the experience of science research in a graduate school laboratory from the perspective of Gilles Deleuze's concepts of "agencement" and "becoming". The research was conducted as an autobiographical narrative inquiry. The research text is written in a way that tells the story of my science research experience and retells it from the perspective of Gilles Deleuze. In Deleuze's view, science research is a constantly flowing agencement. The science research agencement is composed of a mechanical agencement of various experimental tools-machines and researcher-machines as well as a collective agencement of speech acts such as biological knowledge, experiment protocols, and laboratory rules. Furthermore, science research agencement is fluid as events occur all over the agencement. Data, as a change occurring in the material dimension, is an event and sign that raises problems. It has the agency to influence agencement through an intersubjective relationship with researchers, and the meaning of data is generated in this process. The change of agencement compelled me to perform science practice. I have performed repeated science practice, meaning that my body has constantly been connected to other machines. As a result of this connection, my body has been affected, and the capacity of my body that constitutes the agencement has been augmented. In addition, I was able to be deterritorialized from the existing science research agencement and reterritorialized in a new science research agencement with data. This process of differentiation allowed me to becoming-scientist. In sum, this study provides implications for science practice-oriented education by exploring the process of becoming-scientist based on my science research experience.

Liaohe National Park based on big data visualization Visitor Perception Study

  • Qi-Wei Jing;Zi-Yang Liu;Cheng-Kang Zheng
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.4
    • /
    • pp.133-142
    • /
    • 2023
  • National parks are one of the important types of protected area management systems established by IUCN and a management model for implementing effective conservation and sustainable use of natural and cultural heritage in countries around the world, and they assume important roles in conservation, scientific research, education, recreation and driving community development. In the context of big data, this study takes China's Liaohe National Park, a typical representative of global coastal wetlands, as a case study, and using Python technology to collect tourists' travelogues and reviews from major OTA websites in China as a source. The text spans from 2015 to 2022 and contains 2998 reviews with 166,588 words in total. The results show that wildlife resources, natural landscape, wetland ecology and the fishing and hunting culture of northern China are fully reflected in the perceptions of visitors to Liaohe National Park; visitors have strong positive feelings toward Liaohe National Park, but there is still much room for improvement in supporting services and facilities, public education and visitor experience and participation.