• Title/Summary/Keyword: Text data

Search Result 2,959, Processing Time 0.031 seconds

A study on Metaverse keyword Consumer perception survey after Covid-19 using big Data

  • LEE, JINHO;Byun, Kwang Min;Ryu, Gi Hwan
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.14 no.4
    • /
    • pp.52-57
    • /
    • 2022
  • In this study, keywords from representative online portal sites such as Naver, Google, and Youtube were collected based on text mining analysis technique using Textom to check the changes in metqaverse after COVID-19. before Corona, it was confirmed that social media platforms such as Kakao Talk, Facebook, and Twitter were mentioned, and among the four metaverse, consumer awareness was still concentrated in the field of life logging. However, after Corona, keywords from Roblox, Fortnite, and Geppetto appeared, and keywords such as Universe, Space, Meta, and the world appeared, so Metaverse was recognized as a virtual world. As a result, it was confirmed that consumer perception changed from the life logging of Metaverse to the mirror world. Third, keywords such as cryptocurrency, cryptocurrency, coin, and exchange appeared before Corona, and the word frequency ranking for blockchain, which is an underlying technology, was high, but after Corona, the word frequency ranking fell significantly as mentioned above.

Design and Implementation of Typing Practice Application for Learning Using Web Contents (웹 콘텐츠를 활용한 학습용 타자 연습 어플리케이션의 설계와 구현)

  • Kim, Chaewon;Hwang, Soyoung
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.12
    • /
    • pp.1663-1672
    • /
    • 2021
  • There are various typing practice applications. In addition, research cases on learning applications that support typing practice have been reported. These services are usually provided in a way that utilizes their own built-in text. Learners collect various contents through web services and use them a lot for learning. Therefore, this paper proposes a learning application to increase the learning effect by collecting vast amounts of web content and applying it to typing practice. The proposed application is implemented using Tkinter, a GUI module of Python. BeautifulSoup module of Python is used to extract information from the web. In order to process the extracted data, the NLTK module, which is an English data preprocessor, and the KoNLPy module, which is a Korean language processing module, are used. The operation of the proposed function is verified in the implementation and experimental results.

Analysis on Trends of No-Code Machine Learning Tools

  • Yo-Seob, Lee;Phil-Joo, Moon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.412-419
    • /
    • 2022
  • The amount of digital text data is growing exponentially, and many machine learning solutions are being used to monitor and manage this data. Artificial intelligence and machine learning are used in many areas of our daily lives, but the underlying processes and concepts are not easy for most people to understand. At a time when many experts are needed to run a machine learning solution, no-code machine learning tools are a good solution. No-code machine learning tools is a platform that enables machine learning functions to be performed without engineers or developers. The latest No-Code machine learning tools run in your browser, so you don't need to install any additional software, and the simple GUI interface makes them easy to use. Using these platforms can save you a lot of money and time because there is less skill and less code to write. No-Code machine learning tools make it easy to understand artificial intelligence and machine learning. In this paper, we examine No-Code machine learning tools and compare their features.

Fake News Checking Tool Based on Siamese Neural Networks and NLP (NLP와 Siamese Neural Networks를 이용한 뉴스 사실 확인 인공지능 연구)

  • Vadim, Saprunov;Kang, Sung-Won;Rhee, Kyung-hyune
    • Annual Conference of KIPS
    • /
    • 2022.05a
    • /
    • pp.627-630
    • /
    • 2022
  • Over the past few years, fake news has become one of the most significant problems. Since it is impossible to prevent people from spreading misinformation, people should analyze the news themselves. However, this process takes some time and effort, so the routine part of this analysis should be automated. There are many different approaches to this problem, but they only analyze the text and messages, ignoring the images. The fake news problem should be solved using a complex analysis tool to reach better performance. In this paper, we propose the approach of training an Artificial Intelligence using an unsupervised learning algorithm, combined with online data parsing tools, providing independence from subjective data set. Therefore it will be more difficult to spread fake news since people could quickly check if the news or article is trustworthy.

A Study on Automatic Data Tagging for Text-based Training Data Construction (텍스트 기반의 훈련 데이터 구축을 위한 자동 데이터 태깅 작업에 대한 연구)

  • Kim, NaYun;So, Hyeryung;Park, Joonho
    • Annual Conference of KIPS
    • /
    • 2020.11a
    • /
    • pp.1008-1009
    • /
    • 2020
  • 텍스트 기반의 훈련 데이터는 데이터를 수집한 이후에 각 문자별로 태깅 작업이 필요하다. 말뭉치(Corpus)는 언어학에서 주로 이루고 있는 텍스트 집합이다. 말뭉치는 각 단어의 품사 표기에 대한 정보가 태그 형태로 되어 있다. 본 연구에서는 한국어 기반의 태깅 작업을 연구했으며, 기본 한국어 말뭉치가 아닌 기업이나 연구 기관에서 데이터를 수집하여 말뭉치나 별도 학습 데이터를 구축하기 위한 자동 태깅 방법에 대해 알아본다.

An Ensemble Model for Credit Default Discrimination: Incorporating BERT-based NLP and Transformer

  • Sophot Ky;Ju-Hong Lee
    • Annual Conference of KIPS
    • /
    • 2023.05a
    • /
    • pp.624-626
    • /
    • 2023
  • Credit scoring is a technique used by financial institutions to assess the creditworthiness of potential borrowers. This involves evaluating a borrower's credit history to predict the likelihood of defaulting on a loan. This paper presents an ensemble of two Transformer based models within a framework for discriminating the default risk of loan applications in the field of credit scoring. The first model is FinBERT, a pretrained NLP model to analyze sentiment of financial text. The second model is FT-Transformer, a simple adaptation of the Transformer architecture for the tabular domain. Both models are trained on the same underlying data set, with the only difference being the representation of the data. This multi-modal approach allows us to leverage the unique capabilities of each model and potentially uncover insights that may not be apparent when using a single model alone. We compare our model with two famous ensemble-based models, Random Forest and Extreme Gradient Boosting.

Analysis of AI Content Detector Tools

  • Yo-Seob Lee;Phil-Joo Moon
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.154-163
    • /
    • 2023
  • With the rapid development of AI technology, ChatGPT and other AI content creation tools are becoming common, and users are becoming curious and adopting them. These tools, unlike search engines, generate results based on user prompts, which puts them at risk of inaccuracy or plagiarism. This allows unethical users to create inappropriate content and poses greater educational and corporate data security concerns. AI content detection is needed and AI-generated text needs to be identified to address misinformation and trust issues. Along with the positive use of AI tools, monitoring and regulation of their ethical use is essential. When detecting content created by AI with an AI content detection tool, it can be used efficiently by using the appropriate tool depending on the usage environment and purpose. In this paper, we collect data on AI content detection tools and compare and analyze the functions and characteristics of AI content detection tools to help meet these needs.

Exploring trends in blockchain publications with topic modeling: Implications for forecasting the emergence of industry applications

  • Jeongho Lee;Hangjung Zo;Tom Steinberger
    • ETRI Journal
    • /
    • v.45 no.6
    • /
    • pp.982-995
    • /
    • 2023
  • Technological innovation generates products, services, and processes that can disrupt existing industries and lead to the emergence of new fields. Distributed ledger technology, or blockchain, offers novel transparency, security, and anonymity characteristics in transaction data that may disrupt existing industries. However, research attention has largely examined its application to finance. Less is known of any broader applications, particularly in Industry 4.0. This study investigates academic research publications on blockchain and predicts emerging industries using academia-industry dynamics. This study adopts latent Dirichlet allocation and dynamic topic models to analyze large text data with a high capacity for dimensionality reduction. Prior studies confirm that research contributes to technological innovation through spillover, including products, processes, and services. This study predicts emerging industries that will likely incorporate blockchain technology using insights from the knowledge structure of publications.

A Study on Intelligent Document Processing Management using Unstructured Data (비정형 데이터를 활용한 지능형 문서 처리 관리에 관한 연구)

  • Kyoung Hoon Park;Kwang-Kyu Seo
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.2
    • /
    • pp.71-75
    • /
    • 2024
  • This research focuses on processing unstructured data efficiently, containing various formulas in document processing and management regarding the terms and rules of domestic insurance documents using text mining techniques. Through parsing and compilation technology, document context, content, constants, and variables are automatically separated, and errors are verified in order of the document and logic to improve document accuracy accordingly. Through document debugging technology, errors in the document are identified in real time. Furthermore, it is necessary to predict the changes that intelligent document processing will bring to document management work, in particular, the impact on documents and utilization tasks that are double managed due to various formulas and prepare necessary capabilities in the future.

  • PDF

Large Multimodal Model for Context-aware Construction Safety Monitoring

  • Taegeon Kim;Seokhwan Kim;Minkyu Koo;Minwoo Jeong;Hongjo Kim
    • International conference on construction engineering and project management
    • /
    • 2024.07a
    • /
    • pp.415-422
    • /
    • 2024
  • Recent advances in construction automation have led to increased use of deep learning-based computer vision technology for construction monitoring. However, monitoring systems based on supervised learning struggle with recognizing complex risk factors in construction environments, highlighting the need for adaptable solutions. Large multimodal models, pretrained on extensive image-text datasets, present a promising solution with their capability to recognize diverse objects and extract semantic information. This paper proposes a methodology that generates training data for multimodal models, including safety-centric descriptions using GPT-4V, and fine-tunes the LLaVA model using the LoRA method. Experimental results from seven construction site hazard scenarios show that the fine-tuned model accurately assesses safety status in images. These findings underscore the proposed approach's effectiveness in enhancing construction site safety monitoring and illustrate the potential of large multimodal models to tackle domain-specific challenges.