• Title/Summary/Keyword: 텍스트 기반 검색

Search Result 373, Processing Time 0.028 seconds

An Efficient Scheme of Encapsulation Method to Avoid Fragmentation Degradation During TVA Metadata Delivery (TVA 메타데이터 전송과정에서 단편화에 의한 성능 감소를 회피하기 위한 효율적인 캡슐화 방식)

  • Oh, Bong-Jin;Park, Jong-Youl;Kim, Sang-Hyung;Yoo, Kwan-Jong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37 no.7C
    • /
    • pp.627-636
    • /
    • 2012
  • Recently, XML is used to describe details of service and contents for various fields such as IPTV and digital broadcast services because of it's high readability and extensibility. TV-Anytime's schema and delivery protocol have been especially adopted as basic standards for them, and extended to include their own private functions. However, XML describes documents using text-based method, and this causes to create big documents rather than traditional methods. Therefore, many encoding algorithms have been proposed to reduce XML documents like EXI, BiM, GZIP and fast-info set etc. Although these algorithms shows efficient compression effects for XML documents, but they can't avoid fragmentation degradation during encapsulation steep. This paper proposes an efficient encapsulation scheme of TV-Anytime to avoid fragmentation degradation of encoding effect using common string tables.

Development of Artificial Intelligence-based Legal Counseling Chatbot System

  • Park, Koo-Rack
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.3
    • /
    • pp.29-34
    • /
    • 2021
  • With the advent of the 4th industrial revolution era, IT technology is creating new services that have not existed by converging with various existing industries and fields. In particular, in the field of artificial intelligence, chatbots and the latest technologies have developed dramatically with the development of natural language processing technology, and various business processes are processed through chatbots. This study is a study on a system that provides a close answer to the question the user wants to find by creating a structural form for legal inquiries through Slot Filling-based chatbot technology, and inputting a predetermined type of question. Using the proposal system, it is possible to construct question-and-answer data in a more structured form of legal information, which is unstructured data in text form. In addition, by managing the accumulated Q&A data through a big data storage system such as Apache Hive and recycling the data for learning, the reliability of the response can be expected to continuously improve.

A Study on Utilization Method of Information Visualization in the Humanities and Area Studies (인문·지역연구에서의 정보시각화 활용 방안 연구)

  • Kang, Ji-Hoon;Lee, Dong-Yul;Moon, Sang-Ho
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.5 no.5
    • /
    • pp.59-68
    • /
    • 2015
  • Since interdisciplinary convergence could beyond the borders of each disciplines, it is able to create new and meaningful knowledge through collaborative research between different study areas. Especially, in recent years, the Digital Humanities has attracted the attention as the convergence form of the Humanities and ICT. From a research methodology perspective, the Digital Humanities is a tool that can be used as a convergence system for various information utilization such as storage, retrieve, share, and spread. In view of Information system, Digital Humanities has been constructed and used in a variety of systems. Among them, studies related to information visualization for the Digital Humanities have been actively conducted. To visualize data or information, various types such as images, multimedia, interface, and etc could be applied. In this paper, we analyze the cases of various information visualization in digital humanities systems, and propose a method to utilize them in the Humanities and Area Studies.

Web Site Keyword Selection Method by Considering Semantic Similarity Based on Word2Vec (Word2Vec 기반의 의미적 유사도를 고려한 웹사이트 키워드 선택 기법)

  • Lee, Donghun;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.2
    • /
    • pp.83-96
    • /
    • 2018
  • Extracting keywords representing documents is very important because it can be used for automated services such as document search, classification, recommendation system as well as quickly transmitting document information. However, when extracting keywords based on the frequency of words appearing in a web site documents and graph algorithms based on the co-occurrence of words, the problem of containing various words that are not related to the topic potentially in the web page structure, There is a difficulty in extracting the semantic keyword due to the limit of the performance of the Korean tokenizer. In this paper, we propose a method to select candidate keywords based on semantic similarity, and solve the problem that semantic keyword can not be extracted and the accuracy of Korean tokenizer analysis is poor. Finally, we use the technique of extracting final semantic keywords through filtering process to remove inconsistent keywords. Experimental results through real web pages of small business show that the performance of the proposed method is improved by 34.52% over the statistical similarity based keyword selection technique. Therefore, it is confirmed that the performance of extracting keywords from documents is improved by considering semantic similarity between words and removing inconsistent keywords.

Development of an Integrated DataBase System of Marine Geological and Geophysical Data Around the Korean Peninsula (한반도 해역 해양지질 및 지구물리 자료 통합 DB시스템 개발)

  • KIM, Sung-Dae;BAEK, Sang-Ho;CHOI, Sang-Hwa;PARK, Hyuk-Min
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.19 no.2
    • /
    • pp.47-62
    • /
    • 2016
  • An integrated database(DB) system was developed to manage the marine geological data and geophysical data acquired from around the Korean peninsula from 2009 to 2013. Geological data such as size analysis data, columnar section images, X-ray images, heavy metal data, and organic carbon data of sediment samples, were collected in the form of text files, excel files, PDF files and image files. Geophysical data such as seismic data, magnetic data, and gravity data were gathered in the form of SEG-Y binary files, image files and text files. We collected scientific data from research projects funded by the Ministry of Oceans and Fisheries, data produced by domestic marine organizations, and public data provided by foreign organizations. All the collected data were validated manually and stored in the archive DB according to data processing procedures. A geographic information system was developed to manage the spatial information and provide data effectively using the map interface. Geographic information system(GIS) software was used to import the position data from text files, manipulate spatial data, and produce shape files. A GIS DB was set up using the Oracle database system and ArcGIS spatial data engine. A client/server GIS application was developed to support data search, data provision, and visualization of scientific data. It provided complex search functions and on-the-fly visualization using ChartFX and specially developed programs. The system is currently being maintained and newly collected data is added to the DB system every year.

Design of an Intellectual Smart Mirror Appication helping Face Makeup (얼굴 메이크업을 도와주는 지능형 스마트 거울 앱의설계)

  • Oh, Sun Jin;Lee, Yoon Suk
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.497-502
    • /
    • 2022
  • Information delivery among young generation has a distinct tendency to prefer visual to text as means of information distribution and sharing recently, and it is natural to distribute information through Youtube or one-man broadcasting on Internet. That is, young generation usually get their information through this kind of distribution procedure. Many young generation are also drastic and more aggressive for decorating themselves very uniquely. It tends to create personal characteristics freely through drastic expression and attempt of face makeup, hair styling and fashion coordination without distinction of sex. Especially, face makeup becomes an object of major concern among males nowadays, and female of course, then it is the major means to express their personality. In this study, to meet the demands of the times, we design and implement the intellectual smart mirror application that efficiently retrieves and recommends the related videos among Youtube or one-man broadcastings produced by famous professional makeup artists to implement the face makeup congruous with our face shape, hair color & style, skin tone, fashion color & style in order to create the face makeup that represent our characteristics. We also introduce the AI technique to provide optimal solution based on the learning of user's search patterns and facial features, and finally provide the detailed makeup face images to give the chance to get the makeup skill stage by stage.

A Study on A Study on the University Education Plan Using ChatGPTfor University Students (ChatGPT를 활용한 대학 교육 방안 연구)

  • Hyun-ju Kim;Jinyoung Lee
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.71-79
    • /
    • 2024
  • ChatGPT, an interactive artificial intelligence (AI) chatbot developed by Open AI in the U.S., gaining popularity with great repercussions around the world. Some academia are concerned that ChatGPT can be used by students for plagiarism, but ChatGPT is also widely used in a positive direction, such as being used to write marketing phrases or website phrases. There is also an opinion that ChatGPT could be a new future for "search," and some analysts say that the focus should be on fostering rather than excessive regulation. This study analyzed consciousness about ChatGPT for college students through a survey of their perception of ChatGPT. And, plagiarism inspection systems were prepared to establish an education support model using ChatGPT and ChatGPT. Based on this, a university education support model using ChatGPT was constructed. The education model using ChatGPT established an education model based on text, digital, and art, and then composed of detailed strategies necessary for the era of the 4th industrial revolution below it. In addition, it was configured to guide students to use ChatGPT within the permitted range by using the ChatGPT detection function provided by the plagiarism inspection system, after the instructor of the class determined the allowable range of content generated by ChatGPT according to the learning goal. By linking and utilizing ChatGPT and the plagiarism inspection system in this way, it is expected to prevent situations in which ChatGPT's excellent ability is abused in education.

Korean Abbreviation Generation using Sequence to Sequence Learning (Sequence-to-sequence 학습을 이용한 한국어 약어 생성)

  • Choi, Su Jeong;Park, Seong-Bae;Kim, Kweon-Yang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.183-187
    • /
    • 2017
  • Smart phone users prefer fast reading and texting. Hence, users frequently use abbreviated sequences of words and phrases. Nowadays, abbreviations are widely used from chat terms to technical terms. Therefore, gathering abbreviations would be helpful to many services, including information retrieval, recommendation system, and so on. However, manually gathering abbreviations needs to much effort and cost. This is because new abbreviations are continuously generated whenever a new material such as a TV program or a phenomenon is made. Thus it is required to generate of abbreviations automatically. To generate Korean abbreviations, the existing methods use the rule-based approach. The rule-based approach has limitations, in that it is unable to generate irregular abbreviations. Another problem is to decide the correct abbreviation among candidate abbreviations generated rules. To address the limitations, we propose a method of generating Korean abbreviations automatically using sequence-to-sequence learning in this paper. The sequence-to-sequence learning can generate irregular abbreviation and does not lead to the problem of deciding correct abbreviation among candidate abbreviations. Accordingly, it is suitable for generating Korean abbreviations. To evaluate the proposed method, we use dataset of two type. As experimental results, we prove that our method is effective for irregular abbreviations.

Character-based Subtitle Generation by Learning of Multimodal Concept Hierarchy from Cartoon Videos (멀티모달 개념계층모델을 이용한 만화비디오 컨텐츠 학습을 통한 등장인물 기반 비디오 자막 생성)

  • Kim, Kyung-Min;Ha, Jung-Woo;Lee, Beom-Jin;Zhang, Byoung-Tak
    • Journal of KIISE
    • /
    • v.42 no.4
    • /
    • pp.451-458
    • /
    • 2015
  • Previous multimodal learning methods focus on problem-solving aspects, such as image and video search and tagging, rather than on knowledge acquisition via content modeling. In this paper, we propose the Multimodal Concept Hierarchy (MuCH), which is a content modeling method that uses a cartoon video dataset and a character-based subtitle generation method from the learned model. The MuCH model has a multimodal hypernetwork layer, in which the patterns of the words and image patches are represented, and a concept layer, in which each concept variable is represented by a probability distribution of the words and the image patches. The model can learn the characteristics of the characters as concepts from the video subtitles and scene images by using a Bayesian learning method and can also generate character-based subtitles from the learned model if text queries are provided. As an experiment, the MuCH model learned concepts from 'Pororo' cartoon videos with a total of 268 minutes in length and generated character-based subtitles. Finally, we compare the results with those of other multimodal learning models. The Experimental results indicate that given the same text query, our model generates more accurate and more character-specific subtitles than other models.

A Study on the Operating Conditions of Lecture Contents in Contactless Online Classes for University Students (대학생 대상 비대면 온라인 수업에서의 강의 콘텐츠 운영 실태 연구)

  • Lee, Jongmoon
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.32 no.4
    • /
    • pp.5-24
    • /
    • 2021
  • The purpose of this study was to investigate and analyze the operating conditions of lecture contents in contactless online classes for University students. First, as a result of analyzing the responses of 93 respondents, 93.3% of the respondents took real-time online lectures (47.7%) or recorded video lectures (45.6%). Second, as a result of analyzing the contents used as textbooks, it was found that e-books (materials) and paper books (materials) were used together (36.6%), or e-books or electronic materials (36.6% and 37.6% respectively) were used in both liberal arts (47.3%) and major subjects (39.8%). In addition to textbooks, both major subjects and liberal arts highly used web materials (47.6% and 40.5% respectively) and YouTube materials (33.3% and 48.0% respectively) as external materials. Third, both liberal arts and major subjects used 'electronic files in the form of PPT or text organized and written by instructors' (62.9% and 58.1% respectively), 'internet materials' (16.7% and 19% respectively) and 'paper book or materials' (10.4% and 12.3% respectively) to share lecture contents. For the screen displayed lecture contents, 93.5% of the respondents satisfied in major subjects, and 90.2% of the respondents satisfied in liberal arts. These results suggest developing multimedia-based lecture contents and an evaluation solution capable of real-time exam supervision, developing a task management system capable of AI-based plagiarism search, task guidance, and task evaluation, and institutionalizing a solution to copyright problems for electronicizing lecture materials so that lectures can be given in the ubiquitous environment.