• Title/Summary/Keyword: extraction metadata

Search Result 41, Processing Time 0.027 seconds

Metadata Schema Design for Integrated Registry of B2B Business Processes (기업간 비즈니스 프로세스의 통합적 등록저장을 위한 메타데이터 스키마 설계)

  • Kim, Jong-Woo;Kim, Hyoung-Do
    • The Journal of Society for e-Business Studies
    • /
    • v.12 no.2
    • /
    • pp.195-217
    • /
    • 2007
  • B2B registries provide spaces to register and retrieve information which is necessary to support B2B transactions among business partners or potential business partners. Business process specifications are one of important contents in B2B registries, and there is high complexity of representation due to complex and dynamic characteristics of business processes. Also, currently there exist several competing specification frameworks such as ebXML BPSS, WSBPEL, BPMN, and so on. This paper proposes a metadata schema to register business process specifications which are represented by different specification frameworks. The proposed schema has extensibility to register business process specifications which are represented by various different specification frameworks. Also, it extends reuse level from whole business specification processes to their components. To show the usefulness of the proposed schema, this paper demonstrates metadata extraction from business process specifications which are represented by two representative XML-based business process specification languages, ebXML BPSS and WSBPEL.

  • PDF

A Research on Image Metadata Extraction through YCrCb Color Model Analysis for Media Hyper-personalization Recommendation (미디어 초개인화 추천을 위한 YCrCb 컬러 모델 분석을 통한 영상의 메타데이터 추출에 대한 연구)

  • Park, Hyo-Gyeong;Yong, Sung-Jung;You, Yeon-Hwi;Moon, Il-Young
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.277-280
    • /
    • 2021
  • Recently as various contents are mass produced based on high accessibility, the media contents market is more active. Users want to find content that suits their taste, and each platform is competing for personalized recommendations for content. For an efficient recommendation system, high-quality metadata is required. Existing platforms take a method in which the user directly inputs the metadata of an image. This will waste time and money processing large amounts of data. In this paper, for media hyperpersonalization recommendation, keyframes are extracted based on the YCrCb color model of the video based on movie trailers, movie genres are distinguished through supervised learning of artificial intelligence and In the future, we would like to propose a utilization plan for generating metadata.

  • PDF

A study on Extraction of Metadata Elements for long-Term Preserving Official Document in EDMS (전자문서관리시스템의 공문서 영구보존을 위한 메타데이터 요소 설정에 관한 연구)

  • Yoo, Jung-Rym
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 2005.08a
    • /
    • pp.125-132
    • /
    • 2005
  • 본 연구는 공공기관에서 생산되는 기록물로서 가장 일반적이고 대표적인 공문서의 장기보존과 접근을 위한 상호운용성을 갖춘 보존 메타데이터 요소를 설정하는데 그 목적이 있다. 구체적으로는 기록물관리 표준인 ISO 15489에서 제안하는 메타데이터 요소와 우리나라의 메타데이터 요소의 비교분석을 통해 전자문서관리시스템의 최고 핵심인 공문서의 보존 메타데이터 항목을 연구하였다. 이는 향후 우리나라 환경에 적합한 표준화된 기록물 보존 메타데이터를 구축하는데 유용한 기초 자료로 활용할 수 있을 것이다.

  • PDF

Korean Web Content Extraction using Tag Rank Position and Gradient Boosting (태그 서열 위치와 경사 부스팅을 활용한 한국어 웹 본문 추출)

  • Mo, Jonghoon;Yu, Jae-Myung
    • Journal of KIISE
    • /
    • v.44 no.6
    • /
    • pp.581-586
    • /
    • 2017
  • For automatic web scraping, unnecessary components such as menus and advertisements need to be removed from web pages and main contents should be extracted automatically. A content block tends to be located in the middle of a web page. In particular, Korean web documents rarely include metadata and have a complex design; a suitable method of content extraction is therefore needed. Existing content extraction algorithms use the textual and structural features of content blocks because processing visual features requires heavy computation for rendering and image processing. In this paper, we propose a new content extraction method using the tag positions in HTML as a quasi-visual feature. In addition, we develop a tag rank position, a type of tag position not affected by text length, and show that gradient boosting with the tag rank position is a very accurate content extraction method. The result of this paper shows that the content extraction method can be used to collect high-quality text data automatically from various web pages.

A Study on the Extraction and Integration of Learning Object Meta-data using Web Service of Databases (DBMS의 웹서비스를 이용한 학습객체 메타데이터 추출 및 통합에 관한 연구)

  • Choe, Hyun-Jong
    • Journal of The Korean Association of Information Education
    • /
    • v.7 no.2
    • /
    • pp.199-206
    • /
    • 2003
  • XML is becoming a new developing tool of web technology because of its ability of data management and flexibility in data presentation. So it's well researched that the reusability and integration with learning objects such as text, image, sound, video and plug-in programs of web contents in computer education. But the research for storing, extracting and integrating metadata about learning object was needed prior to implementing online learning system to integrate and manage it. Therefore this study propose a new method of using web service of DBMS for extracting learning object's metadata in database server which located in 3-tier system. To evaluate the efficiency of proposed method, The test server and two DBMSs(MS SQL Server 2000 and Oracle 9i) which have 30 metadata was implemented and the response time of it was measured. The response time of it was short, but in order to using this method the additional programming with SAX/DOM was necessary.

  • PDF

A method for metadata extraction from a collection of records using Named Entity Recognition in Natural Language Processing (자연어 처리의 개체명 인식을 통한 기록집합체의 메타데이터 추출 방안)

  • Chiho Song
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.24 no.2
    • /
    • pp.65-88
    • /
    • 2024
  • This pilot study explores a method of extracting metadata values and descriptions from records using named entity recognition (NER), a technique in natural language processing (NLP), a subfield of artificial intelligence. The study focuses on handwritten records from the Guro Industrial Complex, produced during the 1960s and 1970s, comprising approximately 1,200 pages and 80,000 words. After the preprocessing process of the records, which included digitization, the study employed a publicly available language API based on Google's Bidirectional Encoder Representations from Transformers (BERT) language model to recognize entity names within the text. As a result, 173 names of people and 314 of organizations and institutions were extracted from the Guro Industrial Complex's past records. These extracted entities are expected to serve as direct search terms for accessing the contents of the records. Furthermore, the study identified challenges that arose when applying the theoretical methodology of NLP to real-world records consisting of semistructured text. It also presents potential solutions and implications to consider when addressing these issues.

Research of Vehicle Navigation Based Video-GIS

  • Feng, Jiang-Fan;Zhu, Guan-Yu;Liu, Zhao-Hong;Li, Yan
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.39-44
    • /
    • 2009
  • In order to make the effect of the navigation system more direct, the paper proposes a thought of vehicle navigation system based on Video-GIS. A semantic framework has been defined whose core is focused on the integration and interaction of video and spatial information, which supports full content retrieval based on multimodal metadata extraction and fusion, and supports kinds of wireless access mode. Furthermore, requirements of prototype system are discussed. Then the design and implementation of framework are discussed. Next, describe the key ideas and technologies involved. Finally, we point out its future research trend.

  • PDF

Product Information Extraction System Based on STEP in CPC Environment (협업적 제품 거래 환경에서 STEP 기반의 제품정보 추출 시스템)

  • Keem, Joon-Hyoung;Park, Sang-Ho;Kim, Hyun
    • Proceedings of the KSME Conference
    • /
    • 2003.11a
    • /
    • pp.1840-1845
    • /
    • 2003
  • Collaborative product commerce (CPC) supports a collaboration that a global enterprise and customer related to life cycle of product share product information and a collaboration process for the collaboration, and integrating applications. In this paper, we use common data schema in order to solve a interoperability problem about shared product information between enterprises. And we map to common data schema from each other different data format. Therefore we implement CPC Adaptor in order to integrate distributed product information.

  • PDF

Semantic Indexing for Soccer Videos Using Web-Extracted Information (웹에서 축출된 정보를 이용한 축구 경기의 시맨틱 인덱싱)

  • Hirata, Issao;Kim, Myeong-Hoon;Sull, Sang-Hoon
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2007.10c
    • /
    • pp.41-45
    • /
    • 2007
  • The rapid growing of video content production leads to the necessity of developing more complex indexing systems in order to efficiently allow searching, retrieval and presentation of the desired segments of videos. This paper presents a method for indexing soccer video through automatic extraction of information from internet. The proposed paper defines a metadata structure to formally represent the knowledge of soccer matches and provides an automatic method to extract semantic information from web-sites. This approach improves the capability to extract more reliable and richer semantic Information for soccer videos. Experimental results demonstrate that the proposed method provides an efficient performance.

  • PDF

Product Information Extraction System Based on STEP in CPC Environment (협업적 제품 거래 환경에서 STEP 기반의 제품정보 추출 시스템)

  • Park, Sang-Ho;Keem, Joon-Hyoung;Kim, Hyun
    • Transactions of the Korean Society of Mechanical Engineers A
    • /
    • v.28 no.5
    • /
    • pp.648-653
    • /
    • 2004
  • Collaborative product commerce (CPC) supports a collaboration that a global enterprise and customer related to lift cycle of product share product information and a collaboration process for the collaboration, and integrating applications. In this paper, we use common data schema in order to solve a interoperability problem about shared product information between enterprises. And we map to common data schema from each other different data format. Therefore we implement CPC Adaptor in order to integrate distributed product information.