• Title/Summary/Keyword: web Indexing

Search Result 113, Processing Time 0.026 seconds

An Efficient Indexing Scheme Considering the Characteristics of Large Scale RDF Data (대규모 RDF 데이터의 특성을 고려한 효율적인 색인 기법)

  • Kim, Kiyeon;Yoon, Jonghyeon;Kim, Cheonjung;Lim, Jongtae;Bok, Kyoungsoo;Yoo, Jaesoo
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.1
    • /
    • pp.9-23
    • /
    • 2015
  • In this paper, we propose a new RDF index scheme considering the characteristics of large scale RDF data to improve the query processing performance. The proposed index scheme creates a S-O index for subjects and objects since the subjects and objects of RDF triples are used redundantly. In order to reduce the total size of the index, it constructs a P index for the relatively small number of predicates in RDF triples separately. If a query contains the predicate, we first searches the P index since its size is relatively smaller compared to the S-O index. Otherwise, we first searches the S-O index. It is shown through performance evaluation that the proposed scheme outperforms the existing scheme in terms of the query processing time.

The history of high intensity rainfall estimation methods in New Zealand and the latest High Intensity Rainfall Design System (HIRDS.V3)

  • Horrell, Graeme;Pearson, Charles
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2011.05a
    • /
    • pp.16-16
    • /
    • 2011
  • Statistics of extreme rainfall play a vital role in engineering practice from the perspective of mitigation and protection of infrastructure and human life from flooding. While flood frequency assessments, based on river flood flow data are preferred, the analysis of rainfall data is often more convenient due to the finer spatial nature of rainfall recording networks, often with longer records, and potentially more easily transferable from site to site. The rainfall frequency analysis as a design tool has developed over the years in New Zealand from Seelye's daily rainfall frequency maps in 1947 to Thompson's web based tool in 2010. This paper will present a history of the development of New Zealand rainfall frequency analysis methods, and the details of the latest method, so that comparisons may in future be made with the development of Korean methods. One of the main findings in the development of methods was new knowledge on the distribution of New Zealand rainfall extremes. The High Intensity Rainfall Design System (HIRDS.V3) method (Thompson, 2011) is based upon a regional rainfall frequency analysis with the following assumptions: $\bullet$ An "index flood" rainfall regional frequency method, using the median annual maximum rainfall as the indexing variable. $\bullet$ A regional dimensionless growth curve based on the Generalised Extreme Value (GEV), and using goodness of fit test for the GEV, Gumbel (EV1), and Generalised Logistic (GLO) distributions. $\bullet$ Mapping of median annual maximum rainfall and parameters of the regional growth curves, using thin-plate smoothing splines, a $2km\times2km$ grid, L moments statistics, 10 durations from 10 minutes to 72 hours, and a maximum Average Recurrence Interval of 100 years.

  • PDF

Impacts of the Journal Evaluation Program of the Korean Association of Medical Journal Editors (KAMJE) on the Quality of the Member Journals

  • Yang, Hee-Jin;Oh, Se Jeong;Hong, Sung-Tae
    • Journal of Korean Medical Science
    • /
    • v.33 no.48
    • /
    • pp.305.1-305.5
    • /
    • 2018
  • Background: In 1997 the Korean Association of Medical Journal Editors (KAMJE) instituted a program to evaluate member journals. Journals that passed the initial evaluation were indexed in the KoreaMed. Here, we report changes in measures of quality of the KAMJE member journals during the last 20 years. Methods: Quality measures used in the study comprised 3 assessment categories; self-assessment by journal editors, assessment of the journals by KAMJE reviewers, and by Korean health science librarians. Each used detailed criteria to score the journals on a scale of 0 to 5 or 6 in multiple dimensions. We compared scores at baseline evaluation and those after 7 years for 129 journals and compared improvements in journals indexed vs. not-indexed by the Web of Science (Science Citation Index Expanded; SCIE). Results: Among 251 KAMJE member journals at the end of 2015, 227 passed evaluation criteria and 129 (56%) had both baseline and 7-year follow-up assessment data. The journals showed improvement overall (increase in median [interquartile range; IQR] score from baseline, 0.47 [0.64]; 95% confidence interval [CI], 0.44-0.61; P < 0.001) and within each category (median [IQR] increase by editor's assessment, 0.17 [0.83]; 95% CI, 0.04-0.26; P = 0.007; by reviewer's, 0.45 [1.00]; 95% CI, 0.29-0.57; P < 0.001; by librarian's, 1.75 [1.08]; 95% CI, 1.77-2.18, P < 0.001). Before the foundation of KAMJE in 1996, there were only 5 Korean medical journals indexed in the MEDLINE and none in SCIE, but 24 journals in the MEDLINE and 34 journals in SCIE were indexed by 2016. Conclusion: The KAMJE journal evaluation program successfully contributes improving the quality of the member journals.

WordNet-Based Category Utility Approach for Author Name Disambiguation (저자명 모호성 해결을 위한 개념망 기반 카테고리 유틸리티)

  • Kim, Je-Min;Park, Young-Tack
    • The KIPS Transactions:PartB
    • /
    • v.16B no.3
    • /
    • pp.225-232
    • /
    • 2009
  • Author name disambiguation is essential for improving performance of document indexing, retrieval, and web search. Author name disambiguation resolves the conflict when multiple authors share the same name label. This paper introduces a novel approach which exploits ontologies and WordNet-based category utility for author name disambiguation. Our method utilizes author knowledge in the form of populated ontology that uses various types of properties: titles, abstracts and co-authors of papers and authors' affiliation. Author ontology has been constructed in the artificial intelligence and semantic web areas semi-automatically using OWL API and heuristics. Author name disambiguation determines the correct author from various candidate authors in the populated author ontology. Candidate authors are evaluated using proposed WordNet-based category utility to resolve disambiguation. Category utility is a tradeoff between intra-class similarity and inter-class dissimilarity of author instances, where author instances are described in terms of attribute-value pairs. WordNet-based category utility has been proposed to exploit concept information in WordNet for semantic analysis for disambiguation. Experiments using the WordNet-based category utility increase the number of disambiguation by about 10% compared with that of category utility, and increase the overall amount of accuracy by around 98%.

k-Interest Places Search Algorithm for Location Search Map Service (위치 검색 지도 서비스를 위한 k관심지역 검색 기법)

  • Cho, Sunghwan;Lee, Gyoungju;Yu, Kiyun
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.31 no.4
    • /
    • pp.259-267
    • /
    • 2013
  • GIS-based web map service is all the more accessible to the public. Among others, location query services are most frequently utilized, which are currently restricted to only one keyword search. Although there increases the demand for the service for querying multiple keywords corresponding to sequential activities(banking, having lunch, watching movie, and other activities) in various locations POI, such service is yet to be provided. The objective of the paper is to develop the k-IPS algorithm for quickly and accurately querying multiple POIs that internet users input and locating the search outcomes on a web map. The algorithm is developed by utilizing hierarchical tree structure of $R^*$-tree indexing technique to produce overlapped geometric regions. By using recursive $R^*$-tree index based spatial join process, the performance of the current spatial join operation was improved. The performance of the algorithm is tested by applying 2, 3, and 4 multiple POIs for spatial query selected from 159 keyword set. About 90% of the test outcomes are produced within 0.1 second. The algorithm proposed in this paper is expected to be utilized for providing a variety of location-based query services, of which demand increases to conveniently support for citizens' daily activities.

A Study of the Curriculum Operating Model and Standard Courses for Library & Information Science in Korea (한국문헌정보학 교과과정 운영모형 및 표준교과목 개발에 관한 연구)

  • Noh, Young-Hee;Ahn, in-Ja;Choi, Sang-Ki
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.46 no.2
    • /
    • pp.55-82
    • /
    • 2012
  • This study seeks to develop a curriculum operating model for Korean Library and Information Science, based on investigations into LIS curricula at home and abroad. Standard courses that can be applied to this model were also proposed. This study comprehensively analyzed the contents of domestic and foreign curricula and surveyed current librarians in all types of library fields. As a result, this study proposed required courses, core courses, and elective courses. Six required LIS courses are: Introduction to Library and Information Science, Information Organization, Information Services, Library and Information Center Management, Information Retrieval, and Field Work. Six core LIS courses are: Classification & Cataloging Practice, Subject Information Resources, Collection Development, Digital Library, Introduction to Bibliography, and Introduction to Archive Management. Twenty selective LIS courses include: the General Library and Information Science area (Cultural History of Information, Information Society and Library, Library and Copyright, Research Methods in Library and Information Science), the Information Organization area (Metadata Fundamentals, KORMARC Practice), the Information Services area (Information Literacy Instruction, Reading Guidance, Information User Study), the Library and Information Center Management area (Library Management, including management for different kinds of libraries, Library Information Cooperator, Library Marketing, Non-book Material and Multimedia Management (Contents Management), the Information Science area (Database Management, including Web DB Management, Indexing and Abstracting, Introduction to Information Science, Understanding Information Science, Automated System of Library, Library Information Network), and the Archival Science area (Preservation Management).

Soccer Video Highlight Building Algorithm using Structural Characteristics of Broadcasted Sports Video (스포츠 중계 방송의 구조적 특성을 이용한 축구동영상 하이라이트 생성 알고리즘)

  • 김재홍;낭종호;하명환;정병희;김경수
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.7_8
    • /
    • pp.727-743
    • /
    • 2003
  • This paper proposes an automatic highlight building algorithm for soccer video by using the structural characteristics of broadcasted sports video that an interesting (or important) event (such as goal or foul) in sports video has a continuous replay shot surrounded by gradual shot change effect like wipe. This shot editing rule is used in this paper to analyze the structure of broadcated soccer video and extracts shot involving the important events to build a highlight. It first uses the spatial-temporal image of video to detect wipe transition effects and zoom out/in shot changes. They are used to detect the replay shot. However, using spatial-temporal image alone to detect the wipe transition effect requires too much computational resources and need to change algorithm if the wipe pattern is changed. For solving these problems, a two-pass detection algorithm and a pixel sub-sampling technique are proposed in this paper. Furthermore, to detect the zoom out/in shot change and replay shots more precisely, the green-area-ratio and the motion energy are also computed in the proposed scheme. Finally, highlight shots composed of event and player shot are extracted by using these pre-detected replay shot and zoom out/in shot change point. Proposed algorithm will be useful for web services or broadcasting services requiring abstracted soccer video.

A Feature -Based Word Spotting for Content-Based Retrieval of Machine-Printed English Document Images (내용기반의 인쇄체 영문 문서 영상 검색을 위한 특징 기반 단어 검색)

  • Jeong, Gyu-Sik;Gwon, Hui-Ung
    • Journal of KIISE:Software and Applications
    • /
    • v.26 no.10
    • /
    • pp.1204-1218
    • /
    • 1999
  • 문서영상 검색을 위한 디지털도서관의 대부분은 논문제목과/또는 논문요약으로부터 만들어진 색인에 근거한 제한적인 검색기능을 제공하고 있다. 본 논문에서는 영문 문서영상전체에 대한 검색을 위한 단어 영상 형태 특징기반의 단어검색시스템을 제안한다. 본 논문에서는 검색의 효율성과 정확도를 높이기 위해 1) 기존의 단어검색시스템에서 사용된 특징들을 조합하여 사용하며, 2) 특징의 개수 및 위치뿐만 아니라 특징들의 순서를 포함하여 매칭하는 방법을 사용하며, 3) 특징비교에 의해 검색결과를 얻은 후에 여과목적으로 문자인식을 부분적으로 적용하는 2단계의 검색방법을 사용한다. 제안된 시스템의 동작은 다음과 같다. 문서 영상이 주어지면, 문서 영상 구조가 분석되고 단어 영역들의 조합으로 분할된다. 단어 영상의 특징들이 추출되어 저장된다. 사용자의 텍스트 질의가 주어지면 이에 대응되는 단어 영상이 만들어지며 이로부터 영상특징이 추출된다. 이 참조 특징과 저장된 특징들과 비교하여 유사한 단어를 검색하게 된다. 제안된 시스템은 IBM-PC를 이용한 웹 환경에서 구축되었으며, 영문 문서영상을 이용하여 실험이 수행되었다. 실험결과는 본 논문에서 제안하는 방법들의 유효성을 보여주고 있다. Abstract Most existing digital libraries for document image retrieval provide a limited retrieval service due to their indexing from document titles and/or the content of document abstracts. This paper proposes a word spotting system for full English document image retrieval based on word image shape features. In order to improve not only the efficiency but also the precision of a retrieval system, we develop the system by 1) using a combination of the holistic features which have been used in the existing word spotting systems, 2) performing image matching by comparing the order of features in a word in addition to the number of features and their positions, and 3) adopting 2 stage retrieval strategies by obtaining retrieval results by image feature matching and applying OCR(Optical Charater Recognition) partly to the results for filtering purpose. The proposed system operates as follows: given a document image, its structure is analyzed and is segmented into a set of word regions. Then, word shape features are extracted and stored. Given a user's query with text, features are extracted after its corresponding word image is generated. This reference model is compared with the stored features to find out similar words. The proposed system is implemented with IBM-PC in a web environment and its experiments are performed with English document images. Experimental results show the effectiveness of the proposed methods.

An Enhancing Technique for Scan Performance of a Skip List with MVCC (MVCC 지원 스킵 리스트의 범위 탐색 향상 기법)

  • Kim, Leeju;Lee, Eunji
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.107-112
    • /
    • 2020
  • Recently, unstructured data is rapidly being produced based on web-based services. NoSQL systems and key value stores that process unstructured data as key and value pairs are widely used in various applications. In this paper, a study was conducted on a skip list used for in-memory data management in an LSM-tree based key value store. The skip list used in the key value store is an insertion-based skip list that does not allow overwriting and processes all changes only by inserting. This behavior can support Multi-Version Concurrency Control (MVCC), which can simultaneously process multiple read/write requests through snapshot isolation. However, since duplicate keys exist in the skip list, the performance significantly degrades due to unnecessary node visits during a list traverse. In particular, serious overhead occurs when a range query or scan operation that collectively searches a specific range of data occurs. This paper proposes a newly designed Stride SkipList to reduce this overhead. The stride skip list additionally maintains an indexing pointer for the last node of the same key to avoid unnecessary node visits. The proposed scheme is implemented using RocksDB's in-memory component, and the performance evaluation shows that the performance of SCAN operation improves by up to 350 times compared to the existing skip list for various workloads.

Dynamic Management of Equi-Join Results for Multi-Keyword Searches (다중 키워드 검색에 적합한 동등조인 연산 결과의 동적 관리 기법)

  • Lim, Sung-Chae
    • The KIPS Transactions:PartA
    • /
    • v.17A no.5
    • /
    • pp.229-236
    • /
    • 2010
  • With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.