• 제목/요약/키워드: Knowledge-based graph

검색결과 127건 처리시간 0.027초

맵리듀스 잡을 사용한 해시 ID 매핑 테이블 기반 대량 RDF 데이터 변환 방법 (Conversion of Large RDF Data using Hash-based ID Mapping Tables with MapReduce Jobs)

  • 김인아;이규철
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 추계학술대회
    • /
    • pp.236-239
    • /
    • 2021
  • AI 기술의 성장과 함께 지식 그래프의 크기는 지속적으로 확장되고 있다. 지식 그래프는 주로 트리플이 연결된 RDF로 표현되며, 많은 RDF 저장소들이 RDF 데이터를 압축된 형태의 ID로 변환한다. 그러나 RDF 데이터의 크기가 특정 기준 이상으로 클 경우, 테이블 탐색으로 인한 높은 처리 시간과 메모리 오버헤드가 발생한다. 본 논문에서는 해시 ID 매핑 테이블 기반 RDF 변환을 분산 병렬 프레임워크인 맵리듀스에서 처리하는 방법을 제안한다. 제안한 방법은 RDF 데이터를 정수 기반 ID로 압축 변환하면서, 처리 시간을 단축하고 메모리 오버헤드를 개선한다. 본 논문의 실험 결과, 약 23GB의 LUBM 데이터에 제시한 방법을 적용했을 때, 크기는 약 3.8배 가량 줄어들었으며 약 106초의 변환 시간이 소모되었다.

  • PDF

Communication Performance of BLE-based IoT Devices and Routers for Tracking Indoor Construction Resources

  • Yoo, Moo-Young;Yoo, Sung Geun;Park, Sangil
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제11권1호
    • /
    • pp.27-38
    • /
    • 2019
  • Sensors collect information for Internet of Things (IoT)-based services. However, indoor construction sites have a poor communication environment and many interfering elements that make it difficult to collect sensor information. In this study, a network was constructed between a Bluetooth Low Energy (BLE)-based IoT device based on a serverless IoT framework and a router. This experimental environment was applied to large- and small-scale indoor construction sites. Experiments were performed to test the communication performance of BLE-based IoT devices and routers at indoor construction sites. An analysis of the received signal strength indication (RSSI) graph patterns collected from the communication between the BLE-based IoT devices and routers for different testbed site situation revealed areas with good communication performance and poor communication performance due to interfering factors. The results confirmed that structural components of the building as well as the materials, equipment, and temporary facilities used in indoor construction interfere with the communication performance. Construction project managers will require improved technical knowledge of IoT, such as optimizing the router placement and matching communication between the router and workers, to improve the communication performance for large-scale indoor construction.

Supervised Model for Identifying Differentially Expressed Genes in DNA Microarray Gene Expression Dataset Using Biological Pathway Information

  • Chung, Tae Su;Kim, Keewon;Kim, Ju Han
    • Genomics & Informatics
    • /
    • 제3권1호
    • /
    • pp.30-34
    • /
    • 2005
  • Microarray technology makes it possible to measure the expressions of tens of thousands of genes simultaneously under various experimental conditions. Identifying differentially expressed genes in each single experimental condition is one of the most common first steps in microarray gene expression data analysis. Reasonable choices of thresholds for determining differentially expressed genes are used for the next-stap-analysis with suitable statistical significances. We present a supervised model for identifying DEGs using pathway information based on the global connectivity structure. Pathway information can be regarded as a collection of biological knowledge, thus we are trying to determine the optimal threshold so that the consequential connectivity structure can be the most compatible with the existing pathway information. The significant feature of our model is that it uses established knowledge as a reference to determine the direction of analyzing microarray dataset. In the most of previous work, only intrinsic information in the miroarray is used for the identifying DEGs. We hope that our proposed method could contribute to construct biologically meaningful structure from microarray datasets.

판본별 교감을 통한 『동의보감』의 정본화 (A Comparative Analysis about Various Editions of Donguibogam)

  • 이정현;오준호
    • 한국의사학회지
    • /
    • 제31권1호
    • /
    • pp.57-70
    • /
    • 2018
  • Much research has already been done on Donguibogam. However, comparison of specific characters was not done because researchers found it difficult to compare different editions of the text in one place. Recently, important editions have been published on the Internet, making comparison possible. In this paper, researchers compare eight editions Donguibogam, including the original edition published in 1613 and seven other editions corrected by the Naeuiwon (Joseon Dynasty National Medical Center). The comparison results were summarized and tabulated. The results of the comparison are analyzed and presented in this article as a chart. The result of comparing the characters and the analyzed graph were in agreement. The authors propose that all written and electronic publications of Donguibogam should refer to other editions implied, quoted or referenced within the text and including with proper citations, and reference the original and first edition. Inadequate referencing will pollute future knowledge of this foundational text of Traditional Korean Medicine and may result in perpetration of mis-information. Based on accumulated knowledge and study of historical Korean Medicine texts, the Namsan edition made a mistake in the editing process. The year of publication of Gabsul-yoengyoeng-gegan Edition needs to be studied again and corrections made where appropriate.

블록체인 환경에서 보안 기법들의 융합을 통한 프라이버시 및 익명성 강화 기법에 대한 연구 (A Study on An Enhancement Scheme of Privacy and Anonymity through Convergence of Security Mechanisms in Blockchain Environments)

  • 강용혁
    • 한국융합학회논문지
    • /
    • 제9권11호
    • /
    • pp.75-81
    • /
    • 2018
  • 블록체인 내의 모든 트랜잭션이 공개되기 때문에 익명성과 프라이버시 문제는 중요해지고 있다. 공개 블록체인은 사용자 대신 공개키 주소를 사용하여 익명성을 보장하는 것처럼 보이지만 트랜잭션 그래프를 기반으로 다양한 기법을 통해 추적함으로써 익명성을 약화시킬 수 있다. 본 논문에서는 블록체인 환경에서 익명성과 프라이버스를 보호하기 위하여 다양한 보안 기법을 융합하여 사용자의 추적을 어렵게 하는 기법을 제안한다. 제안 기법은 k-anonymity 기술, 믹싱 기술, 은닉서명, 다단계 기법, 램덤 선택기법, 영지식 증명 기법 등을 융합하여 인센티브 및 기여자의 참여를 통해 익명성과 프라이버시를 보호한다. 성능 분석을 통해 제안기법은 기여자의 수가 공모자의 수보다 많은 환경에서는 공모를 통한 프라이버시 및 익명성 훼손이 어렵다는 것을 보였다.

교수공학 친화적, 실용적, 교수학적 변환의 실제적 연구(10-나 삼각함수 단원을 중심으로) (A Practical Study on Didactical Transposition in the Highschool Trigonometric Function for Closer Use of Manipulative, and for More Real, Principle Based)

  • 이영하;신정은
    • 대한수학교육학회지:학교수학
    • /
    • 제11권1호
    • /
    • pp.111-129
    • /
    • 2009
  • 본 연구는 교육적 의도를 가지고 학문적인 지식을, 가르칠 지식으로 변형하는 일, 즉 지식의 교수학적 변환(didactical transposition)에 관한 것으로서, 제 7차 교육과정에 따라 개편된 13종의 10-나 수학 교과서에서 삼각함수 단원의 내용 배열순서 및 설명방식을 분석하고 그 결과, 교수법적 어려움과 서술의 논리성 및 학생의 이해를 함께 고려할 때, 나타나는 교수학적 변환의 어려움은 무엇이며 이를 위한 대안적 교수학적 변환의 방법이나 실제적인 어려움 해소 방안은 무엇인가를 생각해 보았다. 이를 위해 13종의 수학교과서 10-나 단계 삼각함수 단원의 설명방식의 차이를 위주로 비교 분석하고, 그 결과를 최근의 교수법 이론과 암묵적으로 비교하여 새 교수법의 적용 가능성을 높이는데 개연적이나마 도움이 되리라고 예상되는 대안적 내용서술 지도방안(부채꼴, 삼각함수의 그래프, 성질, 주기, 사인법칙에 대한 내용을 위주로)을 제안하였다.

  • PDF

시맨틱 웹 자원의 랭킹을 위한 알고리즘: 클래스중심 접근방법 (A Ranking Algorithm for Semantic Web Resources: A Class-oriented Approach)

  • 노상규;박현정;박진수
    • Asia pacific journal of information systems
    • /
    • 제17권4호
    • /
    • pp.31-59
    • /
    • 2007
  • We frequently use search engines to find relevant information in the Web but still end up with too much information. In order to solve this problem of information overload, ranking algorithms have been applied to various domains. As more information will be available in the future, effectively and efficiently ranking search results will become more critical. In this paper, we propose a ranking algorithm for the Semantic Web resources, specifically RDF resources. Traditionally, the importance of a particular Web page is estimated based on the number of key words found in the page, which is subject to manipulation. In contrast, link analysis methods such as Google's PageRank capitalize on the information which is inherent in the link structure of the Web graph. PageRank considers a certain page highly important if it is referred to by many other pages. The degree of the importance also increases if the importance of the referring pages is high. Kleinberg's algorithm is another link-structure based ranking algorithm for Web pages. Unlike PageRank, Kleinberg's algorithm utilizes two kinds of scores: the authority score and the hub score. If a page has a high authority score, it is an authority on a given topic and many pages refer to it. A page with a high hub score links to many authoritative pages. As mentioned above, the link-structure based ranking method has been playing an essential role in World Wide Web(WWW), and nowadays, many people recognize the effectiveness and efficiency of it. On the other hand, as Resource Description Framework(RDF) data model forms the foundation of the Semantic Web, any information in the Semantic Web can be expressed with RDF graph, making the ranking algorithm for RDF knowledge bases greatly important. The RDF graph consists of nodes and directional links similar to the Web graph. As a result, the link-structure based ranking method seems to be highly applicable to ranking the Semantic Web resources. However, the information space of the Semantic Web is more complex than that of WWW. For instance, WWW can be considered as one huge class, i.e., a collection of Web pages, which has only a recursive property, i.e., a 'refers to' property corresponding to the hyperlinks. However, the Semantic Web encompasses various kinds of classes and properties, and consequently, ranking methods used in WWW should be modified to reflect the complexity of the information space in the Semantic Web. Previous research addressed the ranking problem of query results retrieved from RDF knowledge bases. Mukherjea and Bamba modified Kleinberg's algorithm in order to apply their algorithm to rank the Semantic Web resources. They defined the objectivity score and the subjectivity score of a resource, which correspond to the authority score and the hub score of Kleinberg's, respectively. They concentrated on the diversity of properties and introduced property weights to control the influence of a resource on another resource depending on the characteristic of the property linking the two resources. A node with a high objectivity score becomes the object of many RDF triples, and a node with a high subjectivity score becomes the subject of many RDF triples. They developed several kinds of Semantic Web systems in order to validate their technique and showed some experimental results verifying the applicability of their method to the Semantic Web. Despite their efforts, however, there remained some limitations which they reported in their paper. First, their algorithm is useful only when a Semantic Web system represents most of the knowledge pertaining to a certain domain. In other words, the ratio of links to nodes should be high, or overall resources should be described in detail, to a certain degree for their algorithm to properly work. Second, a Tightly-Knit Community(TKC) effect, the phenomenon that pages which are less important but yet densely connected have higher scores than the ones that are more important but sparsely connected, remains as problematic. Third, a resource may have a high score, not because it is actually important, but simply because it is very common and as a consequence it has many links pointing to it. In this paper, we examine such ranking problems from a novel perspective and propose a new algorithm which can solve the problems under the previous studies. Our proposed method is based on a class-oriented approach. In contrast to the predicate-oriented approach entertained by the previous research, a user, under our approach, determines the weights of a property by comparing its relative significance to the other properties when evaluating the importance of resources in a specific class. This approach stems from the idea that most queries are supposed to find resources belonging to the same class in the Semantic Web, which consists of many heterogeneous classes in RDF Schema. This approach closely reflects the way that people, in the real world, evaluate something, and will turn out to be superior to the predicate-oriented approach for the Semantic Web. Our proposed algorithm can resolve the TKC(Tightly Knit Community) effect, and further can shed lights on other limitations posed by the previous research. In addition, we propose two ways to incorporate data-type properties which have not been employed even in the case when they have some significance on the resource importance. We designed an experiment to show the effectiveness of our proposed algorithm and the validity of ranking results, which was not tried ever in previous research. We also conducted a comprehensive mathematical analysis, which was overlooked in previous research. The mathematical analysis enabled us to simplify the calculation procedure. Finally, we summarize our experimental results and discuss further research issues.

A Virtual Battlefield Situation Dataset Generation for Battlefield Analysis based on Artificial Intelligence

  • Cho, Eunji;Jin, Soyeon;Shin, Yukyung;Lee, Woosin
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권6호
    • /
    • pp.33-42
    • /
    • 2022
  • 기존의 지능형 지휘통제체계 연구에서는 지휘관의 전장 상황 질문에 대한 분석 결과를 지식베이스 기반 상황 데이터에서 정보를 추출하여 제공해주고 있다. 하지만, 다양한 표현의 자연어가 사용된 정·첩보를 문맥에 맞게 분석하는 것이 상황 분석에 있어 중요해지면서 인공지능을 사용한 전장 상황 분석 연구가 진행되고 있다. 본 논문에서는 전장 상황 분석용 인공지능 개발에 필요한 데이터 셋을 제공하기 위해 전장 상황 모의 시나리오 기반 가설 데이터 셋 생성 방법을 제안한다. 가설 데이터 셋은 실제 전장 환경이 고려된 모의 시나리오에서 전장 지식요소를 식별하여 생성한다. 먼저 후보가설을 생성하면 자동으로 단위가설이 생성된다. 단위가설을 조합하여 유사 식별 가설 조합을 만들고, 연관된 후보가설을 그룹화하여 집합가설을 생성한다. 제안하는 방법으로 데이터 셋을 생성할 수 있음을 확인하기 위해 생성기 SW를 구현하였고, 생성기 SW로 가설 데이터 셋을 생성할 수 있음을 확인하였다.

챗봇 기반의 개인화 패션 추천 서비스 향상을 위한 사용자-제품 속성 제안 (Proposal for User-Product Attributes to Enhance Chatbot-Based Personalized Fashion Recommendation Service)

  • 안효선;김성훈;최예림
    • 패션비즈니스
    • /
    • 제27권3호
    • /
    • pp.50-62
    • /
    • 2023
  • The e-commerce fashion market has experienced a remarkable growth, leading to an overwhelming availability of shared information and numerous choices for users. In light of this, chatbots have emerged as a promising technological solution to enhance personalized services in this context. This study aimed to develop user-product attributes for a chatbot-based personalized fashion recommendation service using big data text mining techniques. To accomplish this, over one million consumer reviews from Coupang, an e-commerce platform, were collected and analyzed using frequency analyses to identify the upper-level attributes of users and products. Attribute terms were then assigned to each user-product attribute, including user body shape (body proportion, BMI), user needs (functional, expressive, aesthetic), user TPO (time, place, occasion), product design elements (fit, color, material, detail), product size (label, measurement), and product care (laundry, maintenance). The classification of user-product attributes was found to be applicable to the knowledge graph of the Conversational Path Reasoning model. A testing environment was established to evaluate the usefulness of attributes based on real e-commerce users and purchased product information. This study is significant in proposing a new research methodology in the field of Fashion Informatics for constructing the knowledge base of a chatbot based on text mining analysis. The proposed research methodology is expected to enhance fashion technology and improve personalized fashion recommendation service and user experience with a chatbot in the e-commerce market.

A Covariance Matrix Estimation Method for Position Uncertainty of the Wheeled Mobile Robot

  • Doh, Nakju Lett;Chung, Wan-Kyun;Youm, Young-Il
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.1933-1938
    • /
    • 2003
  • A covariance matrix is a tool that expresses odometry uncertainty of the wheeled mobile robot. The covariance matrix is a key factor in various localization algorithms such as Kalman filter, topological matching and so on. However it is not easy to acquire an accurate covariance matrix because we do not know the real states of the robot. Up to the authors knowledge, there seems to be no established result on the covariance matrix estimation for the odometry. In this paper, we propose a new method which can estimate the covariance matrix from empirical data. It is based on the PC-method and shows a good estimation ability. The experimental results validate the performance of the proposed method.

  • PDF