• Title/Summary/Keyword: Data annotation

Search Result 261, Processing Time 0.033 seconds

Minimally Supervised Relation Identification from Wikipedia Articles

  • Oh, Heung-Seon;Jung, Yuchul
    • Journal of Information Science Theory and Practice
    • /
    • v.6 no.4
    • /
    • pp.28-38
    • /
    • 2018
  • Wikipedia is composed of millions of articles, each of which explains a particular entity with various languages in the real world. Since the articles are contributed and edited by a large population of diverse experts with no specific authority, Wikipedia can be seen as a naturally occurring body of human knowledge. In this paper, we propose a method to automatically identify key entities and relations in Wikipedia articles, which can be used for automatic ontology construction. Compared to previous approaches to entity and relation extraction and/or identification from text, our goal is to capture naturally occurring entities and relations from Wikipedia while minimizing artificiality often introduced at the stages of constructing training and testing data. The titles of the articles and anchored phrases in their text are regarded as entities, and their types are automatically classified with minimal training. We attempt to automatically detect and identify possible relations among the entities based on clustering without training data, as opposed to the relation extraction approach that focuses on improvement of accuracy in selecting one of the several target relations for a given pair of entities. While the relation extraction approach with supervised learning requires a significant amount of annotation efforts for a predefined set of relations, our approach attempts to discover relations as they occur naturally. Unlike other unsupervised relation identification work where evaluation of automatically identified relations is done with the correct relations determined a priori by human judges, we attempted to evaluate appropriateness of the naturally occurring clusters of relations involving person-artifact and person-organization entities and their relation names.

Cascade Network Based Bolt Inspection In High-Speed Train

  • Gu, Xiaodong;Ding, Ji
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.10
    • /
    • pp.3608-3626
    • /
    • 2021
  • The detection of bolts is an important task in high-speed train inspection systems, and it is frequently performed to ensure the safety of trains. The difficulty of the vision-based bolt inspection system lies in small sample defect detection, which makes the end-to-end network ineffective. In this paper, the problem is resolved in two stages, which includes the detection network and cascaded classification networks. For small bolt detection, all bolts including defective bolts and normal bolts are put together for conducting annotation training, a new loss function and a new boundingbox selection based on the smallest axis-aligned convex set are proposed. These allow YOLOv3 network to obtain the accurate position and bounding box of the various bolts. The average precision has been greatly improved on PASCAL VOC, MS COCO and actual data set. After that, the Siamese network is employed for estimating the status of the bolts. Using the convolutional Siamese network, we are able to get strong results on few-shot classification. Extensive experiments and comparisons on actual data set show that the system outperforms state-of-the-art algorithms in bolt inspection.

Construction of Training Data and Model Training for YOLOv4-based Factory Operation Safety Management (YOLOv4 기반의 공장 근로자 안전관리를 위한 학습 데이터 구축과 모델 학습)

  • Lee, Taejun;Cho, Minwoo;Song, Jiho;Hwang, Chulhyun;Jung, Heokyung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.05a
    • /
    • pp.252-254
    • /
    • 2021
  • According to the Institute for Occupational Safety and Health, the number of industrial injuries in 2019 was 109,242, an increase of 6.8% from 2018. In this situation, the government and companies are discussing the development of core technologies for preventing safety accidents on site based on ICT in the field of construction and construction. In these fields, technologies using computer vision and artificial intelligence have recently been widely used. In this paper, we built training data for safety management of factory workers and trained a model based on YOLOv4. It is believed that this can be used as an initial study to predict the risk situation of workers in factories.

  • PDF

Predicting Learning Achievements with Indicators of Perceived Affordances Based on Different Levels of Content Complexity in Video-based Learning

  • Dasom KIM;Gyeoun JEONG
    • Educational Technology International
    • /
    • v.25 no.1
    • /
    • pp.27-65
    • /
    • 2024
  • The purpose of this study was to identify differences in learning patterns according to content complexity in video-based learning environments and to derive variables that have an important effect on learning achievement within particular learning contexts. To achieve our aims, we observed and collected data on learners' cognitive processes through perceived affordances, using behavioral logs and eye movements as specific indicators. These two types of reaction data were collected from 67 male and female university students who watched two learning videos classified according to their task complexity through the video learning player. The results showed that when the content complexity level was low, learners tended to navigate using other learners' digital logs, but when it was high, students tended to control the learning process and directly generate their own logs. In addition, using derived prediction models according to the degree of content complexity level, we identified the important variables influencing learning achievement in the low content complexity group as those related to video playback and annotation. In comparison, in the high content complexity group, the important variables were related to active navigation of the learning video. This study tried not only to apply the novel variables in the field of educational technology, but also attempt to provide qualitative observations on the learning process based on a quantitative approach.

The Protostome database (PANM-DB): Version 2.0 release with updated sequences (연체동물 NGS 데이터 분석을 위한 PANM 데이터베이스 업데이트 (Version II))

  • Kang, Se Won;Park, So Young;Patnaik, Bharat Bhusan;Hwang, Hee Ju;Chung, Jong Min;Song, Dae Kwon;Park, Young-Su;Lee, Jun Sang;Han, Yeon Soo;Park, Hong Seog;Lee, Yong Seok
    • The Korean Journal of Malacology
    • /
    • v.32 no.3
    • /
    • pp.185-188
    • /
    • 2016
  • PANM-DB (version 1.0) was constructed as a web-based interface for the analysis and annotation of Next-Generation Sequencing (NGS) data of Mollusca, Arthropoda, and Nematoda. The database collected the sequences of Protostomes (Mollusca, Arthropoda, and Nematoda) from the NCBI Taxonomy Browser, and the same were compiled in a multi-FASTA format and stored using the formatdb program. This improved the processing of the RNA-seq sequences in terms of speed and hit percentage. PANM-DB has been successfully used for the transcriptome annotation of butterfly, land snail, and other commercial mollusca. We have improved the database by updating the same with new sequences and version 2.0 contains a total of 7,571,246 protein sequences (two times more as compared to version 1.0). Furthermore, the updated version contains the Cephalopoda database. The constructed web interface is available that independently analyses following these updates that is an improvement of the mollusks BLAST server. The updated version of PANM-DB will be helpful for the analysis of the NGS based sequencing data of non-model species, especially Mollusca, Arthropoda, Nematoda.

Gene Co-expression Network Analysis Associated with Acupuncture Treatment of Rheumatoid Arthritis: An Animal Model

  • Ravn, Dea Louise;Mohammadnejad, Afsaneh;Sabaredzovic, Kemal;Li, Weilong;Lund, Jesper;Li, Shuxia;Svendsen, Anders Jorgen;Schwammle, Veit;Tan, Qihua
    • Journal of Acupuncture Research
    • /
    • v.37 no.2
    • /
    • pp.128-135
    • /
    • 2020
  • Background: Classical acupuncture is being used in the treatment of rheumatoid arthritis (RA). To explore the biological response to acupuncture, a network-based analysis was performed on gene expression data collected from an animal model of RA treated with acupuncture. Methods: Gene expression data were obtained from published microarray studies on blood samples from rats with collagen induced arthritis (CIA) and non-CIA rats, both treated with manual acupuncture. The weighted gene co-expression network analysis was performed to identify gene clusters expressed in association with acupuncture treatment time and RA status. Gene ontology and pathway analyses were applied for functional annotation and network visualization. Results: A cluster of 347 genes were identified that differentially downregulated expression in association with acupuncture treatment over time; specifically in rats with CIA with module-RA correlation at 1 hour after acupuncture (-0.27; p < 0.001) and at 34 days after acupuncture (-0.33; p < 0.001). Functional annotation showed highly significant enrichment of porphyrin-containing compound biosynthetic processes (p < 0.001). The network-based analysis also identified a module of 140 genes differentially expressed between CIA and non-CIA in rats (p < 0.001). This cluster of genes was enriched for antigen processing and presentation of exogenous peptide antigen (p < 0.001). Other functional gene clusters previously reported in earlier studies were also observed. Conclusion: The identified gene expression networks and their hub-genes could help with the understanding of mechanisms involved in the pathogenesis of RA, as well understanding the effects of acupuncture treatment of RA.

Design and Implementation of Library Information System Using Collective Intelligence and Cloud Computing (집단지성과 클라우드 컴퓨팅을 활용한 도서관 정보시스템 설계 및 구현)

  • Min, Byoung-Won
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.11
    • /
    • pp.49-61
    • /
    • 2011
  • In recent, library is considered as an integrated knowledge convergence center that can respond to various requests about information service of users. Therefor it is necessary to establish a novel information system based on information communications technologies of the era. In other words, it is currently required to develop mobile information service available in portable devices such as smart phones or tablet PCs, and to establish information system reflecting cloud computing, SaaS, Annotation, and Library 2.0 etc. In this paper we design and implement a library information system using collective intelligence and cloud computing. This information system can be adapted for the varieties of mobile service paradigm and abruptly increasing amount of electronic materials. Advantages of this concept model are resource sharing, multi-tenant supporting, configuration, and meta-data supporting etc. In addition it can offer software on-demand type user services. In order to test the performance of our system, we perform an effectiveness analysis and TTA authentication test. The average response time corresponding to variance of data reveals 0.692 seconds which is very good performance in timing effectiveness point of view. And we detect maturity level-3 or 4 authentication in TTA tests such as SaaS maturity, performance, and application programs.

Detection Algorithm of Road Damage and Obstacle Based on Joint Deep Learning for Driving Safety (주행 안전을 위한 joint deep learning 기반의 도로 노면 파손 및 장애물 탐지 알고리즘)

  • Shim, Seungbo;Jeong, Jae-Jin
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.20 no.2
    • /
    • pp.95-111
    • /
    • 2021
  • As the population decreases in an aging society, the average age of drivers increases. Accordingly, the elderly at high risk of being in an accident need autonomous-driving vehicles. In order to secure driving safety on the road, several technologies to respond to various obstacles are required in those vehicles. Among them, technology is required to recognize static obstacles, such as poor road conditions, as well as dynamic obstacles, such as vehicles, bicycles, and people, that may be encountered while driving. In this study, we propose a deep neural network algorithm capable of simultaneously detecting these two types of obstacle. For this algorithm, we used 1,418 road images and produced annotation data that marks seven categories of dynamic obstacles and labels images to indicate road damage. As a result of training, dynamic obstacles were detected with an average accuracy of 46.22%, and road surface damage was detected with a mean intersection over union of 74.71%. In addition, the average elapsed time required to process a single image is 89ms, and this algorithm is suitable for personal mobility vehicles that are slower than ordinary vehicles. In the future, it is expected that driving safety with personal mobility vehicles will be improved by utilizing technology that detects road obstacles.

Annotation-guided Code Partitioning Compiler for Homomorphic Encryption Program (지시문을 활용한 동형암호 프로그램 코드 분할 컴파일러)

  • Dongkwan Kim;Yongwoo Lee;Seonyoung Cheon;Heelim Choi;Jaeho Lee;Hoyun Youm;Hanjun Kim
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.7
    • /
    • pp.291-298
    • /
    • 2024
  • Despite its wide application, cloud computing raises privacy leakage concerns because users should send their private data to the cloud. Homomorphic encryption (HE) can resolve the concerns by allowing cloud servers to compute on encrypted data without decryption. However, due to the huge computation overhead of HE, simply executing an entire cloud program with HE causes significant computation. Manually partitioning the program and applying HE only to the partitioned program for the cloud can reduce the computation overhead. However, the manual code partitioning and HE-transformation are time-consuming and error-prone. This work proposes a new homomorphic encryption enabled annotation-guided code partitioning compiler, called Heapa, for privacy preserving cloud computing. Heapa allows programmers to annotate a program about the code region for cloud computing. Then, Heapa analyzes the annotated program, makes a partition plan with a variable list that requires communication and encryption, and generates a homomorphic encryptionenabled partitioned programs. Moreover, Heapa provides not only two region-level partitioning annotations, but also two instruction-level annotations, thus enabling a fine-grained partitioning and achieving better performance. For six machine learning and deep learning applications, Heapa achieves a 3.61 times geomean performance speedup compared to the non-partitioned cloud computing scheme.

Semantic Representation of Moving Objectin Video Data Using Motion Ontology (Motion Ontology를 이용한 비디오내 객체 움직임의 의미표현)

  • Shin, Ju-Hyun;Kim, Pan-Koo
    • Journal of Korea Multimedia Society
    • /
    • v.10 no.1
    • /
    • pp.117-127
    • /
    • 2007
  • As the value of the multimedia data is getting high, the study on the semantic recognition and retrieval about the multimedia information is strongly demanded. In this paper, we build the motion ontology and adopt it for representing the meaning of the moving objects in video data. By referencing the WordNet structure, we extend its semantic meaning based on the reclassification of motion verbs, which are used to represent the semantic meaning of moving objects. The represented information is receded in OWL/RDF(S). Here, we could expect the 'Is-A' and 'Equivalent' reasoning of the data as we use the ontologies. And the semantic representation about the moving objects is possible through the video annotation using ontology. And we tested the accuracy of the system comparing with the key-word based system. As a result, we could get the approximately 10% improvement of the system performance.

  • PDF