• 제목/요약/키워드: Semantic Technology

검색결과 950건 처리시간 0.031초

A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training

  • Tang, Zhan;Guo, Xuchao;Bai, Zhao;Diao, Lei;Lu, Shuhan;Li, Lin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권3호
    • /
    • pp.771-791
    • /
    • 2022
  • Protein-protein interaction (PPI) extraction from original text is important for revealing the molecular mechanism of biological processes. With the rapid growth of biomedical literature, manually extracting PPI has become more time-consuming and laborious. Therefore, the automatic PPI extraction from the raw literature through natural language processing technology has attracted the attention of the majority of researchers. We propose a PPI extraction model based on the large pre-trained language model and adversarial training. It enhances the learning of semantic and syntactic features using BioBERT pre-trained weights, which are built on large-scale domain corpora, and adversarial perturbations are applied to the embedding layer to improve the robustness of the model. Experimental results showed that the proposed model achieved the highest F1 scores (83.93% and 90.31%) on two corpora with large sample sizes, namely, AIMed and BioInfer, respectively, compared with the previous method. It also achieved comparable performance on three corpora with small sample sizes, namely, HPRD50, IEPA, and LLL.

Gen-Z memory pool system implementation and performance measurement

  • Kwon, Won-ok;Sok, Song-Woo;Park, Chan-ho;Oh, Myeong-Hoon;Hong, Seokbin
    • ETRI Journal
    • /
    • 제44권3호
    • /
    • pp.450-461
    • /
    • 2022
  • The Gen-Z protocol is a memory semantic protocol between the memory and CPU used in computer architectures with large memory pools. This study presents the implementation of the Gen-Z hardware system configured using Gen-Z specification 1.0 and reports its performance. A hardware prototype of a DDR4 Gen-Z memory pool with an optimized character, a block device driver, and a file system for the Gen-Z hardware was designed. The Gen-Z IP was targeted to the FPGA, and a 512 GB Gen-Z memory pool was configured on an ×86 server. In the experiments, the latency and throughput of the Gen-Z memory were measured and compared with those of the local memory, SATA SSD, and NVMe using character or block device interfaces. The Gen-Z hardware exhibited superior throughput and latency performance compared with SATA SSD and NVMe at block sizes under 4 kB. The MySQL and File IO benchmark of Gen-Z showed good write performance in all block sizes and threads. Besides, it showed low latency in RocksDB's fillseq dbbench using the ext4 direct access filesystem.

Civil legal relations in the context of adaptation of civil legislation to the legislation of the EU countries in the digital age

  • Kizlova, Olena;Safonchyk, Oksana;Hlyniana, Kateryna;Mazurenko, Svetlana
    • International Journal of Computer Science & Network Security
    • /
    • 제21권12spc호
    • /
    • pp.521-525
    • /
    • 2021
  • An essential area is the creation of a single digital market between the EU and Ukraine through information technology. Purpose: to investigate and analyze civil law relations in the field of adaptation of Ukrainian civil law to civil law regulations of the EU. The object of research: Ukrainian civil law and civil law of the EU. The subject of the study is civil law in the context of adaptation of civil law to the legislation of the EU. The following methods of scientific cognition were used during the research: semantic, historical, comparison, analysis and synthesis, generalization. The results of the study show that the harmonization of the legal system of Ukraine with EU law is caused by several goals: successful integration of Ukraine into the EU, legal reforms based on the positive example of EU countries, promoting access of Ukrainian enterprises to the EU market; attracting foreign investment, increasing the welfare of Ukrainian citizens. The adaptation includes three stages, the final of which is the preparation of an expanded program of harmonization of Ukrainian legislation with EU legislation. In the process of adaptation, it is important to take into account the legal history, tradition, features and mentality of Ukraine and before borrowing legal structures to analyze the feasibility of their application in the Ukrainian legal field.

Google Play Malware Detection based on Search Rank Fraud Approach

  • Fareena, N;Yogesh, C;Selvakumar, K;Sai Ramesh, L
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권11호
    • /
    • pp.3723-3737
    • /
    • 2022
  • Google Play is one of the largest Android phone app markets and it contains both free and paid apps. It provides a variety of categories for every target user who has different needs and purposes. The customer's rate every product based on their experience of apps and based on the average rating the position of an app in these arch varies. Fraudulent behaviors emerge in those apps which incorporate search rank maltreatment and malware proliferation. To distinguish the fraudulent behavior, a novel framework is structured that finds and uses follows left behind by fraudsters, to identify both malware and applications exposed to the search rank fraud method. This strategy correlates survey exercises and remarkably joins identified review relations with semantic and behavioral signals produced from Google Play application information, to distinguish dubious applications. The proposed model accomplishes 90% precision in grouping gathered informational indexes of malware, fakes, and authentic apps. It finds many fraudulent applications that right now avoid Google Bouncers recognition technology. It also helped the discovery of fake reviews using the reviewer relationship amount of reviews which are forced as positive reviews for each reviewed Google play the android app.

빅데이터를 위한 트랜스포머 기반의 언어 인식 기법 (Transformer-based Language Recognition Technique for Big Data)

  • 황치곤;윤창표;이수욱
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2022년도 추계학술대회
    • /
    • pp.267-268
    • /
    • 2022
  • 최근, 빅데이터 분석은 기계학습의 발전에 따른 다양한 기법들을 이용할 수 있다. 현실에서 수집된 빅데이터는 단어 간의 관계성에 대한 의미적 분석을 바탕으로 같거나 유사한 용어에 대한 자동화된 정제기법이 부족하다. 빅데이터는 보통 문장의 형태로 구성되어 있고, 이에 대한 형태소 분석이나 문장의 이해가 필요하다. 이에 자연어를 분석하기 위한 기법인 NLP는 단어의 관계성과 문장을 이해할 수 있다. 본 논문에서는 빅데이터를 시계열 접근법인 RNN의 단점을 보완한 기법인 트랜스포머와 리포머의 장단점에 대해 연구한다.

  • PDF

A BIM-based Automated Framework for Formwork Planning on Construction Sites

  • Xu, Maozeng;Mei, Zhongya;Tan, Yi
    • 국제학술발표논문집
    • /
    • The 7th International Conference on Construction Engineering and Project Management Summit Forum on Sustainable Construction and Management
    • /
    • pp.52-61
    • /
    • 2017
  • Considering its significant impact on the cost and schedule of construction projects, formwork as one part of temporary facility categories in construction should be arranged precisely. Current practice in the formwork planning is often conducted manually and repetitively, causing low efficiency and time waste. This study proposes an automated framework to generate more accurate and detailed formwork plans by utilizing information from building information modeling (BIM) considering the adequate geometric and semantic information provided by the BIM model. The dimensions and quantities information of elements in a building can be extracted automatically. Then, a rule is prepared for calculating the required forms erected around elements based on the contact areas. Finally, an algorithm of integrating first fit decreasing (FFD) with coordinated bottom left (CBL) is applied to automatically generate the formwork plan. The BIM-based automated planning framework is demonstrated by an illustrative example. The results show that the proposed framework can generate the formwork plan accurately and automatically, and significantly improve the efficiency in the formwork plan and reuse.

  • PDF

인터넷 텍스트분석을 통한 대운하 유산 관광객 인식에 관한연구 : 소주시 평강역사 문화거리를 예로 들다 (A Study on the Perception of Grand Canal Heritage Visitors Based on Web Text Analysis:The Pingjiang Historical and Cultural District of Suzhou City as an example)

  • 중청강;징치웨이;남경현
    • 한국컴퓨터정보학회:학술대회논문집
    • /
    • 한국컴퓨터정보학회 2023년도 제67차 동계학술대회논문집 31권1호
    • /
    • pp.437-438
    • /
    • 2023
  • This paper takes the Pingjiang historical and cultural district of Suzhou city as an example, collects 1439 visitor review data from Ctrip.com with the help of Python technology, and uses web text analysis to conduct research on high-frequency words, semantic networks and emotional tendencies to comprehensively assess the tourist perception of the Grand Canal heritage. The study found that: natural and humanistic landscape, historical and cultural accumulation, and the style of Jiangnan Canal are fully reflected in the tourists' perception of Pingjiang historical and cultural district; tourists hold strong positive emotion towards Pingjiang Road, however, there is still more room for renovation and improvement of the historical and cultural district. Finally, countermeasure suggestions for improving the tourist perception of the Grand Canal heritage are given in terms of protection first, cultural integration and innovative utilization.

  • PDF

Multidimensional Analysis of Consumers' Opinions from Online Product Reviews

  • Taewook Kim;Dong Sung Kim;Donghyun Kim;Jong Woo Kim
    • Asia pacific journal of information systems
    • /
    • 제29권4호
    • /
    • pp.838-855
    • /
    • 2019
  • Online product reviews are a vital source for companies in that they contain consumers' opinions of products. The earlier methods of opinion mining, which involve drawing semantic information from text, have been mostly applied in one dimension. This is not sufficient in itself to elicit reviewers' comprehensive views on products. In this paper, we propose a novel approach in opinion mining by projecting online consumers' reviews in a multidimensional framework to improve review interpretation of products. First of all, we set up a new framework consisting of six dimensions based on a marketing management theory. To calculate the distances of review sentences and each dimension, we embed words in reviews utilizing Google's pre-trained word2vector model. We classified each sentence of the reviews into the respective dimensions of our new framework. After the classification, we measured the sentiment degrees for each sentence. The results were plotted using a radar graph in which the axes are the dimensions of the framework. We tested the strategy on Amazon product reviews of the iPhone and Galaxy smartphone series with a total of around 21,000 sentences. The results showed that the radar graphs visually reflected several issues associated with the products. The proposed method is not for specific product categories. It can be generally applied for opinion mining on reviews of any product category.

정보 필터링과 시각화에 기반한 국가R&D정보 내비게이션 시스템 개발 (Development of National R&D Information Navigation System Based on Information Filtering and Visualization)

  • 이병희;손강렬
    • 한국콘텐츠학회논문지
    • /
    • 제14권4호
    • /
    • pp.418-424
    • /
    • 2014
  • 본 논문의 목적은 정보 시스템 개발 단계에서 3종(논문, 보고서, 과제)콘텐츠를 융 복합하여 정보 필터링과 시각화에 기반하여 연구자들이 이용하기 편한 국가R&D정보 내비게이션 시스템을 개발하는 것이다. 이전 단계인 정보 서비스 기획 단계에서 조사된 사용자 니즈 분석과 정보 시각화 요소를 반영하여 본 논문에서는 화면 프로토타입을 작성한 후 3종 콘텐츠에 대해 온톨로지와 RDF를 구축하고 정보 필터링과 시맨틱 검색 기술을 적용하여 정보 시스템을 개발한다. 정보 필터링을 위한 척도를 지수화하기 위해 본 논문에서는 R&D내비게이션 인덱스를 제안하여 구현하고,R&D 콘텐츠를 정보 시각화를 통해 종합적으로 검색할 수 있는 국가R&D정보 내비게이션 시스템을 개발하고, 개발된 시스템에 대해 100명을 대상으로 디자인 선호도 조사를 실시하고 실제 사용자 10명을 대상으로 사용성을 테스트한다. 디자인 선호도 결과도 85%가 긍정적으로 나타났고 사용성 테스트 결과 종합적으로 87.2점으로 시인성은 좋으나 향후 개인화 기능 개발이 더 필요하다고 조사되었다. 본 논문에서 제안되고 구현된 논문의 R&D 내비게이션 지수가 정량적 객관성을 제시하고 향후 다른 콘텐츠의 정보 필터링 지수 개발로 이어지길 기대한다.

상한론(傷寒論)온톨로지 구축 방법론 연구 (Study on a Methodology for Developing Shanghanlun Ontology)

  • 정태영;김희열;박종현
    • 동의생리병리학회지
    • /
    • 제25권5호
    • /
    • pp.765-772
    • /
    • 2011
  • Knowledge which is represented by formal logic are widely used in many domains such like artificial intelligence, information retrieval, e-commerce and so on. And for medical field, medical documentary records retrieval, information systems in hospitals, medical data sharing, remote treatment and expert systems need knowledge representation technology. To retrieve information intellectually and provide advanced information services, systematically controlled mechanism is needed to represent and share knowledge. Importantly, medical expert's knowledge should be represented in a form that is understandable to computers and also to humans to be applied to the medical information system supporting decision making. And it should have a suitable and efficient structure for its own purposes including reasoning, extendability of knowledge, management of data, accuracy of expressions, diversity, and so on. we call it ontology which can be processed with machines. We can use the ontology to represent traditional medicine knowledge in structured and systematic way with visualization, then also it can also be used education materials. Hence, the authors developed an Shanghanlun ontology by way of showing an example, so that we suggested a methodology for ontology development and also a model to structure the traditional medical knowledge. And this result can be used for student to learn Shanghanlun by graphical representation of it's knowledge. We analyzed the text of Shanghanlun to construct relational database including it's original text, symptoms and herb formulars. And then we classified the terms following some criterion, confirmed the structure of the ontology to describe semantic relations between the terms, especially we developed the ontology considering visual representation. The ontology developed in this study provides database showing fomulas, herbs, symptoms, the name of diseases and the text written in Shanghanlun. It's easy to retrieve contents by their semantic relations so that it is convenient to search knowledge of Shanghanlun and to learn it. It can display the related concepts by searching terms and provides expanded information with a simple click. It has some limitations such as standardization problems, short coverage of pattern(證), and error in chinese characters input. But we believe this research can be used for basic foundation to make traditional medicine more structural and systematic, to develop application softwares, and also to applied it in Shanghanlun educations.