• 제목/요약/키워드: Machine knowledge

검색결과 643건 처리시간 0.027초

OryzaGP: rice gene and protein dataset for named-entity recognition

  • Larmande, Pierre;Do, Huy;Wang, Yue
    • Genomics & Informatics
    • /
    • 제17권2호
    • /
    • pp.17.1-17.3
    • /
    • 2019
  • Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.

Improving methods for normalizing biomedical text entities with concepts from an ontology with (almost) no training data at BLAH5 the CONTES

  • Ferre, Arnaud;Ba, Mouhamadou;Bossy, Robert
    • Genomics & Informatics
    • /
    • 제17권2호
    • /
    • pp.20.1-20.5
    • /
    • 2019
  • Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.

CacheSCDefender: VMM-based Comprehensive Framework against Cache-based Side-channel Attacks

  • Yang, Chao;Guo, Yunfei;Hu, Hongchao;Liu, Wenyan
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권12호
    • /
    • pp.6098-6122
    • /
    • 2018
  • Cache-based side-channel attacks have achieved more attention along with the development of cloud computing technologies. However, current host-based mitigation methods either provide bad compatibility with current cloud infrastructure, or turn out too application-specific. Besides, they are defending blindly without any knowledge of on-going attacks. In this work, we present CacheSCDefender, a framework that provides a (Virtual Machine Monitor) VMM-based comprehensive defense framework against all levels of cache attacks. In designing CacheSCDefender, we make three key contributions: (1) an attack-aware framework combining our novel dynamic remapping and traditional cache cleansing, which provides a comprehensive defense against all three cases of cache attacks that we identify in this paper; (2) a new defense method called dynamic remapping which is a developed version of random permutation and is able to deal with two cases of cache attacks; (3) formalization and quantification of security improvement and performance overhead of our defense, which can be applicable to other defense methods. We show that CacheSCDefender is practical for deployment in normal virtualized environment, while providing favorable security guarantee for virtual machines.

History of Radiation Therapy Technology

  • Huh, Hyun Do;Kim, Seonghoon
    • 한국의학물리학회지:의학물리
    • /
    • 제31권3호
    • /
    • pp.124-134
    • /
    • 2020
  • Here we review the evolutionary history of radiation therapy technology through the festschrift of articles in celebration of the 30th anniversary of Korean Society of Medical Physics (KSMP). Radiation therapy technology used in clinical practice has evolved over a long period of time. Various areas of science, such as medical physics, mechanical engineering, and computer engineering, have contributed to the continual development of new devices and techniques. The scope of this review was restricted to two areas; i.e., output energy production and functional development, because it is not possible to include all development processes of this technology due to space limitations. The former includes the technological transition process from the initial technique applied to the first model to the latest technique currently used in a variety of machines. The latter has had a direct effect on treatment outcomes and safety, which changed the paradigm of radiation therapy, leading to new guidelines on dose prescriptions, innovation of dose verification tools, new measurement methods and calculation systems for radiation doses, changes in the criteria for errors, and medical law changes in all countries. Various complex developments are covered in this review. To the best of our knowledge, there have been few reviews on this topic and we consider it very meaningful to provide a review in the festschrift in celebration of the 30th anniversary of the KSMP.

Big IoT Healthcare Data Analytics Framework Based on Fog and Cloud Computing

  • Alshammari, Hamoud;El-Ghany, Sameh Abd;Shehab, Abdulaziz
    • Journal of Information Processing Systems
    • /
    • 제16권6호
    • /
    • pp.1238-1249
    • /
    • 2020
  • Throughout the world, aging populations and doctor shortages have helped drive the increasing demand for smart healthcare systems. Recently, these systems have benefited from the evolution of the Internet of Things (IoT), big data, and machine learning. However, these advances result in the generation of large amounts of data, making healthcare data analysis a major issue. These data have a number of complex properties such as high-dimensionality, irregularity, and sparsity, which makes efficient processing difficult to implement. These challenges are met by big data analytics. In this paper, we propose an innovative analytic framework for big healthcare data that are collected either from IoT wearable devices or from archived patient medical images. The proposed method would efficiently address the data heterogeneity problem using middleware between heterogeneous data sources and MapReduce Hadoop clusters. Furthermore, the proposed framework enables the use of both fog computing and cloud platforms to handle the problems faced through online and offline data processing, data storage, and data classification. Additionally, it guarantees robust and secure knowledge of patient medical data.

군집분석의 분할 유용도 점수의 영향 분석 (Impact Analysis of Partition Utility Score in Cluster Analysis)

  • 이계성
    • 문화기술의 융합
    • /
    • 제7권3호
    • /
    • pp.481-486
    • /
    • 2021
  • 기계학습 알고리즘은 기준 함수를 채택하여 데이터를 처리하고 학습 모델을 유도한다. 군집분석에서 사용하는 기준 함수는 어떤 형태로든지 선호성을 내포하게 되고 이를 통해 유사한 데이터끼리 묶어 준 후 이를 구성하는 변수와 값들을 특정하여 군집을 정의하게 된다. 군집분석에서 사용하는 카테고리 유용도와 분할 유용도 점수가 군집분석 결과물에 어떤 영향을 주는지를 파악하고 이들이 결과에 어떤 편향성으로 이어지는지를 분석한다. 본 연구는 군집분석에 사용되는 기준 함수의 특성에 따라 결과에 미치는 영향을 파악하기 위해 여러 데이터 세트를 이용해 실험하고 결과를 평가한다.

DISEASE FORECAST USING MACHINE LEARNING ALGORITHMS

  • HUSSAIN, MOHAMMED MUZAFFAR;DEVI, S. KALPANA
    • Journal of applied mathematics & informatics
    • /
    • 제40권5_6호
    • /
    • pp.1151-1165
    • /
    • 2022
  • Key drive of information quarrying is to digest liked information starting possible information. With the colossal amount of realities kept in documents, information bases, and stores, in the medical care area, it's inexorably significant, assuming excessive, arising compelling resources aimed at examination besides comprehension like information on behalf of the withdrawal of gen that might assistance in independent direction. Classification is method in information mining; it's characterized as per private, passing on item toward a specific course established happening it is likeness toward past instances of different substances trendy the data collection. In pre-owned recycled four Classification algorithm that incorporate Multi-Layer perception, KSTAR, Bayesian Network and PART to fabricate the grouping replicas arranged the malaria data collection and analyze the replicas, degree their exhibition through Waikato Environment for Knowledge Analysis introduced to Java Development Kit 8, then utilizations outfit's technique trendy promoting presentation of the arrangement methodology. The outcome perceived that Bayesian Network return most elevated exactness of 50.05% when working on followed by Multi-Layer perception, with 49.9% when helping is half, then, at that point, Kstar with precision of 49.44%, 49.5% when supporting individually and PART have lesser precision of 48.1% when helping, The exploration recommended that Bayesian Network is awesome toward remain utilized on Malaria data collection in our sanatoriums.

Differentiation of Legal Rules and Individualization of Court Decisions in Criminal, Administrative and Civil Cases: Identification and Assessment Methods

  • Egor, Trofimov;Oleg, Metsker;Georgy, Kopanitsa
    • International Journal of Computer Science & Network Security
    • /
    • 제22권12호
    • /
    • pp.125-131
    • /
    • 2022
  • The diversity and complexity of criminal, administrative and civil cases resolved by the courts makes it difficult to develop universal automated tools for the analysis and evaluation of justice. However, big data generated in the scope of justice gives hope that this problem will be resolved as soon as possible. The big data applying makes it possible to identify typical options for resolving cases, form detailed rules for the individualization of a court decision, and correlate these rules with an abstract provisions of law. This approach allows us to somewhat overcome the contradiction between the abstract and the concrete in law, to automate the analysis of justice and to model e-justice for scientific and practical purposes. The article presents the results of using dimension reduction, SHAP value, and p-value to identify, analyze and evaluate the individualization of justice and the differentiation of legal regulation. Processing and analysis of arrays of court decisions by computational methods make it possible to identify the typical views of courts on questions of fact and questions of law. This knowledge, obtained automatically, is promising for the scientific study of justice issues, the improvement of the prescriptions of the law and the probabilistic prediction of a court decision with a known set of facts.

Multi-factor Evolution for Large-scale Multi-objective Cloud Task Scheduling

  • Tianhao Zhao;Linjie Wu;Di Wu;Jianwei Li;Zhihua Cui
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권4호
    • /
    • pp.1100-1122
    • /
    • 2023
  • Scheduling user-submitted cloud tasks to the appropriate virtual machine (VM) in cloud computing is critical for cloud providers. However, as the demand for cloud resources from user tasks continues to grow, current evolutionary algorithms (EAs) cannot satisfy the optimal solution of large-scale cloud task scheduling problems. In this paper, we first construct a large- scale multi-objective cloud task problem considering the time and cost functions. Second, a multi-objective optimization algorithm based on multi-factor optimization (MFO) is proposed to solve the established problem. This algorithm solves by decomposing the large-scale optimization problem into multiple optimization subproblems. This reduces the computational burden of the algorithm. Later, the introduction of the MFO strategy provides the algorithm with a parallel evolutionary paradigm for multiple subpopulations of implicit knowledge transfer. Finally, simulation experiments and comparisons are performed on a large-scale task scheduling test set on the CloudSim platform. Experimental results show that our algorithm can obtain the best scheduling solution while maintaining good results of the objective function compared with other optimization algorithms.

A detailed analysis of nearby young stellar moving groups

  • Lee, Jinhee
    • 천문학회보
    • /
    • 제44권2호
    • /
    • pp.63.3-63.3
    • /
    • 2019
  • Nearby young moving groups (NYMGs hereafter) are gravitationally unbound loose young stellar associations located within 100 pc of the Sun. Since NYMGs are crucial laboratories for studying low-mass stars and planets, intensive searches for NYMG members have been performed. For identification of NYMG members, various strategies and methods have been applied. As a result, the reliability of the members in terms of membership is not uniform, which means that a careful membership re-assessment is required. In this study, I developed a NYMG membership probability calculation tool based on Bayesian inference (Bayesian Assessment of Moving Groups: BAMG). For the development of the BAMG tool, I constructed ellipsoidal models for nine NYMGs via iterative and self-consistent processes. Using BAMG, memberships of claimed members in the literature (N~2000) were evaluated, and 35 per cent of members were confirmed as bona fide members of NYMGs. Based on the deficiency of low-mass members appeared in mass function using these bona fide members, low mass members from Gaia DR2 are identified. About 2000 new M dwarf and brown dwarf candidate members were identified. Memberships of ~70 members with RV from Gaia were confirmed, and the additional ~20 members were confirmed via spectroscopic observation. Not relying on previous knowledge about the existence of nine NYMGs, unsupervised machine learning analyses were applied to NYMG members. K-means and Agglomerative Clustering algorithms result in similar trends of grouping. As a result, six previously known groups (TWA, beta-Pic, Carina, Argus, AB Doradus, and Volans-Carina) were rediscovered. Three the other known groups are recognized as well; however, they are combined into two new separate groups (ThOr+Columba and TucHor+Columba).

  • PDF