• Title/Summary/Keyword: data-based model

Search Result 21,096, Processing Time 0.052 seconds

TMA-OM(Tissue Microarray Object Model)과 주요 유전체 정보 통합

  • Kim Ju-Han
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.30-36
    • /
    • 2006
  • Tissue microarray (TMA) is an array-based technology allowing the examination of hundreds of tissue samples on a single slide. To handle, exchange, and disseminate TMA data, we need standard representations of the methods used, of the data generated, and of the clinical and histopathological information related to TMA data analysis. This study aims to create a comprehensive data model with flexibility that supports diverse experimental designs and with expressivity and extensibility that enables an adequate and comprehensive description of new clinical and histopathological data elements. We designed a Tissue Microarray Object Model (TMA-OM). Both the Array Information and the Experimental Procedure models are created by referring to Microarray Gene Expression Object Model, Minimum Information Specification For In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE), and the TMA Data Exchange Specifications (TMA DES). The Clinical and Histopathological Information model is created by using CAP Cancer Protocols and National Cancer Institute Common Data Elements (NCI CDEs). MGED Ontology, UMLS and the terms extracted from CAP Cancer Protocols and NCI CDEs are used to create a controlled vocabulary for unambiguous annotation. We implemented a web-based application for TMA-OM, supporting data export in XML format conforming to the TMA DES or the DTD derived from TMA-OM. TMA-OM provides a comprehensive data model for storage, analysis and exchange of TMA data and facilitates model-level integration of other biological models.

  • PDF

Revisiting the Bradley-Terry model and its application to information retrieval

  • Jeon, Jong-June;Kim, Yongdai
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.5
    • /
    • pp.1089-1099
    • /
    • 2013
  • The Bradley-Terry model is widely used for analysis of pairwise preference data. We explain that the popularity of Bradley-Terry model is gained due to not only easy computation but also some nice asymptotic properties when the model is misspecified. For information retrieval required to analyze big ranking data, we propose to use a pseudo likelihood based on the Bradley-Terry model even when the true model is different from the Bradley-Terry model. We justify using the Bradley-Terry model by proving that the estimated ranking based on the proposed pseudo likelihood is consistent when the true model belongs to the class of Thurstone models, which is much bigger than the Bradley-Terry model.

Automatic 3D soil model generation for southern part of the European side of Istanbul based on GIS database

  • Sisman, Rafet;Sahin, Abdurrahman;Hori, Muneo
    • Geomechanics and Engineering
    • /
    • v.13 no.6
    • /
    • pp.893-906
    • /
    • 2017
  • Automatic large scale soil model generation is very critical stage for earthquake hazard simulation of urban areas. Manual model development may cause some data losses and may not be effective when there are too many data from different soil observations in a wide area. Geographic information systems (GIS) for storing and analyzing spatial data help scientists to generate better models automatically. Although the original soil observations were limited to soil profile data, the recent developments in mapping technology, interpolation methods, and remote sensing have provided advanced soil model developments. Together with advanced computational technology, it is possible to handle much larger volumes of data. The scientists may solve difficult problems of describing the spatial variation of soil. In this study, an algorithm is proposed for automatic three dimensional soil and velocity model development of southern part of the European side of Istanbul next to Sea of Marmara based on GIS data. In the proposed algorithm, firstly bedrock surface is generated from integration of geological and geophysical measurements. Then, layer surface contacts are integrated with data gathered in vertical borings, and interpolations are interpreted on sections between the borings automatically. Three dimensional underground geology model is prepared using boring data, geologic cross sections and formation base contours drawn in the light of these data. During the preparation of the model, classification studies are made based on formation models. Then, 3D velocity models are developed by using geophysical measurements such as refraction-microtremor, array microtremor and PS logging. The soil and velocity models are integrated and final soil model is obtained. All stages of this algorithm are carried out automatically in the selected urban area. The system directly reads the GIS soil data in the selected part of urban area and 3D soil model is automatically developed for large scale earthquake hazard simulation studies.

Using Structural Changes to support the Neural Networks based on Data Mining Classifiers: Application to the U.S. Treasury bill rates

  • Oh, Kyong-Joo
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.57-72
    • /
    • 2003
  • This article provides integrated neural network models for the interest rate forecasting using change-point detection. The model is composed of three phases. The first phase is to detect successive structural changes in interest rate dataset. The second phase is to forecast change-point group with data mining classifiers. The final phase is to forecast the interest rate with BPN. Based on this structure, we propose three integrated neural network models in terms of data mining classifier: (1) multivariate discriminant analysis (MDA)-supported neural network model, (2) case based reasoning (CBR)-supported neural network model and (3) backpropagation neural networks (BPN)-supported neural network model. Subsequently, we compare these models with a neural network model alone and, in addition, determine which of three classifiers (MDA, CBR and BPN) can perform better. For interest rate forecasting, this study then examines the predictability of integrated neural network models to represent the structural change.

  • PDF

A Study on the Application of Spatial Big Data from Social Networking Service for the Operation of Activity-Based Traffic Model (활동기반 교통모형 분석자료 구축을 위한 소셜네트워크 공간빅데이터 활용방안 연구)

  • Kim, Seung-Hyun;Kim, Joo-Young;Lee, Seung-Jae
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.15 no.4
    • /
    • pp.44-53
    • /
    • 2016
  • The era of Big Data has come and the importance of Big Data has been rapidly growing. The part of transportation, the Four-Step Travel Demand Model(FSTDM), a traditional Trip-Based Model(TBM) reaches its limit. In recent years, a traffic demand forecasting method using the Activity-Based Model(ABM) emerged as a new paradigm. Given that transportation means the spatial movement of people and goods in a certain period of time, transportation could be very closely associated with spatial data. So, I mined Spatial Big Data from SNS. After that, I analyzed the character of these data from SNS and test the reliability of the data through compared with the attributes of TBM. Finally, I built a database from SNS for the operation of ABM and manipulate an ABM simulator, then I consider the result. Through this research, I was successfully able to create a spatial database from SNS and I found possibilities to overcome technical limitations on using Spatial Big Data in the transportation planning process. Moreover, it was an opportunity to seek ways of further research development.

Reliability analysis of piles based on proof vertical static load test

  • Dong, Xiaole;Tan, Xiaohui;Lin, Xin;Zhang, Xuejuan;Hou, Xiaoliang;Wu, Daoxiang
    • Geomechanics and Engineering
    • /
    • v.29 no.5
    • /
    • pp.487-496
    • /
    • 2022
  • Most of the pile's vertical static load tests in construction sites are the proof load tests, which is difficult to accurately estimate the ultimate bearing capacity and analyze the reliability of piles. Therefore, a reliability analysis method based on the proof load-settlement (Q-s) data is proposed in this study. In this proposed method, a simple ultimate limit state function based on the hyperbolic model is established, where the random variables of reliability analysis include the model factor of the ultimate bearing capacity and the fitting parameters of the hyperbolic model. The model factor M = RuR / RuP is calculated based on the available destructive Q-s data, where the real value of the ultimate bearing capacity (RuR) is obtained by the complete destructive Q-s data; the predicted value of the ultimate bearing capacity (RuP) is obtained by the proof Q-s data, a part of the available destructive Q-s data, that before the predetermined load determined by the pile test report. The results demonstrate that the proposed method can easy and effectively perform the reliability analysis based on the proof Q-s data.

Harmonization of IFC 3D Building Model Standards and ISO/STEP AP202 Drawing Standards for 2D Shape Data Representation (IFC 3차원 건축모델표준과 ISO/STEP AP202도면표준의 2차원 형상정보 연계방안)

  • Won, Ji-Sun;Lim, Kyoung-Il;Kim, Seong-Sig
    • Korean Journal of Computational Design and Engineering
    • /
    • v.11 no.6
    • /
    • pp.429-439
    • /
    • 2006
  • The purpose of this study is to support the integration from current 2D drawing-based design to future 3D model-based design. In this paper, an important theme is the combination between the STEP-based 2D drawing standards (i.e., AP202) and the IFC-based 3D building model standards. To achieve the purpose, two methodologies are proposed as follows: the development of IFC extension model for the 2D shape data representation by harmonizing ISO/STEP AP202; and the development of mapping solution between IFC 2D extension model and KOSDIC by constructing the exchange scenario for 2D shape data representation. It is expected that the proposed IFC2X2 2D extension model and mapping solution will offer the basis of development of the integrated standards model in AEC industry.

Recovery the Missing Streamflow Data on River Basin Based on the Deep Neural Network Model

  • Le, Xuan-Hien;Lee, Giha
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2019.05a
    • /
    • pp.156-156
    • /
    • 2019
  • In this study, a gated recurrent unit (GRU) network is constructed based on a deep neural network (DNN) with the aim of restoring the missing daily flow data in river basins. Lai Chau hydrological station is located upstream of the Da river basin (Vietnam) is selected as the target station for this study. Input data of the model are data on observed daily flow for 24 years from 1961 to 1984 (before Hoa Binh dam was built) at 5 hydrological stations, in which 4 gauge stations in the basin downstream and restoring - target station (Lai Chau). The total available data is divided into sections for different purposes. The data set of 23 years (1961-1983) was employed for training and validation purposes, with corresponding rates of 80% for training and 20% for validation respectively. Another data set of one year (1984) was used for the testing purpose to objectively verify the performance and accuracy of the model. Though only a modest amount of input data is required and furthermore the Lai Chau hydrological station is located upstream of the Da River, the calculated results based on the suggested model are in satisfactory agreement with observed data, the Nash - Sutcliffe efficiency (NSE) is higher than 95%. The finding of this study illustrated the outstanding performance of the GRU network model in recovering the missing flow data at Lai Chau station. As a result, DNN models, as well as GRU network models, have great potential for application within the field of hydrology and hydraulics.

  • PDF

Analysis on the Propulsive Performance of Full Scale Ship (실선의 추진성능 해석기법에 관한 연구)

  • Yang, Seung-Il;Kim, Eun-Chan
    • 한국기계연구소 소보
    • /
    • s.9
    • /
    • pp.183-191
    • /
    • 1982
  • This report describes the analysis method of the full-scale propulsive performance by using the data of model test and the full-scale speed trial. The model test data were analyzed by the computer program "PPTT" based on "1978 ITTC Performance Prediction Method for Single Screw Ships." Also the full-scale speed trial data were analyzed by the computer program "SSTT" based on the newly proposed “SRS-KIMM Standard Method of Speed Trial Analysis." An analysis of model and full-scale test data was carried out for a 60.000 DWT Bulk Carrier and the correlation between model and full-scale ship was stuied.

  • PDF

Vacant House Prediction and Important Features Exploration through Artificial Intelligence: In Case of Gunsan (인공지능 기반 빈집 추정 및 주요 특성 분석)

  • Lim, Gyoo Gun;Noh, Jong Hwa;Lee, Hyun Tae;Ahn, Jae Ik
    • Journal of Information Technology Services
    • /
    • v.21 no.3
    • /
    • pp.63-72
    • /
    • 2022
  • The extinction crisis of local cities, caused by a population density increase phenomenon in capital regions, directly causes the increase of vacant houses in local cities. According to population and housing census, Gunsan-si has continuously shown increasing trend of vacant houses during 2015 to 2019. In particular, since Gunsan-si is the city which suffers from doughnut effect and industrial decline, problems regrading to vacant house seems to exacerbate. This study aims to provide a foundation of a system which can predict and deal with the building that has high risk of becoming vacant house through implementing a data driven vacant house prediction machine learning model. Methodologically, this study analyzes three types of machine learning model by differing the data components. First model is trained based on building register, individual declared land value, house price and socioeconomic data and second model is trained with the same data as first model but with additional POI(Point of Interest) data. Finally, third model is trained with same data as the second model but with excluding water usage and electricity usage data. As a result, second model shows the best performance based on F1-score. Random Forest, Gradient Boosting Machine, XGBoost and LightGBM which are tree ensemble series, show the best performance as a whole. Additionally, the complexity of the model can be reduced through eliminating independent variables that have correlation coefficient between the variables and vacant house status lower than the 0.1 based on absolute value. Finally, this study suggests XGBoost and LightGBM based machine learning model, which can handle missing values, as final vacant house prediction model.