• Title/Summary/Keyword: Mining project

Search Result 160, Processing Time 0.023 seconds

2DSpotDB: A Database for the Annotated Two-dimensional Polyacrylamide Gel Electrophoresis of Pathogen Proteins

  • Kim, Dae-Won;Yoo, Won-Gi;Lee, Myoung-Ro;Kim, Yu-Jung;Cho, Shin-Hyeong;Lee, Won-Ja;Ju, Jung-Won
    • Genomics & Informatics
    • /
    • v.9 no.4
    • /
    • pp.197-199
    • /
    • 2011
  • The biological interpretation of two-dimensional (2D) gel electrophoresis experiments is a key step toward understanding the functions of biological systems. We here present a web-based integrated database, called 2DSpotDB, for the management of proteome data derived from several pathogens. The 2DSpotDB was established as a part of the management of a pathogen proteome project at the Korea National Institute of Health. The goals of the 2DSpotDB implementation are to store and define important pathogen genes, retrieve information obtained by 2D polyacrylamide gel electrophoresis and mass spectrometry, and create an integrated system to provide pathogen proteome information for biological scientists. This database currently contains 14 gels and information on 387 protein spots, among which 329 proteins were identified and annotated.

Data Mining Techniques for Analyzing Promoter Sequences (프로모터 염기서열 분석을 위한 데이터 마이닝 기법)

  • 김정자;이도헌
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.4 no.4
    • /
    • pp.739-744
    • /
    • 2000
  • As DNA sequences have been known through the Genome project the techniques for dealing with molecule-level gene information are being made researches briskly. It is also urgent to develop new computer algorithms for making databases and analyzing it efficiently considering the vastness of the information for known sequences. In this respect, this paper studies the association rule search algorithms for finding out the characteristics shown by means of the association between promoter sequences and genes, which is one of the important research areas in molecular biology. This paper treat biological data, while previous search algorithms used transaction data. So, we design a transformed association rule algorithm that covers data types and biological properties. These research results will contribute to reducing the time and the cost for biological experiments by minimizing their candidates.

  • PDF

PACRIM SCIENCE APPLICATIONS: A DECADE WITH AIRSAR

  • Milne, A.K.;Tapley, I.J.
    • Proceedings of the KSRS Conference
    • /
    • 2002.10a
    • /
    • pp.428-428
    • /
    • 2002
  • The scientific objectives of PACRIM (Pacific Rim) are to advance the understanding of polarimetric and interferometric radar and to promote its application in environmental research designed to detect and quantify changes found in both the physical and humanly dominated ecosystems on the earth's surface. The information derived is used to more readily identify environments at risk; improve environmental decision making and the management of resources and thereby lead to the implementation of more effective and sustainable land use practices. PACRIM is a collaborative research project was organized by NASA's Mission to Planet Earth, Airborne Sciences Program; the Jet Propulsion Laboratory; CSIRO-COSSA and the Centre for Remote Sensing and GIS at the University of New South Wales. A decade of working with AIRSAR data (1993-2003) in the Australia-Asian-Pacific region has provided the opportunity for more than 400 investigators from 20 countries to collect, analyse, interpret and apply state-of-the-art radar data to earth-science studies. This has been achieved by scientists working within seven broad research themes; o Forestry and vegetation o Geology and tectonic processes o Interferometry o Disaster management o Coastal analysis o Agriculture o Urban and regional development. This paper presents an overview of the three data acquisition missions (1993,1996 and 2000) and the science research outcomes achieved from analyzing high quality radar data.

  • PDF

Experimental and numerical investigation of the effect of sample shapes on point load index

  • Haeri, Hadi;Sarfarazi, Vahab;Shemirani, Alireza Bagher;Hosseini, Seyed Shahin
    • Geomechanics and Engineering
    • /
    • v.13 no.6
    • /
    • pp.1045-1055
    • /
    • 2017
  • Tensile strength is considered key properties for characterizing rock material in engineering project. It is determined by direct and indirect methods. Point load test is a useful testing method to estimate the tensile strengths of rocks. In this paper, the effects of rock shape on the point load index of gypsum are investigated by PFC2D simulation. For PFC simulating, initially calibration of PFC was performed with respect to the Brazilian experimental data to ensure the conformity of the simulated numerical models response. In second step, nineteen models with different shape were prepared and tested under point load test. According to the obtained results, as the size of the models increases, the point load strength index increases. It is also found that the shape of particles has no major effect on its tensile strength. Our findings show that the dominant failure pattern for numerical models is breaking the model into two pieces. Also a criterion was rendered numerically for determination of tensile strength of gypsum. The proposed criteria were cross checked with the results of experimental point load test.

Feasibility to Expand Complex Wards for Efficient Hospital Management and Quality Improvement

  • CHOI, Eun-Mee;JUNG, Yong-Sik;KWON, Lee-Seung;KO, Sang-Kyun;LEE, Jae-Young;KIM, Myeong-Jong
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.12
    • /
    • pp.7-15
    • /
    • 2020
  • Purpose: This study aims to explore the feasibility of expanding complex wards to provide efficient hospital management and high-quality medical services to local residents of Gangneung Medical Center (GMC). Research Design, Data and Methodology: There are four research designs to achieve the research objectives. We analyzed Big Data for 3 months on Social Network Services (SNS). A questionnaire survey conducted on 219 patients visiting the GMC. Surveys of 20 employees of the GMC applied. The feasibility to expand the GMC ward measured through Focus Group Interview by 12 internal and external experts. Data analysis methods derived from various surveys applied with data mining technique, frequency analysis, and Importance-Performance Analysis methods, and IBM SPSS statistical package program applied for data processing. Results: In the result of the big data analysis, the GMC's recognition on SNS is high. 95.9% of the residents and 100.0% of the employees required the need for the complex ward extension. In the analysis of expert opinion, in the future functions of GMC, specialized care (△3.3) and public medicine (△1.4) increased significantly. Conclusion: GMC's complex ward extension is an urgent and indispensable project to provide efficient hospital management and service quality.

TANFIS Classifier Integrated Efficacious Aassistance System for Heart Disease Prediction using CNN-MDRP

  • Bhaskaru, O.;Sreedevi, M.
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.10
    • /
    • pp.171-176
    • /
    • 2022
  • A dramatic rise in the number of people dying from heart disease has prompted efforts to find a way to identify it sooner using efficient approaches. A variety of variables contribute to the condition and even hereditary factors. The current estimate approaches use an automated diagnostic system that fails to attain a high level of accuracy because it includes irrelevant dataset information. This paper presents an effective neural network with convolutional layers for classifying clinical data that is highly class-imbalanced. Traditional approaches rely on massive amounts of data rather than precise predictions. Data must be picked carefully in order to achieve an earlier prediction process. It's a setback for analysis if the data obtained is just partially complete. However, feature extraction is a major challenge in classification and prediction since increased data increases the training time of traditional machine learning classifiers. The work integrates the CNN-MDRP classifier (convolutional neural network (CNN)-based efficient multimodal disease risk prediction with TANFIS (tuned adaptive neuro-fuzzy inference system) for earlier accurate prediction. Perform data cleaning by transforming partial data to informative data from the dataset in this project. The recommended TANFIS tuning parameters are then improved using a Laplace Gaussian mutation-based grasshopper and moth flame optimization approach (LGM2G). The proposed approach yields a prediction accuracy of 98.40 percent when compared to current algorithms.

Comparison of the Performance of Clustering Analysis using Data Reduction Techniques to Identify Energy Use Patterns

  • Song, Kwonsik;Park, Moonseo;Lee, Hyun-Soo;Ahn, Joseph
    • International conference on construction engineering and project management
    • /
    • 2015.10a
    • /
    • pp.559-563
    • /
    • 2015
  • Identification of energy use patterns in buildings has a great opportunity for energy saving. To find what energy use patterns exist, clustering analysis has been commonly used such as K-means and hierarchical clustering method. In case of high dimensional data such as energy use time-series, data reduction should be considered to avoid the curse of dimensionality. Principle Component Analysis, Autocorrelation Function, Discrete Fourier Transform and Discrete Wavelet Transform have been widely used to map the original data into the lower dimensional spaces. However, there still remains an ongoing issue since the performance of clustering analysis is dependent on data type, purpose and application. Therefore, we need to understand which data reduction techniques are suitable for energy use management. This research aims find the best clustering method using energy use data obtained from Seoul National University campus. The results of this research show that most experiments with data reduction techniques have a better performance. Also, the results obtained helps facility managers optimally control energy systems such as HVAC to reduce energy use in buildings.

  • PDF

DATA MININING APPROACH TO PARAMETRIC COST ESTIMATE IN EARLY DESIGN STAGE AND ANALYTICAL CHARACTERIZATION ON OLAP (ON-LINE ANALYTICAL PROCESSING)

  • JaeHo Cho;HyunKyun Jung;JaeYoul Chun
    • International conference on construction engineering and project management
    • /
    • 2011.02a
    • /
    • pp.176-181
    • /
    • 2011
  • A role of cost modeler is that of facilitating design process by the systematic application of cost factors so as to maintain sensible and economic relationships between cost, quantity, utility and appearance. These relationships help to achieve the client's requirements within an agreed budget. The purpose of this study is to develop a parametric cost estimating model for the early design stage by using the multi-dimensional system of OLAP (On-line Analytical Processing) based on the case of quantity data related to architectural design features. The parametric cost estimating models have been adopted to support decision making in the early design stage. These models typically use a similar instance or a pattern of historical case. In order to effectively use this type of data model, it is required to set data classification and prediction methods. One of the methods is to find the similar class in line with attribute selection measure in the multi-dimensional data model. Therefore, this research is to analyze the relevance attribute influenced by architectural design features with the subject of case-based quantity data used for the parametric cost estimating model. The relevance attributes can be analyzed by Analytical Characterization. It helps determine what attributes to be included in the OLAP multi-dimension.

  • PDF

Suggestion of Urban Regeneration Type Recommendation System Based on Local Characteristics Using Text Mining (텍스트 마이닝을 활용한 지역 특성 기반 도시재생 유형 추천 시스템 제안)

  • Kim, Ikjun;Lee, Junho;Kim, Hyomin;Kang, Juyoung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.3
    • /
    • pp.149-169
    • /
    • 2020
  • "The Urban Renewal New Deal project", one of the government's major national projects, is about developing underdeveloped areas by investing 50 trillion won in 100 locations on the first year and 500 over the next four years. This project is drawing keen attention from the media and local governments. However, the project model which fails to reflect the original characteristics of the area as it divides project area into five categories: "Our Neighborhood Restoration, Housing Maintenance Support Type, General Neighborhood Type, Central Urban Type, and Economic Base Type," According to keywords for successful urban regeneration in Korea, "resident participation," "regional specialization," "ministerial cooperation" and "public-private cooperation", when local governments propose urban regeneration projects to the government, they can see that it is most important to accurately understand the characteristics of the city and push ahead with the projects in a way that suits the characteristics of the city with the help of local residents and private companies. In addition, considering the gentrification problem, which is one of the side effects of urban regeneration projects, it is important to select and implement urban regeneration types suitable for the characteristics of the area. In order to supplement the limitations of the 'Urban Regeneration New Deal Project' methodology, this study aims to propose a system that recommends urban regeneration types suitable for urban regeneration sites by utilizing various machine learning algorithms, referring to the urban regeneration types of the '2025 Seoul Metropolitan Government Urban Regeneration Strategy Plan' promoted based on regional characteristics. There are four types of urban regeneration in Seoul: "Low-use Low-Level Development, Abandonment, Deteriorated Housing, and Specialization of Historical and Cultural Resources" (Shon and Park, 2017). In order to identify regional characteristics, approximately 100,000 text data were collected for 22 regions where the project was carried out for a total of four types of urban regeneration. Using the collected data, we drew key keywords for each region according to the type of urban regeneration and conducted topic modeling to explore whether there were differences between types. As a result, it was confirmed that a number of topics related to real estate and economy appeared in old residential areas, and in the case of declining and underdeveloped areas, topics reflecting the characteristics of areas where industrial activities were active in the past appeared. In the case of the historical and cultural resource area, since it is an area that contains traces of the past, many keywords related to the government appeared. Therefore, it was possible to confirm political topics and cultural topics resulting from various events. Finally, in the case of low-use and under-developed areas, many topics on real estate and accessibility are emerging, so accessibility is good. It mainly had the characteristics of a region where development is planned or is likely to be developed. Furthermore, a model was implemented that proposes urban regeneration types tailored to regional characteristics for regions other than Seoul. Machine learning technology was used to implement the model, and training data and test data were randomly extracted at an 8:2 ratio and used. In order to compare the performance between various models, the input variables are set in two ways: Count Vector and TF-IDF Vector, and as Classifier, there are 5 types of SVM (Support Vector Machine), Decision Tree, Random Forest, Logistic Regression, and Gradient Boosting. By applying it, performance comparison for a total of 10 models was conducted. The model with the highest performance was the Gradient Boosting method using TF-IDF Vector input data, and the accuracy was 97%. Therefore, the recommendation system proposed in this study is expected to recommend urban regeneration types based on the regional characteristics of new business sites in the process of carrying out urban regeneration projects."

The Analysis of the Visitors' Experiences in Yeonnam-dong before and after the Gyeongui Line Park Project - A Text Mining Approach - (경의선숲길 조성 전후의 연남동 방문자의 경험 분석 - 블로그 텍스트 분석을 중심으로 -)

  • Kim, Sae-Ryung;Choi, Yunwon;Yoon, Heeyeun
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.47 no.4
    • /
    • pp.33-49
    • /
    • 2019
  • The purpose of this study was to investigate the changes in the experiences of visitors of Yeonnam-dong during the period covering the development of a linear park, the Gyeongui Line Park. This study used a text mining technique to analyze Naver Blog postings of those who visited Yeonnam-dong from June 2013 to May 2017, divided into four periods -from June 2013 to May 2014, from June 2014 to May 2015, from June 2015 to May 2016 and from June 2016 to May 2017. The keywords used were 'Yeonnam-dong', 'Gyeongui Line' and 'Yeontral Park' and the data was further refined and resampled. A semantic network analysis was conducted on the basis of the co-occurrences of words. The results of the study were as follows. During the entire period, the main experience of visitors to Yeonnam-dong was 'food culture' consistently, but the activities related to 'market', 'browsing', and 'buy' increased. Also, activities such as 'walk', 'play' and 'rest' in the park newly appeared after the construction of the park. Moreover, more diverse opinions about the Yeonnam-dong were expressed on the blog, and Yeonnam-dong began to be recognized as a place where a variety of activities can be enjoyed. Lastly, when the visitors wrote about the theme 'food culture', the scope of the keywords expanded from simple ones, such as 'eat', 'photograph' and 'chatting' to 'market', 'browsing', and 'walk'. The sub-themes that appeared with the park also expanded to various topics with the emergence of the Gyeongui Line Book Street. This study analyzed the change of experiences of visitors objectively with text mining, a quantitative methodology. Due to the nature of text mining, however, the subjective opinions inevitably have been involved in the process of refining. Also, further research is required to assess the direct relationship between these changes and park construction.