• Title/Summary/Keyword: Public dataset

Search Result 254, Processing Time 0.036 seconds

Construction of a full-length cDNA library from Pinus koraiensis and analysis of EST dataset (잣나무(Pinus koraiensis)의 cDNA library 제작 및 EST 분석)

  • Kim, Joon-Ki;Im, Su-Bin;Choi, Sun-Hee;Lee, Jong-Suk;Roh, Mark S.;Lim, Yong-Pyo
    • Korean Journal of Agricultural Science
    • /
    • v.38 no.1
    • /
    • pp.11-16
    • /
    • 2011
  • In this study, we report the generation and analysis of a total of 1,211 expressed sequence tags (ESTs) from Pinus koraiensis. A cDNA library was generated from the young leaf tissue and a total of 1,211 cDNA were partially sequenced. EST and unigene sequence quality were determined by computational filtering, manual review, and BLAST analyses. In all, 857 ESTs were acquired after the removal of the vector sequence and filtering over a minimum length 50 nucleotides. A total of 411 unigene, consisting of 89 contigs and 322 singletons, was identified after assembling. Also, we identified 77 new microsatellite-containing sequences from the unigenes and classified the structure according to their repeat unit. According to homology search with BLASTX against the NCBI database, 63.1% of ESTs were homologous with known function and 22.2% of ESTs were matched with putative or unknown function. The remaining 14.6% of ESTs showed no significant similarity to any protein sequences found in the public database. Gene ontology (GO) classification showed that the most abundant GO terms were transport, nucleotide binding, plastid, in terms biological process, molecular function and cellular component, respectively. The sequence data will be used to characterize potential roles of new genes in Pinus and provided for the useful tools as a genetic resource.

Nurse Staffing and Health Outcomes of Psychiatric Inpatients: A Secondary Analysis of National Health Insurance Claims Data

  • Park, Suin;Park, Sohee;Lee, Young Joo;Park, Choon-Seon;Jung, Young-Chul;Kim, Sunah
    • Journal of Korean Academy of Nursing
    • /
    • v.50 no.3
    • /
    • pp.333-348
    • /
    • 2020
  • Purpose: The present study investigated the association between nurse staffing and health outcomes among psychiatric inpatients in Korea by assessing National Health Insurance claims data. Methods: The dataset included 70,136 patients aged 19 years who were inpatients in psychiatric wards for at least two days in 2016 and treated for mental and behavioral disorders due to use of alcohol; schizophrenia, schizotypal and delusional disorders; and mood disorders across 453 hospitals. Nurse staffing levels were measured in three ways: registered nurse-to-inpatient ratio, registered nurse-to-adjusted inpatient ratio, and nursing staff-to-adjusted inpatient ratio. Patient outcomes included length of stay, readmission within 30 days, psychiatric emergency treatment, use of injected psycholeptics for chemical restraint, and hypnotics use. Relationships between nurse staffing levels and patient outcomes were analyzed considering both patient and system characteristics using multilevel modeling. Results: Multilevel analyses revealed that more inpatients per registered nurse, adjusted inpatients per registered nurse, and adjusted inpatients per nursing staff were associated with longer lengths of stay as well as a higher risk of readmission. More adjusted inpatients per registered nurse and adjusted inpatients per nursing staff were also associated with increased hypnotics use but a lower risk of psychiatric emergency treatment. Nurse staffing levels were not significantly associated with the use of injected psycholeptics for chemical restraint. Conclusion: Lower nurse staffing levels are associated with negative health outcomes of psychiatric inpatients. Policies for improving nurse staffing toward an optimal level should be enacted to facilitate better outcomes for psychiatric inpatients in Korea.

Access to and Utilization of the Open Source Data-related to Adolescent Health (청소년 건강관련 공개자료 접근 및 활용에 관한 고찰)

  • Lee, Jae-Eun;Sung, Jung-Hye;Lee, Won-Jae;Moon, In-Ok
    • The Journal of Korean Society for School & Community Health Education
    • /
    • v.11 no.1
    • /
    • pp.67-78
    • /
    • 2010
  • Background & Objectives: Current trend is that funding agencies require investigators to share their data with others. However, there is limited guidance how to access and utilize the shared data. We sought to determine what common data sharing practices in U.S.A. are, what data-related to adolescent health are freely available, and how we deal with the large dataset adopting the complex study design. Methods: The study included only research data-related to adolescent health which was collected in USA and unlimitedly accessible through the internet. Only the raw data, not aggregated, was considered for the study. Major keywords for web search were "adolescent", "children", "health", and "school". Results: Current approaches for public health data sharing lacked of common standards and varied largely due to the data's complex nature, large size, local expertise and internal procedures. Some common data sharing practices are unlimited access, formal screened access, restricted access, and informal exclusive access. The Inter-University Consortium for Political and Social Research and the Center for Disease Control and Prevention were the best data depository. "Data on the net" was search engine for the website providing data freely available. Six datasets related to adolescent health freely available were identified. The importance and methods of incorporating complex research design into analysis was discussed. Conclusion: There have been various attempts to standardize process for open access and open data using the information technology concept. However, it may not be easy for researchers to adapt themselves to this high technology. Therefore, guidance provided by this study may help researchers enhance the accessibility to and the utilization of the open source data.

  • PDF

Image Classification Approach for Improving CBIR System Performance (콘텐트 기반의 이미지검색을 위한 분류기 접근방법)

  • Han, Woo-Jin;Sohn, Kyung-Ah
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.7
    • /
    • pp.816-822
    • /
    • 2016
  • Content-Based image retrieval is a method to search by image features such as local color, texture, and other image content information, which is different from conventional tag or labeled text-based searching. In real life data, the number of images having tags or labels is relatively small, so it is hard to search the relevant images with text-based approach. Existing image search method only based on image feature similarity has limited performance and does not ensure that the results are what the user expected. In this study, we propose and validate a machine learning based approach to improve the performance of the image search engine. We note that when users search relevant images with a query image, they would expect the retrieved images belong to the same category as that of the query. Image classification method is combined with the traditional image feature similarity method. The proposed method is extensively validated on a public PASCAL VOC dataset consisting of 11,530 images from 20 categories.

ACCURACY IMPROVEMENT OF LOBLOLLY PINE INVENTORY DATA USING MULTI SENSOR DATASETS

  • Kim, Jin-Woo;Kim, Jong-Hong;Sohn, Hong-Gyoo;Heo, Joon
    • Proceedings of the KSRS Conference
    • /
    • v.2
    • /
    • pp.590-593
    • /
    • 2006
  • Timber inventory management includes to measure and update forest attributes, which is crucial information for private companies and public organizations in property assessment and environment monitoring. Field measurement would be accurate, but time-consuming and inefficient. For the reason, remote sensing technology has been an alternative to field measurement from an economic perspective. Among several sensors, LiDAR and Radar interferometry are known for their efficiency for forest monitoring because they are less influenced by weather and light conditions, and provide reasonably accurate vertical/horizontal measurement for a large area in a short period. For example, Shuttle Radar Topography Mission (SRTM) and National Elevation Dataset (NED) in the U.S. can provide tree height information and DSM. On the other hand, LiDAR DSM (the first return) and DEM (the last return) can also present tree height estimation. With respect to project site of loblolly pine plantation in Louisiana in the U.S., the accuracy of SRTM C-Band approach estimating tree height was assessed by the LiDAR approaches. In addition, SRTM X-Band and NED were also compared with the results. Plantation year in inventory GIS, which is directly related to forest age, is high correlated with the difference between SRTM C-Band and NED. As a byproduct, several stands of age mismatch could be recognized using an outlier detection algorithm, and optical satellite image (ETM+) were used to verify the mismatch. The findings of this study were (1) the confirmation of usefulness of the SRTM DSM for forest monitoring and (2) Multi-sensors- Radar, LiDAR, ETM+, MODIS can be used for accuracy improvement of forest inventory GIS altogether.

  • PDF

Low-dose CT Image Denoising Using Classification Densely Connected Residual Network

  • Ming, Jun;Yi, Benshun;Zhang, Yungang;Li, Huixin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.6
    • /
    • pp.2480-2496
    • /
    • 2020
  • Considering that high-dose X-ray radiation during CT scans may bring potential risks to patients, in the medical imaging industry there has been increasing emphasis on low-dose CT. Due to complex statistical characteristics of noise found in low-dose CT images, many traditional methods are difficult to preserve structural details effectively while suppressing noise and artifacts. Inspired by the deep learning techniques, we propose a densely connected residual network (DCRN) for low-dose CT image noise cancelation, which combines the ideas of dense connection with residual learning. On one hand, dense connection maximizes information flow between layers in the network, which is beneficial to maintain structural details when denoising images. On the other hand, residual learning paired with batch normalization would allow for decreased training speed and better noise reduction performance in images. The experiments are performed on the 100 CT images selected from a public medical dataset-TCIA(The Cancer Imaging Archive). Compared with the other three competitive denoising algorithms, both subjective visual effect and objective evaluation indexes which include PSNR, RMSE, MAE and SSIM show that the proposed network can improve LDCT images quality more effectively while maintaining a low computational cost. In the objective evaluation indexes, the highest PSNR 33.67, RMSE 5.659, MAE 1.965 and SSIM 0.9434 are achieved by the proposed method. Especially for RMSE, compare with the best performing algorithm in the comparison algorithms, the proposed network increases it by 7 percentage points.

CONSTRUCTION OF DATABASE FOR THE DIGITIZED SKY SURVEY I DATA (DIGITIZED SKY SURVEY I 자료의 검색 DB 구축)

  • Sung, Hyun-Il;Sang, Jian;Kim, Sang-Chul;Kim, Bong-Gyu;Yim, In-Sung;Ahn, Young-Suk;Sohn, Sang-Mo-Tony;Yang, Hong-Jin
    • Publications of The Korean Astronomical Society
    • /
    • v.20 no.1 s.24
    • /
    • pp.55-62
    • /
    • 2005
  • The First Generation Digitized Sky Survey (DSS-I) is a collection of digitized photographic atlases of the night sky taken from the Palomar Observatory (northen sky) and the Anglo-Australian Observatory (southern sky). DSS-I is widely used by the astronomical community for a number of applications including object cross-identification and astrometry. However, accessing and retrieving the actual images are nontrivial owing to the huge size (> 60 GB) of the dataset. To facilitate retrieval process of DSS-I data for the public, Korean Astronomical Data Center (KADC) developed a web application that provides not only data retrieval but also visualization functions. The web application consists of several modules developed using Java Applet, Jave Servlet, and JaveServer Pages (JSP) technologies. It allows users to retrieve images efficiently in various formats such as FITS, JPEG, GIF, and TIFF, and also offers an interactive visulization tool, ImgViewer, for displaying/analyzing FITS images. To use the web application, users require a Java-enabled web browser.

Fake News Detection for Korean News Using Text Mining and Machine Learning Techniques (텍스트 마이닝과 기계 학습을 이용한 국내 가짜뉴스 예측)

  • Yun, Tae-Uk;Ahn, Hyunchul
    • Journal of Information Technology Applications and Management
    • /
    • v.25 no.1
    • /
    • pp.19-32
    • /
    • 2018
  • Fake news is defined as the news articles that are intentionally and verifiably false, and could mislead readers. Spread of fake news may provoke anxiety, chaos, fear, or irrational decisions of the public. Thus, detecting fake news and preventing its spread has become very important issue in our society. However, due to the huge amount of fake news produced every day, it is almost impossible to identify it by a human. Under this context, researchers have tried to develop automated fake news detection method using Artificial Intelligence techniques over the past years. But, unfortunately, there have been no prior studies proposed an automated fake news detection method for Korean news. In this study, we aim to detect Korean fake news using text mining and machine learning techniques. Our proposed method consists of two steps. In the first step, the news contents to be analyzed is convert to quantified values using various text mining techniques (Topic Modeling, TF-IDF, and so on). After that, in step 2, classifiers are trained using the values produced in step 1. As the classifiers, machine learning techniques such as multiple discriminant analysis, case based reasoning, artificial neural networks, and support vector machine can be applied. To validate the effectiveness of the proposed method, we collected 200 Korean news from Seoul National University's FactCheck (http://factcheck.snu.ac.kr). which provides with detailed analysis reports from about 20 media outlets and links to source documents for each case. Using this dataset, we will identify which text features are important as well as which classifiers are effective in detecting Korean fake news.

Burden of Virus-associated Liver Cancer in the Arab World, 1990-2010

  • Khan, Gulfaraz;Hashim, M. Jawad
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.16 no.1
    • /
    • pp.265-270
    • /
    • 2015
  • Hepatocellular carcinoma (HCC) is amongst the top three cancer causes of death worldwide with hepatitis B and C viruses (HBV/HCV) as the main etiological agents. An up-to-date descriptive epidemiology of the burden of HBV/HCV-associated HCC in the Arab world is lacking. We therefore determined the burden of HBV/HCV-associated HCC deaths in the Arab world using the Global Burden of Disease (GBD) 2010 dataset. GBD 2010 provides, for the first time, deaths specifically attributable to viral-associated HCC. We analyzed the data for the 22 Arab countries by age, sex and economic status from 1990 to 2010 and compared the findings to global trends. Our analysis revealed that in 2010, an estimated 752,101 deaths occurred from HCC worldwide. Of these 537,093 (71%) were from HBV/HCV-associated HCC. In the Arab world, 17,638 deaths occurred from HCC of which 13,558 (77%) were HBV/HCV-linked. From 1990 to 2010, the burden of HBV and HCV-associated HCC deaths in the Arab world increased by 137% and 216% respectively, compared to global increases of 62% and 73%. Age-standardized death rates also increased in most of the Arab countries, with the highest rates noted in Mauritania and Egypt. Male gender and low economic status correlated with higher rates. These findings indicate that the burden of HBV/HCV-associated HCC in the Arab world is rising at a much faster rate than rest of the world and urgent public health measures are necessary to abate this trend and diminish the impact on already stretched regional healthcare systems.

Police Networks for Criminal Intelligence Functions: Based on Informal Social Network Analysis (경찰 범죄정보 수집 활동의 관계망 분석: 비공식적 사회연결망 분석을 중심으로)

  • Choi, Yeong Jin;Yang, Chang Hoon
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.1
    • /
    • pp.448-459
    • /
    • 2020
  • Recently, the necessities of gathering, producing, and sharing criminal information are critically important as intelligence functions of police agencies to improving public safety and national security. However, the inadequacies and barriers within which police agencies have in regard to intelligence functions impede criminal information gathering, intelligence producing within their agency, and intelligence sharing with other agencies. In this study, we analyzed informal networks constructed from a survey dataset of information and intelligence sharing among officers in police agencies. The results revealed the different structural properties of intelligence networks between police agencies. We did find that officers with high indegree and outdegree in a network played critical role on the dynamics and degree of intelligence gathering and assessment responsibilities. Finally, we could find evidence that the potential role of intermediary triggered relational dynamics for developing and sharing critical information among all police agencies.