• Title/Summary/Keyword: Consistency for classification

Search Result 136, Processing Time 0.026 seconds

An Automated Topic Specific Web Crawler Calculating Degree of Relevance (연관도를 계산하는 자동화된 주제 기반 웹 수집기)

  • Seo Hae-Sung;Choi Young-Soo;Choi Kyung-Hee;Jung Gi-Hyun;Noh Sang-Uk
    • Journal of Internet Computing and Services
    • /
    • v.7 no.3
    • /
    • pp.155-167
    • /
    • 2006
  • It is desirable if users surfing on the Internet could find Web pages related to their interests as closely as possible. Toward this ends, this paper presents a topic specific Web crawler computing the degree of relevance. collecting a cluster of pages given a specific topic, and refining the preliminary set of related web pages using term frequency/document frequency, entropy, and compiled rules. In the experiments, we tested our topic specific crawler in terms of the accuracy of its classification, crawling efficiency, and crawling consistency. First, the classification accuracy using the set of rules compiled by CN2 was the best, among those of C4.5 and back propagation learning algorithms. Second, we measured the classification efficiency to determine the best threshold value affecting the degree of relevance. In the third experiment, the consistency of our topic specific crawler was measured in terms of the number of the resulting URLs overlapped with different starting URLs. The experimental results imply that our topic specific crawler was fairly consistent, regardless of the starting URLs randomly chosen.

  • PDF

A Preliminary Study on Interchange of Science and Technology Information through Harmonization of Classification Schemes (분류체계 일치를 통한 과학기술정보 상호 교환 방법에 관한 기초 연구)

  • Hong, Sung-Wha;Seo, Tae-Sul
    • Journal of Information Management
    • /
    • v.35 no.3
    • /
    • pp.109-123
    • /
    • 2004
  • The problem of semantic interoperability in science and technology information is frequently raised. Well-established classification scheme will be used as a tool to interchange information between different databases without semantic inconsistency. However, there is still a practical barrier due to different classification schemes each database adopts. Accordingly, it is urgent to harmonize or reconcile those classifications with each other. This paper aims to solve semantic inconsistencies occurred when interchanging information between databases having different classification schemes, the Standard National Sci-Tech Classification and the Standard KISTI Classification. For the purpose a conceptual analysis of science and technology are performed and five consistency/inconsistency types are analyzed based on some examples.

A Study on the Notes Analysis of KDC 5th Edition (KDC 제5판의 주기분석에 관한 연구)

  • Chung, Ok-Kyung
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.22 no.3
    • /
    • pp.207-228
    • /
    • 2011
  • The notes of the classification system are to improve the accuracy and consistency of classification by providing useful information on classification numbers and items. Even though, several notes are used in KDC, they are not enough to keep up with rapidly developing and expanding knowledge of nowadays. The purpose of this study is to suggest appropriate types and improvements of the notes in KDC 5th edition. In order to achieve these purposes, transition of notes in KDC was analyzed. Notes of DDC 23rd edition, NDC new 9th edition, and KDC 5th edition were also analyzed. Based upon these comparison and analysis, problem and improvement of notes in KDC were suggested.

Classified Chemicals in Accordance with the Globally Harmonized System of Classification and Labeling of Chemicals: Comparison of Lists of the European Union, Japan, Malaysia and New Zealand

  • Yazid, Mohd Fadhil H.A.;Ta, Goh Choo;Mokhtar, Mazlin
    • Safety and Health at Work
    • /
    • v.11 no.2
    • /
    • pp.152-158
    • /
    • 2020
  • Background: The Globally Harmonized System of Classification and Labeling of Chemicals (GHS) was developed to enhance chemical classification and hazard communication systems worldwide. However, some of the elements such as building blocks and data sources have the potential to cause "disharmony" to the GHS, particularly in its classification results. It is known that some countries have developed their own lists of classified chemicals in accordance with the GHS to "standardize" the classification results within their respective countries. However, the lists of classified chemicals may not be consistent among these countries. Method: In this study, the lists of classified chemicals developed by the European Union, Japan, Malaysia, and New Zealand were selected for comparison of classification results for carcinogenicity, germ cell mutagenicity, and reproductive toxicity. Results: The findings show that only 54%, 66%, and 37% of the classification results for each Carcinogen, Mutagen and Reproductive toxicants hazard classes, respectively are the same among the selected countries. This indicates a "moderate" level of consistency among the classified chemicals lists. Conclusion: By using classification results for the carcinogenicity, germ cell mutagenicity, and reproductive toxicity hazard classes, this study demonstrates the "disharmony" in the classification results among the selected countries. We believe that the findings of this study deserve the attention of the relevant international bodies.

A Study on the Structure of Geographical Division in Library Classification System (문헌분류법에서의 지역구분에 관한 연구)

  • Nam, Tae-Woo;Baek, Hae-Kyung;Lee, Hyung-Mi;Jeong, Soo-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.39 no.4
    • /
    • pp.189-214
    • /
    • 2008
  • Objective of this research is to point out problems of geographic division structure in current Korean Decimal Classification System and provide solutions. For this purpose key classification methods were divided to decimal and non-decimal classification methods and analyzed for geographical division principles. In addition, national institutes regional division standards from Korea, USA and Japan were researched. Through these analysis, we provided suggestions to improve the table of geographical division in KDC4 including public institutions administrative district classification structure relations and consistency, and other regional divisional standards (proposal) instead of typical administrative district reflecting various geographical conditions.

  • PDF

Optimal dwelling time prediction for package tour using K-nearest neighbor classification algorithm

  • Aria Bisma Wahyutama;Mintae Hwang
    • ETRI Journal
    • /
    • v.46 no.3
    • /
    • pp.473-484
    • /
    • 2024
  • We introduce a machine learning-based web application to help travel agents plan a package tour schedule. K-nearest neighbor (KNN) classification predicts the optimal tourists' dwelling time based on a variety of information to automatically generate a convenient tour schedule. A database collected in collaboration with an established travel agency is fed into the KNN algorithm implemented in the Python language, and the predicted dwelling times are sent to the web application via a RESTful application programming interface provided by the Flask framework. The web application displays a page in which the agents can configure the initial data and predict the optimal dwelling time and automatically update the tour schedule. After conducting a performance evaluation by simulating a scenario on a computer running the Windows operating system, the average response time was 1.762 s, and the prediction consistency was 100% over 100 iterations.

Developing an Automatic Classification System Based on Colon Classification: with Special Reference to the Books housed in Medical and Agricultural Libraries (콜론분류법에 바탕한 자동분류시스템의 개발에 관한 연구 - 농학 및 의학 전문도서관을 사레로 -)

  • Lee Kyung-Ho
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.23
    • /
    • pp.207-261
    • /
    • 1992
  • The purpose of this study is (1) to design and test a database which can be automatically classified, and (2) to generate automatic classification number by processing the keywords in titles using the code combination method of Colon Classification(CC) as well as an automatic recognition of subjects in order to develop an automatic classification system (Auto BC System) based on CC which can be applied to any research library. To conduct this study, 1,510 words in the fields of agricultrue and medicine were selected, analized in terms of [P], [M], [E], [S], [T] employed in CC, and included in a database for classification. For the above-mentioned subject fields, the principle of an automatic classification was specified in order to generate automatic classification codes as well as to perform an automatic subject recognition of the titles included. Whenever necessary, editing, deleting, appending and reindexing of a database can be made in this automatic classification system. Appendix 1 shows the result of the automatic classification of books in the fields of agriculture and medicine. The results of the study are summarized below. 1. The classification number for the title of a book can be automatically generated by using the facet principles of Colon Classification. 2. The automatic subject recognition of a book is achieved by designing a database making use of a globe-principle, and by specifying the subject field for each word. 3. The automatic subject-recognition of input data is achieved by measuring the number of searched words by each subject field. 4. The combination of classification numbers is achieved by flowcharting of classification formular of each subject field. 5. The efficient control of classification numbers is achieved by designing control codes on the database for classification. 6. The automatic classification by means of Auto BC has been proved to be successful in the research library concentrating on a Single field. The general library may have some problem in employing this system. The automatic classification through Auto BC has the following advantages: 1. Speed of the classification process can be improve. 2. The revision or updating of classification schemes can be facilitated. 3. Multiple concepts can be expressed in a single classification code. 4. The consistency of classification can be achieved with the classification formular rather than the classifier's subjective judgement. 5. A user's retrieving process can be made after combining the classification numbers through keywords relating to the material to be searched. 6. The materials can be classified by a librarian without subject backgrounds. 7. The large body of materials can be quickly classified by means of a machine processing. 8. This automatic classification is expected to make a good contribution to design of the total system for library operations. 9. The information flow among libraries can be promoted owing to the use of the same program for the automatic classification.

  • PDF

A Study on the 'Religion Class' of DDC (DDC에 있어서 종교류 분류전개상의 제문제)

  • Byun Woo-Yeoul
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.22
    • /
    • pp.259-304
    • /
    • 1992
  • This paper examines 'Religion Class' in the scheme of the DDC. The major findings of the study are summerized as follows. 1. The first edition of DDC was published in 1876 in order to classify Amherst College Library collections. In spite of the continuous study and revision of the experts, the frameworks of the DDC systems are still kept unchanged. Only their subdivisions, reflecting those developments in the academic world, are developed and detailed more sophisticatedly. 2. The division of 200 does not function as generalities for all class of religion. Therefore, it is necessary to amend the division of 200 to serve generalities for all the religions of the world. 3. Standard subdivision for the christian religion and for the non-christian religion is different. So, the mnemonic nature has become weakened due to the dual standard subdivisions and the classification number becomes much longer and complicated. Therefore, one standard subdivision for all religions of the world is required. 4. Religion science was organized in late 19 C and developed continuously, but the DDC does not accomodate the religion science as a science. Accodingly, the DDC should be revised recognize religion science as a science not the christian science. 5. The deployment of classification scheme in Dewey's 200 is severely biased. That is to say, 9 division were assigned for christian religion, whereas only 1 division was assigned for non-christian religion. Therefore, an adjustment should be made to allocate subdivisions equally to all religions of the world. 6. General classification order of religion is prehistoric, primitive, ancient, modem and world religion in religion science. But, DDC does not accept this general classification order of religion, sticking to the biased expansion towards christianity. Therefore, DDC must adopt the general classification order of religion in the religion science. 7. Lastly, because of the limitation of decimal notation in DC, DDC does not accomodate new subject equally and classification number becomes longer. Therefore, centesimal expansion is proposed in order to make the classification number short, to enlarge its capacity of inclusion of new subject and to maintain consistency in the scheme.

  • PDF

Evaluation of Information Consistency of Clinically Significant Drug Interactions in Tyrosine Kinase Inhibitors (타이로신키나아제 억제제의 임상적으로 유의한 약물상호작용 정보 일관성 분석)

  • An, Seulki;Lee, Ju-Yeun;Ah, Young-Mi
    • Korean Journal of Clinical Pharmacy
    • /
    • v.30 no.1
    • /
    • pp.44-50
    • /
    • 2020
  • Background: Drug-drug interactions (DDIs) in patients using oral anticancer treatment are more common than in those using injectable anticancer agents. In addition, DDIs related to anticancer treatment are known to cause clinically significant outcomes, such as treatment failure and severe toxicity. To prevent these negative outcomes, significant DDIs are monitored and managed using the information provided in drug databases. We aimed to evaluate the consistency of information on clinically significant DDIs for tyrosine kinase inhibitors (TKIs) between representative drug databases. Methods: We selected clinically significant DDIs involving medications that are co-prescribed with TKIs and met the following criteria: the severity level of DDIs was equal or greater than "D" in Lexicomp® or "major" in Micromedex®. We then analyzed the consistency of the severity classification and evidence level between the drug databases. Spearman's correlation coefficient was used to identify the relationship between DDI information in the drug databases. Results: In total, 627 DDI pairs were identified as clinically significant; information on these was provided by Lexicomp® and Micromedex® for 571 and 438 pairs, respectively, and both drug databases provided information on 382 DDI pairs. There was no correlation between the severity and evidence level of DDIs provided in the two databases; Spearman's correlation coefficient for Lexicomp® and Micromedex® was -0.009 (p=0.861) and -0.064 (p=0.209), respectively. Conclusion: To judge the significance of DDIs, healthcare providers should consider that the information on DDIs may be different between drug information databases; hence, clinical factors must be considered concurrently.

Korean Brain Tumor Society Consensus Review for the Practical Recommendations on Glioma Management in Korea

  • Chul-Kee Park;Jong Hee Chang
    • Journal of Korean Neurosurgical Society
    • /
    • v.66 no.3
    • /
    • pp.308-315
    • /
    • 2023
  • Recent updates in genomic-integrated glioma classification have caused confusion in current clinical practice, as management protocols and health insurance systems are based on evidence from previous diagnostic classifications. The Korean Brain Tumor Society conducted an electronic questionnaire for society members, asking for their ideas on risk group categorization and preferred treatment for each individual diagnosis listed in the new World Health Organization (WHO) classification of gliomas. Additionally, the current off-label drug use (OLDU) protocols for glioma management approved by the Health Insurance Review and Assessment Service (HIRA) in Korea were investigated. A total of 24 responses were collected from 20 major institutes in Korea. A consensus was reached on the dichotomic definition of risk groups for glioma prognosis, using age, performance status, and extent of resection. In selecting management protocols, there was general consistency in decisions according to the WHO grade and the risk group, regardless of the individual diagnosis. As of December 2022, there were 22 OLDU protocols available for the management of gliomas in Korea. The consensus and available options described in this report will be temporarily helpful until there is an accumulation of evidence for effective management under the new classification system for gliomas.