• Title/Summary/Keyword: Number of clusters

Search Result 939, Processing Time 0.033 seconds

Domain Analysis on the Field of Open Access by Co-Word Analysis: Based on Published Journals of Library and Information Science during 2013 to 2018 (동시출현단어 분석을 활용한 오픈액세스 분야의 지적구조 분석: 2013년부터 2018년까지 출판된 문헌정보학 저널을 기반으로)

  • Kim, Sun-Kyum;Kim, Wan-Jong;Seo, Tae-Sul;Choi, Hyun-Jin
    • Journal of Korean Library and Information Science Society
    • /
    • v.50 no.1
    • /
    • pp.333-356
    • /
    • 2019
  • Open access has emerged as an alternative to overcome the crisis brought by scholarly communication on commercial publishers. The purpose of this study is to suggest the intellectual structure that reflects the newest research trend in the field of open access, to identify how the subject area is structured by using co-word analysis, and compare and analyze with the existing study. In order to do this, the total number of dataset was 761 papers collected from Web of Science during the period from January 2012 to November 2018 using information science and 2,321 keywords as a noun phase are extracted from titles and abstracts. To analyze the intellectual structure of open access, 13 topic clusters are extracted by network analysis and the keywords with higher centrallity are drawn by visualizing the intellectual relationship. In addition, after clustering analysis, the relationship was analyzed by plotting the result on the multidimensional scaling map. As a result, it is expected that our research helps the research direction of open access for the future.

A Big Data Based Random Motif Frequency Method for Analyzing Human Proteins (인간 단백질 분석을 위한 빅 데이타 기반 RMF 방법)

  • Kim, Eun-Mi;Jeong, Jong-Cheol;Lee, Bae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.13 no.6
    • /
    • pp.1397-1404
    • /
    • 2018
  • Due to the technical difficulties and high cost for obtaining 3-dimensional structure data, sequence-based approaches in proteins have not been widely acknowledged. A motif can be defined as any segments in protein or gene sequences. With this simplicity, motifs have been actively and widely used in various areas. However, the motif itself has not been studied comprehensively. The value of this study can be categorized in three fields in order to analyze the human proteins using artificial intelligence method: (1) Based on our best knowledge, this research is the first comprehensive motif analysis by analyzing motifs with all human proteins in Protein Data Bank (PDB) associated with the database of Enzyme Commission (EC) number and Structural Classification of Proteins (SCOP). (2) We deeply analyze the motif in three different categories: pattern, statistical, and functional analysis of clusters. (3) At the last and most importantly, we proposed random motif frequency(RMF) matric that can efficiently distinct the characteristics of proteins by identifying interface residues from non-interface residues and clustering protein functions based on big data while varying the size of random motif.

Analysis of data on prevention of school violence based on AI unsupervised learning (AI 비지도 학습 기반의 학교폭력 예방 데이터 분석)

  • Jung, Soyeong;Ma, Youngji;Koo, Dukhoi
    • 한국정보교육학회:학술대회논문집
    • /
    • 2021.08a
    • /
    • pp.85-91
    • /
    • 2021
  • School violence has long been recognized as a social problem, and various efforts have been made to prevent it. In this study, we propose a system that can prevent school violence by analyzing data on the frequency of conversations between students, and identify peer relationships. The frequency of conversations between students in the class was quantified using a rating scale questionnaire, and this data was grouped into the appropriate number of clusters using the K-means algorithm. Additionally, the homeroom teacher observed the frequency and nature of conversations between students, and targeted specific individuals or groups for counseling and intervention, with the aim of reducing school violence. Data analysis revealed that the teachers' qualitative observations were consistent with the quantified data based on student questionnaires, and therefore applicable as quantitative data towards the identification and understanding of student relationships within the classroom. The study has potential limitations. The data used is subjective and based on peer evaluations which can be inconsistent as the students may use different criteria to evaluate one another. It is expected that this study will help homeroom teachers in their efforts to prevent school violence by understanding the relationships between students within the classroom.

  • PDF

A Performance Improvement Scheme for a Wireless Internet Proxy Server Cluster (무선 인터넷 프록시 서버 클러스터 성능 개선)

  • Kwak, Hu-Keun;Chung, Kyu-Sik
    • Journal of KIISE:Information Networking
    • /
    • v.32 no.3
    • /
    • pp.415-426
    • /
    • 2005
  • Wireless internet, which becomes a hot social issue, has limitations due to the following characteristics, as different from wired internet. It has low bandwidth, frequent disconnection, low computing power, and small screen in user terminal. Also, it has technical issues to Improve in terms of user mobility, network protocol, security, and etc. Wireless internet server should be scalable to handle a large scale traffic due to rapidly growing users. In this paper, wireless internet proxy server clusters are used for the wireless Internet because their caching, distillation, and clustering functions are helpful to overcome the above limitations and needs. TranSend was proposed as a clustering based wireless internet proxy server but it has disadvantages; 1) its scalability is difficult to achieve because there is no systematic way to do it and 2) its structure is complex because of the inefficient communication structure among modules. In our former research, we proposed the All-in-one structure which can be scalable in a systematic way but it also has disadvantages; 1) data sharing among cache servers is not allowed and 2) its communication structure among modules is complex. In this paper, we proposed its improved scheme which has an efficient communication structure among modules and allows data to be shared among cache servers. We performed experiments using 16 PCs and experimental results show 54.86$\%$ and 4.70$\%$ performance improvement of the proposed system compared to TranSend and All-in-one system respectively Due to data sharing amount cache servers, the proposed scheme has an advantage of keeping a fixed size of the total cache memory regardless of cache server numbers. On the contrary, in All-in-one, the total cache memory size increases proportional to the number of cache servers since each cache server should keep all cache data, respectively.

A Novel of Data Clustering Architecture for Outlier Detection to Electric Power Data Analysis (전력데이터 분석에서 이상점 추출을 위한 데이터 클러스터링 아키텍처에 관한 연구)

  • Jung, Se Hoon;Shin, Chang Sun;Cho, Young Yun;Park, Jang Woo;Park, Myung Hye;Kim, Young Hyun;Lee, Seung Bae;Sim, Chun Bo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.10
    • /
    • pp.465-472
    • /
    • 2017
  • In the past, researchers mainly used the supervised learning technique of machine learning to analyze power data and investigated the identification of patterns through the data mining technique. Data analysis research, however, faces its limitations with the old data classification and analysis techniques today when the size of electric power data has increased with the possible real-time provision of data. This study thus set out to propose a clustering architecture to analyze large-sized electric power data. The clustering process proposed in the study supplements the K-means algorithm, an unsupervised learning technique, for its problems and is capable of automating the entire process from the collection of electric power data to their analysis. In the present study, power data were categorized and analyzed in total three levels, which include the row data level, clustering level, and user interface level. In addition, the investigator identified K, the ideal number of clusters, based on principal component analysis and normal distribution and proposed an altered K-means algorithm to reduce data that would be categorized as ideal points in order to increase the efficiency of clustering.

Reviews on Post-synthetic Modification of Metal-Organic Frameworks Membranes (다결정 금속 유기 골격체 분리막의 후처리 성능 제어기술 개발 동향)

  • Hyuk Taek, Kwon;Kiwon, Eum
    • Membrane Journal
    • /
    • v.32 no.6
    • /
    • pp.367-382
    • /
    • 2022
  • Numerous metal-organic frameworks (MOFs) produced by periodic combinations of organic ligands and metal ions or metal-oxo clusters have led the way for the creation of energy-efficient membrane-based separations that may serve as viable replacements for traditional thermal counterparts. Although tremendous progress has been made over the past decade in the synthesis of polycrystalline MOF membranes, only a small number of MOFs have been exploited in the relevant research. Intercrystalline defects, or nonselective diffusion routes in polycrystalline membranes, are likely the reason behind the delay. Postsynthetic modifications (PSMs) are newly emerging strategies for providing polycrystalline MOF membrane diversity by leveraging advanced membranes as a platform and improving their separation capabilities via physical and/or chemical treatments; therefore, neither designing and developing MOFs nor tailoring membrane synthesis techniques for focused MOFs is necessary. In this minireview, seven subclasses of PSM techniques that have recently been adapted to polycrystalline MOF membranes are outlined, along with obstacles and future directions.

A Study on Korean Local Governments' Operation of Participatory Budgeting System : Classification by Support Vector Machine Technique (한국 지방자치단체의 주민참여예산제도 운영에 관한 연구 - Support Vector Machine 기법을 이용한 유형 구분)

  • Junhyun Han;Jaemin Ryou;Jayon Bae;Chunghyeok Im
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.3
    • /
    • pp.461-466
    • /
    • 2024
  • Korean local governments operates the participatory budgeting system autonomously. This study is to classify these entities into clusters. Among the diverse machine learning methodologies(Neural Network, Rule Induction(CN2), KNN, Decision Tree, Random Forest, Gradient Boosting, SVM, Naïve Bayes), the Support Vector Machine technique emerged as the most efficacious in the analysis of 2022 Korean municipalities data. The first cluster C1 is characterized by minimal committee activity but a substantial allocation of participatory budgeting; another cluster C3 comprises cities that exhibit a passive stance. The majority of cities falls into the final cluster C2 which is noted for its proactive engagement in. Overall, most Korean local government operates the participatory busgeting system in good shape. Only a small number of cities is less active in this system. We anticipate that analyzing time-series data from the past decade in follow-up studies will further enhance the reliability of classifying local government types regarding participatory budgeting.

An Intelligent Monitoring System of Semiconductor Processing Equipment using Multiple Time-Series Pattern Recognition (다중 시계열 패턴인식을 이용한 반도체 생산장치의 지능형 감시시스템)

  • Lee, Joong-Jae;Kwon, O-Bum;Kim, Gye-Young
    • The KIPS Transactions:PartD
    • /
    • v.11D no.3
    • /
    • pp.709-716
    • /
    • 2004
  • This paper describes an intelligent real-time monitoring system of a semiconductor processing equipment, which determines normal or not for a wafer in processing, using multiple time-series pattern recognition. The proposed system consists of three phases, initialization, learning and real-time prediction. The initialization phase sets the weights and tile effective steps for all parameters of a monitoring equipment. The learning phase clusters time series patterns, which are producted and fathered for processing wafers by the equipment, using LBG algorithm. Each pattern has an ACI which is measured by a tester at the end of a process The real-time prediction phase corresponds a time series entered by real-time with the clustered patterns using Dynamic Time Warping, and finds the best matched pattern. Then it calculates a predicted ACI from a combination of the ACI, the difference and the weights. Finally it determines Spec in or out for the wafer. The proposed system is tested on the data acquired from etching device. The results show that the error between the estimated ACI and the actual measurement ACI is remarkably reduced according to the number of learning increases.

Bibliometric Analysis on Health Information-Related Research in Korea (국내 건강정보관련 연구에 대한 계량서지학적 분석)

  • Jin Won Kim;Hanseul Lee
    • Journal of the Korean Society for information Management
    • /
    • v.41 no.1
    • /
    • pp.411-438
    • /
    • 2024
  • This study aims to identify and comprehensively view health information-related research trends using a bibliometric analysis. To this end, 1,193 papers from 2002 to 2023 related to "health information" were collected through the Korea Citation Index (KCI) database and analyzed in diverse aspects: research trends by period, academic fields, intellectual structure, and keyword changes. Results indicated that the number of papers related to health information continued to increase and has been decreasing since 2021. The main academic fields of health information-related research included "biomedical engineering," "preventive medicine/occupational environmental medicine," "law," "nursing," "library and information science," and "interdisciplinary research." Moreover, a co-word analysis was performed to understand the intellectual structure of research related to health information. As a result of applying the parallel nearest neighbor clustering (PNNC) algorithm to identify the structure and cluster of the derived network, four clusters and 17 subgroups belonging to them could be identified, centering on two conglomerates: "medical engineering perspective on health information" and "social science perspective on health information." An inflection point analysis was attempted to track the timing of change in the academic field and keywords, and common changes were observed between 2010 and 2011. Finally, a strategy diagram was derived through the average publication year and word frequency, and high-frequency keywords were presented by dividing them into "promising," "growth," and "mature." Unlike previous studies that mainly focused on content analysis, this study is meaningful in that it viewed the research area related to health information from an integrated perspective using various bibliometric methods.

Structural Segmentation for 3-D Brain Image by Intensity Coherence Enhancement and Classification (명암도 응집성 강화 및 분류를 통한 3차원 뇌 영상 구조적 분할)

  • Kim, Min-Jeong;Lee, Joung-Min;Kim, Myoung-Hee
    • The KIPS Transactions:PartA
    • /
    • v.13A no.5 s.102
    • /
    • pp.465-472
    • /
    • 2006
  • Recently, many suggestions have been made in image segmentation methods for extracting human organs or disease affected area from huge amounts of medical image datasets. However, images from some areas, such as brain, which have multiple structures with ambiruous structural borders, have limitations in their structural segmentation. To address this problem, clustering technique which classifies voxels into finite number of clusters is often employed. This, however, has its drawback, the influence from noise, which is caused from voxel by voxel operations. Therefore, applying image enhancing method to minimize the influence from noise and to make clearer image borders would allow more robust structural segmentation. This research proposes an efficient structural segmentation method by filtering based clustering to extract detail structures such as white matter, gray matter and cerebrospinal fluid from brain MR. First, coherence enhancing diffusion filtering is adopted to make clearer borders between structures and to reduce the noises in them. To the enhanced images from this process, fuzzy c-means clustering method was applied, conducting structural segmentation by assigning corresponding cluster index to the structure containing each voxel. The suggested structural segmentation method, in comparison with existing ones with clustering using Gaussian or general anisotropic diffusion filtering, showed enhanced accuracy which was determined by how much it agreed with the manual segmentation results. Moreover, by suggesting fine segmentation method on the border area with reproducible results and minimized manual task, it provides efficient diagnostic support for morphological abnormalities in brain.