• Title/Summary/Keyword: Time-based Clustering

Search Result 721, Processing Time 0.023 seconds

Topic Analysis of Scholarly Communication Research

  • Ji, Hyun;Cha, Mikyeong
    • Journal of Information Science Theory and Practice
    • /
    • v.9 no.2
    • /
    • pp.47-65
    • /
    • 2021
  • This study aims to identify specific topics, trends, and structural characteristics of scholarly communication research, based on 1,435 articles published from 1970 to 2018 in the Scopus database through Latent Dirichlet Allocation topic modeling, serial analysis, and network analysis. Topic modeling, time series analysis, and network analysis were used to analyze specific topics, trends, and structures, respectively. The results were summarized into three sets as follows. First, the specific topics of scholarly communication research were nineteen in number, including research resource management and research data, and their research proportion is even. Second, as a result of the time series analysis, there are three upward trending topics: Topic 6: Open Access Publishing, Topic 7: Green Open Access, Topic 19: Informal Communication, and two downward trending topics: Topic 11: Researcher Network and Topic 12: Electronic Journal. Third, the network analysis results indicated that high mean profile association topics were related to the institution, and topics with high triangle betweenness centrality, such as Topic 14: Research Resource Management, shared the citation context. Also, through cluster analysis using parallel nearest neighbor clustering, six clusters connected with different concepts were identified.

Similarity measurement based on Min-Hash for Preserving Privacy

  • Cha, Hyun-Jong;Yang, Ho-Kyung;Song, You-Jin
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.240-245
    • /
    • 2022
  • Because of the importance of the information, encryption algorithms are heavily used. Raw data is encrypted and secure, but problems arise when the key for decryption is exposed. In particular, large-scale Internet sites such as Facebook and Amazon suffer serious damage when user data is exposed. Recently, research into a new fourth-generation encryption technology that can protect user-related data without the use of a key required for encryption is attracting attention. Also, data clustering technology using encryption is attracting attention. In this paper, we try to reduce key exposure by using homomorphic encryption. In addition, we want to maintain privacy through similarity measurement. Additionally, holistic similarity measurements are time-consuming and expensive as the data size and scope increases. Therefore, Min-Hash has been studied to efficiently estimate the similarity between two signatures Methods of measuring similarity that have been studied in the past are time-consuming and expensive as the size and area of data increases. However, Min-Hash allowed us to efficiently infer the similarity between the two sets. Min-Hash is widely used for anti-plagiarism, graph and image analysis, and genetic analysis. Therefore, this paper reports privacy using homomorphic encryption and presents a model for efficient similarity measurement using Min-Hash.

A Method for Determining the Peak Level of Risk in Root Industry Work Environment using Machine Learning (기계학습을 이용한 뿌리산업 작업 환경 위험도 피크레벨 결정방법)

  • Sang-Min Lee;Jun-Yeong Kim;Suk-Chan Kang;Kyung-Jun Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.1
    • /
    • pp.127-136
    • /
    • 2024
  • Because the hazardous working environments and high labor intensity of the root industry can potentially impact the health of workers, current regulations have focused on measuring and controlling environmental factors, on a semi-annual basis. However, there is a lack of quantitative criteria addressing workers' health conditions other than the physical work environment. This gap makes it challenging to prevent occupational diseases resulting from continuous exposure to harmful substances below regulatory thresholds. Therefore, this paper proposes a machine learning-based method for determining the peak level of risk in root industry work environments and enables real-time safety assessment in workplaces utilizing this approach.

Improving the Performance of Radiologists Using Artificial Intelligence-Based Detection Support Software for Mammography: A Multi-Reader Study

  • Jeong Hoon Lee;Ki Hwan Kim;Eun Hye Lee;Jong Seok Ahn;Jung Kyu Ryu;Young Mi Park;Gi Won Shin;Young Joong Kim;Hye Young Choi
    • Korean Journal of Radiology
    • /
    • v.23 no.5
    • /
    • pp.505-516
    • /
    • 2022
  • Objective: To evaluate whether artificial intelligence (AI) for detecting breast cancer on mammography can improve the performance and time efficiency of radiologists reading mammograms. Materials and Methods: A commercial deep learning-based software for mammography was validated using external data collected from 200 patients, 100 each with and without breast cancer (40 with benign lesions and 60 without lesions) from one hospital. Ten readers, including five breast specialist radiologists (BSRs) and five general radiologists (GRs), assessed all mammography images using a seven-point scale to rate the likelihood of malignancy in two sessions, with and without the aid of the AI-based software, and the reading time was automatically recorded using a web-based reporting system. Two reading sessions were conducted with a two-month washout period in between. Differences in the area under the receiver operating characteristic curve (AUROC), sensitivity, specificity, and reading time between reading with and without AI were analyzed, accounting for data clustering by readers when indicated. Results: The AUROC of the AI alone, BSR (average across five readers), and GR (average across five readers) groups was 0.915 (95% confidence interval, 0.876-0.954), 0.813 (0.756-0.870), and 0.684 (0.616-0.752), respectively. With AI assistance, the AUROC significantly increased to 0.884 (0.840-0.928) and 0.833 (0.779-0.887) in the BSR and GR groups, respectively (p = 0.007 and p < 0.001, respectively). Sensitivity was improved by AI assistance in both groups (74.6% vs. 88.6% in BSR, p < 0.001; 52.1% vs. 79.4% in GR, p < 0.001), but the specificity did not differ significantly (66.6% vs. 66.4% in BSR, p = 0.238; 70.8% vs. 70.0% in GR, p = 0.689). The average reading time pooled across readers was significantly decreased by AI assistance for BSRs (82.73 vs. 73.04 seconds, p < 0.001) but increased in GRs (35.44 vs. 42.52 seconds, p < 0.001). Conclusion: AI-based software improved the performance of radiologists regardless of their experience and affected the reading time.

A Study on Eigenspace Face Recognition using Wavelet Transform and HMM (웨이블렛 변환과 HMM을 이용한 고유공간 기반 얼굴인식에 관한 연구)

  • Lee, Jung-Jae;Kim, Jong-Min
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.10
    • /
    • pp.2121-2128
    • /
    • 2012
  • This paper proposed the real time face area detection using Wavelet transform and the strong detection algorithm that satisfies the efficiency of computation and detection performance at the same time was proposed. The detected face image recognizes the face by configuring the low-dimensional face symbol through the principal component analysis. The proposed method is well suited for real-time system construction because it doesn't require a lot of computation compared to the existing geometric feature-based method or appearance-based method and it can maintain high recognition rate using the minimum amount of information. In addition, in order to reduce the wrong recognition or recognition error occurred during face recognition, the input symbol of Hidden Markov Model is used by configuring the feature values projected to the unique space as a certain symbol through clustering algorithm. By doing so, any input face will be recognized as a face model that has the highest probability. As a result of experiment, when comparing the existing method Euclidean and Mahananobis, the proposed method showed superior recognition performance in incorrect matching or matching error.

Research on An Energy Efficient Triangular Shape Routing Protocol based on Clusters (클러스터에 기반한 에너지 효율적 삼각모양 라우팅 프로토콜에 관한 연구)

  • Nurhayati, Nurhayati;Lee, Kyung-Oh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.9
    • /
    • pp.115-122
    • /
    • 2011
  • In this paper, we propose an efficient dynamic workload balancing strategy which improves the performance of high-performance computing system. The key idea of this dynamic workload balancing strategy is to minimize execution time of each job and to maximize the system throughput by effectively using system resource such as CPU, memory. Also, this strategy dynamically allocates job by considering demanded memory size of executing job and workload status of each node. If an overload node occurs due to allocated job, the proposed scheme migrates job, executing in overload nodes, to another free nodes and reduces the waiting time and execution time of job by balancing workload of each node. Through simulation, we show that the proposed dynamic workload balancing strategy based on CPU, memory improves the performance of high-performance computing system compared to previous strategies.

Temporal Classification Method for Forecasting Power Load Patterns From AMR Data

  • Lee, Heon-Gyu;Shin, Jin-Ho;Park, Hong-Kyu;Kim, Young-Il;Lee, Bong-Jae;Ryu, Keun-Ho
    • Korean Journal of Remote Sensing
    • /
    • v.23 no.5
    • /
    • pp.393-400
    • /
    • 2007
  • We present in this paper a novel power load prediction method using temporal pattern mining from AMR(Automatic Meter Reading) data. Since the power load patterns have time-varying characteristic and very different patterns according to the hour, time, day and week and so on, it gives rise to the uninformative results if only traditional data mining is used. Also, research on data mining for analyzing electric load patterns focused on cluster analysis and classification methods. However despite the usefulness of rules that include temporal dimension and the fact that the AMR data has temporal attribute, the above methods were limited in static pattern extraction and did not consider temporal attributes. Therefore, we propose a new classification method for predicting power load patterns. The main tasks include clustering method and temporal classification method. Cluster analysis is used to create load pattern classes and the representative load profiles for each class. Next, the classification method uses representative load profiles to build a classifier able to assign different load patterns to the existing classes. The proposed classification method is the Calendar-based temporal mining and it discovers electric load patterns in multiple time granularities. Lastly, we show that the proposed method used AMR data and discovered more interest patterns.

Machine Learning Approach for Pattern Analysis of Energy Consumption in Factory (머신러닝 기법을 활용한 공장 에너지 사용량 데이터 분석)

  • Sung, Jong Hoon;Cho, Yeong Sik
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.4
    • /
    • pp.87-92
    • /
    • 2019
  • This paper describes the pattern analysis for data of the factory energy consumption by using machine learning method. While usual statistical methods or approaches require specific equations to represent the physical characteristics of the plant, machine learning based approach uses historical data and calculate the result effectively. Although rule-based approach calculates energy usage with the physical equations, it is hard to identify the exact equations that represent the factory's characteristics and hidden variables affecting the results. Whereas the machine learning approach is relatively useful to find the relations quickly between the data. The factory has several components directly affecting to the electricity consumption which are machines, light, computers and indoor systems like HVAC (heating, ventilation and air conditioning). The energy loads from those components are generated in real-time and these data can be shown in time-series. The various sensors were installed in the factory to construct the database by collecting the energy usage data from the components. After preliminary statistical analysis for data mining, time-series clustering techniques are applied to extract the energy load pattern. This research can attributes to develop Factory Energy Management System (FEMS).

Property-based Design of Ion-Channel-Targeted Library

  • Ahn, Ji-Young;Nam, Ky-Youb;Chang, Byung-Ha;Yoon, Jeong-Hyeok;Cho, Seung-Joo;Koh, Hun-Yeong;No, Kyoung-Tai
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.134-138
    • /
    • 2005
  • The design of ion channel targeted library is a valuable methodology that can aid in the selection and prioritization of potential ion channel-likeness for ion-channel-targeted bio-screening from large commercial available chemical pool. The differences of property profiling between the 93 ion-channel active compounds from MDDR and CMC database and the ACDSC compounds were classified by suitable descriptors calculated with preADME software. Through the PCA, clustering, and similarity analysis, the compounds capable of ion channel activity were defined in ACDSC compounds pool. The designed library showed a tendency to follow the property profile of ion-channel active compounds and can be implemented with great time and economical efficiencies of ligand-based drug design or virtual high throughput screening from an enormous small molecule space.

  • PDF

Segmentation by Benefit Sought in Marketing Channel : A Sequential Approach (추구혜택에 의한 유통시장의 시장세분화 : 순차적 접근)

  • Yi, Seong-Keun;Kim, Jae-Wook;Lee, Seo-Koo
    • Journal of Distribution Research
    • /
    • v.10 no.3
    • /
    • pp.87-101
    • /
    • 2005
  • Market segmentation has been an important issue in marketing for a long time. Many models and statistical methods have been developed by many scholars. The purpose of this research provides one insight for market segmentation based on clustering technique in channel benefit sought. We proposed a sequential approach in market segmentation. A sequential approach means that we do market segmentation by multi-stage method based on the benefits sought in marketing channel. To achieve this approach, we divided the main benefits sought into subcategories. That is to say, after dividing each benefit sought into more detailed concepts, we did market segmentation sequentially.

  • PDF