• Title/Summary/Keyword: DATA PRE-PROCESSING

Search Result 818, Processing Time 0.023 seconds

Noise-robust electrocardiogram R-peak detection with adaptive filter and variable threshold (적응형 필터와 가변 임계값을 적용하여 잡음에 강인한 심전도 R-피크 검출)

  • Rahman, MD Saifur;Choi, Chul-Hyung;Kim, Si-Kyung;Park, In-Deok;Kim, Young-Pil
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.12
    • /
    • pp.126-134
    • /
    • 2017
  • There have been numerous studies on extracting the R-peak from electrocardiogram (ECG) signals. However, most of the detection methods are complicated to implement in a real-time portable electrocardiograph device and have the disadvantage of requiring a large amount of calculations. R-peak detection requires pre-processing and post-processing related to baseline drift and the removal of noise from the commercial power supply for ECG data. An adaptive filter technique is widely used for R-peak detection, but the R-peak value cannot be detected when the input is lower than a threshold value. Moreover, there is a problem in detecting the P-peak and T-peak values due to the derivation of an erroneous threshold value as a result of noise. We propose a robust R-peak detection algorithm with low complexity and simple computation to solve these problems. The proposed scheme removes the baseline drift in ECG signals using an adaptive filter to solve the problems involved in threshold extraction. We also propose a technique to extract the appropriate threshold value automatically using the minimum and maximum values of the filtered ECG signal. To detect the R-peak from the ECG signal, we propose a threshold neighborhood search technique. Through experiments, we confirmed the improvement of the R-peak detection accuracy of the proposed method and achieved a detection speed that is suitable for a mobile system by reducing the amount of calculation. The experimental results show that the heart rate detection accuracy and sensitivity were very high (about 100%).

A study on the classification of research topics based on COVID-19 academic research using Topic modeling (토픽모델링을 활용한 COVID-19 학술 연구 기반 연구 주제 분류에 관한 연구)

  • Yoo, So-yeon;Lim, Gyoo-gun
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.155-174
    • /
    • 2022
  • From January 2020 to October 2021, more than 500,000 academic studies related to COVID-19 (Coronavirus-2, a fatal respiratory syndrome) have been published. The rapid increase in the number of papers related to COVID-19 is putting time and technical constraints on healthcare professionals and policy makers to quickly find important research. Therefore, in this study, we propose a method of extracting useful information from text data of extensive literature using LDA and Word2vec algorithm. Papers related to keywords to be searched were extracted from papers related to COVID-19, and detailed topics were identified. The data used the CORD-19 data set on Kaggle, a free academic resource prepared by major research groups and the White House to respond to the COVID-19 pandemic, updated weekly. The research methods are divided into two main categories. First, 41,062 articles were collected through data filtering and pre-processing of the abstracts of 47,110 academic papers including full text. For this purpose, the number of publications related to COVID-19 by year was analyzed through exploratory data analysis using a Python program, and the top 10 journals under active research were identified. LDA and Word2vec algorithm were used to derive research topics related to COVID-19, and after analyzing related words, similarity was measured. Second, papers containing 'vaccine' and 'treatment' were extracted from among the topics derived from all papers, and a total of 4,555 papers related to 'vaccine' and 5,971 papers related to 'treatment' were extracted. did For each collected paper, detailed topics were analyzed using LDA and Word2vec algorithms, and a clustering method through PCA dimension reduction was applied to visualize groups of papers with similar themes using the t-SNE algorithm. A noteworthy point from the results of this study is that the topics that were not derived from the topics derived for all papers being researched in relation to COVID-19 (

    ) were the topic modeling results for each research topic (
    ) was found to be derived from For example, as a result of topic modeling for papers related to 'vaccine', a new topic titled Topic 05 'neutralizing antibodies' was extracted. A neutralizing antibody is an antibody that protects cells from infection when a virus enters the body, and is said to play an important role in the production of therapeutic agents and vaccine development. In addition, as a result of extracting topics from papers related to 'treatment', a new topic called Topic 05 'cytokine' was discovered. A cytokine storm is when the immune cells of our body do not defend against attacks, but attack normal cells. Hidden topics that could not be found for the entire thesis were classified according to keywords, and topic modeling was performed to find detailed topics. In this study, we proposed a method of extracting topics from a large amount of literature using the LDA algorithm and extracting similar words using the Skip-gram method that predicts the similar words as the central word among the Word2vec models. The combination of the LDA model and the Word2vec model tried to show better performance by identifying the relationship between the document and the LDA subject and the relationship between the Word2vec document. In addition, as a clustering method through PCA dimension reduction, a method for intuitively classifying documents by using the t-SNE technique to classify documents with similar themes and forming groups into a structured organization of documents was presented. In a situation where the efforts of many researchers to overcome COVID-19 cannot keep up with the rapid publication of academic papers related to COVID-19, it will reduce the precious time and effort of healthcare professionals and policy makers, and rapidly gain new insights. We hope to help you get It is also expected to be used as basic data for researchers to explore new research directions.

  • Digital Hologram Compression Technique By Hybrid Video Coding (하이브리드 비디오 코팅에 의한 디지털 홀로그램 압축기술)

    • Seo, Young-Ho;Choi, Hyun-Jun;Kang, Hoon-Jong;Lee, Seung-Hyun;Kim, Dong-Wook
      • Journal of the Institute of Electronics Engineers of Korea SP
      • /
      • v.42 no.5 s.305
      • /
      • pp.29-40
      • /
      • 2005
    • According as base of digital hologram has been magnified, discussion of compression technology is expected as a international standard which defines the compression technique of 3D image and video has been progressed in form of 3DAV which is a part of MPEG. As we can identify in case of 3DAV, the coding technique has high possibility to be formed into the hybrid type which is a merged, refined, or mixid with the various previous technique. Therefore, we wish to present the relationship between various image/video coding techniques and digital hologram In this paper, we propose an efficient coding method of digital hologram using standard compression tools for video and image. At first, we convert fringe patterns into video data using a principle of CGH(Computer Generated Hologram), and then encode it. In this research, we propose a compression algorithm is made up of various method such as pre-processing for transform, local segmentation with global information of object image, frequency transform for coding, scanning to make fringe to video stream, classification of coefficients, and hybrid video coding. Finally the proposed hybrid compression algorithm is all of these methods. The tool for still image coding is JPEG2000, and the toots for video coding include various international compression algorithm such as MPEG-2, MPEG-4, and H.264 and various lossless compression algorithm. The proposed algorithm illustrated that it have better properties for reconstruction than the previous researches on far greater compression rate above from four times to eight times as much. Therefore we expect that the proposed technique for digital hologram coding is to be a good preceding research.

    A Study on Analysis of consumer perception of YouTube advertising using text mining (텍스트 마이닝을 활용한 Youtube 광고에 대한 소비자 인식 분석)

    • Eum, Seong-Won
      • Management & Information Systems Review
      • /
      • v.39 no.2
      • /
      • pp.181-193
      • /
      • 2020
    • This study is a study that analyzes consumer perception by utilizing text mining, which is a recent issue. we analyzed the consumer's perception of Samsung Galaxy by analyzing consumer reviews of Samsung Galaxy YouTube ads. for analysis, 1,819 consumer reviews of YouTube ads were extracted. through this data pre-processing, keywords for advertisements were classified and extracted into nouns, adjectives, and adverbs. after that, frequency analysis and emotional analysis were performed. Finally, clustering was performed through CONCOR. the summary of this study is as follows. the first most frequently mentioned words were Galaxy Note (n = 217), Good (n = 135), Pen (n = 40), and Function (n = 29). it can be judged through the advertisement that consumers "Galaxy Note", "Good", "Pen", and "Features" have good functional aspects for Samsung mobile phone products and positively recognize the Note Pen. in addition, the recognition of "Samsung Pay", "Innovation", "Design", and "iPhone" shows that Samsung's mobile phone is highly regarded for its innovative design and functional aspects of Samsung Pay. second, it is the result of sentiment analysis on YouTube advertising. As a result of emotional analysis, the ratio of emotional intensity was positive (75.95%) and higher than negative (24.05%). this means that consumers are positively aware of Samsung Galaxy mobile phones. As a result of the emotional keyword analysis, positive keywords were "good", "good", "innovative", "highest", "fast", "pretty", etc., negative keywords were "frightening", "I want to cry", "discomfort", "sorry", "no", etc. were extracted. the implication of this study is that most of the studies by quantitative analysis methods were considered when looking at the consumer perception study of existing advertisements. In this study, we deviated from quantitative research methods for advertising and attempted to analyze consumer perception through qualitative research. this is expected to have a great influence on future research, and I am sure that it will be a starting point for consumer awareness research through qualitative research.

    The Effect of Home economic education teaching plans for students in academic and those in vocational high schools' 'Preparation for Successful aging' in the 'Family life in old age' unit -A comparative study between practical problem-teaching lesson plans and instructor-led teaching and learning plans- (인문계와 가사.실업 전문계 고등학생의 '성공적인 노후생활 준비교육'을 위한 가정과 수업의 적용과 효과 -실천적 문제 중심 수업과 강의식 수업을 중심으로-)

    • Lee, Jong-Hui;Cho, Byung-Eun
      • Journal of Korean Home Economics Education Association
      • /
      • v.23 no.4
      • /
      • pp.105-124
      • /
      • 2011
    • To achieve this objective, practical problem-teaching lesson plans and instructor-led teaching and learning plans were developed and integrated into the Technology Home Economics, and Human Development curricula at both academic and vocational high schools. The impact of these plans was examined, as were connections between the teaching methods and types of schools. As part of this study, a survey was conducted on 1,263 students in 46 classes at 6 randomly selected high schools: 4 academic and 2 vocational. A total of 9 teachers conducted classes for both experimental and comparative groups between October 2009 and November 2010. Pre- and post-tests were used to study the impact of the lessons on the experimental and comparative groups. In terms of data analysis and statistics processing, this study implemented mean and standard deviations, t-test, and analysis of covariance using the SPSS 12.0 program. The results of this study are as follows. The practical problem-teaching lessons produced more positive results in the students than the instructor-led lessons, in terms of their image of the elderly, their level of knowledge about them, their understanding of their need for welfare services, and preparation for Successful aging. When comparing the results by type of school, the experimental groups at academic high schools appeared to have a more positive image and better understanding of the elderly and their need for welfare services, and were better prepared for Successful aging than during their previous lessons. They also showed an increase in independence from their children in aging. As for the comparative groups, students at academic high schools showed an increase in preparation for Successful aging compared to the previous lessons. Finally, as for future research on preparation for aging in high schools, more schools should include this subject in their regular curriculum for Technology Home Economics, Human Development and Home Economics in order to generalize the results, and they need to evaluate the content. Additionally, this study suggests that new high school curricula should include lessons on preparation for aging so that students can deal successfully with our aging society.

    • PDF

    Design of a Crowd-Sourced Fingerprint Mapping and Localization System (군중-제공 신호지도 작성 및 위치 추적 시스템의 설계)

    • Choi, Eun-Mi;Kim, In-Cheol
      • KIPS Transactions on Software and Data Engineering
      • /
      • v.2 no.9
      • /
      • pp.595-602
      • /
      • 2013
    • WiFi fingerprinting is well known as an effective localization technique used for indoor environments. However, this technique requires a large amount of pre-built fingerprint maps over the entire space. Moreover, due to environmental changes, these maps have to be newly built or updated periodically by experts. As a way to avoid this problem, crowd-sourced fingerprint mapping attracts many interests from researchers. This approach supports many volunteer users to share their WiFi fingerprints collected at a specific environment. Therefore, crowd-sourced fingerprinting can automatically update fingerprint maps up-to-date. In most previous systems, however, individual users were asked to enter their positions manually to build their local fingerprint maps. Moreover, the systems do not have any principled mechanism to keep fingerprint maps clean by detecting and filtering out erroneous fingerprints collected from multiple users. In this paper, we present the design of a crowd-sourced fingerprint mapping and localization(CMAL) system. The proposed system can not only automatically build and/or update WiFi fingerprint maps from fingerprint collections provided by multiple smartphone users, but also simultaneously track their positions using the up-to-date maps. The CMAL system consists of multiple clients to work on individual smartphones to collect fingerprints and a central server to maintain a database of fingerprint maps. Each client contains a particle filter-based WiFi SLAM engine, tracking the smartphone user's position and building each local fingerprint map. The server of our system adopts a Gaussian interpolation-based error filtering algorithm to maintain the integrity of fingerprint maps. Through various experiments, we show the high performance of our system.

    Comparative Analysis of the Keywords in Taekwondo News Articles by Year: Applying Topic Modeling Method (태권도 뉴스기사의 연도별 주제어 비교분석: 토픽모델링 적용)

    • Jeon, Minsoo;Lim, Hyosung
      • Journal of Digital Convergence
      • /
      • v.19 no.11
      • /
      • pp.575-583
      • /
      • 2021
    • This study aims to analyze Taekwondo trends according to news articles by year by applying topic modeling. In order to examine the Taekwondo trend through media reports, articles including news articles and Taekwondo specialized media articles were collected through Big Kinds of the Korea Press Foundation. The search period was divided into three sections: before 2000, 2001~2010, and 2011~2020. A total of 12,124 items were selected as research data. For topic analysis, pre-processing was performed, and topic analysis was performed using the LDA algorithm. In this case, python 3 was applied for all analysis. First, as a result of analyzing the topics of media articles by year, 'World' was the most common keyword before 2000. 'South and North Korea' was next common and 'Olympic' was the third commonest topic. From 2001 to 2010, 'World' was the most common topic, followed by 'Association' and 'World Taekwondo'. From 2011 to 2020, 'World', 'Demonstration', and 'Kukkiwon' was the most common topic in that order. Second, as a result of analyzing news articles before 2000 by topic modeling, topics were divided into two categories. Specifically, Topic 1 was selected as 'South-North Korea sports exchange' and Topic 2 was selected as 'Adoption of Olympic demonstration events'. Third, as a result of analyzing news articles from 2001 to 2010 by topic modeling, three topics were selected. Topic 1 was selected as 'Taekwondo Demonstration Performance and Corruption', Topic 2 was selected as 'Muju Taekwondo Park Creation', and Topic 3 was selected as 'World Taekwondo Festival'. Fourth, as a result of analyzing news articles from 2011 to 2020 by topic modeling, three topics were selected. Topic 1 was selected as 'Successful Hosting of the 2018 Pyeongchang Winter Olympics', Topic 2 was selected as 'North-South Korea Taekwondo Joint Demonstration Performance', and Topic 3 was selected as '2017 Muju World Taekwondo Championships'.

    Aspect-Based Sentiment Analysis Using BERT: Developing Aspect Category Sentiment Classification Models (BERT를 활용한 속성기반 감성분석: 속성카테고리 감성분류 모델 개발)

    • Park, Hyun-jung;Shin, Kyung-shik
      • Journal of Intelligence and Information Systems
      • /
      • v.26 no.4
      • /
      • pp.1-25
      • /
      • 2020
    • Sentiment Analysis (SA) is a Natural Language Processing (NLP) task that analyzes the sentiments consumers or the public feel about an arbitrary object from written texts. Furthermore, Aspect-Based Sentiment Analysis (ABSA) is a fine-grained analysis of the sentiments towards each aspect of an object. Since having a more practical value in terms of business, ABSA is drawing attention from both academic and industrial organizations. When there is a review that says "The restaurant is expensive but the food is really fantastic", for example, the general SA evaluates the overall sentiment towards the 'restaurant' as 'positive', while ABSA identifies the restaurant's aspect 'price' as 'negative' and 'food' aspect as 'positive'. Thus, ABSA enables a more specific and effective marketing strategy. In order to perform ABSA, it is necessary to identify what are the aspect terms or aspect categories included in the text, and judge the sentiments towards them. Accordingly, there exist four main areas in ABSA; aspect term extraction, aspect category detection, Aspect Term Sentiment Classification (ATSC), and Aspect Category Sentiment Classification (ACSC). It is usually conducted by extracting aspect terms and then performing ATSC to analyze sentiments for the given aspect terms, or by extracting aspect categories and then performing ACSC to analyze sentiments for the given aspect category. Here, an aspect category is expressed in one or more aspect terms, or indirectly inferred by other words. In the preceding example sentence, 'price' and 'food' are both aspect categories, and the aspect category 'food' is expressed by the aspect term 'food' included in the review. If the review sentence includes 'pasta', 'steak', or 'grilled chicken special', these can all be aspect terms for the aspect category 'food'. As such, an aspect category referred to by one or more specific aspect terms is called an explicit aspect. On the other hand, the aspect category like 'price', which does not have any specific aspect terms but can be indirectly guessed with an emotional word 'expensive,' is called an implicit aspect. So far, the 'aspect category' has been used to avoid confusion about 'aspect term'. From now on, we will consider 'aspect category' and 'aspect' as the same concept and use the word 'aspect' more for convenience. And one thing to note is that ATSC analyzes the sentiment towards given aspect terms, so it deals only with explicit aspects, and ACSC treats not only explicit aspects but also implicit aspects. This study seeks to find answers to the following issues ignored in the previous studies when applying the BERT pre-trained language model to ACSC and derives superior ACSC models. First, is it more effective to reflect the output vector of tokens for aspect categories than to use only the final output vector of [CLS] token as a classification vector? Second, is there any performance difference between QA (Question Answering) and NLI (Natural Language Inference) types in the sentence-pair configuration of input data? Third, is there any performance difference according to the order of sentence including aspect category in the QA or NLI type sentence-pair configuration of input data? To achieve these research objectives, we implemented 12 ACSC models and conducted experiments on 4 English benchmark datasets. As a result, ACSC models that provide performance beyond the existing studies without expanding the training dataset were derived. In addition, it was found that it is more effective to reflect the output vector of the aspect category token than to use only the output vector for the [CLS] token as a classification vector. It was also found that QA type input generally provides better performance than NLI, and the order of the sentence with the aspect category in QA type is irrelevant with performance. There may be some differences depending on the characteristics of the dataset, but when using NLI type sentence-pair input, placing the sentence containing the aspect category second seems to provide better performance. The new methodology for designing the ACSC model used in this study could be similarly applied to other studies such as ATSC.


    (34141) Korea Institute of Science and Technology Information, 245, Daehak-ro, Yuseong-gu, Daejeon
    Copyright (C) KISTI. All Rights Reserved.