• Title/Summary/Keyword: preprocessing technique

Search Result 339, Processing Time 0.024 seconds

Preprocessing Technique for Malicious Comments Detection Considering the Form of Comments Used in the Online Community (온라인 커뮤니티에서 사용되는 댓글의 형태를 고려한 악플 탐지를 위한 전처리 기법)

  • Kim Hae Soo;Kim Mi Hui
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.12 no.3
    • /
    • pp.103-110
    • /
    • 2023
  • With the spread of the Internet, anonymous communities emerged along with the activation of communities for communication between people, and many users are doing harm to others, such as posting aggressive posts and leaving comments using anonymity. In the past, administrators directly checked posts and comments, then deleted and blocked them, but as the number of community users increased, they reached a level that managers could not continue to monitor. Initially, word filtering techniques were used to prevent malicious writing from being posted in a form that could not post or comment if a specific word was included, but they avoided filtering in a bypassed form, such as using similar words. As a way to solve this problem, deep learning was used to monitor posts posted by users in real-time, but recently, the community uses words that can only be understood by the community or from a human perspective, not from a general Korean word. There are various types and forms of characters, making it difficult to learn everything in the artificial intelligence model. Therefore, in this paper, we proposes a preprocessing technique in which each character of a sentence is imaged using a CNN model that learns the consonants, vowel and spacing images of Korean word and converts characters that can only be understood from a human perspective into characters predicted by the CNN model. As a result of the experiment, it was confirmed that the performance of the LSTM, BiLSTM and CNN-BiLSTM models increased by 3.2%, 3.3%, and 4.88%, respectively, through the proposed preprocessing technique.

Preprocessing Technique Using a Feature of Character′s Stroke (다문자 획의 특성을 이용한 전처리 기법)

  • 이수봉;김우생
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2004.05a
    • /
    • pp.758-761
    • /
    • 2004
  • 온라인 문자 인식 기술은 PDA, 테블릿 PC등 많은 새로운 응용에서 사용되고 있으나, 인식 기술은 아직 이러한 첨단 도구들을 자연스럽게 이용하기에는 못 미치는 실정이다. 따라서 본 논문에서는 인식률을 높이기 위해 전처리 과정에서 문자를 구성하는 획수를 통해 인식 시 해당 HMM 모델들에게만 적용하여 인식 시간을 줄이고 동시에 오류도 줄이고자 한다 제안하는 방법들의 타당성은 실험을 통해서 검증하였다.

  • PDF

User Identification and Session completion in Input Data Preprocessing for Web Mining (웹 마이닝을 위한 입력 데이타의 전처리과정에서 사용자구분과 세션보정)

  • 최영환;이상용
    • Journal of KIISE:Software and Applications
    • /
    • v.30 no.9
    • /
    • pp.843-849
    • /
    • 2003
  • Web usage mining is the technique of data mining that analyzes web users' usage patterns by large web log. To use the web usage mining technique, we have to classify correctly users and users session in preprocessing, but can't classify them completely by only log files with standard web log format. To classify users and user session there are many problems like local cache, firewall, ISP, user privacy, cookey etc., but there isn't any definite method to solve the problems now. Especially local cache problem is the most difficult problem to classify user session which is used as input in web mining systems. In this paper we propose a heuristic method which solves local cache problem by using only click stream data of server side like referrer log, agent log and access log, classifies user sessions and completes session.

Preprocessing Stage of Timing Simulator, TSIM1.0 : Partitioning and Dynamic Waveform Storage Management (Timing Simulator인 TSIM1.0에서의 전처리 과정 : 회로분할과 파형정보처리)

  • Kwon, Oh-Bong;Yoon, Hyun-Ro;Lee, Ki-Jun
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.3
    • /
    • pp.153-159
    • /
    • 1989
  • This paper describes the algorithms employed in the preprocessing stage of the timing simulator, TSIM1.0, which is based on the Waveform Relaxation Method (WRM) at the CELL-level. The preprocessing stage in TSIM1.0 (1)partitions a given circuit into DC connected blocks (DCB's) (2) forms strongly connected circuts (SCC's) and (3) orders CELL's Also, the efficient waveform management technique for the WRM is described, which allows the overwriting of the waveform management technique for the WRM is described. which allows the overwriting of the waveform information to save the storage requirements. With TSIM1.0, circuits containing up to 5000 MOSFET's can be analyzed within 1 hour computation time on the IBM PC/AT. The simulation results for several types of MOS digital circuits are given to verify the performance of TSIM1.0.

  • PDF

Learning data preprocessing technique for improving indoor positioning performance based on machine learning (기계학습 기반의 실내 측위 성능 향상을 위한 학습 데이터 전처리 기법)

  • Kim, Dae-Jin;Hwang, Chi-Gon;Yoon, Chang-Pyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.24 no.11
    • /
    • pp.1528-1533
    • /
    • 2020
  • Recently, indoor location recognition technology using Wi-Fi fingerprints has been applied and operated in various industrial fields and public services. Along with the interest in machine learning technology, location recognition technology based on machine learning using wireless signal data around a terminal is rapidly developing. At this time, in the process of collecting radio signal data required for machine learning, the accuracy of location recognition is lowered due to distorted or unsuitable data for learning. In addition, when location recognition is performed based on data collected at a specific location, a problem occurs in location recognition at surrounding locations that are not included in the learning. In this paper, we propose a learning data preprocessing technique to obtain an improved position recognition result through the preprocessing of the collected learning data.

An Artificial Intelligent based Learning Model for BIM Elements Usage (건축 부재 사용량 예측을 위한 인공지능 학습 모델)

  • Beom-Su Kim;Jong-Hyeok Park;Soo-Hee Han;Kyung-Jun Kim
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.18 no.1
    • /
    • pp.107-114
    • /
    • 2023
  • This study described a method of designing and implementing an artificial intelligence-based learning model for predicting the usage of building members. Artificial intelligence (AI) is widely used in various fields thanks to the development of technology, but in the field of building information management (BIM), the case of utilizing AI technology is very low due to the specificity of the data in the field and the difficulty of collecting big data. Therefore, AI problems for BIM were discovered, and a new preprocessing technique was devised to solve the specificity of data in the field. An artificial intelligence model was implemented based on the designed preprocessing technique, and it was confirmed that the accuracy of predicting the construction component usage of the implemented artificial intelligence model is at a level that can be used in the actual industry.

Driver Group Clustering Technique and Risk Estimation Method for Traffic Accident Prevention

  • Tae-Wook Kim;Ji-Woong Yang;Hyeon-Jin Jung;Han-Jin Lee;Ellen J. Hong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.8
    • /
    • pp.53-58
    • /
    • 2024
  • Traffic accidents are not only a threat to human lives but also pose significant societal costs. Recently, research has been conducted to address the issue of traffic accidents by predicting the risk using deep learning technology and spatiotemporal information of roads. However, while traffic accidents are influenced not only by the spatiotemporal information of roads but also by human factors, research on the latter has been relatively less active. This paper analyzes driver groups and characteristics by applying clustering techniques to a traffic accident dataset and proposes and applies a method to calculate the Risk Level for each driver group and characteristic. In this process, the preprocessing technique suggested in this paper demonstrates a higher Silhouette Score of 0.255 compared to the commonly used One-Hot Embedding & Min-Max Scaling techniques, indicating its suitability as a preprocessing method.

Preprocessing and Facial Feature Robust to Illumination Variations (조명변화에 강인한 전처리 및 얼굴특징)

  • Kim, Dong-Ju;Lee, Sang-Heon;Kim, Hyun-Duk
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.7
    • /
    • pp.503-506
    • /
    • 2013
  • In this paper, we propose the face recognition method combining the ECSP preprocessing technique which is modified version of previous CS-LBP and the illumination-robust D2D-PCA feature. The performance evaluation of proposed method was carried out using various binary pattern operators and feature extraction algorithms such as well-known PCA and 2D-PCA on the Yale B database. As a results, the proposed method showed the best recognition accuracy compared to different approaches, and we confirmed that the proposed approach is robust to illumination variation.

A Preprocessing Algorithm for Efficient Lossless Compression of Gray Scale Images

  • Kim, Sun-Ja;Hwang, Doh-Yeun;Yoo, Gi-Hyoung;You, Kang-Soo;Kwak, Hoon-Sung
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2485-2489
    • /
    • 2005
  • This paper introduces a new preprocessing scheme to replace original data of gray scale images with particular ordered data so that performance of lossless compression can be improved more efficiently. As a kind of preprocessing technique to maximize performance of entropy encoder, the proposed method converts the input image data into more compressible form. Before encoding a stream of the input image, the proposed preprocessor counts co-occurrence frequencies for neighboring pixel pairs. Then, it replaces each pair of adjacent gray values with particular ordered numbers based on the investigated co-occurrence frequencies. When compressing ordered image using entropy encoder, we can expect to raise compression rate more highly because of enhanced statistical feature of the input image. In this paper, we show that lossless compression rate increased by up to 37.85% when comparing results from compressing preprocessed and non-preprocessed image data using entropy encoder such as Huffman, Arithmetic encoder.

  • PDF

Radiometric and Geometric Correction of the KITSAT-1 CCD Earth Images (우리별 1호 지구 관측 영상의 방사학적 및 기하학적 보정)

  • 이임평;김태정
    • Korean Journal of Remote Sensing
    • /
    • v.12 no.1
    • /
    • pp.26-42
    • /
    • 1996
  • The CCD Earth Images Experiment(CEIE) is one of the main payload of the KITSAT-1. Since it was launched on Oct. 10, 1992, the CEIE has taken more than 500 images on the Earth surface world-wide so far. An image from the space is very different from a feature on the real Earth surface due to various radiometric and geometric distortions. Preprocessing to remove those distortions has to take place before the images data are processed and analyzed further for various applications. This paper describes the procedure to perform preprocessing including radiometric and geometric correction.e-processing system. The GCP marking using this technique showed a sufficient accuracy for KITSAT1,2 narrow camera images.