• Title/Summary/Keyword: and Pre-Processing

Search Result 1,948, Processing Time 0.036 seconds

A Study on the Use of Stopword Corpus for Cleansing Unstructured Text Data (비정형 텍스트 데이터 정제를 위한 불용어 코퍼스의 활용에 관한 연구)

  • Lee, Won-Jo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.6
    • /
    • pp.891-897
    • /
    • 2022
  • In big data analysis, raw text data mostly exists in various unstructured data forms, so it becomes a structured data form that can be analyzed only after undergoing heuristic pre-processing and computer post-processing cleansing. Therefore, in this study, unnecessary elements are purified through pre-processing of the collected raw data in order to apply the wordcloud of R program, which is one of the text data analysis techniques, and stopwords are removed in the post-processing process. Then, a case study of wordcloud analysis was conducted, which calculates the frequency of occurrence of words and expresses words with high frequency as key issues. In this study, to improve the problems of the "nested stopword source code" method, which is the existing stopword processing method, using the word cloud technique of R, we propose the use of "general stopword corpus" and "user-defined stopword corpus" and conduct case analysis. The advantages and disadvantages of the proposed "unstructured data cleansing process model" are comparatively verified and presented, and the practical application of word cloud visualization analysis using the "proposed external corpus cleansing technique" is presented.

An implementation of the high speed image processing board for contact image sensor (Contact image sensor를 위한 고속 영상 처리 보드 구현)

  • Kang, Hyun-Inn;Ju, Yong-Wan;Baek, Kwang-Ryul
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.5 no.6
    • /
    • pp.691-697
    • /
    • 1999
  • This paper describes the implementation of a high speed image processing board. This image processing board is consist of a image acquisition part and a image processing part. The image acquistion part is digitizing the image input data from CIS and save it to the dual port RAM. By putting on the dual port memory between two parts, during acquistion of image, the image processing part can be effectively processing of large-volume image data. Most of all image preprocessing part are integrated in a large-scaled FPGA. We arwe using ADSP-2181 of the Analog Device Inc., LTD. for a image processing part, and using the available all memory of DSP for the large-volume image data. Especially, using of IDMA exchanges the data with the external microprocessor or the external PC, and can watch the result of image processing and acquired image. Finally, we show that an implemented image processing board used for the simulation of image retreval by the one of the typical application.

  • PDF

Development of a 1-Chip Application-Specific DSP for the Next Generation FAX Image Processing (차세대 팩스 영상처리를 위한 1-Chip Application-Specific DSP 기법)

  • 김재호;강구수;김서규;이진우;이방원;김윤수;조석팔;하성한
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.31B no.4
    • /
    • pp.30-39
    • /
    • 1994
  • A 1-chip high quality binarizing VLSI image processor (which has 8 bit ADC. 6 bit flash ADC, 15K standard cell, and 1K word ROM) based on 10 MIPS 16 bit DSP is implemented for FAX. This image processor(IP) performs image pre-processing. image quality improvement in copying and sending mode, and mixed image processing based on the fuzzy theory. And smoothing in sub-scan direction is applied for normal receiving mode data so the received data is enhanced like fine mode data. Each algorithm is processed with the same type of image processing window and 2-D image processing is implemented with a 1-D line buffer. The fabricated chip is applied to a FAX machine and image quality improvement is verified.

  • PDF

PC-based Processing of Shallow Marine Multi-channel Seismic Data (PC기반의 천해저 다중채널 탄성파 자료의 전산처리)

  • 공영세;김국주
    • 한국해양학회지
    • /
    • v.30 no.2
    • /
    • pp.116-124
    • /
    • 1995
  • Marine, shallow seismic data have been acquired and processed by newly developed multi-channel(6 channel), PC-based digital recording and processing system. The digital processing system includes pre-processing, swell-compensation filter, frequency filter, gain correction, deconvolution, stacking, migration, and plotting. The quality of processed sections is greatly enhanced in terms of signal-to-noise ratio and vertical/horizontal resolution. The multi-channel, digital recording, acquisition and processing system proved to be and economical, efficient and easy-to-use marine shallow seismic tool.

  • PDF

Extraction of Canine Cataract Object for Developing Handy Pre-diagnostic Tool with Fuzzy Stretching and ART2 Learning

  • Kim, Kwang Baek
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.1
    • /
    • pp.21-26
    • /
    • 2016
  • Canine cataract is developed with aging and can cause the blindness or surgical treatment if not treated timely. The first observation must be made by pet owners but they do not have proper equipment and knowledge to see the abnormalities. In this paper, we propose an intelligent image processing method to extract canine cataract suspicious object from non-professional equipment such as ordinary digital camera and cellular phone photographs so that even casual owners of pet dog can make a pre-diagnosis of such a surgery-needed disease as soon as possible. The experiment shows that the proposed method is successful in most cases except the dog has similar colored hair to the color of cataract.

Algorithm for Improving GPS Performance by Data Pre-processing (데이터 사전처리에 의한 GPS 성능 개선 알고리즘)

  • Rhee Jae-Hoon;Hong Won-Chul;Kim Hyun-Soo;Jeon Chang-Wan
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.8
    • /
    • pp.752-758
    • /
    • 2006
  • A GPS receiver provides much information such as calculated position, speed, heading, status of satellites, current time errors, etc. It is well-known that GPS signals from GPS receiver mounted on moving vehicle are often distorted, contaminated by various noises, and blocked by tunnel or tall buildings. The phenomenon often obstructs correct navigation especially when a vehicle keeps stopping or is moving in low speed. Therefore it is needed to pre-process the signals to adapt it to various applications. In this paper, an algorithm to pre-process the signals is proposed. For this, GPS data obtaining from uNAV GPS receiver are analyzed and classified based on dynamic characteristic. Then, the proposed algorithm is applied to the data and some test results are shown to verify the usefulness of the algorithm.

Enhanced Pre echo Control Algorithm for MPEG Audio Coders (MPEG 오디오 부호화기를 위한 향상된 프리 에코 컨트롤 알고리듬)

  • Lee Chang-Joon;Lee Jae-Seong;Park Young-Cheol
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.191-199
    • /
    • 2006
  • This paper presents an efficient pre echo control scheme for MPEG Audio coders based on the psychoacoustic model II (PAM-II). Pre echo control is the final step for the calculation of masking threshold in the PAM II. It is to minimize the spread of quantization error over the processing frame. In the conventional encoders, pre echo is reduced by restricting the estimated masking threshold not to exceed the one obtained in the previous frame. The conventional method performs pre echo control not only for short blocks but also for long blocks, which lowers the masking threshold in long blocks and, in turn, increases the quantization noise level of corresponding blocks. This paper proposes an efficient pre echo control process. The test result shows a mean enhancement of more than 0.4 especially for complex signals on the ITU R 5 point audio impairment scale.

Trends in Deep Learning-based Medical Optical Character Recognition (딥러닝 기반의 의료 OCR 기술 동향)

  • Sungyeon Yoon;Arin Choi;Chaewon Kim;Sumin Oh;Seoyoung Sohn;Jiyeon Kim;Hyunhee Lee;Myeongeun Han;Minseo Park
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.2
    • /
    • pp.453-458
    • /
    • 2024
  • Optical Character Recognition is the technology that recognizes text in images and converts them into digital format. Deep learning-based OCR is being used in many industries with large quantities of recorded data due to its high recognition performance. To improve medical services, deep learning-based OCR was actively introduced by the medical industry. In this paper, we discussed trends in OCR engines and medical OCR and provided a roadmap for development of medical OCR. By using natural language processing on detected text data, current medical OCR has improved its recognition performance. However, there are limits to the recognition performance, especially for non-standard handwriting and modified text. To develop advanced medical OCR, databaseization of medical data, image pre-processing, and natural language processing are necessary.

An Efficient Pre-computing Method for Processing Continuous Skyline Queries in Road Networks (도로망에서 연속적인 스카이라인 절의처리를 위한 효율적인 전처리기법)

  • Jang, Su-Min;Yoo, Jae-Soo
    • Journal of KIISE:Databases
    • /
    • v.36 no.4
    • /
    • pp.314-320
    • /
    • 2009
  • Skyline queries have recently received considerable attention in the searching services. The skyline contains interesting objects that are not dominated by any other objects on all dimensions. Many related works have processed a skyline on static data or on moving objects in Euclidean space. However, this paper assumes that the point of a skyline query continuously moves in road networks. We propose a new method that efficiently processes continuous skyline queries in road networks through pre-computed shortest range data of objects. Our experiments show that the proposed method is about 100 times faster than previous methods in terms of query processing time.

Design and Implementation of Sensor Network based Autonomous Vehicle Control System (센서 네트워크 기반 자율주행 자동차 제어 시스템 설계 및 구현)

  • Jang, Won-Chul;Kim, Jong-Myon
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.7 no.5
    • /
    • pp.247-253
    • /
    • 2012
  • This paper presents sensor network based autonomous vehicle system using a proposed image processing algorithm. The proposed image processing algorithm consists of pre-processing and five-stage image processing: coordinate calculation, driving area decision, line segment calculation, steeling decision, and acceleration decision. We evaluate the performance of the proposed algorithm on both straight road and curved road. Experimental results indicate that the proposed algorithm works well for autonomous vehicles. However, control accuracy of the proposed algorithm decreases as speed is increasing.