• Title/Summary/Keyword: Size Normalization

Search Result 110, Processing Time 0.026 seconds

A Local Alignment Algorithm using Normalization by Functions (함수에 의한 정규화를 이용한 local alignment 알고리즘)

  • Lee, Sun-Ho;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.5_6
    • /
    • pp.187-194
    • /
    • 2007
  • A local alignment algorithm does comparing two strings and finding a substring pair with size l and similarity s. To find a pair with both sufficient size and high similarity, existing normalization approaches maximize the ratio of the similarity to the size. In this paper, we introduce normalization by functions that maximizes f(s)/g(l), where f and g are non-decreasing functions. These functions, f and g, are determined by experiments comparing DNA sequences. In the experiments, our normalization by functions finds appropriate local alignments. For the previous algorithm, which evaluates the similarity by using the longest common subsequence, we show that the algorithm can also maximize the score normalized by functions, f(s)/g(l) without loss of time.

Building Hybrid Stop-Words Technique with Normalization for Pre-Processing Arabic Text

  • Atwan, Jaffar
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.7
    • /
    • pp.65-74
    • /
    • 2022
  • In natural language processing, commonly used words such as prepositions are referred to as stop-words; they have no inherent meaning and are therefore ignored in indexing and retrieval tasks. The removal of stop-words from Arabic text has a significant impact in terms of reducing the size of a cor- pus text, which leads to an improvement in the effectiveness and performance of Arabic-language processing systems. This study investigated the effectiveness of applying a stop-word lists elimination with normalization as a preprocessing step. The idea was to merge statistical method with the linguistic method to attain the best efficacy, and comparing the effects of this two-pronged approach in reducing corpus size for Ara- bic natural language processing systems. Three stop-word lists were considered: an Arabic Text Lookup Stop-list, Frequency- based Stop-list using Zipf's law, and Combined Stop-list. An experiment was conducted using a selected file from the Arabic Newswire data set. In the experiment, the size of the cor- pus was compared after removing the words contained in each list. The results showed that the best reduction in size was achieved by using the Combined Stop-list with normalization, with a word count reduction of 452930 and a compression rate of 30%.

Isolated-Word Speech Recognition using Variable-Frame Length Normalization (가변프레임 길이정규화를 이용한 단어음성인식)

  • Sin, Chan-Hu;Lee, Hui-Jeong;Park, Byeong-Cheol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.6 no.4
    • /
    • pp.21-30
    • /
    • 1987
  • Length normalization by variable frame size is proposed as a novel approach to length normalization to solve the problem that the length variation of spoken word results in a lowing of recognition accuracy. This method has the advantage of curtailment of recognition time in the recognition stage because it can reduce the number of frames constructing a word compared with length normalization by a fixed frame size. In this paper, variable frame length normalization is applied to multisection vector quantization and the efficiency of this method is estimated in the view of recognition time and accuracy through practical recognition experiments.

  • PDF

Comparison of Normalization Methods for Defining Copy Number Variation Using Whole-genome SNP Genotyping Data

  • Kim, Ji-Hong;Yim, Seon-Hee;Jeong, Yong-Bok;Jung, Seong-Hyun;Xu, Hai-Dong;Shin, Seung-Hun;Chung, Yeun-Jun
    • Genomics & Informatics
    • /
    • v.6 no.4
    • /
    • pp.231-234
    • /
    • 2008
  • Precise and reliable identification of CNV is still important to fully understand the effect of CNV on genetic diversity and background of complex diseases. SNP marker has been used frequently to detect CNVs, but the analysis of SNP chip data for identifying CNV has not been well established. We compared various normalization methods for CNV analysis and suggest optimal normalization procedure for reliable CNV call. Four normal Koreans and NA10851 HapMap male samples were genotyped using Affymetrix Genome-Wide Human SNP array 5.0. We evaluated the effect of median and quantile normalization to find the optimal normalization for CNV detection based on SNP array data. We also explored the effect of Robust Multichip Average (RMA) background correction for each normalization process. In total, the following 4 combinations of normalization were tried: 1) Median normalization without RMA background correction, 2) Quantile normalization without RMA background correction, 3) Median normalization with RMA background correction, and 4) Quantile normalization with RMA background correction. CNV was called using SW-ARRAY algorithm. We applied 4 different combinations of normalization and compared the effect using intensity ratio profile, box plot, and MA plot. When we applied median and quantile normalizations without RMA background correction, both methods showed similar normalization effect and the final CNV calls were also similar in terms of number and size. In both median and quantile normalizations, RMA backgroundcorrection resulted in widening the range of intensity ratio distribution, which may suggest that RMA background correction may help to detect more CNVs compared to no correction.

A Comparison on the Image Normalizations for Image Information Estimation

  • Kang, Hwan-Il;Lim, Seung-Chul;Kim, Kab-Il;Son, Young-I
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2005.06a
    • /
    • pp.2385-2388
    • /
    • 2005
  • In this paper, we propose the estimation method for the image affine information for computer vision. The first estimation method is given based on the XYS image normalization and the second estimation method is based on the image normalization by Pei and Lin. The XYS normalization method turns out to have better performance than the method by Pei and Lin. In addition, we show that rotation and aspect ratio information can be obtained using the central moments of both the original image and the sensed image. Finally, we propose the modified version of the normalization method so that we may control the size of the image.

  • PDF

Semantic Segmentation of Drone Imagery Using Deep Learning for Seagrass Habitat Monitoring (잘피 서식지 모니터링을 위한 딥러닝 기반의 드론 영상 의미론적 분할)

  • Jeon, Eui-Ik;Kim, Seong-Hak;Kim, Byoung-Sub;Park, Kyung-Hyun;Choi, Ock-In
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.2_1
    • /
    • pp.199-215
    • /
    • 2020
  • A seagrass that is marine vascular plants plays an important role in the marine ecosystem, so periodic monitoring ofseagrass habitatsis being performed. Recently, the use of dronesthat can easily acquire very high-resolution imagery is increasing to efficiently monitor seagrass habitats. And deep learning based on a convolutional neural network has shown excellent performance in semantic segmentation. So, studies applied to deep learning models have been actively conducted in remote sensing. However, the segmentation accuracy was different due to the hyperparameter, various deep learning models and imagery. And the normalization of the image and the tile and batch size are also not standardized. So,seagrass habitats were segmented from drone-borne imagery using a deep learning that shows excellent performance in this study. And it compared and analyzed the results focused on normalization and tile size. For comparison of the results according to the normalization, tile and batch size, a grayscale image and grayscale imagery converted to Z-score and Min-Max normalization methods were used. And the tile size isincreased at a specific interval while the batch size is allowed the memory size to be used as much as possible. As a result, IoU was 0.26 ~ 0.4 higher than that of Z-score normalized imagery than other imagery. Also, it wasfound that the difference to 0.09 depending on the tile and batch size. The results were different according to the normalization, tile and batch. Therefore, this experiment found that these factors should have a suitable decision process.

URL Signatures for Improving URL Normalization (URL 정규화 향상을 위한 URL 서명)

  • Soon, Lay-Ki;Lee, Sang-Ho
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.139-149
    • /
    • 2009
  • In the standard URL normalization mechanism, URLs are normalized syntactically by a set of predefined steps. In this paper, we propose to complement the standard URL normalization by incorporating the semantically meaningful metadata of the web pages. The metadata taken into consideration are the body texts and the page size of the web pages, which can be extracted during HTML parsing. The results from our first exploratory experiment indicate that the body texts are effective in identifying equivalent URLs. Hence, given a URL which has undergone the standard normalization, we construct its URL signature by hashing the body text of the associated web page using Message-Digest algorithm 5 in the second experiment. URLs which share identical signatures are considered to be equivalent in our scheme. The results in the second experiment show that our proposed URL signatures were able to further reduce redundant URLs by 32.94% in comparison with the standard URL normalization.

Recognition of Road Surface Marks and Numbers Using Connected Component Analysis and Size Normalization (연결 성분 분석과 크기 정규화를 이용한 도로 노면 표시와 숫자 인식)

  • Jung, Min Chul
    • Journal of the Semiconductor & Display Technology
    • /
    • v.21 no.1
    • /
    • pp.22-26
    • /
    • 2022
  • This paper proposes a new method for the recognition of road surface marks and numbers. The proposed method designates a region of interest on the road surface without first detecting a lane. The road surface markings are extracted by location and size using a connection component analysis. Distortion due to the perspective effect is minimized by normalizing the size of the road markings. The road surface marking of the connected component is recognized by matching it with the stored road marking templates. The proposed method is implemented using C language in Raspberry Pi 4 system with a camera module for a real-time image processing. The system was fixedly installed in a moving vehicle, and it recorded a video like a vehicle black box. Each frame of the recorded video was extracted, and then the proposed method was tested. The results show that the proposed method is successful for the recognition of road surface marks and numbers.

Affine-Invariant Image normalization for Log-Polar Images using Momentums

  • Son, Young-Ho;You, Bum-Jae;Oh, Sang-Rok;Park, Gwi-Tae
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1140-1145
    • /
    • 2003
  • Image normalization is one of the important areas in pattern recognition. Also, log-polar images are useful in the sense that their image data size is reduced dramatically comparing with conventional images and it is possible to develop faster pattern recognition algorithms. Especially, the log-polar image is very similar with the structure of human eyes. However, there are almost no researches on pattern recognition using the log-polar images while a number of researches on visual tracking have been executed. We propose an image normalization technique of log-polar images using momentums applicable for affine-invariant pattern recognition. We handle basic distortions of an image including translation, rotation, scaling, and skew of a log-polar image. The algorithm is experimented in a PC-based real-time vision system successfully.

  • PDF

Histogram Equalization Using Centroids of Fuzzy C-Means of Background Speakers' Utterances for Majority Voting Based Speaker Identification (다수 투표 기반의 화자 식별을 위한 배경 화자 데이터의 퍼지 C-Means 중심을 이용한 히스토그램 등화기법)

  • Kim, Myung-Jae;Yang, Il-Ho;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.1
    • /
    • pp.68-74
    • /
    • 2014
  • In a previous work, we proposed a novel approach of histogram equalization using a supplement set which is composed of centroids of Fuzzy C-Means of the background utterances. The performance of the proposed method is affected by the size of the supplement set, but it is difficult to find the best size at the point of recognition. In this paper, we propose a histogram equalization using a supplement set for majority voting based speaker identification. The proposed method identifies test utterances using a majority voting on the histogram equalization methods with various sizes of supplement sets. The proposed method is compared with the conventional feature normalization methods such as CMN(Cepstral Mean Normalization), MVN(Mean and Variance Normalization), and HEQ(Histogram Equalization) and the histogram equalization method using a supplement set.