• Title/Summary/Keyword: sequential and parallel algorithms

Search Result 35, Processing Time 0.021 seconds

Parallel Computation For The Edit Distance Based On The Four-Russians' Algorithm (4-러시안 알고리즘 기반의 편집거리 병렬계산)

  • Kim, Young Ho;Jeong, Ju-Hui;Kang, Dae Woong;Sim, Jeong Seop
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.2
    • /
    • pp.67-74
    • /
    • 2013
  • Approximate string matching problems have been studied in diverse fields. Recently, fast approximate string matching algorithms are being used to reduce the time and costs for the next generation sequencing. To measure the amounts of errors between two strings, we use a distance function such as the edit distance. Given two strings X(|X| = m) and Y(|Y| = n) over an alphabet ${\Sigma}$, the edit distance between X and Y is the minimum number of edit operations to convert X into Y. The edit distance between X and Y can be computed using the well-known dynamic programming technique in O(mn) time and space. The edit distance also can be computed using the Four-Russians' algorithm whose preprocessing step runs in $O((3{\mid}{\Sigma}{\mid})^{2t}t^2)$ time and $O((3{\mid}{\Sigma}{\mid})^{2t}t)$ space and the computation step runs in O(mn/t) time and O(mn) space where t represents the size of the block. In this paper, we present a parallelized version of the computation step of the Four-Russians' algorithm. Our algorithm computes the edit distance between X and Y in O(m+n) time using m/t threads. Then we implemented both the sequential version and our parallelized version of the Four-Russians' algorithm using CUDA to compare the execution times. When t = 1 and t = 2, our algorithm runs about 10 times and 3 times faster than the sequential algorithm, respectively.

A Vectorization Technique at Object Code Level (목적 코드 레벨에서의 벡터화 기법)

  • Lee, Dong-Ho;Kim, Ki-Chang
    • The Transactions of the Korea Information Processing Society
    • /
    • v.5 no.5
    • /
    • pp.1172-1184
    • /
    • 1998
  • ILP(Instruction Level Parallelism) processors use code reordering algorithms to expose parallelism in a given sequential program. When applied to a loop, this algorithm produces a software-pipelined loop. In a software-pipelined loop, each iteration contains a sequence of parallel instructions that are composed of data-independent instructions collected across from several iterations. For vector loops, however the software pipelining technique can not expose the maximum parallelism because it schedules the program based only on data-dependencies. This paper proposes to schedule differently for vector loops. We develop an algorithm to detect vector loops at object code level and suggest a new vector scheduling algorithm for them. Our vector scheduling improves the performance because it can schedule not only based on data-dependencies but on loop structure or iteration conditions at the object code level. We compare the resulting schedules with those by software-pipelining techniques in the aspect of performance.

  • PDF

Hybrid CMA-ES/SPGD Algorithm for Phase Control of a Coherent Beam Combining System and its Performance Analysis by Numerical Simulations (CMA-ES/SPGD 이중 알고리즘을 통한 결맞음 빔 결합 시스템 위상제어 및 동작성능에 대한 전산모사 분석)

  • Minsu, Yeo;Hansol, Kim;Yoonchan, Jeong
    • Korean Journal of Optics and Photonics
    • /
    • v.34 no.1
    • /
    • pp.1-12
    • /
    • 2023
  • In this study, we propose a hybrid phase-control algorithm for multi-channel coherent beam combining (CBC) system by combining the covariant matrix adaption evolution strategy (CMA-ES) and stochastic parallel gradient descent (SPGD) algorithms and analyze its operational performance. The proposed hybrid CMA-ES/SPGD algorithm is a sequential process which initially runs the CMA-ES algorithm until the combined final output intensity reaches a preset interim value, and then switches to running the SPGD algorithm to the end of the whole process. For ideal 7-channel and 19-channel all-fiber-based CBC systems, we have found that the mean convergence time can be reduced by about 10% in comparison with the case when the SPGD algorithm is implemented alone. Furthermore, we analyzed a more realistic situation in which some additional phase noise was introduced in the same CBC system. As a result, it is shown that the proposed algorithm reduces the mean convergence time by about 17% for a 7-channel CBC system and 16-27% for a 19-channel system compared to the existing SPGD alone algorithm. We expect that for implementing a CBC system in a real outdoor environment where phase noise cannot be ignored, the hybrid CMA-ES/SPGD algorithm proposed in this study will be exploited very usefully.

A Study on Improved Image Matching Method using the CUDA Computing (CUDA 연산을 이용한 개선된 영상 매칭 방법에 관한 연구)

  • Cho, Kyeongrae;Park, Byungjoon;Yoon, Taebok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.4
    • /
    • pp.2749-2756
    • /
    • 2015
  • Recently, Depending on the quality of data increases, the problem of time-consuming to process the image is raised by being required to accelerate the image processing algorithms, in a traditional CPU and CUDA(Compute Unified Device Architecture) based recognition system for computing speed and performance gains compared to OpenMP When character recognition has been learned by the system to measure the input by the character data matching is implemented in an environment that recognizes the region of the well, so that the font of the characters image learning English alphabet are each constant and standardized in size and character an image matching method for calculating the matching has also been implemented. GPGPU (General Purpose GPU) programming platform technology when using the CUDA computing techniques to recognize and use the four cores of Intel i5 2500 with OpenMP to deal quickly and efficiently an algorithm, than the performance of existing CPU does not produce the rate of four times due to the delay of the data of the partition and merge operation proposed a method of improving the rate of speed of about 3.2 times, and the parallel processing of the video card that processes a result, the sequential operation of the process compared to CPU-based who performed the performance gain is about 21 tiems improvement in was confirmed.

Sentiment Analysis of Movie Review Using Integrated CNN-LSTM Mode (CNN-LSTM 조합모델을 이용한 영화리뷰 감성분석)

  • Park, Ho-yeon;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.4
    • /
    • pp.141-154
    • /
    • 2019
  • Rapid growth of internet technology and social media is progressing. Data mining technology has evolved to enable unstructured document representations in a variety of applications. Sentiment analysis is an important technology that can distinguish poor or high-quality content through text data of products, and it has proliferated during text mining. Sentiment analysis mainly analyzes people's opinions in text data by assigning predefined data categories as positive and negative. This has been studied in various directions in terms of accuracy from simple rule-based to dictionary-based approaches using predefined labels. In fact, sentiment analysis is one of the most active researches in natural language processing and is widely studied in text mining. When real online reviews aren't available for others, it's not only easy to openly collect information, but it also affects your business. In marketing, real-world information from customers is gathered on websites, not surveys. Depending on whether the website's posts are positive or negative, the customer response is reflected in the sales and tries to identify the information. However, many reviews on a website are not always good, and difficult to identify. The earlier studies in this research area used the reviews data of the Amazon.com shopping mal, but the research data used in the recent studies uses the data for stock market trends, blogs, news articles, weather forecasts, IMDB, and facebook etc. However, the lack of accuracy is recognized because sentiment calculations are changed according to the subject, paragraph, sentiment lexicon direction, and sentence strength. This study aims to classify the polarity analysis of sentiment analysis into positive and negative categories and increase the prediction accuracy of the polarity analysis using the pretrained IMDB review data set. First, the text classification algorithm related to sentiment analysis adopts the popular machine learning algorithms such as NB (naive bayes), SVM (support vector machines), XGboost, RF (random forests), and Gradient Boost as comparative models. Second, deep learning has demonstrated discriminative features that can extract complex features of data. Representative algorithms are CNN (convolution neural networks), RNN (recurrent neural networks), LSTM (long-short term memory). CNN can be used similarly to BoW when processing a sentence in vector format, but does not consider sequential data attributes. RNN can handle well in order because it takes into account the time information of the data, but there is a long-term dependency on memory. To solve the problem of long-term dependence, LSTM is used. For the comparison, CNN and LSTM were chosen as simple deep learning models. In addition to classical machine learning algorithms, CNN, LSTM, and the integrated models were analyzed. Although there are many parameters for the algorithms, we examined the relationship between numerical value and precision to find the optimal combination. And, we tried to figure out how the models work well for sentiment analysis and how these models work. This study proposes integrated CNN and LSTM algorithms to extract the positive and negative features of text analysis. The reasons for mixing these two algorithms are as follows. CNN can extract features for the classification automatically by applying convolution layer and massively parallel processing. LSTM is not capable of highly parallel processing. Like faucets, the LSTM has input, output, and forget gates that can be moved and controlled at a desired time. These gates have the advantage of placing memory blocks on hidden nodes. The memory block of the LSTM may not store all the data, but it can solve the CNN's long-term dependency problem. Furthermore, when LSTM is used in CNN's pooling layer, it has an end-to-end structure, so that spatial and temporal features can be designed simultaneously. In combination with CNN-LSTM, 90.33% accuracy was measured. This is slower than CNN, but faster than LSTM. The presented model was more accurate than other models. In addition, each word embedding layer can be improved when training the kernel step by step. CNN-LSTM can improve the weakness of each model, and there is an advantage of improving the learning by layer using the end-to-end structure of LSTM. Based on these reasons, this study tries to enhance the classification accuracy of movie reviews using the integrated CNN-LSTM model.