• Title/Summary/Keyword: parallel computer processing

Search Result 650, Processing Time 0.027 seconds

Template-Based Object-Order Volume Rendering with Perspective Projection (원형기반 객체순서의 원근 투영 볼륨 렌더링)

  • Koo, Yun-Mo;Lee, Cheol-Hi;Shin, Yeong-Gil
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.7
    • /
    • pp.619-628
    • /
    • 2000
  • Abstract Perspective views provide a powerful depth cue and thus aid the interpretation of complicated images. The main drawback of current perspective volume rendering is the long execution time. In this paper, we present an efficient perspective volume rendering algorithm based on coherency between rays. Two sets of templates are built for the rays cast from horizontal and vertical scanlines in the intermediate image which is parallel to one of volume faces. Each sample along a ray is calculated by interpolating neighboring voxels with the pre-computed weights in the templates. We also solve the problem of uneven sampling rate due to perspective ray divergence by building more templates for the regions far away from a viewpoint. Since our algorithm operates in object-order, it can avoid redundant access to each voxel and exploit spatial data coherency by using run-length encoded volume. Experimental results show that the use of templates and the object-order processing with run-length encoded volume provide speedups, compared to the other approaches. Additionally, the image quality of our algorithm improves by solving uneven sampling rate due to perspective ray di vergence.

  • PDF

New VLSI Architecture of Parallel Multiplier-Accumulator Based on Radix-2 Modified Booth Algorithm (Radix-2 MBA 기반 병렬 MAC의 VLSI 구조)

  • Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.4
    • /
    • pp.94-104
    • /
    • 2008
  • In this paper, we propose a new architecture of multiplier-and-accumulator (MAC) for high speed multiplication and accumulation arithmetic. By combining multiplication with accumulation and devising a hybrid type of carry save adder (CSA), the performance was improved. Since the accumulator which has the largest delay in MAC was removed and its function was included into CSA, the overall performance becomes to be elevated. The proposed CSA tree uses 1's complement-based radix-2 modified booth algorithm (MBA) and has the modified array for the sign extension in order to increase the bit density of operands. The CSA propagates the carries by the least significant bits of the partial products and generates the least significant bits in advance for decreasing the number of the input bits of the final adder. Also, the proposed MAC accumulates the intermediate results in the type of sum and carry bits not the output of the final adder for improving the performance by optimizing the efficiency of pipeline scheme. The proposed architecture was synthesized with $250{\mu}m,\;180{\mu}m,\;130{\mu}m$ and 90nm standard CMOS library after designing it. We analyzed the results such as hardware resource, delay, and pipeline which are based on the theoretical and experimental estimation. We used Sakurai's alpha power low for the delay modeling. The proposed MAC has the superior properties to the standard design in many ways and its performance is twice as much than the previous research in the similar clock frequency.

Study of perception of the visual depth caused by the color correction (입체영상 제작에서 색 보정 결과가 입체감 인지에 미치는 영향 연구)

  • Han, Myung-Hee;Kim, Chee-Yong
    • Journal of Digital Contents Society
    • /
    • v.11 no.2
    • /
    • pp.177-184
    • /
    • 2010
  • These days, as digital producing technique has been developed, 3D imaging technique is used in high-tech computer and T.V. Also study for 3D producing technique is actively in progress. Moreover, as James Cameron's movie, 'Avatar' released in 2009 was a box office hit, the issue about 3D image came to the fore again. At this point, I decided to study the effect of the visual depth caused by the color correction during the post-production stage. The purpose of this study is to offer information about processing effective images through data about the effect of the visual depth that applies the color correction during the post-production stage. Basically, I supposed that color and contract would have effects on depth of 3D image. As a result, I could find out the changes of visual depth, space perception and sense of depth throughout the experiment. Applying this result,, I produced the 15 minutes of 3D advertisement movie and I found out that the color correction during the post-production stage was very effective for 3D depth. The left image and the right image by beam splitter based rig and parallel rig were used for this study. Also I adjusted the strong contrast by the color correction during the post-production stage after correcting convergence and visual depth during editing. As a result, I could produce images which had strong sense of space and sense of depth.

A Load Balancing Method using Partition Tuning for Pipelined Multi-way Hash Join (다중 해시 조인의 파이프라인 처리에서 분할 조율을 통한 부하 균형 유지 방법)

  • Mun, Jin-Gyu;Jin, Seong-Il;Jo, Seong-Hyeon
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.180-192
    • /
    • 2002
  • We investigate the effect of the data skew of join attributes on the performance of a pipelined multi-way hash join method, and propose two new harsh join methods in the shared-nothing multiprocessor environment. The first proposed method allocates buckets statically by round-robin fashion, and the second one allocates buckets dynamically via a frequency distribution. Using harsh-based joins, multiple joins can be pipelined to that the early results from a join, before the whole join is completed, are sent to the next join processing without staying in disks. Shared nothing multiprocessor architecture is known to be more scalable to support very large databases. However, this hardware structure is very sensitive to the data skew. Unless the pipelining execution of multiple hash joins includes some dynamic load balancing mechanism, the skew effect can severely deteriorate the system performance. In this parer, we derive an execution model of the pipeline segment and a cost model, and develop a simulator for the study. As shown by our simulation with a wide range of parameters, join selectivities and sizes of relations deteriorate the system performance as the degree of data skew is larger. But the proposed method using a large number of buckets and a tuning technique can offer substantial robustness against a wide range of skew conditions.

Parallel Computation For The Edit Distance Based On The Four-Russians' Algorithm (4-러시안 알고리즘 기반의 편집거리 병렬계산)

  • Kim, Young Ho;Jeong, Ju-Hui;Kang, Dae Woong;Sim, Jeong Seop
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.2 no.2
    • /
    • pp.67-74
    • /
    • 2013
  • Approximate string matching problems have been studied in diverse fields. Recently, fast approximate string matching algorithms are being used to reduce the time and costs for the next generation sequencing. To measure the amounts of errors between two strings, we use a distance function such as the edit distance. Given two strings X(|X| = m) and Y(|Y| = n) over an alphabet ${\Sigma}$, the edit distance between X and Y is the minimum number of edit operations to convert X into Y. The edit distance between X and Y can be computed using the well-known dynamic programming technique in O(mn) time and space. The edit distance also can be computed using the Four-Russians' algorithm whose preprocessing step runs in $O((3{\mid}{\Sigma}{\mid})^{2t}t^2)$ time and $O((3{\mid}{\Sigma}{\mid})^{2t}t)$ space and the computation step runs in O(mn/t) time and O(mn) space where t represents the size of the block. In this paper, we present a parallelized version of the computation step of the Four-Russians' algorithm. Our algorithm computes the edit distance between X and Y in O(m+n) time using m/t threads. Then we implemented both the sequential version and our parallelized version of the Four-Russians' algorithm using CUDA to compare the execution times. When t = 1 and t = 2, our algorithm runs about 10 times and 3 times faster than the sequential algorithm, respectively.

Planning Evacuation Routes with Load Balancing in Indoor Building Environments (실내 빌딩 환경에서 부하 균등을 고려한 대피경로 산출)

  • Jang, Minsoo;Lim, Kyungshik
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.5 no.7
    • /
    • pp.159-172
    • /
    • 2016
  • This paper presents a novel algorithm for searching evacuation paths in indoor disaster environments. The proposed method significantly improves the time complexity to find the paths to the evacuation exit by introducing a light-weight Disaster Evacuation Graph (DEG) for a building in terms of the size of the graph. With the DEG, the method also considers load balancing and bottleneck capacity of the paths to the evacuation exit simultaneously. The behavior of the algorithm consists of two phases: horizontal tiering (HT) and vertical tiering (VT). The HT phase finds a possible optimal path from anywhere of a specific floor to the evacuation stairs of the floor. Thus, after finishing the HT phases of all floors in parallel the VT phase begins to integrate all results from the previous HT phases to determine a evacuation path from anywhere of a floor to the safety zone of the building that could be the entrance or the roof of the building. It should be noted that the path produced by the algorithm. And, in order to define the range of graph to process, tiering scheme is used. In order to test the performance of the method, computing times and evacuation times are compared to the existing path searching algorithms. The result shows the proposed method is better than the existing algorithms in terms of the computing time and evacuation time. It is useful in a large-scale building to find the evacuation routes for evacuees quickly.

DNN based Robust Speech Feature Extraction and Signal Noise Removal Method Using Improved Average Prediction LMS Filter for Speech Recognition (음성 인식을 위한 개선된 평균 예측 LMS 필터를 이용한 DNN 기반의 강인한 음성 특징 추출 및 신호 잡음 제거 기법)

  • Oh, SangYeob
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.6
    • /
    • pp.1-6
    • /
    • 2021
  • In the field of speech recognition, as the DNN is applied, the use of speech recognition is increasing, but the amount of calculation for parallel training needs to be larger than that of the conventional GMM, and if the amount of data is small, overfitting occurs. To solve this problem, we propose an efficient method for robust voice feature extraction and voice signal noise removal even when the amount of data is small. Speech feature extraction efficiently extracts speech energy by applying the difference in frame energy for speech and the zero-crossing ratio and level-crossing ratio that are affected by the speech signal. In addition, in order to remove noise, the noise of the speech signal is removed by removing the noise of the speech signal with an average predictive improved LMS filter with little loss of speech information while maintaining the intrinsic characteristics of speech in detection of the speech signal. The improved LMS filter uses a method of processing noise on the input speech signal by adjusting the active parameter threshold for the input signal. As a result of comparing the method proposed in this paper with the conventional frame energy method, it was confirmed that the error rate at the start point of speech is 7% and the error rate at the end point is improved by 11%.

Comparison of the wall clock time for extracting remote sensing data in Hierarchical Data Format using Geospatial Data Abstraction Library by operating system and compiler (운영 체제와 컴파일러에 따른 Geospatial Data Abstraction Library의 Hierarchical Data Format 형식 원격 탐사 자료 추출 속도 비교)

  • Yoo, Byoung Hyun;Kim, Kwang Soo;Lee, Jihye
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.21 no.1
    • /
    • pp.65-73
    • /
    • 2019
  • The MODIS (Moderate Resolution Imaging Spectroradiometer) data in Hierarchical Data Format (HDF) have been processed using the Geospatial Data Abstraction Library (GDAL). Because of a relatively large data size, it would be preferable to build and install the data analysis tool with greater computing performance, which would differ by operating system and the form of distribution, e.g., source code or binary package. The objective of this study was to examine the performance of the GDAL for processing the HDF files, which would guide construction of a computer system for remote sensing data analysis. The differences in execution time were compared between environments under which the GDAL was installed. The wall clock time was measured after extracting data for each variable in the MODIS data file using a tool built lining against GDAL under a combination of operating systems (Ubuntu and openSUSE), compilers (GNU and Intel), and distribution forms. The MOD07 product, which contains atmosphere data, were processed for eight 2-D variables and two 3-D variables. The GDAL compiled with Intel compiler under Ubuntu had the shortest computation time. For openSUSE, the GDAL compiled using GNU and intel compilers had greater performance for 2-D and 3-D variables, respectively. It was found that the wall clock time was considerably long for the GDAL complied with "--with-hdf4=no" configuration option or RPM package manager under openSUSE. These results indicated that the choice of the environments under which the GDAL is installed, e.g., operation system or compiler, would have a considerable impact on the performance of a system for processing remote sensing data. Application of parallel computing approaches would improve the performance of the data processing for the HDF files, which merits further evaluation of these computational methods.

Benchmark Results of a Monte Carlo Treatment Planning system (몬데카를로 기반 치료계획시스템의 성능평가)

  • Cho, Byung-Chul
    • Progress in Medical Physics
    • /
    • v.13 no.3
    • /
    • pp.149-155
    • /
    • 2002
  • Recent advances in radiation transport algorithms, computer hardware performance, and parallel computing make the clinical use of Monte Carlo based dose calculations possible. To compare the speed and accuracies of dose calculations between different developed codes, a benchmark tests were proposed at the XIIth ICCR (International Conference on the use of Computers in Radiation Therapy, Heidelberg, Germany 2000). A Monte Carlo treatment planning comprised of 28 various Intel Pentium CPUs was implemented for routine clinical use. The purpose of this study was to evaluate the performance of our system using the above benchmark tests. The benchmark procedures are comprised of three parts. a) speed of photon beams dose calculation inside a given phantom of 30.5 cm$\times$39.5 cm $\times$ 30 cm deep and filled with 5 ㎣ voxels within 2% statistical uncertainty. b) speed of electron beams dose calculation inside the same phantom as that of the photon beams. c) accuracy of photon and electron beam calculation inside heterogeneous slab phantom compared with the reference results of EGS4/PRESTA calculation. As results of the speed benchmark tests, it took 5.5 minutes to achieve less than 2% statistical uncertainty for 18 MV photon beams. Though the net calculation for electron beams was an order of faster than the photon beam, the overall calculation time was similar to that of photon beam case due to the overhead time to maintain parallel processing. Since our Monte Carlo code is EGSnrc, which is an improved version of EGS4, the accuracy tests of our system showed, as expected, very good agreement with the reference data. In conclusion, our Monte Carlo treatment planning system shows clinically meaningful results. Though other more efficient codes are developed such like MCDOSE and VMC++, BEAMnrc based on EGSnrc code system may be used for routine clinical Monte Carlo treatment planning in conjunction with clustering technique.

  • PDF

Methods for Integration of Documents using Hierarchical Structure based on the Formal Concept Analysis (FCA 기반 계층적 구조를 이용한 문서 통합 기법)

  • Kim, Tae-Hwan;Jeon, Ho-Cheol;Choi, Joong-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.63-77
    • /
    • 2011
  • The World Wide Web is a very large distributed digital information space. From its origins in 1991, the web has grown to encompass diverse information resources as personal home pasges, online digital libraries and virtual museums. Some estimates suggest that the web currently includes over 500 billion pages in the deep web. The ability to search and retrieve information from the web efficiently and effectively is an enabling technology for realizing its full potential. With powerful workstations and parallel processing technology, efficiency is not a bottleneck. In fact, some existing search tools sift through gigabyte.syze precompiled web indexes in a fraction of a second. But retrieval effectiveness is a different matter. Current search tools retrieve too many documents, of which only a small fraction are relevant to the user query. Furthermore, the most relevant documents do not nessarily appear at the top of the query output order. Also, current search tools can not retrieve the documents related with retrieved document from gigantic amount of documents. The most important problem for lots of current searching systems is to increase the quality of search. It means to provide related documents or decrease the number of unrelated documents as low as possible in the results of search. For this problem, CiteSeer proposed the ACI (Autonomous Citation Indexing) of the articles on the World Wide Web. A "citation index" indexes the links between articles that researchers make when they cite other articles. Citation indexes are very useful for a number of purposes, including literature search and analysis of the academic literature. For details of this work, references contained in academic articles are used to give credit to previous work in the literature and provide a link between the "citing" and "cited" articles. A citation index indexes the citations that an article makes, linking the articleswith the cited works. Citation indexes were originally designed mainly for information retrieval. The citation links allow navigating the literature in unique ways. Papers can be located independent of language, and words in thetitle, keywords or document. A citation index allows navigation backward in time (the list of cited articles) and forwardin time (which subsequent articles cite the current article?) But CiteSeer can not indexes the links between articles that researchers doesn't make. Because it indexes the links between articles that only researchers make when they cite other articles. Also, CiteSeer is not easy to scalability. Because CiteSeer can not indexes the links between articles that researchers doesn't make. All these problems make us orient for designing more effective search system. This paper shows a method that extracts subject and predicate per each sentence in documents. A document will be changed into the tabular form that extracted predicate checked value of possible subject and object. We make a hierarchical graph of a document using the table and then integrate graphs of documents. The graph of entire documents calculates the area of document as compared with integrated documents. We mark relation among the documents as compared with the area of documents. Also it proposes a method for structural integration of documents that retrieves documents from the graph. It makes that the user can find information easier. We compared the performance of the proposed approaches with lucene search engine using the formulas for ranking. As a result, the F.measure is about 60% and it is better as about 15%.