• Title/Summary/Keyword: General processor

Search Result 298, Processing Time 0.021 seconds

Design and Implementation of Initial OpenSHMEM Based on PCI Express (PCI Express 기반 OpenSHMEM 초기 설계 및 구현)

  • Joo, Young-Woong;Choi, Min
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.6 no.3
    • /
    • pp.105-112
    • /
    • 2017
  • PCI Express is a bus technology that connects the processor and the peripheral I/O devices that widely used as an industry standard because it has the characteristics of high-speed, low power. In addition, PCI Express is system interconnect technology such as Ethernet and Infiniband used in high-performance computing and computer cluster. PGAS(partitioned global address space) programming model is often used to implement the one-sided RDMA(remote direct memory access) from multi-host systems, such as computer clusters. In this paper, we design and implement a OpenSHMEM API based on PCI Express maintaining the existing features of OpenSHMEM to implement RDMA based on PCI Express. We perform experiment with implemented OpenSHMEM API through a matrix multiplication example from system which PCs connected with NTB(non-transparent bridge) technology of PCI Express. The PCI Express interconnection network is currently very expensive and is not yet widely available to the general public. Nevertheless, we actually implemented and evaluated a PCI Express based interconnection network on the RDK evaluation board. In addition, we have implemented the OpenSHMEM software stack, which is of great interest recently.

8.1 Gbps High-Throughput and Multi-Mode QC-LDPC Decoder based on Fully Parallel Structure (전 병렬구조 기반 8.1 Gbps 고속 및 다중 모드 QC-LDPC 복호기)

  • Jung, Yongmin;Jung, Yunho;Lee, Seongjoo;Kim, Jaeseok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.78-89
    • /
    • 2013
  • This paper proposes a high-throughput and multi-mode quasi-cyclic (QC) low-density parity-check (LDPC) decoder based on a fully parallel structure. The proposed QC-LDPC decoder employs the fully parallel structure to provide very high throughput. The high interconnection complexity, which is the general problem in the fully parallel structure, is solved by using a broadcasting-based sum-product algorithm and proposing a low-complexity cyclic shift network. The high complexity problem, which is caused by using a large amount of check node processors and variable node processors, is solved by proposing a combined check and variable node processor (CCVP). The proposed QC-LDPC decoder can support the multi-mode decoding by proposing a routing-based interconnection network, the flexible CCVP and the flexible cyclic shift network. The proposed QC-LDPC decoder is operated at 100 MHz clock frequency. The proposed QC-LDPC decoder supports multi-mode decoding and provides 8.1 Gbps throughput for a (1944, 1620) QC-LDPC code.

Efficient RMESH Algorithms for Computing the Intersection and the Union of Two Visibility Polygons (두 가시성 다각형의 교집합과 합집합을 구하는 효율적인 RMESH 알고리즘)

  • Kim, Soo-Hwan
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.2
    • /
    • pp.401-407
    • /
    • 2016
  • We can consider the following problems for two given points p and q in a simple polygon P. (1) Compute the set of points of P which are visible from both p and q. (2) Compute the set of points of P which are visible from either p or q. They are corresponding to the problems which are to compute the intersection and the union of two visibility polygons. In this paper, we consider algorithms for solving these problems on a reconfigurable mesh(in short, RMESH). The algorithm in [1] can compute the intersection of two general polygons in constant time on an RMESH with size O($n^3$), where n is the total number of vertices of two polygons. In this paper, we construct the planar subdivision graph in constant time on an RMESH with size O($n^2$) using the properties of the visibility polygon for preprocessing. Then we present O($log^2n$) time algorithms for computing the union as well as the intersection of two visibility polygons, which improve the processor-time product from O($n^3$) to O($n^2log^2n$).

A Study on the Basic Physical Properties of Water-Soluble Rubber Asphalt-based Coating Waterproofing for Exterior Application (수용성 고무 아스팔트계 도막방수재의 실외 적용을 위한 기본 물성 연구)

  • Kang, Hyo-Jin;Youn, Sung-Hwan;Oh, Sang-Keun
    • Journal of the Korean Recycled Construction Resources Institute
    • /
    • v.8 no.4
    • /
    • pp.553-561
    • /
    • 2020
  • Water-soluble rubber asphalt-based waterproofing material, which is one of the waterproofing materials for building structures, is mainly used indoors (toilet, kitchen, balcony, etc.). In general, asphalt-based materials are used for non-exposed installation, rather than as exposed type as they do not deviate from their usual basic black pigmentation, and water-soluble rubber asphalt-based coating waterproofing materials are basically limited to indoors because of their low physical properties. Accordingly, in order to improve the tensile and elongation properties, a silane coupling agent, an inorganic filler, and a processor oil w ere added to improve the physical properties, and accordingly, the basic physical properties of the outdoor coating waterproofing material quality standard were analyzed. As a result, the water-soluble rubber asphalt coating waterproofing material compared with the exposure quality standard showed a result that exceeded the basic physical property quality standard of silicone rubber in all items under test evaluation, but the tensile strength and tear strength of the first class of urethane rubber were chloroprene. It was found that the performance compared to the quality standards of rubber-based tear strength was about 34.2% to about 40.8%.

Development of RAW Data Storage Equipment for Operation Algorithm research of the Millimeter Wave Tracking Radar (밀리미터파 추적레이더 운용 알고리듬 연구를 위한 RAW 데이터 저장 장비 개발)

  • Choi, Jinkyu;Na, Kyoung-Il;Shin, Youngcheol;Hong, Soonil;Kim, Younjin;Kim, Hongrak;Joo, Jihan;Kim, Sosu
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.3
    • /
    • pp.57-62
    • /
    • 2022
  • Recently, the tracking radar continues research to develop a new operation algorithm that can acquire and track a target in various environments. In general, modeling similar to the real target and environment is used to develop a new operation algorithm, but there is a limit to modeling the real environment. In this paper, a RAW data storage device was developed to efficiently develop a new operation algorithm required for the tracking radar using millimeter wave to acquire and track the target. The RAW data storage equipment is designed so that the signal processing device of the tracking radar using millimeter wave can save the RAW data output from 8 channels to OOOMSPS. RAW data storage equipment consists of data acquisition equipment and data storage equipment. The data acquisition equipment was implemented using a commercial Xilinx KCU 105 Evaluation KIT capable of high-speed data communication interface, and the data storage equipment was implemented by applying a computer compatible with the commercial Xilinx KCU 105 Evaluation KIT. In this paper, the performance of the implemented RAW data storage equipment was verified through repeated interlocking tests with the signal processing device of the millimeter wave tracking radar.

Efficient Implementation of NIST LWC SPARKLE on 64-Bit ARMv8 (ARMv8 환경에서 NIST LWC SPARKLE 효율적 구현)

  • Hanbeom Shin;Gyusang Kim;Myeonghoon Lee;Insung Kim;Sunyeop Kim;Donggeun Kwon;Seonggyeom Kim;Seogchung Seo;Seokhie Hong
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.33 no.3
    • /
    • pp.401-410
    • /
    • 2023
  • In this paper, we propose optimization methods for implementing SPARKLE, one of the NIST LWC finalists, on a 64-bit ARMv8 processor. The proposed methods consist of two approaches: an implementation using ARM A64 instructions and another using NEON ASIMD instructions. The A64-based implementation is optimized by performing register scheduling to efficiently utilize the available registers on the ARMv8 architecture. By utilizing the optimized A64-based implementation, we can achieve speeds that are 1.69 to 1.81 times faster than the C reference implementation on a Raspberry Pi 4B. The ASIMD-based implementation, on the other hand, optimizes data by parallelizing the ARX-boxes to perform more than three of them concurrently through a single vector instruction. While the general speed of the optimized ASIMD-based implementation is lower than that of the A64-based implementation, it only slows down by 1.2 times compared to the 2.1 times slowdown observed in the A64-based implementation as the block size increases from SPARKLE256 to SPARKLE512. This is an advantage of the ASIMD-based implementation. Therefore, the ASIMD-based implementation is more efficient for SPARKLE variant block cipher or permutation designs with larger block sizes than the original SPARKLE, making it a useful resource.

Sound Engine for Korean Traditional Instruments Using General Purpose Digital Signal Processor (범용 디지털 신호처리기를 이용한 국악기 사운드 엔진 개발)

  • Kang, Myeong-Su;Cho, Sang-Jin;Kwon, Sun-Deok;Chong, Ui-Pil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.229-238
    • /
    • 2009
  • This paper describes a sound engine of Korean traditional instruments, which are the Gayageum and Taepyeongso, by using a TMS320F2812. The Gayageum and Taepyeongso models based on commuted waveguide synthesis (CWS) are required to synthesize each sound. There is an instrument selection button to choose one of instruments in the proposed sound engine, and thus a corresponding sound is produced by the relative model at every certain time. Every synthesized sound sample is transmitted to a DAC (TLV5638) using SPI communication, and it is played through a speaker via an audio interface. The length of the delay line determines a fundamental frequency of a desired sound. In order to determine the length of the delay line, it is needed that the time for synthesizing a sound sample should be checked by using a GPIO. It takes $28.6{\mu}s$ for the Gayageum and $21{\mu}s$ for the Taepyeongso, respectively. It happens that each sound sample is synthesized and transferred to the DAC in an interrupt service routine (ISR) of the proposed sound engine. A timer of the TMS320F2812 has four events for generating interrupts. In this paper, the interrupt is happened by using the period matching event of it, and the ISR is called whenever the interrupt happens, $60{\mu}s$. Compared to original sounds with their spectra, the results are good enough to represent timbres of instruments except 'Mu, Hwang, Tae, Joong' of the Taepyeongso. Moreover, only one sound is produced when playing the Taepyeongso and it takes $21{\mu}s$ for the real-time playing. In the case of the Gayageum, players usually use their two fingers (thumb and middle finger or thumb and index finger), so it takes $57.2{\mu}s$ for the real-time playing.

A Study on the Intelligent Service Selection Reasoning for Enhanced User Satisfaction : Appliance to Cloud Computing Service (사용자 만족도 향상을 위한 지능형 서비스 선정 방안에 관한 연구 : 클라우드 컴퓨팅 서비스에의 적용)

  • Shin, Dong Cheon
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.3
    • /
    • pp.35-51
    • /
    • 2012
  • Cloud computing is internet-based computing where computing resources are offered over the Internet as scalable and on-demand services. In particular, in case a number of various cloud services emerge in accordance with development of internet and mobile technology, to select and provide services with which service users satisfy is one of the important issues. Most of previous works show the limitation in the degree of user satisfaction because they are based on so called concept similarity in relation to user requirements or are lack of versatility of user preferences. This paper presents cloud service selection reasoning which can be applied to the general cloud service environments including a variety of computing resource services, not limited to web services. In relation to the service environments, there are two kinds of services: atomic service and composite service. An atomic service consists of service attributes which represent the characteristics of service such as functionality, performance, or specification. A composite service can be created by composition of atomic services and other composite services. Therefore, a composite service inherits attributes of component services. On the other hand, the main participants in providing with cloud services are service users, service suppliers, and service operators. Service suppliers can register services autonomously or in accordance with the strategic collaboration with service operators. Service users submit request queries including service name and requirements to the service management system. The service management system consists of a query processor for processing user queries, a registration manager for service registration, and a selection engine for service selection reasoning. In order to enhance the degree of user satisfaction, our reasoning stands on basis of the degree of conformance to user requirements of service attributes in terms of functionality, performance, and specification of service attributes, instead of concept similarity as in ontology-based reasoning. For this we introduce so called a service attribute graph (SAG) which is generated by considering the inclusion relationship among instances of a service attribute from several perspectives like functionality, performance, and specification. Hence, SAG is a directed graph which shows the inclusion relationships among attribute instances. Since the degree of conformance is very close to the inclusion relationship, we can say the acceptability of services depends on the closeness of inclusion relationship among corresponding attribute instances. That is, the high closeness implies the high acceptability because the degree of closeness reflects the degree of conformance among attributes instances. The degree of closeness is proportional to the path length between two vertex in SAG. The shorter path length means more close inclusion relationship than longer path length, which implies the higher degree of conformance. In addition to acceptability, in this paper, other user preferences such as priority for attributes and mandatary options are reflected for the variety of user requirements. Furthermore, to consider various types of attribute like character, number, and boolean also helps to support the variety of user requirements. Finally, according to service value to price cloud services are rated and recommended to users. One of the significances of this paper is the first try to present a graph-based selection reasoning unlike other works, while considering various user preferences in relation with service attributes.