Search | Korea Science

Analysis of GPU Performance and Memory Efficiency according to Task Processing Units (작업 처리 단위 변화에 따른 GPU 성능과 메모리 접근 시간의 관계 분석)

Son, Dong Oh;Sim, Gyu Yeon;Kim, Cheol Hong
- Smart Media Journal
- /
- v.4 no.4
- /
- pp.56-63
- /
- 2015
Modern GPU can execute mass parallel computation by exploiting many GPU core. GPGPU architecture, which is one of approaches exploiting outstanding computational resources on GPU, executes general-purpose applications as well as graphics applications, effectively. In this paper, we investigate the impact of memory-efficiency and performance according to number of CTAs(Cooperative Thread Array) on a SM(Streaming Multiprocessors), since the analysis of relation between number of CTA on a SM and them provides inspiration for researchers who study the GPU to improve the performance. Our simulation results show that almost benchmarks increasing the number of CTAs on a SM improve the performance. On the other hand, some benchmarks cannot provide performance improvement. This is because the number of CTAs generated from same kernel is a little or the number of CTAs executed simultaneously is not enough. To precisely classify the analysis of performance according to number of CTA on a SM, we also analyze the relations between performance and memory stall, dram stall due to the interconnect congestion, pipeline stall at the memory stage. We expect that our analysis results help the study to improve the parallelism and memory-efficiency on GPGPU architecture.
PDF KSCI

Synchronized Sampling Structure applied HW/SW platform for LAN-based Digital Substation Protection (LAN 기반 디지털 변전소 보호를 위한 동기 샘플링 구조적용 HW/SW 플랫폼 기술)

Son, Kyou Jung;Nam, Kyung-Deok;An, Gi Sung;Chang, Tae Gyu
- Journal of IKEEE
- /
- v.24 no.1
- /
- pp.178-185
- /
- 2020
This paper proposes precise time synchronization-based synchronized sampling structure applied HW/SW platform for LAN-based protection of future digital substations. The integrated software of the proposed platform includes IEC 61850 protocol, IEEE 1588 precision time protocol and synchronized sampling structure. The proposed platform expected to provide a basis of an application of future distributed sensing data-based protection and control methods by providing synchronized measurement among IEDs. The implementation of the proposed HW/SW platform technique was performed using TMDXIDK572 multi-core/multi-processor evaluation module and its time synchronization performance and synchronized sampling function were confirmed through the performance tests.
https://doi.org/10.7471/ikeee.2020.24.1.178 인용 PDF KSCI

Measuring ultrasonic TOF using Zynq baremetal Multiprocessing (Zynq 기반 baremetal 멀티프로세싱에 의한 초음파 TOF 측정)

Kang, Moon ho
- Journal of the Institute of Electronics and Information Engineers
- /
- v.54 no.6
- /
- pp.93-99
- /
- 2017
In this research the TOF (time of flight) of ultrasonic signal is measured using Xilinx's Zynq SoC (system on chip). The TOF is calculated from the difference between periods during which RF (radio frequency) and ultrasonic signals come across a distance, and then travelling distance is obtained by multiplying the TOF by the ultrasonic speed in the air. For this purpose, a ultrasonic pulse is generated from a Zynq's internal ADC, a FIR (finite impulse response) filter, and a Kalman filter. And a RF reference pulse is generated from a RF interface. Based on baremetal multiprocessing, the Kalman filter and the RF interface are c-programmed on Zynq's dual processor cores, with other components fabricated on Zynq's FPGA. With this HW/SW co-design, both lower resource utilization and much smaller designing period were obtained than the HW design. As a design tool, Vivado IDE(integrated design environment) is used to design the whole signal processing system in hierarchical block diagrams.
https://doi.org/10.5573/ieie.2017.54.6.93 인용 PDF KSCI

IPC-based Dynamic SM management on GPGPU for Executing AES Algorithm

Son, Dong Oh;Choi, Hong Jun;Kim, Cheol Hong
- Journal of the Korea Society of Computer and Information
- /
- v.25 no.2
- /
- pp.11-19
- /
- 2020
Modern GPU can execute general purpose computation on the graphic processing unit, and provide high performance by exploiting many core on GPU. To run AES algorithm efficiently, parallel computational resources are required. However, computational resource of CPU architecture are not enough to cryptographic algorithm such as AES whereas GPU architecture has mass parallel computation resources. Therefore, this paper reduce the time to execute AES by employing parallel computational resource on GPGPU. Unfortunately, AES cannot utilize computational resource on GPGPU since it isn't suitable to GPGPU architecture. In this paper, IPC based dynamic SM management technique are proposed to efficiently execute AES on GPGPU. IPC based dynamic SM management can increase and decrease the number of active SMs by using IPC in run-time. According to simulation results, proposed technique improve the performance by increasing resource utilization compared to baseline GPGPU architecture. The results show that AES improve the performance by 41.2% on average.
https://doi.org/10.9708/jksci.2020.25.02.011 인용 PDF KSCI

Implementation of An Embedded Communication Translator for Remote Control (원격 제어를 위한 임베디드 통신 변환기 구현)

Lee Byung-Kwon;Chon Young-Suk;Jeon Joong-Nam
- The KIPS Transactions:PartD
- /
- v.13D no.3 s.106
- /
- pp.445-454
- /
- 2006
Almost of industrial measuring instruments usually are equipped only with serial communication devices. In order to connect these instruments to internet, we implement an embedded translator. This device has the hardware components composed of one WAN port, two LAN ports, and two UARTs, and functions as a communication translator between serial and internet communication. it also provides web-based monitoring function that is absent from existing serial-to-ethernet converter. The hardware is implemented using the KS8695 network processor which s an ARM922T as processor core. We have installed the boa web server and utilized the CGI function for internet-based remote control, added the IP sharing function which allows the network with private IP addresses to access the internet, and developed a serial-to-ethernet translation program. Finally, we show an application example of the developed translator that remotely monitors the solar energy production system.
https://doi.org/10.3745/KIPSTD.2006.13D.3.445 인용 PDF KSCI

Vehicle ECU Design Incorporating LIN/CAN Vehicle Interface with Kalman Filter Function (LIN/CAN 차량용 인터페이스와 칼만 필터 기능을 통합한 차량용 ECU 설계)

Jeong, Seonwoo;Kim, Yongbin;Lee, Seongsoo
- Journal of IKEEE
- /
- v.25 no.4
- /
- pp.762-765
- /
- 2021
In this paper, an automotive ECU (electronic control unit) with Kalman filter accelerator is designed and implemented. RISC-V is exploited as a processor core. Accelerator for Kalman filter matrix operation, CAN (controller area network) controller for in-vehicle network, and LIN (local interconnect network) controller are designed and embedded. Kalman filter operation consists of time update process and measurement update process. Current state variable and its error covariance are estimated in time update process. Final values are corrected from input measurement data and Kalman gain in measurement update process. Usually floating-point multiplication is exploited in software implementation, but fixed-point multiplier considering accuracy analysis is exploited in this paper to reduce hardware area. In 28nm silicon fabrication, its operating frequency, area, and gate counts are 100MHz, 0.37mm², and 760k gates, respectively.
https://doi.org/10.7471/ikeee.2021.25.4.762 인용 PDF KSCI

LCD Module Initialization and Panel Display for the Virtual Screen of LN2440SBC Embedded Systems (LN2440SBC 임베디드 시스템의 가상 스크린을 위한 LCD 모듈 초기화 및 패널 디스플레이)

Oh, Sam-Kweon;Park, Geun-Duk;Kim, Byoung-Kuk
- Journal of Advanced Navigation Technology
- /
- v.14 no.3
- /
- pp.452-458
- /
- 2010
In case of an embedded system with computing resource restrictions such as system power and cpu, the overhead due to displaying data on the computer screen may have a significant influence on the system performance. This paper describes an initialization method for LCD-driving components such as an ARM Core, an LCD controller, and an SPI(serial peripheral interface). It also introduces a pixel display function and a panel display method using virtual screen for reducing the display overhead for an LN2440SBC system with an ARM9-based S3C2440A microprocessor. A virtual screen is a large space of computer memories allocated much larger than those needed for one-time display of an image. Displaying a specific region of a virtual screen is done by assigning it as a view-port region. Such a display is useful in an embedded system when concurrently running tasks produce and display their respective results on the screen; it is especially so when the execution result of each task is partially modified, instead of being totally modified, on its turn and displayed. If the tasks running on such a system divide and make efficient use of the region of the virtual screen, the display overhead can be minimized. For the performance comparison with and without using the virtual screen, two different images are displayed in turn and the amount of time consumed for their display is measured. The result shows that the display time of the former is about 5 times faster than that of the latter.
PDF KSCI

Table-Based Fault Tolerant Routing Method for Voltage-Frequency-Island NoC (Voltage-Frequency-Island NoC를 위한 테이블 기반의 고장 감내 라우팅 기법)

Yoon, Sung Jae;Li, Chang-Lin;Kim, Yong Seok;Han, Tae Hee
- Journal of the Institute of Electronics and Information Engineers
- /
- v.53 no.8
- /
- pp.66-75
- /
- 2016
Due to aggressive scaling of device sizes and reduced noise margins, physical defects caused by aging and process variation are continuously increasing. Additionally, with scaling limitation of metal wire and the increasing of communication volume, fault tolerant method in manycore network-on-chip (NoC) has been actively researched. However, there are few researches investigating reliability in NoC with voltage-frequency-island (VFI) regime. In this paper, we propose a table-based routing technique that can communicate, even if link failures occur in the VFI NoC. The output port is alternatively selected between best and the detour routing path in order to improve reliability with minimized hardware cost. Experimental results show that the proposed method achieves full coverage within 1% faulty links. Compared to $d^2$-LBDR that also considers a routing method for searching a detour path in real time, the proposed method, on average, produces 0.8% savings in execution time and 15.9% savings in energy consumption.
https://doi.org/10.5573/ieie.2016.53.8.066 인용 PDF KSCI

TMC (Tracker Motion Controller) Using Sensors and GPS Implementation and Performance Analysis (센서와 GPS를 이용한 TMC의 구현 및 성능 분석)

Ko, Jae-Hong
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.14 no.2
- /
- pp.828-834
- /
- 2013
In this paper, TMC (Tracker Motion Controller) as one of the many research methods for condensing efficiency improvements can be condensed into efficient solar system configuration to improve the power generation efficiency of the castle with Concentrated solar silicon and photovoltaic systems (CPV)experiments using PV systems. Microprocessor used on the solar system, tracing the development of solar altitude and latitude of each is calculated in real time. Also accept the value from the sensor, motor control and communication with the central control system by calculating the value of the current position of the sun, there is a growing burden on the applicability. Through the way the program is appropriate for solar power systems and sensors hybrid-type algorithm was implemented in the ARM core with built-in TMC, Concentrated CPV system compared to the existing PV systems, through the implementation of the TMC in the country's power generation efficiency compared and analyzed. Sensor method using existing experimental results Concentrated solar power systems to communicate the value of GPS location tracking method hybrid solar horizons in the coordinate system of the sun's azimuth and elevation angles calculated by the program in the calculations of astronomy through experimental resultslook clear day at high solar irradiation were shown to have a large difference. Stopped after a certain period of time, the sun appears in the blind spot of the sensor, the sensor error that can occur from climate change, however, do not have a cloudy and clear day solar radiation sensor does not keep track of the position of the sun, rather than the sensor of excellence could be found. It is expected that research is constantly needed for the system with ongoing research for development of solar cell efficiency increases to reduce the production cost of power generation, high efficiency condensing type according to the change of climate with the optimal development of the ability TMC.
https://doi.org/10.5762/KAIS.2013.14.2.828 인용 PDF KSCI

A Study on the Utilization and Control Method of Hybrid Switching Tap Based Automatic Voltage Regulator on Smart Grid (스마트그리드의 탭 전환 자동 전압 조정기의 다중 스위칭 제어 방법 및 활용 방안에 관한 연구)

Park, Gwang-Yun;Kim, Jung-Ryul;Kim, Byung-Gi
- Journal of the Korea Society of Computer and Information
- /
- v.17 no.12
- /
- pp.31-39
- /
- 2012
In this paper, we propose a microprocessor-based automatic voltage regulator(AVR) to reduce consumers' electric energy consumption and to help controlling peak demanding power. Hybrid Switching Automatic Voltage Regulator (HS-AVR) consist of a toroidal core, several tap control switches, display and command control parts. The coil forms an autotransformer which has a serial main winding and four parallel auxiliary windings. It controls the output voltage by changing the combination of the coils and the switches. Relays are adopted as the link switches of the coils to minimize the loss. To make connecting and disconnecting time accurate, relays of the circuit have parallel TRIACs. A software phase locked loop(PLL) has been used to synchronize the timings of the switches to the voltage waveform. The software PLL informs the input voltage zero-crossing and positive/negative peak timing. The traditional voltage transformers and AVRs have a disadvantage of having a large mandatory capacity to accommodate maximum inrush current to avoid the switch contact damage. But we propose a suitable AVR for every purpose in smart grid with reduced size and increased efficiency.
https://doi.org/10.9708/jksci/2012.17.12.031 인용 PDF KSCI

Search Result 312, Processing Time 0.025 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)