DOI QR코드

DOI QR Code

From WiFi to WiMAX: Efficient GPU-based Parameterized Transceiver across Different OFDM Protocols

  • Li, Rongchun (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology) ;
  • Dou, Yong (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology) ;
  • Zhou, Jie (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology) ;
  • Li, Baofeng (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology) ;
  • Xu, Jinbo (National Laboratory for Parallel and Distributed Processing, National University of Defense Technology)
  • 투고 : 2013.05.24
  • 심사 : 2013.08.11
  • 발행 : 2013.08.31

초록

Orthogonal frequency-division multiplexing (OFDM) has become a popular modulation scheme for wireless protocols because of its spectral efficiency and robustness against multipath interference. Although the components of various OFDM protocols are functionally similar, they remain distinct because of the characteristics of the environment. Recently, graphics processing units (GPUs) have been used to accelerate the signal processing of the physical layer (PHY) because of their great computational power, high development efficiency, and flexibility. In this paper, we describe the implementation of parameterized baseband modules using GPUs for two different OFDM protocols, namely, 802.11a and 802.16. First, we introduce various modules in the modulator/demodulator parts of the transmitter and receiver and analyze the computational complexity of each module. We then describe the integration of the GPU-based baseband modules of the two protocols using the parameterized method. GPU-based implementations are addressed to explain how to accelerate the baseband processing to archive real-time throughput. Finally, the performance results of each signal processing module are evaluated and analyzed. The experiments show that the GPU-based 802.11a and 802.16 PHY meet the real-time requirement and demonstrate good bit error ratio (BER) performance. The performance comparison indicates that our GPU-based implemented modules have better flexibility and throughput to the current ones.

키워드

참고문헌

  1. NVIDIA Corporation, "NVIDIA CUDA Compute Unified Device Architecture Programming Guide version 4.0," 2011.
  2. C. Yang, Q. Wu, T. Tang, F. Wang, and J. Xue, "Programming for scientific computing on peta-scale heterogeneous parallel systems," Journal of Central South University, vol. 20, no. 5, pp. 1189-1203, May, 2013. https://doi.org/10.1007/s11771-013-1602-z
  3. X. Yang, T. Tang, G. Wang, J. Jia, and X. Xu, "MPtostream: an OpenMP compiler for CPU-GPU heterogeneous parallel systems," Science China-information Sciences, vol. 55, no. 9, pp. 1961-1971, September, 2012.
  4. C. Yang, Q. Wu, H. Hu, Z. Shi, J. Chen, and T. Tang, "Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems," Journal of Central South University, vol. 20, no. 6, pp. 1527-1535 , June, 2013. https://doi.org/10.1007/s11771-013-1644-2
  5. S. Gronroos, K. Nybom and J. Bjorkqvist, "Complexity analysis of software defined DVB-T2 physical layer," Analog Integrated Circuits and Signal Processing, vol. 69, no. 2-3, pp. 131-142, December, 2011. https://doi.org/10.1007/s10470-011-9724-4
  6. J. Kim, H. Seungheon and C. Seungwon, "Implementation of an SDR system using graphics processing unit," IEEE Communication Magazine, vol. 48, no. 3, pp. 156-162, March, 2010.
  7. C. Ahn, J. Kim, J. Ju, J. Choi, B. Choi and S. Choi, "Implementation of an SDR platform using GPU and its application to a 2x2 MIMO WiMAX system," Analog Integrated Circuits and Signal Processing, vol. 69, no. 2, pp. 107-117, December, 2011. https://doi.org/10.1007/s10470-011-9764-9
  8. C. Ahn, S. Bang, H. Kim, S. Lee, J. Kim, S. Choi, and J. Glossner, "Implementation of an SDR system using an MPI-based GPU cluster for WiMAX and LTE," Analog Integrated Circuits and Signal Processing, vol. 73, no. 2, pp. 569-582, November, 2012. https://doi.org/10.1007/s10470-012-9941-5
  9. Z. Yu, M. J. Meeuwsen, R. W. Apperson, O. Sattari, M. A. Lai, J. W. Webb, E. W. Work, T. Mohsenin, and B. M. Baas, "Architecture and evaluation of an asynchronous array of simple processors," Journal of Signal Processing Systems, vol. 53, no. 3, pp. 243-259, December, 2008. https://doi.org/10.1007/s11265-008-0162-1
  10. A. T. Tran, D. N. Truong, and B. M. Baas, "A complete real-time 802.11a baseband receiver implemented on an array of programmable processors," in Proc. of 42nd Asilomar Conference Signals, Systems and Computer, pp. 165-170, October 26-29, 2008.
  11. H. Lee, C. Chakrabarti, and T. Mudge, "A low-power DSP for wireless communications," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 18, no. 9, pp. 1310-1322, September, 2010. https://doi.org/10.1109/TVLSI.2009.2023547
  12. M. Mizani, and D. Rakhmatov, "Multi-clock pipelined design of an IEEE 802.11a physical layer transmitter," in Proc. of 20th International Parallel and Distributed Processing Symposium, pp. 21-27, April 25-29, 2006.
  13. J. S. Park and T. Ogunfunmi, "Efficient FPGA-Based Implementations of MIMO-OFDM Physical Layer," Circuits Systems and Signal Processing, vol. 31, no. 4, pp. 1487-1511, August, 2012. https://doi.org/10.1007/s00034-012-9411-4
  14. M. J. Canet, J. Valls, V. Almenar and J. Marin-Roig, "FPGA implementation of an OFDM-based WLAN receiver," Microprocessors and Microsystems, vol. 36, no. 3, pp. 232-244, May, 2012. https://doi.org/10.1016/j.micpro.2011.11.004
  15. T. Nylanden, J. Janhunen, O. Silven and M. Juntti, "A GPU implementation for two MIMO-OFDM detectors," in Proc. of International Conf. Embedded Computer Systems: Architectures, Modeling and Simulation, pp. 293-300, July 19-22, 2010.
  16. M. Wu, Y. Sun, S. Gupta and J. R. Cavallaro, "Implementation of a high throughput soft MIMO detector on GPU," Journal of Signal Processing Systems, vol. 64, no. 1, pp. 123-136, July, 2011. https://doi.org/10.1007/s11265-010-0523-4
  17. G. Falcao, L. Sousa and V. Silva, "Massively LDPC decoding on multicore architectures," IEEE Transactions on Parallel and Distributed Systems, vol. 22, no. 2, pp. 309-322, February, 2011. https://doi.org/10.1109/TPDS.2010.66
  18. H. Ji, J. Cho and W. Sung, "Memory access optimized implementation of cyclic and quasi-cyclic LDPC codes on a GPGPU," Journal of Signal Processing System, vol. 64, no. 1, pp. 149-159, July 2011. https://doi.org/10.1007/s11265-010-0547-9
  19. F. J. Martinez-Zaldivar, A. M. Vidal-Macia, A. Gonzalez and V. Almenar, "Tridimensional block multiword LDPC decoding on GPUs," Journal of Supercomputing, vol. 58, no. 3, pp. 314-322, December, 2011. https://doi.org/10.1007/s11227-011-0587-3
  20. M. Wu, Y. Sun, and J. R. Cavallaro, "Implementation of a 3GPP LTE turbo decoder accelerator on GPU," in Proc. of IEEE Workshop Signal Processing Systems, pp. 192-197, October, 2010.
  21. C. Lin, W. Liu, W. Yeh, L. Chang, W. Hwu, S. Chen, and P. Hsiung, "A Tiling-Scheme Viterbi Decoder in Software Defined Radio for GPUs," in Proc. of 2011 7th International Conf. Wireless Communications, Networking and Mobile Computing, pp. 1-4, September 23-25, 2011.
  22. R. W. Chang, "Symthesis of band-limited orthogonal signals for mulltichannel data transmission," Bell System Technical Kournal, vol. 45, pp. 1775-1796, 1966. https://doi.org/10.1002/j.1538-7305.1966.tb02435.x
  23. IEEE, "Std 802.11a-1999, Part 11: wireless LAN, medium access control (MAC) and physical layer (PHY) specifications: high-speed physical layer in the 5 GHz band, supplement to IEEE 802.11 Standard," 1999.
  24. IEEE, "IEEE standard 802.16. Air interface for fixed broadband wireless access systems," 2004.
  25. S. Choi, K. Kang and S. Choi, "A two-stage radix-4 Viterbi decoder for multiband OFDM UWB system," ETRI Journal, vol. 30, no. 6, pp. 850-852, December, 2008. https://doi.org/10.4218/etrij.08.0208.0196
  26. NVIDIA Corporation, "CUBLAS Library version 4.0," 2011.
  27. Texas Instruments, "TMS320C64x DSP Library Programmer's Reference," 2002.

피인용 문헌

  1. GPU-Accelerated Single Image Depth Estimation with Color-Filtered Aperture vol.8, pp.3, 2013, https://doi.org/10.3837/tiis.2014.03.020