References
- B.-G. Nam and H.-J Yoo, "An Embedded Stream Processor Core Based on Logarithmic Arithmetic for a Low-Power 3-D Graphics SoC", IEEE J. Solid-State Circuits, Vol. 44, No. 5, pp. 1554-1570, 2009. https://doi.org/10.1109/JSSC.2009.2016698
- J. Nickolls and W. J. Dally, "The GPU Computing Era", IEEE Micro, Vol. 30, No. 2, pp. 56-69, 2010. https://doi.org/10.1109/MM.2010.41
- J. E. Stone, D. Gohara, and G. Shi, "OpenCL: A Parallel Programming Standard for Heterogeneous Computing Systems", IEEE Computing in Science & Engineering, Vol. 12, No. 3, pp. 66-73, 2010.
- W. Sheng, P. Szymanski, R. Leupers, and G. Ascheid, "Software Migration for Parallel Execution on a Multicore Tablet: A Case Study", The IEEE 7th International Symposium on Embedded Multicore SoC, pp. 26-28, Sep. 2012.
- R. Ubal, J. Sahuquillo, S. Petit, and P. Lopez "Multi2Sim: a Simulation Framework for CPU-GPU Computing", in Proc. of the 21st International Conference on Parallel Architectures and Compilation Techniques, pp. 335-344, Sep. 2012.
- V. Zakharenko, T. Aamodt, and A. Moshovos, "Characterizing the performance benefits of fused CPU/GPU systems using FusionSim", IEEE in Proc. DATE, pp. 685-688, Mar. 2013.
- H. Wang, V. Sathish, R. Singh, M. Schulte, and N. Kim, "Workload and Power Budget Partitioning for Single-Chip Heterogeneous Processors", IEEE/ACM Int. Conf. on Parallel Architecture and Compilation Techniques (PACT), Sep. 2012.
- S. Raghav, C. Pinto, M. Ruggiero, A. Marongiu, D. Atienza, and L. Benini, "Full System Simulation of Many-Core Heterogeneous SoCs using GPU and QEMU Semihosting", In Proceedings of ACM Workshop on General Purpose Processing with Graphics Processing Units, pp. 101-109, Mar. 2012.
- M. T. Yourst, "PTLsim: A Cycle Accurate Full System x86-64 Microarchitectural Simulator", IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 23-34, 2007.
- S.-T. Shen, S.-Y. Lee, and C.-H. Chen, "Full System Simulation with QEMU: an Approach to Multi-view 3D GPU Design", IEEE Int. Symp. on Circuits and Systems (ISCAS), pp. 3877-3880, May, 2010.
- S. Collange, M. Daumas, D. Defour, and D. Parello, "Barra: a Parallel Functional Simulator for GPGPU", IEEE/ACM International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems, pp. 351-360, Aug. 2010.
- F. Bellard, "QEMU, a Fast and Portable Dynamic Translator", in Proceedings of USENIX Annual Technical Conference, pp. 41-46, June, 2005.
- A. Bakhoda, G. L. Yuan, W. W. L. Fung, H. Wong and T. M. Aamodt, "Analyzing CUDA Workloads Using a Detailed GPU Simulator", IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 163-174, Apr. 2009.