-
Notifications
You must be signed in to change notification settings - Fork 354
Related Projects and Research
Jeff Bush edited this page Apr 28, 2022
·
10 revisions
Papers and projects based on this design.
- Bush, Jeff, et al. "Nyami: a synthesizable GPU architectural model for general-purpose and graphics-specific workloads." Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on. IEEE, 2015. (Microarchitectural details are for older v1 architecture)
- Blane, AJ. "Do it Yourself Heterogeneous Multicore Platform" SITCON, 2015 source (Integrate into Terasic SocKit Altera development board)
- Integration of Nyuzi with Cyclone V Hard Processor System
- Jian, Liu. Research and Implementation of Embedded Multi-core GPU Rendering Pipeline (Master's Thesis) University of Electronic Science and Technology, China.
- Bush, Jeff, et al. "NyuziRaster: Optimizing rasterizer performance and energy in the Nyuzi open source GPU."Performance Analysis of Systems and Software (ISPASS), 2016 IEEE International Symposium on. IEEE, 2016.
- Pang, Yalong, et al. "Instruction set extension and hardware acceleration for SVM application toward a vector processor." SoC Design Conference (ISOCC), 2017 International. IEEE, 2017.
- Bauer, Wolfgang, et al. "Programmable HSA Accelerators for Zynq UltraScale+ MPSoC Systems." European Conference on Parallel Processing. Springer, Cham, 2018. (Integration of Nyuzi on Zync UltraScale MPSoC, using HSA Foundation Standards LibHSA and HSAIL/BRIG intermediate language)
- Nyuzi on Xilinx Ultrascale+ ZCU102 Institute of Computer Technology (ICT) at TU Wien, Vienna, Austria
- Nyuzi Stack Tracer
- Jinchuan, Zhang (2017) Research and Implementation of Heterogeneous Multiprocessor Embedded Platform (Master's Thesis) University of Electronic Science and Technology, China source code (Integration of Nyuzi with ZC706 board)
- Kruppe, Robin, et al. "Extending LLVM for lightweight SPMD vectorization: using SIMD and vector instructions easily from any language." Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization. IEEE Press, 2019.
- Compiling Parallel Kernels in Rust (source code for above)
- Cheng, TaiYu, et al. "Logarithm-approximate floating-point multiplier is applicable to power-efficient neural network training." Integration (2020).
- Lim, Hyunyul, Tae Hyun Kim, and Sungho Kang. "Prediction-Based Error Correction for GPU Reliability with Low Overhead." Electronics 9.11 (2020): 1849.
- Masuda, Yutaka, et al. "Critical path isolation and bit-width scaling are highly compatible for voltage over-scalable design." 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE). IEEE, 2021.
- Masuda, Yutaka, et al. "Low-Power Design Methodology of Voltage Over-Scalable Circuit with Critical Path Isolation and Bit-Width Scaling." IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences (2021).