Blogs

Blog: 给致力于计算机体系结构研究的本科生的建议

Published on 2025

如果你对计算机体系结构充满热情，并立志未来在此领域深造或从事研究工作，以下建议希望能助你打好坚实基础： [Read More]

Blog: how to write a qualified paper

Published on 2023

如何提升论文的可读性？ [Read More]

Paper: prada: point cloud recognition acceleration via dynamic approximation

Published on 2023

Recent point cloud recognition (PCR) tasks tend to utilize deep neural network (DNN) for better accuracy. Still, the computational intensity of DNN makes them far from real-time processing, given the fast-increasing number of points that need to be processed. Because the point cloud represents 3D-shaped discrete objects in the physical... [Read More]

Paper: real Time video recognition via decoder Assisted neural network acceleration framework

Published on 2022

Due to the restricted on-chip computing capability for deep neural network (DNN) processing, high-definition video recognition (VOR) task is not easily achievable as a real-time task in a consumer SoC. Despite the fact that many accelerators have been proposed for fast VOR, they remain isolated from a video decoder’s inherent... [Read More]

Paper: e2sr: an end To End video codec assisted system for super resolution acceleration

Published on 2022

Nowadays high-resolution (HR) videos have been a popular choice for a better viewing experience. Recent works have shown that super-resolution (SR) algorithms can provide superior quality HR video by applying the deep neural network (DNN) to each low-resolution (LR) frame. Obviously, such per-frame DNN processing is compute-intensive and hampers the... [Read More]

Paper: vr Dann: real Time video recognition via decoder Assisted neural network acceleration

Published on 2020

Nowadays, high-definition video object recognition (segmentation and detection) is not within the easy reach of a real-time task in a consumer SoC due to the limited on-chip computing power for neural network (NN) processing. Although many accelerators have been optimized heavily, they are still isolated from the intrinsic video compression... [Read More]

Paper: gpnpu: enabling efficient hardware Based direct convolution with multi Precision support in gpu tensor cores

Published on 2020

To tailor for DNN (Deep Neural Network) acceleration, GPU has migrated to new architectures such as NVIDIA Volta and Tur- ing that incorporate dedicated Tensor Cores. Although good at GEMM (generic matrix-matrix multiplication), Tensor Cores still have inefficiency facing convolutions with certain layer structures. This paper proposes a GPNPU (General-Purpose... [Read More]

Paper: drq: dynamic region Based quantization for deep neural network acceleration

Published on 2020

Quantization is an effective technique for Deep Neural Network (DNN) inference acceleration. However, conventional quantization techniques are either applied at network or layer level that may fail to exploit fine-grained quantization for further speedup, or only applied on kernel weights without paying attention to the feature map dynamics that may... [Read More]

Paper: itt Rna: imperfection tolerable training for rram Crossbar Based deep neural Network accelerator

Published on 2020

Deep neural networks (DNNs) have gained a strong momentum among various applications. The enormous matrix-multiplication exhibited in the above DNNs is computation and memory intensive. Resistive Random-Access Memory crossbar(RRAM-crossbar) consisting of memristor cells can naturally carry out the matrix-vector multiplication. RRAM-crossbar based accelerator therefore has two orders of magnitude of... [Read More]