Remark: * indicates correspondence.
Remark: = indicates equal contribution.
2025
[C39] Fangxin Liu, Haomin Li, Bowen Zhu, Zongwu Wang, Zhuoran Song*, Haibing Guan, and Li Jiang. ASDR: Exploiting Adaptive Sampling and Data Reuse for CIM-based Instant Neural Rendering. Accepted by ACM Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2026, CCF-A).
[C38] Tianbo Liu, Xinkai Song, Zhifei Yue, Rui Wen, Xing Hu, Zhuoran Song, Yuanbo Wen, Yifan Hao, Wei Li, Zidong Du, Rui Zhang, Jiaming Guo, Di Huang, Shaohui Peng, GuangZhong Sun, Qi Guo, and Tianshi Chen. Cambricon-SR: An Accelerator for Neural Scene Representation with Sparse Encoding Table. Accepted by IEEE/ACM International Symposium on Computer Architecture (ISCA 2025, CCF-A).
[J13] Xuhang Wang, Zhuoran Song*, Chunyu Qi, Fangxin Liu, Naifeng Jing, Li Jiang, and Xiaoyao Liang. RTSA: A Run-Through Sparse Attention Framework for Video Transformer. Accepted by IEEE Transactions on Computers (TC 2025, CCF-A).
[C37] Chunyu Qi, Xuhang Wang, Ruiyang Chen, Yuanzheng Yao, Naifeng Jing, Chen Zhang, Jun Wang, Zhihui Fu, Xiaoyao Liang, and Zhuoran Song*. MHDiff: Memory- and Hardware-Efficient Diffusion Acceleration via Focal Pixel Aware Quantization. Accepted by Design Automation Conference (DAC 2025, CCF-A).
[C36] Yuanzheng Yao, Chen Zhang, Chunyu Qi, Ruiyang Chen, Jun Wang, Zhihui Fu, Naifeng Jing, Xiaoyao Liang, and Zhuoran Song*. SynGPU: Synergizing CUDA and Bit-Serial Tensor Cores for Vision Transformer Acceleration on GPU. Accepted by Design Automation Conference (DAC 2025, CCF-A).
[C35] Ruiyang Chen, Xueyuan Liu, Chunyu Qi, Yuanzheng Yao, Yanan Sun, Xiaoyao Liang, and Zhuoran Song*. SAGA: A Memory-Efficient Accelerator for GANN Construction via Harnessing Vertex Similarity. Accepted by Design Automation Conference (DAC 2025, CCF-A).
[C34] Ruiyang Chen, Xing Li, Xiaoyao Liang, and Zhuoran Song*. GIFTS: Efficient GCN Inference Framework on PyTorch-CPU via Exploring the Sparsity. Accepted by IEEE International Parallel & Distributed Processing Symposium (IPDPS 2025, CCF-B).
[J12] Xuhang Wang, Qiyue Huang, Xing Li, Haozhe Jiang, Qiang Xu, Xiaoyao Liang, and Zhuoran Song*. Vision Transformer Acceleration via a Versatile Attention Optimization Framework. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2025, CCF-A).
[C33] Houshu He, Gang Li, Fangxin Liu, Li Jiang, Xiaoyao Liang, and Zhuoran Song*. GSArch: Breaking Memory Barriers in 3D Guassian Splatting Training via Architectural Support. Accepted by IEEE International Symposium on High-Performance Computer Architecture (HPCA 2025, CCF-A).
[J11] Zhuoran Song*, Jiabei Long, Li Jiang, Naifeng Jing and Xiaoyao Liang. GCNTrain+: A Versatile and Efficient Accelerator for Graph Convolutional Neural Network Training. Accepted by ACM Transactions on Architecture and Code Optimization (TACO 2025, CCF-A).
[J10] Shuai Yuan, Weifeng He, Zhenhua Zhu, Fangxin Liu, Zhuoran Song, Guohao Dai, Guanghui He, and Yanan Sun. HyCTor: A Hybrid CNN-Transformer Network Accelerator With Flexible Weight/Output Stationary Dataflow and Multi-Core Extension. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2025, CCF-A).
2024
[C32] Xuan Zhang, Zhuoran Song*, Peng Zhou, Xing Li, Xueyuan Liu, Xiaolong Lin, Zhezhi He, Li Jiang, Naifeng Jing and Xiaoyao Liang. WIFA: A Weight Importance- and Frequency-aware Accelerator for SNN. Accepted by IEEE International Conference on Computer Design (ICCD 2024, CCF-B).
[C31] Zhuoran Song, Houshu He, Fangxin Liu, Yifan Hao, Xinkai Song, Li Jiang, and Xiaoyao Liang. SRender: Boosting Neural Radiance Field Efficiency via Sensitivity-Aware Dynamic Precision Rendering. Accepted by IEEE/ACM International Symposium on Microarchitecture (MICRO 2024, CCF-A).
[J9] Zhuoran Song*, Zhongkai Yu, Xinkai Song, Yifan Hao, Li Jiang, Naifeng Jing and Xiaoyao Liang. Environmental Condition Aware Super-Resolution Acceleration Framework in Server-Client Hierarchies. Accepted by ACM Transactions on Architecture and Code Optimization (TACO 2024, CCF-A). [paper] [cite]
[J8] Xing Li, Zhuoran Song*, Rachata Ausavarungnirun, Xiao Liu, Xueyuan Liu, Xuan Zhang, Xuhang Wang, Jiayao Ling, Gang Li, Naifeng Jing and Xiaoyao Liang. Janus: A Flexible Processing-in-Memory Graph Accelerator Towards Sparsity. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2024, CCF-A). [paper] [cite]
[C30] Xuan Zhang, Zhuoran Song*, Zhezhi He, Naifeng Jing, Li Jiang, and Xiaoyao Liang. Watt: A Write-optimized RRAM-based Accelerator for Attention. Accepted by European Conference on Parallel Processing (Euro-Par 2024, CCF-B).
[C29] Fangxin Liu, Ning Yang, Haomin Li, Zongwu Wang, Zhuoran Song, Songwen Pei, Li Jiang. EOS: An Energy-Oriented Attack Framework for Spiking Neural Networks. Accepted by Design Automation Conference (DAC 2024, CCF-A).
[C28] Fangxin Liu, Ning Yang, Haomin Li, Zongwu Wang, Zhuoran Song, Songwen Pei, Li Jiang. INSPIRE: Accelerating Deep Neural Networks via Hardware-friendly Index-Pair Encoding. Accepted by Design Automation Conference (DAC 2024, CCF-A).
[C27] Xueyuan Liu, Zhuoran Song*, Hao Chen, Xing Li, and Xiaoyao Liang. MoC: A Morton-Code-Based Fine-Grained Quantization for Accelerating Point Cloud Neural Networks. Accepted by Design Automation Conference (DAC 2024, CCF-A).
[C26] Xuhang Wang, Zhuoran Song*, and Xiaoyao Liang. InterArch: Video Transformer Acceleration via Inter-Feature Deduplication with Cube-based Dataflow. Accepted by Design Automation Conference (DAC 2024, CCF-A).
[C25] Zhuoran Song*, Chunyu Qi, Yuanzheng Yao, Peng Zhou, Yanyi Zi, Nan Wang, and Xiaoyao Liang. TSAcc: An Efficient Tempo-Spatial Similarity Aware Accelerator for Attention Acceleration. Accepted by Design Automation Conference (DAC 2024, CCF-A).
[C24] Zhuoran Song, Chunyu Qi, Fangxin Liu, Naifeng Jing, and Xiaoyao Liang. CMC: Video Transformer Acceleration via CODEC Assisted Matrix Condensing. Accepted by ACM Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2024, CCF-A, Accept rate=14%). [paper] [cite]
[C23] Xueyuan Liu, Zhuoran Song*, Guohao Dai, Gang Li, Can Xiao, Yan Xiang, Dehui Kong, Ke Xu and Xiaoyao Liang. FusionArch: A Fusion-Based Accelerator for Point-Based Point Cloud Neural Networks. Accepted by Design, Automation, and Test in Europe (DATE 2024, CCF-B). Best Paper Award. [paper] [cite]
[C22] Xueyuan Liu, Zhuoran Song*, Xiang Liao, Xing Li, Tao Yang, Fangxin Liu and Xiaoyao Liang. Sava: A Spatial- and Value-Aware Accelerator for Point Cloud Transformers. Accepted by Design, Automation, and Test in Europe (DATE 2024, CCF-B). [paper] [cite]
[C21] Fangxin Liu, Ning Yang, Haomin Li, Zongwu Wang, Zhuoran Song, Songwen Pei, Li Jiang. SPARK: Scalable and Precision-Aware Acceleration of Neural Networks via Efficient Encoding. Accepted by High Performance Computer Architecture (HPCA 2024, CCF-A). [paper] [cite]
2023
[J7] Zhuoran Song, Wanzhen Liu, Tao Yang, Fangxin Liu, Naifeng Jing, and Xiaoyao Liang. A Point Cloud Video Recognition Acceleration Framework Based on Tempo-Spatial Information. Accepted by IEEE Transactions on Parallel and Distributed Systems (TPDS 2023, CCF-A). [paper] [cite]
[C20] Chunyu Qi, Zilong Li, Zhuoran Song* and Xiaoyao Liang. ViTframe: Vision Transformer Acceleration via Informative Frame Selection for Video Recognition. Accepted by 40th IEEE International Conference on Computer Design (ICCD 2023, CCF-B). [paper] [cite]
[C19] Xuhang Wang, Zhuoran Song* and Xiaoyao Liang. RealArch: A Real-Time Scheduler for Mapping Multi-Tenant DNNs on Multi-Core Accelerators. Accepted by 40th IEEE International Conference on Computer Design (ICCD 2023, CCF-B). [paper] [cite]
[C18] Xuan Zhang, Zhuoran Song*, Xing Li, Zhezhi He, Li Jiang, Naifeng Jing and Xiaoyao Liang. HyAcc: A Hybrid CAM-MAC RRAM-based Accelerator for Recommendation Model. Accepted by 40th IEEE International Conference on Computer Design (ICCD 2023, CCF-B). [paper] [cite]
[C17] Xuhang Wang, Zhuoran Song*, Qiyue Huang and Xiaoyao Liang. DEQ: Dynamic Element-wise Quantization for Efficient Attention Architecture. Accepted by 40th IEEE International Conference on Computer Design (ICCD 2023, CCF-B). [paper] [cite]
[C16] Xiaolong Lin, Gang Li, Zizhao Liu, Yadong Liu, Fan Zhang, Zhuoran Song, Naifeng Jing, and Xiaoyao Liang. AdaS: A Fast and Energy-Efficient CNN Accelerator Exploiting Bit-Sparsity. Accepted by Design Automation Conference (DAC 2023, CCF-A). [paper] [cite]
[C15] Zhuoran Song, Heng Lu, Gang Li, Li Jiang, Naifeng Jing and Xiaoyao Liang. PRADA: Point Cloud Recognition Acceleration via Dynamic Approximation. Accepted by Design, Automation and Test in Europe Conference (DATE 2023, CCF-B). Best Paper Award. [paper] [cite]
2022
[J6] Zhuoran Song, Heng Lu, Li Jiang, Naifeng Jing and Xiaoyao Liang. Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration Framework. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2022, CCF-A). [paper] [cite]
[C14] Heng Lu, Zhuoran Song*, Xing Li, Naifeng Jing and Xiaoyao Liang. GCNTrain: A Unified and Efficient Accelerator for Graph Convolutional Neural Networks Training. Accepted by 40th IEEE International Conference on Computer Design (ICCD 2022, CCF-B). [paper] [cite]
[C13] Gang Li, Weixiang Xu, Zhuoran Song, Naifeng Jing, Jian Chen and Xiaoyao Liang. Ristretto: An Atomized Processing Architecture for Sparsity-Condensed Stream Flow in CNN. Accepted by 55th ACM/IEEE International Symposium on Microarchitecture (MICRO 2022, CCF-A). [paper] [cite]
[C12] Xing Li, Rachata Ausavarungnirun, Xiao Liu, Xueyuan Liu, Xuan Zhang, Heng Lu, Zhuoran Song, Naifeng Jing and Xiaoyao Liang. Gzippo: Highly-compact Processing-In-Memory Graph Accelerator Alleviating Sparsity and Redundancy. Accepted by 2022 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2022, CCF-B). [paper] [cite]
[J5] Zhuoran Song, Naifeng Jing and Xiaoyao Liang. E2-VOR: An End-to-End En/Decoder Architecture for Efficient Video Object Recognition. Accepted by ACM Transactions on Design Automation of Electronic Systems (TODAES 2022, CCF-B). [paper] [cite]
[C11] Zhuoran Song, Zhongkai Yu, Naifeng Jing and Xiaoyao Liang. E2SR: An End-to-End Video CODEC Assisted System for Super Resolution Acceleration. Accepted by Design Automation Conference (DAC 2022, CCF-A). [paper] [cite]
[J4] Tao Yang, Dongyue Li, Fei Ma, Zhuoran Song, Yilong Zhao, Jiaxi Zhang, Fangxin Liu and Li Jiang. PASGCN: An ReRAM-Based PIM Design for GCN with Adaptively Sparsified Graphs. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2022, CCF-A). [paper] [cite]
[C10] Tao Yang, Dongyue Li, Zhuoran Song, Yilong Zhao, Fangxin Liu, Zongwu Wang, Zhezhi He and Li Jiang. DTQAtten: Leveraging Dynamic Token-based Quantization for Efficient Attention Architecture. Accepted by ACM/IEEE Design Automation & Test in Europe Conference and Exhibition (DATE 2022, CCF-B). Best Paper Nomination. [paper] [cite]
[C9] Feiyang Wu, Zhuoran Song, Jing Ke, Li Jiang, Naifeng Jing and Xiaoyao Liang. PIPArch: Programmable Image Processing Architecture Using Sliding Array. Accepted by 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA 2022, CCF-C). [paper] [cite]
2021
[J3] 宋卓然,蒋力。深度神经网络专用架构与压缩技术演进。中国计算机学会通讯,2021年第3期。
[C8] Zhuoran Song=, Dongyue Li=, Zhezhi He, Xiaoyao Liang and Li Jiang. ReRAM-Sharing: Fine-Grained Weight Sharing for ReRAM-Based Deep Neural Network Accelerator. Accepted by IEEE International Symposium on Circuits and Systems (ISCAS 2021, CCF-C). [paper] [cite]
2020
[C7] Zhuoran Song, Feiyang Wu, Xueyuan Liu, Naifeng Jing and Xiaoyao Liang. VR-DANN: Real-Time Video Recognition via Decoder-Assisted Neural Network Acceleration. Accepted by IEEE/ACM International Symposium on Microarchitecture (MICRO 2020, CCF-A). [paper] [cite]
[C6] Zhuoran Song, Bangqi Fu, Feiyang Wu, Zhaoming Jiang, Li Jiang, Naifeng Jing and Xiaoyao Liang. DRQ: Dynamic Region-Based Quantization for Deep Neural Network Acceleration. Accepted by IEEE/ACM International Symposium on Computer Architecture (ISCA 2020, CCF-A). [paper] [cite]
[C5] Zhuoran Song, Jianfei Wang, Tianjian Li, Li Jiang, Jing Ke, Xiaoyao Liang and Naifeng Jing. GPNPU: Enabling Efficient Hardware-Based Direct Convolution with Multi-Precision Support in GPU Tensor Cores. Accepted by Design Automation Conference (DAC 2020, CCF-A). [paper] [cite]
[J2] Zhuoran Song, Lerong Chen, Tianjian Li, Naifeng Jing, Xiaoyao Liang, Yanan Sun and Li Jiang. ITT-RNA: Imperfection Tolerable Training for RRAM-Crossbar based Deep Neural-network Accelerator. Accepted by IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD 2020, CCF-A). [paper] [cite]
[C4] Zhaoming Jiang, Zhuoran Song, Naifeng Jing and Xiaoyao Liang. PRArch: Pattern-Based Reconfigurable Architecture for Deep Neural Network Acceleration. Accepted by IEEE International Conference on High Performance Computing and Communications (HPCC 2020, CCF-C). [paper] [cite]
[C3] Zhuoran Song, Yilong Zhao, Yanan Sun, Xiaoyao Liang and Li Jiang. ESNreram: An Energy-Efficient Sparse Neural Network Based on Resistive Random-Access Memory. Accepted by ACM/IEEE Accepted by ACM/IEEE (GLSVLSI 2020, CCF-C). [paper] [cite]
2019
[C2] Zhuoran Song, Dongyu Ru, Ru Wang, Hongru Huang, Zhenghao Peng, Jing Ke, Xiaoyao Liang, and Li Jiang. Approximate Random Dropout for DNN training acceleration in GPGPU. Accepted by ACM/IEEE Design Automation & Test in Europe Conference and Exhibition (DATE 2019, CCF-B). [paper] [cite]
[J1] Li Jiang=, Zhuoran Song=, Song H, et al. Energy-Efficient and Quality-Assured Approximate Computing Framework Using a Co-Training Method[J]. ACM Transactions on Design Automation of Electronic Systems (TODAES 2019, CCF-B). [paper] [cite]
2018
[C1] Haiyue Song, Li Jiang, Chengwen Xu, Zhuoran Song, Naifeng Jing, Xiaoyao Liang and Qiang Xu. Invocation-driven Neural Approximate Computing with a Multiclass-Classifier and Multiple Approximators. Accepted by ACM/IEEE International Conference on Computer-Aided Design (ICCAD 2018, CCF-B). [paper] [cite]