Tutorials

ASP-DAC 2022 offers attendees a set of three-hour intense introductions to specific topics. This year, each tutorial will be provided interactive online plus recorded videos.

  • Date: Monday, January 11, 2022 (9:00 — 16:30)
Room 1 Room 2 Room 3
9:00 — 12:00 (TST) (GMT+8) Tutorial-1
IEEE CEDA DATC RDF and METRICS2.1: Toward a Standard Platform for ML-Enabled EDA and IC Design
Tutorial-2
Low-bit Neural Network Computing: Algorithms and Hardware
Tutorial-3
Side Channel Analysis: from Concepts to Simulation and Silicon Validation
13:30 — 16:30 (TST) (GMT+8) Tutorial-4
New Techniques in Variational Quantum Algorithms and Their Applications
Tutorial-5
Towards Efficient Computation for Sparsity in Future Artificial Intelligence
Tutorial-6
Scan-based DfT: Mitigating its Security Vulnerabilities and Building Security Primitives

Tutorial-1: Monday, January 17, 9:00—12:00 (TST) @ Room 1

IEEE CEDA DATC RDF and METRICS2.1: Toward a Standard Platform for ML-Enabled EDA and IC Design

Speaker:
Jinwook Jung (IBM Research)
Andrew B. Kahng (UCSD)
Seungwon Kim (UCSD)
Ravi Varadarajan (UCSD)

Abstract:

Machine learning (ML) for IC design often faces the challenge of "small data" due to its nature. It takes a huge amount of time and effort to go through multiple P&R flows with various tool settings, constraints, and parameters for obtaining useful training data of ML-enabled EDA. In this regard, systematic and scalable execution of hardware design experiments, together with standards for sharing of data and models, is an essential element of ML-based EDA and chip design. In this tutorial, we describe the effort taken in IEEE CEDA Design Automation Technical Committee (DATC) toward a standard platform for ML-enabled EDA and IC design. We first overview the challenges in ML-enabled EDA and review the related previous efforts. We then present DATC RDF and METRICS2.1, followed by hands-on working examples of (1) large-scale design experiments via cloud deployment, (2) extracting, collecting, and analyzing METRICS data from large-scale experiment results, and (3) flow auto-tuning framework for PPA optimization via METRICS2.1 realization. We will provide the working examples used throughout the tutorial via a public code repository, which include full RTL-to-GDS flow, codebases, Jupyter notebooks, and cloud deployment sample codes.

Biography:

Jinwook Jung and Andrew B. Kahng currently serve as chair/co-chair of the IEEE CEDA DATC.
Jinwook Jung is a Research Staff Member at IBM Thomas J. Watson Research Center. At IBM, he works to advance design methodologies for AI hardware accelerators and high-performance microprocessors, leveraging machine learning and cloud infrastructure.
Andrew B. Kahng is on the CSE and ECE faculty at UCSD. His interests span IC physical design, DFM, technology roadmapping, machine learning for IC design and CAD, and open-source platforms to accelerate EDA research and innovation. He serves as PI of the OpenROAD project https://theopenroadproject.org and the https://tilos.ai research institute.
Seungwon Kim is currently a postdoctoral researcher in the VLSI CAD lab, at CSE department at UCSD. His current research interests include machine learning based physical design methodology for flow design automation with open-source platforms.
Ravi Varadarajan is currently a graduate student pursuing his PhD at ECE department at UCSD. He has had over 35 year of industry experience working at Bell Labs, Cadence, Tera Systems, Atrenta and Synopsys.



Tutorial-2: Monday, January 17, 9:00—12:00 @ Room 2

Low-bit Neural Network Computing: Algorithms and Hardware

Speaker:
Zidong Du (Institute of Computing Technology, Chinese Academy of Sciences)
Haojin Yang (Hasso-Plattner-Institute)
Kai Han (Noah’s Ark Lab, Huawei Technology)

Abstract:

In recent years, deep learning technologies achieved excellent performance and many breakthroughs in both academia and industry. However, the state-of-the-art deep models are computationally expensive and consume large storage space. Deep learning is also strongly demanded by numerous applications from areas such as mobile platforms, wearable devices, autonomous robots, and IoT devices. How to efficiently apply deep models on such low power devices becomes a challenging research problem. In recent years, Low-bit Neural Network (NN) Computing has received much attention, due to its potential to reduce the storage and computation complexity of NN inference and training. This tutorial introduces the existing efforts on low-bit NN computing. This tutorial includes three parts: 1) Algorithms towards more accurate low-bit NNs; 2) Binary neural network design and inference; 3) Low-bit training of NNs.
Firstly, quantized neural networks with low-bit weights and activations are attractive for developing AI accelerators. However, the performance of low-bit neural networks is usually much worse than that of the corresponding full-precision counterparts. Firstly, the quantization functions are non-differentiable, which increases the optimization difficulty of quantized networks. Secondly, the representation capacity of low-bit values is limited, which also limits the performance of quantized networks. This talk will introduce several methods to obtain high-performance low-bit neural networks, including better optimization manners, better network architecture and a new quantized adder operator.
Secondly, this talk will present the recent progress of binary neural networks, including the development process and the latest model design. It will also give out some preliminary verification results of BNNs using hardware accelerator simulation in terms of accuracy and energy consumption. The presenter will further provide an outlook on prospects and challenges of AI accelerators based on BNNs.
Finally, training CNNs is time-consuming and energy-hungry. Using low-bit formats has been proved promising for speeding up and improving the energy efficiency of CNN inference by many studies, while it is harder for the training phase of CNNs to benefit from such techniques. This talk will introduce the challenge and recent progress of low-bit NN training. Also, this talk will elaborate on hardware architecture design principles of efficient quantized training.

Biography:

Zidong Du is an associate professor at Intelligent Processor Research Center, Institute of Computing Technology (ICT), Chinese Academy of Sciences (CAS). His research interests mainly focus on novel architecture for artificial intelligence, including deep learning processors, inexact/approximate computing, neural network architecture, neuromorphic architecture. He has published over 20 top-tier computer architecture research papers, including ASPLOS, MICRO, ISCA, TC, TOCS, TCAD. For his innovative works on deep learning processors, he won the best paper award of ASPLOS’14, Distinguished Doctoral Dissertation Award of CAS (40/10000), Distinguished Doctoral Dissertation Award of China Computer Federation (10 per year).
Haojin Yang received the Doctoral Degree with the final grade “summa cum laude” at Hasso-Plattner-Institute (HPI) and the University of Potsdam in 2013. From 2015 to 2019, He was the multimedia and machine learning (MML) research group leader at HPI. He received German full professor's teaching qualification (Habilitation) in July 2019. From November 2019 to October 2020, he was the Edge Computing Lab branch head at AI Labs and Ali-Cloud of Alibaba Group. Currently, he leads the MML research group at HPI and serves as chief scientific advisor of Huawei edge cloud innovation lab. He has authored and co-authored more than 70 high quality papers in peer reviewed international conferences and journals. He served as program committee member for leading conferences such as NeurIPS, ICML, ICLR, CVPR, ICCV etc.
Kai Han is a senior researcher at Noah’s Ark Lab, Huawei Technologies. He received his M.S. degree from Peking University, China. His research interests lie primarily in deep learning, machine learning, and computer vision. He has published 20+ papers on top-tier conferences or journals including NeurIPS, ICML, CVPR, ICCV. He regularly serves as reviewer for NeurIPS, ICML, ICLR, AAAI, IJCAI, TCSVT, etc.



Tutorial-3: Monday, January 17, 9:00—12:00 @ Room 3

Side Channel Analysis: from Concepts to Simulation and Silicon Validation

Speaker:
Makoto Nagata (Kobe University)
Lang Lin (ANSYS Inc.)
Yier Jin (University of Florida)

Abstract:

Since the report of simple and differential power analysis in the late 1990’s, side channel analysis (SCA) has been one of the most important and well-studied topics in hardware security. In this tutorial, we will share our insights and experience on SCA by a combination of presentations, embedded demos, and an interactive panel discussion. The three speakers are from academia and industry with rich experience and solid tracking record on hardware security research and practice.
We will start the tutorial with a comprehensive introduction of SCA, including the popular side channels that have been exploited by attackers, common countermeasures, and the simulation based SCA with commercial EDA tools. Then we will present industry proven flows for fast and effective pre-silicon side channel leakage analysis (SCLA) with focus on physical level power and electromagnetic (EM) side channels. Next, we elaborate how to perform on-chip and in-system side-channel leakage measurements and assessments with system-level assembly options on crypto silicon chips with the help of embedded on-chip noise monitor circuits. We will conclude the presentations with some forward-looking discussion on emerging topics such as SCA for security, SCA in AI and machine learning (ML), and pre-silicon SCLA assisted by AI/ML. Multiple short video clips will be embedded to showcase SCA by simulation and silicon measurement.
No prior knowledge is required to attend this tutorial. The audience is expected to learn the foundations and state-of-the-arts in SCA with some hands-on skills.

Biography:

Makoto Nagata is currently a professor of the graduate school of science, technology and innovation, Kobe University, Kobe, Japan. Dr. Nagata is chairing the Technology Directions subcommittee for International Solid-State Circuits Conference since 2018. He was a technical program chair, a symposium chair and an executive committee member for the Symposium on VLSI circuits. He is currently an AdCom member to the IEEE Solid- State Circuits Society and also serves as a distinguished lecturer (DL) in the society, both since 2020. He is an associate editor for IEEE Transactions on VLSI Systems.
Lang Lin is a technical product manager from ANSYS Inc. based in California USA. He is currently passionate about creating EDA solutions to world-wide customers with expertise of power integrity, hardware security and static timing simulation. He was leading CPU design efforts in Intel Corp. with several power integrity papers published in Intel DTTC conference. He holds Ph.D. in ECE from University of Massachusetts and M.S. from Peking University. He has published research papers in the domains of low-power design, side-channel analysis, and hardware security. He has been serving as technical program committees and reviewers of several conferences including IEEE ICCAD, HOST and DAC.
Dr. Yier Jin is the co-chair of IEEE Hardware Security and Trust Technical Committee. He is interested in hardware security, hardware-assisted cybersecurity and IoT design. He is the IEEE CEDA Distinguished Lecturer.



Tutorial-4: Monday, January 17, 13:30—16:30 @ Room 1

New Techniques in Variational Quantum Algorithms and Their Applications

Speaker:
Tamiya Onodera (IBM Research – Tokyo)
Atsushi Matsuo (IBM Research – Tokyo)
Rudy Raymond (IBM Research – Tokyo)

Abstract:

Variational Quantum Algorithms (VQAs) are important and promising quantum algorithms applicable to near-term quantum devices. Some of their applications are in optimization, machine learning, and quantum chemistry. This tutorial will introduce new techniques in VQAs by emphasizing the design of their quantum circuits. It starts with a general introduction of IBM Quantum devices and their programming environment. Next, the design of parameterized quantum circuits in the VQAs for optimization and machine learning is discussed. Finally, we will present some open problems that may be of interest to the ASP-DAC community.

Biography:

Dr. Tamiya Onodera is an IBM Distinguished Engineer, and Deputy Director of IBM Research - Tokyo, managing the Quantum Computing team there. He joined the research laboratory in 1988, after obtaining Ph.D. in Information Science from The University of Tokyo. His current research interests are software stacks for AI and quantum computing. He is a Distinguished Scientist of Association for Computing Machinery (ACM), a Fellow of Japan Society for Software Science Technology (JSSST), and a Board Member of Information Processing Society of Japan (IPSJ). He also serves as Vice-chair of Quantum Computer Technology Promotion Committee at Quantum ICT Forum, while he serves as Secretary of SIG on Quantum Software of IPSJ.
Atsushi Matsuo is a researcher at IBM Research - Tokyo. He received his M.E. degrees in Information Science and Engineering from Ritsumeikan University, Japan in 2013, respectively. His research interests include quantum circuit synthesis, compilers for quantum circuits and quantum algorithms for current noisy devices such as variational quantum algorithms. He works on developing Qiskit, especially Qiskit optimization. Also, he works on developing quantum community and develop quantum human resource for future. For example, he participated Qiskit camps and hackathon as a coach. He contributed to IBM Quantum Challenge that is a competitive programming contest for quantum computers as one of core members.
Rudy Raymond has more than 15 years of experience in applying technical skills in Quantum Computing, AI, and optimization to real-world problems that led to projects with industrial clients, ranging from IT, telecommunication, education, and finance to automotive industries. His recent activities include technical consulting on Quantum Computing and AI to clients in Japan, authoring 100+ technical peer-reviewed papers in Optimization, AI, and Quantum, evaluating patents, delivering invited talks on Quantum Computing, and serving as a program committee of IPSJ SIGQS (Information Processing Society of Japan, Special Interest Group of Quantum Software), and ASPDAC 2021 (Asia-South Pacific Design Automation Conference), and other AI top conferences. He taught classes introducing near-term quantum computers at the University of Tokyo since 2019 and Keio University since 2020. He graduated from Graduate School of Informatics, Kyoto University in 2006 with Ph. D. thesis “Studies on Quantum Query Complexities and Quantum Network Coding”.



Tutorial-5: Monday, January 17, 13:30—16:30 @ Room 2

Towards Efficient Computation for Sparsity in Future Artificial Intelligence

Speakers:
Fei Sun (Alibaba Group)
Dacheng Liang (Biren Technology)
Yu Wang (Tsinghua University)

Abstract:

With the fast development of Artificial Intelligence (AI), introducing sparsity becomes a key enabler for both practical deployment and efficient training in various domains. On the other hand, sparsity does not necessarily translate to efficiency, sparse computation loses to its dense counterpart in terms of throughput. To enable efficient AI computation based on sparse workloads in the future, this tutorial focuses on the following issues: (1) From the algorithm perspective. Compressing deep learning models by introducing zeros to the model weights has proven to be an effective way to reduce model complexity, which introduces sparsity to the model. The introduction of fine-grained 2:4 sparsity by NVIDIA A100 GPUs has renewed the interest in efficient sparse formats on commercial hardware. This tutorial compares different pruning methodologies and introduces several efficient sparse representations that are efficient on CPUs and GPUs. (2) From the kernel perspective. Many AI applications can be decomposed into several key sparse kernels, and the problem can be broken down into acceleration for sparse kernels. Different from dense kernels like general-purpose matrix multiplication (GEMM) which can utilize the peak performance of GPU hardware, sparse kernels like sparse matrix-matrix multiplication (SpMM) achieves low FLOPs, and performance is closely related to the implementation. In this tutorial, we will introduce how to optimize sparse kernels on GPUs with several simple but effective methods. (3) From the hardware perspective. In the existing architecture, the efficiency of dense GEMM operations is getting higher and higher, and the performance is constantly being squeezed. To improve computing efficiency and reduce energy consumption, sparse tensor processing has been placed in an increasingly important position by designers. The main challenges of sparse tensor processing are limited bandwidth, irregular memory access, and fine grain data processing, etc. In this tutorial, we will present some Domain Specific Architectures (DSAs) for sparse tensors in GPGPUs and provide an overview of trends of development.

Biography:

Dr. Fei Sun is a research scientist in the DAMO academy, Alibaba Inc. His research interests include efficient machine learning inference hardware design, algorithm/software/hardware co-design of deep neural network (DNN) implementations, sparse DNN training infrastructure, etc. Prior to the DAMO academy, he was a software engineer at Facebook Inc, where he worked on the Caffe2/Pytorch ML framework and the Facebook AI performance evaluation platform. Prior to that, he was an architect at Cadence/Tensilica. Fei Sun received his B.S. at Peking University in 2000 and Ph.D. at Princeton University in 2005.
Dacheng Liang is a senior researcher of Biren Technology, playing a significant role in the development of tensor core. Prior to joining Biren, he has over 10 years of experience in GPU shader execution unit and AI-NPU development. His research mainly focuses on sparse data processing architecture, ML method for computer architecture and chip design area. Dacheng Liang received his Master degree from University of Electronic Science and Technology of China and a Bachelor degree from Southwest Jiaotong University.
Prof. Yu Wang received the B.S. and Ph.D. (with honor) degrees from Tsinghua University, Beijing, in 2002 and 2007. He is currently a tenured professor with the Department of Electronic Engineering, Tsinghua University. His research interests include brain inspired computing, application specific hardware computing, parallel circuit analysis, and power/reliability aware system design methodology. He has authored and coauthored more than 300 papers in refereed journals and conferences. He has received Best Paper Award in ASPDAC 2019, FPGA 2017, NVMSA 2017, ISVLSI 2012, and Best Poster Award in HEART 2012 with 10 Best Paper Nominations. He is a recipient of DAC under 40 innovator award (2018), IBM X10 Faculty Award (2010). He served as TPC chair for ICFPT 2019 and 2011, ISVLSI2018, finance chair of ISLPED 2012-2016, track chair for DATE 2017-2019 and GLSVLSI 2018, and served as program committee member for leading conferences in these areas, including top EDA conferences such as DAC, DATE, ICCAD, ASP-DAC, and top FPGA conferences such as FPGA and FPT. He served as co-editor-in-chief of the ACM SIGDA E-Newsletter, associate editor of the IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, the IEEE Transactions on Circuits and Systems for Video Technology, ACM Transactions on Embedded Computing Systems, ACM Transactions on Design Automation of Electronic Systems, IEEE Embedded Systems Letters, the Journal of Circuits, Systems, and Computers, and Special Issue editor of the Microelectronics Journal. He is now with ACM SIGDA EC and DAC 2021 EC. He is the co-founder of Deephi Tech (acquired by Xilinx in 2018), which is a leading deep learning computing platform provider.



Tutorial-6: Monday, January 17, 13:30—16:30 @ Room 3

Scan-based DfT: Mitigating its Security Vulnerabilities and Building Security Primitives

Speakers:
Aijiao Cui (Harbin Institute of Technology (Shenzhen))
Gang Qu (University of Maryland)

Abstract:

Scan chain is one of the most powerful and popular design for test (DfT) technologies as it provides test engineers the unrestrictive access to the internal states of the core under test. This same convenience has also made scan chain an exploitable side channel for attackers to steal the cipher key of cryptographic core or combinational logic designed for obfuscation. In this tutorial, we will first present the preliminaries of scan-based DfT technology. Then we will illustrate the vulnerabilities against scan side-channel attacks and SAT attacks and review the existing countermeasures on how to design secure scan-based DfT to resist these attacks. Next, we will discuss how to utilize scan-based DfT as a security primitive to provide solution for several hardware security problems including: hardware intellectual property protection, physical unclonable function and device authentication.
This tutorial targets two groups of audience: (1) graduate students interested in IC testing (in particular scan chain) and security, (2) researchers and engineers from industry and academic working on IC testing and hardware security. No prior knowledge on scan-based DfT or security is required to attend this tutorial. The audience is expected to learn the foundations and state-of-the-arts in secure scan design.

Biography:

Aijiao Cui (S’05–M’10-S’20) received the B.Eng. and M.Eng. degrees in electronics from Beijing Normal University, Beijing, China, in 2000 and 2003, respectively, and the Ph.D. degree in electrical and electronic engineering from Nanyang Technological University, Singapore, in 2009. From July 2003 to December 2004, she was a Lecturer with Beijing Jiaotong University, Beijing. She was a Research Fellow with Peking University Shenzhen SoC Laboratory, Shenzhen, from 2009 to 2010 prior to joining the School of Electronic and Information Engineering of Harbin Institute of Technology (Shenzhen) in 2010, where she is currently an Associate Professor. Her current research interests include hardware security and IC testing techniques.
Gang Qu (SM’98 – M’01–SM’07–F’20) received the Ph.D. degree in computer science from the University of California, Los Angeles. He is currently a Professor with the Department of Electrical and Computer Engineering and the Institute for Systems Research, University of Maryland at College Park, where he leads the Maryland Embedded Systems and Hardware Security (MeshSec) Lab and the Wireless Sensors Laboratory. His primary research interests are in the area of embedded systems and VLSI CAD with a focus on low power system design and hardware related security and trust. He has more than 250 publications and delivered more than 120 invited talks, which helped to build the hardware security and trust community. Notably, he co-founded the AsianHOST symposium (hardware oriented security and trust) which is in its 6th year now. He has organized as chair or co-chair for a dozen other conferences and workshops, as well as founded or chaired hardware security tracks, including the hardware and system security in ASPDAC 2020- 2022. He is an enthusiastic teacher, and has taught and co-taught various security courses, including VLSI design intellectual property protection, cybersecurity for smart grid, reverse engineering and a popular MOOC on hardware security through Coursera.



Last Updated on: November 12, 2021