PRUNING DEEP NEURAL NETWORKS: TOWARDS EFFICIENT MODELS ON THE EDGE

ABSTRACT
Deep Neural Networks can solve extremely challenging tasks stacking a large number of layers with thousands of neurons and millions of learnable parameters. Their success comes mainly from their ability to learn complex non-linear functions from examples by minimizing a target loss function. However, DNNs are often over-parameterized with respect to the computational resources available at deployment time.
Here comes the urgency of taking into consideration metrics like memory footprint and floating-points operations (FLOPs), which are proxies of the model’s complexity. A number of approaches are currently proposed toward more efficient deep models, including neural architecture search and knowledge distillation. One of the natural ways to proceed, however, is to remove, from a trained model, all the unnecessary parameters/neurons. The problem of pruning a deep model addresses this problem.
In this tutorial session, we will outline motivations, some approaches and limits, evidencing the difference between structured and unstructured pruning.
A hands-on-the-code session will be offered in the final part of the tutorial, powered by HPC4AI, the high-performance computing infrastructure dedicated to AI (https://hpc4ai.it/ ) available at University of Turin.
SPEAKERS

Enzo Tartaglione, enzo.tartaglione@telecom-paris.fr

Enzo Tartaglione is Maitre de Conferences at Telecom Paris and he is an Hi!Paris chair holder. He received the MS degree in Electronic Engineering at Politecnico di Torino in 2015, cum laude. The same year, he also received a magna cum laude MS in electrical and computer engineering at University of Illinois at
Chicago. In 2016 he was also awarded the MS in Electronics by Politecnico di Milano, cum laude. In 2019 he obtained the PhD in Physics at Politecnico di Torino, cum laude, with the thesis “From Statistical Physics to Algorithms in Deep Neural Systems”. His principal interests include compression, sparsification, pruning and watermarking of deep neural networks, deep learning for medical imaging, privacy-aware learning, data debiasing, regularization for deep learning and neural networks growing. His expertise mainly focuses on the themes of efficient deep learning, with articles published on top conferences and journals on the field.

Attilio Fiandrotti, attilio.fiandrotti@unito.it

Attilio Fiandrotti is assistant professor at Università di Torino, dept. of computer science; he also holds a position as associate professor at Télécom Paris, Institut Polytechnique de Paris. He got his PhD in computer science from Politecnico di Torino in 2010 after a visiting period at EPFL Lausanne in 2009.His current research interests include deep-learning based methods video compression and methods for learning sparse neural network topologies for embedded applications. His reserach interests also include network coding based cooperative video delivery and robust video streaming over wireless networks. He is author of more than 50 publications in IEEE top conferences (ICIP, ICASSP, MSSP, ICME) and journals (TMM, TIP, TIFS, TRSGS) and is also habitual reviewer for the same venues. He is inventor of a number of patent applications and has provide several contributions to the ISO/IEC MPEG standardization activities.

Andrea Bragagnolo, andrea.bragagnolo@unito.it

Andrea Bragagnolo received his Master’s Degree in Computer Science from the University of Turin, Italy, in 2019. He is currently a Ph.D. student at the University of Turin and employed by Synesthesia for the development of smart systems for retail workers. His current research interests include pruning and simplification of Neural Networks and synthetic datasets for Object Detection.

Iacopo Colonnelli, iacopo.colonnelli@unito.it

Iacopo Colonnelli is a Ph.D. student in Modeling and Data Science at Università di Torino. He received his master’s degree in Computer Engineering from Politecnico di Torino with a thesis on a high-performance parallel tracking algorithm for the ALICE experiment at CERN.
His research focuses on both statistical and computational aspects of data analysis at large scale and on workflow modeling and management in heterogeneous distributed architectures.

Marco Grangetto, marco.grangetto@unito.it

Marco Grangetto received the M.S. degree in electrical engineering and the Ph.D. degree from the Politecnico di Torino, Turin, Italy, in 1999 and 2003, respectively. He is currently a Full Professor with the Department of Computer Science, Università di Torino, Turin, where he coordinates research in the area of image processing and computer vision. His research interests are in the fields of multimedia signal processing and networking. His expertise includes wavelets, image and video coding, data compression, video error concealment, error resilient video coding, computer vision, and biomedical image processing. Dr. Grangetto is a member of Moving Picture, Audio and Data Coding by Artificial Intelligence (MPAI). He was the recipient of the Premio Optime by the Unione Industriale di Turin in September 2000 and a Fulbright Grant in 2001 for a research period with the Department of Electrical and Computer Engineering, University of California, San Diego, CA, USA. He participated in the ISO standardization activities on Part 11 of the JPEG2000 standard. He was an Associate Editor of the IEEE TRANSACTIONS ON COMMUNICATIONS and IEEE TRANSACTIONS ON MULTIMEDIA .