Differences
This shows you the differences between two versions of the page.
albert_gural [2020/03/31 13:16] – created 0.0.0.0 | albert_gural [2020/04/02 17:17] (current) – reric | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== | + | ====== |
- | {{wiki: | + | {{wiki: |
BSEE, California Institute of Technology, 2016 | BSEE, California Institute of Technology, 2016 | ||
Line 7: | Line 7: | ||
MSEE, Stanford University, 2018 | MSEE, Stanford University, 2018 | ||
- | **Email:** agural (AT) stanford (DOT) edu | + | **Email:** agural (AT) stanford (DOT) edu |
+ | |||
+ | $[hdcolor #8c1515$] | ||
+ | ====== Hardware-Algorithm Co-design for Emerging Machine Learning Accelerators ====== | ||
+ | $[/ | ||
- | ====== <span style=" | ||
Deep neural networks (DNNs) have recently seen a resurgence in popularity due to the increased availability of data and capability of compute. These modern advancements allow DNNs to tackle previously intractable real-world decision problems. To continue enabling these advancements - and to enable them in practice, such as inference on edge devices - we need to continue targeting improvements to the underlying compute capabilities. However, rather than focus on compute hardware in isolation of the algorithmic applications, | Deep neural networks (DNNs) have recently seen a resurgence in popularity due to the increased availability of data and capability of compute. These modern advancements allow DNNs to tackle previously intractable real-world decision problems. To continue enabling these advancements - and to enable them in practice, such as inference on edge devices - we need to continue targeting improvements to the underlying compute capabilities. However, rather than focus on compute hardware in isolation of the algorithmic applications, | ||
- | For applications involving small microcontrollers, | + | For applications involving small microcontrollers, |
- | For latency-critical applications, | + | For latency-critical applications, |
For edge applications with large DNNs, DNN weight movement begins to dominate energy costs. In-memory compute (IMC) offers an elegant solution to the problem by requiring nearly no weight movement - do computations where the weights are stored. However, as DNN sizes grow, the chip area required to store these weights becomes a problem. One potential solution is to use emerging nonvolatile memory (NVM) such as resistive RAM (RRAM), which promises high spatial density. To use RRAM, however, we need to understand its non-idealities and their effects on DNN accelerators designed around them. | For edge applications with large DNNs, DNN weight movement begins to dominate energy costs. In-memory compute (IMC) offers an elegant solution to the problem by requiring nearly no weight movement - do computations where the weights are stored. However, as DNN sizes grow, the chip area required to store these weights becomes a problem. One potential solution is to use emerging nonvolatile memory (NVM) such as resistive RAM (RRAM), which promises high spatial density. To use RRAM, however, we need to understand its non-idealities and their effects on DNN accelerators designed around them. | ||
- | [[1]] Gural, Albert, and Boris Murmann. " | + | [1] Gural, Albert, and Boris Murmann. " |
- | [[2]] Jain, Sambhav R., et al. " | + | [2] Jain, Sambhav R., et al. " |