Please use this identifier to cite or link to this item:
https://scholarhub.balamand.edu.lb/handle/uob/7052
DC Field | Value | Language |
---|---|---|
dc.contributor.author | akkad, Ghattas | en_US |
dc.contributor.author | Mansour, Ali | en_US |
dc.contributor.author | Inaty, Elie | en_US |
dc.date.accessioned | 2023-10-02T07:20:00Z | - |
dc.date.available | 2023-10-02T07:20:00Z | - |
dc.date.issued | 2024-05-11 | - |
dc.identifier.uri | https://scholarhub.balamand.edu.lb/handle/uob/7052 | - |
dc.description.abstract | The exponential increase in generated data as well as the advances in high-performance computing has paved the way for the use of complex machine learning methods. Indeed, the availability of Graphical Processing Units (GPU) and Tensor Processing Units (TPU) have made it possible to train and prototype Deep Neural Networks (DNN) on large-scale data sets and for a variety of applications, i.e., vision, robotics, biomedical, etc. The popularity of these DNNs originates from their efficacy and state-of-the-art inference accuracy. However, this is obtained at the cost of a considerably high computational complexity. Such drawbacks rendered their implementation on limited resources, edge devices, without a major loss in inference speed and accuracy, a dire and challenging task. To this extent, it has become extremely important to design innovative architectures and dedicated accelerators to deploy these DNNs to embedded and re-configurable processors in a high-performance low complexity structure. In this study, we present a survey on recent advances in deep learning accelerators (DLA) for heterogeneous systems and Reduced Instruction Set Computer (RISC-V) processors given their open-source nature, accessibility, customizability and universality. After reading this article, the readers should have a comprehensive overview of the recent progress in this domain, cutting edge knowledge of recent embedded machine learning trends and substantial insights for future research directions and challenges. | en_US |
dc.language.iso | eng | en_US |
dc.subject | Computer architecture | en_US |
dc.subject | Convolutional Neural Network (CNN) | en_US |
dc.subject | Embedded machine learning | en_US |
dc.subject | Field programmable gate arrays | en_US |
dc.subject | Hardware accelerators | en_US |
dc.subject | Pipelines | en_US |
dc.subject | Reduced instruction set computing | en_US |
dc.subject | Registers | en_US |
dc.subject | RISC-V | en_US |
dc.subject | Rockets | en_US |
dc.subject | Surveys | en_US |
dc.subject | Transformers | en_US |
dc.title | Embedded Deep Learning Accelerators: A Survey on Recent Advances | en_US |
dc.type | Journal Article | en_US |
dc.identifier.doi | 10.1109/TAI.2023.3311776 | - |
dc.identifier.scopus | 2-s2.0-85171589455 | - |
dc.identifier.url | https://api.elsevier.com/content/abstract/scopus_id/85171589455 | - |
dc.contributor.affiliation | Department of Computer Engineering | en_US |
dc.description.startpage | 1 | en_US |
dc.description.endpage | 19 | en_US |
dc.date.catalogued | 2023-10-02 | - |
dc.description.status | Published | en_US |
dc.relation.ispartoftext | IEEE Transactions on Artificial Intelligence | en_US |
Appears in Collections: | Department of Computer Engineering |
SCOPUSTM
Citations
3
checked on Nov 16, 2024
Record view(s)
171
checked on Nov 21, 2024
Google ScholarTM
Check
Altmetric
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.