Lite-CNN: a high-performance architecture to execute CNNs in low density FPGAs

Due to the computational complexity of Convolutional Neural Networks (CNNs), high performance platforms are generally considered for their execution. However, CNNs are very useful in embedded systems and its execution right next to the source of data has many advantages, like avoiding the need for d...

ver descrição completa

Detalhes bibliográficos
Autor principal: Véstias, Mário (author)
Outros Autores: Duarte, Rui (author), De Sousa, Jose (author), Cláudio de Campos Neto, Horácio (author)
Formato: conferenceObject
Idioma:eng
Publicado em: 2018
Assuntos:
Texto completo:http://hdl.handle.net/10400.21/8903
País:Portugal
Oai:oai:repositorio.ipl.pt:10400.21/8903
Descrição
Resumo:Due to the computational complexity of Convolutional Neural Networks (CNNs), high performance platforms are generally considered for their execution. However, CNNs are very useful in embedded systems and its execution right next to the source of data has many advantages, like avoiding the need for data communication. In this paper, we propose an architecture for CNN inference (Lite-CNN) that can achieve high performance in low density FPGAs. Lite-CNN adopts a fixed-point representation for both neurons and weights, which was already shown to be sufficient for most CNNs. Also, with a simple and known dot product reorganization, the number of multiplications is reduced to half. We show implementation results for 8 bit fixed-point in a ZYNQ7020 and extrapolate for other larger FPGAs. Lite-CNN achieves 410 GOPs in a ZYNQ7020.