Digital in-memory stochastic computing architecture for vector-matrix multiplication

Journal article

Shady O. Agwa, T. Prodromakis
Frontiers in Nanotechnology, 2023

Semantic Scholar DOI

Cite

APA Click to copy
Agwa, S. O., & Prodromakis, T. (2023). Digital in-memory stochastic computing architecture for vector-matrix multiplication. Frontiers in Nanotechnology.

Chicago/Turabian Click to copy
Agwa, Shady O., and T. Prodromakis. “Digital in-Memory Stochastic Computing Architecture for Vector-Matrix Multiplication.” Frontiers in Nanotechnology (2023).

MLA Click to copy
Agwa, Shady O., and T. Prodromakis. “Digital in-Memory Stochastic Computing Architecture for Vector-Matrix Multiplication.” Frontiers in Nanotechnology, 2023.

BibTeX Click to copy

@article{shady2023a,
  title = {Digital in-memory stochastic computing architecture for vector-matrix multiplication},
  year = {2023},
  journal = {Frontiers in Nanotechnology},
  author = {Agwa, Shady O. and Prodromakis, T.}
}

Abstract

The applications of the Artificial Intelligence are currently dominating the technology landscape. Meanwhile, the conventional Von Neumann architectures are struggling with the data-movement bottleneck to meet the ever-increasing performance demands of these data-centric applications. Moreover, The vector-matrix multiplication cost, in the binary domain, is a major computational bottleneck for these applications. This paper introduces a novel digital in-memory stochastic computing architecture that leverages the simplicity of the stochastic computing for in-memory vector-matrix multiplication. The proposed architecture incorporates several new approaches including a new stochastic number generator with ideal binary-to-stochastic mapping, a best seeding approach for accurate-enough low stochastic bit-precisions, a hybrid stochastic-binary accumulation approach for vector-matrix multiplication, and the conversion of conventional memory read operations into on-the-fly stochastic multiplication operations with negligible overhead. Thanks to the combination of these approaches, the accuracy analysis of the vector-matrix multiplication benchmark shows that scaling down the stochastic bit-precision from 16-bit to 4-bit achieves nearly the same average error (less than 3%). The derived analytical model of the proposed in-memory stochastic computing architecture demonstrates that the 4-bit stochastic architecture achieves the highest throughput per sub-array (122 Ops/Cycle), which is better than the 16-bit stochastic precision by 4.36x, while still maintaining a small average error of 2.25%.