- Computing technology entering a new phase
Computers have evolved over half a century into an essential and indispensable backbone of social information infrastructure. And now, it faces a significant turning point. Although Moore’s Law, which doubles the number of transistors in a chip every two years, has contributed to continuously improving computer performance, we might not be able to expect sustainable transistor shrinking anymore. A promising way to make a breakthrough in solving such a critical issue is to exploit emerging devices, such as non-volatile memories, superconductor devices, and nanophotonic devices, to design novel computing platforms aggressively. What should computer architects pay attention to when exploiting such new devices? A key is to find significant potential that MOS-FETs cannot achieve. It is essential to “maximize the advantages of new devices and hide their shortcomings”. However, in many cases, we do not have enough knowledge and experience with such new devices, materials, physics, etc. Therefore, it is strongly required that we interact with device researchers.
- Sharing my experience: A case for superconductor computing
About 20 years ago, I started computer architecture research targeting superconductor SFQ (Single Flux Quantum) logics that operate in a 4.2-kelvin environment with ultra-high-speed (such as over 50 GHz) ultra-low-power features. When I first heard about SFQ, I was very attracted by its ultra-high-speed nature. However, this was later replaced by disappointment due to its bit-serial operations that greatly lower the computing throughput. At that time, there were several technical reasons why they did not consider bit-parallel circuits, but the fundamental barrier was the existence of a pitfall that bit-parallel circuits do not work with an ultra-high-speed frequency due to their complex, long wires. Through a long engagement with device/circuit researchers, we have been able to introduce a gate-level fine-grained deep pipeline architecture that maximizes its performance but also makes wiring simple. Designers can concentrate timing optimization on gate-to-gate wirings in the gate-level pipelined structure. Although many SFQ researchers disagreed with this design style, we have finally demonstrated real chip designs over 50 GHz with bit-parallel operations. You can enjoy our demo movie on how ultra-fast SFQ chips are designed and measured. Recently, we have discussed exploiting SFQ logics for AI inference acceleration with architectural optimization. The importance of resource re-balancing of PEs and on-chip memory was a critical microarchitectural optimization. BTW, physical chip measurement (not simulation) of such emerging devices, as shown in images (A) and (C), is an exciting process. We do manual wire bonding, and sometimes we fail in this process, as you can see in (B). Although this chip does not work anymore, it gives us valuable experience!
- Sharing my experience: A case for nanophotonic computing
In 2015, I joined a project regarding nanophotonic computing. Our role was to explore AI inference architecture by using nanophotonic devices. It is known that we can implement a vector-by-matrix multiplier (VMM) by using optical components called MZI (Mach-Zehnder Interferometer). At the beginning of this project, the priority goal of the device team was to reduce the size of MZI in order to achieve low-latency device operations. To confirm the impact of device size reduction at a system level, we have performed device modeling and developed an architecture exploration platform. Then, we found an interesting observation. Unexpectedly, if we assume a small size of VMM, the device size reduction does not contribute to power efficiency. This is because the critical path of the optical VMM changes with its size, i.e., the other optical component rather than MZI dominates the system frequency if the VMM size is small. On the other hand, if we consider a large size of VMM, the device size reduction has a significant impact, but in this case, the loss of energy in each MZI has to be reduced significantly because a large number of serially connected MZIs finally lose too much energy to detect the accurate final VMM calculation result. This result suggested that “only device size optimization” is insufficient if we consider architecture-level designs.
- Lessons from past experiences
I believe that realizing emerging device computing, e.g., superconductor computing, optical computing, quantum computing, etc., is an exciting computer architecture research direction. I think computer architects should carefully consider the following three points.
- What was considered optimal/failed in research targeting conventional semiconductors is not necessarily the same for novel devices. Don’t be limited by past precedent!
- It is crucial to systematically search for ways to improve performance (or power efficiency, etc.) in as wide an architectural space as possible and feedback the results to device research. This kind of architecture-device co-design can have significant potential, and guiding the exploration of such kind of co-design is an important role of computer architects.
- Device modeling is the key challenge to achieving such co-design. To validate the accuracy of the models, architects need to discuss with device researchers deeply.
Yes, cross-layer interaction is not an easy task because it usually takes a very long time. In many cases, we use different words (technical terms), definitions/assumptions, metrics, and, of course, have different backgrounds. But let’s overcome this difficulty and find a new frontier of computer architecture!
About the Author: Koji Inoue is a professor in the Department of Advanced Information Technology at Kyushu University, Japan. His research interests include emerging device computing (SFQ, nanophotonic, quantum, etc.), power-aware computing, and IoT system designs.
Disclaimer: These posts are written by individual contributors to share their thoughts on the Computer Architecture Today blog for the benefit of the community. Any views or opinions represented in this blog are personal, belong solely to the blog author and do not represent those of ACM SIGARCH or its parent organization, ACM.