Recently, the growing popularity of powerful mobile
devices has resulted in the exponential growth of demand
for multimedia applications in these devices. Due to the intensive
computation, these complex multimedia applications need
highly frequent embedded memory accesses and are highly
memory dependent. Hence, with embedded static random access
memories (SRAMs) consuming a large amount of power,
it limits the battery lifetime of mobile devices. We have presented low power techniques for mobile multimedia applications. As an example, by exploring the nature of the pixel data, we achieve a reliable operation at 0.36 V under process variation and NBTI aging effect. The developed memory achieves 95% reduction in power consumption, with no significant degradation in frame quality.
Download Full Artical (Pdf)
(Architecture/Circuit) Low Power Tri-modal Register Files Design
Modern microprocessors employ register files (RFs) for performance enhancement and at the same time achieving instruction level parallelism. However, RF incurs large power consumption due to the highly frequent access. Meanwhile, as technology scales, bias temperature instability (NBTI/PBTI) has become a major reliability concern for RF designers. We present a circuit-architecture co-design technique for power efficient and reliable tri-modal register files, by exploring register activity to improve the power efficiency. To meet design constraints of diverse applications, we develop four possible implementations that tradeoff power, speed, and design complexity, achieving greater design flexibility.
Download Full Artical (Pdf)
(Circuit) Power Efficient and High Performance Bit-Line for On-chip Memories
In modern microprocessors, the register files read stage is
usually on the critical path and its read latency limits the achievable
maximum operating frequency. Accordingly, wide fan-in
dynamic circuits are forced in use for LBL and global bit lines (GBL)
in register files to speed up the read operation. However, there are two
main challenges to design LBL: 1. the bit line structure takes up to
70% of the dynamic power consumption of register files, so the power
efficiency of LBL has become a great concern; 2. the aggressive
scaling of CMOS technology along with increasing levels of process
variations has adversely impacted the yield of LBL. We implement clock-delay-unit-combined local bit line (CB-LBL) to enable high read access speed and energy efficient operation for high performance register files, achieving 99.8% parametric yield.
Download Full Artical (Pdf)
(Circuit) Sleep Vector Selection for On-chip Memories
Dual threshold voltage technique is applied
widely in dynamic OR circuits to achieve low leakage in register
files (RF) design, but its effectiveness is significantly influenced by
the selected sleep vector during the standby mode. As technology
scales into deep nanometer era, the sleep vector selection in dual
dynamic OR (DV-OR) circuits becomes challenging due to
the impact of PVT (process, supply voltage and temperature)
variations. By analyzing the relationship between Process, Voltage & Temperature (PVT) variations, leakage characteristics, and sleep vectors in register files, we conduct a comprehensive study on sleep vector selection and develop guidelines to achieve low leakage and robust register files in modern processors.
Download Full Artical (Pdf)
(Circuit/Device) Reliable Register Files
With the continuous technology scaling, negative Bias Temperature
Instability (NBTI) has become one of the major reliability
challenges in modern processors. This aging effect is further exacerbated
in register files (RF) due to the following two reasons: (1)
RF is a hot spot in modern processors and the NBTI effect increases
exponentially with temperature; (2) Since RF are accessed very frequently,
corrupted data in RF can easily propagate to other parts of
microprocessors. By exploring CMOS device’s long-time reliability characteristics, we develop a hybrid-cell register files to achieve high reliability by storing the most vulnerable bits in robust 8T cells and other bits in conventional 6T cells. Based on 32 nm CMOS process, our design achieves 11.4% and 24.8% register file reliability improvement in high performance and embedded systems respectively, with negligible overhead.
Download Full Artical (Pdf)