%0 Journal Article %@ 0278-0070 %A Dang, Nam Khanh %A Ahmed, Akram Ben %A Abdallah, Abderazek Ben %A Tran, Xuan Tu %D 2021 %F SisLab:4430 %I IEEE %J IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems %T HotCluster: A thermal-aware defect recovery method for Through-Silicon-Vias Towards Reliable 3-D ICs systems %U https://eprints.uet.vnu.edu.vn/eprints/id/eprint/4430/ %X Through Silicon Via (TSV) is considered as the near-future solution to realize low-power and high-performance 3D-Integrated Circuits (3D-ICs) and 3D-Network-on-Chips (3DNoCs). However, the lifetime reliability issue of TSV due to its fault sensitivity and the high operating temperature of 3D-ICs, which also accelerates the fault-rate, is one of the most critical challenges. Meanwhile, most current works focus on detecting and correcting TSV defects after manufacturing without considering high-temperature nodes’ impact on lifetime reliability. Besides, the recovery for defective clusters is also challenging because of costly redundancies. In this work, we present HotCluster: a hotspot-aware self-correction platform for clustering defects in 3D-NoCs to help understand and tackle this problem. We first give a method to predict normalized fault rates and place redundant TSV groups according to each region’s fault rate. In our particular medium fault-rate (normalized to the coolest area), HotCluster reduces about 60% of the redundancies in comparison to the uniformly distributed redundancies while having a higher ratio of router working in a normal state. Furthermore, HotCluster integrates both online (weight-based) and offline (max-flow min-cut offline method) mapping algorithms to help the system correct the faulty TSV clusters. The experimental results show that both the max-flow min-cut offline method and weight-based online mode with a redundancy of 0.25 exhibits less than 1% of routers disabled under 50% defect-rates.