Computer Architecture Today

Informing the broad computing community about current activities, advances and future directions in computer architecture.
Archive of posts tagged: Errors
Silent Data Corruption at Scale

Silent Data Corruption at Scale

Hyperscalers are reporting frequent silent data corruptions (SDCs)—a.k.a. silent errors or corrupt execution errors (CEEs)—in their cloud fleets caused by silicon manufacturing defects. Notably, SDCs at-scale exhibit error occurrence rates on the order of one fault...

Read more...

Engineering Reliable Persistence

Engineering Reliable Persistence

Integrating non-volatile main memories (NVMMs) into the storage/memory hierarchy make data integrity a critical design consideration.  Protecting data in NVMM is a complex problem:  media errors and software bugs can corrupt data and the reliability of each memory...

Read more...