In this episode of Google's site reliability engineering podcast, host Steve McGhee and co-host Matt Siegler interview Pete Pellerzi, a distinguished engineer on Google's construction team, about the physical infrastructure of Google's data centers. Pete discusses the scale of data center operations, emphasizing Google's community-oriented approach to building campuses and the importance of planning for multiple buildings. The conversation covers incident management, highlighting Google's cooperative adaptation strategy during failures, such as the countrywide power outage in Chile. Pete shares insights on building resilience in smaller organizations by finding trusted partners and developing business continuity plans. The discussion shifts to next-generation tech, focusing on the increasing density of chips and the adoption of liquid cooling, as well as the use of AI and machine learning to optimize data center cooling plants. The podcast concludes with a discussion on MTTR and MTBF metrics, emphasizing the importance of availability and Google's unique position in leveraging its large installed base to work closely with manufacturers and implement fault-tolerant designs.