## Design Principles - Perform operations as code - Make frequent, small, reversible changes - Refine operations procedures frequently - Anticipate failure - Learn from all operational failures - Use managed services - Implement observability for actionable insights ## Best Practice Areas ### Organization - **Understand priorities:** Analyze customer needs, internal governance, and external factors - **Empower teams:** Define ownership, responsibilities, and communication channels - **Invest in teams:** Provide support, resources, and learning opportunities ### Prepare - **Design for Observability**: monitor data (metrics, logs, traces) across components - **Streamline Deployments**: Make small, reversible changes & leverage CI/CD - **Ensure Operational Readiness**: Know risks & use checklists, procedures, contingency plans ### Operate - **Focus on meaningful data**: Track key signals & fix performance issues fast - **Measure success by outcomes**: build, measure, learn for stakeholder needs - **Manage events effectively**: Plan for any disruption & respond clearly based on impact ### Evolve - **Continuous Improvement**: embrace refining with feedback loops and safe experiments - **Automate for Efficiency**: Free up team by automating routine tasks and leveraging code - **Foster a Culture of Learning**: with collaboration, communication & celebrating innovation ## References [Operational Excellence White Paper](https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/welcome.html)