## Design Principles
- Perform operations as code
- Make frequent, small, reversible changes
- Refine operations procedures frequently
- Anticipate failure
- Learn from all operational failures
- Use managed services
- Implement observability for actionable insights
## Best Practice Areas
### Organization
- **Understand priorities:** Analyze customer needs, internal governance, and external factors
- **Empower teams:** Define ownership, responsibilities, and communication channels
- **Invest in teams:** Provide support, resources, and learning opportunities
### Prepare
- **Design for Observability**: monitor data (metrics, logs, traces) across components
- **Streamline Deployments**: Make small, reversible changes & leverage CI/CD
- **Ensure Operational Readiness**: Know risks & use checklists, procedures, contingency plans
### Operate
- **Focus on meaningful data**: Track key signals & fix performance issues fast
- **Measure success by outcomes**: build, measure, learn for stakeholder needs
- **Manage events effectively**: Plan for any disruption & respond clearly based on impact
### Evolve
- **Continuous Improvement**: embrace refining with feedback loops and safe experiments
- **Automate for Efficiency**: Free up team by automating routine tasks and leveraging code
- **Foster a Culture of Learning**: with collaboration, communication & celebrating innovation
## References
[Operational Excellence White Paper](https://docs.aws.amazon.com/wellarchitected/latest/operational-excellence-pillar/welcome.html)