The main purpose of DevOps teams is to ensure system is up and running.
This boils down into 2 things to look for
Both of these questions can be answered by a single metric MTTR - Mean time to recover from an incident.
<aside> ℹ️ DevOps teams should track MTTR (Mean time to recover)
</aside>
There are definitely other metrics that DevOps teams should track
Between all of these, DevOps team fully owns only one metric, and that is MTTR. The rest can be effected by other developers or even 3rd parties. The team has less control over them thus improving those metrics will be much harder. .