Uptime Escalation Matrix #71

Open
opened 2024-07-02 19:08:39 +00:00 by mik-tf · 2 comments
Owner

Todo

We need an escalation matrix to ensure 99.9% uptime for TF.

Parameters

  • We need 7/7 24h escalation possibilities to team (need to make shifts)

References

How to Use the Matrix

  • If something is urgent enough -> escalate to people in matrix
# Todo We need an escalation matrix to ensure 99.9% uptime for TF. # Parameters - We need 7/7 24h escalation possibilities to team (need to make shifts) # References - We already have an escalation matrix defined in the private tfgrid repo: https://git.ourworld.tf/tfgrid/info_tfgrid_private/src/branch/development/collections/procs_and_docs/escalation_matrix.md - Adapt from this to 99.9% uptime escalation matrix # How to Use the Matrix - If something is urgent enough -> escalate to people in matrix
mik-tf changed title from TF 99.9% Uptime to Uptime Escalation Matrix 2024-07-02 19:09:10 +00:00
mik-tf added this to the (deleted) project 2024-07-02 19:09:14 +00:00
despiegk modified the project from (deleted) to tfgrid_3_17 2024-07-03 12:52:06 +00:00
despiegk modified the project from tfgrid_3_17 to (deleted) 2024-07-03 12:52:11 +00:00
despiegk modified the project from (deleted) to tfgrid_3_17 2024-07-03 12:52:29 +00:00
Owner

We could train the support team and add them to the monitoring group (that needs to be optimized). Support team could be split up to have 24/7 coverage. Once there indeed is an issue, they can get in touch with the person on call. Let's discuss

We could train the support team and add them to the monitoring group (that needs to be optimized). Support team could be split up to have 24/7 coverage. Once there indeed is an issue, they can get in touch with the person on call. Let's discuss
Author
Owner

That's perfect. I like that plan. Yes let's discuss and make it work. We can coordinate with Sherwin for this phase.

At this point we will have clear documentation on the monitoring system (e.g. alerta.io, with text + video guides).

That's perfect. I like that plan. Yes let's discuss and make it work. We can coordinate with Sherwin for this phase. At this point we will have clear documentation on the monitoring system (e.g. alerta.io, with text + video guides).
despiegk added the
Story
label 2024-07-28 07:58:11 +00:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: tfgrid/circle_engineering#71
No description provided.