This is a new service – your feedback will help us to improve it.

  1. Home
  2. Documentation
  3. OCTO Strategic Documentation
  4. Service Levels for MOJ Applications

Service Levels for MOJ Applications

Service Levels for MOJ Applications

Purpose and Context

The Ministry of Justice operates a diverse portfolio of applications ranging from mission-critical systems supporting courts and prisons to administrative utilities. Historically, investment in resilience, recovery, and continuity has been inconsistent and often based on reactive incident response rather than planned, tiered expectations.

This inconsistency created several problems:

  • Misaligned investment decisions – teams over-engineer low-value systems and under-resource critical services
  • Unclear recovery expectations – no consistent framework for defining RTO/RPO targets across the department
  • Inefficient resource allocation – continuity and disaster recovery planning was ad hoc and often duplicative
  • Compliance and audit risk – no clear traceability between business impact and architectural decisions

This Service Level framework establishes a five-tier model of business criticality with aligned Recovery Time Objectives (RTO), Recovery Point Objectives (RPO), and architectural expectations. By doing so, we achieve:

  • Consistent architectural decisions across all MOJ applications based on business impact
  • Predictable recovery and continuity capabilities aligned to operational need
  • Efficient resource allocation by right-sizing investment to business criticality
  • Clear accountability for continuity planning and resilience engineering
  • Reduced operational risk by ensuring critical systems meet appropriate availability standards

This framework applies to all applications in the MOJ portfolio, from citizen-facing services and operational systems to internal administrative tools. New systems must be classified during design; existing systems should be re-evaluated during their next major change or upgrade.

The model is grounded in industry best practice:

  • ISO 22301:2019 (Business Continuity Management Systems)
  • ISO 22313:2020 (Business Continuity and Resilience – Guidance and Methods for Implementation)
  • NIST SP 800-34 Rev. 1 (Contingency Planning)
  • ServiceNow Recovery Tier Guidance (widely adopted cloud recovery models)

Level Description Typical RTO Typical RPO Architectural Expectation
1 — Mission Critical Immediate harm if unavailable Minutes Near-zero Multi-region active-active; continuous replication
2 — Business Critical Material operational disruption 4–8 hours <1 hour Warm standby; rapid failover
3 — Operational Workarounds available <24 hours <24 hours Cold/warm DR; daily backups
4 — Administrative Low operational impact 1–3 days 1–2 days Standard backup and restore
5 — Non-Critical Minimal impact Days–weeks 2+ days Fix-on-failure

Level 1 — Mission Critical

Systems whose loss results in immediate and unacceptable harm to life, liberty, security, financial obligations, or statutory duties. These systems typically underpin core operational services where no downtime and no data loss are acceptable.

Parameter Value
RTO 0–15 minutes
RPO Zero or near-zero

Continuity and DR expectations

  • Active-active architectures, multi-region failover or hot-standby
  • Automated failover, continuous monitoring, and tested DR drills
  • Regular resilience testing including cyber-attack and infrastructure-level scenarios
  • Must align with a very short Maximum Tolerable Period of Disruption (MTPD)

Level 2 — Business Critical

Systems supporting essential services where downtime produces material operational, financial or reputational impact, but not immediate organisational failure.

Parameter Value
RTO 4–8 hours
RPO 15 minutes to 1 hour

Continuity and DR expectations

  • Warm standby, high-availability clustering, and replicated storage
  • Daily or sub-hourly backups; asynchronous replication acceptable
  • Annual DR exercises; scenario testing recommended
  • Clear manual fallback processes to maintain partial operations

Level 3 — Operational

Systems supporting day-to-day operations. Temporary outages are tolerable and well-defined workarounds exist.

Parameter Value
RTO <24 hours
RPO 4–24 hours

Continuity and DR expectations

  • Daily backups; restore from secondary region or cold standby environment
  • Documented manual operating procedures
  • DR testing biannual or annual depending on change velocity

Level 4 — Administrative / Supporting

Internal productivity, administration, and support systems where downtime produces inconvenience but limited operational consequence.

Parameter Value
RTO 24–72 hours
RPO 24–48 hours

Continuity and DR expectations

  • Standard backup regimes; recovery from snapshots or cold standby
  • Manual workaround acceptable for several days
  • Periodic review to ensure the system does not become more business-relevant over time

Level 5 — Non-Critical / Low-Impact

Systems with purely convenience-based, informational or peripheral use. Failure does not materially impact organisational outcomes (e.g. internal dashboards with no operational effect).

Parameter Value
RTO 72 hours to weeks
RPO 48 hours or more

Continuity and DR expectations

  • Backups at standard intervals
  • Fix-on-failure; no immediate failover required
  • Minimal DR testing; restore as part of BAU backlog

Cross-cutting expectations

Business continuity

  • Must link to ISO 22301 Business Impact Analysis (BIA) outputs including MTPD, RTO, RPO, and MBCO
  • Each service must define dependencies, upstream/downstream integrations, and manual fallback capability

Disaster recovery

  • Recovery strategies must be justified against cost and risk
  • DR parameters (RTO, RPO, WRT, MTD) should be documented following the NIST SP 800-34 recommended structure

Continuity testing

  • Regular exercises required across all levels, scaling from full failover for Level 1 to table-top for Level 5
  • Plans must be validated regularly to remain effective as systems and priorities change

Architecture alignment

  • Design patterns (HA, DR, geo-redundancy, backup frequency) must be driven by the chosen criticality level
  • Lower RTO → more automation, multi-site redundancy and monitoring
  • Higher RTO → simpler backup/restore approaches
Last reviewed: 28 May 2026Review status: ✓ Up to dateSource: View source on GitHub

Was this page useful?