VCDX Skill Business Continuity Disaster Recovery

Originally posted in September 2015 on vmice.net

Business Continuity is a term which is self-explanatory. It involves measures taken to be able to fulfill a certain SLA, or a Service Level Agreement.

Disaster Recovery involves being able to recover from a disaster. In most cases that means having workloads replicated to a separate site. These have in most cases a defined SLA as well.

Here is an example of categories that can be covered:

  • Business Continuity
    • Common SLA times (5×9, 4×9, 3×9, 2×9)
    • Component redundancy (HBAs, memory mirrors, Boot drive, CPU availability features (E7 vs E5) etc)
    • Hardware redundancy (network ports, switches, multiple paths etc)
    • Increase availability features (vSphere HA, FT , DB clustering etc)
    • Backups (agent vs image based-VADP etc)
    • Restores (restore tests, security standard requirements)
    • RTO, RPO, WRT, MTD
  • Disaster Recovery
    • Data Mirroring (async or synchrounous)
    • Data Mirroring Transfer Method (FC, Network etc)
    • Automation of recovery
    • Failback
    • Failover
    • Runbooks

How do can the design qualities be impacted?

  • Availability can be impacted by component failures, hardware failures, improper use of HA features, lack of HA features, budget constraints impacting HA features, etc.
  • Manageability can be impacted by management tools (SRM, Zerto, Veeam), HA feature prerequisites (FT), skill levels, integration with internal tools (automation or billing), runbook definition and creation etc
  • Performance can be impacted by mirror bandwidth, appliance maximus (vSphere Replication) , tape bandwidth, backup type and access etc.
  • Recoverability can be impacted by backup type, backup fast restore features, Disaster recovery solutions (SRM, Zerto, Veeam), site location and link redundancy, hardware integration features (SRM SRAs, Veeam HP Plugins) etc
  • Security can be impacted by user access control, encryption of traffic, SSL certificates, encryption of static data, network security etc.

Rene Van Den Bedem has an excellent post on lots of BC/DR items as well: http://vcdx133.com/2014/04/24/vcdx-study-plan-bcdr/ and actually explains availability in detail in this post http://vcdx133.com/2015/01/28/vcdx-availability-explained/

Here are some of the questions from the Quizlet:

What is the purpose of an BIA? Analyse the impact the downtime of an application would have on the business, how much it would cost if it went down. Financial perspective. How much it would cost, per hour, per day, per month to have the service down

What are the types of application in regards to criticality to the business? What SLA level is used generally with each? Mission Critical Applications: 5×9
Business Critical Applications: 4×9
Non-Critical Production: 3×9
Test/Dev: 3×9 or 2×9

Here is a slide from the VCDX slide deck:

VCDX_Skill_BCDR