Friday, 15 February 2013

99.999% Availability - Really

So… How many times have you heard it / seen it… Vendor A walks into the business that your IT Team supports and says… Yes –  of course our product is highly available… We guarantee it… 99.999% available…. all good… the “five nines” have been quoted –  what could possible go wrong..

The business then goes and signs on the dotted line (ok –  see anything wrong here –  yep –  the business went and done it again, they let a vendor sell into the business without involving their infrastructure teams… but hey –  its still all good –  Vendor A has stated that Product B is 99.999% available –  its the best possible choice for that mission critical business application) –  What could possibly go wrong

So many months later after the business has brought this wonderful piece of infrastructure and started running their business critical app on said kit –  it suddenly looks a little more difficult to achieve this system integration and the “five nines” promise… why well –  erm that patch management that needed to be completed wasn't really included in the quote of system availability and yeah, the lifecycle management activities that need to be completed also means “planned downtime” –  surely this eats into out 99.999% availability as well –  Yep pretty much… so what should we learn from this –  probably a couple of things when dealing with vendors and guarantees of availability;

  • Ensure that the business involves infrastructure at a very early stage –  so vendor cannot mislead with quotes of availability
  • Identify your availability requirements ahead of engaging vendors –  at least understand your availbility requirements and work back from there with the vendor
  • Be sceptical and dont believe the hype!
  • Understand what is quoted in this 99.999% availability i.e. does this include planned and un-planned ourages
  • Understand patch management regimes and what this may mean to your availbility characteristics
  • Understand lifecycle management application and possible implications on downtime whilst running these processes
  • What currency practices does the vendor have in place to ensure your equipment stays in support and what is the frequency of updates that will be required
  • Understand the vendors measurement of availability –  i.e. just because a storage array is up and running and servicing I/O's slowly does this count as data available or unavailable
  • Understand the penalty system that a vendor puts in place for not achieving their guaranteed availability and how this is invoked –  it always helps to motivate a vendor with financial penalties when availability is not being met
  • Put proper measurements in place for your infrastructure –  when you hold your monthly / yearly vendor updates you can challenge them with appropriate metrics. This again drives good behaviour for us (the consumer)

@storagebod has written a great article on this topic also –  and can be found here:

Hopefully some food for thought!





No comments:

Post a Comment