Re: [LUG] Devon is offline...


On 19/07/10 20:09, Rob Beard wrote:
>> Imagine not having an off-site backup for a corporate web site as 
>> important as devon.gov.uk ...
>> Imagine not being able to use any computer in DCC right now...
> Redundancy?
> Yeah they've heard of it!

I'm a big fan of lack of redundancy.

Most of the high availability solutions I've used are either
exceptionally expensive, or not robust, or high maintenance. Some times
the smart thing to do is fail. Assuming you've planned for it!

Not that I don't enjoy building high availability systems, it is a great
technical challenge, just sometimes it is smarter not to.

I remember a classic case of HP discovering that they had forgotten to
specify in the design document that a SCSI controller terminate when the
host OS panic'ed - thus leaving one of their fancy disk fail-over
systems with an unterminated SCSI bus - and so unlikely to fail-over
under certain conditions.

HP discovered that in testing, but only after selling this solution to
various customers as "high availability". In most cases customers would
have paid the price of a decent sized house to have this extra lack of
redundancy on their systems. Good value? Well maybe, but one might
rightly be aggrieved if it didn't save your bacon when the time came.

At one point I was testing HP kit on a server I was building, only to be
told by the hardware engineers at HP that manually disconnecting SCSI
cables whilst the system was in use would likely void the hardware
warranty, even though this was a failure condition the hardware/software
configuration was suppose to survive without glitch. So even if you
built a robust solution, HP officially wouldn't let you test it under
realistic conditions. Of course there was no way I was returning the
system to operation without testing that fail-over worked as documented,
even if HP engineers thought it a rather cavalier approach to expensive

