D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Backup solutions

 

On Fri, 19 Dec 2008, Simon Waters wrote:

> Jonathan Roberts wrote:
>>> Hope this helps
>>
>> Thanks for all the suggestions everyone. I think rdiff-backup seems
>> like a pretty good solution, though I wish it had some way to preview
>> files before deciding which one to backup. In this regard, I think OS
>> X time machine is the closest to what would suit best, but then that's
>> crippled by being OS X only and not being able to do backups across a
>> network.
>
> Looks like it is crippled by poor underlying technology.
>
> You have various approaches to emulate Time Machine with free (as in
> freedom) software.
>
>
> rsnapshot
>
> rsnapshot seems to try it via the hard links approach, this is good for
> disk IO, but looks messy to me. This is basically what Time machine does
> by the looks of it, although I struggled to find a proper explanation of
> Time Machine internals written by someone who knows.

I use "hard links + rsync" type of backups extensively. Once setup it easy 
to use, trivial to get older versions back and only takes up as much disk 
space as it needs (one complete copy plus the files that have changed).

I do this both on a local machine (not a backup or archive, but an 
accidental file deletion recovery), and to remote servers for backup and 
archive.

Keeping an archive is easy - once a month you just take one of the daily 
copies out of the system. Then you can have N months of archive plus a 
daily copy.

One thing to note - rsync needs a lot of memory with large filesystems, 
although I suspect that for the average user that's nothing to worry 
about. Also take care when copying heirarchies with hard-links in them - 
you need to preserve the links and not make copies. This may require a lot 
of memory too, depending on the copy mechanism used (and time if it has to 
find each file that's hard-linked to the same file)

You also need to make sure the target filesystem has enough inodes - a 
mkfs setup "skill" that I suspect is is danger of being lost - saying 
that, the defaults will probably be OK for your average user.

The biggest system I personally worked on had a 9TB backup disk/server 
(RAID-6 over 15 x 750GB drives) and acted as the off-site backup for about 
8 servers with 0.25 to 1TB of local storage. It held a months worth of 
incrementals and took snapshots too (although there was on-site tape 
backup for that, but doing an archive took 2 days on tape) The remote site 
was connected to the main site by a 100Mb Lan extension circuit.

My own servers (co-located in Sheffield) are backed up overnight to 
themselves and their neighbours, and also to a server here in sunny Devon 
using this technique, as well as pulling data off a few customer sites 
who've got office fileservers which I've supplied for them. The biggest 
issue we have there is data recovery - it's simply not possible to restore 
60GB of data (which one of my customers has) over the 'net, so copying it 
to a local backup server and taking it to the customer is the option 
here...

(And I note that Entanet has just changed their off-peak hours too, so 
anyone on Entanet who was relying on off-peak starting at 8pm for the 
business service or 10pm for the residential service beware!)

> lvm2 snapshots
>
> Alternatively you could use LVM with your backup drive and use the LVM
> snapshot utility to make hourly snapshots of the backup volume, then
> just rsync files you want backed up to that volume. A little script to
> keep the LVM snapshots mounted (read only) would be easy to write.

I never trusted LVM, but I did get "bitten" by it early on. I'd suggest 
using LVM to create a snapshot of a volume, back that up, then release the 
LVM snapshot... I encountered bad performance problems and buggy software, 
but it was early days in LVM. I'd hope things have improved since then!

> Also most snapshot technology needs attention if used with databases,
> since most databases need some sort of quiescence command run when
> making a hot backup (I wonder if sqlite is affected with Mozilla history
> files and such like?). There is a lot of discussion of Linux LVM with
> MySQL that covers the kind of issues, but some of the solutions are not
> things to bet the bank on.

For moderate sized databases (a few GB) I'm doing a text dump of them 
before running the backup. That seems to be the most reliable way, but 
then it's still only a snapshot of the data - it all depends on the 
frequency of updates, etc. and just how valuable the data is... (And how 
much your customer/boss is paying you to keep the data secure :)

Then there's MySQL server replication which does work very well.

Gordon


-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html