[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Ensuring data is on disk - 2

To: list@xxxxxxxxxxxxx
Subject: Re: [LUG] Ensuring data is on disk - 2
From: Gordon Henderson <gordon+dcglug@xxxxxxxxxx>
Date: Wed, 14 Jul 2010 12:18:06 +0100 (BST)
Delivered-to: dclug@xxxxxxxxxxxxxxxxxxxxx
Distribution: world

On Wed, 14 Jul 2010, Simon Waters wrote:

The key tools are "smartctl" and "hdparm", but neither of which are
sufficiently fresh on Debian Lenny to display the relevant attributes to
me (if my drive has them).

Hm. I've been using smartctl for ... well, as long as I've known aboutthem. Lenny's version of smartctl is 5.38 - the one on the projectswebsite is 5.39 so I don't think it's that old... The one thing I wishedit did was have the drive database in a file and not compiled into theprogram - that might make updating it a little easier.

And I have to say, I've never found hdparam useful for anything otherthan the raw benchmark facility (-tT) in recent years - although it washandy in the days when drive DMA was off by default and you wanted to tunemulti-sector reads. (however, it might still be handy for that for all Iknow - especially on very old IDE drives...) Maybe I'll re-read the manpage to see what it can do now ;-)

Although both report their own inadequacies
to describe the features of my disk drives - smartctl says one drive has
SMART but can't talk to it, and the other drive has a good SMART health
(despite me having another windows open which has counted 1948 bad
blocks and is only 77% finished scanning the disk), and hdparm displays
some unknown attributes for the first mentioned disk (include vendor
extensions), and the other disk isn't recognised at all (apparently it
is a SAMSUNG  HD642JI, 640GB).

At that size, I guess it's relatively modern - seems odd that it's notrecognised, however I don't think I've ever bought/used a samsung drive!

I'm also not convinced the 'health' status is good - I have a pair ofdrives that show good 'health':


  # smartctl -d ata -H /dev/sda
  === START OF READ SMART DATA SECTION ===
  SMART overall-health self-assessment test result: PASSED

Yet this drive shows bad sectors during tests:

Num  Test_Description    Status                  Remaining  LifeTime(hours)  
LBA_of_first_error
# 1  Short offline       Completed: read failure       90%      9816         7855446

And another 20 lines of similar )-:

However, there are no reallocated blocks:

  # smartctl -a -d ata /dev/sda | fgrep Reallocated_Event_Count
  ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  
WHEN_FAILED RAW_VALUE
  196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -   
    0

So what does that tell me )-:

I sometimes wonder if the drive manufacturers know what they're doing....I have a set of drives with firmware bugs that cause some of the smartdata to be read incorrectly - at least the makers (WDC in this case)acknowledged it - these drives read their temperatures as about 15C hotterthan they really are - so if they can get the temperature reads wrong,what else can they get wrong?


  # hddtemp /dev/sd[a-e]
  /dev/sda: WDC WD2500KS-00MJB0: 52 C
  /dev/sdb: WDC WD2500KS-00MJB0: 52 C
  /dev/sdc: WDC WD2500KS-00MJB0: 52 C
  /dev/sdd: WDC WD2500KS-00MJB0: 52 C
  /dev/sde: WDC WD2500KS-00MJB0: 46 C

In the mean time, as suggested there are several ways to clear the OS
cache (unmount or drop_caches), and one is left relying on luck (or
power cycling) to miss the cache in the disk (mine are 8MB and 16MB so
when writing big backups luck is likely on your side....), at which
point just reading the data should be sufficient to ensure it is safely
on disk, since only a bad block should stop the process but verifying
the data is probably wise.

There's only so much you can cache though, so at some point you must startto read back off the drive - benchmarks rely on that - bonnie will insiston using a file size of 2x memory size for example....


So if we write to disk, flush memory cache:

  sync
  echo 3 > /proc/sys/vm/drop_caches

I'd probably unmount it and re-mount it, if practical too.

tell the disk to flush it's cache.... If it's a whole drive (and not justa partition), then maybe we can power it down and up again - either withhdparam or by writing the right runes to /proc/scsi.. but even powereddown, I bet the controller is still active...


Hmmmm....

hdparam -f and -F both claim to flush caches, -F the write cache, butmakes no mention of a read cache.

I wonder, if as part of the backup, you write a file which is 2x (or more)the size of the disk buffer - so 64MB or 128MB, or whatever it is... Thenread this first before reading the 'real' data -


So..

  Do the backup.
  dd if=/dev/zero of=/mnt/dummy.file bs=1M count=256
  umount /mnt   # if practical
  sync
  echo 3 > /proc/sys/vm/drop_caches
  mount /mnt    # if practical
  dd if=/mnt/dummy.file of=/dev/null bs=1M

  then compare backup with original...

Seems a bit crude though...

I do a weekly 'resync' on all my RAID arrays - it doesn't verify that thedata is what was written, but it will verify that every sector can be readand that the RAID checksums (R5/R6) are correct, or that the mirrors (R1)agree with each other. Mdadm can do it automagically, or do it manually -


  echo 'check' > /sys/block/md1/md/sync_action

etc.

The Lenny package only does this once a month though, so I change thecrontab.

All this leads to the answer to a question Ben (I think) muttered
recently about who uses tape these days.... Well suddenly it looks a lot
simpler than disks.

Harder to manage though - especially in remote data centres - although"remote hands" or a big "jukebox" changer can help... (but potentiallyexpensive) PITA to get backups from in cases of accidental deletion -that's when I started to do a 'backup' to the same server before dumpingto other storage when disks got cheaper than tapes... (DLT tape at thetime, then remote servers to disk - the local server would have a few daysof data held on it via rsync on a separate partition)

Also one area where I think Microsoft Windows may be doing a better job
at exposing the relevant interface.

More probably the drive manufacturers writing their diagnostics anddrivers for MS rather than for anything else.


Anyone with Mac experience know how they do it?

Gordon

--
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/listfaq

References:
- [LUG] Ensuring data is on disk - 2
  - From: Simon Waters

Prev by Date: Re: [LUG] Ensuring data is on disk - 2
Next by Date: [LUG] OT but one reson to convert to linux
Previous by thread: Re: [LUG] Ensuring data is on disk - 2
Next by thread: [LUG] OT but one reson to convert to linux
Index(es):
- Date
- Thread