D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Folding@home 'errors'

 

Dan James wrote:
On Tue, 2010-03-16 at 11:10 +0000, tom wrote:
I'll have a look at cpuburn - what baffles me though is I'm using that particular PC all the time and no-other programs have any noticeable faults which would be expected at the same time (other than firefox/flash issues) I have no special GPU or anything so F@H is just hammering the cpu/ram and the odd bit of disk - as would any other app. Nothing crops up in the logs anywhere. I'm probably just looking at an excuse to get an one of those 4 core AMD's but the only thing that fails is f@h and until I can get something else to fall over I cant even convince myself its a valid expense!
Tom te tom te tom

Hi all,

I've decided that this is as good a moment as any to de-lurk after a few
months of reading the list!

Tom, your day to day usage won't be putting your hardware to use
anywhere near as fully as F@H, unless you're doing something like video
rendering/encoding. I've also found linux systems to be incredibly
resilient to hardware errors; most recently I had a dual boot system
pulling a regular BSOD on windows due to bad RAM, on Ubuntu the only
indication I had was the occasional crash and restart of Firefox and the
odd bad MD5 checksum.

If your memory checked out fine with memtest I'd say the next thing to
have a look at is the PSU. Bad power can cause all sorts of unexpected
problems, most of them inconsistent and hard to reproduce. It can also
damage your system badly if left unchecked. I'd check voltages in the
BIOS, OS and with a multimeter before running any stress testing. If it
is dirty power you don't want to compound the problem and fry your
hardware.

Stress testing tools like cpuburn are probably the closest you will get
to the kind of work F@H does - Prime95 is one I use regularly. Just make
sure you have effective cooling, I wouldn't run one on a system with
unchecked old thermal paste for example!

Best wishes,

Dan





You may have missed the point - f@h runs all the time on my system*, it only uses a small amount of the resources available but goes wrong from time to time - while the rest of the system doesnt. Basically the machine is running 100% all the time. * I've been running cpuburn flat out for about an hour now and there's no noticeable temperature/voltage change on anything over any hardware according to X sensors**. F@H drives it flat out normally... **Though I have to admit I've no idea what values they should be but 29C for a CPU/39C motherboard seems fine to me. As you say Linux is quite resilient but presumable if there is a problem it should be possible to get it logged somewhere?
Tom te tom te tom

--
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html