D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Folding@home 'errors'

 

On Tue, 2010-03-16 at 12:44 +0000, tom wrote: 
> You may have missed the point - f@h runs all the time on my system*, it 
> only uses a small amount of the resources available but goes wrong from 
> time to time - while the rest of the system doesnt. Basically the 
> machine is running 100% all the time.
> * I've been running cpuburn flat out for about an hour now and there's 
> no  noticeable temperature/voltage  change on anything over any hardware 
> according to X sensors**. F@H drives it flat out normally... **Though I 
> have to admit I've no idea what values they should be but 29C for a 
> CPU/39C motherboard seems fine to me.
> As you say Linux is quite resilient but presumable if there is a problem 
> it should be possible to get it logged somewhere?
> Tom te tom te tom

I don't think I missed the point - perhaps I was a little vague though,
my apologies.

When you don't need the processing power, F@H will be utilising at least
one of your CPU cores to 100% (depending on how you set it up),
performing large scale iterative calculations - day to day computing
just doesn't use your CPU like this. If your CPU computes one wrong
value in a desktop program you may never know about it, although it
might crash eventually - in an iterative calculation it will invalidate
everything which follows.

I've not used cpuburn before, does it have a way of telling you that it
got a calculation wrong? Prime95 is a good one for this as it can be
set-up to calculate Mersenne primes and then compare with the known
answer. There are also several different tests which use RAM and
CPU/cache to differing levels - pass them all and your system is
probably OK. But remember that the only thing you can ever prove with
these tests is a fault - you can never prove 100% that there is no
fault, no matter how long you run them.

If Prime95 throws errors then I would check the PSU first as it's the
most common problem in my experience - stick a spare in there if you
have one and see what happens. It could also be something like a defect
in one part of the CPU or a slighty defective cache  - and these things
might never show up in day to day use. It could also of course be
nothing at all, depending on how often F@H rejects your work!

Hope this helps and good luck!

Dan



-- 
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html