D&C GLug - Home Page

[ Date Index ] [ Thread Index ] [ <= Previous by date / thread ] [ Next by date / thread => ]

Re: [LUG] Folding@home 'errors'

 

Gordon Henderson wrote:
On Tue, 16 Mar 2010, tom wrote:

I run folding@home on 4 machines here and one quite often finishes early:
" Simulation instability has been encountered. The run has entered a
[23:12:46]   state from which no further progress can be made.
[23:12:46] This may be the correct result of the simulation, however if you
[23:12:46]   often see other project units terminating early like this
[23:12:46] too, you may wish to check the stability of your computer (issues
[23:12:46]   such as high temperature, overclocking, etc.)."

there is no overclocking, cpu is ~29c and memtest can run for days without finding a problem...
Any clues/tips?

Once upon a time I worked in the R&D department of an old british supercomputer company... I mainly wrote test & diagnostics, and low-level driver code - worked with the hardware & chip desginers, did some design & integration, system building, etc, etc...

And even then, I could get a system to run all my diagnostics for days on end in & out of the burn-in ovens, then they would fail miserably when subject to application code )-:

And even more yerars ago - I looked after a PDP11 running Unix v6 - every quarter we'd get the DEC engineer in as part of the maintenance contract - he'd hoover the core memory, etc... run all his diagnostics, but I remember him saying that running Unix on them was a much better test than any of his diagnostics ever were!

So you need to think bigger than just memtest - have you tried cpuburn? However that's just a set of CPU tests. There is a user-land memory tester too - it's 'memtester' under debian. Portentially not as thorough as memtest86+, but you can run it in conjunction with other things.
I'll have a look at cpuburn - what baffles me though is I'm using that particular PC all the time and no-other programs have any noticeable faults which would be expected at the same time (other than firefox/flash issues) I have no special GPU or anything so F@H is just hammering the cpu/ram and the odd bit of disk - as would any other app. Nothing crops up in the logs anywhere. I'm probably just looking at an excuse to get an one of those 4 core AMD's but the only thing that fails is f@h and until I can get something else to fall over I cant even convince myself its a valid expense!
Tom te tom te tom

--
The Mailing List for the Devon & Cornwall LUG
http://mailman.dclug.org.uk/listinfo/list
FAQ: http://www.dcglug.org.uk/linux_adm/list-faq.html