Posts by ChertseyAl
log in
1) Message boards : Number crunching : BOINC's "Project is not highest priority" In Event Log Is Annoying (Message 1010)
Posted 4 Oct 2013 by ChertseyAl
FWIW, the last version of BOINC that worked properly (but only for CPU projects) was 5.10.45, which is what I always install. Version 6.10.xx onwards has serious work-fetch and scheduling problems, and 6.12.xx hid the essential messages tab. I dread to think what state 7.x.x is in, but it ain't going to be pretty!

Sadly I have to run 6.10.60 on a couple of machines so that I can run Radioactive@Home and QCN but I run those with a single project or they get confused and don't fetch any work for anything. What. A. Crock. Of. Sh....

Cheers,

Al.
2) Message boards : Number crunching : WU lasting about 4 hours instead of (Message 998)
Posted 2 Oct 2013 by ChertseyAl
Only two? I've had hundreds of them.

It's a known issue with this project. If you are after as much credit as possible don't crunch this project as it's dire for 'reward'!

Sadly the problem was caused by the usual credit cheaters (who have now moved on) which left the rest of us with the current grim credit granting system.

Best bet is to try a different project.

Cheers,

Al.
3) Questions and Answers : Wish list : How about 30 minute suspend with auto restart (Message 497)
Posted 13 Jan 2013 by ChertseyAl
I thought the standard BOINC 'snooze' function did this, albeit for a fixed time (60 minutes? 30 minutes).

Never used it myself, but you might like to try it.

Cheers,

Al.
4) Message boards : Number crunching : cool 0.00 credits (Message 385)
Posted 26 Oct 2012 by ChertseyAl
We just switched over to 1 credit per WU


Brave move making it that low :)

FWIW, I just calculated my RAC based on the batch of work I completed a few days ago and it would now be one third of what it was. Also, that would make it the lowest credit of any project I've ever crunched :)

But I guess this is just a temporary measure, so I'll test it again when things have settled down.

Cheers,

Al.
5) Message boards : News : Interview with Ant & Kevin (Message 266)
Posted 4 Oct 2012 by ChertseyAl
Thanks for that - Always nice to get some feedback from projects.

Oh, and tell that guy halfway in to close the door quietly in future :)

Cheers,

Al.
6) Message boards : Number crunching : Validator not running? (Message 265)
Posted 4 Oct 2012 by ChertseyAl
We'll try to implement a purge that leaves the past weeks data for users to do their checks, but should substantially reduce the total DB size.


I suspect most of us only need/want a few days of past results just to check that haven't got a rogue host or that something has changed with new batches of WUs.

What most people hate is the famous (infamous!) 'Instapurge' that plagued other projects. They know who they are ... Travis and Tobias, see me after class ;) Plenty here will get that reference even if you don't Ant :)

Cheers,

Al.
7) Message boards : Number crunching : WU Error (Message 210)
Posted 7 Sep 2012 by ChertseyAl
I'm seeing a few of these, just 11 so far with 6660 valid ones. I can post links if you can't find them.

Cheers,

Al.
8) Message boards : Number crunching : Incorrect CPU Time (Message 94)
Posted 28 Jul 2012 by ChertseyAl
This is strange behaviour but we are not yet certain of the cause. We are looking into this and hope to have a fix in a few days.
Kevin


Thanks. I'll see if I can find a common factor across hosts, BOINC versions, whatever, before I move off to SIMAP on Monday for a few days.

Oh, maybe this should all be in a different thread in the number crunching forum. Dunno if you can move threads? If not, anything I find, I'll start a new thread elsewhere :)

Cheers,

Al.
9) Message boards : Number crunching : Incorrect CPU Time (Message 93)
Posted 28 Jul 2012 by ChertseyAl

i bet a bottle of bush 21 it's a problem on your side.
maybe the tin-lizzy's, maybe boinc 5.x, but most likely its the app that does not checkpoint.


I don't know what Bush 21 is, so it's a risky wager ;) Send me a bottle to try so that I can assess the risk :)

Unlikely to be my prehistoric hosts as they happily crunch every other project that I partake of :) (Took me a while to figure out the 'tin-lizzy' reference there!) - As you know, I rotate projects every week, so everything gets a good workout.

Also happens on BOINC 6.10.60, so it's not a 5.x problem (agree, therefore, that's it's not the same problem as numberfields). Yes, I know 6.10.60 is not very good, but everything later is worse, and I have to run it just on 2 machines just for 2 projects that need it :(

Unlikely to be checkpoint related as I keep app in memory, and the other project I run alongside this one ATM has a 2 week DL for a 6 hour WU, so doesn't swap about and just lets WUs run. Although, just noticed that the host I originally posted about has gone into hi-pri and paused one FMAH task to let another one run. But that's because the estimated completion leapt up to 30 hours from 30 minutes.

It's all a bit odd. I'll keep crunching until Monday (SIMAP time!) and see if there's a pattern.

Oh, now the validator has died again ...

Cheers,

Al.
10) Message boards : Number crunching : Incorrect CPU Time (Message 88)
Posted 28 Jul 2012 by ChertseyAl

it looks much more like a single glitch on your host - all those other wu's look ok


Far from it. Any WU on any of my hosts that is longer than the usual short (half hour) ones is getting granted really low credit. I've had a few of them. Also CPU time longer than run time? No. It's a reliable host. As are all of my others. Something not right.

Cheers,

Al.

p.s. This is an exact replay of numberfields if you recall. Those were happy days ;)

11) Message boards : Number crunching : Incorrect CPU Time (Message 73)
Posted 28 Jul 2012 by ChertseyAl
Just updated that now, we are now giving credit based on runtime.


Something very wrong here. Check this one out:

http://boinc.ucd.ie/fmah/result.php?resultid=359080

Run time 52,193.00
CPU time 152,710.50
Validate state Valid
Credit 15.11

That looks exactly like CreditRandom to me. Numberfields@home was exactly the same at first. I don't think you've fixed it. I'm out.

Cheers,

Al.
12) Message boards : News : FMaH server will be down for about 12 hours (Message 56)
Posted 25 Jul 2012 by ChertseyAl
Just updated that now, we are now giving credit based on runtime.


Fantastic! Thanks for listening :)

Cheers,

Al.

13) Message boards : News : FMaH server will be down for about 12 hours (Message 54)
Posted 25 Jul 2012 by ChertseyAl
Hopefully you'll optimise OUT CreditRandom, sorry, CreditNew.

My stats today are laughable. Truly random. 4 identical machines, range of average credit varies by a factor of 5. My fastest machine has the lowest RAC of my small farm. A celeron laptop is getting great credit, a celeron desktop with twice the CPU speed is getting dismal credit. I keep very detailed stats on all projects across my farm, so I can easily spot projects with rogue credit granting schemes :)

Now, I'm not a Credit Wh*re, but CreditRandom is a real downer for dedicated crunchers. Get rid of it. Not asking for amazing credit, just a system that anyone can understand.

*waits for @frankhagen to join in* ;)

Cheers,

Al.
14) Message boards : News : FM@H up and running (Message 49)
Posted 25 Jul 2012 by ChertseyAl
On one of the other forums a user complained about too many tasks flooding his system, so he aborted them. This then caused problems for Kevin, who had to clean up the mess.


Hmmm. BOINC should handle that automatically and just add another replication. One thing it might be worth doing is setting the option that sends out aborted/failed WUs immediately to trusted hosts, so that they don't hang about until the queue runs dry. Can't remember where to set the flag ATM, but someone will be able to help. Actually, I think they turned that option on over at the asteroids@home project recently, you could look over there.

Oh, and with loads of small WUs flying about, keep and eye on your disc space as the results database will grow quickly :)

Cheers,

Al.
15) Message boards : News : Update (Message 48)
Posted 25 Jul 2012 by ChertseyAl
I was just able to download a few WUs and got through them in about 3 minutes each. Will all of them be like that?


FWIW, I've seen a range of times from a few minutes to 5 hours :)

Cheers,

Al.
16) Message boards : News : Update (Message 38)
Posted 24 Jul 2012 by ChertseyAl
We are having issues with a small percentage (less than 1%) of Windows computers


Yes, I still have the same problem that I had with V1.07 ...

http://boinc.ucd.ie/fmah/result.php?resultid=163738

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
- exit code -148 (0xffffff6c)
</message>
<stderr_txt>
wrapper: starting
12:04:29 (3000): wrapper: running ../../projects/boinc.ucd.ie_fmah/vina_2.0_windows_intelx86.exe (--config conf --receptor receptor.pdb --ligand ligand.pdb --out outfile --exhaustiveness 20 --cpu 1)
can't run app: This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem. (0x36b1)
12:04:29 (3000): called boinc_finish

</stderr_txt>
]]>


I grabbed a section of the messages tab - There's loads of this:

24/07/2012 12:03:36|fightmalaria@home|Starting vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0
24/07/2012 12:03:36|fightmalaria@home|Starting task vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 using vina version 200
24/07/2012 12:03:36|fightmalaria@home|Starting vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0
24/07/2012 12:03:36|fightmalaria@home|Starting task vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0 using vina version 200
24/07/2012 12:03:44|fightmalaria@home|[error] Can't rename output file vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0
24/07/2012 12:03:49|fightmalaria@home|[error] Can't rename output file vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0_0
24/07/2012 12:03:49|fightmalaria@home|Computation for task vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 finished
24/07/2012 12:03:49|fightmalaria@home|Output file vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0 for task vina_6230_1343063354.179880-38-1ZRO_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 absent
24/07/2012 12:03:49|fightmalaria@home|Computation for task vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0 finished
24/07/2012 12:03:49|fightmalaria@home|Output file vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0_0 for task vina_6230_1343063340.234497-74-1Z6B_A.pdbqt-121-1ONF_lig_2.pdbqt-A1-_0 absent
24/07/2012 12:03:49|fightmalaria@home|Starting vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0
24/07/2012 12:03:49|fightmalaria@home|Starting task vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 using vina version 200
24/07/2012 12:03:49|fightmalaria@home|Starting vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0
24/07/2012 12:03:49|fightmalaria@home|Starting task vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 using vina version 200
24/07/2012 12:03:57|fightmalaria@home|[error] Can't rename output file vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0
24/07/2012 12:04:02|fightmalaria@home|[error] Can't rename output file vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0
24/07/2012 12:04:02|fightmalaria@home|Computation for task vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 finished
24/07/2012 12:04:02|fightmalaria@home|Output file vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0 for task vina_6230_1343063350.564181-11-1D5C_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 absent
24/07/2012 12:04:02|fightmalaria@home|Computation for task vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 finished
24/07/2012 12:04:02|fightmalaria@home|Output file vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0 for task vina_6230_1343063350.070149-9-2QU8_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 absent
24/07/2012 12:04:02|fightmalaria@home|Starting vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0
24/07/2012 12:04:02|fightmalaria@home|Starting task vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 using vina version 200
24/07/2012 12:04:02|fightmalaria@home|Starting vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0
24/07/2012 12:04:03|fightmalaria@home|Starting task vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 using vina version 200
24/07/2012 12:04:11|fightmalaria@home|[error] Can't rename output file vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0
24/07/2012 12:04:17|fightmalaria@home|[error] Can't rename output file vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0
24/07/2012 12:04:17|fightmalaria@home|Computation for task vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 finished
24/07/2012 12:04:17|fightmalaria@home|Output file vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0 for task vina_6230_1343063351.182768-14-1OB3_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 absent
24/07/2012 12:04:17|fightmalaria@home|Computation for task vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 finished
24/07/2012 12:04:17|fightmalaria@home|Output file vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0_0 for task vina_6230_1343063351.283264-15-1NHG_A.pdbqt-122-3BPW_lig_1.pdbqt-A1-_0 absent

So far only one machine affected (same as one of the 2 before), but I suspect the other one will also fail in the same way. Working OK on my other machines though.

Cheers,

Al.
17) Message boards : Number crunching : vina 1.07 computation errors (Message 28)
Posted 21 Jul 2012 by ChertseyAl

had the same over here on one host - try to detach and reatach.


Thanks Frank, I've just done that, we'll see what happens when the new WUs arrive :)

Cheers,

Al.


18) Message boards : Number crunching : vina 1.07 computation errors (Message 21)
Posted 21 Jul 2012 by ChertseyAl
I'm seeing errors on 2 of my machines. All WUs failed on both. Sample WU:

http://boinc.ucd.ie/fmah/result.php?resultid=103565

<core_client_version>5.10.45</core_client_version>
<![CDATA[
<message>
- exit code -148 (0xffffff6c)
</message>
<stderr_txt>
wrapper: starting
08:00:17 (1820): wrapper: running ../../projects/boinc.ucd.ie_fmah/vina_1.7_windows_intelx86.exe (--config conf --receptor receptor.pdb --ligand ligand.pdb --out outfile --cpu 1)
can't run app: This application has failed to start because the application configuration is incorrect. Reinstalling the application may fix this problem. (0x36b1)
08:00:17 (1820): called boinc_finish

</stderr_txt>
]]>


Note that plenty of other projects run happily on those machines, and that this project runs happily on other machines.

As these machines are identical to others I have, I'm suspicious that both of these have (fairly) recently had XP installed on bare metal. Which makes me think I'm missing some run time library or java version that's been updated on the other machines.

Of course, it might just have been bad luck or coincidence :)

Cheers,

Al.




Main page · Your account · Message boards


Copyright © 2017 Dr Anthony Chubb