Another ECM project backoff and no new WU

Fehler und Wünsche zum Projekt yoyo@home
Bugs and wishes for the project yoyo@home
Nachricht
Autor
Dunckx
PDA-Benutzer
PDA-Benutzer
Beiträge: 45
Registriert: 12.11.2014 09:26

Another ECM project backoff and no new WU

#1 Ungelesener Beitrag von Dunckx » 07.05.2018 17:51

I had forgotten the signs of this which I should have remembered. It looks like this is another fail like the one in my post dated 22-01-18 viewtopic.php?f=56&t=16833#p173094 . Anyhow, the log file contains:

07/05/2018 17:16:27 | yoyo@home | [sched_op] Starting scheduler request
07/05/2018 17:16:27 | yoyo@home | [work_fetch] request: CPU (1355855.83 sec, 6.00 inst) NVIDIA GPU (172800.00 sec, 1.00 inst)
07/05/2018 17:16:27 | yoyo@home | Sending scheduler request: To report completed tasks.
07/05/2018 17:16:27 | yoyo@home | Reporting 200 completed tasks
07/05/2018 17:16:27 | yoyo@home | Requesting new tasks for CPU and NVIDIA GPU
07/05/2018 17:16:27 | yoyo@home | [sched_op] CPU work request: 1355855.83 seconds; 6.00 devices
07/05/2018 17:16:27 | yoyo@home | [sched_op] NVIDIA GPU work request: 172800.00 seconds; 1.00 devices
07/05/2018 17:16:34 | yoyo@home | [error] No close tag in scheduler reply
07/05/2018 17:16:34 | yoyo@home | [sched_op] Deferring communication for 03:47:49
07/05/2018 17:16:34 | yoyo@home | [sched_op] Reason: can't parse scheduler reply
07/05/2018 17:16:34 | | [work_fetch] Request work fetch: RPC complete
07/05/2018 17:16:39 | | [work_fetch] ------- start work fetch state -------
07/05/2018 17:16:39 | | [work_fetch] target work buffer: 86400.00 + 86400.00 sec
07/05/2018 17:16:39 | | [work_fetch] --- project states ---
07/05/2018 17:16:39 | yoyo@home | [work_fetch] REC 6963.600 prio -0.002 can't request work: scheduler RPC backoff (13663.74 sec)

Can you just confirm this is what I think it is? The only way I was able to resolve it last time was by removing and reattaching the project, and I hate to lose a day's results, but I think that there's no option.

Do you want the sched_reply.xml file?

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 7835
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Another ECM project backoff and no new WU

#2 Ungelesener Beitrag von yoyo » 07.05.2018 18:24

The reply message of the server contains the close tag. But the message is longer than the buffer in the boinc client. So the close tag didn't get it into the buffer. I think the only way is to reattach the project.
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

Dunckx
PDA-Benutzer
PDA-Benutzer
Beiträge: 45
Registriert: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

#3 Ungelesener Beitrag von Dunckx » 07.05.2018 18:54

OK, I thought it was so. Pity. Ah well, there goes a day's crunching!

Thanks for letting me know.

Dunckx
PDA-Benutzer
PDA-Benutzer
Beiträge: 45
Registriert: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

#4 Ungelesener Beitrag von Dunckx » 15.07.2018 20:03

This has now happened to me twice in this month. On 2nd July I got the same out-of-work message and project backoff and had to re-attach the project. Today it happened again, this time it was sufficient to reset the project in order to fix the fault.

My computer has now lost two days' output in two weeks, 14% wasted effort.

Please can the BOINC programming team do something about the client buffer size problem?! This is getting to be a real drag, as they say.

Thanks.
Dunckx

Dunckx
PDA-Benutzer
PDA-Benutzer
Beiträge: 45
Registriert: 12.11.2014 09:26

Re: Another ECM project backoff and no new WU

#5 Ungelesener Beitrag von Dunckx » 22.07.2018 11:07

Today it has happened yet again, three times in one month! This time, resetting the project was not enough, I had to delete it and re-install it. The new BOINC version 7.12.1 hasn't fixed it.

At least three days of crunching lost in 22!

Dunckx

Benutzeravatar
yoyo
Vereinsvorstand
Vereinsvorstand
Beiträge: 7835
Registriert: 17.12.2002 14:09
Wohnort: Berlin
Kontaktdaten:

Re: Another ECM project backoff and no new WU

#6 Ungelesener Beitrag von yoyo » 22.07.2018 14:24

Maybe you should set your bufers smaller.
In your first posting the message logs says to report 200 results. This is a lot and might be the reason for the long message reply which doesn't fit into the boinc client buffer.
HILF mit im Rechenkraft-WiKi, dies gibts zu tun.
Wiki - FAQ - Verein - Chat

Bild Bild

ChristianB
Vereinsvorstand
Vereinsvorstand
Beiträge: 1915
Registriert: 23.02.2010 22:12

Re: Another ECM project backoff and no new WU

#7 Ungelesener Beitrag von ChristianB » 22.07.2018 14:44

What version of BOINC is that? I thought we increased the buffer size of the sched_reply message for the 7.10.x release.

Benutzeravatar
gemini8
Vereinsmitglied
Vereinsmitglied
Beiträge: 3522
Registriert: 31.05.2011 10:30
Wohnort: Hannover

Re: Another ECM project backoff and no new WU

#8 Ungelesener Beitrag von gemini8 » 27.07.2018 09:41

The Rakesearch project suggests the following:
Dear crunchers!
If your powerful machines store a large number of tasks and try to report about its completion to project server at once, they may faced timeouts or errors. Please use a max_tasks_reported option in cc_config.xml file for reducing of size of requests to project server and fast tasks reporting:

Code: Alles auswählen

<cc_config>
    ...
    <options>
        ...
        <max_tasks_reported>N</max_tasks_reported>
        ...
    </options>
    ...
</cc_config>
For example, with value N = 32 maximum size of xml-requests to projects server significantly decrease and them process faster.
Might this help with the afore-mentioned problem as well?
Gruß, Jens
- - - - - -
Lowend-User und Teilzeitcruncher

Bild Bild Bild
Bild

Antworten

Zurück zu „Fehler, Wünsche / Bugs, Wishes“