* tabled RPM build fails before it succeeds
@ 2010-04-16 17:16 Jeff Garzik
2010-04-16 20:19 ` Pete Zaitcev
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jeff Garzik @ 2010-04-16 17:16 UTC (permalink / raw)
To: Project Hail
The same source, same spec.
Build #1 (fails on x86_64):
http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825
Build #2 (fails on i686):
http://koji.fedoraproject.org/koji/taskinfo?taskID=2120174
Build #3 (success on all platforms):
http://koji.fedoraproject.org/koji/taskinfo?taskID=2120215
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: tabled RPM build fails before it succeeds
2010-04-16 17:16 tabled RPM build fails before it succeeds Jeff Garzik
@ 2010-04-16 20:19 ` Pete Zaitcev
2010-05-13 0:49 ` Pete Zaitcev
2010-05-28 20:14 ` Pete Zaitcev
2 siblings, 0 replies; 4+ messages in thread
From: Pete Zaitcev @ 2010-04-16 20:19 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Project Hail
On Fri, 16 Apr 2010 13:16:56 -0400
Jeff Garzik <jeff@garzik.org> wrote:
> Build #1 (fails on x86_64):
> http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825
PASS: prep-db
chunkd[17774]: Waiting for CLD PortFile cld.port
cld[17773]: databases up
cld[17773]: Listening on port 34671
cld[17773]: initialized: nodebug
chunkd[17774]: Using CLD port 34671
chunkd[17775]: Listening on auto port 57521
tabled[17777]: Listening on port 36667
chunkd[17775]: New CLD session created, sid 4D81504D6021608C
chunkd[17775]: initialized
PASS: start-daemon
PASS: pid-exists
PASS: daemon-running
server is not up after 100 s
> Build #2 (fails on i686):
> http://koji.fedoraproject.org/koji/taskinfo?taskID=2120174
Same.
I think it's time to figure it out. Going for a 100s delay
kinda sorta helped, but on second thought: why would any
packets get lost at all, on a loopback?
I'll try to see if I can reproduce this by overloading the
test system.
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: tabled RPM build fails before it succeeds
2010-04-16 17:16 tabled RPM build fails before it succeeds Jeff Garzik
2010-04-16 20:19 ` Pete Zaitcev
@ 2010-05-13 0:49 ` Pete Zaitcev
2010-05-28 20:14 ` Pete Zaitcev
2 siblings, 0 replies; 4+ messages in thread
From: Pete Zaitcev @ 2010-05-13 0:49 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Project Hail
On Fri, 16 Apr 2010 13:16:56 -0400
Jeff Garzik <jeff@garzik.org> wrote:
> Build #1 (fails on x86_64):
> http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825
I think current tabled is much better; it should not stumble with
the "100s" thing as much. Unfortunately, it's not completely reliable
still. I see this (although very infrequently):
PASS: prep-db
chunkd[19052]: Waiting for CLD PortFile cld.port
cld[19051]: databases up
cld[19051]: Listening on port 56141
cld[19051]: initialized: nodebug
chunkd[19052]: Using CLD port 56141
tabled[19055]: Listening on port 44610
tabled[19055]: New CLD session created, sid 4C7619861D42473D
tabled[19055]: /chunk-default: open failed, retrying
chunkd[19053]: Listening on auto port 48660
PASS: start-daemon
PASS: pid-exists
PASS: daemon-running
tabled[19055]: /chunk-default: open failed, retrying
tabled[19055]: /chunk-default: open failed, retrying
tabled[19055]: /chunk-default: open failed, retrying
tabled[19055]: /chunk-default: open failed, retrying
<------------ at this point tabled exits
cld[19051]: session timeout, addr ::1 sid 4C7619861D42473D
chunkd[19053]: New CLD session created, sid 4C7619861D42473D
chunkd[19053]: initialized
<------------ great, too late
^Cmake[2]: *** [check-TESTS] Interrupt
So, tabled retries, but gives up too early. Of course the knee-jerk
reaction would be to change the max retries from 5 to 10... The
problem is I have a vague suspicion that something is fishy.
The root of the 100s problem was that CLD gets delayed just
a tiny bit, enough for clients to start and fail the first
round of sessions. That's fine, we deal with it now. But in
the above log CLD seems to be available enough for tabled to
initiate at least, so why does Chunk have to retry?
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: tabled RPM build fails before it succeeds
2010-04-16 17:16 tabled RPM build fails before it succeeds Jeff Garzik
2010-04-16 20:19 ` Pete Zaitcev
2010-05-13 0:49 ` Pete Zaitcev
@ 2010-05-28 20:14 ` Pete Zaitcev
2 siblings, 0 replies; 4+ messages in thread
From: Pete Zaitcev @ 2010-05-28 20:14 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Project Hail
On Fri, 16 Apr 2010 13:16:56 -0400
Jeff Garzik <jeff@garzik.org> wrote:
> Build #1 (fails on x86_64):
> http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825
Oh that's it, I have had it with these build failures, I give up.
I added "sleep 3" after CLD start in test/start-daemon and it fixed
everything. You know sometimes it's good to be practical.
-- Pete
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2010-05-28 20:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-16 17:16 tabled RPM build fails before it succeeds Jeff Garzik
2010-04-16 20:19 ` Pete Zaitcev
2010-05-13 0:49 ` Pete Zaitcev
2010-05-28 20:14 ` Pete Zaitcev
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.