* timeout 120 teuthology-killl is highly recommended
@ 2015-07-21 16:13 Loic Dachary
2015-07-21 16:29 ` Gregory Farnum
2015-07-21 16:33 ` Yuri Weinstein
0 siblings, 2 replies; 5+ messages in thread
From: Loic Dachary @ 2015-07-21 16:13 UTC (permalink / raw)
To: Ceph Development
[-- Attachment #1: Type: text/plain, Size: 803 bytes --]
Hi Ceph,
Today I did something wrong and that blocked the lab for a good half hour.
a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do
timeout 120 teuthology-kill .... || echo FAIL!
as a generic safeguard.
Apologies for the troubles.
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timeout 120 teuthology-killl is highly recommended
2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
@ 2015-07-21 16:29 ` Gregory Farnum
2015-07-21 16:33 ` Yuri Weinstein
1 sibling, 0 replies; 5+ messages in thread
From: Gregory Farnum @ 2015-07-21 16:29 UTC (permalink / raw)
To: Loic Dachary; +Cc: Ceph Development
On Tue, Jul 21, 2015 at 5:13 PM, Loic Dachary <loic@dachary.org> wrote:
> Hi Ceph,
>
> Today I did something wrong and that blocked the lab for a good half hour.
>
> a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
> b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
>
> Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two.
Mmm, I'm not sure that's correct if you're killing jobs which are
actually running — teuthology-nuke (which it will invoke) can take a
while and you definitely don't want to time that out! So beware for
in-process runs.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timeout 120 teuthology-killl is highly recommended
2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
2015-07-21 16:29 ` Gregory Farnum
@ 2015-07-21 16:33 ` Yuri Weinstein
2015-07-21 16:38 ` Yuri Weinstein
2015-07-21 16:42 ` Loic Dachary
1 sibling, 2 replies; 5+ messages in thread
From: Yuri Weinstein @ 2015-07-21 16:33 UTC (permalink / raw)
To: Loic Dachary; +Cc: Ceph Development
Loic
I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish.
Thx
YuriW
----- Original Message -----
From: "Loic Dachary" <loic@dachary.org>
To: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended
Hi Ceph,
Today I did something wrong and that blocked the lab for a good half hour.
a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do
timeout 120 teuthology-kill .... || echo FAIL!
as a generic safeguard.
Apologies for the troubles.
--
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timeout 120 teuthology-killl is highly recommended
2015-07-21 16:33 ` Yuri Weinstein
@ 2015-07-21 16:38 ` Yuri Weinstein
2015-07-21 16:42 ` Loic Dachary
1 sibling, 0 replies; 5+ messages in thread
From: Yuri Weinstein @ 2015-07-21 16:38 UTC (permalink / raw)
To: Loic Dachary; +Cc: Ceph Development
I was thinking of teuthology-nuke thou !
Thx
YuriW
----- Original Message -----
From: "Yuri Weinstein" <yweinste@redhat.com>
To: "Loic Dachary" <loic@dachary.org>
Cc: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:33:26 AM
Subject: Re: timeout 120 teuthology-killl is highly recommended
Loic
I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish.
Thx
YuriW
----- Original Message -----
From: "Loic Dachary" <loic@dachary.org>
To: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended
Hi Ceph,
Today I did something wrong and that blocked the lab for a good half hour.
a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do
timeout 120 teuthology-kill .... || echo FAIL!
as a generic safeguard.
Apologies for the troubles.
--
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: timeout 120 teuthology-killl is highly recommended
2015-07-21 16:33 ` Yuri Weinstein
2015-07-21 16:38 ` Yuri Weinstein
@ 2015-07-21 16:42 ` Loic Dachary
1 sibling, 0 replies; 5+ messages in thread
From: Loic Dachary @ 2015-07-21 16:42 UTC (permalink / raw)
To: Yuri Weinstein; +Cc: Ceph Development
[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]
Greg & Yuri : I stand corrected, I should have been less affirmative on a topic I know little about. Thanks !
On 21/07/2015 18:33, Yuri Weinstein wrote:
> Loic
>
> I don't use teuthology-kill simultaneously only sequentially.
> As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish.
>
>
> Thx
> YuriW
>
> ----- Original Message -----
> From: "Loic Dachary" <loic@dachary.org>
> To: "Ceph Development" <ceph-devel@vger.kernel.org>
> Sent: Tuesday, July 21, 2015 9:13:04 AM
> Subject: timeout 120 teuthology-killl is highly recommended
>
> Hi Ceph,
>
> Today I did something wrong and that blocked the lab for a good half hour.
>
> a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
> b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
>
> Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do
>
> timeout 120 teuthology-kill .... || echo FAIL!
>
> as a generic safeguard.
>
> Apologies for the troubles.
>
--
Loïc Dachary, Artisan Logiciel Libre
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2015-07-21 16:42 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
2015-07-21 16:29 ` Gregory Farnum
2015-07-21 16:33 ` Yuri Weinstein
2015-07-21 16:38 ` Yuri Weinstein
2015-07-21 16:42 ` Loic Dachary
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.