All of lore.kernel.org
 help / color / mirror / Atom feed
* timeout 120 teuthology-killl is highly recommended
@ 2015-07-21 16:13 Loic Dachary
  2015-07-21 16:29 ` Gregory Farnum
  2015-07-21 16:33 ` Yuri Weinstein
  0 siblings, 2 replies; 5+ messages in thread
From: Loic Dachary @ 2015-07-21 16:13 UTC (permalink / raw)
  To: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 803 bytes --]

Hi Ceph,

Today I did something wrong and that blocked the lab for a good half hour. 

a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.

Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do

timeout 120 teuthology-kill .... || echo FAIL!

as a generic safeguard.

Apologies for the troubles.

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timeout 120 teuthology-killl is highly recommended
  2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
@ 2015-07-21 16:29 ` Gregory Farnum
  2015-07-21 16:33 ` Yuri Weinstein
  1 sibling, 0 replies; 5+ messages in thread
From: Gregory Farnum @ 2015-07-21 16:29 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

On Tue, Jul 21, 2015 at 5:13 PM, Loic Dachary <loic@dachary.org> wrote:
> Hi Ceph,
>
> Today I did something wrong and that blocked the lab for a good half hour.
>
> a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
> b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
>
> Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two.

Mmm, I'm not sure that's correct if you're killing jobs which are
actually running — teuthology-nuke (which it will invoke) can take a
while and you definitely don't want to time that out! So beware for
in-process runs.
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timeout 120 teuthology-killl is highly recommended
  2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
  2015-07-21 16:29 ` Gregory Farnum
@ 2015-07-21 16:33 ` Yuri Weinstein
  2015-07-21 16:38   ` Yuri Weinstein
  2015-07-21 16:42   ` Loic Dachary
  1 sibling, 2 replies; 5+ messages in thread
From: Yuri Weinstein @ 2015-07-21 16:33 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

Loic

I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish. 


Thx
YuriW

----- Original Message -----
From: "Loic Dachary" <loic@dachary.org>
To: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended

Hi Ceph,

Today I did something wrong and that blocked the lab for a good half hour. 

a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.

Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do

timeout 120 teuthology-kill .... || echo FAIL!

as a generic safeguard.

Apologies for the troubles.

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timeout 120 teuthology-killl is highly recommended
  2015-07-21 16:33 ` Yuri Weinstein
@ 2015-07-21 16:38   ` Yuri Weinstein
  2015-07-21 16:42   ` Loic Dachary
  1 sibling, 0 replies; 5+ messages in thread
From: Yuri Weinstein @ 2015-07-21 16:38 UTC (permalink / raw)
  To: Loic Dachary; +Cc: Ceph Development

I was thinking of teuthology-nuke thou !


Thx
YuriW

----- Original Message -----
From: "Yuri Weinstein" <yweinste@redhat.com>
To: "Loic Dachary" <loic@dachary.org>
Cc: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:33:26 AM
Subject: Re: timeout 120 teuthology-killl is highly recommended

Loic

I don't use teuthology-kill simultaneously only sequentially.
As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish. 


Thx
YuriW

----- Original Message -----
From: "Loic Dachary" <loic@dachary.org>
To: "Ceph Development" <ceph-devel@vger.kernel.org>
Sent: Tuesday, July 21, 2015 9:13:04 AM
Subject: timeout 120 teuthology-killl is highly recommended

Hi Ceph,

Today I did something wrong and that blocked the lab for a good half hour. 

a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.

Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do

timeout 120 teuthology-kill .... || echo FAIL!

as a generic safeguard.

Apologies for the troubles.

-- 
Loïc Dachary, Artisan Logiciel Libre
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: timeout 120 teuthology-killl is highly recommended
  2015-07-21 16:33 ` Yuri Weinstein
  2015-07-21 16:38   ` Yuri Weinstein
@ 2015-07-21 16:42   ` Loic Dachary
  1 sibling, 0 replies; 5+ messages in thread
From: Loic Dachary @ 2015-07-21 16:42 UTC (permalink / raw)
  To: Yuri Weinstein; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]

Greg & Yuri : I stand corrected, I should have been less affirmative on a topic I know little about. Thanks !

On 21/07/2015 18:33, Yuri Weinstein wrote:
> Loic
> 
> I don't use teuthology-kill simultaneously only sequentially.
> As far as run time, just as a note, when we use 'stale' arg and it invokes ipmitool interface it does take awhile to finish. 
> 
> 
> Thx
> YuriW
> 
> ----- Original Message -----
> From: "Loic Dachary" <loic@dachary.org>
> To: "Ceph Development" <ceph-devel@vger.kernel.org>
> Sent: Tuesday, July 21, 2015 9:13:04 AM
> Subject: timeout 120 teuthology-killl is highly recommended
> 
> Hi Ceph,
> 
> Today I did something wrong and that blocked the lab for a good half hour. 
> 
> a) I ran two teuthology-kill simultaneously and that makes them deadlock each other
> b) I let them run unattended only to come back to the terminal 30 minutes later and see them stuck.
> 
> Sure, two teuthology-kill simultaneously should not deadlock and that needs to be fixed. But the easy workaround to avoid that trouble is to just not let it run forever. Even for ~200 jobs it takes at most a minute or two. And if it takes longer it probably means another teuthology-kill competes and it should be interrupted and restarted later. From now on I'll do
> 
> timeout 120 teuthology-kill .... || echo FAIL!
> 
> as a generic safeguard.
> 
> Apologies for the troubles.
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2015-07-21 16:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-07-21 16:13 timeout 120 teuthology-killl is highly recommended Loic Dachary
2015-07-21 16:29 ` Gregory Farnum
2015-07-21 16:33 ` Yuri Weinstein
2015-07-21 16:38   ` Yuri Weinstein
2015-07-21 16:42   ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.