public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] remove name length check in a workqueue
@ 2005-08-10 14:19 James Bottomley
  2005-08-10 14:45 ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2005-08-10 14:19 UTC (permalink / raw)
  To: Andrew Morton, mingo; +Cc: Linux Kernel, SCSI Mailing List

Ingo,

This has been in the workqueue code in day one, for no real reason that
I can see.  We just tripped over it in SCSI because the fibre channel
transport class creates one workqueue per host with the name scsi_wq_%d
which trips this after we get to 100.  Unfortunately we just came across
someone with > 100 host adapters ...

I think the solution is just to get rid of the artificial limit.

James

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -308,8 +308,6 @@ struct workqueue_struct *__create_workqu
 	struct workqueue_struct *wq;
 	struct task_struct *p;
 
-	BUG_ON(strlen(name) > 10);
-
 	wq = kmalloc(sizeof(*wq), GFP_KERNEL);
 	if (!wq)
 		return NULL;



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 14:19 [PATCH] remove name length check in a workqueue James Bottomley
@ 2005-08-10 14:45 ` Ingo Molnar
  2005-08-10 17:05   ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2005-08-10 14:45 UTC (permalink / raw)
  To: James Bottomley; +Cc: Andrew Morton, Linux Kernel, SCSI Mailing List


yeah ... cannot remember why i have done it originally :-|

Acked-by: Ingo Molnar <mingo@redhat.com>

	Ingo

On Wed, 10 Aug 2005, James Bottomley wrote:

> Ingo,
> 
> This has been in the workqueue code in day one, for no real reason that
> I can see.  We just tripped over it in SCSI because the fibre channel
> transport class creates one workqueue per host with the name scsi_wq_%d
> which trips this after we get to 100.  Unfortunately we just came across
> someone with > 100 host adapters ...
> 
> I think the solution is just to get rid of the artificial limit.
> 
> James
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -308,8 +308,6 @@ struct workqueue_struct *__create_workqu
>  	struct workqueue_struct *wq;
>  	struct task_struct *p;
>  
> -	BUG_ON(strlen(name) > 10);
> -
>  	wq = kmalloc(sizeof(*wq), GFP_KERNEL);
>  	if (!wq)
>  		return NULL;
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 14:45 ` Ingo Molnar
@ 2005-08-10 17:05   ` Andrew Morton
  2005-08-10 17:24     ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-08-10 17:05 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: James.Bottomley, linux-kernel, linux-scsi

Ingo Molnar <mingo@redhat.com> wrote:
>
> 
> yeah ... cannot remember why i have done it originally :-|
> 

Might it be to do with sizeof(task_struct.comm)?

> 
> On Wed, 10 Aug 2005, James Bottomley wrote:
> 
> > Ingo,
> > 
> > This has been in the workqueue code in day one, for no real reason that
> > I can see.  We just tripped over it in SCSI because the fibre channel
> > transport class creates one workqueue per host with the name scsi_wq_%d
> > which trips this after we get to 100.  Unfortunately we just came across
> > someone with > 100 host adapters ...
> > 
> > I think the solution is just to get rid of the artificial limit.
> > 
> > James
> > 
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -308,8 +308,6 @@ struct workqueue_struct *__create_workqu
> >  	struct workqueue_struct *wq;
> >  	struct task_struct *p;
> >  
> > -	BUG_ON(strlen(name) > 10);
> > -
> >  	wq = kmalloc(sizeof(*wq), GFP_KERNEL);
> >  	if (!wq)
> >  		return NULL;
> > 
> > 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 17:05   ` Andrew Morton
@ 2005-08-10 17:24     ` James Bottomley
  2005-08-10 17:37       ` Andrew Morton
  0 siblings, 1 reply; 11+ messages in thread
From: James Bottomley @ 2005-08-10 17:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Ingo Molnar, Linux Kernel, SCSI Mailing List

On Wed, 2005-08-10 at 10:05 -0700, Andrew Morton wrote:
> Ingo Molnar <mingo@redhat.com> wrote:
> > yeah ... cannot remember why i have done it originally :-|

> Might it be to do with sizeof(task_struct.comm)?

But that's 16 bytes not 10; and anyway, it doesn't have to be unique;
set_task_comm just does a strlcpy from the name, so it will be truncated
(same as for a binary with > 15 character name).

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 17:24     ` James Bottomley
@ 2005-08-10 17:37       ` Andrew Morton
  2005-08-10 17:54         ` James Bottomley
  0 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2005-08-10 17:37 UTC (permalink / raw)
  To: James Bottomley; +Cc: mingo, linux-kernel, linux-scsi

James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> On Wed, 2005-08-10 at 10:05 -0700, Andrew Morton wrote:
> > Ingo Molnar <mingo@redhat.com> wrote:
> > > yeah ... cannot remember why i have done it originally :-|
> 
> > Might it be to do with sizeof(task_struct.comm)?
> 
> But that's 16 bytes not 10;

Well we stick a "/%d" at the end, which gets us up to 14 chars, assuming
NR_CPUS < 1000

> and anyway, it doesn't have to be unique;
> set_task_comm just does a strlcpy from the name, so it will be truncated
> (same as for a binary with > 15 character name).

Yup.  But it'd be fairly silly to go adding the /%d, only to have it
truncated off again.

We could truncate the name before adding the CPU number, but it sounds
saner to just prevent anyone passing in excessively long names.  Via
BUG_ON, say ;)

What's the actual problem?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 17:37       ` Andrew Morton
@ 2005-08-10 17:54         ` James Bottomley
  2005-08-10 18:27           ` Andrew Morton
  2005-08-10 18:49           ` Frederic TEMPORELLI - astek
  0 siblings, 2 replies; 11+ messages in thread
From: James Bottomley @ 2005-08-10 17:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: mingo, Linux Kernel, SCSI Mailing List

On Wed, 2005-08-10 at 10:37 -0700, Andrew Morton wrote:
> > and anyway, it doesn't have to be unique;
> > set_task_comm just does a strlcpy from the name, so it will be truncated
> > (same as for a binary with > 15 character name).
> 
> Yup.  But it'd be fairly silly to go adding the /%d, only to have it
> truncated off again.

Well, but the other alternative is that we hit arbitrary BUG_ON() limits
in systems that create numbered workqueues which is rather contrary to
our scaleability objectives, isn't it?

I think I'd rather the name truncation than have to respond to kernel
BUG()'s.  If someone really has a problem with the appearance of ps,
they can always increase TASK_COMM_LEN.

> We could truncate the name before adding the CPU number, but it sounds
> saner to just prevent anyone passing in excessively long names.  Via
> BUG_ON, say ;)
> 
> What's the actual problem?

What I posted originally; the current SCSI format for a workqueue:
scsi_wq_%d hits the bug after the host number rises to 100, which has
been seen by some enterprise person with > 100 HBAs.

The reason for this name is that the error handler thread is called
scsi_eh_%d; so we could rename all our threads to avoid this, but one
day someone will come along with a huge enough machine to hit whatever
limit we squeeze it down to.

James



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 17:54         ` James Bottomley
@ 2005-08-10 18:27           ` Andrew Morton
  2005-08-11 14:37             ` Simon Derr
  2005-08-11 16:22             ` Coywolf Qi Hunt
  2005-08-10 18:49           ` Frederic TEMPORELLI - astek
  1 sibling, 2 replies; 11+ messages in thread
From: Andrew Morton @ 2005-08-10 18:27 UTC (permalink / raw)
  To: James Bottomley; +Cc: mingo, linux-kernel, linux-scsi

James Bottomley <James.Bottomley@SteelEye.com> wrote:
>
> On Wed, 2005-08-10 at 10:37 -0700, Andrew Morton wrote:
> > > and anyway, it doesn't have to be unique;
> > > set_task_comm just does a strlcpy from the name, so it will be truncated
> > > (same as for a binary with > 15 character name).
> > 
> > Yup.  But it'd be fairly silly to go adding the /%d, only to have it
> > truncated off again.
> 
> Well, but the other alternative is that we hit arbitrary BUG_ON() limits
> in systems that create numbered workqueues which is rather contrary to
> our scaleability objectives, isn't it?

Another alternative is to stop passing in such long strings ;)

> > What's the actual problem?
> 
> What I posted originally; the current SCSI format for a workqueue:
> scsi_wq_%d hits the bug after the host number rises to 100, which has
> been seen by some enterprise person with > 100 HBAs.
> 
> The reason for this name is that the error handler thread is called
> scsi_eh_%d; so we could rename all our threads to avoid this, but one
> day someone will come along with a huge enough machine to hit whatever
> limit we squeeze it down to.

OK, well scsi is using single-threaded workqueues anyway.  So we could do:

	if (singlethread)
		BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1);
	else
		BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1 - 4);

which gets you 10,000,000 HBAs.   Enough?

Ho hum, OK, let's just kill the BUG_ON.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 17:54         ` James Bottomley
  2005-08-10 18:27           ` Andrew Morton
@ 2005-08-10 18:49           ` Frederic TEMPORELLI - astek
  1 sibling, 0 replies; 11+ messages in thread
From: Frederic TEMPORELLI - astek @ 2005-08-10 18:49 UTC (permalink / raw)
  To: Linux Kernel; +Cc: James Bottomley, SCSI Mailing List

James Bottomley a écrit :

>Well, but the other alternative is that we hit arbitrary BUG_ON() limits
>in systems that create numbered workqueues which is rather contrary to
>our scaleability objectives, isn't it?
>
>I think I'd rather the name truncation than have to respond to kernel
>BUG()'s.  If someone really has a problem with the appearance of ps,
>they can always increase TASK_COMM_LEN.
>
>  
>
>>We could truncate the name before adding the CPU number, but it sounds
>>saner to just prevent anyone passing in excessively long names.  Via
>>BUG_ON, say ;)
>>
>>What's the actual problem?
>>    
>>
>
>What I posted originally; the current SCSI format for a workqueue:
>scsi_wq_%d hits the bug after the host number rises to 100, which has
>been seen by some enterprise person with > 100 HBAs.
>
>The reason for this name is that the error handler thread is called
>scsi_eh_%d; so we could rename all our threads to avoid this, but one
>day someone will come along with a huge enough machine to hit whatever
>limit we squeeze it down to.
>
>James
>
>  
>
In scsi layer (drivers/scsi/hosts.c), wq name length is limited to 
KOBJ_NAME_LEN due to the snprintf .
may be nice to use same limit if BUG_ON is kept... but why NULL isn't 
returned, then ?  ;-)

--
Tempo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 18:27           ` Andrew Morton
@ 2005-08-11 14:37             ` Simon Derr
  2005-08-11 16:22             ` Coywolf Qi Hunt
  1 sibling, 0 replies; 11+ messages in thread
From: Simon Derr @ 2005-08-11 14:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: James Bottomley, mingo, linux-kernel, linux-scsi

On Wed, 10 Aug 2005, Andrew Morton wrote:

> > What I posted originally; the current SCSI format for a workqueue:
> > scsi_wq_%d hits the bug after the host number rises to 100, which has
> > been seen by some enterprise person with > 100 HBAs.
> > 
> > The reason for this name is that the error handler thread is called
> > scsi_eh_%d; so we could rename all our threads to avoid this, but one
> > day someone will come along with a huge enough machine to hit whatever
> > limit we squeeze it down to.
> 
> OK, well scsi is using single-threaded workqueues anyway.  So we could do:
> 
> 	if (singlethread)
> 		BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1);
> 	else
> 		BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1 - 4);
> 
> which gets you 10,000,000 HBAs.   Enough?

I suppose so, but the problem is slightly worse:

One does not need 100 HBAs to trigger the BUG_ON: 

It is sufficient to have a few HBAs and to insmod/rmmod the driver a few 
times.

Since the host_no is choosen with a mere counter increment 
in scsi_host_alloc():

      shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy */

Unused `host_no's are not reused and the 100 limit is reached even on 
smaller systems.

I have no idea of why someone would do repeated insmod/rmmods, though.
(But someone did).

	Simon.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
  2005-08-10 18:27           ` Andrew Morton
  2005-08-11 14:37             ` Simon Derr
@ 2005-08-11 16:22             ` Coywolf Qi Hunt
  1 sibling, 0 replies; 11+ messages in thread
From: Coywolf Qi Hunt @ 2005-08-11 16:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: James Bottomley, mingo, linux-kernel, linux-scsi

On 8/11/05, Andrew Morton <akpm@osdl.org> wrote:
> James Bottomley <James.Bottomley@SteelEye.com> wrote:
> >
> > On Wed, 2005-08-10 at 10:37 -0700, Andrew Morton wrote:
> > > > and anyway, it doesn't have to be unique;
> > > > set_task_comm just does a strlcpy from the name, so it will be truncated
> > > > (same as for a binary with > 15 character name).
> > >
> > > Yup.  But it'd be fairly silly to go adding the /%d, only to have it
> > > truncated off again.
> >
> > Well, but the other alternative is that we hit arbitrary BUG_ON() limits
> > in systems that create numbered workqueues which is rather contrary to
> > our scaleability objectives, isn't it?
> 
> Another alternative is to stop passing in such long strings ;)
> 
> > > What's the actual problem?
> >
> > What I posted originally; the current SCSI format for a workqueue:
> > scsi_wq_%d hits the bug after the host number rises to 100, which has
> > been seen by some enterprise person with > 100 HBAs.
> >
> > The reason for this name is that the error handler thread is called
> > scsi_eh_%d; so we could rename all our threads to avoid this, but one
> > day someone will come along with a huge enough machine to hit whatever
> > limit we squeeze it down to.
> 
> OK, well scsi is using single-threaded workqueues anyway.  So we could do:
> 
>         if (singlethread)
>                 BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1);
>         else
>                 BUG_ON(strlen(name) > sizeof(task_struct.comm) - 1 - 4);
> 
> which gets you 10,000,000 HBAs.   Enough?
> 
> Ho hum, OK, let's just kill the BUG_ON.

s/BUG_ON/WARN_ON/ ?

-- 
Coywolf Qi Hunt
http://ahbl.org/~coywolf/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] remove name length check in a workqueue
@ 2005-08-11 18:48 Andreas Herrmann
  0 siblings, 0 replies; 11+ messages in thread
From: Andreas Herrmann @ 2005-08-11 18:48 UTC (permalink / raw)
  To: Simon Derr
  Cc: Andrew Morton, James Bottomley, linux-kernel, linux-scsi,
	linux-scsi-owner, mingo

        Simon Derr <Simon.Derr@bull.net> wrote:

  > It is sufficient to have a few HBAs and to insmod/rmmod the driver a 
few 
  > times.

  > Since the host_no is choosen with a mere counter increment 
  > in scsi_host_alloc():

  >       shost->host_no = scsi_host_next_hn++; /* XXX(hch): still racy */

  > Unused `host_no's are not reused and the 100 limit is reached even on 
  > smaller systems.

  > I have no idea of why someone would do repeated insmod/rmmods, though.
  > (But someone did).

You even don't have to use insmod/rmmod.  On s390 (using zfcp) it
suffices to take adapters offline and online (triggered via VM,
hardware, or within Linux). Just do so about 100 times ... You
know the result.


Regards,

Andreas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-08-11 18:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-08-10 14:19 [PATCH] remove name length check in a workqueue James Bottomley
2005-08-10 14:45 ` Ingo Molnar
2005-08-10 17:05   ` Andrew Morton
2005-08-10 17:24     ` James Bottomley
2005-08-10 17:37       ` Andrew Morton
2005-08-10 17:54         ` James Bottomley
2005-08-10 18:27           ` Andrew Morton
2005-08-11 14:37             ` Simon Derr
2005-08-11 16:22             ` Coywolf Qi Hunt
2005-08-10 18:49           ` Frederic TEMPORELLI - astek
  -- strict thread matches above, loose matches on Subject: below --
2005-08-11 18:48 Andreas Herrmann

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox