* GFP_ATOMIC allocations...
@ 2002-08-29 0:25 Doug Ledford
2002-08-29 0:47 ` Patrick Mansfield
0 siblings, 1 reply; 13+ messages in thread
From: Doug Ledford @ 2002-08-29 0:25 UTC (permalink / raw)
To: linux-scsi
I'm looking through some of the new code that has been put in the kernel
scsi subsystem lately and I'm seeing a few of the same mistakes that the
old code made. So, I'm pointing out one of them here. There are lots of
places in the SCSI mid layer that are called outside of any spinlocks and
not from interrupt context where there is absolutely no need to use
GFP_ATOMIC allocations for memory. In general, if you don't need to use
GFP_ATOMIC, then don't use it. You're more likely to fail the allocation
and therefore fail the operation. So, if you are working on a section of
code, please think about this when doing allocations and do whatever is
right for the code snippet you are currently hacking. Fixing places in
the existing code that use GFP_ATOMIC needlessly is a plus ;-)
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 0:25 Doug Ledford
@ 2002-08-29 0:47 ` Patrick Mansfield
2002-08-29 1:42 ` Doug Ledford
0 siblings, 1 reply; 13+ messages in thread
From: Patrick Mansfield @ 2002-08-29 0:47 UTC (permalink / raw)
To: linux-scsi
On Wed, Aug 28, 2002 at 08:25:34PM -0400, Doug Ledford wrote:
> I'm looking through some of the new code that has been put in the kernel
> scsi subsystem lately and I'm seeing a few of the same mistakes that the
> old code made. So, I'm pointing out one of them here. There are lots of
> places in the SCSI mid layer that are called outside of any spinlocks and
> not from interrupt context where there is absolutely no need to use
> GFP_ATOMIC allocations for memory. In general, if you don't need to use
> GFP_ATOMIC, then don't use it. You're more likely to fail the allocation
> and therefore fail the operation. So, if you are working on a section of
> code, please think about this when doing allocations and do whatever is
> right for the code snippet you are currently hacking. Fixing places in
> the existing code that use GFP_ATOMIC needlessly is a plus ;-)
So, you think all (or most) of the GFP_ATOMIC's in scsi_scan.c should
be GFP_KERNEL? All the kmalloc calls should be at boot time, or via
insmod.
I was wondering about them, but left them to match the previous
scsi_scan.c code.
Do GFP_KERNEL alloc failures during boot time just return NULL?
(Given that there is really no memory left.) I'd hope so, but was
never sure.
I suppose I could change them and boot with mem=something-low and
see what happens.
-- Patrick
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 0:47 ` Patrick Mansfield
@ 2002-08-29 1:42 ` Doug Ledford
2002-08-29 10:47 ` Alan Cox
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Doug Ledford @ 2002-08-29 1:42 UTC (permalink / raw)
To: Patrick Mansfield; +Cc: linux-scsi
On Wed, Aug 28, 2002 at 05:47:37PM -0700, Patrick Mansfield wrote:
> So, you think all (or most) of the GFP_ATOMIC's in scsi_scan.c should
> be GFP_KERNEL?
Yep. Until such time as a LLDD actually calls scsi_scan from an interrupt
handler to handle a fiber fabric that has just come back up, it is safe
for this code to use GFP_KERNEL. As for the fiber channel issue, I
actually think that it needs to notify the eh thread of the change in loop
state and let the eh thread do the rescan from the eh thread context
instead of interrupt context, so this won't be an issue when we support
fiber loop transitions either.
> All the kmalloc calls should be at boot time, or via
> insmod.
Right, process context with no locks held.
> I was wondering about them, but left them to match the previous
> scsi_scan.c code.
I figured. That's why I pointed it out ;-)
> Do GFP_KERNEL alloc failures during boot time just return NULL?
GFP_ATOMIC returns NULL on failure, GFP_KERNEL *should* never fail until
we get to a true OOM condition because it will block and wait for the vm
subsystem to free us up some space. Under true OOM conditions it will
return NULL AFAIK.
> (Given that there is really no memory left.) I'd hope so, but was
> never sure.
>
> I suppose I could change them and boot with mem=something-low and
> see what happens.
I've known people that had actual, real problems because of the GFP_ATOMIC
allocations in this code area. It was related to bringing up a fiber
controller that had 100s of disks hanging off of it and therefore wanted
to do a *LOT* of atomic allocations in a short period of time, but it was
a real problem none the less. Making the allocations non-atomic would
allow the scanning to block waiting on more ram instead of bailing on
devices.
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 1:42 ` Doug Ledford
@ 2002-08-29 10:47 ` Alan Cox
2002-08-29 15:58 ` Doug Ledford
2002-08-29 17:10 ` Patrick Mansfield
2002-08-30 16:22 ` Patrick Mansfield
2 siblings, 1 reply; 13+ messages in thread
From: Alan Cox @ 2002-08-29 10:47 UTC (permalink / raw)
To: Doug Ledford; +Cc: Patrick Mansfield, linux-scsi
On Thu, 2002-08-29 at 02:42, Doug Ledford wrote:
> > Do GFP_KERNEL alloc failures during boot time just return NULL?
>
> GFP_ATOMIC returns NULL on failure, GFP_KERNEL *should* never fail until
> we get to a true OOM condition because it will block and wait for the vm
> subsystem to free us up some space. Under true OOM conditions it will
> return NULL AFAIK.
Or maybe deadlock if its happening due to I/O. If it wants to flush a
page out to that scsi device and the scsi device is doing the allocation
life gets very unpleasant. SCSI may not be coming from block layer
access with PF_MEM* set either, think about /proc and scsi-generic
"Here be dragons"
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 10:47 ` Alan Cox
@ 2002-08-29 15:58 ` Doug Ledford
0 siblings, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2002-08-29 15:58 UTC (permalink / raw)
To: Alan Cox; +Cc: Patrick Mansfield, linux-scsi
On Thu, Aug 29, 2002 at 11:47:55AM +0100, Alan Cox wrote:
> On Thu, 2002-08-29 at 02:42, Doug Ledford wrote:
> > > Do GFP_KERNEL alloc failures during boot time just return NULL?
> >
> > GFP_ATOMIC returns NULL on failure, GFP_KERNEL *should* never fail until
> > we get to a true OOM condition because it will block and wait for the vm
> > subsystem to free us up some space. Under true OOM conditions it will
> > return NULL AFAIK.
>
> Or maybe deadlock if its happening due to I/O.
That would be a trick indeed since Patrick and I were referring to the
scanning code so the device isn't even configured for kernel use yet :-P
> If it wants to flush a
> page out to that scsi device and the scsi device is doing the allocation
> life gets very unpleasant.
Of course. This is why the mid layer doesn't typically allocate anything
at all once initial setup of a device is complete. No allocations once
setup is complete means no deadlock.
> SCSI may not be coming from block layer
> access with PF_MEM* set either, think about /proc and scsi-generic
>
> "Here be dragons"
That's why I recommended people think each use through instead of blindly
changing things ;-)
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
@ 2002-08-29 15:59 Bryan Henderson
2002-08-29 16:25 ` Doug Ledford
2002-08-29 16:50 ` Alan Cox
0 siblings, 2 replies; 13+ messages in thread
From: Bryan Henderson @ 2002-08-29 15:59 UTC (permalink / raw)
To: Doug Ledford; +Cc: linux-scsi, patman
>GFP_KERNEL *should* never fail until
>we get to a true OOM condition because it will block and wait for the vm
>subsystem to free us up some space. Under true OOM conditions it will
>return NULL AFAIK.
That (the NULL case) doesn't sound right to me. The purpose of the OOM
Killer is to choose wisely which processes should die when there's simply
not enough memory to go around. And it does it by terminating a process
directly, not by affecting the return value of kmalloc(). So if
kmalloc(GFP_KERNEL) returns NULL when the system is out of memory, the
system is arbitrarily killing whoever is the next guy to ask for memory,
thus defeating the purpose of the OOM Killer.
Incidentally, I've had to explain GFP_ATOMIC to many people, and I believe
the reason for its incorrect usage is that people know it's supposed to be
called by processes that can't wait, and they think that means "I can't
wait for this memory, so be sure you give it to me _now_." Simple
explanations of GFP_ATOMIC give you the impression that it's just a higher
priority request than GFP_KERNEL.
One thing about GFP_ATOMIC that is often overlooked is that what it really
means is that the caller, instead of VM, is responsible for doing the
waiting and retrying and whatever to deal with memory allocation failure.
Failing some operation permanently because a kmalloc(GFP_ATOMIC) it did
failed is usually the wrong thing to do.
Again, many people have just the opposite impression. One person told me
he used GFP_ATOMIC instead of GFP_KERNEL because he needed to be guaranteed
the allocation wouldn't fail.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 15:59 GFP_ATOMIC allocations Bryan Henderson
@ 2002-08-29 16:25 ` Doug Ledford
2002-08-29 16:50 ` Alan Cox
1 sibling, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2002-08-29 16:25 UTC (permalink / raw)
To: Bryan Henderson; +Cc: linux-scsi, patman
On Thu, Aug 29, 2002 at 08:59:50AM -0700, Bryan Henderson wrote:
>
> Incidentally, I've had to explain GFP_ATOMIC to many people, and I believe
> the reason for its incorrect usage is that people know it's supposed to be
> called by processes that can't wait, and they think that means "I can't
> wait for this memory, so be sure you give it to me _now_." Simple
> explanations of GFP_ATOMIC give you the impression that it's just a higher
> priority request than GFP_KERNEL.
OK, then clarification time ;-) GFP_KERNEL means give me memory that is
kernel addressable, and we are allowed to block while writing stuff out to
disk if necessary. GFP_ATOMIC doesn't really mean give me memory now, but
it does mean make your decision now and either give me my memory or give
me an error condition immediately because I can't be blocked. Of note,
GFP_KERNEL allocations are not allowed to succeed if the amount of free
RAM falls below free pages lo. Instead they must block until we have
freed up some ram (enough to get the number of free pages up above free
pages high). Atomic allocations will steal memory out the machine well
below free pages low. In fact, it's specifically for these atomic
allocations that we keep that free pages high number of free memory
around, and this memory pool is used to satisfy atomic allocations long
after normal GFP_KERNEL allocations have started to block.
> One thing about GFP_ATOMIC that is often overlooked is that what it really
> means is that the caller, instead of VM, is responsible for doing the
> waiting and retrying and whatever to deal with memory allocation failure.
> Failing some operation permanently because a kmalloc(GFP_ATOMIC) it did
> failed is usually the wrong thing to do.
This depends on context. Failing an atomic allocation in your interrupt
handler shouldn't result in waiting or retries. And, since we can't allow
ourselves to block in an interrupt handler, atomic allocations are the
only correct method of allocating in an interrupt handler (the scsi
devices don't typically do this, but lots of network drivers do).
Now, additionally, if we are failing an allocation in an interrupt
handler, then often the only thing we can do is bail on the operation.
So, IMHO, when you see places that truly need atomic allocations, then you
should also see a place where if the allocations fails then we should
bail immediately instead of blocking on RAM.
> Again, many people have just the opposite impression. One person told me
> he used GFP_ATOMIC instead of GFP_KERNEL because he needed to be guaranteed
> the allocation wouldn't fail.
GFP_KERNEL == most likely to succeed, eventually
GFP_ATOMIC == most likely to succeed, right now.
In fact, atomic allocations *will* succeed as long as there are enough
free pages to satisfy the request. Regardless of the result, the call
returns without any delay. No sleeping is allowed in these allocations.
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 15:59 GFP_ATOMIC allocations Bryan Henderson
2002-08-29 16:25 ` Doug Ledford
@ 2002-08-29 16:50 ` Alan Cox
1 sibling, 0 replies; 13+ messages in thread
From: Alan Cox @ 2002-08-29 16:50 UTC (permalink / raw)
To: Bryan Henderson; +Cc: Doug Ledford, linux-scsi, patman
On Thu, 2002-08-29 at 16:59, Bryan Henderson wrote:
> That (the NULL case) doesn't sound right to me. The purpose of the OOM
> Killer is to choose wisely which processes should die when there's simply
> not enough memory to go around.
It can return NULL in such cases. It shouldn't in the current case ever
do this for an allocation exceeding 4K unless you are in an I/O path.
However its a good idea to program for the can always happen case
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 1:42 ` Doug Ledford
2002-08-29 10:47 ` Alan Cox
@ 2002-08-29 17:10 ` Patrick Mansfield
2002-08-30 16:22 ` Patrick Mansfield
2 siblings, 0 replies; 13+ messages in thread
From: Patrick Mansfield @ 2002-08-29 17:10 UTC (permalink / raw)
To: linux-scsi
On Wed, Aug 28, 2002 at 09:42:15PM -0400, Doug Ledford wrote:
> I've known people that had actual, real problems because of the GFP_ATOMIC
> allocations in this code area. It was related to bringing up a fiber
> controller that had 100s of disks hanging off of it and therefore wanted
> to do a *LOT* of atomic allocations in a short period of time, but it was
> a real problem none the less. Making the allocations non-atomic would
> allow the scanning to block waiting on more ram instead of bailing on
> devices.
>
There are some bad allocations in sd.c - there's now a (single) GFP_DMA
buffer for test unit ready, spinup and the write protect check, where we
used scsi_malloc() in 2.4.
There are a bunch of vmalloc's most of them should be kmalloc's.
Here's the sd.c code that does the vmalloc:
#define init_mem_lth(x,n) x = vmalloc((n) * sizeof(*x))
#define zero_mem_lth(x,n) memset(x, 0, (n) * sizeof(*x))
init_mem_lth(sd_dsk_arr, sd_template.dev_max);
if (sd_dsk_arr) {
zero_mem_lth(sd_dsk_arr, sd_template.dev_max);
for (k = 0; k < sd_template.dev_max; ++k) {
sdkp = vmalloc(sizeof(Scsi_Disk));
if (NULL == sdkp)
goto cleanup_mem;
memset(sdkp, 0, sizeof(Scsi_Disk));
sd_dsk_arr[k] = sdkp;
}
}
init_mem_lth(sd_disks, sd_template.dev_max);
if (sd_disks)
zero_mem_lth(sd_disks, sd_template.dev_max);
init_mem_lth(sd, maxparts);
if (!sd_dsk_arr || !sd || !sd_disks)
goto cleanup_mem;
zero_mem_lth(sd, maxparts);
return 0;
#undef init_mem_lth
#undef zero_mem_lth
The sdkp = vmalloc() certainly should be kmalloc().
Each of the init_mem_lth and associated variable:
static Scsi_Disk ** sd_dsk_arr;
init_mem_lth(sd_dsk_arr, sd_template.dev_max);
static struct gendisk **sd_disks;
init_mem_lth(sd_disks, sd_template.dev_max);
struct hd_struct *sd;
init_mem_lth(sd, maxparts);
dev_max maximum is N_SD_MAJORS * SCSI_DISKS_PER_MAJOR right now 128 (8 * 16),
even doubling won't hit kmalloc limits if we want an array of pointers
(256 * 4 or 256 * 8).
So only the last one might need a vmalloc(), since maxparts is max_dev * 16,
and hd_structs are larger.
The only use of sd is in sd_attach to set the part - can we now just
allocate a hd_struct as needed in sd_attach, as part of the struct *p
data? Also it looks like there is never a kfree(p) for the normal
non-error case.
Or, can the size be checked and vmalloc() only called when necessary,
or called if kmalloc() fails?
-- Patrick Mansfield
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-29 1:42 ` Doug Ledford
2002-08-29 10:47 ` Alan Cox
2002-08-29 17:10 ` Patrick Mansfield
@ 2002-08-30 16:22 ` Patrick Mansfield
2002-08-30 16:46 ` James Bottomley
2002-09-03 16:07 ` Pete Zaitcev
2 siblings, 2 replies; 13+ messages in thread
From: Patrick Mansfield @ 2002-08-30 16:22 UTC (permalink / raw)
To: linux-scsi, zaitcev
On Wed, Aug 28, 2002 at 09:42:15PM -0400, Doug Ledford wrote:
> On Wed, Aug 28, 2002 at 05:47:37PM -0700, Patrick Mansfield wrote:
> > So, you think all (or most) of the GFP_ATOMIC's in scsi_scan.c should
> > be GFP_KERNEL?
>
>
> > All the kmalloc calls should be at boot time, or via
> > insmod.
>
> Right, process context with no locks held.
>
There was past discussion about calling into scsi_build_commandblocks()
while not in user context (via usb etc.), this implies that we could
scan while not in user context.
I don't see how that would work, since the scan sleeps waiting for IO.
Do we really call scan_scsis and scsi_build_commandblocks while not in user
context? Who is the caller (to scsi_register_host)?
Here is Pete's response saying he found such a case, but not where:
http://marc.theaimsgroup.com/?l=linux-scsi&m=101678634608966&w=2
-- Patrick Mansfield
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-30 16:22 ` Patrick Mansfield
@ 2002-08-30 16:46 ` James Bottomley
2002-08-30 18:58 ` Doug Ledford
2002-09-03 16:07 ` Pete Zaitcev
1 sibling, 1 reply; 13+ messages in thread
From: James Bottomley @ 2002-08-30 16:46 UTC (permalink / raw)
To: Patrick Mansfield; +Cc: linux-scsi, zaitcev
patmans@us.ibm.com said:
> There was past discussion about calling into scsi_build_commandblocks()
> while not in user context (via usb etc.), this implies that we could
> scan while not in user context.
> I don't see how that would work, since the scan sleeps waiting for IO.
>
> Do we really call scan_scsis and scsi_build_commandblocks while not in
> user context? Who is the caller (to scsi_register_host)?
> Here is Pete's response saying he found such a case, but not where:
> http://marc.theaimsgroup.com/?l=linux-scsi&m=101678634608966&w=2
I agree we shouldn't not be in user context here.
What about, instead of Pete's patch, changing the GFP_ATOMIC to GFP_KERNEL and
adding a BUG_ON(in_interrupt()) at the top and see who sends back bug
reports...
James
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-30 16:46 ` James Bottomley
@ 2002-08-30 18:58 ` Doug Ledford
0 siblings, 0 replies; 13+ messages in thread
From: Doug Ledford @ 2002-08-30 18:58 UTC (permalink / raw)
To: James Bottomley; +Cc: Patrick Mansfield, linux-scsi, zaitcev
On Fri, Aug 30, 2002 at 11:46:02AM -0500, James Bottomley wrote:
> What about, instead of Pete's patch, changing the GFP_ATOMIC to GFP_KERNEL and
> adding a BUG_ON(in_interrupt()) at the top and see who sends back bug
> reports...
Works for me.
--
Doug Ledford <dledford@redhat.com> 919-754-3700 x44233
Red Hat, Inc.
1801 Varsity Dr.
Raleigh, NC 27606
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: GFP_ATOMIC allocations...
2002-08-30 16:22 ` Patrick Mansfield
2002-08-30 16:46 ` James Bottomley
@ 2002-09-03 16:07 ` Pete Zaitcev
1 sibling, 0 replies; 13+ messages in thread
From: Pete Zaitcev @ 2002-09-03 16:07 UTC (permalink / raw)
To: Patrick Mansfield; +Cc: linux-scsi, zaitcev
> Date: Fri, 30 Aug 2002 09:22:57 -0700
> From: Patrick Mansfield <patmans@us.ibm.com>
>[...]
> Do we really call scan_scsis and scsi_build_commandblocks while not in user
> context? Who is the caller (to scsi_register_host)?
>
> Here is Pete's response saying he found such a case, but not where:
I did a call graph, but I discarded it since. IIRC it showed
some paths, but perhaps I made a mistake. Please ignore it.
The "fix" did not go anywhere, actually. Arjan explained that
if a box runs out of atomics, it's bad anyway. We reduced the
queue size in the driver that the guy used.
-- Pete
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2002-09-03 16:07 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-08-29 15:59 GFP_ATOMIC allocations Bryan Henderson
2002-08-29 16:25 ` Doug Ledford
2002-08-29 16:50 ` Alan Cox
-- strict thread matches above, loose matches on Subject: below --
2002-08-29 0:25 Doug Ledford
2002-08-29 0:47 ` Patrick Mansfield
2002-08-29 1:42 ` Doug Ledford
2002-08-29 10:47 ` Alan Cox
2002-08-29 15:58 ` Doug Ledford
2002-08-29 17:10 ` Patrick Mansfield
2002-08-30 16:22 ` Patrick Mansfield
2002-08-30 16:46 ` James Bottomley
2002-08-30 18:58 ` Doug Ledford
2002-09-03 16:07 ` Pete Zaitcev
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox