* Re: ST alloc failures
[not found] <20040402051355.GA1604@frodo>
@ 2004-04-02 6:32 ` Christoph Hellwig
2004-04-03 7:19 ` Kai Makisara
0 siblings, 1 reply; 4+ messages in thread
From: Christoph Hellwig @ 2004-04-02 6:32 UTC (permalink / raw)
To: Nathan Scott; +Cc: linux-scsi
[linux-scsi is the right list for st problems, moving the thread there]
On Fri, Apr 02, 2004 at 03:13:55PM +1000, Nathan Scott wrote:
> Hi all,
>
> I'm seeing a bunch of large allocation attempts failing from
> the SCSI tape driver when doing dumps and restores ... (this
> is with a stock 2.6.4 kernel).
>
> xfsdump: page allocation failure. order:8, mode:0xd0
> Call Trace:
> [<c013982b>] __alloc_pages+0x33b/0x3d0
> [<c03805ac>] enlarge_buffer+0xdc/0x1b0
> [<c03819a3>] st_map_user_pages+0x33/0x90
> [<c037cf24>] setup_buffering+0xb4/0x160
This looks like the driver tries to pin down the userpages first
(st_map_user_pages) but then fails and needs to use an inkernel
buffer. Can you put some debug printks into st_map_user_pages
to see why it fails? The actual message is harmless, it's the
same thing we had in the XFS log code: It tries to allocate
an as large as possible buffer and if that fails tries the next
smaller power of two size. We should probably add an __GFP_NOWARN
here.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ST alloc failures
2004-04-02 6:32 ` ST alloc failures Christoph Hellwig
@ 2004-04-03 7:19 ` Kai Makisara
2004-04-06 8:48 ` Nathan Scott
0 siblings, 1 reply; 4+ messages in thread
From: Kai Makisara @ 2004-04-03 7:19 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: Nathan Scott, linux-scsi
On Fri, 2 Apr 2004, Christoph Hellwig wrote:
> [linux-scsi is the right list for st problems, moving the thread there]
>
> On Fri, Apr 02, 2004 at 03:13:55PM +1000, Nathan Scott wrote:
> > Hi all,
> >
> > I'm seeing a bunch of large allocation attempts failing from
> > the SCSI tape driver when doing dumps and restores ... (this
> > is with a stock 2.6.4 kernel).
> >
> > xfsdump: page allocation failure. order:8, mode:0xd0
> > Call Trace:
> > [<c013982b>] __alloc_pages+0x33b/0x3d0
> > [<c03805ac>] enlarge_buffer+0xdc/0x1b0
> > [<c03819a3>] st_map_user_pages+0x33/0x90
> > [<c037cf24>] setup_buffering+0xb4/0x160
>
> This looks like the driver tries to pin down the userpages first
> (st_map_user_pages) but then fails and needs to use an inkernel
> buffer. Can you put some debug printks into st_map_user_pages
> to see why it fails?
Pinning down pages should not fail with most modern hardware except for
the following three cases:
1) A change in 2.6.4 (*) mandates st (and sg) not to use direct transfers
unless the user buffer is aligned at 512 byte boundary. This means, for
instance, that in most cases transfers from/to malloced/calloced buffers
are forced to use bounce buffers (alignment at 8 or 16 byte boundaries).
2) There is a bug in checking the allowed address range. Most SCSI
adapters support 64-bit addresses and so even lots of memory should not
prevent using direct transfers.
3) Some resource shortage that happened just now. This is not a bug.
> The actual message is harmless, it's the
> same thing we had in the XFS log code: It tries to allocate
> an as large as possible buffer and if that fails tries the next
> smaller power of two size. We should probably add an __GFP_NOWARN
> here.
Yes. st prints a message if the allocation finally fails.
(*) Some history for those who have not followed this development:
In 2.6.3, st (and sg) started checking the user buffer alignment with
queue_dma_alignment(). The overall default is 512 bytes. A change was
added to the scsi code to set the limit for SCSI devices to 8 bytes.
In 2.6.4, the code setting the SCSI device alignment to 8 bytes was, for
some reason unknown to me, removed and this put the requirement to 512
bytes.
The alignment requirement defaults can be set using many strategies. The
current one is very safe. The low-level drivers (practically every
driver in this case) can relax the constraints but this is not currently
done. Another strategy (the 8-byte limit) sets the requirements safe for
most devices. The exceptions can then enforce more strict limits.
--
Kai
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ST alloc failures
2004-04-03 7:19 ` Kai Makisara
@ 2004-04-06 8:48 ` Nathan Scott
2004-04-06 19:09 ` Kai Makisara
0 siblings, 1 reply; 4+ messages in thread
From: Nathan Scott @ 2004-04-06 8:48 UTC (permalink / raw)
To: Christoph Hellwig, Kai Makisara; +Cc: linux-scsi, linux-xfs
Hi there,
On Sat, Apr 03, 2004 at 10:19:51AM +0300, Kai Makisara wrote:
> On Fri, 2 Apr 2004, Christoph Hellwig wrote:
>
> > [linux-scsi is the right list for st problems, moving the thread there]
> >
> > On Fri, Apr 02, 2004 at 03:13:55PM +1000, Nathan Scott wrote:
> > > Hi all,
> > >
> > > I'm seeing a bunch of large allocation attempts failing from
> > > the SCSI tape driver when doing dumps and restores ... (this
> > > is with a stock 2.6.4 kernel).
> > >
> > > xfsdump: page allocation failure. order:8, mode:0xd0
> > > Call Trace:
> > > [<c013982b>] __alloc_pages+0x33b/0x3d0
> > > [<c03805ac>] enlarge_buffer+0xdc/0x1b0
> > > [<c03819a3>] st_map_user_pages+0x33/0x90
> > > [<c037cf24>] setup_buffering+0xb4/0x160
> >
> > This looks like the driver tries to pin down the userpages first
> > (st_map_user_pages) but then fails and needs to use an inkernel
> > buffer. Can you put some debug printks into st_map_user_pages
> > to see why it fails?
Apologies for the delay; after whacking in some printk's it looks
like the point st decides to not pin down the user pages for me is
here in sgl_map_user_pages:
/* Too big */
if (nr_pages > max_pages) {
return -ENOMEM;
}
In my cases nr_pages is always 256 and max_pages is always 96 (I
see this printk a fair few times, and its always from this point).
> Pinning down pages should not fail with most modern hardware except for
> the following three cases:
>
> 1) A change in 2.6.4 (*) mandates st (and sg) not to use direct transfers
> unless the user buffer is aligned at 512 byte boundary. This means, for
> instance, that in most cases transfers from/to malloced/calloced buffers
> are forced to use bounce buffers (alignment at 8 or 16 byte boundaries).
>
> 2) There is a bug in checking the allowed address range. Most SCSI
> adapters support 64-bit addresses and so even lots of memory should not
> prevent using direct transfers.
I guess its not either of these two, from the printk?
> 3) Some resource shortage that happened just now. This is not a bug.
Hmm... I see this alot, but I have a fair bit of memory in the machine
(its during stress and regression testing that I hit this, so not sure
about the exact memory usage at each particular printk I see).
Is this something we should be tuning in xfsdump/xfsrestore, Kai?
(to make smaller requests?)
cheers.
--
Nathan
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: ST alloc failures
2004-04-06 8:48 ` Nathan Scott
@ 2004-04-06 19:09 ` Kai Makisara
0 siblings, 0 replies; 4+ messages in thread
From: Kai Makisara @ 2004-04-06 19:09 UTC (permalink / raw)
To: Nathan Scott; +Cc: Christoph Hellwig, linux-scsi, linux-xfs
On Tue, 6 Apr 2004, Nathan Scott wrote:
> Hi there,
>
> On Sat, Apr 03, 2004 at 10:19:51AM +0300, Kai Makisara wrote:
> > On Fri, 2 Apr 2004, Christoph Hellwig wrote:
> >
> > > [linux-scsi is the right list for st problems, moving the thread there]
> > >
> > > On Fri, Apr 02, 2004 at 03:13:55PM +1000, Nathan Scott wrote:
> > > > Hi all,
> > > >
> > > > I'm seeing a bunch of large allocation attempts failing from
> > > > the SCSI tape driver when doing dumps and restores ... (this
> > > > is with a stock 2.6.4 kernel).
> > > >
> > > > xfsdump: page allocation failure. order:8, mode:0xd0
> > > > Call Trace:
> > > > [<c013982b>] __alloc_pages+0x33b/0x3d0
> > > > [<c03805ac>] enlarge_buffer+0xdc/0x1b0
> > > > [<c03819a3>] st_map_user_pages+0x33/0x90
> > > > [<c037cf24>] setup_buffering+0xb4/0x160
> > >
> > > This looks like the driver tries to pin down the userpages first
> > > (st_map_user_pages) but then fails and needs to use an inkernel
> > > buffer. Can you put some debug printks into st_map_user_pages
> > > to see why it fails?
>
> Apologies for the delay; after whacking in some printk's it looks
> like the point st decides to not pin down the user pages for me is
> here in sgl_map_user_pages:
>
> /* Too big */
> if (nr_pages > max_pages) {
> return -ENOMEM;
> }
>
> In my cases nr_pages is always 256 and max_pages is always 96 (I
> see this printk a fair few times, and its always from this point).
>
OK. max_pages is the maximum number of scatter/gather segments supported
by the SCSI adapter.
> > Pinning down pages should not fail with most modern hardware except for
> > the following three cases:
> >
> > 1) A change in 2.6.4 (*) mandates st (and sg) not to use direct transfers
> > unless the user buffer is aligned at 512 byte boundary. This means, for
> > instance, that in most cases transfers from/to malloced/calloced buffers
> > are forced to use bounce buffers (alignment at 8 or 16 byte boundaries).
> >
> > 2) There is a bug in checking the allowed address range. Most SCSI
> > adapters support 64-bit addresses and so even lots of memory should not
> > prevent using direct transfers.
>
> I guess its not either of these two, from the printk?
>
Correct. 1) was something that would have explained why you see this
starting from 2.6.4. I am happy that it is not 2 :-)
> > 3) Some resource shortage that happened just now. This is not a bug.
>
> Hmm... I see this alot, but I have a fair bit of memory in the machine
> (its during stress and regression testing that I hit this, so not sure
> about the exact memory usage at each particular printk I see).
>
Having a lot of memory does not help because it gets fragmented, too. st
is trying to allocate big chunks so that it can satisfy the user requests
with the available number of s/g segments even when the user successively
requests bigger and bigger block sizes. Usually smaller than maximum
chunks can be used if the user just uses the same block size for
subsequent requests. The driver tries to allocate smaller chunks if
allocation of big chunks fails and the smaller chunks are big enough for
the current user request. This is what happens in your case now. Earlier
allocations with the big chunk size have succeeded and no error messages
have been written.
> Is this something we should be tuning in xfsdump/xfsrestore, Kai?
> (to make smaller requests?)
>
There are actually two problems. As Christoph said, the messages you see
are harmless. I have already sent to linux-scsi a patch that adds
__GFP_NOWARN to the allocation. This should remove these error messages.
The other problem is that you probably would like to use direct transfers
between the xfsdump/xfsrestore buffer and the drive instead of using the
"bounce" buffer in the driver. This is not possible unless the tape
requests are small enough for the SCSI adapter. In your case the limit is
96 pages. You can try to increase this limit but it is not a general
solution.
I would recommend xfsdump/xfsrestore to use smaller requests if possible.
64 pages of 4 kB would make 256 kB. Using this request size should not
limit throughput even with the fastest tape drives.
I would like to make st somehow tell the users when it is using the driver
buffer instead of direct transfers. Some users would probably like to know
this because it limits throughput in some cases. The best idea I have so
far is to log a message once for each open if this happens. Even this
may be too much. Good ideas are welcome.
--
Kai
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2004-04-06 19:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20040402051355.GA1604@frodo>
2004-04-02 6:32 ` ST alloc failures Christoph Hellwig
2004-04-03 7:19 ` Kai Makisara
2004-04-06 8:48 ` Nathan Scott
2004-04-06 19:09 ` Kai Makisara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox