linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: "Andy Falanga (afalanga)" <afalanga@micron.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-scsi@vger.kernel.org, Doug Gilbert <dgilbert@interlog.com>
Subject: Re: block layer copying user io vectors
Date: Fri, 23 Jan 2015 09:06:27 -0800	[thread overview]
Message-ID: <20150123170627.GA8652@infradead.org> (raw)
In-Reply-To: <60F6FAE47D1BCE4380CC06D18F49789B952F7ED3@NTXBOIMBX02.micron.com>

On Thu, Jan 22, 2015 at 09:33:08PM +0000, Andy Falanga (afalanga) wrote:
> Please CC me directly.
> 
> I am working in kernel 2.6.32 (CentOS 6).  To increase the upper limit
> of sg from 4mb to at least 128mb in a single SCSI command.  At first I
> thought this issue was in sg, but have tracked the issue to the block
> layer.

2.6.32 is fairly old, but fortunately for you not too many things should
have changed in this area.

> 
> Thinking I could solve this issue by using scatter/gather lists, I
> increased the size from 32k to 4mb of each vector.  This did work
> until I tried to send 8mb.  When I do so, I get errno EINVAL.  After
> some tracing, I tracked the problem into bio_copy_user_iov().
> 
> This function does something that seems rather strange.  On line 859,
> a for loop determines the number of pages needed for the copying of the
> user data to kernel space.  Then the memory is allocated (line
> 886 bio_kmalloc()).  Then, strangely, on line 895, there is this
> conditional:

This is because the function can also be used with preallocated pages,
a feature only used by the sg and tape drivers.

Make sure your user memory is 4k aligned, and you should be able
to avoid the copy entirely (1).

(1) except that the sg driver disables the direct mapping of user pages
    when using readv/writev.  I can't really see why and it should be
    fixable by just removign that condition from the if in sg_start_req.
    Alternatively use the SG_IO ioctl directly on the disk device node,
    which neither has the read/writev limitation, nor does it use
    a fixed upper bound preallocated page pool.

> 
> if (map_data) {
>     nr_pages = 1 << map_data->page_order;
>     i = map_data->offset / PAGE_SIZE;
> }
> 
> This effectively ignores the number of pages counted earlier (in this
> case which applies to me), and then apparently disregards whatever
> memory may have been allocated earlier.  Thinking that this was the
> root I tried to correct this by commenting that simple branch from
> bio_copy_user_iov, but still had the same result.  Can someone
> help me understand what is happening in the block layer?

  reply	other threads:[~2015-01-23 17:06 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-22 21:33 block layer copying user io vectors Andy Falanga (afalanga)
2015-01-23 17:06 ` Christoph Hellwig [this message]
2015-01-30 17:43   ` Andy Falanga (afalanga)
2015-02-02 10:00     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150123170627.GA8652@infradead.org \
    --to=hch@infradead.org \
    --cc=afalanga@micron.com \
    --cc=dgilbert@interlog.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).