From: Bryan Holty <lgeek@frontiernet.net>
To: Mike Christie <michaelc@cs.wisc.edu>
Cc: Dan Aloni <da-x@monatomic.org>,
James Bottomley <James.Bottomley@steeleye.com>,
linux-scsi <linux-scsi@vger.kernel.org>,
Linux Kernel List <linux-kernel@vger.kernel.org>,
brking@us.ibm.com, dror@xiv.co.il
Subject: Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg()
Date: Wed, 22 Mar 2006 06:35:39 -0600 [thread overview]
Message-ID: <200603220635.40251.lgeek@frontiernet.net> (raw)
In-Reply-To: <200603211448.54620.lgeek@frontiernet.net>
On Tuesday 21 March 2006 14:48, Bryan Holty wrote:
> On Tuesday 21 March 2006 13:17, Mike Christie wrote:
> > Bryan Holty wrote:
> > > On Tuesday 21 March 2006 10:19, Dan Aloni wrote:
> > >>On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote:
> > >>>This is a good email to discuss on the scsi list:
> > >>>linux-scsi@vger.kernel.org; whom I've added to the cc list.
> > >>>
> > >>>On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote:
> > >>>>Improper calculation of the number of pages causes bio_alloc() to
> > >>>>be called with nr_iovecs=0, and slab corruption later.
> > >>>>
> > >>>>For example, a simple scatterlist that fails: {(3644,452), (0, 60)},
> > >>>>(offset, size). bufflen=512 => nr_pages=1 => breakage. The proper
> > >>>>page count for this example is 2.
> > >>>
> > >>>Such a scatterlist would likely violate the device's underlying
> > >>>boundaries and is not legal ... there's supposed to be special code
> > >>>checking the queue alignment and copying the bio to an aligned buffer
> > >>> if the limits are violated. Where are you generating these
> > >>> scatterlists from?
> > >>
> > >>These scatterlists can be generated using the sg driver. Though I am
> > >>actually running a customized version of the sg driver, it seems the
> > >>conversion from a userspace array of sg_iovec_t to scatterlist stays
> > >>the same and also applies to the original driver (see
> > >>st_map_user_pages()).
> > >
> > > Hello,
> > > I am seeing the same issue when using direct io with sg. sg will
> > > perform direct io on any date that is aligned with the devices
> > > dma_align. The default for drivers that do not specify is 512. sg
> > > builds the scatter gather list from the user specified location,
> > > offsetting the first entry in the list if not page aligned. This is
> > > the case that causes the improper allocation of "nr_iovec" in
> > > scsi_req_map_sgand the later slab corruption.
> > >
> > > I don't think it is necessary to calculate nr_pages from the entire
> > > list. Only sgl[0] is allowed to have an offset, so we can calculate
> > > from that as follows.
> > > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600
> > > +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600
> > > @@ -368,12 +368,15 @@
> > > int nsegs, unsigned bufflen, gfp_t gfp)
> > > {
> > > struct request_queue *q = rq->q;
> > > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > > + int nr_pages = 0;
> > > unsigned int data_len = 0, len, bytes, off;
> > > struct page *page;
> > > struct bio *bio = NULL;
> > > int i, err, nr_vecs = 0;
> > > -
> > > +
> > > + if (nsegs)
> >
> > you can drop that test
> >
> > > + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
> > > +
> >
> > I think we can do this without looping but I think this is broken. If we
> > had a slight variant of Dan's example but we have a page and some change
> > in the first entry {3644, 4548} and 0,60} in the last one, that would
> > would only calculate two pages but we want three.
> >
> > I think we can have to calculate the first and last entries but the
> > middle ones we can assume have no offset and lengths that are multiples
> > of a page.
>
> You are absolutely correct.
> bufflen is not page masked and when added to the offset of the first entry,
> will generate the correct nr_pages.
>
> nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
> ==
> nr_pages = ((bufflen & PAGE_MASK) + (PAGE_SIZE-1) + sgl[0].offset +
> sgl[nsegs-1].length) >> PAGE_SHIFT;
>
> Either will calculate correctly.
>
> --
> Bryan Holty
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
Based on above, I think the most intuitive fix would be the offset addition of
the first entry to the initialization of nr_pages.
Without this change, for instance, with 4K io's every sg io that is
dma_aligned for direct io, but not page aligned will cause slab corruption
and an oops
I am able to run a number of tests with sg that cause the boundary to be
crossed, and with this fix there is no slab corruption or data corruption.
Thanks Dan, I had been hunting for this for a couple of days!!
Thoughts??
Signed-off-by: Bryan Holty <lgeek@frontiernet.net>
--- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600
+++ b/drivers/scsi/scsi_lib.c 2006-03-22 06:09:09.669599539 -0600
@@ -368,7 +368,7 @@
int nsegs, unsigned bufflen, gfp_t gfp)
{
struct request_queue *q = rq->q;
- int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
unsigned int data_len = 0, len, bytes, off;
struct page *page;
struct bio *bio = NULL;
--
Bryan Holty
next prev parent reply other threads:[~2006-03-22 12:35 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-21 8:38 [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() Dan Aloni
2006-03-21 15:54 ` James Bottomley
2006-03-21 16:19 ` Dan Aloni
2006-03-21 18:05 ` Bryan Holty
2006-03-21 19:17 ` Mike Christie
2006-03-21 20:48 ` Bryan Holty
2006-03-22 12:35 ` Bryan Holty [this message]
2006-05-26 6:13 ` Mike Christie
2006-05-26 13:23 ` Bryan Holty
2006-03-23 14:52 ` Christoph Hellwig
2006-03-23 16:51 ` Bryan Holty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200603220635.40251.lgeek@frontiernet.net \
--to=lgeek@frontiernet.net \
--cc=James.Bottomley@steeleye.com \
--cc=brking@us.ibm.com \
--cc=da-x@monatomic.org \
--cc=dror@xiv.co.il \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=michaelc@cs.wisc.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox