* [PATCH] scsi: properly count the number of pages in scsi_req_map_sg()
@ 2006-03-21 8:38 Dan Aloni
2006-03-21 15:54 ` James Bottomley
0 siblings, 1 reply; 11+ messages in thread
From: Dan Aloni @ 2006-03-21 8:38 UTC (permalink / raw)
To: Linux Kernel List; +Cc: brking, James Bottomley, dror
Improper calculation of the number of pages causes bio_alloc() to
be called with nr_iovecs=0, and slab corruption later.
For example, a simple scatterlist that fails: {(3644,452), (0, 60)},
(offset, size). bufflen=512 => nr_pages=1 => breakage. The proper
page count for this example is 2.
Signed-off-by: Dan Aloni <da-x@monatomic.org>
---
commit 8faa94b01e6fd4518b760ce39a2db0ede9444ded
tree c2e3c6ee5f59a4c1e166e4798ddc6e938f448de2
parent c4a1745aa09fc110afdefea0e5d025043e348bae
author Dan Aloni <da-x@monatomic.org> Tue, 21 Mar 2006 10:19:11 +0200
committer Dan Aloni <da-x@monatomic.org> Tue, 21 Mar 2006 10:19:11 +0200
drivers/scsi/scsi_lib.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 701a328..a42f3aa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -368,13 +368,20 @@ static int scsi_req_map_sg(struct reques
int nsegs, unsigned bufflen, gfp_t gfp)
{
struct request_queue *q = rq->q;
- int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT;
+ int nr_pages = 0;
unsigned int data_len = 0, len, bytes, off;
struct page *page;
struct bio *bio = NULL;
int i, err, nr_vecs = 0;
for (i = 0; i < nsegs; i++) {
+ off = sgl[i].offset;
+ len = sgl[i].length;
+
+ nr_pages += ((off + len + PAGE_SIZE - 1) >> PAGE_SHIFT) - (off >> PAGE_SHIFT);
+ }
+
+ for (i = 0; i < nsegs; i++) {
page = sgl[i].page;
off = sgl[i].offset;
len = sgl[i].length;
--
Dan Aloni
da-x@monatomic.org, da-x@colinux.org, da-x@gmx.net, dan@xiv.co.il
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 8:38 [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() Dan Aloni @ 2006-03-21 15:54 ` James Bottomley 2006-03-21 16:19 ` Dan Aloni 0 siblings, 1 reply; 11+ messages in thread From: James Bottomley @ 2006-03-21 15:54 UTC (permalink / raw) To: Dan Aloni; +Cc: linux-scsi, Linux Kernel List, brking, dror This is a good email to discuss on the scsi list: linux-scsi@vger.kernel.org; whom I've added to the cc list. On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > Improper calculation of the number of pages causes bio_alloc() to > be called with nr_iovecs=0, and slab corruption later. > > For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, > (offset, size). bufflen=512 => nr_pages=1 => breakage. The proper > page count for this example is 2. Such a scatterlist would likely violate the device's underlying boundaries and is not legal ... there's supposed to be special code checking the queue alignment and copying the bio to an aligned buffer if the limits are violated. Where are you generating these scatterlists from? James ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 15:54 ` James Bottomley @ 2006-03-21 16:19 ` Dan Aloni 2006-03-21 18:05 ` Bryan Holty 2006-03-23 14:52 ` Christoph Hellwig 0 siblings, 2 replies; 11+ messages in thread From: Dan Aloni @ 2006-03-21 16:19 UTC (permalink / raw) To: James Bottomley; +Cc: linux-scsi, Linux Kernel List, brking, dror On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: > This is a good email to discuss on the scsi list: > linux-scsi@vger.kernel.org; whom I've added to the cc list. > > On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > > Improper calculation of the number of pages causes bio_alloc() to > > be called with nr_iovecs=0, and slab corruption later. > > > > For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, > > (offset, size). bufflen=512 => nr_pages=1 => breakage. The proper > > page count for this example is 2. > > Such a scatterlist would likely violate the device's underlying > boundaries and is not legal ... there's supposed to be special code > checking the queue alignment and copying the bio to an aligned buffer if > the limits are violated. Where are you generating these scatterlists > from? These scatterlists can be generated using the sg driver. Though I am actually running a customized version of the sg driver, it seems the conversion from a userspace array of sg_iovec_t to scatterlist stays the same and also applies to the original driver (see st_map_user_pages()). -- Dan Aloni da-x@monatomic.org, da-x@colinux.org, da-x@gmx.net, dan@xiv.co.il ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 16:19 ` Dan Aloni @ 2006-03-21 18:05 ` Bryan Holty 2006-03-21 19:17 ` Mike Christie 2006-03-23 14:52 ` Christoph Hellwig 1 sibling, 1 reply; 11+ messages in thread From: Bryan Holty @ 2006-03-21 18:05 UTC (permalink / raw) To: Dan Aloni; +Cc: James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Tuesday 21 March 2006 10:19, Dan Aloni wrote: > On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: > > This is a good email to discuss on the scsi list: > > linux-scsi@vger.kernel.org; whom I've added to the cc list. > > > > On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > > > Improper calculation of the number of pages causes bio_alloc() to > > > be called with nr_iovecs=0, and slab corruption later. > > > > > > For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, > > > (offset, size). bufflen=512 => nr_pages=1 => breakage. The proper > > > page count for this example is 2. > > > > Such a scatterlist would likely violate the device's underlying > > boundaries and is not legal ... there's supposed to be special code > > checking the queue alignment and copying the bio to an aligned buffer if > > the limits are violated. Where are you generating these scatterlists > > from? > > These scatterlists can be generated using the sg driver. Though I am > actually running a customized version of the sg driver, it seems the > conversion from a userspace array of sg_iovec_t to scatterlist stays > the same and also applies to the original driver (see > st_map_user_pages()). Hello, I am seeing the same issue when using direct io with sg. sg will perform direct io on any date that is aligned with the devices dma_align. The default for drivers that do not specify is 512. sg builds the scatter gather list from the user specified location, offsetting the first entry in the list if not page aligned. This is the case that causes the improper allocation of "nr_iovec" in scsi_req_map_sgand the later slab corruption. I don't think it is necessary to calculate nr_pages from the entire list. Only sgl[0] is allowed to have an offset, so we can calculate from that as follows. --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 @@ -368,12 +368,15 @@ int nsegs, unsigned bufflen, gfp_t gfp) { struct request_queue *q = rq->q; - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; + int nr_pages = 0; unsigned int data_len = 0, len, bytes, off; struct page *page; struct bio *bio = NULL; int i, err, nr_vecs = 0; - + + if (nsegs) + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; + for (i = 0; i < nsegs; i++) { page = sgl[i].page; off = sgl[i].offset; -- Bryan Holty ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 18:05 ` Bryan Holty @ 2006-03-21 19:17 ` Mike Christie 2006-03-21 20:48 ` Bryan Holty 0 siblings, 1 reply; 11+ messages in thread From: Mike Christie @ 2006-03-21 19:17 UTC (permalink / raw) To: Bryan Holty Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror Bryan Holty wrote: > On Tuesday 21 March 2006 10:19, Dan Aloni wrote: > >>On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: >> >>>This is a good email to discuss on the scsi list: >>>linux-scsi@vger.kernel.org; whom I've added to the cc list. >>> >>>On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: >>> >>>>Improper calculation of the number of pages causes bio_alloc() to >>>>be called with nr_iovecs=0, and slab corruption later. >>>> >>>>For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, >>>>(offset, size). bufflen=512 => nr_pages=1 => breakage. The proper >>>>page count for this example is 2. >>> >>>Such a scatterlist would likely violate the device's underlying >>>boundaries and is not legal ... there's supposed to be special code >>>checking the queue alignment and copying the bio to an aligned buffer if >>>the limits are violated. Where are you generating these scatterlists >>>from? >> >>These scatterlists can be generated using the sg driver. Though I am >>actually running a customized version of the sg driver, it seems the >>conversion from a userspace array of sg_iovec_t to scatterlist stays >>the same and also applies to the original driver (see >>st_map_user_pages()). > > > Hello, > I am seeing the same issue when using direct io with sg. sg will perform > direct io on any date that is aligned with the devices dma_align. The > default for drivers that do not specify is 512. sg builds the scatter gather > list from the user specified location, offsetting the first entry in the list > if not page aligned. This is the case that causes the improper allocation of > "nr_iovec" in scsi_req_map_sgand the later slab corruption. > > I don't think it is necessary to calculate nr_pages from the entire list. > Only sgl[0] is allowed to have an offset, so we can calculate from that as > follows. > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 > @@ -368,12 +368,15 @@ > int nsegs, unsigned bufflen, gfp_t gfp) > { > struct request_queue *q = rq->q; > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > + int nr_pages = 0; > unsigned int data_len = 0, len, bytes, off; > struct page *page; > struct bio *bio = NULL; > int i, err, nr_vecs = 0; > - > + > + if (nsegs) you can drop that test > + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > + I think we can do this without looping but I think this is broken. If we had a slight variant of Dan's example but we have a page and some change in the first entry {3644, 4548} and 0,60} in the last one, that would would only calculate two pages but we want three. I think we can have to calculate the first and last entries but the middle ones we can assume have no offset and lengths that are multiples of a page. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 19:17 ` Mike Christie @ 2006-03-21 20:48 ` Bryan Holty 2006-03-22 12:35 ` Bryan Holty 0 siblings, 1 reply; 11+ messages in thread From: Bryan Holty @ 2006-03-21 20:48 UTC (permalink / raw) To: Mike Christie Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Tuesday 21 March 2006 13:17, Mike Christie wrote: > Bryan Holty wrote: > > On Tuesday 21 March 2006 10:19, Dan Aloni wrote: > >>On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: > >>>This is a good email to discuss on the scsi list: > >>>linux-scsi@vger.kernel.org; whom I've added to the cc list. > >>> > >>>On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > >>>>Improper calculation of the number of pages causes bio_alloc() to > >>>>be called with nr_iovecs=0, and slab corruption later. > >>>> > >>>>For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, > >>>>(offset, size). bufflen=512 => nr_pages=1 => breakage. The proper > >>>>page count for this example is 2. > >>> > >>>Such a scatterlist would likely violate the device's underlying > >>>boundaries and is not legal ... there's supposed to be special code > >>>checking the queue alignment and copying the bio to an aligned buffer if > >>>the limits are violated. Where are you generating these scatterlists > >>>from? > >> > >>These scatterlists can be generated using the sg driver. Though I am > >>actually running a customized version of the sg driver, it seems the > >>conversion from a userspace array of sg_iovec_t to scatterlist stays > >>the same and also applies to the original driver (see > >>st_map_user_pages()). > > > > Hello, > > I am seeing the same issue when using direct io with sg. sg will > > perform direct io on any date that is aligned with the devices dma_align. > > The default for drivers that do not specify is 512. sg builds the > > scatter gather list from the user specified location, offsetting the > > first entry in the list if not page aligned. This is the case that > > causes the improper allocation of "nr_iovec" in scsi_req_map_sgand the > > later slab corruption. > > > > I don't think it is necessary to calculate nr_pages from the entire > > list. Only sgl[0] is allowed to have an offset, so we can calculate from > > that as follows. > > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > > +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 > > @@ -368,12 +368,15 @@ > > int nsegs, unsigned bufflen, gfp_t gfp) > > { > > struct request_queue *q = rq->q; > > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > > + int nr_pages = 0; > > unsigned int data_len = 0, len, bytes, off; > > struct page *page; > > struct bio *bio = NULL; > > int i, err, nr_vecs = 0; > > - > > + > > + if (nsegs) > > you can drop that test > > > + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > > + > > I think we can do this without looping but I think this is broken. If we > had a slight variant of Dan's example but we have a page and some change > in the first entry {3644, 4548} and 0,60} in the last one, that would > would only calculate two pages but we want three. > > I think we can have to calculate the first and last entries but the > middle ones we can assume have no offset and lengths that are multiples > of a page. You are absolutely correct. bufflen is not page masked and when added to the offset of the first entry, will generate the correct nr_pages. nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; == nr_pages = ((bufflen & PAGE_MASK) + (PAGE_SIZE-1) + sgl[0].offset + sgl[nsegs-1].length) >> PAGE_SHIFT; Either will calculate correctly. -- Bryan Holty ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 20:48 ` Bryan Holty @ 2006-03-22 12:35 ` Bryan Holty 2006-05-26 6:13 ` Mike Christie 0 siblings, 1 reply; 11+ messages in thread From: Bryan Holty @ 2006-03-22 12:35 UTC (permalink / raw) To: Mike Christie Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Tuesday 21 March 2006 14:48, Bryan Holty wrote: > On Tuesday 21 March 2006 13:17, Mike Christie wrote: > > Bryan Holty wrote: > > > On Tuesday 21 March 2006 10:19, Dan Aloni wrote: > > >>On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: > > >>>This is a good email to discuss on the scsi list: > > >>>linux-scsi@vger.kernel.org; whom I've added to the cc list. > > >>> > > >>>On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > > >>>>Improper calculation of the number of pages causes bio_alloc() to > > >>>>be called with nr_iovecs=0, and slab corruption later. > > >>>> > > >>>>For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, > > >>>>(offset, size). bufflen=512 => nr_pages=1 => breakage. The proper > > >>>>page count for this example is 2. > > >>> > > >>>Such a scatterlist would likely violate the device's underlying > > >>>boundaries and is not legal ... there's supposed to be special code > > >>>checking the queue alignment and copying the bio to an aligned buffer > > >>> if the limits are violated. Where are you generating these > > >>> scatterlists from? > > >> > > >>These scatterlists can be generated using the sg driver. Though I am > > >>actually running a customized version of the sg driver, it seems the > > >>conversion from a userspace array of sg_iovec_t to scatterlist stays > > >>the same and also applies to the original driver (see > > >>st_map_user_pages()). > > > > > > Hello, > > > I am seeing the same issue when using direct io with sg. sg will > > > perform direct io on any date that is aligned with the devices > > > dma_align. The default for drivers that do not specify is 512. sg > > > builds the scatter gather list from the user specified location, > > > offsetting the first entry in the list if not page aligned. This is > > > the case that causes the improper allocation of "nr_iovec" in > > > scsi_req_map_sgand the later slab corruption. > > > > > > I don't think it is necessary to calculate nr_pages from the entire > > > list. Only sgl[0] is allowed to have an offset, so we can calculate > > > from that as follows. > > > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > > > +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 > > > @@ -368,12 +368,15 @@ > > > int nsegs, unsigned bufflen, gfp_t gfp) > > > { > > > struct request_queue *q = rq->q; > > > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > > > + int nr_pages = 0; > > > unsigned int data_len = 0, len, bytes, off; > > > struct page *page; > > > struct bio *bio = NULL; > > > int i, err, nr_vecs = 0; > > > - > > > + > > > + if (nsegs) > > > > you can drop that test > > > > > + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > > > + > > > > I think we can do this without looping but I think this is broken. If we > > had a slight variant of Dan's example but we have a page and some change > > in the first entry {3644, 4548} and 0,60} in the last one, that would > > would only calculate two pages but we want three. > > > > I think we can have to calculate the first and last entries but the > > middle ones we can assume have no offset and lengths that are multiples > > of a page. > > You are absolutely correct. > bufflen is not page masked and when added to the offset of the first entry, > will generate the correct nr_pages. > > nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > == > nr_pages = ((bufflen & PAGE_MASK) + (PAGE_SIZE-1) + sgl[0].offset + > sgl[nsegs-1].length) >> PAGE_SHIFT; > > Either will calculate correctly. > > -- > Bryan Holty > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ Based on above, I think the most intuitive fix would be the offset addition of the first entry to the initialization of nr_pages. Without this change, for instance, with 4K io's every sg io that is dma_aligned for direct io, but not page aligned will cause slab corruption and an oops I am able to run a number of tests with sg that cause the boundary to be crossed, and with this fix there is no slab corruption or data corruption. Thanks Dan, I had been hunting for this for a couple of days!! Thoughts?? Signed-off-by: Bryan Holty <lgeek@frontiernet.net> --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 +++ b/drivers/scsi/scsi_lib.c 2006-03-22 06:09:09.669599539 -0600 @@ -368,7 +368,7 @@ int nsegs, unsigned bufflen, gfp_t gfp) { struct request_queue *q = rq->q; - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; + int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; unsigned int data_len = 0, len, bytes, off; struct page *page; struct bio *bio = NULL; -- Bryan Holty ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-22 12:35 ` Bryan Holty @ 2006-05-26 6:13 ` Mike Christie 2006-05-26 13:23 ` Bryan Holty 0 siblings, 1 reply; 11+ messages in thread From: Mike Christie @ 2006-05-26 6:13 UTC (permalink / raw) To: Bryan Holty Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror Bryan Holty wrote: > On Tuesday 21 March 2006 14:48, Bryan Holty wrote: >> On Tuesday 21 March 2006 13:17, Mike Christie wrote: >>> Bryan Holty wrote: >>>> On Tuesday 21 March 2006 10:19, Dan Aloni wrote: >>>>> On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: >>>>>> This is a good email to discuss on the scsi list: >>>>>> linux-scsi@vger.kernel.org; whom I've added to the cc list. >>>>>> >>>>>> On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: >>>>>>> Improper calculation of the number of pages causes bio_alloc() to >>>>>>> be called with nr_iovecs=0, and slab corruption later. >>>>>>> >>>>>>> For example, a simple scatterlist that fails: {(3644,452), (0, 60)}, >>>>>>> (offset, size). bufflen=512 => nr_pages=1 => breakage. The proper >>>>>>> page count for this example is 2. >>>>>> Such a scatterlist would likely violate the device's underlying >>>>>> boundaries and is not legal ... there's supposed to be special code >>>>>> checking the queue alignment and copying the bio to an aligned buffer >>>>>> if the limits are violated. Where are you generating these >>>>>> scatterlists from? >>>>> These scatterlists can be generated using the sg driver. Though I am >>>>> actually running a customized version of the sg driver, it seems the >>>>> conversion from a userspace array of sg_iovec_t to scatterlist stays >>>>> the same and also applies to the original driver (see >>>>> st_map_user_pages()). >>>> Hello, >>>> I am seeing the same issue when using direct io with sg. sg will >>>> perform direct io on any date that is aligned with the devices >>>> dma_align. The default for drivers that do not specify is 512. sg >>>> builds the scatter gather list from the user specified location, >>>> offsetting the first entry in the list if not page aligned. This is >>>> the case that causes the improper allocation of "nr_iovec" in >>>> scsi_req_map_sgand the later slab corruption. >>>> >>>> I don't think it is necessary to calculate nr_pages from the entire >>>> list. Only sgl[0] is allowed to have an offset, so we can calculate >>>> from that as follows. >>>> --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 >>>> +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 >>>> @@ -368,12 +368,15 @@ >>>> int nsegs, unsigned bufflen, gfp_t gfp) >>>> { >>>> struct request_queue *q = rq->q; >>>> - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; >>>> + int nr_pages = 0; >>>> unsigned int data_len = 0, len, bytes, off; >>>> struct page *page; >>>> struct bio *bio = NULL; >>>> int i, err, nr_vecs = 0; >>>> - >>>> + >>>> + if (nsegs) >>> you can drop that test >>> >>>> + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; >>>> + >>> I think we can do this without looping but I think this is broken. If we >>> had a slight variant of Dan's example but we have a page and some change >>> in the first entry {3644, 4548} and 0,60} in the last one, that would >>> would only calculate two pages but we want three. >>> >>> I think we can have to calculate the first and last entries but the >>> middle ones we can assume have no offset and lengths that are multiples >>> of a page. >> You are absolutely correct. >> bufflen is not page masked and when added to the offset of the first entry, >> will generate the correct nr_pages. >> >> nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; >> == >> nr_pages = ((bufflen & PAGE_MASK) + (PAGE_SIZE-1) + sgl[0].offset + >> sgl[nsegs-1].length) >> PAGE_SHIFT; >> >> Either will calculate correctly. >> >> -- >> Bryan Holty >> - >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> Please read the FAQ at http://www.tux.org/lkml/ > > Based on above, I think the most intuitive fix would be the offset addition of > the first entry to the initialization of nr_pages. > > Without this change, for instance, with 4K io's every sg io that is > dma_aligned for direct io, but not page aligned will cause slab corruption > and an oops > > I am able to run a number of tests with sg that cause the boundary to be > crossed, and with this fix there is no slab corruption or data corruption. > > Thanks Dan, I had been hunting for this for a couple of days!! > > Thoughts?? > > Signed-off-by: Bryan Holty <lgeek@frontiernet.net> > > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > +++ b/drivers/scsi/scsi_lib.c 2006-03-22 06:09:09.669599539 -0600 > @@ -368,7 +368,7 @@ > int nsegs, unsigned bufflen, gfp_t gfp) > { > struct request_queue *q = rq->q; > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > + int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > unsigned int data_len = 0, len, bytes, off; > struct page *page; > struct bio *bio = NULL; > -- Was this patch ok? I was looking at some other problem and noticed this was not merged. The original calculation that I did in the code is wrong. I thought the calculation in this patch was correct, but I have showed that my math skills are lacking so I may be wrong. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-05-26 6:13 ` Mike Christie @ 2006-05-26 13:23 ` Bryan Holty 0 siblings, 0 replies; 11+ messages in thread From: Bryan Holty @ 2006-05-26 13:23 UTC (permalink / raw) To: Mike Christie Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Friday 26 May 2006 01:13, Mike Christie wrote: > Bryan Holty wrote: > > On Tuesday 21 March 2006 14:48, Bryan Holty wrote: > >> On Tuesday 21 March 2006 13:17, Mike Christie wrote: > >>> Bryan Holty wrote: > >>>> On Tuesday 21 March 2006 10:19, Dan Aloni wrote: > >>>>> On Tue, Mar 21, 2006 at 09:54:54AM -0600, James Bottomley wrote: > >>>>>> This is a good email to discuss on the scsi list: > >>>>>> linux-scsi@vger.kernel.org; whom I've added to the cc list. > >>>>>> > >>>>>> On Tue, 2006-03-21 at 10:38 +0200, Dan Aloni wrote: > >>>>>>> Improper calculation of the number of pages causes bio_alloc() to > >>>>>>> be called with nr_iovecs=0, and slab corruption later. > >>>>>>> > >>>>>>> For example, a simple scatterlist that fails: {(3644,452), (0, > >>>>>>> 60)}, (offset, size). bufflen=512 => nr_pages=1 => breakage. The > >>>>>>> proper page count for this example is 2. > >>>>>> > >>>>>> Such a scatterlist would likely violate the device's underlying > >>>>>> boundaries and is not legal ... there's supposed to be special code > >>>>>> checking the queue alignment and copying the bio to an aligned > >>>>>> buffer if the limits are violated. Where are you generating these > >>>>>> scatterlists from? > >>>>> > >>>>> These scatterlists can be generated using the sg driver. Though I am > >>>>> actually running a customized version of the sg driver, it seems the > >>>>> conversion from a userspace array of sg_iovec_t to scatterlist stays > >>>>> the same and also applies to the original driver (see > >>>>> st_map_user_pages()). > >>>> > >>>> Hello, > >>>> I am seeing the same issue when using direct io with sg. sg will > >>>> perform direct io on any date that is aligned with the devices > >>>> dma_align. The default for drivers that do not specify is 512. sg > >>>> builds the scatter gather list from the user specified location, > >>>> offsetting the first entry in the list if not page aligned. This is > >>>> the case that causes the improper allocation of "nr_iovec" in > >>>> scsi_req_map_sgand the later slab corruption. > >>>> > >>>> I don't think it is necessary to calculate nr_pages from the > >>>> entire list. Only sgl[0] is allowed to have an offset, so we can > >>>> calculate from that as follows. > >>>> --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > >>>> +++ b/drivers/scsi/scsi_lib.c 2006-03-21 11:36:39.389763804 -0600 > >>>> @@ -368,12 +368,15 @@ > >>>> int nsegs, unsigned bufflen, gfp_t gfp) > >>>> { > >>>> struct request_queue *q = rq->q; > >>>> - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > >>>> + int nr_pages = 0; > >>>> unsigned int data_len = 0, len, bytes, off; > >>>> struct page *page; > >>>> struct bio *bio = NULL; > >>>> int i, err, nr_vecs = 0; > >>>> - > >>>> + > >>>> + if (nsegs) > >>> > >>> you can drop that test > >>> > >>>> + nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> > >>>> PAGE_SHIFT; + > >>> > >>> I think we can do this without looping but I think this is broken. If > >>> we had a slight variant of Dan's example but we have a page and some > >>> change in the first entry {3644, 4548} and 0,60} in the last one, that > >>> would would only calculate two pages but we want three. > >>> > >>> I think we can have to calculate the first and last entries but the > >>> middle ones we can assume have no offset and lengths that are multiples > >>> of a page. > >> > >> You are absolutely correct. > >> bufflen is not page masked and when added to the offset of the first > >> entry, will generate the correct nr_pages. > >> > >> nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > >> == > >> nr_pages = ((bufflen & PAGE_MASK) + (PAGE_SIZE-1) + sgl[0].offset + > >> sgl[nsegs-1].length) >> PAGE_SHIFT; > >> > >> Either will calculate correctly. > >> > >> -- > >> Bryan Holty > >> - > >> To unsubscribe from this list: send the line "unsubscribe linux-kernel" > >> in the body of a message to majordomo@vger.kernel.org > >> More majordomo info at http://vger.kernel.org/majordomo-info.html > >> Please read the FAQ at http://www.tux.org/lkml/ > > > > Based on above, I think the most intuitive fix would be the offset > > addition of the first entry to the initialization of nr_pages. > > > > Without this change, for instance, with 4K io's every sg io that is > > dma_aligned for direct io, but not page aligned will cause slab > > corruption and an oops > > > > I am able to run a number of tests with sg that cause the boundary to be > > crossed, and with this fix there is no slab corruption or data > > corruption. > > > > Thanks Dan, I had been hunting for this for a couple of days!! > > > > Thoughts?? > > > > Signed-off-by: Bryan Holty <lgeek@frontiernet.net> > > > > --- a/drivers/scsi/scsi_lib.c 2006-03-03 13:17:22.000000000 -0600 > > +++ b/drivers/scsi/scsi_lib.c 2006-03-22 06:09:09.669599539 -0600 > > @@ -368,7 +368,7 @@ > > int nsegs, unsigned bufflen, gfp_t gfp) > > { > > struct request_queue *q = rq->q; > > - int nr_pages = (bufflen + PAGE_SIZE - 1) >> PAGE_SHIFT; > > + int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT; > > unsigned int data_len = 0, len, bytes, off; > > struct page *page; > > struct bio *bio = NULL; > > -- > > Was this patch ok? I was looking at some other problem and noticed this > was not merged. The original calculation that I did in the code is > wrong. I thought the calculation in this patch was correct, but I have > showed that my math skills are lacking so I may be wrong. Yes, this patch fixes the problem. We have been running with it since it was originally posted without problem. We crash near immediately without the patch in place. We are using sg with direct_io heavily. -- Thanks, Bryan Holty ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-21 16:19 ` Dan Aloni 2006-03-21 18:05 ` Bryan Holty @ 2006-03-23 14:52 ` Christoph Hellwig 2006-03-23 16:51 ` Bryan Holty 1 sibling, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2006-03-23 14:52 UTC (permalink / raw) To: Dan Aloni; +Cc: James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Tue, Mar 21, 2006 at 06:19:12PM +0200, Dan Aloni wrote: > These scatterlists can be generated using the sg driver. Though I am > actually running a customized version of the sg driver, it seems the > conversion from a userspace array of sg_iovec_t to scatterlist stays > the same and also applies to the original driver (see > st_map_user_pages()). What kernel version did you reproduce this with? Since 2.6.16 sg should obey all request size/alingment limitations. If not that's a bug in scsi_execute_async and it's helpers and should be fixed there. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() 2006-03-23 14:52 ` Christoph Hellwig @ 2006-03-23 16:51 ` Bryan Holty 0 siblings, 0 replies; 11+ messages in thread From: Bryan Holty @ 2006-03-23 16:51 UTC (permalink / raw) To: Christoph Hellwig Cc: Dan Aloni, James Bottomley, linux-scsi, Linux Kernel List, brking, dror On Thursday 23 March 2006 08:52, Christoph Hellwig wrote: > On Tue, Mar 21, 2006 at 06:19:12PM +0200, Dan Aloni wrote: > > These scatterlists can be generated using the sg driver. Though I am > > actually running a customized version of the sg driver, it seems the > > conversion from a userspace array of sg_iovec_t to scatterlist stays > > the same and also applies to the original driver (see > > st_map_user_pages()). > > What kernel version did you reproduce this with? Since 2.6.16 sg should > obey all request size/alingment limitations. If not that's a bug in > scsi_execute_async and it's helpers and should be fixed there. > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ I am able to reproduce this with 2.6.16-rc5 - 2.6.16. There is a problem in scsi_req_map_sg which is called by scsi_execute_async. Currently, scsi_req_map_sg assumes every sgl entry is page aligned. It will cause later slab corruption by under-allocating the number of bio entries if sgl[0].offset + sgl[last].length > PAGE_SIZE. Dan pointed this out, and I have submitted a patch that I believe correctly fixes the issue. Just waiting for some feedback. -- Bryan Holty ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2006-05-26 13:24 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-03-21 8:38 [PATCH] scsi: properly count the number of pages in scsi_req_map_sg() Dan Aloni 2006-03-21 15:54 ` James Bottomley 2006-03-21 16:19 ` Dan Aloni 2006-03-21 18:05 ` Bryan Holty 2006-03-21 19:17 ` Mike Christie 2006-03-21 20:48 ` Bryan Holty 2006-03-22 12:35 ` Bryan Holty 2006-05-26 6:13 ` Mike Christie 2006-05-26 13:23 ` Bryan Holty 2006-03-23 14:52 ` Christoph Hellwig 2006-03-23 16:51 ` Bryan Holty
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox