[PATCH] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
@ 2015-02-27 18:44 Darrick J. Wong
  2015-02-27 19:09 ` [PATCH v2] " Mikulas Patocka
  0 siblings, 1 reply; 8+ messages in thread
From: Darrick J. Wong @ 2015-02-27 18:44 UTC (permalink / raw)
  To: device-mapper development
  Cc: Mike Snitzer, Mikulas Patocka, Martin K. Petersen, agk,
	Srinivas Eeda

Since it's apparently possible that the queue limits for discard and
write same can change while the upper level command is being sliced
and diced, fix up both of them (a) to reject IO if the special command
is unsupported at the start of the function and (b) read the limits
once and let the commands error out on their own if the status happens
to change.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 drivers/md/dm-io.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/md/dm-io.c b/drivers/md/dm-io.c
index 37de017..d66cfb2 100644
--- a/drivers/md/dm-io.c
+++ b/drivers/md/dm-io.c
@@ -289,9 +289,15 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where,
 	struct request_queue *q = bdev_get_queue(where->bdev);
 	unsigned short logical_block_size = queue_logical_block_size(q);
 	sector_t num_sectors;
-
-	/* Reject unsupported discard requests */
-	if ((rw & REQ_DISCARD) && !blk_queue_discard(q)) {
+	unsigned int special_cmd_max_sectors;
+
+	/* Reject unsupported discard and write same requests */
+	if (rw & REQ_DISCARD)
+		special_cmd_max_sectors = q->limits.max_discard_sectors;
+	else if (rw & REQ_WRITE_SAME)
+		special_cmd_max_sectors = q->limits.max_write_same_sectors;
+	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
+	    special_cmd_max_sectors == 0) {
 		dec_count(io, region, -EOPNOTSUPP);
 		return;
 	}
@@ -317,7 +323,8 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where,
 		store_io_and_region_in_bio(bio, io, region);
 
 		if (rw & REQ_DISCARD) {
-			num_sectors = min_t(sector_t, q->limits.max_discard_sectors, remaining);
+			num_sectors = min_t(sector_t, special_cmd_max_sectors,
+					    remaining);
 			bio->bi_iter.bi_size = num_sectors << SECTOR_SHIFT;
 			remaining -= num_sectors;
 		} else if (rw & REQ_WRITE_SAME) {
@@ -326,7 +333,8 @@ static void do_region(int rw, unsigned region, struct dm_io_region *where,
 			 */
 			dp->get_page(dp, &page, &len, &offset);
 			bio_add_page(bio, page, logical_block_size, offset);
-			num_sectors = min_t(sector_t, q->limits.max_write_same_sectors, remaining);
+			num_sectors = min_t(sector_t, special_cmd_max_sectors,
+					    remaining);
 			bio->bi_iter.bi_size = num_sectors << SECTOR_SHIFT;
 
 			offset = 0;

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 18:44 [PATCH] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME Darrick J. Wong
@ 2015-02-27 19:09 ` Mikulas Patocka
  2015-02-27 19:19   ` Mike Snitzer
  2015-02-27 19:58   ` Mike Snitzer
  0 siblings, 2 replies; 8+ messages in thread
From: Mikulas Patocka @ 2015-02-27 19:09 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: device-mapper development, Martin K. Petersen, Mike Snitzer, agk,
	Srinivas Eeda



On Fri, 27 Feb 2015, Darrick J. Wong wrote:

> Since it's apparently possible that the queue limits for discard and
> write same can change while the upper level command is being sliced
> and diced, fix up both of them (a) to reject IO if the special command
> is unsupported at the start of the function and (b) read the limits
> once and let the commands error out on their own if the status happens
> to change.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

> +	unsigned int special_cmd_max_sectors;
> +
> +	/* Reject unsupported discard and write same requests */
> +	if (rw & REQ_DISCARD)
> +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> +	else if (rw & REQ_WRITE_SAME)
> +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> +	    special_cmd_max_sectors == 0) {

That results in uninitialized variable warning (although the warning is 
false positive). We need the macro uninitialized_var to suppress the 
warning.

It's better to use ACCESS_ONCE on variables that may be changing so that 
the compiler doesn't load them multiple times.

Here I'm sending the updated patch.

Mikulas


From: Mikulas Patocka <mpatocka@redhat.com>

Since it's apparently possible that the queue limits for discard and
write same can change while the upper level command is being sliced
and diced, fix up both of them (a) to reject IO if the special command
is unsupported at the start of the function and (b) read the limits
once and let the commands error out on their own if the status happens
to change.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org

---
 drivers/md/dm-io.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

Index: linux-2.6/drivers/md/dm-io.c
===================================================================
--- linux-2.6.orig/drivers/md/dm-io.c
+++ linux-2.6/drivers/md/dm-io.c
@@ -289,9 +289,15 @@ static void do_region(int rw, unsigned r
 	struct request_queue *q = bdev_get_queue(where->bdev);
 	unsigned short logical_block_size = queue_logical_block_size(q);
 	sector_t num_sectors;
+	unsigned int uninitialized_var(special_cmd_max_sectors);
 
-	/* Reject unsupported discard requests */
-	if ((rw & REQ_DISCARD) && !blk_queue_discard(q)) {
+	/* Reject unsupported discard and write same requests */
+	if (rw & REQ_DISCARD)
+		special_cmd_max_sectors = ACCESS_ONCE(q->limits.max_discard_sectors);
+	else if (rw & REQ_WRITE_SAME)
+		special_cmd_max_sectors = ACCESS_ONCE(q->limits.max_write_same_sectors);
+	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
+	    special_cmd_max_sectors == 0) {
 		dec_count(io, region, -EOPNOTSUPP);
 		return;
 	}
@@ -317,7 +323,8 @@ static void do_region(int rw, unsigned r
 		store_io_and_region_in_bio(bio, io, region);
 
 		if (rw & REQ_DISCARD) {
-			num_sectors = min_t(sector_t, q->limits.max_discard_sectors, remaining);
+			num_sectors = min_t(sector_t, special_cmd_max_sectors,
+					    remaining);
 			bio->bi_iter.bi_size = num_sectors << SECTOR_SHIFT;
 			remaining -= num_sectors;
 		} else if (rw & REQ_WRITE_SAME) {
@@ -326,7 +333,8 @@ static void do_region(int rw, unsigned r
 			 */
 			dp->get_page(dp, &page, &len, &offset);
 			bio_add_page(bio, page, logical_block_size, offset);
-			num_sectors = min_t(sector_t, q->limits.max_write_same_sectors, remaining);
+			num_sectors = min_t(sector_t, special_cmd_max_sectors,
+					    remaining);
 			bio->bi_iter.bi_size = num_sectors << SECTOR_SHIFT;
 
 			offset = 0;

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 19:09 ` [PATCH v2] " Mikulas Patocka
@ 2015-02-27 19:19   ` Mike Snitzer
  2015-02-27 19:21     ` Mikulas Patocka
  2015-02-27 19:58   ` Mike Snitzer
  1 sibling, 1 reply; 8+ messages in thread
From: Mike Snitzer @ 2015-02-27 19:19 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong

On Fri, Feb 27 2015 at  2:09pm -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Fri, 27 Feb 2015, Darrick J. Wong wrote:
> 
> > Since it's apparently possible that the queue limits for discard and
> > write same can change while the upper level command is being sliced
> > and diced, fix up both of them (a) to reject IO if the special command
> > is unsupported at the start of the function and (b) read the limits
> > once and let the commands error out on their own if the status happens
> > to change.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> > +	unsigned int special_cmd_max_sectors;
> > +
> > +	/* Reject unsupported discard and write same requests */
> > +	if (rw & REQ_DISCARD)
> > +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> > +	else if (rw & REQ_WRITE_SAME)
> > +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> > +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> > +	    special_cmd_max_sectors == 0) {
> 
> That results in uninitialized variable warning (although the warning is 
> false positive). We need the macro uninitialized_var to suppress the 
> warning.
> 
> It's better to use ACCESS_ONCE on variables that may be changing so that 
> the compiler doesn't load them multiple times.
> 
> Here I'm sending the updated patch.
> 
> Mikulas
> 
> 
> From: Mikulas Patocka <mpatocka@redhat.com>
...

I'm reviewing this now, but just to be clear, this patch will still be
attributed to Darrick.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 19:19   ` Mike Snitzer
@ 2015-02-27 19:21     ` Mikulas Patocka
  0 siblings, 0 replies; 8+ messages in thread
From: Mikulas Patocka @ 2015-02-27 19:21 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong



On Fri, 27 Feb 2015, Mike Snitzer wrote:

> On Fri, Feb 27 2015 at  2:09pm -0500,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > 
> > 
> > On Fri, 27 Feb 2015, Darrick J. Wong wrote:
> > 
> > > Since it's apparently possible that the queue limits for discard and
> > > write same can change while the upper level command is being sliced
> > > and diced, fix up both of them (a) to reject IO if the special command
> > > is unsupported at the start of the function and (b) read the limits
> > > once and let the commands error out on their own if the status happens
> > > to change.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > > +	unsigned int special_cmd_max_sectors;
> > > +
> > > +	/* Reject unsupported discard and write same requests */
> > > +	if (rw & REQ_DISCARD)
> > > +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> > > +	else if (rw & REQ_WRITE_SAME)
> > > +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> > > +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> > > +	    special_cmd_max_sectors == 0) {
> > 
> > That results in uninitialized variable warning (although the warning is 
> > false positive). We need the macro uninitialized_var to suppress the 
> > warning.
> > 
> > It's better to use ACCESS_ONCE on variables that may be changing so that 
> > the compiler doesn't load them multiple times.
> > 
> > Here I'm sending the updated patch.
> > 
> > Mikulas
> > 
> > 
> > From: Mikulas Patocka <mpatocka@redhat.com>
> ...
> 
> I'm reviewing this now, but just to be clear, this patch will still be
> attributed to Darrick.

Yes.

Mikulas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 19:09 ` [PATCH v2] " Mikulas Patocka
  2015-02-27 19:19   ` Mike Snitzer
@ 2015-02-27 19:58   ` Mike Snitzer
  2015-02-27 22:39     ` Mikulas Patocka
  1 sibling, 1 reply; 8+ messages in thread
From: Mike Snitzer @ 2015-02-27 19:58 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong

On Fri, Feb 27 2015 at  2:09pm -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Fri, 27 Feb 2015, Darrick J. Wong wrote:
> 
> > Since it's apparently possible that the queue limits for discard and
> > write same can change while the upper level command is being sliced
> > and diced, fix up both of them (a) to reject IO if the special command
> > is unsupported at the start of the function and (b) read the limits
> > once and let the commands error out on their own if the status happens
> > to change.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> > +	unsigned int special_cmd_max_sectors;
> > +
> > +	/* Reject unsupported discard and write same requests */
> > +	if (rw & REQ_DISCARD)
> > +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> > +	else if (rw & REQ_WRITE_SAME)
> > +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> > +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> > +	    special_cmd_max_sectors == 0) {
> 
> That results in uninitialized variable warning (although the warning is 
> false positive). We need the macro uninitialized_var to suppress the 
> warning.
> 
> It's better to use ACCESS_ONCE on variables that may be changing so that 
> the compiler doesn't load them multiple times.

I dropped the use of ACCESS_ONCE.  We access queue_limits all over block
related code.  If the performance is quantifiable then all accesses
should be updated.  Until then, I'm maintaining status-quo.

Slightly tweaked commit is staged for 4.0 here:
https://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=dm-for-4.0&id=e5db29806b99ce2b2640d2e4d4fcb983cea115c5

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 19:58   ` Mike Snitzer
@ 2015-02-27 22:39     ` Mikulas Patocka
  2015-02-27 22:55       ` Mike Snitzer
  0 siblings, 1 reply; 8+ messages in thread
From: Mikulas Patocka @ 2015-02-27 22:39 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong



On Fri, 27 Feb 2015, Mike Snitzer wrote:

> On Fri, Feb 27 2015 at  2:09pm -0500,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > 
> > 
> > On Fri, 27 Feb 2015, Darrick J. Wong wrote:
> > 
> > > Since it's apparently possible that the queue limits for discard and
> > > write same can change while the upper level command is being sliced
> > > and diced, fix up both of them (a) to reject IO if the special command
> > > is unsupported at the start of the function and (b) read the limits
> > > once and let the commands error out on their own if the status happens
> > > to change.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > > +	unsigned int special_cmd_max_sectors;
> > > +
> > > +	/* Reject unsupported discard and write same requests */
> > > +	if (rw & REQ_DISCARD)
> > > +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> > > +	else if (rw & REQ_WRITE_SAME)
> > > +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> > > +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> > > +	    special_cmd_max_sectors == 0) {
> > 
> > That results in uninitialized variable warning (although the warning is 
> > false positive). We need the macro uninitialized_var to suppress the 
> > warning.
> > 
> > It's better to use ACCESS_ONCE on variables that may be changing so that 
> > the compiler doesn't load them multiple times.
> 
> I dropped the use of ACCESS_ONCE.  We access queue_limits all over block
> related code.  If the performance is quantifiable then all accesses
> should be updated.  Until then, I'm maintaining status-quo.

ACCESS_ONCE is not there because of performance. Without ACCESS_ONCE, the 
compiler may reload the variable multple times and reintroduce the bug 
that we are trying to fix.


See this piece of code.

special_cmd_max_sectors = q->limits.max_discard_sectors;
if (special_cmd_max_sectors == 0) {
	dec_count(io, region, -EOPNOTSUPP);                                                                                                                                        
        return;                  
}
....
num_sectors = special_cmd_max_sectors;
remaining -= num_sectors;


At first sight, it seems that the variable num_sectors can't be zero. But 
in fact, it can. The compiler may eliminate the variable 
special_cmd_max_sectors and translate the code into this:

if (q->limits.max_discard_sectors == 0) {
	dec_count(io, region, -EOPNOTSUPP);                                                                                                                                        
        return;                  
}
....
num_sectors = q->limits.max_discard_sectors;
remaining -= num_sectors;

- and now, if we have the same bug that we were trying to fix.

That's why we need ACCESS_ONCE - to prevent the compiler from doing this 
transformation.

It's true that the kernel misses the ACCESS_ONCE at many places where it 
should be. But the fact that there is a lot of broken code doesn't mean 
that we should write broken code too.

Mikulas

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 22:39     ` Mikulas Patocka
@ 2015-02-27 22:55       ` Mike Snitzer
  2015-03-02 16:06         ` Mikulas Patocka
  0 siblings, 1 reply; 8+ messages in thread
From: Mike Snitzer @ 2015-02-27 22:55 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong

On Fri, Feb 27 2015 at  5:39pm -0500,
Mikulas Patocka <mpatocka@redhat.com> wrote:

> 
> 
> On Fri, 27 Feb 2015, Mike Snitzer wrote:
> 
> > On Fri, Feb 27 2015 at  2:09pm -0500,
> > Mikulas Patocka <mpatocka@redhat.com> wrote:
> > 
> > > 
> > > 
> > > On Fri, 27 Feb 2015, Darrick J. Wong wrote:
> > > 
> > > > Since it's apparently possible that the queue limits for discard and
> > > > write same can change while the upper level command is being sliced
> > > > and diced, fix up both of them (a) to reject IO if the special command
> > > > is unsupported at the start of the function and (b) read the limits
> > > > once and let the commands error out on their own if the status happens
> > > > to change.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > > +	unsigned int special_cmd_max_sectors;
> > > > +
> > > > +	/* Reject unsupported discard and write same requests */
> > > > +	if (rw & REQ_DISCARD)
> > > > +		special_cmd_max_sectors = q->limits.max_discard_sectors;
> > > > +	else if (rw & REQ_WRITE_SAME)
> > > > +		special_cmd_max_sectors = q->limits.max_write_same_sectors;
> > > > +	if ((rw & (REQ_DISCARD | REQ_WRITE_SAME)) &&
> > > > +	    special_cmd_max_sectors == 0) {
> > > 
> > > That results in uninitialized variable warning (although the warning is 
> > > false positive). We need the macro uninitialized_var to suppress the 
> > > warning.
> > > 
> > > It's better to use ACCESS_ONCE on variables that may be changing so that 
> > > the compiler doesn't load them multiple times.
> > 
> > I dropped the use of ACCESS_ONCE.  We access queue_limits all over block
> > related code.  If the performance is quantifiable then all accesses
> > should be updated.  Until then, I'm maintaining status-quo.
> 
> ACCESS_ONCE is not there because of performance.

Yes, clearly not, I've had "performance" on the brain lately (see
request-based DM merge threads) ;)

I understand this is about correctness concerns.

> Without ACCESS_ONCE, the 
> compiler may reload the variable multple times and reintroduce the bug 
> that we are trying to fix.
> 
> 
> See this piece of code.
> 
> special_cmd_max_sectors = q->limits.max_discard_sectors;
> if (special_cmd_max_sectors == 0) {
> 	dec_count(io, region, -EOPNOTSUPP);                                                                                                                                        
>         return;                  
> }
> ....
> num_sectors = special_cmd_max_sectors;
> remaining -= num_sectors;
> 
> 
> At first sight, it seems that the variable num_sectors can't be zero. But 
> in fact, it can. The compiler may eliminate the variable 
> special_cmd_max_sectors and translate the code into this:
> 
> if (q->limits.max_discard_sectors == 0) {
> 	dec_count(io, region, -EOPNOTSUPP);                                                                                                                                        
>         return;                  
> }
> ....
> num_sectors = q->limits.max_discard_sectors;
> remaining -= num_sectors;
> 
> - and now, if we have the same bug that we were trying to fix.
> 
> That's why we need ACCESS_ONCE - to prevent the compiler from doing this 
> transformation.

Do we have proof that a gcc from the last 5-10 years actually does crap
like this?  Again, if so and that compiler is likely to be in
production, this isn't a concern localized to dm-io.  It would be a
rampant problem in the kernel!

> It's true that the kernel misses the ACCESS_ONCE at many places where it 
> should be. But the fact that there is a lot of broken code doesn't mean 
> that we should write broken code too.

I'm saying in practice this type of code isn't broken (and that gcc
isn't doing this.. but I have no _real_ proof other than all the other
places we access structures whose members may change).  ACCESS_ONCE() is
one of the biggest warts in all of the kernel code -- and you happen to
be very persistent about adding them.  I appreciate that it is safer to
have them then not but I'm at the point where I'm starting to question
certain uses of ACCESS_ONCE() -- this one just feels unnecessary.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME
  2015-02-27 22:55       ` Mike Snitzer
@ 2015-03-02 16:06         ` Mikulas Patocka
  0 siblings, 0 replies; 8+ messages in thread
From: Mikulas Patocka @ 2015-03-02 16:06 UTC (permalink / raw)
  To: Mike Snitzer
  Cc: device-mapper development, Srinivas Eeda, Martin K. Petersen, agk,
	Darrick J. Wong



On Fri, 27 Feb 2015, Mike Snitzer wrote:

> On Fri, Feb 27 2015 at  5:39pm -0500,
> Mikulas Patocka <mpatocka@redhat.com> wrote:
> 
> > Without ACCESS_ONCE, the 
> > compiler may reload the variable multple times and reintroduce the bug 
> > that we are trying to fix.
> > 
> > 
> > See this piece of code.
> > 
> > special_cmd_max_sectors = q->limits.max_discard_sectors;
> > if (special_cmd_max_sectors == 0) {
> > 	dec_count(io, region, -EOPNOTSUPP);
> >         return;                  
> > }
> > ....
> > num_sectors = special_cmd_max_sectors;
> > remaining -= num_sectors;
> > 
> > 
> > At first sight, it seems that the variable num_sectors can't be zero. But 
> > in fact, it can. The compiler may eliminate the variable 
> > special_cmd_max_sectors and translate the code into this:
> > 
> > if (q->limits.max_discard_sectors == 0) {
> > 	dec_count(io, region, -EOPNOTSUPP);
> >         return;                  
> > }
> > ....
> > num_sectors = q->limits.max_discard_sectors;
> > remaining -= num_sectors;
> > 
> > - and now, if we have the same bug that we were trying to fix.
> > 
> > That's why we need ACCESS_ONCE - to prevent the compiler from doing this 
> > transformation.
> 
> Do we have proof that a gcc from the last 5-10 years actually does crap
> like this?  Again, if so and that compiler is likely to be in
> production, this isn't a concern localized to dm-io.  It would be a
> rampant problem in the kernel!
> 
> > It's true that the kernel misses the ACCESS_ONCE at many places where it 
> > should be. But the fact that there is a lot of broken code doesn't mean 
> > that we should write broken code too.
> 
> I'm saying in practice this type of code isn't broken (and that gcc
> isn't doing this.. but I have no _real_ proof other than all the other
> places we access structures whose members may change).  ACCESS_ONCE() is
> one of the biggest warts in all of the kernel code -- and you happen to
> be very persistent about adding them.  I appreciate that it is safer to
> have them then not but I'm at the point where I'm starting to question
> certain uses of ACCESS_ONCE() -- this one just feels unnecessary.

This is an example that gcc does multiple loads of the same value:

struct s {
        unsigned a, b, c, d;
};

unsigned fn(struct s *s)
{
        unsigned a = s->a;
        s->b = a;
        asm("nop":::"ebx","ecx","edx","esi","edi","ebp");
        s->c = a;
        return s->d;
}

Compile it in 32-bit mode with -m32 -O2, with gcc 4.9.2 you get:

00000000 <fn>:
   0:   55                      push   %ebp
   1:   57                      push   %edi
   2:   56                      push   %esi
   3:   53                      push   %ebx
   4:   8b 44 24 14             mov    0x14(%esp),%eax
   8:   8b 10                   mov    (%eax),%edx	* 1st load of s->a
   a:   89 50 04                mov    %edx,0x4(%eax)
   d:   90                      nop
   e:   8b 08                   mov    (%eax),%ecx	* 2nd load of s->a
  10:   89 48 08                mov    %ecx,0x8(%eax)
  13:   8b 40 0c                mov    0xc(%eax),%eax
  16:   5b                      pop    %ebx
  17:   5e                      pop    %esi
  18:   5f                      pop    %edi
  19:   5d                      pop    %ebp
  1a:   c3                      ret

You see that the value s->a is loaded twice and if another thread modifies 
it, it is possible that s->b and s->c will be different.

The asm statement marks all registers except eax as used. The variable s 
is stored in eax and there is no free register to store the variable a. 
The compiler could decide to either spill the variable a to the stack or 
reload it from s->a - in this case it chooses reload - and the decision is 
right because spilling the variable would result in more instructions than 
reloading it.

Mikulas

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2015-03-02 16:06 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-02-27 18:44 [PATCH] dm-io: deal with wandering queue limits when handling REQ_DISCARD and REQ_WRITE_SAME Darrick J. Wong
2015-02-27 19:09 ` [PATCH v2] " Mikulas Patocka
2015-02-27 19:19   ` Mike Snitzer
2015-02-27 19:21     ` Mikulas Patocka
2015-02-27 19:58   ` Mike Snitzer
2015-02-27 22:39     ` Mikulas Patocka
2015-02-27 22:55       ` Mike Snitzer
2015-03-02 16:06         ` Mikulas Patocka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.