public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
@ 2011-08-09 15:24 Jeff Moyer
  2011-08-09 18:29 ` Jens Axboe
  0 siblings, 1 reply; 6+ messages in thread
From: Jeff Moyer @ 2011-08-09 15:24 UTC (permalink / raw)
  To: Tejun Heo; +Cc: linux-kernel, Jens Axboe

Hi,

blk_insert_flush has the following check:

	/*
	 * If there's data but flush is not necessary, the request can be
	 * processed directly without going through flush machinery.  Queue
	 * for normal execution.
	 */
	if ((policy & REQ_FSEQ_DATA) &&
	    !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
		list_add_tail(&rq->queuelist, &q->queue_head);
		return;
	}

However, blk_flush_policy will not return with policy set to only
REQ_FSEQ_DATA:

static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
{
	unsigned int policy = 0;

	if (fflags & REQ_FLUSH) {
		if (rq->cmd_flags & REQ_FLUSH)
			policy |= REQ_FSEQ_PREFLUSH;
		if (blk_rq_sectors(rq))
			policy |= REQ_FSEQ_DATA;
		if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
			policy |= REQ_FSEQ_POSTFLUSH;
	}
	return policy;
}

Notice that REQ_FSEQ_DATA is only set if REQ_FLUSH is set.  Fix this
mismatch by moving the setting of REQ_FSEQ_DATA outside of the REQ_FLUSH
check.

Tejun notes:

  Hmmm... yes, this can become a correctness issue if (and only if)
  blk_queue_flush() is called to change q->flush_flags while requests
  are in-flight; otherwise, requests wouldn't reach the function at all.
  Also, I think it would be a generally good idea to always set
  FSEQ_DATA if the request has data.

Cheers,
Jeff

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>

diff --git a/block/blk-flush.c b/block/blk-flush.c
index bb21e4c..2d162bd 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -95,11 +95,12 @@ static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
 {
 	unsigned int policy = 0;
 
+	if (blk_rq_sectors(rq))
+		policy |= REQ_FSEQ_DATA;
+
 	if (fflags & REQ_FLUSH) {
 		if (rq->cmd_flags & REQ_FLUSH)
 			policy |= REQ_FSEQ_PREFLUSH;
-		if (blk_rq_sectors(rq))
-			policy |= REQ_FSEQ_DATA;
 		if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
 			policy |= REQ_FSEQ_POSTFLUSH;
 	}

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
  2011-08-09 15:24 [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH Jeff Moyer
@ 2011-08-09 18:29 ` Jens Axboe
  0 siblings, 0 replies; 6+ messages in thread
From: Jens Axboe @ 2011-08-09 18:29 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: Tejun Heo, linux-kernel@vger.kernel.org

On 2011-08-09 17:24, Jeff Moyer wrote:
> Hi,
> 
> blk_insert_flush has the following check:
> 
> 	/*
> 	 * If there's data but flush is not necessary, the request can be
> 	 * processed directly without going through flush machinery.  Queue
> 	 * for normal execution.
> 	 */
> 	if ((policy & REQ_FSEQ_DATA) &&
> 	    !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
> 		list_add_tail(&rq->queuelist, &q->queue_head);
> 		return;
> 	}
> 
> However, blk_flush_policy will not return with policy set to only
> REQ_FSEQ_DATA:
> 
> static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
> {
> 	unsigned int policy = 0;
> 
> 	if (fflags & REQ_FLUSH) {
> 		if (rq->cmd_flags & REQ_FLUSH)
> 			policy |= REQ_FSEQ_PREFLUSH;
> 		if (blk_rq_sectors(rq))
> 			policy |= REQ_FSEQ_DATA;
> 		if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
> 			policy |= REQ_FSEQ_POSTFLUSH;
> 	}
> 	return policy;
> }
> 
> Notice that REQ_FSEQ_DATA is only set if REQ_FLUSH is set.  Fix this
> mismatch by moving the setting of REQ_FSEQ_DATA outside of the REQ_FLUSH
> check.

Thanks Jeff, applied.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
       [not found] <fa.WShHQz59YhGkw1sSCIqXCXLEJ3Q@ifi.uio.no>
@ 2013-01-08 18:04 ` ajithb.kumar
  2013-01-08 18:45   ` Tejun Heo
  0 siblings, 1 reply; 6+ messages in thread
From: ajithb.kumar @ 2013-01-08 18:04 UTC (permalink / raw)
  To: fa.linux.kernel; +Cc: Tejun Heo, linux-kernel, Jens Axboe

Hi,
Could you please provide clarity on the following.
">   Hmmm... yes, this can become a correctness issue if (and only if)
>   blk_queue_flush() is called to change q->flush_flags while requests
>   are in-flight;"
Could you please clarify as to why is it a correctness issue only if blk_queue_flush() is used to change flush_flags when requests are in flight ?  As I understand, XFS does set WRITE_FLUSH_FUA flag in _xfs_buf_ioapply() function irrespective of whether the underlying device supports flush capabilities or not which will flow into blk_insert_flush().  Is my reading of the code correct and is there a general correctness issue here which potentially results in XFS file system corruption in case of an abrupt shutdown independent of q->flush_flags getting changed while request is in flight.

Thanks,
Ajith

On Tuesday, 9 August 2011 20:54:35 UTC+5:30, Jeff Moyer  wrote:
> Hi,
> 
> blk_insert_flush has the following check:
> 
> 	/*
> 	 * If there's data but flush is not necessary, the request can be
> 	 * processed directly without going through flush machinery.  Queue
> 	 * for normal execution.
> 	 */
> 	if ((policy & REQ_FSEQ_DATA) &&
> 	    !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
> 		list_add_tail(&rq->queuelist, &q->queue_head);
> 		return;
> 	}
> 
> However, blk_flush_policy will not return with policy set to only
> REQ_FSEQ_DATA:
> 
> static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
> {
> 	unsigned int policy = 0;
> 
> 	if (fflags & REQ_FLUSH) {
> 		if (rq->cmd_flags & REQ_FLUSH)
> 			policy |= REQ_FSEQ_PREFLUSH;
> 		if (blk_rq_sectors(rq))
> 			policy |= REQ_FSEQ_DATA;
> 		if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
> 			policy |= REQ_FSEQ_POSTFLUSH;
> 	}
> 	return policy;
> }
> 
> Notice that REQ_FSEQ_DATA is only set if REQ_FLUSH is set.  Fix this
> mismatch by moving the setting of REQ_FSEQ_DATA outside of the REQ_FLUSH
> check.
> 
> Tejun notes:
> 
>   Hmmm... yes, this can become a correctness issue if (and only if)
>   blk_queue_flush() is called to change q->flush_flags while requests
>   are in-flight; otherwise, requests wouldn't reach the function at all.
>   Also, I think it would be a generally good idea to always set
>   FSEQ_DATA if the request has data.
> 
> Cheers,
> Jeff
> 
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
> Acked-by: Tejun Heo <tj@kernel.org>
> 
> diff --git a/block/blk-flush.c b/block/blk-flush.c
> index bb21e4c..2d162bd 100644
> --- a/block/blk-flush.c
> +++ b/block/blk-flush.c
> @@ -95,11 +95,12 @@ static unsigned int blk_flush_policy(unsigned int fflags, struct request *rq)
>  {
>  	unsigned int policy = 0;
>  
> +	if (blk_rq_sectors(rq))
> +		policy |= REQ_FSEQ_DATA;
> +
>  	if (fflags & REQ_FLUSH) {
>  		if (rq->cmd_flags & REQ_FLUSH)
>  			policy |= REQ_FSEQ_PREFLUSH;
> -		if (blk_rq_sectors(rq))
> -			policy |= REQ_FSEQ_DATA;
>  		if (!(fflags & REQ_FUA) && (rq->cmd_flags & REQ_FUA))
>  			policy |= REQ_FSEQ_POSTFLUSH;
>  	}
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
  2013-01-08 18:04 ` ajithb.kumar
@ 2013-01-08 18:45   ` Tejun Heo
  0 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2013-01-08 18:45 UTC (permalink / raw)
  To: ajithb.kumar; +Cc: fa.linux.kernel, linux-kernel, Jens Axboe

Hello,

On Tue, Jan 08, 2013 at 10:04:23AM -0800, ajithb.kumar@gmail.com wrote:
> Hi,
> Could you please provide clarity on the following.
> ">   Hmmm... yes, this can become a correctness issue if (and only if)
> >   blk_queue_flush() is called to change q->flush_flags while requests
> >   are in-flight;"
>
> Could you please clarify as to why is it a correctness issue only if
> blk_queue_flush() is used to change flush_flags when requests are in
> flight ?  As I understand, XFS does set WRITE_FLUSH_FUA flag in
> _xfs_buf_ioapply() function irrespective of whether the underlying
> device supports flush capabilities or not which will flow into
> blk_insert_flush().  Is my reading of the code correct and is there
> a general correctness issue here which potentially results in XFS
> file system corruption in case of an abrupt shutdown independent of
> q->flush_flags getting changed while request is in flight.

My memory is kinda fuzzy at this point but if a queue doesn't support
flush, its flush_flags should be zero and
generic_make_request_checks() will clear REQ_FLUSH|REQ_FUA from
bio->bi_rw so we never hit blk_insert_flush() and the request will be
processed as a normal IO one; however, if REQ_FLUSH goes off after a
request passed generic_make_request_checks() but before
blk_flush_policy(), it'll become null op and its data payload won't
get written out to the underlying device, which is data corruption.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
       [not found]   ` <fa.h/6oWR6kmOBuicuev7/0gdVAyU4@ifi.uio.no>
@ 2013-01-09  4:14     ` Ajith Kumar
  2013-01-13  7:01       ` Ajith Kumar
  0 siblings, 1 reply; 6+ messages in thread
From: Ajith Kumar @ 2013-01-09  4:14 UTC (permalink / raw)
  To: fa.linux.kernel; +Cc: ajithb.kumar, linux-kernel, Jens Axboe

Hello,
Thanks for the response.
A block device driver during initialization would decide if it is capable of supporting FLUSH/FUA or not.  Suppose driver claims FLUSH/FUA support then any bio targeted at this driver with FLUSH bit set(which is commonly set by file system like XFS for doing internal logging) has a data corruption vulnerability in case of an abrupt shutdown.  So, IMO the vulnerability is not open to rare window where driver changes q->flush_flags while IO is in flight, but for a much larger window from time driver comes up and throughout it's life.

Thanks,
Ajith

On Wednesday, 9 January 2013 00:15:31 UTC+5:30, Tejun Heo  wrote:
> Hello,
> 
> 
> 
> On Tue, Jan 08, 2013 at 10:04:23AM -0800, ajithb.kumar@gmail.com wrote:
> 
> > Hi,
> 
> > Could you please provide clarity on the following.
> 
> > ">   Hmmm... yes, this can become a correctness issue if (and only if)
> 
> > >   blk_queue_flush() is called to change q->flush_flags while requests
> 
> > >   are in-flight;"
> 
> >
> 
> > Could you please clarify as to why is it a correctness issue only if
> 
> > blk_queue_flush() is used to change flush_flags when requests are in
> 
> > flight ?  As I understand, XFS does set WRITE_FLUSH_FUA flag in
> 
> > _xfs_buf_ioapply() function irrespective of whether the underlying
> 
> > device supports flush capabilities or not which will flow into
> 
> > blk_insert_flush().  Is my reading of the code correct and is there
> 
> > a general correctness issue here which potentially results in XFS
> 
> > file system corruption in case of an abrupt shutdown independent of
> 
> > q->flush_flags getting changed while request is in flight.
> 
> 
> 
> My memory is kinda fuzzy at this point but if a queue doesn't support
> 
> flush, its flush_flags should be zero and
> 
> generic_make_request_checks() will clear REQ_FLUSH|REQ_FUA from
> 
> bio->bi_rw so we never hit blk_insert_flush() and the request will be
> 
> processed as a normal IO one; however, if REQ_FLUSH goes off after a
> 
> request passed generic_make_request_checks() but before
> 
> blk_flush_policy(), it'll become null op and its data payload won't
> 
> get written out to the underlying device, which is data corruption.
> 
> 
> 
> Thanks.
> 
> 
> 
> -- 
> 
> tejun
> 
> --
> 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> 
> the body of a message to majordomo@vger.kernel.org
> 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH
  2013-01-09  4:14     ` Ajith Kumar
@ 2013-01-13  7:01       ` Ajith Kumar
  0 siblings, 0 replies; 6+ messages in thread
From: Ajith Kumar @ 2013-01-13  7:01 UTC (permalink / raw)
  To: fa.linux.kernel; +Cc: ajithb.kumar, linux-kernel, Jens Axboe

Hello,
Re-reading the code, I see that I was wrong with the previous comment of mine.  So, in summary

i) If block driver does not support FLUSH and FUA, then __generic_make_request() checks will clear BIO FLUSH and FUA flags and hence blk_insert_flush() will not be invoked.

ii) If block driver clears FLUSH & FUA flags while IO is in flight, then there is possibility of IO missing __generic_make_request() checks and hitting issue with blk_insert_flush() being discussed here.

iii) If block driver sets only REQ_FUA without REQ_FLUSH then __generic_make_request() checks will not clear BIO flags and hence blk_insert_flush() will be invoked which will hit the blk_insert_flush() issue being discussed here.  However, setting REQ_FUA without REQ_FLUSH is not as per documentation and it is invalid for block driver to do so.

Thanks,
Ajith

On Wednesday, 9 January 2013 09:44:55 UTC+5:30, Ajith Kumar  wrote:
> Hello,
> 
> Thanks for the response.
> 
> A block device driver during initialization would decide if it is capable of supporting FLUSH/FUA or not.  Suppose driver claims FLUSH/FUA support then any bio targeted at this driver with FLUSH bit set(which is commonly set by file system like XFS for doing internal logging) has a data corruption vulnerability in case of an abrupt shutdown.  So, IMO the vulnerability is not open to rare window where driver changes q->flush_flags while IO is in flight, but for a much larger window from time driver comes up and throughout it's life.
> 
> 
> 
> Thanks,
> 
> Ajith
> 
> 
> 
> On Wednesday, 9 January 2013 00:15:31 UTC+5:30, Tejun Heo  wrote:
> 
> > Hello,
> 
> > 
> 
> > 
> 
> > 
> 
> > On Tue, Jan 08, 2013 at 10:04:23AM -0800, ajithb.kumar@gmail.com wrote:
> 
> > 
> 
> > > Hi,
> 
> > 
> 
> > > Could you please provide clarity on the following.
> 
> > 
> 
> > > ">   Hmmm... yes, this can become a correctness issue if (and only if)
> 
> > 
> 
> > > >   blk_queue_flush() is called to change q->flush_flags while requests
> 
> > 
> 
> > > >   are in-flight;"
> 
> > 
> 
> > >
> 
> > 
> 
> > > Could you please clarify as to why is it a correctness issue only if
> 
> > 
> 
> > > blk_queue_flush() is used to change flush_flags when requests are in
> 
> > 
> 
> > > flight ?  As I understand, XFS does set WRITE_FLUSH_FUA flag in
> 
> > 
> 
> > > _xfs_buf_ioapply() function irrespective of whether the underlying
> 
> > 
> 
> > > device supports flush capabilities or not which will flow into
> 
> > 
> 
> > > blk_insert_flush().  Is my reading of the code correct and is there
> 
> > 
> 
> > > a general correctness issue here which potentially results in XFS
> 
> > 
> 
> > > file system corruption in case of an abrupt shutdown independent of
> 
> > 
> 
> > > q->flush_flags getting changed while request is in flight.
> 
> > 
> 
> > 
> 
> > 
> 
> > My memory is kinda fuzzy at this point but if a queue doesn't support
> 
> > 
> 
> > flush, its flush_flags should be zero and
> 
> > 
> 
> > generic_make_request_checks() will clear REQ_FLUSH|REQ_FUA from
> 
> > 
> 
> > bio->bi_rw so we never hit blk_insert_flush() and the request will be
> 
> > 
> 
> > processed as a normal IO one; however, if REQ_FLUSH goes off after a
> 
> > 
> 
> > request passed generic_make_request_checks() but before
> 
> > 
> 
> > blk_flush_policy(), it'll become null op and its data payload won't
> 
> > 
> 
> > get written out to the underlying device, which is data corruption.
> 
> > 
> 
> > 
> 
> > 
> 
> > Thanks.
> 
> > 
> 
> > 
> 
> > 
> 
> > -- 
> 
> > 
> 
> > tejun
> 
> > 
> 
> > --
> 
> > 
> 
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> 
> > 
> 
> > the body of a message to majordomo@vger.kernel.org
> 
> > 
> 
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> > 
> 
> > Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-13  7:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-08-09 15:24 [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH Jeff Moyer
2011-08-09 18:29 ` Jens Axboe
     [not found] <fa.WShHQz59YhGkw1sSCIqXCXLEJ3Q@ifi.uio.no>
2013-01-08 18:04 ` ajithb.kumar
2013-01-08 18:45   ` Tejun Heo
     [not found] <fa.Y0FHvNmq3PUOPFTOjxSNqjMULHc@ifi.uio.no>
     [not found] ` <fa.brRV/ZItJPfVkucfEAnS6B1Vfow@ifi.uio.no>
     [not found]   ` <fa.h/6oWR6kmOBuicuev7/0gdVAyU4@ifi.uio.no>
2013-01-09  4:14     ` Ajith Kumar
2013-01-13  7:01       ` Ajith Kumar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox