All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Snitzer <snitzer@redhat.com>
To: Nikos Tsironis <ntsironis@arrikto.com>
Cc: Eric Wheeler <dm-devel@lists.ewheeler.net>,
	dm-devel@redhat.com, thornber@redhat.com, agk@redhat.com
Subject: Re: [PATCH 0/2] dm thin: Flush data device before committing metadata to avoid data corruption
Date: Thu, 5 Dec 2019 10:42:21 -0500	[thread overview]
Message-ID: <20191205154221.GA4792@redhat.com> (raw)
In-Reply-To: <a60f1571-fff1-8be8-5537-f604747b39c9@arrikto.com>

On Thu, Dec 05 2019 at 10:31am -0500,
Nikos Tsironis <ntsironis@arrikto.com> wrote:

> On 12/4/19 10:17 PM, Mike Snitzer wrote:
> >On Wed, Dec 04 2019 at  2:58pm -0500,
> >Eric Wheeler <dm-devel@lists.ewheeler.net> wrote:
> >
> >>On Wed, 4 Dec 2019, Nikos Tsironis wrote:
> >>
> >>>The thin provisioning target maintains per thin device mappings that map
> >>>virtual blocks to data blocks in the data device.
> >>>
> >>>When we write to a shared block, in case of internal snapshots, or
> >>>provision a new block, in case of external snapshots, we copy the shared
> >>>block to a new data block (COW), update the mapping for the relevant
> >>>virtual block and then issue the write to the new data block.
> >>>
> >>>Suppose the data device has a volatile write-back cache and the
> >>>following sequence of events occur:
> >>
> >>For those with NV caches, can the data disk flush be optional (maybe as a
> >>table flag)?
> >
> >IIRC block core should avoid issuing the flush if not needed.  I'll have
> >a closer look to verify as much.
> >
> 
> For devices without a volatile write-back cache block core strips off
> the REQ_PREFLUSH and REQ_FUA bits from requests with a payload and
> completes empty REQ_PREFLUSH requests before entering the driver.
> 
> This happens in generic_make_request_checks():
> 
> 		/*
> 		 * Filter flush bio's early so that make_request based
> 		 * drivers without flush support don't have to worry
> 		 * about them.
> 		 */
> 		if (op_is_flush(bio->bi_opf) &&
> 		    !test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
> 		        bio->bi_opf &= ~(REQ_PREFLUSH | REQ_FUA);
> 		        if (!nr_sectors) {
> 		                status = BLK_STS_OK;
> 		                goto end_io;
> 		        }
> 		}
> 
> If I am not mistaken, it all depends on whether the underlying device
> reports the existence of a write back cache or not.

Yes, thanks for confirming my memory of the situation.

> You could check this by looking at /sys/block/<device>/queue/write_cache
> If it says "write back" then flushes will be issued.
> 
> In case the sysfs entry reports a "write back" cache for a device with a
> non-volatile write cache, I think you can change the kernel's view of
> the device by writing to this entry (you could also create a udev rule
> for this).
> 
> This way you can set the write cache as write through. This will
> eliminate the cache flushes issued by the kernel, without altering the
> device state (Documentation/block/queue-sysfs.rst).

Not delved into this aspect of Linux's capabilities but it strikes me as
"dangerous" to twiddle device capabilities like this.  Best to fix
driver to properly expose cache (or not, as the case may be).  It should
also be noted that with DM; the capabilities are stac ked up at device
creation time.  So any changes to the underlying devices will _not_ be
reflected to the high level DM device.

Mike

  reply	other threads:[~2019-12-05 15:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-04 14:07 [PATCH 0/2] dm thin: Flush data device before committing metadata to avoid data corruption Nikos Tsironis
2019-12-04 14:07 ` [PATCH 1/2] dm thin metadata: Add support for a pre-commit callback Nikos Tsironis
2019-12-05 19:40   ` Mike Snitzer
2019-12-05 21:33     ` Nikos Tsironis
2019-12-04 14:07 ` [PATCH 2/2] dm thin: Flush data device before committing metadata Nikos Tsironis
2019-12-04 15:27   ` Joe Thornber
2019-12-04 16:17     ` Nikos Tsironis
2019-12-04 16:39       ` Mike Snitzer
2019-12-04 16:47         ` Nikos Tsironis
2019-12-04 19:58 ` [PATCH 0/2] dm thin: Flush data device before committing metadata to avoid data corruption Eric Wheeler
2019-12-04 20:17   ` Mike Snitzer
2019-12-05 15:31     ` Nikos Tsironis
2019-12-05 15:42       ` Mike Snitzer [this message]
2019-12-05 16:02         ` Nikos Tsironis
2019-12-05 22:34       ` Eric Wheeler
2019-12-06 15:14         ` Nikos Tsironis
2019-12-06 20:06           ` Eric Wheeler
2019-12-09 14:25             ` Nikos Tsironis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191205154221.GA4792@redhat.com \
    --to=snitzer@redhat.com \
    --cc=agk@redhat.com \
    --cc=dm-devel@lists.ewheeler.net \
    --cc=dm-devel@redhat.com \
    --cc=ntsironis@arrikto.com \
    --cc=thornber@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.