linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: jane.chu@oracle.com
To: Vivek Goyal <vgoyal@redhat.com>,
	Dan Williams <dan.j.williams@intel.com>,
	vishal.l.verma@intel.com, Christoph Hellwig <hch@infradead.org>
Cc: linux-nvdimm@lists.01.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC][PATCH] dax: Do not try to clear poison for partial pages
Date: Wed, 5 Feb 2020 12:26:52 -0800	[thread overview]
Message-ID: <f97d1ce2-9003-6b46-cd25-a908dc3bd2c6@oracle.com> (raw)
In-Reply-To: <20200129210337.GA13630@redhat.com>

Hello,

I haven't seen response to this proposal, unsure if there is a different
but related discussion ongoing...

I'd like to express my wish: please make it easier for the pmem 
applications when possible.

If kernel does not clear poison when it could legitimately do so,
applications have to go through lengths to clear poisons.
For Cloud pmem applications that have upper bound on error recovery
time, not clearing poison while zeroing-out is quite undesirable.

Thanks!
-jane

On 1/29/20 1:03 PM, Vivek Goyal wrote:
> I am looking into getting rid of dependency on block device in dax
> path. One such place is __dax_zero_page_range() which checks if
> range being zeroed is aligned to block device logical size, then
> it calls bdev_issue_zeroout() instead of doing memset(). Calling
> blkdev_issue_zeroout() also clears bad blocks and poison if any
> in that range.
> 
> This path is used by iomap_zero_range() which in-turn is being
> used by filesystems to zero partial filesystem system blocks.
> For zeroing full filesystem blocks we seem to be calling
> blkdev_issue_zeroout() which clears bad blocks and poison in that
> range.
> 
> So this code currently only seems to help with partial filesystem
> block zeroing. That too only if truncation/hole_punch happens on
> device logical block size boundary.
> 
> To avoid using blkdev_issue_zeroout() in this path, I proposed another
> patch here which adds another dax operation to zero out a rangex and
> clear poison.
> 
> https://lore.kernel.org/linux-fsdevel/20200123165249.GA7664@redhat.com/
> 
> Thinking more about it, it might be an overkill to solve this problem.
> 
> How about if we just not clear poison/bad blocks when a partial page
> is being zeroed. IOW, users will need to hole_punch whole filesystem
> block worth of data which will free that block and it will be zeroed
> some other time and clear poison in the process. For partial fs block
> truncation/punch_hole, we don't clear poison.
> 
> If above interface is acceptable, then we can get rid of code which
> tries to call blkdev_issue_zeroout() in iomap_zero_range() path and
> we don't have to implement another dax operation.
> 
> Looking for some feedback on this.
> 
> Vivek
>   
> Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
> ---
>   fs/dax.c |   50 +++++++++++++++-----------------------------------
>   1 file changed, 15 insertions(+), 35 deletions(-)
> 
> Index: redhat-linux/fs/dax.c
> ===================================================================
> --- redhat-linux.orig/fs/dax.c	2020-01-29 15:19:18.551902448 -0500
> +++ redhat-linux/fs/dax.c	2020-01-29 15:40:56.584824549 -0500
> @@ -1044,47 +1044,27 @@ static vm_fault_t dax_load_hole(struct x
>   	return ret;
>   }
>   
> -static bool dax_range_is_aligned(struct block_device *bdev,
> -				 unsigned int offset, unsigned int length)
> -{
> -	unsigned short sector_size = bdev_logical_block_size(bdev);
> -
> -	if (!IS_ALIGNED(offset, sector_size))
> -		return false;
> -	if (!IS_ALIGNED(length, sector_size))
> -		return false;
> -
> -	return true;
> -}
> -
>   int __dax_zero_page_range(struct block_device *bdev,
>   		struct dax_device *dax_dev, sector_t sector,
>   		unsigned int offset, unsigned int size)
>   {
> -	if (dax_range_is_aligned(bdev, offset, size)) {
> -		sector_t start_sector = sector + (offset >> 9);
> -
> -		return blkdev_issue_zeroout(bdev, start_sector,
> -				size >> 9, GFP_NOFS, 0);
> -	} else {
> -		pgoff_t pgoff;
> -		long rc, id;
> -		void *kaddr;
> -
> -		rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
> -		if (rc)
> -			return rc;
> -
> -		id = dax_read_lock();
> -		rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
> -		if (rc < 0) {
> -			dax_read_unlock(id);
> -			return rc;
> -		}
> -		memset(kaddr + offset, 0, size);
> -		dax_flush(dax_dev, kaddr + offset, size);
> +	pgoff_t pgoff;
> +	long rc, id;
> +	void *kaddr;
> +
> +	rc = bdev_dax_pgoff(bdev, sector, PAGE_SIZE, &pgoff);
> +	if (rc)
> +		return rc;
> +
> +	id = dax_read_lock();
> +	rc = dax_direct_access(dax_dev, pgoff, 1, &kaddr, NULL);
> +	if (rc < 0) {
>   		dax_read_unlock(id);
> +		return rc;
>   	}
> +	memset(kaddr + offset, 0, size);
> +	dax_flush(dax_dev, kaddr + offset, size);
> +	dax_read_unlock(id);
>   	return 0;
>   }
>   EXPORT_SYMBOL_GPL(__dax_zero_page_range);
> _______________________________________________
> Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
> To unsubscribe send an email to linux-nvdimm-leave@lists.01.org
> 

  parent reply	other threads:[~2020-02-05 20:27 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-29 21:03 [RFC][PATCH] dax: Do not try to clear poison for partial pages Vivek Goyal
2020-01-31  5:42 ` Christoph Hellwig
2020-02-05 20:26 ` jane.chu [this message]
2020-02-06  0:37   ` Dan Williams
2020-02-18 19:50     ` Jeff Moyer
2020-02-18 20:45       ` jane.chu
2020-02-18 22:50         ` jane.chu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f97d1ce2-9003-6b46-cd25-a908dc3bd2c6@oracle.com \
    --to=jane.chu@oracle.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=vgoyal@redhat.com \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).