linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* add and use a per-mapping stable writes flag
@ 2023-10-24  6:44 Christoph Hellwig
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24  6:44 UTC (permalink / raw)
  To: Jens Axboe, Matthew Wilcox
  Cc: Ilya Dryomov, Andrew Morton, linux-block, linux-fsdevel,
	linux-xfs, linux-mm

Hi all

A while ago Ilya pointer out that since commit 1cb039f3dc16 ("bdi:
replace BDI_CAP_STABLE_WRITES with a queue and a sb flag"), the stable
write flag on the queue wasn't used for writes to the block devices
nodes any more, and willy suggested fixing this by adding a stable write
flags on each address_space.  This series implements this fix, and also
fixes the stable write flag when the XFS RT device requires it, but the
main device doesn't (which is probably more a theoretical than a
practical problem).

Diffstat:
 block/bdev.c            |    2 ++
 fs/inode.c              |    2 ++
 fs/xfs/xfs_inode.h      |    8 ++++++++
 fs/xfs/xfs_ioctl.c      |    9 +++++++++
 fs/xfs/xfs_iops.c       |    7 +++++++
 include/linux/pagemap.h |   17 +++++++++++++++++
 mm/page-writeback.c     |    2 +-
 7 files changed, 46 insertions(+), 1 deletion(-)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24  6:44 add and use a per-mapping stable writes flag Christoph Hellwig
@ 2023-10-24  6:44 ` Christoph Hellwig
  2023-10-24 11:43   ` Matthew Wilcox
                     ` (2 more replies)
  2023-10-24  6:44 ` [PATCH 2/3] block: update the stable_writes flag in bdev_add Christoph Hellwig
  2023-10-24  6:44 ` [PATCH 3/3] xfs: respect the stable writes flag on the RT device Christoph Hellwig
  2 siblings, 3 replies; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24  6:44 UTC (permalink / raw)
  To: Jens Axboe, Matthew Wilcox
  Cc: Ilya Dryomov, Andrew Morton, linux-block, linux-fsdevel,
	linux-xfs, linux-mm

folio_wait_stable waits for writeback to finish before modifying the
contents of a folio again, e.g. to support check summing of the data
in the block integrity code.

Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
on the super_block, which means it is uniform for the entire file system.
This is wrong for the block device pseudofs which is shared by all
block devices, or file systems that can use multiple devices like XFS
witht the RT subvolume or btrfs (although btrfs currently reimplements
folio_wait_stable anyway).

Add a per-address_space AS_STABLE_WRITES flag to control the behavior
in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
to initialize AS_STABLE_WRITES to the existing default which covers
most cases.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/inode.c              |  2 ++
 include/linux/pagemap.h | 17 +++++++++++++++++
 mm/page-writeback.c     |  2 +-
 3 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/fs/inode.c b/fs/inode.c
index 84bc3c76e5ccb5..ae1a6410b53d7e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -215,6 +215,8 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
 	lockdep_set_class_and_name(&mapping->invalidate_lock,
 				   &sb->s_type->invalidate_lock_key,
 				   "mapping.invalidate_lock");
+	if (sb->s_iflags & SB_I_STABLE_WRITES)
+		mapping_set_stable_writes(mapping);
 	inode->i_private = NULL;
 	inode->i_mapping = mapping;
 	INIT_HLIST_HEAD(&inode->i_dentry);	/* buggered by rcu freeing */
diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 351c3b7f93a14e..8c9608b217b000 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -204,6 +204,8 @@ enum mapping_flags {
 	AS_NO_WRITEBACK_TAGS = 5,
 	AS_LARGE_FOLIO_SUPPORT = 6,
 	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
+	AS_STABLE_WRITES,	/* must wait for writeback before modifying
+				   folio contents */
 };
 
 /**
@@ -289,6 +291,21 @@ static inline void mapping_clear_release_always(struct address_space *mapping)
 	clear_bit(AS_RELEASE_ALWAYS, &mapping->flags);
 }
 
+static inline bool mapping_stable_writes(const struct address_space *mapping)
+{
+	return test_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
+static inline void mapping_set_stable_writes(struct address_space *mapping)
+{
+	set_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
+static inline void mapping_clear_stable_writes(struct address_space *mapping)
+{
+	clear_bit(AS_STABLE_WRITES, &mapping->flags);
+}
+
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
 	return mapping->gfp_mask;
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index b8d3d7040a506a..4656534b8f5cc6 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -3110,7 +3110,7 @@ EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
  */
 void folio_wait_stable(struct folio *folio)
 {
-	if (folio_inode(folio)->i_sb->s_iflags & SB_I_STABLE_WRITES)
+	if (mapping_stable_writes(folio_mapping(folio)))
 		folio_wait_writeback(folio);
 }
 EXPORT_SYMBOL_GPL(folio_wait_stable);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/3] block: update the stable_writes flag in bdev_add
  2023-10-24  6:44 add and use a per-mapping stable writes flag Christoph Hellwig
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
@ 2023-10-24  6:44 ` Christoph Hellwig
  2023-10-24 12:04   ` Ilya Dryomov
  2023-10-24 15:01   ` Darrick J. Wong
  2023-10-24  6:44 ` [PATCH 3/3] xfs: respect the stable writes flag on the RT device Christoph Hellwig
  2 siblings, 2 replies; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24  6:44 UTC (permalink / raw)
  To: Jens Axboe, Matthew Wilcox
  Cc: Ilya Dryomov, Andrew Morton, linux-block, linux-fsdevel,
	linux-xfs, linux-mm

Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
This makes sure devices that require stable writes have it set for I/O
on the block device node as well.

Note that this doesn't cover the case of a flag changing on a live device
yet.  We should handle that as well, but I plan to cover it as part of a
more general rework of how changing runtime paramters on block devices
works.

Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
Reported-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 block/bdev.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/block/bdev.c b/block/bdev.c
index f3b13aa1b7d428..04dba25b0019eb 100644
--- a/block/bdev.c
+++ b/block/bdev.c
@@ -425,6 +425,8 @@ void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
 
 void bdev_add(struct block_device *bdev, dev_t dev)
 {
+	if (bdev_stable_writes(bdev))
+		mapping_set_stable_writes(bdev->bd_inode->i_mapping);
 	bdev->bd_dev = dev;
 	bdev->bd_inode->i_rdev = dev;
 	bdev->bd_inode->i_ino = dev;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/3] xfs: respect the stable writes flag on the RT device
  2023-10-24  6:44 add and use a per-mapping stable writes flag Christoph Hellwig
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
  2023-10-24  6:44 ` [PATCH 2/3] block: update the stable_writes flag in bdev_add Christoph Hellwig
@ 2023-10-24  6:44 ` Christoph Hellwig
  2023-10-24 15:09   ` Darrick J. Wong
  2 siblings, 1 reply; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24  6:44 UTC (permalink / raw)
  To: Jens Axboe, Matthew Wilcox
  Cc: Ilya Dryomov, Andrew Morton, linux-block, linux-fsdevel,
	linux-xfs, linux-mm

Update the per-folio stable writes flag dependening on which device an
inode resides on.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/xfs_inode.h | 8 ++++++++
 fs/xfs/xfs_ioctl.c | 9 +++++++++
 fs/xfs/xfs_iops.c  | 7 +++++++
 3 files changed, 24 insertions(+)

diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 0c5bdb91152e1c..682959c8f78cb0 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -561,6 +561,14 @@ extern void xfs_setup_inode(struct xfs_inode *ip);
 extern void xfs_setup_iops(struct xfs_inode *ip);
 extern void xfs_diflags_to_iflags(struct xfs_inode *ip, bool init);
 
+static inline void xfs_update_stable_writes(struct xfs_inode *ip)
+{
+	if (bdev_stable_writes(xfs_inode_buftarg(ip)->bt_bdev))
+		mapping_set_stable_writes(VFS_I(ip)->i_mapping);
+	else
+		mapping_clear_stable_writes(VFS_I(ip)->i_mapping);
+}
+
 /*
  * When setting up a newly allocated inode, we need to call
  * xfs_finish_inode_setup() once the inode is fully instantiated at
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 55bb01173cde8c..67bf613b3c86bc 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1147,6 +1147,15 @@ xfs_ioctl_setattr_xflags(
 	ip->i_diflags2 = i_flags2;
 
 	xfs_diflags_to_iflags(ip, false);
+
+	/*
+	 * Make the stable writes flag match that of the device the inode
+	 * resides on when flipping the RT flag.
+	 */
+	if (S_ISREG(VFS_I(ip)->i_mode) &&
+	    XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME))
+		xfs_update_stable_writes(ip);
+
 	xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 	XFS_STATS_INC(mp, xs_ig_attrchg);
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 2b3b05c28e9e48..b8ec045708c318 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -1298,6 +1298,13 @@ xfs_setup_inode(
 	gfp_mask = mapping_gfp_mask(inode->i_mapping);
 	mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
 
+	/*
+	 * For real-time inodes update the stable write flags to that of the RT
+	 * device instead of the data device.
+	 */
+	if (S_ISREG(inode->i_mode) && XFS_IS_REALTIME_INODE(ip))
+		xfs_update_stable_writes(ip);
+
 	/*
 	 * If there is no attribute fork no ACL can exist on this inode,
 	 * and it can't have any file capabilities attached to it either.
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
@ 2023-10-24 11:43   ` Matthew Wilcox
  2023-10-24 12:03   ` Ilya Dryomov
  2023-10-24 15:00   ` Darrick J. Wong
  2 siblings, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-24 11:43 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Ilya Dryomov, Andrew Morton, linux-block,
	linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:44:14AM +0200, Christoph Hellwig wrote:
> folio_wait_stable waits for writeback to finish before modifying the
> contents of a folio again, e.g. to support check summing of the data
> in the block integrity code.
> 
> Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
> on the super_block, which means it is uniform for the entire file system.
> This is wrong for the block device pseudofs which is shared by all
> block devices, or file systems that can use multiple devices like XFS
> witht the RT subvolume or btrfs (although btrfs currently reimplements
> folio_wait_stable anyway).
> 
> Add a per-address_space AS_STABLE_WRITES flag to control the behavior
> in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
> to initialize AS_STABLE_WRITES to the existing default which covers
> most cases.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>

> +++ b/mm/page-writeback.c
> @@ -3110,7 +3110,7 @@ EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
>   */
>  void folio_wait_stable(struct folio *folio)
>  {
> -	if (folio_inode(folio)->i_sb->s_iflags & SB_I_STABLE_WRITES)
> +	if (mapping_stable_writes(folio_mapping(folio)))
>  		folio_wait_writeback(folio);

What I really like about this is that we've gone from

	folio->mapping->host->i_sb->s_iflags
to
	folio->mapping->flags

which saves us two pointer dereferences.  Sure, probably cached, but
maybe not, and cache misses are expensive.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
  2023-10-24 11:43   ` Matthew Wilcox
@ 2023-10-24 12:03   ` Ilya Dryomov
  2023-10-24 12:09     ` Matthew Wilcox
  2023-10-24 15:00   ` Darrick J. Wong
  2 siblings, 1 reply; 16+ messages in thread
From: Ilya Dryomov @ 2023-10-24 12:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, Andrew Morton, linux-block,
	linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 8:44 AM Christoph Hellwig <hch@lst.de> wrote:
>
> folio_wait_stable waits for writeback to finish before modifying the
> contents of a folio again, e.g. to support check summing of the data
> in the block integrity code.
>
> Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
> on the super_block, which means it is uniform for the entire file system.
> This is wrong for the block device pseudofs which is shared by all
> block devices, or file systems that can use multiple devices like XFS
> witht the RT subvolume or btrfs (although btrfs currently reimplements
> folio_wait_stable anyway).
>
> Add a per-address_space AS_STABLE_WRITES flag to control the behavior
> in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
> to initialize AS_STABLE_WRITES to the existing default which covers
> most cases.
>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/inode.c              |  2 ++
>  include/linux/pagemap.h | 17 +++++++++++++++++
>  mm/page-writeback.c     |  2 +-
>  3 files changed, 20 insertions(+), 1 deletion(-)
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 84bc3c76e5ccb5..ae1a6410b53d7e 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -215,6 +215,8 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
>         lockdep_set_class_and_name(&mapping->invalidate_lock,
>                                    &sb->s_type->invalidate_lock_key,
>                                    "mapping.invalidate_lock");
> +       if (sb->s_iflags & SB_I_STABLE_WRITES)
> +               mapping_set_stable_writes(mapping);
>         inode->i_private = NULL;
>         inode->i_mapping = mapping;
>         INIT_HLIST_HEAD(&inode->i_dentry);      /* buggered by rcu freeing */
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 351c3b7f93a14e..8c9608b217b000 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -204,6 +204,8 @@ enum mapping_flags {
>         AS_NO_WRITEBACK_TAGS = 5,
>         AS_LARGE_FOLIO_SUPPORT = 6,
>         AS_RELEASE_ALWAYS,      /* Call ->release_folio(), even if no private data */
> +       AS_STABLE_WRITES,       /* must wait for writeback before modifying
> +                                  folio contents */
>  };
>
>  /**
> @@ -289,6 +291,21 @@ static inline void mapping_clear_release_always(struct address_space *mapping)
>         clear_bit(AS_RELEASE_ALWAYS, &mapping->flags);
>  }
>
> +static inline bool mapping_stable_writes(const struct address_space *mapping)
> +{
> +       return test_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
> +static inline void mapping_set_stable_writes(struct address_space *mapping)
> +{
> +       set_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
> +static inline void mapping_clear_stable_writes(struct address_space *mapping)

Hi Christoph,

Nit: mapping_clear_stable_writes() is unused.

> +{
> +       clear_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
>  static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
>  {
>         return mapping->gfp_mask;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index b8d3d7040a506a..4656534b8f5cc6 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -3110,7 +3110,7 @@ EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
>   */
>  void folio_wait_stable(struct folio *folio)
>  {
> -       if (folio_inode(folio)->i_sb->s_iflags & SB_I_STABLE_WRITES)
> +       if (mapping_stable_writes(folio_mapping(folio)))
>                 folio_wait_writeback(folio);
>  }
>  EXPORT_SYMBOL_GPL(folio_wait_stable);
> --
> 2.39.2
>

Tested with RBD which behaves like a DIF/DIX device (i.e. requires
stable pages):

Tested-by: Ilya Dryomov <idryomov@gmail.com>

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] block: update the stable_writes flag in bdev_add
  2023-10-24  6:44 ` [PATCH 2/3] block: update the stable_writes flag in bdev_add Christoph Hellwig
@ 2023-10-24 12:04   ` Ilya Dryomov
  2023-10-24 15:01   ` Darrick J. Wong
  1 sibling, 0 replies; 16+ messages in thread
From: Ilya Dryomov @ 2023-10-24 12:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, Andrew Morton, linux-block,
	linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 8:44 AM Christoph Hellwig <hch@lst.de> wrote:
>
> Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
> This makes sure devices that require stable writes have it set for I/O
> on the block device node as well.
>
> Note that this doesn't cover the case of a flag changing on a live device
> yet.  We should handle that as well, but I plan to cover it as part of a
> more general rework of how changing runtime paramters on block devices
> works.
>
> Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
> Reported-by: Ilya Dryomov <idryomov@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  block/bdev.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/block/bdev.c b/block/bdev.c
> index f3b13aa1b7d428..04dba25b0019eb 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -425,6 +425,8 @@ void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
>
>  void bdev_add(struct block_device *bdev, dev_t dev)
>  {
> +       if (bdev_stable_writes(bdev))
> +               mapping_set_stable_writes(bdev->bd_inode->i_mapping);
>         bdev->bd_dev = dev;
>         bdev->bd_inode->i_rdev = dev;
>         bdev->bd_inode->i_ino = dev;
> --
> 2.39.2
>

Tested with RBD which behaves like a DIF/DIX device (i.e. requires
stable pages):

Tested-by: Ilya Dryomov <idryomov@gmail.com>

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24 12:03   ` Ilya Dryomov
@ 2023-10-24 12:09     ` Matthew Wilcox
  2023-10-24 12:45       ` Ilya Dryomov
  0 siblings, 1 reply; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-24 12:09 UTC (permalink / raw)
  To: Ilya Dryomov
  Cc: Christoph Hellwig, Jens Axboe, Andrew Morton, linux-block,
	linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 02:03:36PM +0200, Ilya Dryomov wrote:
> > +static inline void mapping_clear_stable_writes(struct address_space *mapping)
> 
> Hi Christoph,
> 
> Nit: mapping_clear_stable_writes() is unused.

It's used in patch 3

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24 12:09     ` Matthew Wilcox
@ 2023-10-24 12:45       ` Ilya Dryomov
  0 siblings, 0 replies; 16+ messages in thread
From: Ilya Dryomov @ 2023-10-24 12:45 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Christoph Hellwig, Jens Axboe, Andrew Morton, linux-block,
	linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 2:09 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Tue, Oct 24, 2023 at 02:03:36PM +0200, Ilya Dryomov wrote:
> > > +static inline void mapping_clear_stable_writes(struct address_space *mapping)
> >
> > Hi Christoph,
> >
> > Nit: mapping_clear_stable_writes() is unused.
>
> It's used in patch 3

My apologies, I was too quick to archive it as specific to XFS.

Thanks,

                Ilya

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
  2023-10-24 11:43   ` Matthew Wilcox
  2023-10-24 12:03   ` Ilya Dryomov
@ 2023-10-24 15:00   ` Darrick J. Wong
  2023-10-24 15:10     ` Matthew Wilcox
  2023-10-24 16:14     ` Christoph Hellwig
  2 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2023-10-24 15:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, Ilya Dryomov, Andrew Morton,
	linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:44:14AM +0200, Christoph Hellwig wrote:
> folio_wait_stable waits for writeback to finish before modifying the
> contents of a folio again, e.g. to support check summing of the data
> in the block integrity code.
> 
> Currently this behavior is controlled by the SB_I_STABLE_WRITES flag
> on the super_block, which means it is uniform for the entire file system.
> This is wrong for the block device pseudofs which is shared by all
> block devices, or file systems that can use multiple devices like XFS
> witht the RT subvolume or btrfs (although btrfs currently reimplements
> folio_wait_stable anyway).
> 
> Add a per-address_space AS_STABLE_WRITES flag to control the behavior
> in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
> to initialize AS_STABLE_WRITES to the existing default which covers
> most cases.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/inode.c              |  2 ++
>  include/linux/pagemap.h | 17 +++++++++++++++++
>  mm/page-writeback.c     |  2 +-

For a hot second I wondered if we could get rid of SB_I_STABLE_WRITES
too, but then had an AHA moment when I saw that NFS also sets it.

This looks reasonable,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  3 files changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/inode.c b/fs/inode.c
> index 84bc3c76e5ccb5..ae1a6410b53d7e 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -215,6 +215,8 @@ int inode_init_always(struct super_block *sb, struct inode *inode)
>  	lockdep_set_class_and_name(&mapping->invalidate_lock,
>  				   &sb->s_type->invalidate_lock_key,
>  				   "mapping.invalidate_lock");
> +	if (sb->s_iflags & SB_I_STABLE_WRITES)
> +		mapping_set_stable_writes(mapping);
>  	inode->i_private = NULL;
>  	inode->i_mapping = mapping;
>  	INIT_HLIST_HEAD(&inode->i_dentry);	/* buggered by rcu freeing */
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 351c3b7f93a14e..8c9608b217b000 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -204,6 +204,8 @@ enum mapping_flags {
>  	AS_NO_WRITEBACK_TAGS = 5,
>  	AS_LARGE_FOLIO_SUPPORT = 6,
>  	AS_RELEASE_ALWAYS,	/* Call ->release_folio(), even if no private data */
> +	AS_STABLE_WRITES,	/* must wait for writeback before modifying
> +				   folio contents */
>  };
>  
>  /**
> @@ -289,6 +291,21 @@ static inline void mapping_clear_release_always(struct address_space *mapping)
>  	clear_bit(AS_RELEASE_ALWAYS, &mapping->flags);
>  }
>  
> +static inline bool mapping_stable_writes(const struct address_space *mapping)
> +{
> +	return test_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
> +static inline void mapping_set_stable_writes(struct address_space *mapping)
> +{
> +	set_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
> +static inline void mapping_clear_stable_writes(struct address_space *mapping)
> +{
> +	clear_bit(AS_STABLE_WRITES, &mapping->flags);
> +}
> +
>  static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
>  {
>  	return mapping->gfp_mask;
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index b8d3d7040a506a..4656534b8f5cc6 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -3110,7 +3110,7 @@ EXPORT_SYMBOL_GPL(folio_wait_writeback_killable);
>   */
>  void folio_wait_stable(struct folio *folio)
>  {
> -	if (folio_inode(folio)->i_sb->s_iflags & SB_I_STABLE_WRITES)
> +	if (mapping_stable_writes(folio_mapping(folio)))
>  		folio_wait_writeback(folio);
>  }
>  EXPORT_SYMBOL_GPL(folio_wait_stable);
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/3] block: update the stable_writes flag in bdev_add
  2023-10-24  6:44 ` [PATCH 2/3] block: update the stable_writes flag in bdev_add Christoph Hellwig
  2023-10-24 12:04   ` Ilya Dryomov
@ 2023-10-24 15:01   ` Darrick J. Wong
  1 sibling, 0 replies; 16+ messages in thread
From: Darrick J. Wong @ 2023-10-24 15:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, Ilya Dryomov, Andrew Morton,
	linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:44:15AM +0200, Christoph Hellwig wrote:
> Propagate the per-queue stable_write flags into each bdev inode in bdev_add.
> This makes sure devices that require stable writes have it set for I/O
> on the block device node as well.
> 
> Note that this doesn't cover the case of a flag changing on a live device
> yet.  We should handle that as well, but I plan to cover it as part of a
> more general rework of how changing runtime paramters on block devices
> works.

Yessssssssss! :)

> Fixes: 1cb039f3dc16 ("bdi: replace BDI_CAP_STABLE_WRITES with a queue and a sb flag")
> Reported-by: Ilya Dryomov <idryomov@gmail.com>
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Seems sane to me,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  block/bdev.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/bdev.c b/block/bdev.c
> index f3b13aa1b7d428..04dba25b0019eb 100644
> --- a/block/bdev.c
> +++ b/block/bdev.c
> @@ -425,6 +425,8 @@ void bdev_set_nr_sectors(struct block_device *bdev, sector_t sectors)
>  
>  void bdev_add(struct block_device *bdev, dev_t dev)
>  {
> +	if (bdev_stable_writes(bdev))
> +		mapping_set_stable_writes(bdev->bd_inode->i_mapping);
>  	bdev->bd_dev = dev;
>  	bdev->bd_inode->i_rdev = dev;
>  	bdev->bd_inode->i_ino = dev;
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] xfs: respect the stable writes flag on the RT device
  2023-10-24  6:44 ` [PATCH 3/3] xfs: respect the stable writes flag on the RT device Christoph Hellwig
@ 2023-10-24 15:09   ` Darrick J. Wong
  2023-10-24 15:14     ` Matthew Wilcox
  2023-10-24 16:16     ` Christoph Hellwig
  0 siblings, 2 replies; 16+ messages in thread
From: Darrick J. Wong @ 2023-10-24 15:09 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, Matthew Wilcox, Ilya Dryomov, Andrew Morton,
	linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:44:16AM +0200, Christoph Hellwig wrote:
> Update the per-folio stable writes flag dependening on which device an
> inode resides on.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_inode.h | 8 ++++++++
>  fs/xfs/xfs_ioctl.c | 9 +++++++++
>  fs/xfs/xfs_iops.c  | 7 +++++++
>  3 files changed, 24 insertions(+)
> 
> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
> index 0c5bdb91152e1c..682959c8f78cb0 100644
> --- a/fs/xfs/xfs_inode.h
> +++ b/fs/xfs/xfs_inode.h
> @@ -561,6 +561,14 @@ extern void xfs_setup_inode(struct xfs_inode *ip);
>  extern void xfs_setup_iops(struct xfs_inode *ip);
>  extern void xfs_diflags_to_iflags(struct xfs_inode *ip, bool init);
>  
> +static inline void xfs_update_stable_writes(struct xfs_inode *ip)
> +{
> +	if (bdev_stable_writes(xfs_inode_buftarg(ip)->bt_bdev))
> +		mapping_set_stable_writes(VFS_I(ip)->i_mapping);
> +	else
> +		mapping_clear_stable_writes(VFS_I(ip)->i_mapping);
> +}
> +
>  /*
>   * When setting up a newly allocated inode, we need to call
>   * xfs_finish_inode_setup() once the inode is fully instantiated at
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 55bb01173cde8c..67bf613b3c86bc 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1147,6 +1147,15 @@ xfs_ioctl_setattr_xflags(
>  	ip->i_diflags2 = i_flags2;
>  
>  	xfs_diflags_to_iflags(ip, false);
> +
> +	/*
> +	 * Make the stable writes flag match that of the device the inode
> +	 * resides on when flipping the RT flag.
> +	 */
> +	if (S_ISREG(VFS_I(ip)->i_mode) &&
> +	    XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME))
> +		xfs_update_stable_writes(ip);

Hmm.  Won't the masking operation here result in the if test comparing 0
or FS_XFLAG_REALTIME to 0 or 1?

Oh.  FS_XFLAG_REALTIME == 1, so that's not an issue in this one case.
That's a bit subtle though, I'd have preferred

	    XFS_IS_REALTIME_INODE(ip) != !!(fa->fsx_xflags & FS_XFLAG_REALTIME))

to make it more obvious that the if test isn't comparing apples to
oranges.

> +
>  	xfs_trans_ichgtime(tp, ip, XFS_ICHGTIME_CHG);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  	XFS_STATS_INC(mp, xs_ig_attrchg);
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 2b3b05c28e9e48..b8ec045708c318 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -1298,6 +1298,13 @@ xfs_setup_inode(
>  	gfp_mask = mapping_gfp_mask(inode->i_mapping);
>  	mapping_set_gfp_mask(inode->i_mapping, (gfp_mask & ~(__GFP_FS)));
>  
> +	/*
> +	 * For real-time inodes update the stable write flags to that of the RT
> +	 * device instead of the data device.
> +	 */
> +	if (S_ISREG(inode->i_mode) && XFS_IS_REALTIME_INODE(ip))
> +		xfs_update_stable_writes(ip);

I wonder if xfs_update_stable_writes should become an empty function for
the CONFIG_XFS_RT=n case, to avoid the atomic flags update?

(The extra code is probably not worth the microoptimization.)

With the !! nit addressed,
Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

> +
>  	/*
>  	 * If there is no attribute fork no ACL can exist on this inode,
>  	 * and it can't have any file capabilities attached to it either.
> -- 
> 2.39.2
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24 15:00   ` Darrick J. Wong
@ 2023-10-24 15:10     ` Matthew Wilcox
  2023-10-24 16:14     ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-24 15:10 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Jens Axboe, Ilya Dryomov, Andrew Morton,
	linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:00:53AM -0700, Darrick J. Wong wrote:
> On Tue, Oct 24, 2023 at 08:44:14AM +0200, Christoph Hellwig wrote:
> > Add a per-address_space AS_STABLE_WRITES flag to control the behavior
> > in a more fine grained way.  The existing SB_I_STABLE_WRITES is kept
> > to initialize AS_STABLE_WRITES to the existing default which covers
> > most cases.
> 
> For a hot second I wondered if we could get rid of SB_I_STABLE_WRITES
> too, but then had an AHA moment when I saw that NFS also sets it.

I mean, we could if we're short on flags, or it just offends our
aesthetics to have it.  NFS (and others) would just have to
call mapping_set_stable_writes() in their inode initialisation
routines.  I don't personally find it a big deal either way.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] xfs: respect the stable writes flag on the RT device
  2023-10-24 15:09   ` Darrick J. Wong
@ 2023-10-24 15:14     ` Matthew Wilcox
  2023-10-24 16:16     ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Matthew Wilcox @ 2023-10-24 15:14 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Jens Axboe, Ilya Dryomov, Andrew Morton,
	linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:09:04AM -0700, Darrick J. Wong wrote:
> > +	if (S_ISREG(VFS_I(ip)->i_mode) &&
> > +	    XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME))
> > +		xfs_update_stable_writes(ip);
> 
> Hmm.  Won't the masking operation here result in the if test comparing 0
> or FS_XFLAG_REALTIME to 0 or 1?
> 
> Oh.  FS_XFLAG_REALTIME == 1, so that's not an issue in this one case.
> That's a bit subtle though, I'd have preferred
> 
> 	    XFS_IS_REALTIME_INODE(ip) != !!(fa->fsx_xflags & FS_XFLAG_REALTIME))
> 
> to make it more obvious that the if test isn't comparing apples to
> oranges.

!= !! might be going a bit far.  Would you settle for

XFS_IS_REALTIME_INODE(ip) == !(fa->fsx_xflags & FS_XFLAG_REALTIME)

?  Although none of these read particularly nicely.  Maybe

	XFS_IS_REALTIME_INODE(ip) != ((fa->fsx_xflags & FS_XFLAG_REALTIME) == 0))

Perhaps we need a bool helper for (fa->fsx_xflags & FS_XFLAG_REALTIME)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/3] filemap: add a per-mapping stable writes flag
  2023-10-24 15:00   ` Darrick J. Wong
  2023-10-24 15:10     ` Matthew Wilcox
@ 2023-10-24 16:14     ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24 16:14 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Jens Axboe, Matthew Wilcox, Ilya Dryomov,
	Andrew Morton, linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:00:53AM -0700, Darrick J. Wong wrote:
> For a hot second I wondered if we could get rid of SB_I_STABLE_WRITES
> too, but then had an AHA moment when I saw that NFS also sets it.

It's not just NFS, but still the general way of propagating the flag.
I don't really want to add bdev-specific code to new_inode.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/3] xfs: respect the stable writes flag on the RT device
  2023-10-24 15:09   ` Darrick J. Wong
  2023-10-24 15:14     ` Matthew Wilcox
@ 2023-10-24 16:16     ` Christoph Hellwig
  1 sibling, 0 replies; 16+ messages in thread
From: Christoph Hellwig @ 2023-10-24 16:16 UTC (permalink / raw)
  To: Darrick J. Wong
  Cc: Christoph Hellwig, Jens Axboe, Matthew Wilcox, Ilya Dryomov,
	Andrew Morton, linux-block, linux-fsdevel, linux-xfs, linux-mm

On Tue, Oct 24, 2023 at 08:09:04AM -0700, Darrick J. Wong wrote:
> > +	/*
> > +	 * Make the stable writes flag match that of the device the inode
> > +	 * resides on when flipping the RT flag.
> > +	 */
> > +	if (S_ISREG(VFS_I(ip)->i_mode) &&
> > +	    XFS_IS_REALTIME_INODE(ip) != (fa->fsx_xflags & FS_XFLAG_REALTIME))
> > +		xfs_update_stable_writes(ip);
> 
> Hmm.  Won't the masking operation here result in the if test comparing 0
> or FS_XFLAG_REALTIME to 0 or 1?
> 
> Oh.  FS_XFLAG_REALTIME == 1, so that's not an issue in this one case.
> That's a bit subtle though, I'd have preferred
> 
> 	    XFS_IS_REALTIME_INODE(ip) != !!(fa->fsx_xflags & FS_XFLAG_REALTIME))
> 
> to make it more obvious that the if test isn't comparing apples to
> oranges.

This is all copy and pasted from a check a few lines above :)

I guess I could clean it up as well or even add a rt_flag_changed local
variable instead of duplicating the check.

> > +	/*
> > +	 * For real-time inodes update the stable write flags to that of the RT
> > +	 * device instead of the data device.
> > +	 */
> > +	if (S_ISREG(inode->i_mode) && XFS_IS_REALTIME_INODE(ip))
> > +		xfs_update_stable_writes(ip);
> 
> I wonder if xfs_update_stable_writes should become an empty function for
> the CONFIG_XFS_RT=n case, to avoid the atomic flags update?
> 
> (The extra code is probably not worth the microoptimization.)

The compiler already eliminates the code because XFS_IS_REALTIME_INODE(ip)
has a stub for CONFIG_XFS_RT=n that always returns 0.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-10-24 16:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-24  6:44 add and use a per-mapping stable writes flag Christoph Hellwig
2023-10-24  6:44 ` [PATCH 1/3] filemap: add " Christoph Hellwig
2023-10-24 11:43   ` Matthew Wilcox
2023-10-24 12:03   ` Ilya Dryomov
2023-10-24 12:09     ` Matthew Wilcox
2023-10-24 12:45       ` Ilya Dryomov
2023-10-24 15:00   ` Darrick J. Wong
2023-10-24 15:10     ` Matthew Wilcox
2023-10-24 16:14     ` Christoph Hellwig
2023-10-24  6:44 ` [PATCH 2/3] block: update the stable_writes flag in bdev_add Christoph Hellwig
2023-10-24 12:04   ` Ilya Dryomov
2023-10-24 15:01   ` Darrick J. Wong
2023-10-24  6:44 ` [PATCH 3/3] xfs: respect the stable writes flag on the RT device Christoph Hellwig
2023-10-24 15:09   ` Darrick J. Wong
2023-10-24 15:14     ` Matthew Wilcox
2023-10-24 16:16     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).