All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
	Jan Kara <jack@suse.cz>, Christoph Hellwig <hch@infradead.org>,
	"Darrick J . Wong" <djwong@kernel.org>,
	Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	Disha Goel <disgoel@linux.ibm.com>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [PATCHv5 2/9] fs/buffer.c: Add generic_buffer_fsync implementation
Date: Mon, 17 Apr 2023 17:08:57 +0530	[thread overview]
Message-ID: <87o7nmivqm.fsf@doe.com> (raw)
In-Reply-To: <20230417110149.mhrksh4owqkfw5pa@quack3>

Jan Kara <jack@suse.cz> writes:

> On Sun 16-04-23 15:38:37, Ritesh Harjani (IBM) wrote:
>> Some of the higher layers like iomap takes inode_lock() when calling
>> generic_write_sync().
>> Also writeback already happens from other paths without inode lock,
>> so it's difficult to say that we really need sync_mapping_buffers() to
>> take any inode locking here. Having said that, let's add
>> generic_buffer_fsync() implementation in buffer.c with no
>> inode_lock/unlock() for now so that filesystems like ext2 and
>> ext4's nojournal mode can use it.
>>
>> Ext4 when got converted to iomap for direct-io already copied it's own
>> variant of __generic_file_fsync() without lock. Hence let's add a helper
>> API and use it both in ext2 and ext4.
>>
>> Later we can review other filesystems as well to see if we can make
>> generic_buffer_fsync() which does not take any inode_lock() as the
>> default path.
>>
>> Tested-by: Disha Goel <disgoel@linux.ibm.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
>
> There is a problem with generic_buffer_fsync() that it does not call
> blkdev_issue_flush() so the caller is responsible for doing that. That's
> necessary for ext2 & ext4 so fine for now. But historically this was the
> case with generic_file_fsync() as well and that led to many filesystem
> forgetting to flush caches from fsync(2).

Ok, thanks for the details.

> What is our transition plan for
> these filesystems that currently do the cache flush from
> generic_file_fsync()? Do we want to eventually keep generic_file_fsync()
> doing the cache flush and call generic_buffer_fsync() instead of
> __generic_buffer_fsync() from it?

Frankly speaking, I was thinking we will come back to this question
maybe when we start working on those changes. At this point in time
I only looked at it from ext2 DIO changes perspective.

But since you asked, here is what I think we could do -

Rename generic_file_fsync => generic_buffers_sync() to fs/buffers.c
Then
generic_buffers_sync() {
    ret = generic_buffers_fsync()
    if (!ret)
       blkdev_issue_flush()
}

generic_buffers_fsync() is same as in this patch which does not have the
cache flush operation.
(will rename from generic_buffer_fsync() to generic_buffers_fsync())

Note: The naming is kept such that-
- sync means it will do fsync followed by cache flush.
- fsync means it will only do the file fsync

As I understand - we would eventually like to kill the
inode_lock() variants of generic_file_fsync() and __generic_file_fsync()
after auditing other filesystem code, right?

Then for now what we need is generic_buffers_sync() function which does
not take an inode_lock() and also does cache flush which is required for ext2.
And generic_buffers_fsync() which does not do any cache flush operations
required by filesystem like ext4.

Does that sound good to you? Is the naming also proper?

Is yes, then I can rename the below function to generic_buffers_fsync()
and also create implementation of generic_buffers_sync().
Then let ext2 and ext4 use them.


-ritesh


>
> 								Honza
>
>> ---
>>  fs/buffer.c                 | 43 +++++++++++++++++++++++++++++++++++++
>>  include/linux/buffer_head.h |  2 ++
>>  2 files changed, 45 insertions(+)
>>
>> diff --git a/fs/buffer.c b/fs/buffer.c
>> index 9e1e2add541e..df98f1966a71 100644
>> --- a/fs/buffer.c
>> +++ b/fs/buffer.c
>> @@ -593,6 +593,49 @@ int sync_mapping_buffers(struct address_space *mapping)
>>  }
>>  EXPORT_SYMBOL(sync_mapping_buffers);
>>
>> +/**
>> + * generic_buffer_fsync - generic buffer fsync implementation
>> + * for simple filesystems with no inode lock
>> + *
>> + * @file:	file to synchronize
>> + * @start:	start offset in bytes
>> + * @end:	end offset in bytes (inclusive)
>> + * @datasync:	only synchronize essential metadata if true
>> + *
>> + * This is a generic implementation of the fsync method for simple
>> + * filesystems which track all non-inode metadata in the buffers list
>> + * hanging off the address_space structure.
>> + */
>> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
>> +			 bool datasync)
>> +{
>> +	struct inode *inode = file->f_mapping->host;
>> +	int err;
>> +	int ret;
>> +
>> +	err = file_write_and_wait_range(file, start, end);
>> +	if (err)
>> +		return err;
>> +
>> +	ret = sync_mapping_buffers(inode->i_mapping);
>> +	if (!(inode->i_state & I_DIRTY_ALL))
>> +		goto out;
>> +	if (datasync && !(inode->i_state & I_DIRTY_DATASYNC))
>> +		goto out;
>> +
>> +	err = sync_inode_metadata(inode, 1);
>> +	if (ret == 0)
>> +		ret = err;
>> +
>> +out:
>> +	/* check and advance again to catch errors after syncing out buffers */
>> +	err = file_check_and_advance_wb_err(file);
>> +	if (ret == 0)
>> +		ret = err;
>> +	return ret;
>> +}
>> +EXPORT_SYMBOL(generic_buffer_fsync);
>> +
>>  /*
>>   * Called when we've recently written block `bblock', and it is known that
>>   * `bblock' was for a buffer_boundary() buffer.  This means that the block at
>> diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
>> index 8f14dca5fed7..3170d0792d52 100644
>> --- a/include/linux/buffer_head.h
>> +++ b/include/linux/buffer_head.h
>> @@ -211,6 +211,8 @@ int inode_has_buffers(struct inode *);
>>  void invalidate_inode_buffers(struct inode *);
>>  int remove_inode_buffers(struct inode *inode);
>>  int sync_mapping_buffers(struct address_space *mapping);
>> +int generic_buffer_fsync(struct file *file, loff_t start, loff_t end,
>> +			 bool datasync);
>>  void clean_bdev_aliases(struct block_device *bdev, sector_t block,
>>  			sector_t len);
>>  static inline void clean_bdev_bh_alias(struct buffer_head *bh)
>> --
>> 2.39.2
>>
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

  parent reply	other threads:[~2023-04-17 11:39 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-16 10:08 [PATCHv5 0/9] ext2: DIO to use iomap Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 1/9] ext2/dax: Fix ext2_setsize when len is page aligned Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 2/9] fs/buffer.c: Add generic_buffer_fsync implementation Ritesh Harjani (IBM)
2023-04-17 11:01   ` Jan Kara
2023-04-17 11:07     ` Jan Kara
2023-04-17 11:38     ` Ritesh Harjani [this message]
2023-04-17 16:45       ` Jan Kara
2023-04-18  5:04         ` Christoph Hellwig
2023-04-20 14:42           ` Ritesh Harjani
2023-04-16 10:08 ` [PATCHv5 3/9] ext4: Use generic_buffer_fsync() implementation Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 4/9] ext2: " Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 5/9] ext2: Move direct-io to use iomap Ritesh Harjani (IBM)
2023-04-17 11:20   ` Jan Kara
2023-04-20 14:48     ` Ritesh Harjani
2023-04-16 10:08 ` [PATCHv5 6/9] fs.h: Add TRACE_IOCB_STRINGS for use in trace points Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 7/9] ext2: Add direct-io " Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 8/9] iomap: Remove IOMAP_DIO_NOSYNC unused dio flag Ritesh Harjani (IBM)
2023-04-16 10:08 ` [PATCHv5 9/9] iomap: Add DIO tracepoints Ritesh Harjani (IBM)
2023-04-16 13:49   ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87o7nmivqm.fsf@doe.com \
    --to=ritesh.list@gmail.com \
    --cc=disgoel@linux.ibm.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.