public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
	linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap
Date: Thu, 30 Nov 2023 16:29:59 +0530	[thread overview]
Message-ID: <87fs0nik0g.fsf@doe.com> (raw)
In-Reply-To: <20231130101845.mt3hhwbbpnhroefg@quack3>

Jan Kara <jack@suse.cz> writes:

> On Thu 30-11-23 13:15:58, Ritesh Harjani wrote:
>> Ritesh Harjani (IBM) <ritesh.list@gmail.com> writes:
>> 
>> > Ritesh Harjani (IBM) <ritesh.list@gmail.com> writes:
>> >
>> >> Christoph Hellwig <hch@infradead.org> writes:
>> >>
>> >>> On Wed, Nov 22, 2023 at 01:29:46PM +0100, Jan Kara wrote:
>> >>>> writeback bit set. XFS plays the revalidation sequence counter games
>> >>>> because of this so we'd have to do something similar for ext2. Not that I'd
>> >>>> care as much about ext2 writeback performance but it should not be that
>> >>>> hard and we'll definitely need some similar solution for ext4 anyway. Can
>> >>>> you give that a try (as a followup "performance improvement" patch).
>> 
>> ok. So I am re-thinknig over this on why will a filesystem like ext2
>> would require sequence counter check. We don't have collapse range
>> or COW sort of operations, it is only the truncate which can race,
>> but that should be taken care by folio_lock. And even if the partial
>> truncate happens on a folio, since the logical to physical block mapping
>> never changes, it should not matter if the writeback wrote data to a
>> cached entry, right?
>
> Yes, so this is what I think I've already mentioned. As long as we map just
> the block at the current offset (or a block under currently locked folio),
> we are fine and we don't need any kind of sequence counter. But as soon as
> we start caching any kind of mapping in iomap_writepage_ctx we need a way
> to protect from races with truncate. So something like the sequence counter.
>

Why do we need to protect from the race with truncate, is my question here.
So, IMO, truncate will truncate the folio cache first before releasing the FS
blocks. Truncation of the folio cache and the writeback path are
protected using folio_lock()
Truncate will clear the dirty flag of the folio before
releasing the folio_lock() right, so writeback will not even proceed for
folios which are not marked dirty (even if we have a cached wpc entry for
which folio is released from folio cache).

Now coming to the stale cached wpc entry for which truncate is doing a
partial truncation. Say, truncate ended up calling
truncate_inode_partial_folio(). Now for such folio (it remains dirty
after partial truncation) (for which there is a stale cached wpc entry),
when writeback writes to the underlying stale block, there is no harm
with that right?

Also this will "only" happen for folio which was partially truncated.
So why do we need to have sequence counter for protecting against this
race is my question. 

So could this be only needed when existing logical to physical block
mapping changes e.g. like COW or maybe collapse range?

-ritesh

  reply	other threads:[~2023-11-30 11:00 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <cover.1700506526.git.ritesh.list@gmail.com>
2023-11-20 19:05 ` [RFC 1/3] ext2: Fix ki_pos update for DIO buffered-io fallback case Ritesh Harjani (IBM)
2023-11-21  4:39   ` Christoph Hellwig
2023-11-21  5:36     ` Ritesh Harjani
2023-11-22  6:51       ` Christoph Hellwig
2023-11-20 19:05 ` [RFC 2/3] ext2: Convert ext2 regular file buffered I/O to use iomap Ritesh Harjani (IBM)
2023-11-20 20:00   ` Matthew Wilcox
2023-11-21  4:44   ` Christoph Hellwig
2023-11-21  5:56     ` Ritesh Harjani
2023-11-21  6:08       ` Christoph Hellwig
2023-11-21  6:15         ` Ritesh Harjani
2023-11-22 12:29   ` Jan Kara
2023-11-22 13:11     ` Christoph Hellwig
2023-11-22 20:26       ` Ritesh Harjani
2023-11-30  3:24         ` Ritesh Harjani
2023-11-30  4:22           ` Matthew Wilcox
2023-11-30  4:37             ` Ritesh Harjani
2023-11-30  4:30           ` Christoph Hellwig
2023-11-30  5:27             ` Ritesh Harjani
2023-11-30  8:22             ` Zhang Yi
2023-11-30  7:45           ` Ritesh Harjani
2023-11-30 10:18             ` Jan Kara
2023-11-30 10:59               ` Ritesh Harjani [this message]
2023-11-30 14:08                 ` Jan Kara
2023-11-30 15:50                   ` Ritesh Harjani
2023-11-30 16:01                     ` Jan Kara
2023-11-30 16:03                     ` Matthew Wilcox
2023-12-01 23:09           ` Dave Chinner
2023-12-05 15:22             ` Ritesh Harjani
2023-12-07  8:58               ` Jan Kara
2023-11-22 22:26       ` Dave Chinner
2023-11-23  4:09         ` Darrick J. Wong
2023-11-23  7:09           ` Christoph Hellwig
2023-11-29  5:37             ` Darrick J. Wong
2023-11-29  6:32               ` Christoph Hellwig
2023-11-29  9:19           ` Dave Chinner
2023-11-23  7:02         ` Christoph Hellwig
2023-11-22 20:25     ` Ritesh Harjani
2023-11-20 19:05 ` [RFC 3/3] ext2: Enable large folio support Ritesh Harjani (IBM)
2023-11-20 20:00   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87fs0nik0g.fsf@doe.com \
    --to=ritesh.list@gmail.com \
    --cc=hch@infradead.org \
    --cc=jack@suse.cz \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox