From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
Dave Chinner <david@fromorbit.com>,
Brian Foster <bfoster@redhat.com>,
Ojaswin Mujoo <ojaswin@linux.ibm.com>,
Disha Goel <disgoel@linux.ibm.com>,
Aravinda Herle <araherle@in.ibm.com>
Subject: Re: [RFCv4 3/3] iomap: Support subpage size dirty tracking to improve write performance
Date: Fri, 05 May 2023 08:57:53 +0530 [thread overview]
Message-ID: <87zg6j1mpy.fsf@doe.com> (raw)
In-Reply-To: <ZFPJGDqfcUtI4ptO@casper.infradead.org>
Matthew Wilcox <willy@infradead.org> writes:
> On Thu, May 04, 2023 at 08:21:09PM +0530, Ritesh Harjani (IBM) wrote:
>> @@ -90,12 +116,21 @@ iomap_page_create(struct inode *inode, struct folio *folio, unsigned int flags)
>> else
>> gfp = GFP_NOFS | __GFP_NOFAIL;
>>
>> - iop = kzalloc(struct_size(iop, state, BITS_TO_LONGS(nr_blocks)),
>> + /*
>> + * iop->state tracks two sets of state flags when the
>> + * filesystem block size is smaller than the folio size.
>> + * The first state tracks per-filesystem block uptodate
>> + * and the second tracks per-filesystem block dirty
>> + * state.
>> + */
>
> "per-filesystem"? I think you mean "per-block (uptodate|block) state".
> Using "per-block" naming throughout this patchset might help readability.
> It's currently an awkward mix of "subpage", "sub-page" and "sub-folio".
and subfolio to add.
Yes, I agree it got all mixed up in the comments.
Let me stick to sub-folio (which was what we were using earlier [1])
[1]: https://lore.kernel.org/all/20211216210715.3801857-5-willy@infradead.org/
> It also feels like you're adding a comment to every non-mechanical change
> you make, which isn't necessarily helpful. Changelog, sure, but
> sometimes your comments simply re-state what your change is doing.
>
Sure, I will keep that in mind for next rev to remove unwanted comments.
>> -static void iomap_iop_set_range_uptodate(struct folio *folio,
>> - struct iomap_page *iop, size_t off, size_t len)
>> +static void iomap_iop_set_range(struct folio *folio, struct iomap_page *iop,
>> + size_t off, size_t len, enum iop_state state)
>> {
>> struct inode *inode = folio->mapping->host;
>> - unsigned first = off >> inode->i_blkbits;
>> - unsigned last = (off + len - 1) >> inode->i_blkbits;
>> + unsigned int blks_per_folio = i_blocks_per_folio(inode, folio);
>> + unsigned int first_blk = (off >> inode->i_blkbits);
>> + unsigned int last_blk = ((off + len - 1) >> inode->i_blkbits);
>> + unsigned int nr_blks = last_blk - first_blk + 1;
>> unsigned long flags;
>> - unsigned int nr_blocks = i_blocks_per_folio(inode, folio);
>>
>> - spin_lock_irqsave(&iop->state_lock, flags);
>> - iop_set_range_uptodate(iop, first, last - first + 1, nr_blocks);
>> - if (iop_uptodate_full(iop, nr_blocks))
>> - folio_mark_uptodate(folio);
>> - spin_unlock_irqrestore(&iop->state_lock, flags);
>> + switch (state) {
>> + case IOP_STATE_UPDATE:
>> + if (!iop) {
>> + folio_mark_uptodate(folio);
>> + return;
>> + }
>> + spin_lock_irqsave(&iop->state_lock, flags);
>> + iop_set_range_uptodate(iop, first_blk, nr_blks, blks_per_folio);
>> + if (iop_uptodate_full(iop, blks_per_folio))
>> + folio_mark_uptodate(folio);
>> + spin_unlock_irqrestore(&iop->state_lock, flags);
>> + break;
>> + case IOP_STATE_DIRTY:
>> + if (!iop)
>> + return;
>> + spin_lock_irqsave(&iop->state_lock, flags);
>> + iop_set_range_dirty(iop, first_blk, nr_blks, blks_per_folio);
>> + spin_unlock_irqrestore(&iop->state_lock, flags);
>> + break;
>> + }
>> }
>
> I can't believe this is what Dave wanted you to do. iomap_iop_set_range()
> should be the low-level helper called by iop_set_range_uptodate() and
> iop_set_range_dirty(), not the other way around.
Ok, I see the confusion, I think if we make
iomap_iop_set_range() to iomap_set_range(), then that should be good.
Then it becomes iomap_set_range() calling
iop_set_range_update() & iop_set_range_dirty() as the lower level helper routines.
Based on the the existing code, I believe this ^^^^ is how the heirarchy
should look like. Does it look good then? If yes, I will simply drop the
"_iop" part in the next rev.
<Relevant function list>
iop_set_range_uptodate
iop_clear_range_uptodate
iop_test_uptodate
iop_uptodate_full
iop_set_range_dirty
iop_clear_range_dirty
iop_test_dirty
iomap_page_create
iomap_page_release
iomap_iop_set_range -> iomap_set_range()
iomap_iop_clear_range -> iomap_clear_range()
-ritesh
next prev parent reply other threads:[~2023-05-05 3:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-04 14:51 [RFCv4 0/3] iomap: Support subpage size dirty tracking to improve write performance Ritesh Harjani (IBM)
2023-05-04 14:51 ` [RFCv4 1/3] iomap: Allocate iop in ->write_begin() early Ritesh Harjani (IBM)
2023-05-04 14:51 ` [RFCv4 2/3] iomap: Change uptodate variable name to state Ritesh Harjani (IBM)
2023-05-04 14:51 ` [RFCv4 3/3] iomap: Support subpage size dirty tracking to improve write performance Ritesh Harjani (IBM)
2023-05-04 15:02 ` Matthew Wilcox
2023-05-05 3:27 ` Ritesh Harjani [this message]
2023-05-05 3:59 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87zg6j1mpy.fsf@doe.com \
--to=ritesh.list@gmail.com \
--cc=araherle@in.ibm.com \
--cc=bfoster@redhat.com \
--cc=david@fromorbit.com \
--cc=disgoel@linux.ibm.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.