From: Damien Le Moal <damien.lemoal@opensource.wdc.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: linux-fsdevel@vger.kernel.org,
Johannes Thumshirn <johannes.thumshirn@wdc.com>,
Hans Holmberg <hans.holmberg@wdc.com>
Subject: Re: [PATCH] zonefs: Always invalidate last cache page on append write
Date: Wed, 29 Mar 2023 17:27:43 +0900 [thread overview]
Message-ID: <46acc134-3f38-2a2d-c2aa-11d2fbee2abc@opensource.wdc.com> (raw)
In-Reply-To: <ZCPzbFzjFyiOVDdl@infradead.org>
On 3/29/23 17:14, Christoph Hellwig wrote:
> On Wed, Mar 29, 2023 at 02:58:23PM +0900, Damien Le Moal wrote:
>> + /*
>> + * If the inode block size (sector size) is smaller than the
>> + * page size, we may be appending data belonging to an already
>> + * cached last page of the inode. So make sure to invalidate that
>> + * last cached page. This will always be a no-op for the case where
>> + * the block size is equal to the page size.
>> + */
>> + ret = invalidate_inode_pages2_range(inode->i_mapping,
>> + iocb->ki_pos >> PAGE_SHIFT, -1);
>> + if (ret)
>> + return ret;
>
> The missing truncate here obviously is a bug and needs fixing.
>
> But why does this not follow the logic in __iomap_dio_rw to to return
> -ENOTBLK for any error so that the write falls back to buffered I/O.
This is a write to sequential zones so we cannot use buffered writes. We have to
do a direct write to ensure ordering between writes.
Note that this is the special blocking write case where we issue a zone append.
For async regular writes, we use iomap so this bug does not exist. But then I
now realize that __iomap_dio_rw() falling back to buffered IOs could also create
an issue with write ordering.
> Also as far as I can tell from reading the code, -1 is not a valid
> end special case for invalidate_inode_pages2_range, so you'll actually
> have to pass a valid end here.
I wondered about that but then saw:
int invalidate_inode_pages2(struct address_space *mapping)
{
return invalidate_inode_pages2_range(mapping, 0, -1);
}
EXPORT_SYMBOL_GPL(invalidate_inode_pages2);
which tend to indicate that "-1" is fine. The end is passed to
find_get_entries() -> find_get_entry() where it becomes a "max" pgoff_t, so
using -1 seems fine.
--
Damien Le Moal
Western Digital Research
next prev parent reply other threads:[~2023-03-29 8:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-29 5:58 [PATCH] zonefs: Always invalidate last cache page on append write Damien Le Moal
2023-03-29 6:14 ` Johannes Thumshirn
2023-03-29 8:14 ` Christoph Hellwig
2023-03-29 8:27 ` Damien Le Moal [this message]
2023-03-29 9:49 ` Damien Le Moal
2023-03-29 23:36 ` Christoph Hellwig
2023-03-29 23:57 ` Damien Le Moal
2023-03-30 0:07 ` Christoph Hellwig
2023-03-30 0:22 ` Damien Le Moal
2023-03-29 11:04 ` Hans Holmberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46acc134-3f38-2a2d-c2aa-11d2fbee2abc@opensource.wdc.com \
--to=damien.lemoal@opensource.wdc.com \
--cc=hans.holmberg@wdc.com \
--cc=hch@infradead.org \
--cc=johannes.thumshirn@wdc.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).