From: Jan Kara <jack@suse.cz>
To: Mauricio Faria de Oliveira <mfo@canonical.com>
Cc: Jan Kara <jack@suse.cz>,
linux-ext4@vger.kernel.org,
dann frazier <dann.frazier@canonical.com>,
Mauricio Faria de Oliveira <mauricio.foliveira@gmail.com>,
Jan Kara <jack@suse.com>
Subject: Re: [RFC PATCH v2 0/5] ext4/jbd2: data=journal: write-protect pages on transaction commit
Date: Wed, 19 Aug 2020 11:27:35 +0200 [thread overview]
Message-ID: <20200819092735.GI1902@quack2.suse.cz> (raw)
In-Reply-To: <20200810010210.3305322-1-mfo@canonical.com>
Hi Mauricio!
On Sun 09-08-20 22:02:03, Mauricio Faria de Oliveira wrote:
> I'd like to ask for your comments on the patchset so far, and have a few
> questions.
Thanks for taking time to write these patches! I have commented on the
patches and now let's address the remaining questions :)
> Third, my changes/observations/questions:
>
> 1) > Probably the best place to add inode to this list is
> ext4_journalled_writepage().
>
> I think we also need it in ext4_page_mkwrite(), right? See the test
> case, for example:
>
> fd = open("file");
> addr = mmap(fd);
> pwrite(fd, "a", 1, 0);
> addr[0] = 'a';
> press_enter(); // when asked for.
> addr[0] = 'b';
>
> Here pwrite() led to ext4_write_begin() which added the inode to a
> transaction (not to the transaction's inode list).
Hum, right. Good spotting, I didn't think of this possibility. Actually,
when we have to do this in ext4_page_mkwrite() we don't need
ext4_journalled_writepage() at all which is what I've suggested when
commenting on your patch 3/5 anyway to simplify things :).
> 2) Do not return early in ext4_page_mkwrite() 'if we have all the buffers mapped'?
>
> There's this check in ext4_page_mkwrite():
>
> /*
> * Return if we have all the buffers mapped. This avoids the need to do
> * journal_start/journal_stop which can block and take a long time
> */
>
> Corresponding to this line in the debug messages:
>
> [ 20.677639] TESTDBG: ext4_page_mkwrite() :: returning; all buffers mapped for inode ffff8f72eda5fca8
>
> That returns early, *before* do_journal_get_write_access(), which is
> needed in data=journal, regardless of whether all buffers are mapped: we
> could exit ext4_page_mkwrite() then change buffers under commit during
> the race window.
>
> So, it seems we should ignore that for data=journal, which allows us to
> ext4_journal_start() and get a transaction handle, used to add the inode
> to the transaction's inode list (per #1.)
Yes, correct.
> With the changes from 1) and 2), the inode is added to the transaction's
> inode list, and the callback does write-protect the page correctly.
>
> When trying to change the buffer contents later (pressing enter) there
> _is_ a wait in ext4_page_mkwrite() as expected! But it's first at the
> file_update_time() because it calls do_get_write_access() for that inode,
> which is also in the transaction.
>
> I presume that a commit can start after file_update_time() and before
> ext4_journal_start() in ext4_page_mkwrite(), so we'd still have to keep
> these changes to ext4_page_mkwrite() ?
Yes, relying on file_update_time() waiting is definitely too fragile.
> 3) When checking to redirty the page in the writepage callback,
> does a buffer without a journal head means we should redirty
> the page? (for the reason it's not part of the committing txn)
This is going to change after the simplifications I've suggested so there's
no need to worry about this now.
> 4) Should we clear the PageChecked bit?
>
> We don't clear the PageChecked bit, thus later when the writeback
> work kicks in, __ext4_journalled_writepage() adds the page to the
> transaction's inode list again.
>
> Since the page is clean, it just goes into the submit data buffers
> callback, but write_cache_pages() returns before the writepage callback.
>
> Should we try to provent that by, say, clearing the pagechecked bit
> in case we don't have to redirty the page (in the writepage callback) ?
I guess it's difficult to determine when PageChecked bit can be safely
cleared. And I plan to do away with PageChecked bit usage completely in my
cleanup so this should go away anyway...
> 5) Inodes with inline data. I haven't checked in detail yet, but
> would appreciate if you have a brief explanation about them, and
> if they need special handling since they apparently don't get
> do_journal_get_write_access() called (eg, see __ext4_journalled_writepage()).
I think I've commented about this in a particular patch.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
prev parent reply other threads:[~2020-08-19 9:27 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-10 1:02 [RFC PATCH v2 0/5] ext4/jbd2: data=journal: write-protect pages on transaction commit Mauricio Faria de Oliveira
2020-08-10 1:02 ` [RFC PATCH v2 1/5] jbd2: test case for ext4 data=journal/mmap() journal corruption Mauricio Faria de Oliveira
2020-08-18 14:38 ` Jan Kara
2020-08-19 1:15 ` Mauricio Faria de Oliveira
2020-08-10 1:02 ` [RFC PATCH v2 2/5] jbd2: introduce journal callbacks j_submit|finish_inode_data_buffers Mauricio Faria de Oliveira
2020-08-18 14:52 ` Jan Kara
2020-08-19 1:20 ` Mauricio Faria de Oliveira
2020-08-19 9:16 ` Jan Kara
2020-08-10 1:02 ` [RFC PATCH v2 3/5] ext4: data=journal: write-protect pages on submit inode data buffers callback Mauricio Faria de Oliveira
2020-08-19 8:44 ` Jan Kara
2020-08-19 10:41 ` Jan Kara
2020-08-20 22:55 ` Mauricio Faria de Oliveira
2020-08-21 10:26 ` Jan Kara
2020-08-10 1:02 ` [RFC PATCH v2 4/5] ext4: data=journal: add inode to transaction inode list in ext4_page_mkwrite() Mauricio Faria de Oliveira
2020-08-10 1:02 ` [RFC PATCH v2 5/5] ext4/jbd2: debugging messages Mauricio Faria de Oliveira
2020-08-19 8:46 ` Jan Kara
2020-08-10 1:02 ` [RFC PATCH v2/SETUP SCRIPT] Mauricio Faria de Oliveira
2020-08-10 1:02 ` [RFC PATCH v2/TEST CASE] Mauricio Faria de Oliveira
2020-08-19 9:27 ` Jan Kara [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200819092735.GI1902@quack2.suse.cz \
--to=jack@suse.cz \
--cc=dann.frazier@canonical.com \
--cc=jack@suse.com \
--cc=linux-ext4@vger.kernel.org \
--cc=mauricio.foliveira@gmail.com \
--cc=mfo@canonical.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).