From: Zheng Liu <gnehzuil.liu@gmail.com>
To: linux-ext4@vger.kernel.org
Cc: Jan Kara <jack@suse.cz>
Subject: [RFC] call end_page_writeback after converting unwritten extents in ext4_end_io
Date: Thu, 10 Jan 2013 13:56:17 +0800 [thread overview]
Message-ID: <20130110055617.GA4415@gmail.com> (raw)
Hi all,
Now I am trying to handle AIO DIO with O_SYNC using extent status tree in ext4.
After applied Christoph's patch series, O_SYNC semantics in ext4 will be broken.
This problem can be fixed using extent status tree. But we will get a deadlock
because i_mutex needs to be taken in ext4_sync_file() and then it will wait on
i_unwritten==0. So let's consider what happends after applied Christoph's
patches and using extent status tree to ensure AIO DIO with O_SYNC semantics.
ext4_ext_direct_IO: ext4_ind_direct_IO:
->ext4_file_write()
->mutex_lock(i_mutex)
->ext4_ind_direct_IO()
[if this is an append dio]
->mutex_unlock(i_mutex)
->ext4_file_write()
->mutex_lock(i_mutex)
->ext4_ext_direct_IO()
->mutex_unlock(i_mutex)
->generic_write_sync()
->ext4_sync_file()
->mutex_lock(i_mutex)
->ext4_flush_unwritten_io()
->ext4_do_flush_complete_IO()
[there is empty list]
->ext4_unwritten_wait()
[wait on i_unwritten==0 because
in ext4_ext_direct_IO i_unwritten
has been increased]
kworkd:
->dio_complete()
->ext4_end_dio()
->ext4_es_convert_unwritten_extents()
[convert unwritten extents in status
tree to ensure O_SYNC semantics]
->ext4_add_complete_io()
->generic_write_sync()
->ext4_sync_file()
->mutex_lock(i_mutex)
[*DEADLOCK*]
Thus all we need to do is do not wait on i_unwritten==0. But, as this
commit (c278531d) described, there is a time window that integrity is
broken. So we need to call end_page_writeback() after converting
unwritten extents in ext4_end_io(). However, if we call end_page_writeback()
after conversion has been done in ext4_end_io(), we will get another deadlock
because in ext4_convert_unwritten_extents() we need to start a journal and it is
possible to cause a journal commit. At the time if ext4_write_begin() is
called, it also will start a journal and then it will wait on writeback in
grab_cache_page_write_begin().
Now I have an idea to solve this problem. We start a journal before submitting
an io request rather than start it in ext4_convert_unwritten_extents(). The
reason of starting a journal in ext4_convert_unwritten_extents() is that we need
to calculate credits for journal. But as far as I understand the credits is not
increased in this function because we have splitted extents before submitting
this io request. A 'handle_t *handle' will be added into ext4_io_end_t, and it
will be used in ext4_convert_unwritten_extents(). Then we can avoid to
trigger a journal commit when starting a journal.
Hope my description is clear. Any comments or feedbacks are always welcome.
Jan, I don't know whether you have begun to try to fix this problem or not. If
there has an update, please let me know.
Thanks,
- Zheng
next reply other threads:[~2013-01-10 5:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-10 5:56 Zheng Liu [this message]
2013-01-10 14:47 ` [RFC] call end_page_writeback after converting unwritten extents in ext4_end_io Jan Kara
2013-01-11 2:29 ` Zheng Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130110055617.GA4415@gmail.com \
--to=gnehzuil.liu@gmail.com \
--cc=jack@suse.cz \
--cc=linux-ext4@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).