From: Junxiao Bi <junxiao.bi@oracle.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: ocfs2-devel@oss.oracle.com, linux-aio@kvack.org,
mfasheh@suse.com, jlbec@evilplan.org, bcrl@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
joe.jin@oracle.com
Subject: [Ocfs2-devel] [PATCH 1/2] aio: make kiocb->private NUll in init_sync_kiocb()
Date: Sat, 02 Jun 2012 10:59:05 +0800 [thread overview]
Message-ID: <4FC98179.4000400@oracle.com> (raw)
In-Reply-To: <x49r4ty69bw.fsf@segfault.boston.devel.redhat.com>
On 06/02/2012 04:55 AM, Jeff Moyer wrote:
> Junxiao Bi<junxiao.bi@oracle.com> writes:
>
>> On 05/31/2012 10:08 PM, Jeff Moyer wrote:
>>> Junxiao Bi<junxiao.bi@oracle.com> writes:
>>>
>>>> Ocfs2 uses kiocb.*private as a flag of unsigned long size. In
>>>> commit a11f7e6 ocfs2: serialize unaligned aio, the unaligned
>>>> io flag is involved in it to serialize the unaligned aio. As
>>>> *private is not initialized in init_sync_kiocb() of do_sync_write(),
>>>> this unaligned io flag may be unexpectly set in an aligned dio.
>>>> And this will cause OCFS2_I(inode)->ip_unaligned_aio decreased
>>>> to -1 in ocfs2_dio_end_io(), thus the following unaligned dio
>>>> will hang forever at ocfs2_aiodio_wait() in ocfs2_file_write_iter().
>>>> We can't initialized this flag in ocfs2_file_write_iter() since
>>>> it may be invoked several times by do_sync_write(). So we initialize
>>>> it in init_sync_kiocb(), it's also useful for other similiar use of
>>>> it in the future.
>>> I don't see any ocfs2_file_write_iter in the upstream kernel.
>>> ocfs2_file_aio_write most certainly could set ->private to 0, it
>>> will only be called once for a given kiocb.
>> From sys_io_submit->..->io_submit_one->aio_run_iocb->aio_rw_vect_retry,
>> it seems that aio_write could be called two times. See the following
>> scenario.
>> 1. There is a file opened with direct io flag, in aio_rw_vect_retry,
>> aio_write is called first time. If the direct io can
>> not be completed, it will fall back into buffer io, see line 2329 in
>> aio_write.
> Huh? What's line 2329 in aio_write?
See the following code.
2312 can_do_direct = direct_io;
2313 ret = ocfs2_prepare_inode_for_write(file, ppos,
2314 iocb->ki_left, appending,
2315 &can_do_direct, &has_refcount);
2316 if (ret < 0) {
2317 mlog_errno(ret);
2318 goto out;
2319 }
2320
2321 if (direct_io && !is_sync_kiocb(iocb))
2322 unaligned_dio = ocfs2_is_io_unaligned(inode,
iocb->ki_left,
2323 *ppos);
2324
2325 /*
2326 * We can't complete the direct I/O as requested, fall back to
2327 * buffered I/O.
2328 */
2329 if (direct_io && !can_do_direct) {
2330 ocfs2_rw_unlock(inode, rw_level);
2331
2332 have_alloc_sem = 0;
2333 rw_level = -1;
2334
2335 direct_io = 0;
2336 goto relock;
2337 }
The above is the source code how direct io falled back to buffer io. In
line 2313, in function ocfs2_prepare_inode_for_write(), it will judge
whether the direct io can be executed. If not, the variable
"can_do_direct" will be set to false, then the variable "direct_io" will
be set to 0 in line 2335. This means that generic_file_buffered_write()
will be called in the following code, not generic_file_direct_write(),
see the following code. So if the generic_file_buffered_write() is a
partial write, then its return value "written" will be made as the
return value of the aio_write, see line 2439. Then it return back to
aio_rw_vect_retry(), the condition (ret > 0 && iocb->ki_left > 0 &&
opcode == IOCB_CMD_PWRITEV) is true. Then aio_write will be called
second time. As the unaligned I/O flag may be set in the kiocb at the
first time call of aio_write, it may affect the second call of aio_write
if its direct IO is aligned.
2372 if (direct_io) {
2373 written = generic_file_direct_write(iocb, iov,
&nr_segs, *ppos,
2374 ppos, count,
ocount);
2375 if (written < 0) {
2376 ret = written;
2377 goto out_dio;
2378 }
2379 } else {
2380 current->backing_dev_info =
file->f_mapping->backing_dev_info;
2381 written = generic_file_buffered_write(iocb, iov,
nr_segs, *ppos,
2382 ppos, count, 0);
2383 current->backing_dev_info = NULL;
2384 }
2438 if (written)
2439 ret = written;
2440 return ret;
>
>> 2. If the very buffer io is a partial write, then it will return back
>> to aio_rw_vect_retry and issue the second aio_write.
> For the generic case, the fallback to buffered I/O happens in
> __generic_file_aio_write, without bouncing all the way back up the call
> stack to aio_rw_vect_retry. I see in ocfs2, things are a bit different:
>
> retry->aio_rw_vect_retry->ocfs2_file_aio_write->generic_file_direct_write
> ->ocfs2_direct_IO->__blockdev_direct_IO
>
> That last function can return 0 if not all of the data was written via
> direct I/O. At that point, you return all of the way up the chain to
> aio_rw_vect_retry, which checks the return value (ret). If it was 0,
> then it goes ahead and retries the complete I/O. How does that make any
> progress?!
>
> Cheers,
> Jeff
WARNING: multiple messages have this Message-ID (diff)
From: Junxiao Bi <junxiao.bi@oracle.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: ocfs2-devel@oss.oracle.com, linux-aio@kvack.org,
mfasheh@suse.com, jlbec@evilplan.org, bcrl@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org,
joe.jin@oracle.com
Subject: Re: [PATCH 1/2] aio: make kiocb->private NUll in init_sync_kiocb()
Date: Sat, 02 Jun 2012 10:59:05 +0800 [thread overview]
Message-ID: <4FC98179.4000400@oracle.com> (raw)
In-Reply-To: <x49r4ty69bw.fsf@segfault.boston.devel.redhat.com>
On 06/02/2012 04:55 AM, Jeff Moyer wrote:
> Junxiao Bi<junxiao.bi@oracle.com> writes:
>
>> On 05/31/2012 10:08 PM, Jeff Moyer wrote:
>>> Junxiao Bi<junxiao.bi@oracle.com> writes:
>>>
>>>> Ocfs2 uses kiocb.*private as a flag of unsigned long size. In
>>>> commit a11f7e6 ocfs2: serialize unaligned aio, the unaligned
>>>> io flag is involved in it to serialize the unaligned aio. As
>>>> *private is not initialized in init_sync_kiocb() of do_sync_write(),
>>>> this unaligned io flag may be unexpectly set in an aligned dio.
>>>> And this will cause OCFS2_I(inode)->ip_unaligned_aio decreased
>>>> to -1 in ocfs2_dio_end_io(), thus the following unaligned dio
>>>> will hang forever at ocfs2_aiodio_wait() in ocfs2_file_write_iter().
>>>> We can't initialized this flag in ocfs2_file_write_iter() since
>>>> it may be invoked several times by do_sync_write(). So we initialize
>>>> it in init_sync_kiocb(), it's also useful for other similiar use of
>>>> it in the future.
>>> I don't see any ocfs2_file_write_iter in the upstream kernel.
>>> ocfs2_file_aio_write most certainly could set ->private to 0, it
>>> will only be called once for a given kiocb.
>> From sys_io_submit->..->io_submit_one->aio_run_iocb->aio_rw_vect_retry,
>> it seems that aio_write could be called two times. See the following
>> scenario.
>> 1. There is a file opened with direct io flag, in aio_rw_vect_retry,
>> aio_write is called first time. If the direct io can
>> not be completed, it will fall back into buffer io, see line 2329 in
>> aio_write.
> Huh? What's line 2329 in aio_write?
See the following code.
2312 can_do_direct = direct_io;
2313 ret = ocfs2_prepare_inode_for_write(file, ppos,
2314 iocb->ki_left, appending,
2315 &can_do_direct, &has_refcount);
2316 if (ret < 0) {
2317 mlog_errno(ret);
2318 goto out;
2319 }
2320
2321 if (direct_io && !is_sync_kiocb(iocb))
2322 unaligned_dio = ocfs2_is_io_unaligned(inode,
iocb->ki_left,
2323 *ppos);
2324
2325 /*
2326 * We can't complete the direct I/O as requested, fall back to
2327 * buffered I/O.
2328 */
2329 if (direct_io && !can_do_direct) {
2330 ocfs2_rw_unlock(inode, rw_level);
2331
2332 have_alloc_sem = 0;
2333 rw_level = -1;
2334
2335 direct_io = 0;
2336 goto relock;
2337 }
The above is the source code how direct io falled back to buffer io. In
line 2313, in function ocfs2_prepare_inode_for_write(), it will judge
whether the direct io can be executed. If not, the variable
"can_do_direct" will be set to false, then the variable "direct_io" will
be set to 0 in line 2335. This means that generic_file_buffered_write()
will be called in the following code, not generic_file_direct_write(),
see the following code. So if the generic_file_buffered_write() is a
partial write, then its return value "written" will be made as the
return value of the aio_write, see line 2439. Then it return back to
aio_rw_vect_retry(), the condition (ret > 0 && iocb->ki_left > 0 &&
opcode == IOCB_CMD_PWRITEV) is true. Then aio_write will be called
second time. As the unaligned I/O flag may be set in the kiocb at the
first time call of aio_write, it may affect the second call of aio_write
if its direct IO is aligned.
2372 if (direct_io) {
2373 written = generic_file_direct_write(iocb, iov,
&nr_segs, *ppos,
2374 ppos, count,
ocount);
2375 if (written < 0) {
2376 ret = written;
2377 goto out_dio;
2378 }
2379 } else {
2380 current->backing_dev_info =
file->f_mapping->backing_dev_info;
2381 written = generic_file_buffered_write(iocb, iov,
nr_segs, *ppos,
2382 ppos, count, 0);
2383 current->backing_dev_info = NULL;
2384 }
2438 if (written)
2439 ret = written;
2440 return ret;
>
>> 2. If the very buffer io is a partial write, then it will return back
>> to aio_rw_vect_retry and issue the second aio_write.
> For the generic case, the fallback to buffered I/O happens in
> __generic_file_aio_write, without bouncing all the way back up the call
> stack to aio_rw_vect_retry. I see in ocfs2, things are a bit different:
>
> retry->aio_rw_vect_retry->ocfs2_file_aio_write->generic_file_direct_write
> ->ocfs2_direct_IO->__blockdev_direct_IO
>
> That last function can return 0 if not all of the data was written via
> direct I/O. At that point, you return all of the way up the chain to
> aio_rw_vect_retry, which checks the return value (ret). If it was 0,
> then it goes ahead and retries the complete I/O. How does that make any
> progress?!
>
> Cheers,
> Jeff
next prev parent reply other threads:[~2012-06-02 2:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-31 4:12 [Ocfs2-devel] [PATCH 1/2] aio: make kiocb->private NUll in init_sync_kiocb() Junxiao Bi
2012-05-31 4:12 ` Junxiao Bi
2012-05-31 4:12 ` [Ocfs2-devel] [PATCH 2/2] ocfs2: clear unaligned io flag when dio fails Junxiao Bi
2012-05-31 4:12 ` Junxiao Bi
2012-05-31 4:36 ` Joe Jin
2012-05-31 14:09 ` [Ocfs2-devel] " Jeff Moyer
2012-05-31 14:09 ` Jeff Moyer
2012-06-01 1:44 ` [Ocfs2-devel] " Junxiao Bi
2012-06-01 1:44 ` Junxiao Bi
2012-05-31 4:36 ` [PATCH 1/2] aio: make kiocb->private NUll in init_sync_kiocb() Joe Jin
2012-05-31 14:08 ` [Ocfs2-devel] " Jeff Moyer
2012-05-31 14:08 ` Jeff Moyer
2012-06-01 1:41 ` [Ocfs2-devel] " Junxiao Bi
2012-06-01 1:41 ` Junxiao Bi
2012-06-01 20:55 ` [Ocfs2-devel] " Jeff Moyer
2012-06-01 20:55 ` Jeff Moyer
2012-06-02 2:59 ` Junxiao Bi [this message]
2012-06-02 2:59 ` Junxiao Bi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FC98179.4000400@oracle.com \
--to=junxiao.bi@oracle.com \
--cc=bcrl@kvack.org \
--cc=jlbec@evilplan.org \
--cc=jmoyer@redhat.com \
--cc=joe.jin@oracle.com \
--cc=linux-aio@kvack.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mfasheh@suse.com \
--cc=ocfs2-devel@oss.oracle.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.