* bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
@ 2013-03-07 8:36 Kazuya Mio
2013-03-07 10:48 ` Jan Kara
0 siblings, 1 reply; 9+ messages in thread
From: Kazuya Mio @ 2013-03-07 8:36 UTC (permalink / raw)
To: jack, akpm, adilger.kernel; +Cc: linux-ext4
I found the performance problem that ext3 direct I/O sends large number of bio
unnecessarily when buffer_head is set BH_Boundary flag.
When we read/write a file sequentially, we will read/write not only
the data blocks but also the indirect blocks that may not be physically
adjacent to the data blocks. So ext3 sets BG_Boundary flag to submit
the previous I/O before reading/writing an indirect block.
However, in the case of direct I/O, the size of buffer_head
could be more than the blocksize. dio_send_cur_page() checks BH_Boundary flag
and then calls submit_bio() without calling dio_bio_add_page().
As a result, submit_bio() is called every one page and cause of high CPU usage.
The following patch fixes this problem only for ext3. At least ext2/3/4
don't need BH_Boundary flag for direct I/O because submit_bio() will be called
when the offset of buffer_head is discontinuous about the previous one.
---
@@ -926,7 +926,8 @@ int ext3_get_blocks_handle(handle_t *handle, struct inode *inode,
set_buffer_new(bh_result);
got_it:
map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key));
- if (count > blocks_to_boundary)
+ /* set bourdary flag for buffered I/O */
+ if (maxblocks == 1 && count > blocks_to_boundary)
set_buffer_boundary(bh_result);
err = count;
/* Clean up and exit */
---
My simple performance test with/without the above patch shows us reducing
CPU usage.
-------------------------------------------------
| | I/O time(s)| CPU used(%)| mem used(%)|
-------------------------------------------------
|default | 41.304 | 74.658 | 21.528 |
|patched | 40.948 | 58.325 | 21.857 |
-------------------------------------------------
environment:
kernel: 3.8.0-rc7
CPU: Xeon E3-1220
Memory: 8GB
Test detail:
(1) create 48KB file
(2) write 4096KB with O_DIRECT from the file offset 48KB (write only
indirect blocks)
(3) loop (2) at 1000 times
I/O time means the time between (1) and (3), and CPU/memory usage is
monitored by sar command.
When BH_Boundary flag is sets to buffer_head, we should call submit_bio()
once per the size of buffer_head. But I don't see the impact of
other filesystems that is used BH_Boundary.
Does anyone have any ideas about this problem?
Regards,
Kazuya Mio
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-07 8:36 bio splits unnecessarily due to BH_Boundary in ext3 direct I/O Kazuya Mio
@ 2013-03-07 10:48 ` Jan Kara
2013-03-19 8:36 ` Kazuya Mio
0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2013-03-07 10:48 UTC (permalink / raw)
To: Kazuya Mio; +Cc: jack, akpm, adilger.kernel, linux-ext4
[-- Attachment #1: Type: text/plain, Size: 2343 bytes --]
Hello,
On Thu 07-03-13 17:36:07, Kazuya Mio wrote:
> I found the performance problem that ext3 direct I/O sends large number of bio
> unnecessarily when buffer_head is set BH_Boundary flag.
>
> When we read/write a file sequentially, we will read/write not only
> the data blocks but also the indirect blocks that may not be physically
> adjacent to the data blocks. So ext3 sets BG_Boundary flag to submit
> the previous I/O before reading/writing an indirect block.
>
> However, in the case of direct I/O, the size of buffer_head
> could be more than the blocksize. dio_send_cur_page() checks BH_Boundary flag
> and then calls submit_bio() without calling dio_bio_add_page().
> As a result, submit_bio() is called every one page and cause of high CPU usage.
Yes, you are right that this is a bug. Thank you for reporting it!
> The following patch fixes this problem only for ext3. At least ext2/3/4
> don't need BH_Boundary flag for direct I/O because submit_bio() will be called
> when the offset of buffer_head is discontinuous about the previous one.
>
> ---
> @@ -926,7 +926,8 @@ int ext3_get_blocks_handle(handle_t *handle, struct inode *inode,
> set_buffer_new(bh_result);
> got_it:
> map_bh(bh_result, inode->i_sb, le32_to_cpu(chain[depth-1].key));
> - if (count > blocks_to_boundary)
> + /* set bourdary flag for buffered I/O */
> + if (maxblocks == 1 && count > blocks_to_boundary)
> set_buffer_boundary(bh_result);
> err = count;
> /* Clean up and exit */
> ---
But I'm afraid your fix isn't quite correct. Because as I read the code
we will accumulate the bio, then read indirect block from get_more_blocks()
and only after that we find out bio won't be contiguous so we would submit
that. But the desired sequence is like:
* accumulate the bio
* find out it will not be contiguous so submit it
* get_more_blocks() - submits read
I think the proper fix should be in fs/direct-io.c:
...
- sdio->boundary = buffer_boundary(map_bh);
+ if (sdio->blocks_available == this_chunk_blocks)
+ sdio->boundary = buffer_boundary(map_bh);
...
Then we properly mark bio should be submitted only if we are mapping last
part of the mapped extent from the filesystem. Can you give this change a
try (full patch with changelog attached)?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-direct-io-Fix-boundary-block-handling.patch --]
[-- Type: text/x-patch, Size: 1491 bytes --]
>From c45bc949f7b42ed25f40869ff79664a47bd0979f Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Thu, 7 Mar 2013 11:41:58 +0100
Subject: [PATCH] direct-io: Fix boundary block handling
When we read/write a file sequentially, we will read/write not only the
data blocks but also the indirect blocks that may not be physically
adjacent to the data blocks. So filesystems sets BG_Boundary flag to
submit the previous I/O before reading/writing an indirect block.
However generic direct IO code mishandles buffer_boundary() flag, sets
sdio->boundary before each submit_page_section() call which results in
sending only one page bios as underlying code thinks this page is the
last in the contiguous extent. So fix the problem by setting
sdio->boundary only if the current page is really the last one in the
mapped extent.
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/direct-io.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/fs/direct-io.c b/fs/direct-io.c
index f853263..e666854 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -969,7 +969,8 @@ do_holes:
this_chunk_bytes = this_chunk_blocks << blkbits;
BUG_ON(this_chunk_bytes == 0);
- sdio->boundary = buffer_boundary(map_bh);
+ if (sdio->blocks_available == this_chunk_blocks)
+ sdio->boundary = buffer_boundary(map_bh);
ret = submit_page_section(dio, sdio, page,
offset_in_page,
this_chunk_bytes,
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-07 10:48 ` Jan Kara
@ 2013-03-19 8:36 ` Kazuya Mio
2013-03-19 19:31 ` Jan Kara
0 siblings, 1 reply; 9+ messages in thread
From: Kazuya Mio @ 2013-03-19 8:36 UTC (permalink / raw)
To: Jan Kara; +Cc: akpm, adilger.kernel, linux-ext4
2013/03/07 19:48, Jan Kara wrote:
> Then we properly mark bio should be submitted only if we are mapping last
> part of the mapped extent from the filesystem. Can you give this change a
> try (full patch with changelog attached)?
Sorry for the late response.
After applying your patch, the problem I reported was fixed.
One matter for concern is that submit_bio() is called twice per one buffer_head.
Because submit_page_section() calls dio_bio_submit() before adding
the old page (sdio->cur_page) and the current page to struct dio_submit.
Does it work as required?
Regards,
Kazuya Mio
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-19 8:36 ` Kazuya Mio
@ 2013-03-19 19:31 ` Jan Kara
2013-03-21 8:43 ` Kazuya Mio
0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2013-03-19 19:31 UTC (permalink / raw)
To: Kazuya Mio; +Cc: Jan Kara, akpm, adilger.kernel, linux-ext4
On Tue 19-03-13 17:36:17, Kazuya Mio wrote:
> 2013/03/07 19:48, Jan Kara wrote:
> >Then we properly mark bio should be submitted only if we are mapping last
> >part of the mapped extent from the filesystem. Can you give this change a
> >try (full patch with changelog attached)?
>
> Sorry for the late response.
> After applying your patch, the problem I reported was fixed.
>
> One matter for concern is that submit_bio() is called twice per one buffer_head.
> Because submit_page_section() calls dio_bio_submit() before adding
> the old page (sdio->cur_page) and the current page to struct dio_submit.
> Does it work as required?
I'm not sure I understand. Looking into dio_send_cur_page() it seems may
prematurely submit the bio if sdio->boundary is set - in that case we
should probably first try to add the page to the bio and submit the bio
only after that. Is that what you mean?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-19 19:31 ` Jan Kara
@ 2013-03-21 8:43 ` Kazuya Mio
2013-03-29 17:15 ` Jan Kara
0 siblings, 1 reply; 9+ messages in thread
From: Kazuya Mio @ 2013-03-21 8:43 UTC (permalink / raw)
To: Jan Kara; +Cc: akpm, adilger.kernel, linux-ext4
2013/03/20 4:31, Jan Kara wrote:
> I'm not sure I understand. Looking into dio_send_cur_page() it seems may
> prematurely submit the bio if sdio->boundary is set - in that case we
> should probably first try to add the page to the bio and submit the bio
> only after that. Is that what you mean?
I think the direct I/O works for each page into buffer_head by the following
three steps:
1. submit sdio->bio if sdio->boudary is set
2. add sdio->cur_page to sdio->bio by dio_new_bio() or dio_bio_add_page()
3. set the curret page to sdio->cur_page in submit_page_section()
It is true that dio_send_cur_page() submits the bio if sdio->boudary is set.
However, at this time, this bio does not contain sdio->cur_page and
the current page do_direct_IO() passed.
Regards,
Kazuya Mio
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-21 8:43 ` Kazuya Mio
@ 2013-03-29 17:15 ` Jan Kara
2013-04-01 8:25 ` Kazuya Mio
0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2013-03-29 17:15 UTC (permalink / raw)
To: Kazuya Mio; +Cc: Jan Kara, akpm, adilger.kernel, linux-ext4, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1022 bytes --]
On Thu 21-03-13 17:43:52, Kazuya Mio wrote:
> 2013/03/20 4:31, Jan Kara wrote:
> > I'm not sure I understand. Looking into dio_send_cur_page() it seems may
> >prematurely submit the bio if sdio->boundary is set - in that case we
> >should probably first try to add the page to the bio and submit the bio
> >only after that. Is that what you mean?
>
> I think the direct I/O works for each page into buffer_head by the following
> three steps:
> 1. submit sdio->bio if sdio->boudary is set
> 2. add sdio->cur_page to sdio->bio by dio_new_bio() or dio_bio_add_page()
> 3. set the curret page to sdio->cur_page in submit_page_section()
>
> It is true that dio_send_cur_page() submits the bio if sdio->boudary is set.
> However, at this time, this bio does not contain sdio->cur_page and
> the current page do_direct_IO() passed.
Sorry for not getting to you earlier. So we agree in our analysis. Do you
agree with the attached patch? Does it help your workload?
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-direct-io-Submit-bio-after-boundary-buffer-is-added-.patch --]
[-- Type: text/x-patch, Size: 1551 bytes --]
>From d9ef6ae45c80b298f7f3b718a101956071709b02 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 29 Mar 2013 18:05:01 +0100
Subject: [PATCH] direct-io: Submit bio after boundary buffer is added to it
Currently, dio_send_cur_page() submits bio before current page
(sdio->cur_page) is added to the bio if sdio->boundary is set. This is
actually wrong because sdio->boundary means the current buffer is the
last one before metadata needs to be read. So we should rather submit
the bio *after* the current page is added to it.
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/direct-io.c | 12 ++++++------
1 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/fs/direct-io.c b/fs/direct-io.c
index e666854..2ccde31 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -672,12 +672,6 @@ static inline int dio_send_cur_page(struct dio *dio, struct dio_submit *sdio,
if (sdio->final_block_in_bio != sdio->cur_page_block ||
cur_offset != bio_next_offset)
dio_bio_submit(dio, sdio);
- /*
- * Submit now if the underlying fs is about to perform a
- * metadata read
- */
- else if (sdio->boundary)
- dio_bio_submit(dio, sdio);
}
if (sdio->bio == NULL) {
@@ -694,6 +688,12 @@ static inline int dio_send_cur_page(struct dio *dio, struct dio_submit *sdio,
BUG_ON(ret != 0);
}
}
+ /*
+ * Submit now if the underlying fs is about to perform a
+ * metadata read
+ */
+ if (sdio->boundary)
+ dio_bio_submit(dio, sdio);
out:
return ret;
}
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-03-29 17:15 ` Jan Kara
@ 2013-04-01 8:25 ` Kazuya Mio
2013-04-09 15:40 ` Jan Kara
0 siblings, 1 reply; 9+ messages in thread
From: Kazuya Mio @ 2013-04-01 8:25 UTC (permalink / raw)
To: Jan Kara; +Cc: akpm, adilger.kernel, linux-ext4, linux-fsdevel
2013/03/30 2:15, Jan Kara wrote:
> Sorry for not getting to you earlier. So we agree in our analysis. Do you
> agree with the attached patch? Does it help your workload?
According to your two patches, direct I/O works as the following steps:
1. add sdio->cur_page to sdio->bio by dio_new_bio() or dio_bio_add_page()
2. submit sdio->bio if sdio->boudary is set
3. set the curret page to sdio->cur_page in submit_page_section()
Only one page is submitted separately because of submitting bio before step 3.
For example, we write 52KB data with O_DIRECT, it is ideal that filesystem
submits a bio twice (48KB and 4KB). However, after applying your two patches,
52KB data is split into three bios (44KB, 4KB and 4KB).
Regards,
Kazuya Mio
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-04-01 8:25 ` Kazuya Mio
@ 2013-04-09 15:40 ` Jan Kara
2013-04-10 2:59 ` Kazuya Mio
0 siblings, 1 reply; 9+ messages in thread
From: Jan Kara @ 2013-04-09 15:40 UTC (permalink / raw)
To: Kazuya Mio; +Cc: Jan Kara, akpm, adilger.kernel, linux-ext4, linux-fsdevel
[-- Attachment #1: Type: text/plain, Size: 1201 bytes --]
On Mon 01-04-13 17:25:56, Kazuya Mio wrote:
> 2013/03/30 2:15, Jan Kara wrote:
> > Sorry for not getting to you earlier. So we agree in our analysis. Do you
> >agree with the attached patch? Does it help your workload?
>
> According to your two patches, direct I/O works as the following steps:
> 1. add sdio->cur_page to sdio->bio by dio_new_bio() or dio_bio_add_page()
> 2. submit sdio->bio if sdio->boudary is set
> 3. set the curret page to sdio->cur_page in submit_page_section()
>
> Only one page is submitted separately because of submitting bio before step 3.
>
> For example, we write 52KB data with O_DIRECT, it is ideal that filesystem
> submits a bio twice (48KB and 4KB). However, after applying your two patches,
> 52KB data is split into three bios (44KB, 4KB and 4KB).
Ah, thanks for pointing that out. It took me a while to get things
correct but with the attached patch (on top of the first patch), I'm
getting all the bios with maximum size. I've also checked that for direct
IO on ramdisk, I get about 3x faster IO on ext3 with both patches applied.
I also get a decent 10% speedup of dio reads on standard harddrive.
Honza
--
Jan Kara <jack@suse.cz>
SUSE Labs, CR
[-- Attachment #2: 0001-direct-io-Submit-bio-after-boundary-buffer-is-added-.patch --]
[-- Type: text/x-patch, Size: 2527 bytes --]
>From 8e0ae2a449f996e0ea5ad865d3147ee6758743aa Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Fri, 29 Mar 2013 18:05:01 +0100
Subject: [PATCH] direct-io: Submit bio after boundary buffer is added to it
Currently, dio_send_cur_page() submits bio before current page and
cached sdio->cur_page is added to the bio if sdio->boundary is set. This
is actually wrong because sdio->boundary means the current buffer is the
last one before metadata needs to be read. So we should rather submit
the bio after the current page is added to it.
Reported-by: Kazuya Mio <k-mio@sx.jp.nec.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
fs/direct-io.c | 28 +++++++++++-----------------
1 files changed, 11 insertions(+), 17 deletions(-)
diff --git a/fs/direct-io.c b/fs/direct-io.c
index e666854..c57add7 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -672,12 +672,6 @@ static inline int dio_send_cur_page(struct dio *dio, struct dio_submit *sdio,
if (sdio->final_block_in_bio != sdio->cur_page_block ||
cur_offset != bio_next_offset)
dio_bio_submit(dio, sdio);
- /*
- * Submit now if the underlying fs is about to perform a
- * metadata read
- */
- else if (sdio->boundary)
- dio_bio_submit(dio, sdio);
}
if (sdio->bio == NULL) {
@@ -737,16 +731,6 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
sdio->cur_page_block +
(sdio->cur_page_len >> sdio->blkbits) == blocknr) {
sdio->cur_page_len += len;
-
- /*
- * If sdio->boundary then we want to schedule the IO now to
- * avoid metadata seeks.
- */
- if (sdio->boundary) {
- ret = dio_send_cur_page(dio, sdio, map_bh);
- page_cache_release(sdio->cur_page);
- sdio->cur_page = NULL;
- }
goto out;
}
@@ -758,7 +742,7 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
page_cache_release(sdio->cur_page);
sdio->cur_page = NULL;
if (ret)
- goto out;
+ return ret;
}
page_cache_get(page); /* It is in dio */
@@ -768,6 +752,16 @@ submit_page_section(struct dio *dio, struct dio_submit *sdio, struct page *page,
sdio->cur_page_block = blocknr;
sdio->cur_page_fs_offset = sdio->block_in_file << sdio->blkbits;
out:
+ /*
+ * If sdio->boundary then we want to schedule the IO now to
+ * avoid metadata seeks.
+ */
+ if (sdio->boundary) {
+ ret = dio_send_cur_page(dio, sdio, map_bh);
+ dio_bio_submit(dio, sdio);
+ page_cache_release(sdio->cur_page);
+ sdio->cur_page = NULL;
+ }
return ret;
}
--
1.7.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: bio splits unnecessarily due to BH_Boundary in ext3 direct I/O
2013-04-09 15:40 ` Jan Kara
@ 2013-04-10 2:59 ` Kazuya Mio
0 siblings, 0 replies; 9+ messages in thread
From: Kazuya Mio @ 2013-04-10 2:59 UTC (permalink / raw)
To: Jan Kara; +Cc: akpm, adilger.kernel, linux-ext4, linux-fsdevel
2013/04/10 0:40, Jan Kara wrote:
> Ah, thanks for pointing that out. It took me a while to get things
> correct but with the attached patch (on top of the first patch), I'm
> getting all the bios with maximum size. I've also checked that for direct
> IO on ramdisk, I get about 3x faster IO on ext3 with both patches applied.
> I also get a decent 10% speedup of dio reads on standard harddrive.
I made sure that the problem I reported was completely fixed.
Thank you for your help.
Regards,
Kazuya Mio
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2013-04-10 3:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-07 8:36 bio splits unnecessarily due to BH_Boundary in ext3 direct I/O Kazuya Mio
2013-03-07 10:48 ` Jan Kara
2013-03-19 8:36 ` Kazuya Mio
2013-03-19 19:31 ` Jan Kara
2013-03-21 8:43 ` Kazuya Mio
2013-03-29 17:15 ` Jan Kara
2013-04-01 8:25 ` Kazuya Mio
2013-04-09 15:40 ` Jan Kara
2013-04-10 2:59 ` Kazuya Mio
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).