linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks()
@ 2010-11-08  6:06 Theodore Ts'o
  2010-11-09 16:36 ` Eric Sandeen
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2010-11-08  6:06 UTC (permalink / raw)
  To: Ext4 Developers List; +Cc: Theodore Ts'o, jiayingz

Commit 5c521830cf (ext4: Support discard requests when running in
no-journal mode) attempts to add sb_issue_discard() for data blocks
(in data=writeback mode) and in no-journal mode.  Unfortunately, this
no longer works, because in commit dd3932eddf (block: remove
BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
interface, and there are times when we call ext4_free_blocks() when we
are are holding a spinlock, or are otherwise in an atomic context.

For now, I've removed the call to sb_issue_discard() to prevent a
deadlock or (if spinlock debugging is enabled) failures like this:

BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
Call Trace:
[<ffffffff810397ce>] __schedule_bug+0x5e/0x70
[<ffffffff81403110>] schedule+0x950/0xa70
[<ffffffff81060bad>] ? insert_work+0x7d/0x90
[<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30
[<ffffffff81061127>] ? queue_work+0x37/0x60
[<ffffffff8140377d>] schedule_timeout+0x21d/0x360
[<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540
[<ffffffff81402680>] wait_for_common+0xc0/0x150
[<ffffffff81041490>] ? default_wake_function+0x0/0x10
[<ffffffff812034bc>] ? submit_bio+0x7c/0x100
[<ffffffff810680a0>] ? wake_bit_function+0x0/0x40
[<ffffffff814027b8>] wait_for_completion+0x18/0x20
[<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210
[<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60
[<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120
[<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70
[<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30
[<ffffffff81191618>] ext4_truncate+0x178/0x5d0
[<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280
[<ffffffff810d8976>] vmtruncate+0x56/0x70
[<ffffffff811925cb>] ext4_setattr+0x14b/0x460
[<ffffffff811319e4>] notify_change+0x194/0x380
[<ffffffff81117f80>] do_truncate+0x60/0x90
[<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20
[<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20
[<ffffffff81127539>] do_last+0x5d9/0x770
[<ffffffff811278bd>] do_filp_open+0x1ed/0x680
[<ffffffff8140644f>] ? page_fault+0x1f/0x30
[<ffffffff81132bfc>] ? alloc_fd+0xec/0x140
[<ffffffff81118db1>] do_sys_open+0x61/0x120
[<ffffffff81118e8b>] sys_open+0x1b/0x20
[<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b

https://bugzilla.kernel.org/show_bug.cgi?id=22302

Reported-by: Mathias Burén <mathias.buren@gmail.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Cc: jiayingz@google.com
---
 fs/ext4/mballoc.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index c58eba3..5b4d4e3 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -4640,8 +4640,6 @@ do_more:
 		 * with group lock held. generate_buddy look at
 		 * them with group lock_held
 		 */
-		if (test_opt(sb, DISCARD))
-			ext4_issue_discard(sb, block_group, bit, count);
 		ext4_lock_group(sb, block_group);
 		mb_clear_bits(bitmap_bh->b_data, bit, count);
 		mb_free_blocks(inode, &e4b, bit, count);
-- 
1.7.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks()
  2010-11-08  6:06 [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks() Theodore Ts'o
@ 2010-11-09 16:36 ` Eric Sandeen
  2010-11-09 18:01   ` Jiaying Zhang
       [not found]   ` <AANLkTikmExLtH0-LKEvH=WoL=o550EJQuO33dRsoFNWB@mail.gmail.com>
  0 siblings, 2 replies; 4+ messages in thread
From: Eric Sandeen @ 2010-11-09 16:36 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Ext4 Developers List, jiayingz

On 11/8/10 12:06 AM, Theodore Ts'o wrote:
> Commit 5c521830cf (ext4: Support discard requests when running in
> no-journal mode) attempts to add sb_issue_discard() for data blocks
> (in data=writeback mode) and in no-journal mode.  Unfortunately, this
> no longer works, because in commit dd3932eddf (block: remove
> BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
> interface, and there are times when we call ext4_free_blocks() when we
> are are holding a spinlock, or are otherwise in an atomic context.
> 
> For now, I've removed the call to sb_issue_discard() to prevent a
> deadlock or (if spinlock debugging is enabled) failures like this:

perhaps mount should fail with the combination of -o discard and no
journal present?

-Eric

> BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
> Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
> Call Trace:
> [<ffffffff810397ce>] __schedule_bug+0x5e/0x70
> [<ffffffff81403110>] schedule+0x950/0xa70
> [<ffffffff81060bad>] ? insert_work+0x7d/0x90
> [<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30
> [<ffffffff81061127>] ? queue_work+0x37/0x60
> [<ffffffff8140377d>] schedule_timeout+0x21d/0x360
> [<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540
> [<ffffffff81402680>] wait_for_common+0xc0/0x150
> [<ffffffff81041490>] ? default_wake_function+0x0/0x10
> [<ffffffff812034bc>] ? submit_bio+0x7c/0x100
> [<ffffffff810680a0>] ? wake_bit_function+0x0/0x40
> [<ffffffff814027b8>] wait_for_completion+0x18/0x20
> [<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210
> [<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60
> [<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120
> [<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70
> [<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30
> [<ffffffff81191618>] ext4_truncate+0x178/0x5d0
> [<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280
> [<ffffffff810d8976>] vmtruncate+0x56/0x70
> [<ffffffff811925cb>] ext4_setattr+0x14b/0x460
> [<ffffffff811319e4>] notify_change+0x194/0x380
> [<ffffffff81117f80>] do_truncate+0x60/0x90
> [<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20
> [<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20
> [<ffffffff81127539>] do_last+0x5d9/0x770
> [<ffffffff811278bd>] do_filp_open+0x1ed/0x680
> [<ffffffff8140644f>] ? page_fault+0x1f/0x30
> [<ffffffff81132bfc>] ? alloc_fd+0xec/0x140
> [<ffffffff81118db1>] do_sys_open+0x61/0x120
> [<ffffffff81118e8b>] sys_open+0x1b/0x20
> [<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=22302
> 
> Reported-by: Mathias Burén <mathias.buren@gmail.com>
> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> Cc: jiayingz@google.com
> ---
>  fs/ext4/mballoc.c |    2 --
>  1 files changed, 0 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> index c58eba3..5b4d4e3 100644
> --- a/fs/ext4/mballoc.c
> +++ b/fs/ext4/mballoc.c
> @@ -4640,8 +4640,6 @@ do_more:
>  		 * with group lock held. generate_buddy look at
>  		 * them with group lock_held
>  		 */
> -		if (test_opt(sb, DISCARD))
> -			ext4_issue_discard(sb, block_group, bit, count);
>  		ext4_lock_group(sb, block_group);
>  		mb_clear_bits(bitmap_bh->b_data, bit, count);
>  		mb_free_blocks(inode, &e4b, bit, count);

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks()
  2010-11-09 16:36 ` Eric Sandeen
@ 2010-11-09 18:01   ` Jiaying Zhang
       [not found]   ` <AANLkTikmExLtH0-LKEvH=WoL=o550EJQuO33dRsoFNWB@mail.gmail.com>
  1 sibling, 0 replies; 4+ messages in thread
From: Jiaying Zhang @ 2010-11-09 18:01 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Theodore Ts'o, Ext4 Developers List

I would like to spend some time to see whether we can add
sb_issue_discard() somewhere else for non-journaled mode.
It is a useful feature to be included for both modes.

Jiaying

On Tue, Nov 9, 2010 at 8:36 AM, Eric Sandeen <sandeen@redhat.com> wrote:
>
> On 11/8/10 12:06 AM, Theodore Ts'o wrote:
> > Commit 5c521830cf (ext4: Support discard requests when running in
> > no-journal mode) attempts to add sb_issue_discard() for data blocks
> > (in data=writeback mode) and in no-journal mode.  Unfortunately, this
> > no longer works, because in commit dd3932eddf (block: remove
> > BLKDEV_IFL_WAIT), sb_issue_discard() only presents a synchronous
> > interface, and there are times when we call ext4_free_blocks() when we
> > are are holding a spinlock, or are otherwise in an atomic context.
> >
> > For now, I've removed the call to sb_issue_discard() to prevent a
> > deadlock or (if spinlock debugging is enabled) failures like this:
>
> perhaps mount should fail with the combination of -o discard and no
> journal present?
>
> -Eric
>
> > BUG: scheduling while atomic: rc.sysinit/1376/0x00000002
> > Pid: 1376, comm: rc.sysinit Not tainted 2.6.36-ARCH #1
> > Call Trace:
> > [<ffffffff810397ce>] __schedule_bug+0x5e/0x70
> > [<ffffffff81403110>] schedule+0x950/0xa70
> > [<ffffffff81060bad>] ? insert_work+0x7d/0x90
> > [<ffffffff81060fbd>] ? queue_work_on+0x1d/0x30
> > [<ffffffff81061127>] ? queue_work+0x37/0x60
> > [<ffffffff8140377d>] schedule_timeout+0x21d/0x360
> > [<ffffffff812031c3>] ? generic_make_request+0x2c3/0x540
> > [<ffffffff81402680>] wait_for_common+0xc0/0x150
> > [<ffffffff81041490>] ? default_wake_function+0x0/0x10
> > [<ffffffff812034bc>] ? submit_bio+0x7c/0x100
> > [<ffffffff810680a0>] ? wake_bit_function+0x0/0x40
> > [<ffffffff814027b8>] wait_for_completion+0x18/0x20
> > [<ffffffff8120a969>] blkdev_issue_discard+0x1b9/0x210
> > [<ffffffff811ba03e>] ext4_free_blocks+0x68e/0xb60
> > [<ffffffff811b1650>] ? __ext4_handle_dirty_metadata+0x110/0x120
> > [<ffffffff811b098c>] ext4_ext_truncate+0x8cc/0xa70
> > [<ffffffff810d713e>] ? pagevec_lookup+0x1e/0x30
> > [<ffffffff81191618>] ext4_truncate+0x178/0x5d0
> > [<ffffffff810eacbb>] ? unmap_mapping_range+0xab/0x280
> > [<ffffffff810d8976>] vmtruncate+0x56/0x70
> > [<ffffffff811925cb>] ext4_setattr+0x14b/0x460
> > [<ffffffff811319e4>] notify_change+0x194/0x380
> > [<ffffffff81117f80>] do_truncate+0x60/0x90
> > [<ffffffff811e08fa>] ? security_inode_permission+0x1a/0x20
> > [<ffffffff811eaec1>] ? tomoyo_path_truncate+0x11/0x20
> > [<ffffffff81127539>] do_last+0x5d9/0x770
> > [<ffffffff811278bd>] do_filp_open+0x1ed/0x680
> > [<ffffffff8140644f>] ? page_fault+0x1f/0x30
> > [<ffffffff81132bfc>] ? alloc_fd+0xec/0x140
> > [<ffffffff81118db1>] do_sys_open+0x61/0x120
> > [<ffffffff81118e8b>] sys_open+0x1b/0x20
> > [<ffffffff81002e6b>] system_call_fastpath+0x16/0x1b
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=22302
> >
> > Reported-by: Mathias Burén <mathias.buren@gmail.com>
> > Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
> > Cc: jiayingz@google.com
> > ---
> >  fs/ext4/mballoc.c |    2 --
> >  1 files changed, 0 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
> > index c58eba3..5b4d4e3 100644
> > --- a/fs/ext4/mballoc.c
> > +++ b/fs/ext4/mballoc.c
> > @@ -4640,8 +4640,6 @@ do_more:
> >                * with group lock held. generate_buddy look at
> >                * them with group lock_held
> >                */
> > -             if (test_opt(sb, DISCARD))
> > -                     ext4_issue_discard(sb, block_group, bit, count);
> >               ext4_lock_group(sb, block_group);
> >               mb_clear_bits(bitmap_bh->b_data, bit, count);
> >               mb_free_blocks(inode, &e4b, bit, count);
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks()
       [not found]   ` <AANLkTikmExLtH0-LKEvH=WoL=o550EJQuO33dRsoFNWB@mail.gmail.com>
@ 2010-11-09 20:03     ` Ted Ts'o
  0 siblings, 0 replies; 4+ messages in thread
From: Ted Ts'o @ 2010-11-09 20:03 UTC (permalink / raw)
  To: Jiaying Zhang; +Cc: Eric Sandeen, Ext4 Developers List

On Tue, Nov 09, 2010 at 10:00:37AM -0800, Jiaying Zhang wrote:
> I would like to spend some time to see whether we can add
> sb_issue_discard() somewhere else for non-journaled mode.
> It is a useful feature to be included for both modes.

It certainly can be done, but we'll have to do the trim from a kernel
thread or workqueue context.  (Of course, we need to make sure that
the blocks don't get reused until the trim happens --- or, if we want
to use those blocks, that we take them off the to-be-trimmed list
before we reuse them.)

I am a bit concerned about just adding a new thread, though.
Especially if it's per filesystem, since on a system with a very high
spindle/ext4 file system count, this could get a bit crazy.  It's a
bit better in 2.6.36 now that with concurrency managed workqueues,
there's only one workqueue thread per file system, instead of one
workqueue thread per file system per core (so on a system with 50
spindles and 32 cores, there would be 1600 workqueue threads!).

						- Ted


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-11-09 20:03 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-08  6:06 [PATCH] ext4: Don't call sb_issue_discard() in ext4_free_blocks() Theodore Ts'o
2010-11-09 16:36 ` Eric Sandeen
2010-11-09 18:01   ` Jiaying Zhang
     [not found]   ` <AANLkTikmExLtH0-LKEvH=WoL=o550EJQuO33dRsoFNWB@mail.gmail.com>
2010-11-09 20:03     ` Ted Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).