* [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
@ 2013-09-10 0:11 majianpeng
2013-09-25 8:08 ` majianpeng
0 siblings, 1 reply; 6+ messages in thread
From: majianpeng @ 2013-09-10 0:11 UTC (permalink / raw)
To: tj; +Cc: axboe, linux-fsdevel
Met a oops when remove a busy-writing disk.The kernel messages are:
[ 253.105528] PGD 1366b6067 PUD 136690067 PMD 0
[ 253.105531] Oops: 0000 [#1] SMP
[ 253.105541] CPU: 3 PID: 748 Comm: kworker/u8:2 Tainted: G W 3.11.0+ #183
[ 253.105542] Hardware name: To Be Filled By O.E.M. To Be Filled By
O.E.M./To be filled by O.E.M., BIOS 080015 01/06/2011[ 253.105543]
Workqueue: writeback bdi_writeback_workfn
[ 253.105545] task: ffff880136d98000 ti: ffff880136dfa000 task.ti:ffff880136dfa000
[ 253.105547] RIP: 0010:[<ffffffff811952ca>] [<ffffffff811952ca>] bdi_writeback_workfn+0x3a/0x3c0
[ 253.105548] RSP: 0018:ffff880136dfbcc8 EFLAGS: 00010292
[ 253.105549] RAX: 0000000000000000 RBX: ffff880136cf4580 RCX: 0000000000000000
[ 253.105549] RDX: ffff880136dc1a70 RSI: 0000000000584000 RDI: ffff880136dc1d68
[ 253.105550] RBP: ffff880136dfbd68 R08: 0000000000000000 R09: ffff880136d98750
[ 253.105551] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013a812000
[ 253.105552] R13: ffff880139d0b900 R14: ffff880136dc1d68 R15: 0000000000000100
[ 253.105553] FS: 0000000000000000(0000) GS:ffff88013b400000(0000) knlGS:0000000000000000
[ 253.105554] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 253.105555] CR2: 0000000000000050 CR3: 00000001366b5000 CR4: 00000000000407e0
[ 253.105555] Stack:
[ 253.105561] ffff880136dc1d68 ffffffff8106546e ffff880136dc1a70 ffff880136cf4580
[ 253.105563] ffff880136dc1d50 0000000000000000 0000000000000000 0000000000000000
[ 253.105565] ffff880136dfbd68 0000000000000246 ffffffff81065463 0000000000000000
[ 253.105566] Call Trace:
[ 253.105575] [<ffffffff8106546e>] ? process_one_work+0x18e/0x500
[ 253.105577] [<ffffffff81065463>] ? process_one_work+0x183/0x500
[ 253.105579] [<ffffffff810654cc>] process_one_work+0x1ec/0x500
[ 253.105580] [<ffffffff81065463>] ? process_one_work+0x183/0x500
[ 253.105582] [<ffffffff81065c52>] worker_thread+0x122/0x380
[ 253.105584] [<ffffffff81065b30>] ? rescuer_thread+0x310/0x310
[ 253.105586] [<ffffffff8106d67b>] kthread+0xdb/0xe0
[ 253.105589] [<ffffffff8106d5a0>] ? flush_kthread_work+0x1c0/0x1c0
[ 253.105591] [<ffffffff816f6c1c>] ret_from_fork+0x7c/0xb0
[ 253.105593] [<ffffffff8106d5a0>] ? flush_kthread_work+0x1c0/0x1c0
[ 253.105610] Code: e5 41 57 41 56 41 55 41 54 53 48 83 ec 78 48 8b 57
e8 48 89 45 80 48 89 bd 60 ff ff ff 48 8b 82 a8 04 00 00 48 89 95 70 ff
ff ff <48> 8b 70 50 48 85 f6 0f 84 68 02 00 00 31 c0 48 c7 c7 6e e2 a2
[ 253.105611] RIP [<ffffffff811952ca>] bdi_writeback_workfn+0x3a/0x3c0
[ 253.105612] RSP <ffff880136dfbcc8>
[ 253.105613] CR2: 0000000000000050
[ 253.105615] ---[ end trace 60404b6c9a2e6b32 ]---
This bug introduced by commit ef3b101925f2170c.
I think the situation is like:
remove disk flush_thread_work
bdi_destroy()
bdi_unregister()
bdi->dev = NULL
bdi_writeback_workfn()
Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
---
fs/fs-writeback.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 68851ff..9beb967 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1007,7 +1007,8 @@ void bdi_writeback_workfn(struct work_struct *work)
struct backing_dev_info *bdi = wb->bdi;
long pages_written;
- set_worker_desc("flush-%s", dev_name(bdi->dev));
+ set_worker_desc("flush-%s", bdi->dev ?
+ dev_name(bdi->dev) : bdi->name);
current->flags |= PF_SWAPWRITE;
if (likely(!current_is_workqueue_rescuer() ||
--
1.8.1.2
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
2013-09-10 0:11 [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn majianpeng
@ 2013-09-25 8:08 ` majianpeng
2013-09-25 22:43 ` Dave Chinner
0 siblings, 1 reply; 6+ messages in thread
From: majianpeng @ 2013-09-25 8:08 UTC (permalink / raw)
To: tj; +Cc: axboe, linux-fsdevel
Hi,
How about this patch?
This bug can easily to reproduce.
dd if=/dev/zero of=/dev/sdb bs=64k
For a while, remove the disk. At my machine, it at most 100% occured.
Thanks!
Jianpeng Ma
>Met a oops when remove a busy-writing disk.The kernel messages are:
>[ 253.105528] PGD 1366b6067 PUD 136690067 PMD 0
>[ 253.105531] Oops: 0000 [#1] SMP
>[ 253.105541] CPU: 3 PID: 748 Comm: kworker/u8:2 Tainted: G W 3.11.0+ #183
>[ 253.105542] Hardware name: To Be Filled By O.E.M. To Be Filled By
>O.E.M./To be filled by O.E.M., BIOS 080015 01/06/2011[ 253.105543]
>Workqueue: writeback bdi_writeback_workfn
>[ 253.105545] task: ffff880136d98000 ti: ffff880136dfa000 task.ti:ffff880136dfa000
>[ 253.105547] RIP: 0010:[<ffffffff811952ca>] [<ffffffff811952ca>] bdi_writeback_workfn+0x3a/0x3c0
>[ 253.105548] RSP: 0018:ffff880136dfbcc8 EFLAGS: 00010292
>[ 253.105549] RAX: 0000000000000000 RBX: ffff880136cf4580 RCX: 0000000000000000
>[ 253.105549] RDX: ffff880136dc1a70 RSI: 0000000000584000 RDI: ffff880136dc1d68
>[ 253.105550] RBP: ffff880136dfbd68 R08: 0000000000000000 R09: ffff880136d98750
>[ 253.105551] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88013a812000
>[ 253.105552] R13: ffff880139d0b900 R14: ffff880136dc1d68 R15: 0000000000000100
>[ 253.105553] FS: 0000000000000000(0000) GS:ffff88013b400000(0000) knlGS:0000000000000000
>[ 253.105554] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>[ 253.105555] CR2: 0000000000000050 CR3: 00000001366b5000 CR4: 00000000000407e0
>[ 253.105555] Stack:
>[ 253.105561] ffff880136dc1d68 ffffffff8106546e ffff880136dc1a70 ffff880136cf4580
>[ 253.105563] ffff880136dc1d50 0000000000000000 0000000000000000 0000000000000000
>[ 253.105565] ffff880136dfbd68 0000000000000246 ffffffff81065463 0000000000000000
>[ 253.105566] Call Trace:
>[ 253.105575] [<ffffffff8106546e>] ? process_one_work+0x18e/0x500
>[ 253.105577] [<ffffffff81065463>] ? process_one_work+0x183/0x500
>[ 253.105579] [<ffffffff810654cc>] process_one_work+0x1ec/0x500
>[ 253.105580] [<ffffffff81065463>] ? process_one_work+0x183/0x500
>[ 253.105582] [<ffffffff81065c52>] worker_thread+0x122/0x380
>[ 253.105584] [<ffffffff81065b30>] ? rescuer_thread+0x310/0x310
>[ 253.105586] [<ffffffff8106d67b>] kthread+0xdb/0xe0
>[ 253.105589] [<ffffffff8106d5a0>] ? flush_kthread_work+0x1c0/0x1c0
>[ 253.105591] [<ffffffff816f6c1c>] ret_from_fork+0x7c/0xb0
>[ 253.105593] [<ffffffff8106d5a0>] ? flush_kthread_work+0x1c0/0x1c0
>[ 253.105610] Code: e5 41 57 41 56 41 55 41 54 53 48 83 ec 78 48 8b 57
>e8 48 89 45 80 48 89 bd 60 ff ff ff 48 8b 82 a8 04 00 00 48 89 95 70 ff
>ff ff <48> 8b 70 50 48 85 f6 0f 84 68 02 00 00 31 c0 48 c7 c7 6e e2 a2
>[ 253.105611] RIP [<ffffffff811952ca>] bdi_writeback_workfn+0x3a/0x3c0
>[ 253.105612] RSP <ffff880136dfbcc8>
>[ 253.105613] CR2: 0000000000000050
>[ 253.105615] ---[ end trace 60404b6c9a2e6b32 ]---
>
>This bug introduced by commit ef3b101925f2170c.
>I think the situation is like:
>remove disk flush_thread_work
>bdi_destroy()
> bdi_unregister()
> bdi->dev = NULL
> bdi_writeback_workfn()
>
>Signed-off-by: Jianpeng Ma <majianpeng@gmail.com>
>---
> fs/fs-writeback.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
>index 68851ff..9beb967 100644
>--- a/fs/fs-writeback.c
>+++ b/fs/fs-writeback.c
>@@ -1007,7 +1007,8 @@ void bdi_writeback_workfn(struct work_struct *work)
> struct backing_dev_info *bdi = wb->bdi;
> long pages_written;
>
>- set_worker_desc("flush-%s", dev_name(bdi->dev));
>+ set_worker_desc("flush-%s", bdi->dev ?
>+ dev_name(bdi->dev) : bdi->name);
> current->flags |= PF_SWAPWRITE;
>
> if (likely(!current_is_workqueue_rescuer() ||
>--
>1.8.1.2
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
2013-09-25 8:08 ` majianpeng
@ 2013-09-25 22:43 ` Dave Chinner
2013-09-26 13:35 ` tj
2013-09-26 17:47 ` Christoph Hellwig
0 siblings, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2013-09-25 22:43 UTC (permalink / raw)
To: majianpeng; +Cc: tj, axboe, linux-fsdevel
On Wed, Sep 25, 2013 at 04:08:54PM +0800, majianpeng wrote:
> Hi,
> How about this patch?
> This bug can easily to reproduce.
> dd if=/dev/zero of=/dev/sdb bs=64k
> For a while, remove the disk. At my machine, it at most 100% occured.
...
> >
> >This bug introduced by commit ef3b101925f2170c.
> >I think the situation is like:
> >remove disk flush_thread_work
> >bdi_destroy()
> > bdi_unregister()
> > bdi->dev = NULL
> > bdi_writeback_workfn()
You're just papering over the larger problem, in that the writeback
work is running concurrently with the bdi_unregister() function that
is tearing the bdi down. You should try to fix the underlying race
condition, as documented in bdi_destroy.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
2013-09-25 22:43 ` Dave Chinner
@ 2013-09-26 13:35 ` tj
2013-09-26 17:47 ` Christoph Hellwig
1 sibling, 0 replies; 6+ messages in thread
From: tj @ 2013-09-26 13:35 UTC (permalink / raw)
To: Dave Chinner; +Cc: majianpeng, axboe, linux-fsdevel
Hey,
On Thu, Sep 26, 2013 at 08:43:51AM +1000, Dave Chinner wrote:
> On Wed, Sep 25, 2013 at 04:08:54PM +0800, majianpeng wrote:
> > Hi,
> > How about this patch?
> > This bug can easily to reproduce.
> > dd if=/dev/zero of=/dev/sdb bs=64k
> > For a while, remove the disk. At my machine, it at most 100% occured.
>
> ...
> > >
> > >This bug introduced by commit ef3b101925f2170c.
> > >I think the situation is like:
> > >remove disk flush_thread_work
> > >bdi_destroy()
> > > bdi_unregister()
> > > bdi->dev = NULL
> > > bdi_writeback_workfn()
>
> You're just papering over the larger problem, in that the writeback
> work is running concurrently with the bdi_unregister() function that
> is tearing the bdi down. You should try to fix the underlying race
> condition, as documented in bdi_destroy.
Given that it's a fix for a reliably reproducible oops, I think
papering over is better than nothing as long as it's clearly labeled
as such. Jianpeng, can you please make it clear both in the
description and patch that this is a bandaid for now and bdi shutdown
sequence needs to be fixed properly later?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
2013-09-25 22:43 ` Dave Chinner
2013-09-26 13:35 ` tj
@ 2013-09-26 17:47 ` Christoph Hellwig
2013-09-26 22:59 ` Dave Chinner
1 sibling, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2013-09-26 17:47 UTC (permalink / raw)
To: Dave Chinner; +Cc: majianpeng, tj, axboe, linux-fsdevel
On Thu, Sep 26, 2013 at 08:43:51AM +1000, Dave Chinner wrote:
> You're just papering over the larger problem, in that the writeback
> work is running concurrently with the bdi_unregister() function that
> is tearing the bdi down. You should try to fix the underlying race
> condition, as documented in bdi_destroy.
The problem is even worse than that, and it's the lack of a proper
refcounting on the bdi as seen by gems like bdi_prune_sb and the moving
of writeback requests in bdi_destroy. The right fix is to dynamically
allocate the bdi (or at least the bdi_writeback), and make sure that we
keep it around as long as a filesystem and thus the writeback code
refers to. Then a block device going away can just set a flag to stop
writeback from trying instead of having the bdi ripped out underneath
it which is guaranteed to fail and historically has failed in a large
number of misterious ways.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn.
2013-09-26 17:47 ` Christoph Hellwig
@ 2013-09-26 22:59 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2013-09-26 22:59 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: majianpeng, tj, axboe, linux-fsdevel
On Thu, Sep 26, 2013 at 10:47:02AM -0700, Christoph Hellwig wrote:
> On Thu, Sep 26, 2013 at 08:43:51AM +1000, Dave Chinner wrote:
> > You're just papering over the larger problem, in that the writeback
> > work is running concurrently with the bdi_unregister() function that
> > is tearing the bdi down. You should try to fix the underlying race
> > condition, as documented in bdi_destroy.
>
> The problem is even worse than that, and it's the lack of a proper
> refcounting on the bdi as seen by gems like bdi_prune_sb and the moving
> of writeback requests in bdi_destroy. The right fix is to dynamically
> allocate the bdi (or at least the bdi_writeback), and make sure that we
> keep it around as long as a filesystem and thus the writeback code
> refers to. Then a block device going away can just set a flag to stop
> writeback from trying instead of having the bdi ripped out underneath
> it which is guaranteed to fail and historically has failed in a large
> number of misterious ways.
Yup, be nice to see this rat-hole cleaned up properly, especially
those silly "pull disk, unplug-runs-sync_fs, fs hangs trying to do
IO to the disk that just got unplugged" bugs we get reported every
so often.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-09-26 22:59 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-10 0:11 [RESEND PATCH] writeback: Judge bdi->dev when set worker desc in bdi_writeback_workfn majianpeng
2013-09-25 8:08 ` majianpeng
2013-09-25 22:43 ` Dave Chinner
2013-09-26 13:35 ` tj
2013-09-26 17:47 ` Christoph Hellwig
2013-09-26 22:59 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).