linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
@ 2015-12-12 16:23 Chris Mason
  2015-12-12 18:33 ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: Chris Mason @ 2015-12-12 16:23 UTC (permalink / raw)
  To: Linus Torvalds, Peter Zijlstra, Dave Jones, LKML,
	Jon Christopherson

We have two reports of frequent crashes in btrfs where asserts in
clear_page_dirty_for_io() were triggering on missing page locks.

The crashes were much easier to trigger when processes were catching
ctrl-c's, and after much debugging it really looked like lock_page was a
noop.

This recent commit looks pretty suspect to me, and I confirmed that we
were exiting __wait_on_bit_lock() with -EINTR when it was called with
TASK_UNINTERRUPTIBLE

commit 68985633bccb6066bf1803e316fbc6c1f5b796d6
Author: Peter Zijlstra <peterz@infradead.org>
Date:   Tue Dec 1 14:04:04 2015 +0100

    sched/wait: Fix signal handling in bit wait helpers

The patch below is mostly untested, and probably not the right solution.
Dave's trinity run doesn't explode immediately anymore, and I wanted to
get this out for discussion.  A quick look on the list doesn't show
anyone else has tracked this down, sorry if it's a dup.

Reported-by: Dave Jones <dsj@fb.com>, 
Reported-by: Jon Christopherson <jon@jons.org>
Signed-off-by: Chris Mason <clm@fb.com>

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f10bd87..12f69df 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -434,6 +434,8 @@ __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
 		ret = action(&q->key);
 		if (!ret)
 			continue;
+		if (ret == -EINTR && mode == TASK_UNINTERRUPTIBLE)
+			continue;
 		abort_exclusive_wait(wq, &q->wait, mode, &q->key);
 		return ret;
 	} while (test_and_set_bit(q->key.bit_nr, q->key.flags));

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-12 16:23 [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR Chris Mason
@ 2015-12-12 18:33 ` Linus Torvalds
  2015-12-12 19:41   ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2015-12-12 18:33 UTC (permalink / raw)
  To: Chris Mason, Linus Torvalds, Peter Zijlstra, Dave Jones, LKML,
	Jon Christopherson

On Sat, Dec 12, 2015 at 8:23 AM, Chris Mason <clm@fb.com> wrote:
> We have two reports of frequent crashes in btrfs where asserts in
> clear_page_dirty_for_io() were triggering on missing page locks.
>
> The crashes were much easier to trigger when processes were catching
> ctrl-c's, and after much debugging it really looked like lock_page was a
> noop.

Hmm. PeterZ has another patch for wait_ob_bit signal handling pending,
but I *think* that one only affects the "killable()" case.

Peter, did that patch also handle just plain "lock_page()" case?

                     Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-12 18:33 ` Linus Torvalds
@ 2015-12-12 19:41   ` Linus Torvalds
  2015-12-13  0:07     ` Chris Mason
  2015-12-13  9:50     ` Peter Zijlstra
  0 siblings, 2 replies; 11+ messages in thread
From: Linus Torvalds @ 2015-12-12 19:41 UTC (permalink / raw)
  To: Chris Mason, Peter Zijlstra, Dave Jones, LKML, Jon Christopherson
  Cc: NeilBrown, Ingo Molnar, David Howells, Steven Whitehouse

On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> Peter, did that patch also handle just plain "lock_page()" case?

Looking more at it, I think this all goes back to commit 743162013d40
("sched: Remove proliferation of wait_on_bit() action functions").

Before that, we had wait_on_page_bit() doing:

        __wait_on_bit(page_waitqueue(page), &wait, sleep_on_page,
TASK_UNINTERRUPTIBLE);

and after that, the "sleep_on_page" got changed to "bit_wait_io".

But that is bogus, because sleep_on_page() used to look like this:

    static int sleep_on_page(void *word)
    {
        io_schedule();
        return 0;
    }

while bit_wait_io() looks like this:

    __sched int bit_wait_io(void *word)
    {
        if (signal_pending_state(current->state, current))
                return 1;
        io_schedule();
        return 0;
    }

which is ok, because as long as the task state is
TASK_UNINTERRUPTIBLE, the whole signal_pending_state() thing turns
into a no-op.

So far, so fine.

However, then commit 68985633bccb ("sched/wait: Fix signal handling in
bit wait helpers") _really_ screwed up, and changed the function to

    __sched int bit_wait(struct wait_bit_key *word)
    {
        schedule();
        if (signal_pending(current))
                return -EINTR;
        return 0;
    }

so now it returns an error when no error should happen. Which in turn
makes __wait_on_bit() exit the bit-wait loop early.

It looks like PeterZ's pending patch should fix this, by passing in
the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
back to signal_pending_state(). PeterZ, did I follow the history of
this correctly?

                   Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-12 19:41   ` Linus Torvalds
@ 2015-12-13  0:07     ` Chris Mason
  2015-12-14 18:33       ` Dave Jones
  2015-12-13  9:50     ` Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Mason @ 2015-12-13  0:07 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Peter Zijlstra, Dave Jones, LKML, Jon Christopherson, NeilBrown,
	Ingo Molnar, David Howells, Steven Whitehouse

On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
> On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Peter, did that patch also handle just plain "lock_page()" case?
> 
> Looking more at it, I think this all goes back to commit 743162013d40
> ("sched: Remove proliferation of wait_on_bit() action functions").
> 
> It looks like PeterZ's pending patch should fix this, by passing in
> the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
> back to signal_pending_state(). PeterZ, did I follow the history of
> this correctly?

Looks right to me, I found Peter's patch and have it running now. After
about 6 hours my patch did eventually crash again under trinity.  Btrfs has a
very old (from 2011) bug in the error handling path that trinity is
banging on.

Doing another run with Peter's patch and btrfs fixed up.  The btrfs patch is
small, but not urgent enough to shove in on Sunday.  I'll send for rc6
along with a few others we've queued up.

-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-12 19:41   ` Linus Torvalds
  2015-12-13  0:07     ` Chris Mason
@ 2015-12-13  9:50     ` Peter Zijlstra
  2015-12-13 15:55       ` Chris Mason
  2015-12-13 20:51       ` Linus Torvalds
  1 sibling, 2 replies; 11+ messages in thread
From: Peter Zijlstra @ 2015-12-13  9:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Mason, Dave Jones, LKML, Jon Christopherson, NeilBrown,
	Ingo Molnar, David Howells, Steven Whitehouse

On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
> It looks like PeterZ's pending patch should fix this, by passing in
> the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
> back to signal_pending_state(). PeterZ, did I follow the history of
> this correctly?

Yes. I made a right mess of things with my 'fix'.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-13  9:50     ` Peter Zijlstra
@ 2015-12-13 15:55       ` Chris Mason
  2015-12-13 20:51       ` Linus Torvalds
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Mason @ 2015-12-13 15:55 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Linus Torvalds, Dave Jones, LKML, Jon Christopherson, NeilBrown,
	Ingo Molnar, David Howells, Steven Whitehouse

On Sun, Dec 13, 2015 at 10:50:01AM +0100, Peter Zijlstra wrote:
> On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
> > It looks like PeterZ's pending patch should fix this, by passing in
> > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
> > back to signal_pending_state(). PeterZ, did I follow the history of
> > this correctly?
> 
> Yes. I made a right mess of things with my 'fix'.

Our two fixes together made it overnight.  So at least for the
lock_page() case things are extra super fixed.

-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-13  9:50     ` Peter Zijlstra
  2015-12-13 15:55       ` Chris Mason
@ 2015-12-13 20:51       ` Linus Torvalds
  2015-12-13 21:12         ` Peter Zijlstra
  1 sibling, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2015-12-13 20:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Chris Mason, Dave Jones, LKML, Jon Christopherson, NeilBrown,
	Ingo Molnar, David Howells, Steven Whitehouse

On Sun, Dec 13, 2015 at 1:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
>
> Yes. I made a right mess of things with my 'fix'.

Can I get the latest version of your fix as a patch (or pull request,
I'm not picky)? I'd like to get this fixed for 4.4-rc5, which is
supposed to go out today..

                   Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-13 20:51       ` Linus Torvalds
@ 2015-12-13 21:12         ` Peter Zijlstra
  0 siblings, 0 replies; 11+ messages in thread
From: Peter Zijlstra @ 2015-12-13 21:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Chris Mason, Dave Jones, LKML, Jon Christopherson, NeilBrown,
	Ingo Molnar, David Howells, Steven Whitehouse

On Sun, Dec 13, 2015 at 12:51:08PM -0800, Linus Torvalds wrote:
> On Sun, Dec 13, 2015 at 1:50 AM, Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > Yes. I made a right mess of things with my 'fix'.
> 
> Can I get the latest version of your fix as a patch (or pull request,
> I'm not picky)? I'd like to get this fixed for 4.4-rc5, which is
> supposed to go out today..

On its way (or there already, since I send it before this one). I did an
allmodconfig build with it on your latest just in case.



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-13  0:07     ` Chris Mason
@ 2015-12-14 18:33       ` Dave Jones
  2015-12-14 20:01         ` Chris Mason
  2015-12-14 23:59         ` Chris Mason
  0 siblings, 2 replies; 11+ messages in thread
From: Dave Jones @ 2015-12-14 18:33 UTC (permalink / raw)
  To: Chris Mason, Linus Torvalds, Peter Zijlstra, LKML,
	Jon Christopherson, NeilBrown, Ingo Molnar, David Howells,
	Steven Whitehouse

On Sat, Dec 12, 2015 at 07:07:46PM -0500, Chris Mason wrote:
 > On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
 > > On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
 > > <torvalds@linux-foundation.org> wrote:
 > > >
 > > > Peter, did that patch also handle just plain "lock_page()" case?
 > > 
 > > Looking more at it, I think this all goes back to commit 743162013d40
 > > ("sched: Remove proliferation of wait_on_bit() action functions").
 > > 
 > > It looks like PeterZ's pending patch should fix this, by passing in
 > > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
 > > back to signal_pending_state(). PeterZ, did I follow the history of
 > > this correctly?
 > 
 > Looks right to me, I found Peter's patch and have it running now. After
 > about 6 hours my patch did eventually crash again under trinity.  Btrfs has a
 > very old (from 2011) bug in the error handling path that trinity is
 > banging on.

Is the other bug this one ? I've hit this quite a lot over the last 12 months,
and now that the lock_page bug is fixed this is showing up again.

page:ffffea00110d2700 count:4 mapcount:0 mapping:ffff88045b5160a0 index:0x0
flags: 0x8000000000000806(error|referenced|private)
page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:812!
invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC KASAN
CPU: 1 PID: 22062 Comm: trinity-c4 Tainted: G        W       4.4.0-rc5-think+ #1
task: ffff8800bce51b80 ti: ffff8803f7210000 task.ti: ffff8803f7210000
RIP: 0010:[<ffffffffad25e3c7>]  [<ffffffffad25e3c7>] unlock_page+0xa7/0xb0
RSP: 0018:ffff8803f72176b8  EFLAGS: 00010292
RAX: fffff9400221a4e0 RBX: 0000000000001000 RCX: 0000000000000000
RDX: dffffc0000000000 RSI: ffffffffad144aee RDI: ffffea00110d2700
RBP: ffff8803f72176d8 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000001 R12: ffffea00110d2700
R13: 0000000000000000 R14: ffffea00110d2700 R15: 0000000000000000
FS:  00007f82b1f73700(0000) GS:ffff880468a00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000389120 CR3: 000000040bc54000 CR4: 00000000001406e0
Stack:
 0000000000001000 0000000000000000 0000000000000000 ffffea00110d2700
 ffff8803f7217898 ffffffffc010b777 0000000000000000 ffff880302400040
 0000000000000000 ffff8803f7217728 ffff8803f7217728 ffff8803f7217760
Call Trace:
 [<ffffffffc010b777>] __do_readpage+0xa97/0xd80 [btrfs]
 [<ffffffffad13351f>] ? mark_lock+0x6f/0x8a0
 [<ffffffffadcea0ac>] ? _raw_spin_unlock_irq+0x2c/0x50
 [<ffffffffc00db310>] ? btrfs_real_readdir+0x8d0/0x8d0 [btrfs]
 [<ffffffffc010ace0>] ? extent_write_cache_pages.isra.37.constprop.54+0x540/0x540 [btrfs]
 [<ffffffffad153fea>] ? rcu_read_lock_sched_held+0x8a/0xa0
 [<ffffffffad25e9c4>] ? __add_to_page_cache_locked+0x464/0x500
 [<ffffffffc010bb66>] __extent_read_full_page+0x106/0x120 [btrfs]
 [<ffffffffc00db310>] ? btrfs_real_readdir+0x8d0/0x8d0 [btrfs]
 [<ffffffffc010c09d>] extent_read_full_page+0xad/0x110 [btrfs]
 [<ffffffffc010bff0>] ? set_page_extent_mapped+0x30/0x30 [btrfs]
 [<ffffffffad122f70>] ? __wake_up_locked_key+0x20/0x20
 [<ffffffffad25ea80>] ? add_to_page_cache_locked+0x20/0x20
 [<ffffffffad25f4fb>] ? find_get_entry+0x14b/0x270
 [<ffffffffad25f3b5>] ? find_get_entry+0x5/0x270
 [<ffffffffc00d7a00>] btrfs_readpage+0x40/0x50 [btrfs]
 [<ffffffffc00f17f9>] prepare_uptodate_page+0x39/0x80 [btrfs]
 [<ffffffffc00f19de>] prepare_pages+0x19e/0x210 [btrfs]
 [<ffffffffc00f2d21>] __btrfs_buffered_write+0x351/0x8a0 [btrfs]
 [<ffffffffc00f29d0>] ? btrfs_dirty_pages+0xf0/0xf0 [btrfs]
 [<ffffffffad2619aa>] ? generic_file_direct_write+0x1aa/0x2c0
 [<ffffffffad261800>] ? generic_file_read_iter+0xa00/0xa00
 [<ffffffffc00f866d>] btrfs_file_write_iter+0x6dd/0x800 [btrfs]
 [<ffffffffad2f694d>] __vfs_write+0x21d/0x260
 [<ffffffffad2f6730>] ? __vfs_read+0x260/0x260
 [<ffffffffad12ed32>] ? __lock_is_held+0x92/0xd0
 [<ffffffffad0ee3b1>] ? preempt_count_sub+0xc1/0x120
 [<ffffffffad12cd17>] ? percpu_down_read+0x57/0xa0
 [<ffffffffad2fbd24>] ? __sb_start_write+0xb4/0xf0
 [<ffffffffad2f7736>] vfs_write+0xf6/0x260
 [<ffffffffad2f8d4f>] SyS_write+0xbf/0x160
 [<ffffffffad2f8c90>] ? SyS_read+0x160/0x160
 [<ffffffffad002017>] ? trace_hardirqs_on_thunk+0x17/0x19
 [<ffffffffadceab17>] entry_SYSCALL_64_fastpath+0x12/0x6b
Code: 48 8d 14 80 48 8d 04 50 31 d2 49 8d 3c c6 e8 c1 4b ec ff 5b 41 5c 41 5d 41 5e 5d c3 48 c7 c6 e0 3a ec ad 4c 89 e7 e8 49 22 04 00 <0f> 0b 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 53 48 89 
RIP  [<ffffffffad25e3c7>] unlock_page+0xa7/0xb0
 RSP <ffff8803f72176b8>
---[ end trace 5d1ded6801ca2e6c ]---


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-14 18:33       ` Dave Jones
@ 2015-12-14 20:01         ` Chris Mason
  2015-12-14 23:59         ` Chris Mason
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Mason @ 2015-12-14 20:01 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Peter Zijlstra, LKML,
	Jon Christopherson, NeilBrown, Ingo Molnar, David Howells,
	Steven Whitehouse

On Mon, Dec 14, 2015 at 01:33:56PM -0500, Dave Jones wrote:
> On Sat, Dec 12, 2015 at 07:07:46PM -0500, Chris Mason wrote:
>  > On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
>  > > On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
>  > > <torvalds@linux-foundation.org> wrote:
>  > > >
>  > > > Peter, did that patch also handle just plain "lock_page()" case?
>  > > 
>  > > Looking more at it, I think this all goes back to commit 743162013d40
>  > > ("sched: Remove proliferation of wait_on_bit() action functions").
>  > > 
>  > > It looks like PeterZ's pending patch should fix this, by passing in
>  > > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
>  > > back to signal_pending_state(). PeterZ, did I follow the history of
>  > > this correctly?
>  > 
>  > Looks right to me, I found Peter's patch and have it running now. After
>  > about 6 hours my patch did eventually crash again under trinity.  Btrfs has a
>  > very old (from 2011) bug in the error handling path that trinity is
>  > banging on.
> 
> Is the other bug this one ? I've hit this quite a lot over the last 12 months,
> and now that the lock_page bug is fixed this is showing up again.
> 
> page:ffffea00110d2700 count:4 mapcount:0 mapping:ffff88045b5160a0 index:0x0
> flags: 0x8000000000000806(error|referenced|private)
> page dumped because: VM_BUG_ON_PAGE(!PageLocked(page))

[ snip ]

>  [<ffffffffc00f17f9>] prepare_uptodate_page+0x39/0x80 [btrfs]
>  [<ffffffffc00f19de>] prepare_pages+0x19e/0x210 [btrfs]

This should be the second call to prepare_uptodate_page() in
prepare_pages().   If we get an error on the first call, and the write
only spans a single page, we'll call prepare_uptodate_page a second time
on an unlocked page.

I'll send out the patch a little later this afternoon.

>  [<ffffffffc00f2d21>] __btrfs_buffered_write+0x351/0x8a0 [btrfs]
>  [<ffffffffc00f29d0>] ? btrfs_dirty_pages+0xf0/0xf0 [btrfs]
>  [<ffffffffad2619aa>] ? generic_file_direct_write+0x1aa/0x2c0
>  [<ffffffffad261800>] ? generic_file_read_iter+0xa00/0xa00
>  [<ffffffffc00f866d>] btrfs_file_write_iter+0x6dd/0x800 [btrfs]
>  [<ffffffffad2f694d>] __vfs_write+0x21d/0x260
>  [<ffffffffad2f6730>] ? __vfs_read+0x260/0x260
>  [<ffffffffad12ed32>] ? __lock_is_held+0x92/0xd0
>  [<ffffffffad0ee3b1>] ? preempt_count_sub+0xc1/0x120
>  [<ffffffffad12cd17>] ? percpu_down_read+0x57/0xa0
>  [<ffffffffad2fbd24>] ? __sb_start_write+0xb4/0xf0
>  [<ffffffffad2f7736>] vfs_write+0xf6/0x260
>  [<ffffffffad2f8d4f>] SyS_write+0xbf/0x160
>  [<ffffffffad2f8c90>] ? SyS_read+0x160/0x160
>  [<ffffffffad002017>] ? trace_hardirqs_on_thunk+0x17/0x19
>  [<ffffffffadceab17>] entry_SYSCALL_64_fastpath+0x12/0x6b

-chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR
  2015-12-14 18:33       ` Dave Jones
  2015-12-14 20:01         ` Chris Mason
@ 2015-12-14 23:59         ` Chris Mason
  1 sibling, 0 replies; 11+ messages in thread
From: Chris Mason @ 2015-12-14 23:59 UTC (permalink / raw)
  To: Dave Jones, Linus Torvalds, Peter Zijlstra, LKML,
	Jon Christopherson, NeilBrown, Ingo Molnar, David Howells,
	Steven Whitehouse

On Mon, Dec 14, 2015 at 01:33:56PM -0500, Dave Jones wrote:
> On Sat, Dec 12, 2015 at 07:07:46PM -0500, Chris Mason wrote:
>  > On Sat, Dec 12, 2015 at 11:41:26AM -0800, Linus Torvalds wrote:
>  > > On Sat, Dec 12, 2015 at 10:33 AM, Linus Torvalds
>  > > <torvalds@linux-foundation.org> wrote:
>  > > >
>  > > > Peter, did that patch also handle just plain "lock_page()" case?
>  > > 
>  > > Looking more at it, I think this all goes back to commit 743162013d40
>  > > ("sched: Remove proliferation of wait_on_bit() action functions").
>  > > 
>  > > It looks like PeterZ's pending patch should fix this, by passing in
>  > > the proper TASK_UNINTERRUPTIBLE to the bit_wait_io function, and going
>  > > back to signal_pending_state(). PeterZ, did I follow the history of
>  > > this correctly?
>  > 
>  > Looks right to me, I found Peter's patch and have it running now. After
>  > about 6 hours my patch did eventually crash again under trinity.  Btrfs has a
>  > very old (from 2011) bug in the error handling path that trinity is
>  > banging on.
> 
> Is the other bug this one ? I've hit this quite a lot over the last 12 months,
> and now that the lock_page bug is fixed this is showing up again.

Linus, I'll send this in a pull request, but just to close the loop in
this thread:

From: Chris Mason <clm@fb.com>
Subject: [PATCH] Btrfs: check prepare_uptodate_page() error code earlier

prepare_pages() may end up calling prepare_uptodate_page() twice if our
write only spans a single page.  But if the first call returns an error,
our page will be unlocked and its not safe to call it again.

This bug goes all the way back to 2011, and it's not something commonly
hit.

While we're here, add a more explicit check for the page being truncated
away.  The bare lock_page() alone is protected only by good thoughts and
i_mutex, which we're sure to regret eventually.

Reported-by: Dave Jones <dsj@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
---
 fs/btrfs/file.c | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 72e7346..0f09526 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1291,7 +1291,8 @@ out:
  * on error we return an unlocked page and the error value
  * on success we return a locked page and 0
  */
-static int prepare_uptodate_page(struct page *page, u64 pos,
+static int prepare_uptodate_page(struct inode *inode,
+				 struct page *page, u64 pos,
 				 bool force_uptodate)
 {
 	int ret = 0;
@@ -1306,6 +1307,10 @@ static int prepare_uptodate_page(struct page *page, u64 pos,
 			unlock_page(page);
 			return -EIO;
 		}
+		if (page->mapping != inode->i_mapping) {
+			unlock_page(page);
+			return -EAGAIN;
+		}
 	}
 	return 0;
 }
@@ -1324,6 +1329,7 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages,
 	int faili;
 
 	for (i = 0; i < num_pages; i++) {
+again:
 		pages[i] = find_or_create_page(inode->i_mapping, index + i,
 					       mask | __GFP_WRITE);
 		if (!pages[i]) {
@@ -1333,13 +1339,17 @@ static noinline int prepare_pages(struct inode *inode, struct page **pages,
 		}
 
 		if (i == 0)
-			err = prepare_uptodate_page(pages[i], pos,
+			err = prepare_uptodate_page(inode, pages[i], pos,
 						    force_uptodate);
-		if (i == num_pages - 1)
-			err = prepare_uptodate_page(pages[i],
+		if (!err && i == num_pages - 1)
+			err = prepare_uptodate_page(inode, pages[i],
 						    pos + write_bytes, false);
 		if (err) {
 			page_cache_release(pages[i]);
+			if (err == -EAGAIN) {
+				err = 0;
+				goto again;
+			}
 			faili = i - 1;
 			goto fail;
 		}
-- 
2.4.6


^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2015-12-15  0:00 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-12 16:23 [PATCH] lock_page() doesn't lock if __wait_on_bit_lock returns -EINTR Chris Mason
2015-12-12 18:33 ` Linus Torvalds
2015-12-12 19:41   ` Linus Torvalds
2015-12-13  0:07     ` Chris Mason
2015-12-14 18:33       ` Dave Jones
2015-12-14 20:01         ` Chris Mason
2015-12-14 23:59         ` Chris Mason
2015-12-13  9:50     ` Peter Zijlstra
2015-12-13 15:55       ` Chris Mason
2015-12-13 20:51       ` Linus Torvalds
2015-12-13 21:12         ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).