Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	"mingo@kernel.org" <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>, Neil Brown <neilb@suse.de>,
	Michael Shaver <jmshaver@gmail.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] sched: Avoid that __wait_on_bit_lock() hangs
Date: Tue, 16 Aug 2016 15:06:00 +0200	[thread overview]
Message-ID: <20160816130559.GA14022@redhat.com> (raw)
In-Reply-To: <c2004d9c-1c57-756b-31ad-d88afd459a2b@sandisk.com>

On 08/15, Bart Van Assche wrote:
>
> On 08/13/2016 09:32 AM, Oleg Nesterov wrote:
>> On 08/12, Bart Van Assche wrote:
>>> before I started testing. It took some time
>>> before I could reproduce the hang in truncate_inode_pages_range().
>>
>> all I can say this contradicts with the previous testing results with
>> my previous patch or with your change in abort_exclusive_wait().
>
> Hello Oleg,
>
> My opinion is that all this means is that we do not yet have a full
> understanding of what is going on.

Sure.

> BTW, I have improved my page lock owner instrumentation patch such that
> it prints a call stack of the lock owner if lock_page() takes too long.
> The following call stack was reported:
>
> __lock_page / pid 8549 / m 0x2: timeout - continuing to wait for 8549
>   [<ffffffff8102b316>] save_stack_trace+0x26/0x50
>   [<ffffffff81152bee>] add_to_page_cache_lru+0x7e/0x170
>   [<ffffffff8121bfc5>] mpage_readpages+0xc5/0x170
>   [<ffffffff81215548>] blkdev_readpages+0x18/0x20
>   [<ffffffff81163a68>] __do_page_cache_readahead+0x268/0x310
>   [<ffffffff811640a8>] force_page_cache_readahead+0xa8/0x100
>   [<ffffffff81164139>] page_cache_sync_readahead+0x39/0x40
>   [<ffffffff81153967>] generic_file_read_iter+0x707/0x920
>   [<ffffffff81215920>] blkdev_read_iter+0x30/0x40
>   [<ffffffff811d4b4b>] __vfs_read+0xbb/0x130
>   [<ffffffff811d4f31>] vfs_read+0x91/0x130
>   [<ffffffff811d62b4>] SyS_read+0x44/0xa0
>   [<ffffffff816281e5>] entry_SYSCALL_64_fastpath+0x18/0xa8
>
> My understanding of mpage_readpages() is that the page unlock happens
> after readahead I/O completed (see also page_endio()). So this probably
> means that an I/O request submitted because of readahead code did not
> get completed. I will see whether I can find anything that's wrong in
> the block layer.

Perhaps. But this means another problem! Or you didn't wait enough. Or
your previous testing was wrong.

Because, once again, your changes in abort_exclusive_wait(), and my
debugging patch which adds wakeup into ClearPageLocked() suggest that
the problem is NOT that the page is still locked.


I'd still like to know what happens with the last patch I sent (without
any other changes)... but now I am totally confused.

If only I could reproduce. Or at least understand what are you doing to
hit thi bug ;)

Oleg.

next prev parent reply	other threads:[~2016-08-16 13:07 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-03 16:35 [PATCH] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
2016-08-03 18:11 ` Peter Zijlstra
2016-08-03 18:56   ` Bart Van Assche
2016-08-03 21:30     ` Oleg Nesterov
2016-08-03 21:51       ` Bart Van Assche
2016-08-04 14:09         ` Peter Zijlstra
2016-08-04 14:31           ` Bart Van Assche
2016-08-05 17:41           ` Bart Van Assche
2016-08-08 10:22             ` Peter Zijlstra
2016-08-08 14:38               ` Bart Van Assche
2016-08-08 16:20                 ` Oleg Nesterov
2016-08-08 18:31                   ` Bart Van Assche
2016-08-09 17:14                     ` Oleg Nesterov
2016-08-09 18:48                       ` Bart Van Assche
2016-08-09 23:10                         ` Bart Van Assche
2016-08-10 10:45                         ` Oleg Nesterov
2016-08-10 16:01                           ` Bart Van Assche
2016-08-10 16:27                             ` Oleg Nesterov
2016-08-10 19:58                           ` Bart Van Assche
2016-08-11 17:36                             ` Oleg Nesterov
2016-08-12 16:16                               ` Oleg Nesterov
2016-08-12 16:27                                 ` Bart Van Assche
2016-08-12 22:47                                 ` Bart Van Assche
2016-08-13 16:32                                   ` Oleg Nesterov
2016-08-15 23:39                                     ` Bart Van Assche
2016-08-16 13:06                                       ` Oleg Nesterov [this message]
2016-08-16 16:54                                         ` Bart Van Assche
2016-08-17 17:30                                           ` Oleg Nesterov
2016-08-13 17:07                                   ` Oleg Nesterov
2016-08-09 23:56                   ` Bart Van Assche
2016-08-10 10:57                     ` Oleg Nesterov
2016-08-10 11:03                       ` Peter Zijlstra
2016-08-04  0:05       ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160816130559.GA14022@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=bart.vanassche@sandisk.com \
    --cc=hannes@cmpxchg.org \
    --cc=jmshaver@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=neilb@suse.de \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox