linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthieu Baerts <matthieu.baerts@tessares.net>,
	Michael Larabel <Michael@michaellarabel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Amir Goldstein <amir73il@gmail.com>, Ted Ts'o <tytso@google.com>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	Ext4 Developers List <linux-ext4@vger.kernel.org>,
	Jan Kara <jack@suse.cz>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: Kernel Benchmarking
Date: Wed, 16 Sep 2020 12:34:46 +0200	[thread overview]
Message-ID: <20200916103446.GB3607@quack2.suse.cz> (raw)
In-Reply-To: <CAHk-=wj8Bi5Kiufw8_1SEMmxc0GCO5Nh7TxEt+c1HdKaya=LaA@mail.gmail.com>

On Tue 15-09-20 16:35:45, Linus Torvalds wrote:
> On Tue, Sep 15, 2020 at 12:56 PM Matthieu Baerts
> <matthieu.baerts@tessares.net> wrote:
> >
> > I am sorry, I am not sure how to verify this. I guess it was one
> > processor because I removed "-smp 2" option from qemu. So I guess it
> > switched to a uniprocessor mode.
> 
> Ok, that all sounds fine. So yes, your problem happens even with just
> one CPU, and it's not any subtle SMP race.
> 
> Which is all good - apart from the bug existing in the first place, of
> course. It just reinforces the "it's probably a latent deadlock"
> thing.

So from the traces another theory that appeared to me is that it could be a
"missed wakeup" problem. Looking at the code in wait_on_page_bit_common() I
found one suspicious thing (which isn't a great match because the problem
seems to happen on UP as well and I think it's mostly a theoretical issue but
still I'll write it here):

wait_on_page_bit_common() has:

        spin_lock_irq(&q->lock);
        SetPageWaiters(page);
        if (!trylock_page_bit_common(page, bit_nr, wait))
	  - which expands to:
	  (
	        if (wait->flags & WQ_FLAG_EXCLUSIVE) {
        	        if (test_and_set_bit(bit_nr, &page->flags))
                	        return false;
	        } else if (test_bit(bit_nr, &page->flags))
        	        return false;
	  )

                __add_wait_queue_entry_tail(q, wait);
        spin_unlock_irq(&q->lock);

Now the suspicious thing is the ordering here. What prevents the compiler
(or the CPU for that matter) from reordering SetPageWaiters() call behind
the __add_wait_queue_entry_tail() call? I know SetPageWaiters() and
test_and_set_bit() operate on the same long but is it really guaranteed
something doesn't reorder these?

In unlock_page() we have:

        if (clear_bit_unlock_is_negative_byte(PG_locked, &page->flags))
                wake_up_page_bit(page, PG_locked);

So if the reordering happens, clear_bit_unlock_is_negative_byte() could
return false even though we have a waiter queued.

And this seems to be a thing commit 2a9127fcf22 ("mm: rewrite
wait_on_page_bit_common() logic") introduced because before we had
set_current_state() between SetPageWaiters() and test_bit() which implies a
memory barrier.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2020-09-16 10:40 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAHk-=wiZnE409WkTOG6fbF_eV1LgrHBvMtyKkpTqM9zT5hpf9A@mail.gmail.com>
     [not found] ` <aa90f272-1186-f9e1-8fdb-eefd332fdae8@MichaelLarabel.com>
     [not found]   ` <CAHk-=wh_31_XBNHbdF7EUJceLpEpwRxVF+_1TONzyBUym6Pw4w@mail.gmail.com>
     [not found]     ` <e24ef34d-7b1d-dd99-082d-28ca285a79ff@MichaelLarabel.com>
     [not found]       ` <CAHk-=wgEE4GuNjcRaaAvaS97tW+239-+tjcPjTq2FGhEuM8HYg@mail.gmail.com>
     [not found]         ` <6e1d8740-2594-c58b-ff02-a04df453d53c@MichaelLarabel.com>
     [not found]           ` <CAHk-=wgJ3-cEkU-5zXFPvRCHKkCCuKxVauYWGphjePEhJJgtgQ@mail.gmail.com>
     [not found]             ` <d2023f4c-ef14-b877-b5bb-e4f8af332abc@MichaelLarabel.com>
     [not found]               ` <CAHk-=wiz=J=8mJ=zRG93nuJ9GtQAm5bSRAbWJbWZuN4Br38+EQ@mail.gmail.com>
     [not found]                 ` <CAHk-=wimM2kckaYj7spUJwehZkSYxK9RQqu3G392BE=73dyKtg@mail.gmail.com>
     [not found]                   ` <8bb582d2-2841-94eb-8862-91d1225d5ebc@MichaelLarabel.com>
     [not found]                     ` <CAHk-=wjqE_a6bpZyDQ4DCrvj_Dv2RwQoY7wN91kj8y-tZFRvEA@mail.gmail.com>
     [not found]                       ` <0cbc959e-1b8d-8d7e-1dc6-672cf5b3899a@MichaelLarabel.com>
     [not found]                         ` <CAHk-=whP-7Uw9WgWgjRgF1mCg+NnkOPpWjVw+a9M3F9C52DrVg@mail.gmail.com>
     [not found]                           ` <CAHk-=wjfw3U5eTGWLaisPHg1+jXsCX=xLZgqPx4KJeHhEqRnEQ@mail.gmail.com>
     [not found]                             ` <a2369108-7103-278c-9f10-6309a0a9dc3b@MichaelLarabel.com>
2020-09-12  7:28                               ` Kernel Benchmarking Amir Goldstein
2020-09-12 10:32                                 ` Michael Larabel
2020-09-12 14:37                                   ` Matthew Wilcox
2020-09-12 14:44                                     ` Michael Larabel
2020-09-15  3:32                                       ` Matthew Wilcox
2020-09-15 10:39                                         ` Jan Kara
2020-09-15 13:52                                           ` Matthew Wilcox
     [not found]                                     ` <658ae026-32d9-0a25-5a59-9c510d6898d5@MichaelLarabel.com>
2020-09-14 17:47                                       ` Linus Torvalds
2020-09-14 20:21                                         ` Matthieu Baerts
2020-09-14 20:53                                           ` Linus Torvalds
2020-09-15  0:42                                             ` Linus Torvalds
2020-09-15 15:34                                             ` Matthieu Baerts
2020-09-15 18:27                                               ` Linus Torvalds
2020-09-15 18:47                                                 ` Linus Torvalds
2020-09-15 19:26                                                   ` Matthieu Baerts
2020-09-15 19:32                                                     ` Linus Torvalds
2020-09-15 19:56                                                       ` Matthieu Baerts
2020-09-15 23:35                                                         ` Linus Torvalds
2020-09-16 10:34                                                           ` Jan Kara [this message]
2020-09-16 18:47                                                             ` Linus Torvalds
     [not found]                                                 ` <9a92bf16-02c5-ba38-33c7-f350588ac874@tessares.net>
2020-09-15 19:24                                                   ` Linus Torvalds
2020-09-15 19:38                                                     ` Matthieu Baerts
2020-09-15 18:31                                               ` Linus Torvalds
2020-09-15 14:21                                         ` Michael Larabel
2020-09-15 17:52                                           ` Linus Torvalds
2020-09-17 17:51                                         ` Linus Torvalds
2020-09-17 18:23                                           ` Matthew Wilcox
2020-09-17 18:30                                             ` Linus Torvalds
2020-09-17 18:50                                               ` Matthew Wilcox
2020-09-17 19:00                                                 ` Linus Torvalds
2020-09-17 19:27                                                   ` Matthew Wilcox
2020-09-17 19:47                                                     ` Linus Torvalds
2020-09-18  0:39                                                       ` Sedat Dilek
2020-09-18  0:40                                                         ` Sedat Dilek
2020-09-18 20:25                                                           ` Sedat Dilek
2020-09-20 17:06                                                             ` Linus Torvalds
2020-09-20 17:14                                                               ` Sedat Dilek
2020-09-20 17:40                                                                 ` Linus Torvalds
2020-09-20 18:00                                                                   ` Sedat Dilek
2020-09-20 23:23                                                       ` Dave Chinner
2020-09-20 23:31                                                         ` Linus Torvalds
2020-09-20 23:40                                                           ` Linus Torvalds
2020-09-21  1:20                                                           ` Dave Chinner
2020-09-12 15:53                                 ` Matthew Wilcox
2020-09-12 17:59                                 ` Linus Torvalds
2020-09-12 20:32                                   ` Rogério Brito
2020-09-14  9:33                                     ` Jan Kara
2020-09-12 20:58                                   ` Josh Triplett
2020-09-12 20:59                                   ` James Bottomley
2020-09-12 21:15                                     ` Linus Torvalds
2020-09-12 22:32                                   ` Matthew Wilcox
2020-09-13  0:40                                   ` Dave Chinner
2020-09-13  2:39                                     ` Linus Torvalds
2020-09-13  3:40                                       ` Matthew Wilcox
2020-09-13 23:45                                       ` Dave Chinner
2020-09-14  3:31                                         ` Matthew Wilcox
2020-09-15 14:28                                           ` Chris Mason
2020-09-15  9:27                                         ` Jan Kara
2020-09-13  3:18                                     ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200916103446.GB3607@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=Michael@michaellarabel.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=amir73il@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=matthieu.baerts@tessares.net \
    --cc=torvalds@linux-foundation.org \
    --cc=tytso@google.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).