linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tim Chen <tim.c.chen@linux.intel.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Brian Foster <bfoster@redhat.com>, Linux-MM <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Hugh Dickins <hughd@google.com>
Subject: Re: writeback completion soft lockup BUG in folio_wake_bit()
Date: Mon, 24 Oct 2022 12:39:33 -0700	[thread overview]
Message-ID: <feb89e52675ed630e52dc8aacfa66feb6f19fd3a.camel@linux.intel.com> (raw)
In-Reply-To: <CAHk-=wizsHtGa=7dESxXd6VNU2mdHqhvCv88FB3xcWb3o3iJMw@mail.gmail.com>

On Sun, 2022-10-23 at 15:38 -0700, Linus Torvalds wrote:
> On Wed, Oct 19, 2022 at 6:35 PM Dan Williams
> <dan.j.williams@intel.com> wrote:
> > 
> > A report from a tester with this call trace:
> > 
> >  watchdog: BUG: soft lockup - CPU#127 stuck for 134s!
> > [ksoftirqd/127:782]
> >  RIP: 0010:_raw_spin_unlock_irqrestore+0x19/0x40 [..]
> 
> Whee.
> 
> > ...lead me to this thread. This was after I had them force all
> > softirqs
> > to run in ksoftirqd context, and run with rq_affinity == 2 to force
> > I/O completion work to throttle new submissions.
> > 
> > Willy, are these headed upstream:
> > 
> > https://lore.kernel.org/all/YjSbHp6B9a1G3tuQ@casper.infradead.org
> > 
> > ...or I am missing an alternate solution posted elsewhere?
> 
> Can your reporter test that patch? I think it should still apply
> pretty much as-is.. And if we actually had somebody who had a
> test-case that was literally fixed by getting rid of the old bookmark
> code, that would make applying that patch a no-brainer.
> 
> The problem is that the original load that caused us to do that thing
> in the first place isn't repeatable because it was special production
> code - so removing that bookmark code because we _think_ it now hurts
> more than it helps is kind of a big hurdle.
> 
> But if we had some hard confirmation from somebody that "yes, the
> bookmark code is now hurting", that would make it a lot more
> palatable
> to just remove the code that we just _think_ that probably isn't
> needed any more..
> 
> 
I do think that the original locked page on migration problem was fixed
by commit 9a1ea439b16b. Unfortunately the customer did not respond to
us when we asked them to test their workload when that patch went 
into the mainline. 

I don't have objection to Matthew's fix to remove the bookmark code,
now that it is causing problems with this scenario that I didn't
anticipate in my original code.

Tim


  reply	other threads:[~2022-10-24 21:35 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-15 19:07 writeback completion soft lockup BUG in folio_wake_bit() Brian Foster
2022-03-16 20:59 ` Matthew Wilcox
2022-03-16 23:35   ` Linus Torvalds
2022-03-17 15:04     ` Matthew Wilcox
2022-03-17 19:26       ` Linus Torvalds
2022-03-17 21:16         ` Matthew Wilcox
2022-03-17 22:52           ` Dave Chinner
2022-03-18 13:16           ` Jan Kara
2022-03-18 18:56             ` Linus Torvalds
2022-03-19 16:23               ` Theodore Ts'o
2022-03-30 15:55                 ` Christoph Hellwig
2022-03-17 15:31     ` Brian Foster
2022-03-17 13:51   ` Brian Foster
2022-03-18 14:14     ` Brian Foster
2022-03-18 14:45       ` Matthew Wilcox
2022-03-18 18:58         ` Linus Torvalds
2022-10-20  1:35           ` Dan Williams
2022-10-23 22:38             ` Linus Torvalds
2022-10-24 19:39               ` Tim Chen [this message]
2022-10-24 19:43                 ` Linus Torvalds
2022-10-24 20:14                   ` Dan Williams
2022-10-24 20:13               ` Dan Williams
2022-10-24 20:28                 ` Linus Torvalds
2022-10-24 20:35                   ` Dan Williams
2022-10-25 15:58                     ` Arechiga Lopez, Jesus A
2022-10-25 19:19                   ` Matthew Wilcox
2022-10-25 19:20                     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=feb89e52675ed630e52dc8aacfa66feb6f19fd3a.camel@linux.intel.com \
    --to=tim.c.chen@linux.intel.com \
    --cc=bfoster@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=hughd@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).