From: Jeff Layton <jlayton@kernel.org>
To: Mike Galbraith <efault@gmx.de>,
dai.ngo@oracle.com, Chuck Lever III <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work
Date: Wed, 11 Jan 2023 05:15:18 -0500 [thread overview]
Message-ID: <ce3724b88bb2987ac773057f523aa0ed2abacaed.camel@kernel.org> (raw)
In-Reply-To: <37c80eaf2f6d8a5d318e2b10e737a1c351b27427.camel@gmx.de>
On Wed, 2023-01-11 at 03:34 +0100, Mike Galbraith wrote:
> On Tue, 2023-01-10 at 11:58 -0800, dai.ngo@oracle.com wrote:
> >
> > On 1/10/23 11:30 AM, Jeff Layton wrote:
> >
> > > >
> > > >
> > > Looking over the traces that Mike posted, I suspect this is the real
> > > bug, particularly if the server is being restarted during this test.
> >
> > Yes, I noticed the WARN_ON_ONCE(timer->function != delayed_work_timer_fn)
> > too and this seems to indicate some kind of corruption. However, I'm not
> > sure if Mike's test restarts the nfs-server service. This could be a bug
> > in work queue module when it's under stress.
>
> My reproducer was to merely mount and traverse/md5sum, while that was
> going on, fire up LTP's min_free_kbytes testcase (memory hog from hell)
> on the server. Systemthing may well be restarting the server service
> in response to oomkill. In fact, the struct delayed_work in question
> at WARN_ON_ONCE() time didn't look the least bit ready for business.
>
> FWIW, I had noticed the missing cancel while eyeballing, and stuck one
> next to the existing one as a hail-mary, but that helped not at all.
>
Ok, thanks, that's good to know.
I still doubt that the problem is the race that Dai seems to think it
is. The workqueue infrastructure has been fairly stable for years. If
there were problems with concurrent tasks queueing the same work, the
kernel would be blowing up all over the place.
> crash> delayed_work ffff8881601fab48
> struct delayed_work {
> work = {
> data = {
> counter = 1
> },
> entry = {
> next = 0x0,
> prev = 0x0
> },
> func = 0x0
> },
> timer = {
> entry = {
> next = 0x0,
> pprev = 0x0
> },
> expires = 0,
> function = 0x0,
> flags = 0
> },
> wq = 0x0,
> cpu = 0
> }
That looks more like a memory scribble or UAF. Merely having multiple
tasks calling queue_work at the same time wouldn't be enough to trigger
this, IMO. It's more likely that the extra locking is changing the
timing of your reproducer somehow.
It might be interesting to turn up KASAN if you're able.
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2023-01-11 10:17 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-10 6:48 [PATCH 1/1] NFSD: fix WARN_ON_ONCE in __queue_delayed_work Dai Ngo
2023-01-10 10:30 ` Jeff Layton
2023-01-10 17:33 ` dai.ngo
2023-01-10 18:17 ` Chuck Lever III
2023-01-10 18:34 ` Jeff Layton
2023-01-10 19:17 ` dai.ngo
2023-01-10 19:30 ` Jeff Layton
2023-01-10 19:58 ` dai.ngo
2023-01-11 2:34 ` Mike Galbraith
2023-01-11 10:15 ` Jeff Layton [this message]
2023-01-11 10:55 ` Jeff Layton
2023-01-11 11:19 ` Mike Galbraith
2023-01-11 11:31 ` dai.ngo
2023-01-11 12:26 ` Mike Galbraith
2023-01-11 12:44 ` Jeff Layton
2023-01-11 12:00 ` Jeff Layton
2023-01-11 12:15 ` Mike Galbraith
2023-01-11 12:33 ` Jeff Layton
2023-01-11 13:48 ` Mike Galbraith
2023-01-11 14:01 ` Jeff Layton
2023-01-11 14:16 ` Jeff Layton
2023-01-10 18:46 ` dai.ngo
2023-01-10 18:53 ` Chuck Lever III
2023-01-10 19:07 ` dai.ngo
2023-01-10 19:27 ` Jeff Layton
2023-01-10 19:16 ` Jeff Layton
2023-01-10 14:26 ` Chuck Lever III
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ce3724b88bb2987ac773057f523aa0ed2abacaed.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=dai.ngo@oracle.com \
--cc=efault@gmx.de \
--cc=linux-nfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox