public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Geraldo Nascimento <geraldogabriel@gmail.com>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] workqueue: missing NOT while checking if Workqueue is offline
Date: Sat, 28 May 2022 20:14:23 -1000	[thread overview]
Message-ID: <YpMPPyIZVlBwUrNe@slm.duckdns.org> (raw)
In-Reply-To: <YpMKY88/2tTK319E@geday>

On Sun, May 29, 2022 at 02:53:39AM -0300, Geraldo Nascimento wrote:
> On Sat, May 28, 2022 at 07:24:41PM -1000, Tejun Heo wrote:
> > On Sun, May 29, 2022 at 01:29:32AM -0300, Geraldo Nascimento wrote:
> > > I would like very much to hear the opinion of the maintainers!
> > 
> > I have a hard time understanding what you're trying to do. Can you please
> > slow down and start from describing the problem itself?
> 
> Hi Tejun,
> 
> Sorry for the hurry.
> 
> The problem is best described in https://gitlab.freedesktop.org/drm/amd/-/issues/1898
> 
> From my understanding from the context of __cancel_work_timer() we should not
> ever call __flush_work() but I may be wrong. In the present case as

Yeah, you're wrong.

> described in AMD's GitLab __cancel_work_timer() is being called by
> cancel_delayed_work_sync() inside kfd_process_notifier_release()
> from drivers/gpu/drm/amd/amdkfd/kfd_process.c:1157 (Linux 5.18).

Have you confirmed that that actually is the warning which is triggering? I
don't see how that condition would trigger that late during the boot and the
warning line being reported doesn't match v5.16 source code, so I'm not sure
but skimming the instructon sequence, that's the second UD2 sequence, so I'm
gonna guess that's the second WARN_ON - the !work->func one and someone else
on the gitlab bug report seems to agree too.

It's usually a lot more helpful if the bug report is complete - include the
full warning message with some context at least, make sure that the kernel
you're using is an upstream one or something close enough. If not, point to
the source tree. Also, try to clearly distinguish what you know and what you
suspect. Both can help but mixing them up together tends to cause confusion
for everyone involved.

It just looks like the code is trying to cancel a work item which hasn't
been initialized and what it prolly needs is an ifdef around that cancel
call depending on the config option.

Thanks.

-- 
tejun

  reply	other threads:[~2022-05-29  6:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-28 20:07 [PATCH] workqueue: missing NOT while checking if Workqueue is offline Geraldo Nascimento
2022-05-29  1:25 ` Geraldo Nascimento
2022-05-29  4:29   ` Geraldo Nascimento
2022-05-29  5:24     ` Tejun Heo
2022-05-29  5:53       ` Geraldo Nascimento
2022-05-29  6:14         ` Tejun Heo [this message]
2022-05-29  6:41           ` Geraldo Nascimento
2022-05-31  4:13           ` Geraldo Nascimento
2022-05-31 17:43             ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YpMPPyIZVlBwUrNe@slm.duckdns.org \
    --to=tj@kernel.org \
    --cc=geraldogabriel@gmail.com \
    --cc=jiangshanlai@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox