public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
From: "Chuck Lever" <cel@kernel.org>
To: "Frederic Weisbecker" <frederic@kernel.org>
Cc: "Marco Crivellari" <marco.crivellari@suse.com>,
	linux-kernel@vger.kernel.org, linux-nfs@vger.kernel.org,
	netdev@vger.kernel.org, "Tejun Heo" <tj@kernel.org>,
	"Lai Jiangshan" <jiangshanlai@gmail.com>,
	"Sebastian Andrzej Siewior" <bigeasy@linutronix.de>,
	"Michal Hocko" <mhocko@suse.com>,
	"Trond Myklebust" <trondmy@kernel.org>,
	"Anna Schumaker" <anna@kernel.org>,
	"Chuck Lever" <chuck.lever@oracle.com>,
	"Jeff Layton" <jlayton@kernel.org>, NeilBrown <neil@brown.name>,
	"Olga Kornievskaia" <okorniev@redhat.com>,
	"Dai Ngo" <Dai.Ngo@oracle.com>, "Tom Talpey" <tom@talpey.com>,
	"David S . Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Simon Horman" <horms@kernel.org>
Subject: Re: [RFC PATCH] xprtrdma: Move long delayed work on system_dfl_long_wq
Date: Thu, 30 Apr 2026 10:05:52 -0400	[thread overview]
Message-ID: <1e220a70-4318-49de-aaac-332c0a1cfab4@app.fastmail.com> (raw)
In-Reply-To: <afNguCraI6AvmZrR@localhost.localdomain>



On Thu, Apr 30, 2026, at 10:01 AM, Frederic Weisbecker wrote:
> Le Thu, Apr 30, 2026 at 09:35:20AM -0400, Chuck Lever a écrit :
>> 
>> On Thu, Apr 30, 2026, at 4:54 AM, Marco Crivellari wrote:
>> > Currently the code enqueue work items using {queue|mod}_delayed_work(),
>> > using system_long_wq. This workqueue should be used when long works are
>> > expected, but it is a per-cpu workqueue.
>> >
>> > This is important because queue_delayed_work() queue the work using:
>> >
>> >    queue_delayed_work_on(WORK_CPU_UNBOUND, ...);
>> >
>> > Note that WORK_CPU_UNBOUND = NR_CPUS.
>> >
>> > This would end up calling __queue_delayed_work() that does:
>> >
>> >     if (housekeeping_enabled(HK_TYPE_TIMER)) {
>> >     //      [....]
>> >     } else {
>> >             if (likely(cpu == WORK_CPU_UNBOUND))
>> >                     add_timer_global(timer);
>> >             else
>> >                     add_timer_on(timer, cpu);
>> >     }
>> >
>> > So when cpu == WORK_CPU_UNBOUND the timer is global and is
>> > not using a specific CPU. Later, when __queue_work() is called:
>> >
>> >     if (req_cpu == WORK_CPU_UNBOUND) {
>> >             if (wq->flags & WQ_UNBOUND)
>> >                     cpu = wq_select_unbound_cpu(raw_smp_processor_id());
>> >             else
>> >                     cpu = raw_smp_processor_id();
>> >     }
>> >
>> > Because the wq is not unbound, it takes the CPU where the timer
>> > fired and enqueue the work on that CPU.
>> > The consequence of all of this is that the work can run anywhere,
>> > depending on where the timer fired.
>> >
>> > Recently, a new unbound workqueue specific for long running work has
>> > been added:
>> >
>> >    c116737e972e ("workqueue: Add system_dfl_long_wq for long unbound works")
>> >
>> > So change system_long_wq with system_dfl_long_wq so that the work may
>> > benefit from scheduler task placement.
>> 
>> The patch description confuses me.
>> 
>> The message ends with "the work can run anywhere, depending on where
>> the timer fired." Read literally, "can run anywhere" sounds like a
>> feature, not a bug
>
> A feature, but incomplete :)
>
>> — and the proposed fix (WQ_UNBOUND) also lets it
>> run anywhere, just via a different selection path. Without a sentence
>> saying "and that anywhere includes isolated CPUs, which we don't want,"
>> the reader is left to fill in the gap.
>
> Not quite, global timers don't fire on isolated CPUs. And since it gets enqueued
> on the CPU where it fired, it won't be enqueued on an isolated CPU.
>
>> 
>> So, could the commit message lead with the motivation? My guess is that
>> this is about respecting HK_TYPE_TIMER housekeeping on isolated systems,
>> which system_long_wq cannot do because its per-CPU pool ignores the
>> housekeeping mask once the global timer fires. If that is the case,
>> please say so directly and the mechanism trace becomes a supporting
>> argument rather than the whole argument.
>
> The purpose is explained on the last line:
>
> """
> So change system_long_wq with system_dfl_long_wq so that the work may
>  benefit from scheduler task placement.
> """
>
> Arguably this could be elaborated. For example we can change that:
>
> """
> The consequence of all of this is that the work can run anywhere,
> depending on where the timer fired.
> """
>
> into that:
>
> """
> The consequence of all of this is that the work can run on any
> housekeeping CPU, irrespective of the scheduler that knows better
> about the best task placement, which would apply if the work were
> to be queued on an unbound workqueue.
> """
>
> Would that help?

It's still not clearing it up for me.

Does the patch address a bug (work isn't getting rescheduled at
all) or is it merely a minor optimization for certain platforms?

What's the user-visible issue that will be improved with this
change?

-- 
Chuck Lever

  reply	other threads:[~2026-04-30 14:06 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-30  8:54 [RFC PATCH] xprtrdma: Move long delayed work on system_dfl_long_wq Marco Crivellari
2026-04-30 13:35 ` Chuck Lever
2026-04-30 14:01   ` Frederic Weisbecker
2026-04-30 14:05     ` Chuck Lever [this message]
2026-04-30 15:04       ` Frederic Weisbecker
2026-04-30 15:09         ` Chuck Lever
2026-05-04  8:34           ` Marco Crivellari

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1e220a70-4318-49de-aaac-332c0a1cfab4@app.fastmail.com \
    --to=cel@kernel.org \
    --cc=Dai.Ngo@oracle.com \
    --cc=anna@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=frederic@kernel.org \
    --cc=horms@kernel.org \
    --cc=jiangshanlai@gmail.com \
    --cc=jlayton@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=mhocko@suse.com \
    --cc=neil@brown.name \
    --cc=netdev@vger.kernel.org \
    --cc=okorniev@redhat.com \
    --cc=pabeni@redhat.com \
    --cc=tj@kernel.org \
    --cc=tom@talpey.com \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox