Re: [GIT PULL] xfs: new code for 5.15

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Dennis Zhou <dennis@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Darrick J. Wong" <djwong@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-xfs <linux-xfs@vger.kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Eric Sandeen <sandeen@sandeen.net>,
	Christoph Hellwig <hch@lst.de>
Subject: Re: [GIT PULL] xfs: new code for 5.15
Date: Fri, 3 Sep 2021 14:40:58 -0400	[thread overview]
Message-ID: <YTJsOoqaI3FiTkZD@fedora> (raw)
In-Reply-To: <CAHk-=whyVPgkAfARB7gMjLEyu0kSxmb6qpqfuE_r6QstAzgHcA@mail.gmail.com>

Hello,

On Thu, Sep 02, 2021 at 08:47:42AM -0700, Linus Torvalds wrote:
> On Tue, Aug 31, 2021 at 2:18 PM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > As for new features: we now batch inode inactivations in percpu
> > background threads, which sharply decreases frontend thread wait time
> > when performing file deletions and should improve overall directory tree
> > deletion times.
> 
> So no complaints on this one, but I do have a reaction: we have a lot
> of these random CPU hotplug events, and XFS now added another one.
> 
> I don't see that as a problem, but just the _randomness_ of these
> callbacks makes me go "hmm". And that "enum cpuhp_state" thing isn't
> exactly a thing of beauty, and just makes me think there's something
> nasty going on.
> 
> For the new xfs usage, I really get the feeling that it's not that XFS
> actually cares about the CPU states, but that this is literally tied
> to just having percpu state allocated and active, and that maybe it
> would be sensible to have something more specific to that kind of use.
> 
> We have other things that are very similar in nature - like the page
> allocator percpu caches etc, which for very similar reasons want cpu
> dead/online notification.
> 
> I'm only throwing this out as a reaction to this - I'm not sure
> another interface would be good or worthwhile, but that "enum
> cpuhp_state" is ugly enough that I thought I'd rope in Thomas for CPU
> hotplug, and the percpu memory allocation people for comments.
> 
> IOW, just _maybe_ we would want to have some kind of callback model
> for "percpu_alloc()" and it being explicitly about allocations
> becoming available or going away, rather than about CPU state.
> 
> Comments?
> 

I think there are 2 pieces here from percpu's side:
A) Onlining and offlining state related to a percpu alloc.
B) Freeing backing memory of an allocation wrt hot plug.

An RFC was sent out for B) in [1] and you need A) for B).
I can see percpu having a callback model for basic allocations that are
independent, but for anything more complex, that subsystem would need to
register with hotplug anyway. It appears percpu_counter already has hot
plug support. percpu_refcount could be extended as well, but more
complex initialization like the runqueues and slab related allocations
would require work. In short, yes I think A) is doable/reasonable.

Freeing the backing memory for A) seems trickier. We would have to
figure out a clean way to handle onlining/offlining racing with new
percpu allocations (adding or removing pages for the corresponding cpu's
chunk). To support A), init and onlining/offlining can be separate
phases, but for B) init/freeing would have to be rolled into
onlining/offlining.

Without freeing, it's not incorrect for_each_online_cpu() to read a dead
cpu's percpu values, but with freeing it does.

I guess to summarize, A) seems like it might be a good idea with
init/destruction happening at allocation/freeing times. I'm a little
skeptical of B) in terms of complexity. If y'all think it's a good idea
I can look into it again.

[1] https://lore.kernel.org/lkml/20210601065147.53735-1-bharata@linux.ibm.com/

Thanks,
Dennis

> > Lastly, with this release, two new features have graduated to supported
> > status: inode btree counters (for faster mounts), and support for dates
> > beyond Y2038.
> 
> Oh, I had thought Y2038 was already a non-issue for xfs. Silly me.
> 
>               Linus

next prev parent reply	other threads:[~2021-09-03 18:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-31 21:18 [GIT PULL] xfs: new code for 5.15 Darrick J. Wong
2021-09-02 15:47 ` Linus Torvalds
2021-09-02 17:43   ` Darrick J. Wong
2021-09-02 22:35     ` Dave Chinner
2021-09-03  6:26       ` Thomas Gleixner
2021-09-05  0:21         ` Dave Chinner
2021-09-05 23:28           ` Thomas Gleixner
2021-09-06  2:11             ` Randy Dunlap
2021-09-06  9:42               ` Thomas Gleixner
2021-09-02 19:13   ` Thomas Gleixner
2021-09-03  4:29     ` Christoph Hellwig
2021-09-03 18:40   ` Dennis Zhou [this message]
2021-09-02 17:37 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YTJsOoqaI3FiTkZD@fedora \
    --to=dennis@kernel.org \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sandeen@sandeen.net \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.