From: Jason Gunthorpe <jgg@mellanox.com>
To: Michal Hocko <mhocko@suse.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Andrea Arcangeli" <aarcange@redhat.com>,
"Christoph Hellwig" <hch@lst.de>
Subject: Re: [PATCH] mm/mmn: prevent unpaired invalidate_start and invalidate_end with non-blocking
Date: Thu, 8 Aug 2019 12:04:07 +0000 [thread overview]
Message-ID: <20190808120402.GA1975@mellanox.com> (raw)
In-Reply-To: <20190808081827.GB18351@dhcp22.suse.cz>
On Thu, Aug 08, 2019 at 10:18:27AM +0200, Michal Hocko wrote:
> On Wed 07-08-19 19:16:32, Jason Gunthorpe wrote:
> > Many users of the mmu_notifier invalidate_range callbacks maintain
> > locking/counters/etc on a paired basis and have long expected that
> > invalidate_range start/end are always paired.
> >
> > The recent change to add non-blocking notifiers breaks this assumption
> > when multiple notifiers are present in the list as an EAGAIN return from a
> > later notifier causes all earlier notifiers to get their
> > invalidate_range_end() skipped.
> >
> > During the development of non-blocking each user was audited to be sure
> > they can skip their invalidate_range_end() if their start returns -EAGAIN,
> > so the only place that has a problem is when there are multiple
> > subscriptions.
> >
> > Due to the RCU locking we can't reliably generate a subset of the linked
> > list representing the notifiers already called, and generate an
> > invalidate_range_end() pairing.
> >
> > Rather than design an elaborate fix, for now, just block non-blocking
> > requests early on if there are multiple subscriptions.
>
> Which means that the oom path cannot really release any memory for
> ranges covered by these notifiers which is really unfortunate because
> that might cover a lot of memory. Especially when the particular range
> might not be tracked at all, right?
Yes, it is a very big hammer to avoid a bug where the locking schemes
get corrupted and the impacted drivers deadlock.
If you really don't like it then we have to push ahead on either an
rcu-safe undo algorithm or some locking thing. I've been looking at
the locking thing, so we can wait a bit more and see.
At least it doesn't seem urgent right now as nobody is reporting
hitting this bug, but we are moving toward cases where a process will
have 4 notififers (amdgpu kfd, hmm, amd iommu, RDMA ODP), so the
chance is higher
> If a different fix is indeed too elaborate then make sure to let users
> known that there is a restriction in place and dump something useful
> into the kernel log.
The 'simple' alternative I see is to use a rcu safe undo algorithm,
such as sorting the hlist. This is not so much code, but it is tricky
stuff.
Jason
next prev parent reply other threads:[~2019-08-08 12:04 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-07 19:16 [PATCH] mm/mmn: prevent unpaired invalidate_start and invalidate_end with non-blocking Jason Gunthorpe
2019-08-08 7:00 ` Christoph Hellwig
2019-08-08 8:18 ` Michal Hocko
2019-08-08 12:04 ` Jason Gunthorpe [this message]
2019-08-08 12:13 ` Michal Hocko
2019-10-24 14:15 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190808120402.GA1975@mellanox.com \
--to=jgg@mellanox.com \
--cc=aarcange@redhat.com \
--cc=hch@lst.de \
--cc=jglisse@redhat.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.