All of lore.kernel.org
 help / color / mirror / Atom feed
From: Uladzislau Rezki <urezki@gmail.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	LKML <linux-kernel@vger.kernel.org>, RCU <rcu@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Michal Hocko <mhocko@suse.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sonymobile.com>,
	willy@infradead.org
Subject: Re: [PATCH 01/16] rcu/tree: Add a work to allocate pages from regular context
Date: Wed, 4 Nov 2020 13:35:53 +0100	[thread overview]
Message-ID: <20201104123553.GC17782@pc636> (raw)
In-Reply-To: <20201103191822.GC3249@paulmck-ThinkPad-P72>

On Tue, Nov 03, 2020 at 11:18:22AM -0800, Paul E. McKenney wrote:
> On Tue, Nov 03, 2020 at 05:33:50PM +0100, Uladzislau Rezki wrote:
> > On Tue, Nov 03, 2020 at 10:47:23AM -0500, Joel Fernandes wrote:
> > > On Thu, Oct 29, 2020 at 05:50:04PM +0100, Uladzislau Rezki (Sony) wrote:
> > > > The current memmory-allocation interface presents to following
> > > > difficulties that this patch is designed to overcome:
> > > > 
> > > > a) If built with CONFIG_PROVE_RAW_LOCK_NESTING, the lockdep will
> > > >    complain about violation("BUG: Invalid wait context") of the
> > > >    nesting rules. It does the raw_spinlock vs. spinlock nesting
> > > >    checks, i.e. it is not legal to acquire a spinlock_t while
> > > >    holding a raw_spinlock_t.
> > > > 
> > > >    Internally the kfree_rcu() uses raw_spinlock_t whereas the
> > > >    "page allocator" internally deals with spinlock_t to access
> > > >    to its zones. The code also can be broken from higher level
> > > >    of view:
> > > >    <snip>
> > > >        raw_spin_lock(&some_lock);
> > > >        kfree_rcu(some_pointer, some_field_offset);
> > > >    <snip>
> > > > 
> > > > b) If built with CONFIG_PREEMPT_RT. Please note, in that case spinlock_t
> > > >    is converted into sleepable variant. Invoking the page allocator from
> > > >    atomic contexts leads to "BUG: scheduling while atomic".
> > > > 
> > > > c) call_rcu() is invoked from raw atomic context and kfree_rcu()
> > > >    and kvfree_rcu() are expected to be called from atomic raw context
> > > >    as well.
> > > > 
> > > > Move out a page allocation from contexts which trigger kvfree_rcu()
> > > > function to the separate worker. When a k[v]free_rcu() per-cpu page
> > > > cache is empty a fallback mechanism is used and a special job is
> > > > scheduled to refill the per-cpu cache.
> > > 
> > > Looks good, still reviewing here. BTW just for my education, I was wondering
> > > about Thomas's email:
> > > https://lkml.org/lkml/2020/8/11/939
> > > 
> > > If slab allocations in pure raw-atomic context on RT is not allowed or
> > > recommended, should kfree_rcu() be allowed?
> > >
> > Thanks for reviewing, Joel :)
> > 
> > The decision was made that we need to support kfree_rcu() from "real atomic contexts",
> > to align with how it used to be before. We can go and just convert our local locks
> > to the spinlock_t variant but that was not Paul goal, it can be that some users need
> > kfree_rcu() for raw atomics.
> 
> People invoke call_rcu() from raw atomics, and so we should provide
> the same for kfree_rcu().  Yes, people could work around a raw-atomic
> prohibition, but such prohibitions incur constant costs over time in
> terms of development effort, increased bug rate, and increased complexity.
> Yes, this does increase all of those for RCU, but the relative increase
> is negligible, RCU being what it is.
> 
I see your point.

> > > slab can have same issue right? If per-cpu cache is drained, it has to
> > > allocate page from buddy allocator and there's no GFP flag to tell it about
> > > context where alloc is happening from.
> > > 
> > Sounds like that. Apart of that, it might turn out soon that we or somebody
> > else will rise a question one more time about something GFP_RAW or GFP_NOLOCKS.
> > So who knows..
> 
> I would prefer that slab provide some way of dealing with raw atomic
> context, but the maintainers are thus far unconvinced.
> 
I think, when preempt_rt is fully integrated to the kernel, we might get
new users with such demand. So, it is not a closed topic so far, IMHO.

> > > Or are we saying that we want to support kfree on RT from raw atomic atomic
> > > context, even though kmalloc is not supported? I hate to bring up this
> > > elephant in the room, but since I am a part of the people maintaining this
> > > code, I believe I would rather set some rules than supporting unsupported
> > > usages. :-\ (Once I know what is supported and what isn't that is). If indeed
> > > raw atomic kfree_rcu() is a bogus use case because of -RT, then we ought to
> > > put a giant warning than supporting it :-(.
> > > 
> > We discussed it several times, the conclusion was that we need to support 
> > kfree_rcu() from raw contexts. At least that was a clear signal from Paul 
> > to me. I think, if we obtain the preemtable(), so it becomes versatile, we
> > can drop the patch that is in question later on in the future.
> 
> Given a universally meaningful preemptible(), we could directly call
> the allocator in some cases.  It might (or might not) still make sense
> to defer the allocation when preemptible() indicated that a direct call
> to the allocator was unsafe.
> 
I do not have a strong opinion here. Giving the fact that maintaining of
such "deferring" is not considered as a big effort, i think, we can live
with it.

--
Vlad Rezki

  reply	other threads:[~2020-11-04 12:36 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-29 16:50 [PATCH 01/16] rcu/tree: Add a work to allocate pages from regular context Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 02/16] lib/debug: Remove pointless ARCH_NO_PREEMPT dependencies Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 03/16] preempt: Make preempt count unconditional Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 04/16] preempt: Cleanup PREEMPT_COUNT leftovers Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 05/16] lockdep: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 06/16] mm/pagemap: " Uladzislau Rezki (Sony)
2020-10-29 20:57   ` Uladzislau Rezki
2020-10-29 21:26     ` Paul E. McKenney
2020-10-29 21:03   ` kernel test robot
2020-10-29 21:21   ` kernel test robot
2020-10-30  2:36   ` kernel test robot
2020-10-29 16:50 ` [PATCH 07/16] locking/bitspinlock: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 08/16] uaccess: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 09/16] sched: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 10/16] ARM: " Uladzislau Rezki (Sony)
2020-10-29 16:50   ` Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 11/16] xtensa: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [Intel-gfx] [PATCH 12/16] drm/i915: " Uladzislau Rezki (Sony)
2020-10-29 16:50   ` Uladzislau Rezki (Sony)
2020-10-29 16:50   ` Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 13/16] rcutorture: " Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 14/16] preempt: Remove PREEMPT_COUNT from Kconfig Uladzislau Rezki (Sony)
2020-10-29 16:50 ` [PATCH 15/16] rcu/tree: Allocate a page when caller is preemptible Uladzislau Rezki (Sony)
2020-11-03 18:03   ` Joel Fernandes
2020-11-04 11:39     ` Uladzislau Rezki
2020-11-04 14:36       ` Joel Fernandes
2020-10-29 16:50 ` [PATCH 16/16] rcu/tree: Use delayed work instead of hrtimer to refill the cache Uladzislau Rezki (Sony)
2020-10-29 19:47   ` Paul E. McKenney
2020-10-29 20:13     ` Uladzislau Rezki
2020-10-29 20:22       ` Uladzislau Rezki
2020-10-29 20:33         ` Paul E. McKenney
2020-10-29 21:00           ` Uladzislau Rezki
2020-11-03 15:47 ` [PATCH 01/16] rcu/tree: Add a work to allocate pages from regular context Joel Fernandes
2020-11-03 16:33   ` Uladzislau Rezki
2020-11-03 19:18     ` Paul E. McKenney
2020-11-04 12:35       ` Uladzislau Rezki [this message]
2020-11-04 14:12         ` Paul E. McKenney
2020-11-04 14:40           ` Uladzislau Rezki
2020-11-03 17:54 ` Joel Fernandes
2020-11-04 12:12   ` Uladzislau Rezki
2020-11-04 15:01     ` Joel Fernandes
2020-11-04 18:38       ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201104123553.GC17782@pc636 \
    --to=urezki@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=bigeasy@linutronix.de \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=oleksiy.avramchenko@sonymobile.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rcu@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.