All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>,
	Oleg Nesterov <oleg@redhat.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [GIT PULL] sigqueue cache fix
Date: Mon, 28 Jun 2021 07:14:12 +0200	[thread overview]
Message-ID: <YNlapAKObfeVPoQO@gmail.com> (raw)
In-Reply-To: <CAHk-=wiJq0Ns7_AFRW+rvZcD_m+1t5cYgvQRO-Gbp8TEK1x1bQ@mail.gmail.com>


* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Sun, Jun 27, 2021 at 11:52 AM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Ok, I may have confused myself looking at all this, but it does all
> > make me think this is dodgy.
> 
> I also couldn't convince myself that the memory ordering is correct
> for the _contents_ of the sigqueue entry that had its pointer cached,
> although I suspect that is purely a theoretical concern (certainly a
> non-issue on x86).
> 
> So I've reverted the sigqueue cache code, in that I haven't heard
> anything back and I'm not going to delay 5.13 over something small and
> easily undone like this.

I concur that it was the safest to revert this, because it was close to the 
final release.

I think the code is safe, but only by accident. The most critical data race 
isn't well-documented, unless I missed something.

The most fundamental race we can have is this:

      CPU#0

      __sigqueue_alloc()

      [ holds sighand->siglock ]
      [ IRQs off. ]

      q = READ_ONCE(t->sigqueue_cache);
      if (!q || sigqueue_flags)
            q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
      else
            WRITE_ONCE(t->sigqueue_cache, NULL);


                                CPU#1  

                                __sigqueue_free()

                                [ IRQs off. ]

                                if (!READ_ONCE(current->sigqueue_cache))
                                      WRITE_ONCE(current->sigqueue_cache, q);
                                else
                                      kmem_cache_free(sigqueue_cachep, q);

( Let's assume exit_task_sigqueue_cache() happens while there's no new 
  signal sending going on, so that angle is safe. )

Someone confusingly, *alloc() is the consumer and *free() is the producer 
of the sigqueue_cache.

Here's how I see the 3 fundamental races these two pieces of code may have:

 - Producer <-> producer: The producer cannot race with itself, because it 
   only ever produces into current->sigqueue_cache and has interrupts 
   disabled. We don't send signals from NMI context.

 - Consumer <-> consumer: multiple consumers cannot race with themselves, 
   because they serialize on sighand->siglock.

 - Producer <-> consumer: this is the most interesting race, and I think 
   it's unsafe in theory, because the producer doesn't make sure that any 
   previous writes to the actual queue entry (struct sigqueue *q) have 
   reached storage before the new 'free' entry is advertised to consumers.

   So in principle CPU#0 could see a new sigqueue entry and use it, before 
   it's fully freed.

   In *practice* it's probably safe by accident (or by undocumented 
   intent), because there's an atomic op we have shortly before putting the 
   queue entry into the sigqueue_cache, in __sigqueue_free():

         if (atomic_dec_and_test(&q->user->sigpending))
                free_uid(q->user);

   And atomic_dec_and_test() implies a full barrier - although I haven't 
   found the place where we document it and 
   Documentation/memory-ordering.txt is silent on it. We should probably 
   fix that too.

At minimum the patch adding the ->sigqueue_cache should include a 
well-documented race analysis firmly documenting the implicit barrier after 
the atomic_dec_and_test().

Anyway, I agree with the revert.

Thanks,

	Ingo

  reply	other threads:[~2021-06-28  5:14 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  7:13 [GIT PULL] sigqueue cache fix Ingo Molnar
2021-06-24 16:29 ` Linus Torvalds
2021-06-27 18:52   ` Linus Torvalds
2021-06-27 20:40     ` Linus Torvalds
2021-06-28  5:14       ` Ingo Molnar [this message]
2021-06-28  5:22         ` Ingo Molnar
2021-06-28  5:30           ` Ingo Molnar
2021-06-28 17:06         ` Linus Torvalds
2021-06-28 18:46           ` Ingo Molnar
2021-06-28 19:02             ` Linus Torvalds
2021-07-07  9:47               ` Thomas Gleixner
2021-06-24 16:34 ` pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YNlapAKObfeVPoQO@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=christian.brauner@ubuntu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.