All of lore.kernel.org
 help / color / mirror / Atom feed
From: Florian Weimer <fweimer@redhat.com>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	libc-alpha@sourceware.org
Subject: rseq CPU ID not correct on 6.0 kernels for pinned threads
Date: Wed, 11 Jan 2023 12:26:08 +0100	[thread overview]
Message-ID: <87lem9cnxr.fsf@oldenburg.str.redhat.com> (raw)

The glibc test suite contains a test that verifies that sched_getcpu
returns the expected CPU number for a thread that is pinned (via
sched_setaffinity) to a specific CPU.  There are other threads running
which attempt to de-schedule the pinned thread from its CPU.  I believe
the test is correctly doing what it is expected to do; it is invalid
only if one believes that it is okay for the kernel to disregard the
affinity mask for scheduling decisions.

These days, we use the cpu_id rseq field as the return value of
sched_getcpu if the kernel has rseq support (which it has in these
cases).

This test has started failing sporadically for us, some time around
kernel 6.0.  I see failure occasionally on a Fedora builder, it runs:

Linux buildvm-x86-26.iad2.fedoraproject.org 6.0.15-300.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Dec 21 18:33:23 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

I think I've seen it on the x86-64 builder only, but that might just be
an accident.

The failing tests log this output:

=====FAIL: nptl/tst-thread-affinity-pthread.out=====
info: Detected CPU set size (in bits): 64
info: Maximum test CPU: 5
error: Pinned thread 1 ran on impossible cpu 0
error: Pinned thread 0 ran on impossible cpu 0
info: Main thread ran on 4 CPU(s) of 6 available CPU(s)
info: Other threads ran on 6 CPU(s)
=====FAIL: nptl/tst-thread-affinity-pthread2.out=====
info: Detected CPU set size (in bits): 64
info: Maximum test CPU: 5
error: Pinned thread 1 ran on impossible cpu 1
error: Pinned thread 2 ran on impossible cpu 0
error: Pinned thread 3 ran on impossible cpu 3
info: Main thread ran on 5 CPU(s) of 6 available CPU(s)
info: Other threads ran on 6 CPU(s)

But I also encountered one local failure, but it is rare.  Maybe it's
load-related.  There shouldn't be any CPU unplug or anything like that
involved here.

I am not entirely sure if something is changing CPU affinities from
outside the process (which would be quite wrong, but not a kernel bug).
But in the past, our glibc test has detected real rseq cpu_id
brokenness, so I'm leaning towards that as the cause this time, too.

Thanks,
Florian


             reply	other threads:[~2023-01-11 11:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-11 11:26 Florian Weimer [this message]
2023-01-11 14:52 ` rseq CPU ID not correct on 6.0 kernels for pinned threads Mathieu Desnoyers
2023-01-11 19:31   ` Mathieu Desnoyers
2023-01-11 21:51     ` Mathieu Desnoyers
2023-01-12 16:33   ` Florian Weimer
2023-01-12 20:25     ` Mathieu Desnoyers
2023-01-13 16:06       ` Florian Weimer
2023-01-13 16:13         ` Waiman Long

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87lem9cnxr.fsf@oldenburg.str.redhat.com \
    --to=fweimer@redhat.com \
    --cc=libc-alpha@sourceware.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.