From: Gregory Haskins <ghaskins@novell.com>
To: mingo@elte.hu, rostedt@goodmis.org, tglx@linutronix.de
Cc: ghaskins@novell.com, linux-kernel@vger.kernel.org,
linux-rt-users@vger.kernel.org
Subject: [PATCH RT 1/2] seqlock: make sure that raw_seqlock_t retries readers while writes are pending
Date: Tue, 19 Aug 2008 05:19:18 -0400 [thread overview]
Message-ID: <20080819091918.21725.39839.stgit@dev.haskins.net> (raw)
In-Reply-To: <20080819091817.21725.81831.stgit@dev.haskins.net>
The seqlock protocol is broken in -rt for raw_seqlock_t objects. This
manifested in my 2.6.26-rt1 kernel as a 500ms (yes, millisecond) spike
which was traced out with ftrace/preemptirqsoff to be originating in
the HRT (hrtimer_interrupt, to be precise). It would occasionally
spin processing the same CLOCK_MONOTONIC timer (the scheduler-tick)
in a tight loop with interrupts disabled. Investigating, it turned out
that the time-basis recorded for "now" early in the interrupt was
momentarily moved 500ms in the future. This caused all timers with
correct expiration times to appear to have expired a long time ago.
Even rescheduling the timer via hrtimer_forward ultimately placed the
timer in an "expired" state since the "now" basis was in the future.
So I began investigating how this time-basis (derived from ktime_get())
could have done this. I observed that ktime_get() readers were able to
successfully read a time value even while another core held a
write-lock on the xtime_lock. Therefore the fundamental issue was
that ktime_get was able to return transitional states of the
xtime/clocksource infrastructure, which is clearly not intended.
I root caused the issue to the raw_seqlock_t implementation. It was
missing support for retrying a reader if it finds a write-pending
flag. Investigating further, I think I can speculate why.
Back in April, Ingo and Thomas checked in a fix to mainline for seqlocks,
referenced here:
commit 88a411c07b6fedcfc97b8dc51ae18540bd2beda0
Author: Ingo Molnar <mingo@elte.hu>
Date: Thu Apr 3 09:06:13 2008 +0200
seqlock: livelock fix
Thomas Gleixner debugged a particularly ugly seqlock related livelock:
do not process the seq-read section if we know it beforehand that the
test at the end of the section will fail ...
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Of course, mainline only has seqlock_t. In -rt, we have both seqlock_t
and raw_seqlock_t. It would appear that the merge-resolution for
commit 88a411c07b6 to the -rt branch inadvertently applied one hunk
of the fix to seqlock_t, and the other to raw_seqlock_t. The normal
seqlocks now have two checks for retry, while the raw_seqlocks have none.
This lack of a check is what causes the protocol failure, which ultimately
caused the bad clock info and a latency spike.
This patch corrects the above condition by applying the conceptual change
from 88a411c07b6 to both seqlock_t and raw_seqlock_t equally. The observed
problems with the HRT spike are confirmed to no longer be reproducible as
as result.
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Thomas Gleixner <tglx@linutronix.de>
---
include/linux/seqlock.h | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index e6ecb46..345d726 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -145,7 +145,7 @@ static inline int __read_seqretry(seqlock_t *sl, unsigned iv)
int ret;
smp_rmb();
- ret = (iv & 1) | (sl->sequence ^ iv);
+ ret = (sl->sequence != iv);
/*
* If invalid then serialize with the writer, to make sure we
* are not livelocking it:
@@ -228,8 +228,16 @@ static __always_inline int __write_tryseqlock_raw(raw_seqlock_t *sl)
static __always_inline unsigned __read_seqbegin_raw(const raw_seqlock_t *sl)
{
- unsigned ret = sl->sequence;
+ unsigned ret;
+
+repeat:
+ ret = sl->sequence;
smp_rmb();
+ if (unlikely(ret & 1)) {
+ cpu_relax();
+ goto repeat;
+ }
+
return ret;
}
next prev parent reply other threads:[~2008-08-19 9:21 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-19 9:19 [PATCH RT 0/2] Misc fixes for 2.6.26-rt1 Gregory Haskins
2008-08-19 9:19 ` Gregory Haskins [this message]
2008-08-19 9:19 ` [PATCH RT 2/2] ftrace: fix elevated preempt_count in wakeup-tracer Gregory Haskins
2008-08-19 13:12 ` Peter Zijlstra
2008-08-19 13:21 ` Gregory Haskins
2008-08-19 13:45 ` Steven Rostedt
2008-08-19 14:07 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080819091918.21725.39839.stgit@dev.haskins.net \
--to=ghaskins@novell.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).