* [PATCH v2] ntp: remove accidental integer wrap-around
@ 2024-05-17 20:22 Justin Stitt
2024-05-24 12:09 ` Thomas Gleixner
2024-08-05 14:22 ` [tip: timers/urgent] ntp: Clamp maxerror and esterror to operating range tip-bot2 for Justin Stitt
0 siblings, 2 replies; 8+ messages in thread
From: Justin Stitt @ 2024-05-17 20:22 UTC (permalink / raw)
To: John Stultz, Thomas Gleixner, Stephen Boyd, Nathan Chancellor,
Bill Wendling, Nick Desaulniers
Cc: linux-kernel, llvm, linux-hardening, Justin Stitt
Using syzkaller alongside the newly reintroduced signed integer overflow
sanitizer spits out this report:
UBSAN: signed-integer-overflow in ../kernel/time/ntp.c:461:16
9223372036854775807 + 500 cannot be represented in type 'long'
Call Trace:
<IRQ>
dump_stack_lvl+0x93/0xd0
handle_overflow+0x171/0x1b0
second_overflow+0x2d6/0x500
accumulate_nsecs_to_secs+0x60/0x160
timekeeping_advance+0x1fe/0x890
update_wall_time+0x10/0x30
...
time_maxerror is unconditionally incremented and the result is checked
against NTP_PHASE_LIMIT, but the increment itself can overflow,
resulting in wrap-around to negative space.
The user can supply some crazy values which is causing the overflow. Add
an extra validation step checking that maxerror is reasonable.
Link: https://github.com/llvm/llvm-project/pull/82432 [1]
Closes: https://github.com/KSPP/linux/issues/354
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Justin Stitt <justinstitt@google.com>
---
Changes in v2:
- update commit log (thanks Thomas)
- check for sane user input during validation (thanks Thomas)
- Link to v1: https://lore.kernel.org/r/20240507-b4-sio-ntp-usec-v1-1-15003fc9c2b4@google.com
---
Historically, the signed integer overflow sanitizer did not work in the
kernel due to its interaction with `-fwrapv` but this has since been
changed [1] in the newest version of Clang. It was re-enabled in the
kernel with Commit 557f8c582a9ba8ab ("ubsan: Reintroduce signed overflow
sanitizer").
Here's the syzkaller reproducer:
| #{Threaded:false Repeat:false RepeatTimes:0 Procs:1 Slowdown:1 Sandbox:
| #SandboxArg:0 Leak:false NetInjection:false NetDevices:false
| #NetReset:false Cgroups:false BinfmtMisc:false CloseFDs:false KCSAN:false
| #DevlinkPCI:false NicVF:false USB:false VhciInjection:false Wifi:false
| #IEEE802154:false Sysctl:false Swap:false UseTmpDir:false
| #HandleSegv:false Repro:false Trace:false LegacyOptions:{Collide:false
| #Fault:false FaultCall:0 FaultNth:0}}
| clock_adjtime(0x0, &(0x7f0000000000)={0x5, 0x1, 0x40,
| 0x7fffffffffffffff, 0x8, 0xb2, 0x256, 0x6, 0x5, 0x8001, 0x9, 0x3f, 0x0,
| 0x8000, 0x800, 0x64d, 0x50000, 0x7ff, 0x8000000000000001, 0x1f, 0x3,
| 0xfff, 0x7fffffff, 0x5, 0x100, 0x4})
... which was used against Kees' tree here (v6.8rc2):
https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=wip/v6.9-rc2/unsigned-overflow-sanitizer
... with this config:
https://gist.github.com/JustinStitt/824976568b0f228ccbcbe49f3dee9bf4
---
kernel/time/timekeeping.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index b58dffc58a8f..321f251c02aa 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -2388,6 +2388,11 @@ static int timekeeping_validate_timex(const struct __kernel_timex *txc)
}
}
+ if (txc->modes & ADJ_MAXERROR) {
+ if (txc->maxerror < 0 || txc->maxerror > NTP_PHASE_LIMIT)
+ return -EINVAL;
+ }
+
/*
* Check for potential multiplication overflows that can
* only happen on 64-bit systems:
---
base-commit: 0106679839f7c69632b3b9833c3268c316c0a9fc
change-id: 20240507-b4-sio-ntp-usec-1a3ab67bdce1
Best regards,
--
Justin Stitt <justinstitt@google.com>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-17 20:22 [PATCH v2] ntp: remove accidental integer wrap-around Justin Stitt
@ 2024-05-24 12:09 ` Thomas Gleixner
2024-05-24 12:44 ` Thomas Gleixner
2024-05-24 22:43 ` Justin Stitt
2024-08-05 14:22 ` [tip: timers/urgent] ntp: Clamp maxerror and esterror to operating range tip-bot2 for Justin Stitt
1 sibling, 2 replies; 8+ messages in thread
From: Thomas Gleixner @ 2024-05-24 12:09 UTC (permalink / raw)
To: Justin Stitt, John Stultz, Stephen Boyd, Nathan Chancellor,
Bill Wendling, Nick Desaulniers
Cc: linux-kernel, llvm, linux-hardening, Justin Stitt,
Miroslav Lichvar
On Fri, May 17 2024 at 20:22, Justin Stitt wrote:
> time_maxerror is unconditionally incremented and the result is checked
> against NTP_PHASE_LIMIT, but the increment itself can overflow,
> resulting in wrap-around to negative space.
>
> The user can supply some crazy values which is causing the overflow. Add
> an extra validation step checking that maxerror is reasonable.
The user can supply any value which can cause an overflow as the input
is unchecked. Add ...
Hmm?
> diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> index b58dffc58a8f..321f251c02aa 100644
> --- a/kernel/time/timekeeping.c
> +++ b/kernel/time/timekeeping.c
> @@ -2388,6 +2388,11 @@ static int timekeeping_validate_timex(const struct __kernel_timex *txc)
> }
> }
>
> + if (txc->modes & ADJ_MAXERROR) {
> + if (txc->maxerror < 0 || txc->maxerror > NTP_PHASE_LIMIT)
> + return -EINVAL;
> + }
I dug into history to find a Fixes tag. That unearthed something
interesting. Exactly this check used to be there until commit
eea83d896e31 ("ntp: NTP4 user space bits update") which landed in
2.6.30. The change log says:
"If some values for adjtimex() are outside the acceptable range, they
are now simply normalized instead of letting the syscall fail."
The problem with that commit is that it did not do any normalization at
all and just relied on the actual time_maxerror handling in
second_overflow(), which is both insufficient and also prone to that
overflow issue.
So instead of turning the clock back, we might be better off to actually
put the normalization in place at the assignment:
time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
or something like that.
Miroslav: Any opinion on that?
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-24 12:09 ` Thomas Gleixner
@ 2024-05-24 12:44 ` Thomas Gleixner
2024-05-27 8:26 ` Miroslav Lichvar
2024-05-24 22:43 ` Justin Stitt
1 sibling, 1 reply; 8+ messages in thread
From: Thomas Gleixner @ 2024-05-24 12:44 UTC (permalink / raw)
To: Justin Stitt, John Stultz, Stephen Boyd, Nathan Chancellor,
Bill Wendling, Nick Desaulniers
Cc: linux-kernel, llvm, linux-hardening, Justin Stitt,
Miroslav Lichvar
On Fri, May 24 2024 at 14:09, Thomas Gleixner wrote:
> On Fri, May 17 2024 at 20:22, Justin Stitt wrote:
> I dug into history to find a Fixes tag. That unearthed something
> interesting. Exactly this check used to be there until commit
> eea83d896e31 ("ntp: NTP4 user space bits update") which landed in
> 2.6.30. The change log says:
>
> "If some values for adjtimex() are outside the acceptable range, they
> are now simply normalized instead of letting the syscall fail."
>
> The problem with that commit is that it did not do any normalization at
> all and just relied on the actual time_maxerror handling in
> second_overflow(), which is both insufficient and also prone to that
> overflow issue.
>
> So instead of turning the clock back, we might be better off to actually
> put the normalization in place at the assignment:
>
> time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
>
> or something like that.
So that commit also removed the sanity check for time_esterror, but
that's not doing anything in the kernel other than being reset in
clear_ntp() and being handed back to user space. No idea what this is
actually used for.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-24 12:09 ` Thomas Gleixner
2024-05-24 12:44 ` Thomas Gleixner
@ 2024-05-24 22:43 ` Justin Stitt
2024-05-24 22:54 ` Thomas Gleixner
1 sibling, 1 reply; 8+ messages in thread
From: Justin Stitt @ 2024-05-24 22:43 UTC (permalink / raw)
To: Thomas Gleixner
Cc: John Stultz, Stephen Boyd, Nathan Chancellor, Bill Wendling,
Nick Desaulniers, linux-kernel, llvm, linux-hardening,
Miroslav Lichvar
Thomas,
I appreciate you reviewing my patches.
On Fri, May 24, 2024 at 5:09 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Fri, May 17 2024 at 20:22, Justin Stitt wrote:
> > time_maxerror is unconditionally incremented and the result is checked
> > against NTP_PHASE_LIMIT, but the increment itself can overflow,
> > resulting in wrap-around to negative space.
> >
> > The user can supply some crazy values which is causing the overflow. Add
> > an extra validation step checking that maxerror is reasonable.
>
> The user can supply any value which can cause an overflow as the input
> is unchecked. Add ...
>
> Hmm?
>
> > diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
> > index b58dffc58a8f..321f251c02aa 100644
> > --- a/kernel/time/timekeeping.c
> > +++ b/kernel/time/timekeeping.c
> > @@ -2388,6 +2388,11 @@ static int timekeeping_validate_timex(const struct __kernel_timex *txc)
> > }
> > }
> >
> > + if (txc->modes & ADJ_MAXERROR) {
> > + if (txc->maxerror < 0 || txc->maxerror > NTP_PHASE_LIMIT)
> > + return -EINVAL;
> > + }
>
> I dug into history to find a Fixes tag. That unearthed something
> interesting. Exactly this check used to be there until commit
> eea83d896e31 ("ntp: NTP4 user space bits update") which landed in
> 2.6.30. The change log says:
Thanks for doing the archaeology.
>
> "If some values for adjtimex() are outside the acceptable range, they
> are now simply normalized instead of letting the syscall fail."
>
> The problem with that commit is that it did not do any normalization at
> all and just relied on the actual time_maxerror handling in
> second_overflow(), which is both insufficient and also prone to that
> overflow issue.
>
> So instead of turning the clock back, we might be better off to actually
> put the normalization in place at the assignment:
>
> time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
A saturating resolution strategy is one that I've taken with some of
my other overflow patches.
... but how about: clamp(txc->maxerror, 0, NTP_PHASE_LIMIT)
>
> or something like that.
>
> Miroslav: Any opinion on that?
>
> Thanks,
>
> tglx
Anyways, I'm waiting to see how the whole overflow/wraparound
discussion in general evolves and, of course, how the local discussion
about this patch shapes up.
Thanks
Justin
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-24 22:43 ` Justin Stitt
@ 2024-05-24 22:54 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2024-05-24 22:54 UTC (permalink / raw)
To: Justin Stitt
Cc: John Stultz, Stephen Boyd, Nathan Chancellor, Bill Wendling,
Nick Desaulniers, linux-kernel, llvm, linux-hardening,
Miroslav Lichvar
Justin!
On Fri, May 24 2024 at 15:43, Justin Stitt wrote:
> I appreciate you reviewing my patches.
You're welcome!
> On Fri, May 24, 2024 at 5:09 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> So instead of turning the clock back, we might be better off to actually
>> put the normalization in place at the assignment:
>>
>> time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
>
> A saturating resolution strategy is one that I've taken with some of
> my other overflow patches.
>
> ... but how about: clamp(txc->maxerror, 0, NTP_PHASE_LIMIT)
Duh. You are right, but that's too obvious :)
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-24 12:44 ` Thomas Gleixner
@ 2024-05-27 8:26 ` Miroslav Lichvar
2024-05-29 8:18 ` Thomas Gleixner
0 siblings, 1 reply; 8+ messages in thread
From: Miroslav Lichvar @ 2024-05-27 8:26 UTC (permalink / raw)
To: Thomas Gleixner
Cc: Justin Stitt, John Stultz, Stephen Boyd, Nathan Chancellor,
Bill Wendling, Nick Desaulniers, linux-kernel, llvm,
linux-hardening
On Fri, May 24, 2024 at 02:44:19PM +0200, Thomas Gleixner wrote:
> On Fri, May 24 2024 at 14:09, Thomas Gleixner wrote:
> > So instead of turning the clock back, we might be better off to actually
> > put the normalization in place at the assignment:
> >
> > time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
> >
> > or something like that.
Yes, I think that's a better approach. Failing the system call could
break existing applications, e.g. ntpd can be configured to accept a
large root distance and it doesn't clamp the maxerror value, while
updating the PLL offset in the same adjtimex() call.
> So that commit also removed the sanity check for time_esterror, but
> that's not doing anything in the kernel other than being reset in
> clear_ntp() and being handed back to user space. No idea what this is
> actually used for.
It's a lower-bound estimate of the clock error, which applications can
check if it's acceptable for them. I think it should be clamped too.
It doesn't make much sense for it to be larger than the maximum error.
Another possible improvement of adjtimex() would be to set the UNSYNC
flag immediately in the call if maxerror >= 16s to avoid the delay of
up to 1 second for applications which check only that flag instead of
the maxerror value.
--
Miroslav Lichvar
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v2] ntp: remove accidental integer wrap-around
2024-05-27 8:26 ` Miroslav Lichvar
@ 2024-05-29 8:18 ` Thomas Gleixner
0 siblings, 0 replies; 8+ messages in thread
From: Thomas Gleixner @ 2024-05-29 8:18 UTC (permalink / raw)
To: Miroslav Lichvar
Cc: Justin Stitt, John Stultz, Stephen Boyd, Nathan Chancellor,
Bill Wendling, Nick Desaulniers, linux-kernel, llvm,
linux-hardening
On Mon, May 27 2024 at 10:26, Miroslav Lichvar wrote:
> On Fri, May 24, 2024 at 02:44:19PM +0200, Thomas Gleixner wrote:
>> On Fri, May 24 2024 at 14:09, Thomas Gleixner wrote:
>> > So instead of turning the clock back, we might be better off to actually
>> > put the normalization in place at the assignment:
>> >
>> > time_maxerror = min(max(0, txc->maxerror), NTP_PHASE_LIMIT);
>> >
>> > or something like that.
>
> Yes, I think that's a better approach. Failing the system call could
> break existing applications, e.g. ntpd can be configured to accept a
> large root distance and it doesn't clamp the maxerror value, while
> updating the PLL offset in the same adjtimex() call.
Thanks for confirming. I suspected that, but the original change logs
are pretty useless in that regard.
>> So that commit also removed the sanity check for time_esterror, but
>> that's not doing anything in the kernel other than being reset in
>> clear_ntp() and being handed back to user space. No idea what this is
>> actually used for.
>
> It's a lower-bound estimate of the clock error, which applications can
> check if it's acceptable for them. I think it should be clamped too.
> It doesn't make much sense for it to be larger than the maximum error.
Ok.
> Another possible improvement of adjtimex() would be to set the UNSYNC
> flag immediately in the call if maxerror >= 16s to avoid the delay of
> up to 1 second for applications which check only that flag instead of
> the maxerror value.
That needs to be a seperate change.
Thanks,
tglx
^ permalink raw reply [flat|nested] 8+ messages in thread
* [tip: timers/urgent] ntp: Clamp maxerror and esterror to operating range
2024-05-17 20:22 [PATCH v2] ntp: remove accidental integer wrap-around Justin Stitt
2024-05-24 12:09 ` Thomas Gleixner
@ 2024-08-05 14:22 ` tip-bot2 for Justin Stitt
1 sibling, 0 replies; 8+ messages in thread
From: tip-bot2 for Justin Stitt @ 2024-08-05 14:22 UTC (permalink / raw)
To: linux-tip-commits
Cc: Justin Stitt, Thomas Gleixner, Miroslav Lichvar, x86,
linux-kernel
The following commit has been merged into the timers/urgent branch of tip:
Commit-ID: 87d571d6fb77ec342a985afa8744bb9bb75b3622
Gitweb: https://git.kernel.org/tip/87d571d6fb77ec342a985afa8744bb9bb75b3622
Author: Justin Stitt <justinstitt@google.com>
AuthorDate: Fri, 17 May 2024 20:22:44
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 05 Aug 2024 16:14:14 +02:00
ntp: Clamp maxerror and esterror to operating range
Using syzkaller alongside the newly reintroduced signed integer overflow
sanitizer spits out this report:
UBSAN: signed-integer-overflow in ../kernel/time/ntp.c:461:16
9223372036854775807 + 500 cannot be represented in type 'long'
Call Trace:
handle_overflow+0x171/0x1b0
second_overflow+0x2d6/0x500
accumulate_nsecs_to_secs+0x60/0x160
timekeeping_advance+0x1fe/0x890
update_wall_time+0x10/0x30
time_maxerror is unconditionally incremented and the result is checked
against NTP_PHASE_LIMIT, but the increment itself can overflow, resulting
in wrap-around to negative space.
Before commit eea83d896e31 ("ntp: NTP4 user space bits update") the user
supplied value was sanity checked to be in the operating range. That change
removed the sanity check and relied on clamping in handle_overflow() which
does not work correctly when the user supplied value is in the overflow
zone of the '+ 500' operation.
The operation requires CAP_SYS_TIME and the side effect of the overflow is
NTP getting out of sync.
Miroslav confirmed that the input value should be clamped to the operating
range and the same applies to time_esterror. The latter is not used by the
kernel, but the value still should be in the operating range as it was
before the sanity check got removed.
Clamp them to the operating range.
[ tglx: Changed it to clamping and included time_esterror ]
Fixes: eea83d896e31 ("ntp: NTP4 user space bits update")
Signed-off-by: Justin Stitt <justinstitt@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: https://lore.kernel.org/all/20240517-b4-sio-ntp-usec-v2-1-d539180f2b79@google.com
Closes: https://github.com/KSPP/linux/issues/354
---
kernel/time/ntp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c
index 406dccb..502e1e5 100644
--- a/kernel/time/ntp.c
+++ b/kernel/time/ntp.c
@@ -727,10 +727,10 @@ static inline void process_adjtimex_modes(const struct __kernel_timex *txc,
}
if (txc->modes & ADJ_MAXERROR)
- time_maxerror = txc->maxerror;
+ time_maxerror = clamp(txc->maxerror, 0, NTP_PHASE_LIMIT);
if (txc->modes & ADJ_ESTERROR)
- time_esterror = txc->esterror;
+ time_esterror = clamp(txc->esterror, 0, NTP_PHASE_LIMIT);
if (txc->modes & ADJ_TIMECONST) {
time_constant = txc->constant;
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-08-05 14:22 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-05-17 20:22 [PATCH v2] ntp: remove accidental integer wrap-around Justin Stitt
2024-05-24 12:09 ` Thomas Gleixner
2024-05-24 12:44 ` Thomas Gleixner
2024-05-27 8:26 ` Miroslav Lichvar
2024-05-29 8:18 ` Thomas Gleixner
2024-05-24 22:43 ` Justin Stitt
2024-05-24 22:54 ` Thomas Gleixner
2024-08-05 14:22 ` [tip: timers/urgent] ntp: Clamp maxerror and esterror to operating range tip-bot2 for Justin Stitt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox