* [PATCH] Prevent the loop in timespec_add_ns() to be optimised away @ 2008-02-22 21:40 Segher Boessenkool 2008-02-27 23:43 ` Andrew Morton 0 siblings, 1 reply; 4+ messages in thread From: Segher Boessenkool @ 2008-02-22 21:40 UTC (permalink / raw) To: linux-kernel; +Cc: Segher Boessenkool ...since some architectures don't support __udivdi3() (and we don't want to use that, anyway). Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> --- include/linux/time.h | 4 ++++ 1 files changed, 4 insertions(+), 0 deletions(-) diff --git a/include/linux/time.h b/include/linux/time.h index 2091a19..d32ef0a 100644 --- a/include/linux/time.h +++ b/include/linux/time.h @@ -174,6 +174,10 @@ static inline void timespec_add_ns(struct timespec *a, u64 ns) { ns += a->tv_nsec; while(unlikely(ns >= NSEC_PER_SEC)) { + /* The following asm() prevents the compiler from + * optimising this loop into a modulo operation. */ + asm("" : "+r"(ns)); + ns -= NSEC_PER_SEC; a->tv_sec++; } -- 1.5.4.2.184.gb23b ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] Prevent the loop in timespec_add_ns() to be optimised away 2008-02-22 21:40 [PATCH] Prevent the loop in timespec_add_ns() to be optimised away Segher Boessenkool @ 2008-02-27 23:43 ` Andrew Morton 2008-02-28 21:58 ` Thomas Gleixner 0 siblings, 1 reply; 4+ messages in thread From: Andrew Morton @ 2008-02-27 23:43 UTC (permalink / raw) To: Segher Boessenkool Cc: linux-kernel, segher, Ingo Molnar, Thomas Gleixner, john stultz On Fri, 22 Feb 2008 22:40:45 +0100 Segher Boessenkool <segher@kernel.crashing.org> wrote: > ...since some architectures don't support __udivdi3() (and > we don't want to use that, anyway). > > Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> > --- > include/linux/time.h | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > diff --git a/include/linux/time.h b/include/linux/time.h > index 2091a19..d32ef0a 100644 > --- a/include/linux/time.h > +++ b/include/linux/time.h > @@ -174,6 +174,10 @@ static inline void timespec_add_ns(struct timespec *a, u64 ns) > { > ns += a->tv_nsec; > while(unlikely(ns >= NSEC_PER_SEC)) { > + /* The following asm() prevents the compiler from > + * optimising this loop into a modulo operation. */ > + asm("" : "+r"(ns)); > + > ns -= NSEC_PER_SEC; > a->tv_sec++; > } It's pretty sad that we need to turn this into a loop just because of the __udivdi3() thing. otoh, it's rarely occurring, and it could be that the number of times it loops is usually 1 (if it wasn't zero), so perhaps a loop is faster than a divide anyway. This code is probably too large to be inlined. I queued this patch as needed-in-2.6.25, to-be-merged-via-Thomas. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Prevent the loop in timespec_add_ns() to be optimised away 2008-02-27 23:43 ` Andrew Morton @ 2008-02-28 21:58 ` Thomas Gleixner 2008-02-28 22:13 ` Andrew Morton 0 siblings, 1 reply; 4+ messages in thread From: Thomas Gleixner @ 2008-02-28 21:58 UTC (permalink / raw) To: Andrew Morton; +Cc: Segher Boessenkool, linux-kernel, Ingo Molnar, john stultz On Wed, 27 Feb 2008, Andrew Morton wrote: > On Fri, 22 Feb 2008 22:40:45 +0100 > Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > ...since some architectures don't support __udivdi3() (and > > we don't want to use that, anyway). > > > > Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> > > --- > > include/linux/time.h | 4 ++++ > > 1 files changed, 4 insertions(+), 0 deletions(-) > > > > diff --git a/include/linux/time.h b/include/linux/time.h > > index 2091a19..d32ef0a 100644 > > --- a/include/linux/time.h > > +++ b/include/linux/time.h > > @@ -174,6 +174,10 @@ static inline void timespec_add_ns(struct timespec *a, u64 ns) > > { > > ns += a->tv_nsec; > > while(unlikely(ns >= NSEC_PER_SEC)) { > > + /* The following asm() prevents the compiler from > > + * optimising this loop into a modulo operation. */ > > + asm("" : "+r"(ns)); > > + > > ns -= NSEC_PER_SEC; > > a->tv_sec++; > > } > > It's pretty sad that we need to turn this into a loop just because of the > __udivdi3() thing. > > otoh, it's rarely occurring, and it could be that the number of times it > loops is usually 1 (if it wasn't zero), so perhaps a loop is faster than a > divide anyway. > > This code is probably too large to be inlined. > > I queued this patch as needed-in-2.6.25, to-be-merged-via-Thomas. Are you going to send it or should I grab it from the mailing list myself ? Thanks, tglx ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] Prevent the loop in timespec_add_ns() to be optimised away 2008-02-28 21:58 ` Thomas Gleixner @ 2008-02-28 22:13 ` Andrew Morton 0 siblings, 0 replies; 4+ messages in thread From: Andrew Morton @ 2008-02-28 22:13 UTC (permalink / raw) To: Thomas Gleixner; +Cc: segher, linux-kernel, mingo, johnstul On Thu, 28 Feb 2008 22:58:49 +0100 (CET) Thomas Gleixner <tglx@linutronix.de> wrote: > On Wed, 27 Feb 2008, Andrew Morton wrote: > > > On Fri, 22 Feb 2008 22:40:45 +0100 > > Segher Boessenkool <segher@kernel.crashing.org> wrote: > > > > > ...since some architectures don't support __udivdi3() (and > > > we don't want to use that, anyway). > > > > > > Signed-off-by: Segher Boessenkool <segher@kernel.crashing.org> > > > --- > > > include/linux/time.h | 4 ++++ > > > 1 files changed, 4 insertions(+), 0 deletions(-) > > > > > > diff --git a/include/linux/time.h b/include/linux/time.h > > > index 2091a19..d32ef0a 100644 > > > --- a/include/linux/time.h > > > +++ b/include/linux/time.h > > > @@ -174,6 +174,10 @@ static inline void timespec_add_ns(struct timespec *a, u64 ns) > > > { > > > ns += a->tv_nsec; > > > while(unlikely(ns >= NSEC_PER_SEC)) { > > > + /* The following asm() prevents the compiler from > > > + * optimising this loop into a modulo operation. */ > > > + asm("" : "+r"(ns)); > > > + > > > ns -= NSEC_PER_SEC; > > > a->tv_sec++; > > > } > > > > It's pretty sad that we need to turn this into a loop just because of the > > __udivdi3() thing. > > > > otoh, it's rarely occurring, and it could be that the number of times it > > loops is usually 1 (if it wasn't zero), so perhaps a loop is faster than a > > divide anyway. > > > > This code is probably too large to be inlined. > > > > I queued this patch as needed-in-2.6.25, to-be-merged-via-Thomas. > > Are you going to send it or should I grab it from the mailing list > myself ? > My secret-undocumented protocol for handling a patch which I think is needed in 2.6.current and which is handled via a subsystem tree is: - queue it up behind all the subsystem trees in -mm's "2.6.25:" section - actually it's now "2.6.25"'s "others:" section. I've split the "2.6.25:" section into two parts: "me:" and "others:". - drop it if it turns up in a subsystem tree - and hope like hell that whoever merged it has considered whether it's needed in the current kernel cycle, because I just lost track of it. - let it cook for a few days or more, to give time for people to comment, and perhaps to get it into a -mm release - send it out to the subsystem maintainer tagged as "[patch (for 2.6.25?)]" - repeat. patches which I currently have in this state are: x86-cast-cmpxchg-and-cmpxchg_local-result-for-386-and-486.patch x86-fix-clearcopy_user_page-declarations-in-pageh.patch x86-visws-fix-printk-format-warnings.patch mpt-fusion-dont-oops-if-numphys==0.patch bluetooth-hci_core-defer-hci_unregister_sysfs.patch e100-do-suspend-shutdown-like-e1000.patch dm-raid1-bitops-bug.patch iova-lockdep-false-alarm-fix.patch bluetooth-conwise-technology-based-adapters-with-buggy-sco-support-bugzilla-9027.patch ieee80211-fix-broken-error-handling-in-ieee80211_sta_process_addba_request.patch fix-typo-in-tick-broadcastc.patch i8042-use-sgi_has_i8042-to-select-sgi-i8042-handlinig.patch fix-b43-driver-build-for-arm.patch apanel-fix-kconfig-dependencies.patch de2104x-remove-bug_on-when-changing-media-type.patch scsi-fix-cd-burning-regression.patch arm-fix-freeing-of-page-tables-in-free_pgd_slow.patch time-prevent-the-loop-in-timespec_add_ns-from-being-optimised-away.patch acpi-ec-fix-regression.patch drivers-acpi-asus_acpic-correct-use-of-and.patch drivers-media-video-em28xx-correct-use-of-and.patch drivers-media-video-em28xx-correct-use-of-and-fix.patch drivers-net-wireless-iwlwifi-iwl-4965c-correct-use-of-and.patch time-dont-touch-an-offlined-cpus-ts-tick_stopped-in-tick_cancel_sched_timer.patch scsi-arcmsr-update-driver-version.patch the way to stop me from continuing to loop is to send me an "acked-by, please send to Linus for 2.6.25". I'll then move the patch into -mm's "me:" section (or even right up next to origin.patch if it's simple-and-obvious) for sending to LT. ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2008-02-28 22:15 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-02-22 21:40 [PATCH] Prevent the loop in timespec_add_ns() to be optimised away Segher Boessenkool 2008-02-27 23:43 ` Andrew Morton 2008-02-28 21:58 ` Thomas Gleixner 2008-02-28 22:13 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox