* [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock
@ 2010-01-21 15:39 Richard Kennedy
2010-01-21 17:19 ` john stultz
2010-01-26 23:28 ` Andrew Morton
0 siblings, 2 replies; 6+ messages in thread
From: Richard Kennedy @ 2010-01-21 15:39 UTC (permalink / raw)
To: John Stultz, Thomas Gleixner, Martin Schwidefsky
Cc: Ingo Molnar, Andrew Morton, lkml
move xtime_cache to be in the same cache line as the lock
allowing current_kernel_time() to access only one cache line
when running fio write tests on a 2 core machine, on some of the runs
'perf record -e cache_misses' shows current_kernel_time near the top of
the list of cache_misses with 5.5%.
On the other runs it's down at 0.05% so I'm assuming that the difference
is just down to which core the test client get run on.
This patch moves the xtime_cache variable near to the lock so that it
only need to access one cache line.
With this applied it drops the current_kernel_time cache_misses in the
slow case to 4.5%
Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
---
patch against v2.6.33-rc4
compiled & tested on AMD64X2 x86_64
BTW on 64 bit timespec is a 16 byte structure so the aligned 16 doesn't
do much, and on 32bit timepec is 8bytes so this just seems to spread
these variables across more cache lines than necessary. Any ideas what
this is here for?
regards
Richard
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 7faaa32..657e861 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -137,6 +137,7 @@ static inline s64 timekeeping_get_ns_raw(void)
*/
__cacheline_aligned_in_smp DEFINE_SEQLOCK(xtime_lock);
+static struct timespec xtime_cache __attribute__ ((aligned (16)));
/*
* The current time
@@ -165,7 +166,6 @@ struct timespec raw_time;
/* flag for if timekeeping is suspended */
int __read_mostly timekeeping_suspended;
-static struct timespec xtime_cache __attribute__ ((aligned (16)));
void update_xtime_cache(u64 nsec)
{
xtime_cache = xtime;
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock 2010-01-21 15:39 [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock Richard Kennedy @ 2010-01-21 17:19 ` john stultz 2010-01-22 11:10 ` Richard Kennedy 2010-01-26 23:28 ` Andrew Morton 1 sibling, 1 reply; 6+ messages in thread From: john stultz @ 2010-01-21 17:19 UTC (permalink / raw) To: Richard Kennedy Cc: Thomas Gleixner, Martin Schwidefsky, Ingo Molnar, Andrew Morton, lkml On Thu, 2010-01-21 at 15:39 +0000, Richard Kennedy wrote: > move xtime_cache to be in the same cache line as the lock > > allowing current_kernel_time() to access only one cache line > > when running fio write tests on a 2 core machine, on some of the runs > 'perf record -e cache_misses' shows current_kernel_time near the top of > the list of cache_misses with 5.5%. > On the other runs it's down at 0.05% so I'm assuming that the difference > is just down to which core the test client get run on. > > This patch moves the xtime_cache variable near to the lock so that it > only need to access one cache line. > With this applied it drops the current_kernel_time cache_misses in the > slow case to 4.5% > > Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk> Hrm.. I'm hoping to kill off the xtime_cache at some point soon, so I'm not sure if this patch will do much for long. That said, I'm not opposed to it in the mean time, and when xtime_cache does get yanked, I'd appreciate similar performance review to make sure we're not regressing. > --- > patch against v2.6.33-rc4 > compiled & tested on AMD64X2 x86_64 > > > BTW on 64 bit timespec is a 16 byte structure so the aligned 16 doesn't > do much, and on 32bit timepec is 8bytes so this just seems to spread > these variables across more cache lines than necessary. Any ideas what > this is here for? I think it was a copy-paste from the xtime and wall_to_monotonic definitions, which both have the same alignment. thanks -john ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock 2010-01-21 17:19 ` john stultz @ 2010-01-22 11:10 ` Richard Kennedy 0 siblings, 0 replies; 6+ messages in thread From: Richard Kennedy @ 2010-01-22 11:10 UTC (permalink / raw) To: john stultz Cc: Thomas Gleixner, Martin Schwidefsky, Ingo Molnar, Andrew Morton, lkml On Thu, 2010-01-21 at 09:19 -0800, john stultz wrote: > > Hrm.. I'm hoping to kill off the xtime_cache at some point soon, so I'm > not sure if this patch will do much for long. That said, I'm not opposed > to it in the mean time, and when xtime_cache does get yanked, I'd > appreciate similar performance review to make sure we're not regressing. > OK, removing it will be even better. I can re-run the test anytime you like, just let me know if you've got a patch that needs testing. > > --- > > patch against v2.6.33-rc4 > > compiled & tested on AMD64X2 x86_64 > > > > > > BTW on 64 bit timespec is a 16 byte structure so the aligned 16 doesn't > > do much, and on 32bit timepec is 8bytes so this just seems to spread > > these variables across more cache lines than necessary. Any ideas what > > this is here for? > > I think it was a copy-paste from the xtime and wall_to_monotonic > definitions, which both have the same alignment. Yes, that's what I thought. In that case I think we can remove all of those attribute aligned, which should be a small improvement for 32 bit builds. regards Richard ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock 2010-01-21 15:39 [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock Richard Kennedy 2010-01-21 17:19 ` john stultz @ 2010-01-26 23:28 ` Andrew Morton 2010-01-27 12:10 ` Richard Kennedy 1 sibling, 1 reply; 6+ messages in thread From: Andrew Morton @ 2010-01-26 23:28 UTC (permalink / raw) To: Richard Kennedy Cc: John Stultz, Thomas Gleixner, Martin Schwidefsky, Ingo Molnar, lkml On Thu, 21 Jan 2010 15:39:21 +0000 Richard Kennedy <richard@rsk.demon.co.uk> wrote: > move xtime_cache to be in the same cache line as the lock > > allowing current_kernel_time() to access only one cache line Sentences start with capital letters, please. > when running fio write tests on a 2 core machine, on some of the runs > 'perf record -e cache_misses' shows current_kernel_time near the top of > the list of cache_misses with 5.5%. > On the other runs it's down at 0.05% so I'm assuming that the difference > is just down to which core the test client get run on. > > This patch moves the xtime_cache variable near to the lock so that it > only need to access one cache line. > With this applied it drops the current_kernel_time cache_misses in the > slow case to 4.5% > I don't know how reliable this is. I _think_ the compiler and linker are free to place variables of this nature in any old place. Whether any of the current tools actually do that I don't know. Note that one of these variables has file-static scope and the other does not, which perhaps increases the risk that the compiler or linker will go and fiddle with them. To do this reliably one would need to put them in a struct: time.h: extern struct xtime_stuff { seqlock_t _xtime_lock, struct timespec _xtime_cache, } xtime_stuff; #define xtime_lock xtime_stuff._xtime_lock timekeeping.c: struct xtime_stuff { ._xtime_lock = __SEQLOCK_UNLOCKED(xtime_stuff._xtime_lock), }; > BTW on 64 bit timespec is a 16 byte structure so the aligned 16 doesn't > do much, and on 32bit timepec is 8bytes so this just seems to spread > these variables across more cache lines than necessary. Any ideas what > this is here for? Dunno. I had a bit of a peek in the git history but it got complicated and people rarely bother explaining things like this anyway :( ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock 2010-01-26 23:28 ` Andrew Morton @ 2010-01-27 12:10 ` Richard Kennedy 2010-01-28 20:16 ` john stultz 0 siblings, 1 reply; 6+ messages in thread From: Richard Kennedy @ 2010-01-27 12:10 UTC (permalink / raw) To: Andrew Morton Cc: John Stultz, Thomas Gleixner, Martin Schwidefsky, Ingo Molnar, lkml On Tue, 2010-01-26 at 15:28 -0800, Andrew Morton wrote: > On Thu, 21 Jan 2010 15:39:21 +0000 > Richard Kennedy <richard@rsk.demon.co.uk> wrote: > > > move xtime_cache to be in the same cache line as the lock > > > > allowing current_kernel_time() to access only one cache line > > Sentences start with capital letters, please. Sorry about that, I will try harder in future ;) > > I don't know how reliable this is. I _think_ the compiler and linker > are free to place variables of this nature in any old place. Whether > any of the current tools actually do that I don't know. Note that one > of these variables has file-static scope and the other does not, which > perhaps increases the risk that the compiler or linker will go and > fiddle with them. > > To do this reliably one would need to put them in a struct: > > time.h: > > extern struct xtime_stuff { > seqlock_t _xtime_lock, > struct timespec _xtime_cache, > } xtime_stuff; > > #define xtime_lock xtime_stuff._xtime_lock > > > timekeeping.c: > > struct xtime_stuff { > ._xtime_lock = __SEQLOCK_UNLOCKED(xtime_stuff._xtime_lock), > }; Thank you, yes that looks like a much better approach. I can do this if it's needed, but John Stultz said he's going to kill the xtime_cache anyway, so it may not be worth it? However I do wonder if we should move all, or at least some, of the variables protected by that xtime_lock into that structure? Then we can manage their placement and they would be easier to find. After only a brief look I see variables in ntp, tick & timekeeping that seem to be protected by that seqlock. > > BTW on 64 bit timespec is a 16 byte structure so the aligned 16 doesn't > > do much, and on 32bit timepec is 8bytes so this just seems to spread > > these variables across more cache lines than necessary. Any ideas what > > this is here for? > > Dunno. I had a bit of a peek in the git history but it got complicated > and people rarely bother explaining things like this anyway :( > regards Richard ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock 2010-01-27 12:10 ` Richard Kennedy @ 2010-01-28 20:16 ` john stultz 0 siblings, 0 replies; 6+ messages in thread From: john stultz @ 2010-01-28 20:16 UTC (permalink / raw) To: Richard Kennedy Cc: Andrew Morton, Thomas Gleixner, Martin Schwidefsky, Ingo Molnar, lkml On Wed, 2010-01-27 at 12:10 +0000, Richard Kennedy wrote: > On Tue, 2010-01-26 at 15:28 -0800, Andrew Morton wrote: > > On Thu, 21 Jan 2010 15:39:21 +0000 > > Richard Kennedy <richard@rsk.demon.co.uk> wrote: > > > > > move xtime_cache to be in the same cache line as the lock > > > > > > allowing current_kernel_time() to access only one cache line > > > > Sentences start with capital letters, please. > > Sorry about that, I will try harder in future ;) > > > > > > I don't know how reliable this is. I _think_ the compiler and linker > > are free to place variables of this nature in any old place. Whether > > any of the current tools actually do that I don't know. Note that one > > of these variables has file-static scope and the other does not, which > > perhaps increases the risk that the compiler or linker will go and > > fiddle with them. > > > > To do this reliably one would need to put them in a struct: > > > > time.h: > > > > extern struct xtime_stuff { > > seqlock_t _xtime_lock, > > struct timespec _xtime_cache, > > } xtime_stuff; > > > > #define xtime_lock xtime_stuff._xtime_lock > > > > > > timekeeping.c: > > > > struct xtime_stuff { > > ._xtime_lock = __SEQLOCK_UNLOCKED(xtime_stuff._xtime_lock), > > }; > Thank you, yes that looks like a much better approach. > I can do this if it's needed, but John Stultz said he's going to kill > the xtime_cache anyway, so it may not be worth it? > > However I do wonder if we should move all, or at least some, of the > variables protected by that xtime_lock into that structure? Then we can > manage their placement and they would be easier to find. After only a > brief look I see variables in ntp, tick & timekeeping that seem to be > protected by that seqlock. The xtime lock has been used to protect quite a bit, but that has been slowly cleaned up. Luckily process accounting is no longer done under it (although load time accounting still is - not that its used on the read-side if I recall), but jiffies, tick_next_period, almost all of the values in ntp.c, and the timekeeper structure and friends are all covered by it. Personally I'd love to see the timekeeper structure gain a lock that protects its elements, further removing the need for the external xtime_lock. The timekeeper structure should also pull in xtime and wall_to_monotonic, but there's still quite a bit of arch specific code that needs to be fixed first (see my arch_gettimeoffset patches and read/update_persistent_clock patches from end of December). Trying to resolve the ntp.c and timekeeping.c xtime_lock usage is a little harder. Keeping those files and structures separate makes the time code slightly less confusing (though maybe not as much as I'd hope), but the functionality is so tightly coupled that splitting the xtime lock into two separate locks seems like a little too much. But maybe you have some interesting ideas there? So by all means, please take a shot at it. I'm probably prone to being too conservative with making changes in the hopes that we don't cause too many regressions and if we do they're easy to find. But folks like Martin have been able to make some great cleanups without the world falling apart, so maybe my turtle's pace is unwarranted. thanks -john ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-01-28 20:16 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-01-21 15:39 [PATCH] kernel/timekeeping: move xtime_cache to be in the same cache line as the lock Richard Kennedy 2010-01-21 17:19 ` john stultz 2010-01-22 11:10 ` Richard Kennedy 2010-01-26 23:28 ` Andrew Morton 2010-01-27 12:10 ` Richard Kennedy 2010-01-28 20:16 ` john stultz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox