All of lore.kernel.org
 help / color / mirror / Atom feed
From: Daniel Thompson <daniel.thompson@linaro.org>
To: Stephen Boyd <sboyd@codeaurora.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	John Stultz <john.stultz@linaro.org>,
	linux-kernel@vger.kernel.org, patches@linaro.org,
	linaro-kernel@lists.linaro.org,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Russell King <linux@arm.linux.org.uk>,
	Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: [PATCH v3 2/4] sched_clock: Optimize cache line usage
Date: Thu, 05 Feb 2015 10:21:32 +0000	[thread overview]
Message-ID: <54D3442C.3090205@linaro.org> (raw)
In-Reply-To: <20150205011407.GB30372@codeaurora.org>

On 05/02/15 01:14, Stephen Boyd wrote:
> On 01/30, Daniel Thompson wrote:
>> diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
>> index 3d21a8719444..cb69a47dfee4 100644
>> --- a/kernel/time/sched_clock.c
>> +++ b/kernel/time/sched_clock.c
>> @@ -18,28 +18,44 @@
>>  #include <linux/seqlock.h>
>>  #include <linux/bitops.h>
>>  
>> -struct clock_data {
>> -	ktime_t wrap_kt;
>> +/**
>> + * struct clock_read_data - data required to read from sched_clock
>> + *
> 
> Nitpick: Won't kernel-doc complain that members aren't
> documented?

It does indeed. I'll add descriptions here...


>> + * Care must be taken when updating this structure; it is read by
>> + * some very hot code paths. It occupies <=48 bytes and, when combined
>> + * with the seqcount used to synchronize access, comfortably fits into
>> + * a 64 byte cache line.
>> + */
>> +struct clock_read_data {
>>  	u64 epoch_ns;
>>  	u64 epoch_cyc;
>> -	seqcount_t seq;
>> -	unsigned long rate;
>> +	u64 sched_clock_mask;
>> +	u64 (*read_sched_clock)(void);
>>  	u32 mult;
>>  	u32 shift;
>>  	bool suspended;
>>  };
>>  
>> +/**
>> + * struct clock_data - all data needed for sched_clock (including
>> + *                     registration of a new clock source)
>> + *
> 
> Same comment.

... and here.


>> + * The ordering of this structure has been chosen to optimize cache
>> + * performance. In particular seq and read_data (combined) should fit
>> + * into a single 64 byte cache line.
>> + */
>> +struct clock_data {
>> +	seqcount_t seq;
>> +	struct clock_read_data read_data;
>> +	ktime_t wrap_kt;
>> +	unsigned long rate;
>> +};
>> @@ -60,15 +79,16 @@ unsigned long long notrace sched_clock(void)
>>  {
>>  	u64 cyc, res;
>>  	unsigned long seq;
>> +	struct clock_read_data *rd = &cd.read_data;
>>  
>>  	do {
>>  		seq = raw_read_seqcount_begin(&cd.seq);
>>  
>> -		res = cd.epoch_ns;
>> -		if (!cd.suspended) {
>> -			cyc = read_sched_clock();
>> -			cyc = (cyc - cd.epoch_cyc) & sched_clock_mask;
>> -			res += cyc_to_ns(cyc, cd.mult, cd.shift);
>> +		res = rd->epoch_ns;
>> +		if (!rd->suspended) {
> 
> Should this have likely() treatment? It would be really nice if
> we could use static branches here to avoid any branch penalty at
> all. I guess that would need some sort of special cased
> stop_machine() though. Or I wonder if we could replace
> rd->read_sched_clock() with a dumb function that returns
> cd.epoch_cyc so that the math turns out to be 0?

Great idea.

Making this code branchless with a special function sounds very much
better than using likely().


  reply	other threads:[~2015-02-05 10:21 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-21 16:53 [RFC PATCH] sched_clock: Avoid tearing during read from NMI Daniel Thompson
2015-01-21 17:29 ` John Stultz
2015-01-21 20:20   ` Daniel Thompson
2015-01-21 20:58   ` Stephen Boyd
2015-01-22 13:06 ` [PATCH v2] sched_clock: Avoid deadlock " Daniel Thompson
2015-01-30 19:03 ` [PATCH v3 0/4] sched_clock: Optimize and avoid " Daniel Thompson
2015-01-30 19:03   ` [PATCH v3 1/4] sched_clock: Match scope of read and write seqcounts Daniel Thompson
2015-01-30 19:03   ` [PATCH v3 2/4] sched_clock: Optimize cache line usage Daniel Thompson
2015-02-05  1:14     ` Stephen Boyd
2015-02-05 10:21       ` Daniel Thompson [this message]
2015-01-30 19:03   ` [PATCH v3 3/4] sched_clock: Remove suspend from clock_read_data Daniel Thompson
2015-01-30 19:03   ` [PATCH v3 4/4] sched_clock: Avoid deadlock during read from NMI Daniel Thompson
2015-02-05  1:23     ` Stephen Boyd
2015-02-05  1:48       ` Steven Rostedt
2015-02-05  6:23         ` Stephen Boyd
2015-02-05  0:50   ` [PATCH v3 0/4] sched_clock: Optimize and avoid " Stephen Boyd
2015-02-05  9:05     ` Daniel Thompson
2015-02-08 12:09       ` Daniel Thompson
2015-02-09 22:08         ` Stephen Boyd
2015-02-08 12:02 ` [PATCH v4 0/5] " Daniel Thompson
2015-02-08 12:02   ` [PATCH v4 1/5] sched_clock: Match scope of read and write seqcounts Daniel Thompson
2015-02-08 12:02   ` [PATCH v4 2/5] sched_clock: Optimize cache line usage Daniel Thompson
2015-02-09  1:28     ` Will Deacon
2015-02-09  9:47       ` Daniel Thompson
2015-02-10  2:37         ` Stephen Boyd
2015-02-08 12:02   ` [PATCH v4 3/5] sched_clock: Remove suspend from clock_read_data Daniel Thompson
2015-02-08 12:02   ` [PATCH v4 4/5] sched_clock: Remove redundant notrace from update function Daniel Thompson
2015-02-08 12:02   ` [PATCH v4 5/5] sched_clock: Avoid deadlock during read from NMI Daniel Thompson
2015-02-13  3:49   ` [PATCH v4 0/5] sched_clock: Optimize and avoid " Stephen Boyd
2015-03-02 15:56 ` [PATCH v5 " Daniel Thompson
2015-03-02 15:56   ` [PATCH v5 1/5] sched_clock: Match scope of read and write seqcounts Daniel Thompson
2015-03-02 15:56   ` [PATCH v5 2/5] sched_clock: Optimize cache line usage Daniel Thompson
2015-03-02 15:56   ` [PATCH v5 3/5] sched_clock: Remove suspend from clock_read_data Daniel Thompson
2015-03-02 15:56   ` [PATCH v5 4/5] sched_clock: Remove redundant notrace from update function Daniel Thompson
2015-03-02 15:56   ` [PATCH v5 5/5] sched_clock: Avoid deadlock during read from NMI Daniel Thompson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54D3442C.3090205@linaro.org \
    --to=daniel.thompson@linaro.org \
    --cc=catalin.marinas@arm.com \
    --cc=john.stultz@linaro.org \
    --cc=linaro-kernel@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=patches@linaro.org \
    --cc=rostedt@goodmis.org \
    --cc=sboyd@codeaurora.org \
    --cc=sumit.semwal@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.