xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Malcolm Crossley <malcolm.crossley@citrix.com>
To: George Dunlap <george.dunlap@citrix.com>,
	JBeulich@suse.com, ian.campbell@citrix.com,
	andrew.cooper3@citrix.com, Marcos.Matsunaga@oracle.com,
	keir@xen.org, konrad.wilk@oracle.com,
	george.dunlap@eu.citrix.com
Cc: xen-devel@lists.xenproject.org, dario.faggioli@citrix.com,
	stefano.stabellini@citrix.com
Subject: Re: [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
Date: Tue, 19 Jan 2016 10:29:22 +0000	[thread overview]
Message-ID: <569E1002.6060007@citrix.com> (raw)
In-Reply-To: <5693C4DD.3040907@citrix.com>

On 11/01/16 15:06, Malcolm Crossley wrote:
> On 22/12/15 11:56, George Dunlap wrote:
>> On 18/12/15 16:08, Malcolm Crossley wrote:
>>> <snip>
>>> +
>>> +#ifndef NDEBUG
>>> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0, owner }
>>> +static inline void _percpu_rwlock_owner_check(percpu_rwlock_t **per_cpudata,
>>> +                                         percpu_rwlock_t *percpu_rwlock)
>>> +{
>>> +    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
>>> +}
>>> +#else
>>> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0 }
>>> +#define _percpu_rwlock_owner_check(data, lock) ((void)0)
>>> +#endif
>>> +
>>> +#define DEFINE_PERCPU_RWLOCK_RESOURCE(l, owner) \
>>> +    percpu_rwlock_t l = PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner))
>>> +#define percpu_rwlock_resource_init(l, owner) \
>>> +    (*(l) = (percpu_rwlock_t)PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner)))
>>> +
>>> +static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
>>> +                                         percpu_rwlock_t *percpu_rwlock)
>>
>> Is there a particular reason you chose to only use the "owner" value in
>> the struct to verify that the "per_cpudata" argument passed matched the
>> one you expected, rather than just getting rid of the "per_cpudata"
>> argument altogether and always using the pointer in the struct?
> 
> Initially I was aiming to add percpu aspects to the rwlock without increasing
> the size of the rwlock structure itself, this was to keep data cache usage and
> memory allocations the same.
> It became clear that having a global writer_activating barrier would cause the
> read_lock to enter the slow path far too often. So I put the writer_activating
> variable in the percpu_rwlock_t, as writer_activating is just a bool then the
> additional data overhead should be small. Always having a 8 byte pointer may
> add a lot of overhead to data structures contain multiple rwlocks and thus
> cause additional allocation overhead.
>>
>> (i.e., _percpu_read_lock(percpu_rwlock_t *percpu_rwlock) { ...
>> per_cpudata = percpu_rwlock->percpu_owner; ... })
>>
>> I'm not an expert in this sort of micro-optimization, but it seems like
>> you're trading off storing a pointer in your rwlock struct for storing a
>> pointer at every call site.  Since you have to read writer_activating
>> for every lock or unlock anyway,
> 
> writer_activating is not read on the read_unlock path. As these are rwlocks
> then I'm assuming the read lock/unlock paths are more critical for performance.
> So I'd prefer to not do a read of the percpu_rwlock structure if it's not
> required (i.e. on the read unlock path)
> Furthermore, the single byte for the writer_activating variable is likely
> to have been read into cache by accesses to other parts of the data structure
> near the percpu_rwlock_t. If we add additional 8 bytes to the percpu_rwlock_t
> then this may not happen and it may also adjust the cache line alignment aswell.
> 
>> it doesn't seem like you'd actually be
>> saving that many memory fetches; but having only one copy in the cache,
>> rather than one copy per call site, would on the whole reduce both the
>> cache footprint and the total memory used (if only by a few bytes).
> 
> If you put the owner pointer in the percpu_rwlock_t then wouldn't you have
> a copy per instance of percpu_rwlock_t? Surely this would use more cache than
> the handful of call site references to a global variable.
> 
>>
>> It also makes the code cleaner to have only one argument, rather than
>> two which must match; but since in all the places you use it you end up
>> using a wrapper to give you a single argument anyway, I don't think that
>> matters in this case.  (i.e., if there's a good reason for having it at
>> the call site instead if in the struct, I'm fine with this approach).
> 
> If you agree with my reasoning for the cache overhead and performance of the
> read unlock path being better with passing the percpu_data as an argument then
> I propose we keep the patches as is.
> 
Ping? I believe this is the last point of discussion before the patches can go in.

Malcolm

>>
>> Everything else looks good, thanks.
>>
>>  -George
>>
> 
> 

  reply	other threads:[~2016-01-19 10:29 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-18 16:08 [PATCHv5 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
2015-12-18 16:08 ` [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 16:39   ` Jan Beulich
2015-12-22 11:56   ` George Dunlap
2016-01-11 15:06     ` Malcolm Crossley
2016-01-19 10:29       ` Malcolm Crossley [this message]
2016-01-19 12:25         ` George Dunlap
2016-01-20 15:30       ` George Dunlap
2016-01-21 15:17   ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-18 16:40   ` Jan Beulich
2016-01-21 15:31   ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 3/3] p2m: convert p2m " Malcolm Crossley
2015-12-22 12:07   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=569E1002.6060007@citrix.com \
    --to=malcolm.crossley@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcos.Matsunaga@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).