xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Malcolm Crossley <malcolm.crossley@citrix.com>
To: George Dunlap <george.dunlap@citrix.com>,
	JBeulich@suse.com, ian.campbell@citrix.com,
	andrew.cooper3@citrix.com, Marcos.Matsunaga@oracle.com,
	keir@xen.org, konrad.wilk@oracle.com,
	george.dunlap@eu.citrix.com
Cc: xen-devel@lists.xenproject.org, dario.faggioli@citrix.com,
	stefano.stabellini@citrix.com
Subject: Re: [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure
Date: Mon, 11 Jan 2016 15:06:05 +0000	[thread overview]
Message-ID: <5693C4DD.3040907@citrix.com> (raw)
In-Reply-To: <56793A8B.1070005@citrix.com>

On 22/12/15 11:56, George Dunlap wrote:
> On 18/12/15 16:08, Malcolm Crossley wrote:
>> <snip>
>> +
>> +#ifndef NDEBUG
>> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0, owner }
>> +static inline void _percpu_rwlock_owner_check(percpu_rwlock_t **per_cpudata,
>> +                                         percpu_rwlock_t *percpu_rwlock)
>> +{
>> +    ASSERT(per_cpudata == percpu_rwlock->percpu_owner);
>> +}
>> +#else
>> +#define PERCPU_RW_LOCK_UNLOCKED(owner) { RW_LOCK_UNLOCKED, 0 }
>> +#define _percpu_rwlock_owner_check(data, lock) ((void)0)
>> +#endif
>> +
>> +#define DEFINE_PERCPU_RWLOCK_RESOURCE(l, owner) \
>> +    percpu_rwlock_t l = PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner))
>> +#define percpu_rwlock_resource_init(l, owner) \
>> +    (*(l) = (percpu_rwlock_t)PERCPU_RW_LOCK_UNLOCKED(&get_per_cpu_var(owner)))
>> +
>> +static inline void _percpu_read_lock(percpu_rwlock_t **per_cpudata,
>> +                                         percpu_rwlock_t *percpu_rwlock)
> 
> Is there a particular reason you chose to only use the "owner" value in
> the struct to verify that the "per_cpudata" argument passed matched the
> one you expected, rather than just getting rid of the "per_cpudata"
> argument altogether and always using the pointer in the struct?

Initially I was aiming to add percpu aspects to the rwlock without increasing
the size of the rwlock structure itself, this was to keep data cache usage and
memory allocations the same.
It became clear that having a global writer_activating barrier would cause the
read_lock to enter the slow path far too often. So I put the writer_activating
variable in the percpu_rwlock_t, as writer_activating is just a bool then the
additional data overhead should be small. Always having a 8 byte pointer may
add a lot of overhead to data structures contain multiple rwlocks and thus
cause additional allocation overhead.
> 
> (i.e., _percpu_read_lock(percpu_rwlock_t *percpu_rwlock) { ...
> per_cpudata = percpu_rwlock->percpu_owner; ... })
> 
> I'm not an expert in this sort of micro-optimization, but it seems like
> you're trading off storing a pointer in your rwlock struct for storing a
> pointer at every call site.  Since you have to read writer_activating
> for every lock or unlock anyway,

writer_activating is not read on the read_unlock path. As these are rwlocks
then I'm assuming the read lock/unlock paths are more critical for performance.
So I'd prefer to not do a read of the percpu_rwlock structure if it's not
required (i.e. on the read unlock path)
Furthermore, the single byte for the writer_activating variable is likely
to have been read into cache by accesses to other parts of the data structure
near the percpu_rwlock_t. If we add additional 8 bytes to the percpu_rwlock_t
then this may not happen and it may also adjust the cache line alignment aswell.

> it doesn't seem like you'd actually be
> saving that many memory fetches; but having only one copy in the cache,
> rather than one copy per call site, would on the whole reduce both the
> cache footprint and the total memory used (if only by a few bytes).

If you put the owner pointer in the percpu_rwlock_t then wouldn't you have
a copy per instance of percpu_rwlock_t? Surely this would use more cache than
the handful of call site references to a global variable.

> 
> It also makes the code cleaner to have only one argument, rather than
> two which must match; but since in all the places you use it you end up
> using a wrapper to give you a single argument anyway, I don't think that
> matters in this case.  (i.e., if there's a good reason for having it at
> the call site instead if in the struct, I'm fine with this approach).

If you agree with my reasoning for the cache overhead and performance of the
read unlock path being better with passing the percpu_data as an argument then
I propose we keep the patches as is.

> 
> Everything else looks good, thanks.
> 
>  -George
> 

Thanks for the review and sorry about the delayed reply.

Malcolm

  reply	other threads:[~2016-01-11 15:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-18 16:08 [PATCHv5 0/3] Implement per-cpu reader-writer locks Malcolm Crossley
2015-12-18 16:08 ` [PATCHv5 1/3] rwlock: Add per-cpu reader-writer lock infrastructure Malcolm Crossley
2015-12-18 16:39   ` Jan Beulich
2015-12-22 11:56   ` George Dunlap
2016-01-11 15:06     ` Malcolm Crossley [this message]
2016-01-19 10:29       ` Malcolm Crossley
2016-01-19 12:25         ` George Dunlap
2016-01-20 15:30       ` George Dunlap
2016-01-21 15:17   ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 2/3] grant_table: convert grant table rwlock to percpu rwlock Malcolm Crossley
2015-12-18 16:40   ` Jan Beulich
2016-01-21 15:31   ` Ian Campbell
2015-12-18 16:08 ` [PATCHv5 3/3] p2m: convert p2m " Malcolm Crossley
2015-12-22 12:07   ` George Dunlap

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5693C4DD.3040907@citrix.com \
    --to=malcolm.crossley@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=Marcos.Matsunaga@oracle.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dario.faggioli@citrix.com \
    --cc=george.dunlap@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=ian.campbell@citrix.com \
    --cc=keir@xen.org \
    --cc=konrad.wilk@oracle.com \
    --cc=stefano.stabellini@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).