Linux cryptographic layer development
 help / color / mirror / Atom feed
* [PATCH] padata: make the sequence counter an atomic_t
@ 2013-10-02 13:40 Mathias Krause
  2013-10-08 12:08 ` Steffen Klassert
  0 siblings, 1 reply; 7+ messages in thread
From: Mathias Krause @ 2013-10-02 13:40 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: linux-crypto, Mathias Krause

Using a spinlock to atomically increase a counter sounds wrong -- we've
atomic_t for this!

Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
line trashing. This has the nice side effect of decreasing the size of
struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
occupying only two instead of three cache lines.

Those changes results in a 5% performance increase on an IPsec test run
using pcrypt.

Btw. the seq_lock spinlock was never explicitly initialized -- one more
reason to get rid of it.

Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
---
 include/linux/padata.h |    3 +--
 kernel/padata.c        |    9 ++++-----
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/padata.h b/include/linux/padata.h
index 86292be..4386946 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -129,10 +129,9 @@ struct parallel_data {
 	struct padata_serial_queue	__percpu *squeue;
 	atomic_t			reorder_objects;
 	atomic_t			refcnt;
+	atomic_t			seq_nr;
 	struct padata_cpumask		cpumask;
 	spinlock_t                      lock ____cacheline_aligned;
-	spinlock_t                      seq_lock;
-	unsigned int			seq_nr;
 	unsigned int			processed;
 	struct timer_list		timer;
 };
diff --git a/kernel/padata.c b/kernel/padata.c
index 07af2c9..2abd25d 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -46,6 +46,7 @@ static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index)
 
 static int padata_cpu_hash(struct parallel_data *pd)
 {
+	unsigned int seq_nr;
 	int cpu_index;
 
 	/*
@@ -53,10 +54,8 @@ static int padata_cpu_hash(struct parallel_data *pd)
 	 * seq_nr mod. number of cpus in use.
 	 */
 
-	spin_lock(&pd->seq_lock);
-	cpu_index =  pd->seq_nr % cpumask_weight(pd->cpumask.pcpu);
-	pd->seq_nr++;
-	spin_unlock(&pd->seq_lock);
+	seq_nr = atomic_inc_return(&pd->seq_nr);
+	cpu_index = seq_nr % cpumask_weight(pd->cpumask.pcpu);
 
 	return padata_index_to_cpu(pd, cpu_index);
 }
@@ -429,7 +428,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst,
 	padata_init_pqueues(pd);
 	padata_init_squeues(pd);
 	setup_timer(&pd->timer, padata_reorder_timer, (unsigned long)pd);
-	pd->seq_nr = 0;
+	atomic_set(&pd->seq_nr, -1);
 	atomic_set(&pd->reorder_objects, 0);
 	atomic_set(&pd->refcnt, 0);
 	pd->pinst = pinst;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] padata: make the sequence counter an atomic_t
  2013-10-02 13:40 [PATCH] padata: make the sequence counter an atomic_t Mathias Krause
@ 2013-10-08 12:08 ` Steffen Klassert
  2013-10-25  8:20   ` Mathias Krause
  0 siblings, 1 reply; 7+ messages in thread
From: Steffen Klassert @ 2013-10-08 12:08 UTC (permalink / raw)
  To: Mathias Krause, Herbert Xu; +Cc: linux-crypto

On Wed, Oct 02, 2013 at 03:40:45PM +0200, Mathias Krause wrote:
> Using a spinlock to atomically increase a counter sounds wrong -- we've
> atomic_t for this!
> 
> Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
> line trashing. This has the nice side effect of decreasing the size of
> struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
> occupying only two instead of three cache lines.
> 
> Those changes results in a 5% performance increase on an IPsec test run
> using pcrypt.
> 
> Btw. the seq_lock spinlock was never explicitly initialized -- one more
> reason to get rid of it.
> 
> Signed-off-by: Mathias Krause <mathias.krause@secunet.com>

Acked-by: Steffen Klassert <steffen.klassert@secunet.com>

Herbert can you take this one?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] padata: make the sequence counter an atomic_t
  2013-10-08 12:08 ` Steffen Klassert
@ 2013-10-25  8:20   ` Mathias Krause
  2013-10-25  9:26     ` Herbert Xu
  0 siblings, 1 reply; 7+ messages in thread
From: Mathias Krause @ 2013-10-25  8:20 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Steffen Klassert, linux-crypto

On 08.10.2013 14:08, Steffen Klassert wrote:
> On Wed, Oct 02, 2013 at 03:40:45PM +0200, Mathias Krause wrote:
>> Using a spinlock to atomically increase a counter sounds wrong -- we've
>> atomic_t for this!
>>
>> Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
>> line trashing. This has the nice side effect of decreasing the size of
>> struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
>> occupying only two instead of three cache lines.
>>
>> Those changes results in a 5% performance increase on an IPsec test run
>> using pcrypt.
>>
>> Btw. the seq_lock spinlock was never explicitly initialized -- one more
>> reason to get rid of it.
>>
>> Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
> 
> Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> Herbert can you take this one?

Ping, Herbert? Anything wrong with the patch?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] padata: make the sequence counter an atomic_t
  2013-10-25  8:20   ` Mathias Krause
@ 2013-10-25  9:26     ` Herbert Xu
  2013-10-25 10:13       ` Mathias Krause
  2013-10-25 10:14       ` [PATCH RESEND] " Mathias Krause
  0 siblings, 2 replies; 7+ messages in thread
From: Herbert Xu @ 2013-10-25  9:26 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Steffen Klassert, linux-crypto

On Fri, Oct 25, 2013 at 10:20:48AM +0200, Mathias Krause wrote:
> On 08.10.2013 14:08, Steffen Klassert wrote:
> > On Wed, Oct 02, 2013 at 03:40:45PM +0200, Mathias Krause wrote:
> >> Using a spinlock to atomically increase a counter sounds wrong -- we've
> >> atomic_t for this!
> >>
> >> Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
> >> line trashing. This has the nice side effect of decreasing the size of
> >> struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
> >> occupying only two instead of three cache lines.
> >>
> >> Those changes results in a 5% performance increase on an IPsec test run
> >> using pcrypt.
> >>
> >> Btw. the seq_lock spinlock was never explicitly initialized -- one more
> >> reason to get rid of it.
> >>
> >> Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
> > 
> > Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
> > 
> > Herbert can you take this one?
> 
> Ping, Herbert? Anything wrong with the patch?

Sorry I don't seem to have this patch in my mail box.  Can you
resend it please?

Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] padata: make the sequence counter an atomic_t
  2013-10-25  9:26     ` Herbert Xu
@ 2013-10-25 10:13       ` Mathias Krause
  2013-10-25 10:14       ` [PATCH RESEND] " Mathias Krause
  1 sibling, 0 replies; 7+ messages in thread
From: Mathias Krause @ 2013-10-25 10:13 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Steffen Klassert, linux-crypto

On 25.10.2013 11:26, Herbert Xu wrote:
> On Fri, Oct 25, 2013 at 10:20:48AM +0200, Mathias Krause wrote:
>> On 08.10.2013 14:08, Steffen Klassert wrote:
>>> On Wed, Oct 02, 2013 at 03:40:45PM +0200, Mathias Krause wrote:
>>>> Using a spinlock to atomically increase a counter sounds wrong -- we've
>>>> atomic_t for this!
>>>>
>>>> Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
>>>> line trashing. This has the nice side effect of decreasing the size of
>>>> struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
>>>> occupying only two instead of three cache lines.
>>>>
>>>> Those changes results in a 5% performance increase on an IPsec test run
>>>> using pcrypt.
>>>>
>>>> Btw. the seq_lock spinlock was never explicitly initialized -- one more
>>>> reason to get rid of it.
>>>>
>>>> Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
>>> Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
>>>
>>> Herbert can you take this one?
>> Ping, Herbert? Anything wrong with the patch?
> 
> Sorry I don't seem to have this patch in my mail box.  Can you
> resend it please?

I send it to linux-crypto and Steffen only. Will resend it directed to
you, now.

> 
> Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH RESEND] padata: make the sequence counter an atomic_t
  2013-10-25  9:26     ` Herbert Xu
  2013-10-25 10:13       ` Mathias Krause
@ 2013-10-25 10:14       ` Mathias Krause
  2013-10-30  4:11         ` Herbert Xu
  1 sibling, 1 reply; 7+ messages in thread
From: Mathias Krause @ 2013-10-25 10:14 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Steffen Klassert, linux-crypto, Mathias Krause

Using a spinlock to atomically increase a counter sounds wrong -- we've
atomic_t for this!

Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
line trashing. This has the nice side effect of decreasing the size of
struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
occupying only two instead of three cache lines.

Those changes results in a 5% performance increase on an IPsec test run
using pcrypt.

Btw. the seq_lock spinlock was never explicitly initialized -- one more
reason to get rid of it.

Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/linux/padata.h |    3 +--
 kernel/padata.c        |    9 ++++-----
 2 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/include/linux/padata.h b/include/linux/padata.h
index 86292be..4386946 100644
--- a/include/linux/padata.h
+++ b/include/linux/padata.h
@@ -129,10 +129,9 @@ struct parallel_data {
 	struct padata_serial_queue	__percpu *squeue;
 	atomic_t			reorder_objects;
 	atomic_t			refcnt;
+	atomic_t			seq_nr;
 	struct padata_cpumask		cpumask;
 	spinlock_t                      lock ____cacheline_aligned;
-	spinlock_t                      seq_lock;
-	unsigned int			seq_nr;
 	unsigned int			processed;
 	struct timer_list		timer;
 };
diff --git a/kernel/padata.c b/kernel/padata.c
index 07af2c9..2abd25d 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -46,6 +46,7 @@ static int padata_index_to_cpu(struct parallel_data *pd, int cpu_index)
 
 static int padata_cpu_hash(struct parallel_data *pd)
 {
+	unsigned int seq_nr;
 	int cpu_index;
 
 	/*
@@ -53,10 +54,8 @@ static int padata_cpu_hash(struct parallel_data *pd)
 	 * seq_nr mod. number of cpus in use.
 	 */
 
-	spin_lock(&pd->seq_lock);
-	cpu_index =  pd->seq_nr % cpumask_weight(pd->cpumask.pcpu);
-	pd->seq_nr++;
-	spin_unlock(&pd->seq_lock);
+	seq_nr = atomic_inc_return(&pd->seq_nr);
+	cpu_index = seq_nr % cpumask_weight(pd->cpumask.pcpu);
 
 	return padata_index_to_cpu(pd, cpu_index);
 }
@@ -429,7 +428,7 @@ static struct parallel_data *padata_alloc_pd(struct padata_instance *pinst,
 	padata_init_pqueues(pd);
 	padata_init_squeues(pd);
 	setup_timer(&pd->timer, padata_reorder_timer, (unsigned long)pd);
-	pd->seq_nr = 0;
+	atomic_set(&pd->seq_nr, -1);
 	atomic_set(&pd->reorder_objects, 0);
 	atomic_set(&pd->refcnt, 0);
 	pd->pinst = pinst;
-- 
1.7.2.5

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RESEND] padata: make the sequence counter an atomic_t
  2013-10-25 10:14       ` [PATCH RESEND] " Mathias Krause
@ 2013-10-30  4:11         ` Herbert Xu
  0 siblings, 0 replies; 7+ messages in thread
From: Herbert Xu @ 2013-10-30  4:11 UTC (permalink / raw)
  To: Mathias Krause; +Cc: Steffen Klassert, linux-crypto

On Fri, Oct 25, 2013 at 12:14:15PM +0200, Mathias Krause wrote:
> Using a spinlock to atomically increase a counter sounds wrong -- we've
> atomic_t for this!
> 
> Also move 'seq_nr' to a different cache line than 'lock' to reduce cache
> line trashing. This has the nice side effect of decreasing the size of
> struct parallel_data from 192 to 128 bytes for a x86-64 build, e.g.
> occupying only two instead of three cache lines.
> 
> Those changes results in a 5% performance increase on an IPsec test run
> using pcrypt.
> 
> Btw. the seq_lock spinlock was never explicitly initialized -- one more
> reason to get rid of it.
> 
> Signed-off-by: Mathias Krause <mathias.krause@secunet.com>
> Acked-by: Steffen Klassert <steffen.klassert@secunet.com>

Patch applied.  Thanks!
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2013-10-30  4:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-10-02 13:40 [PATCH] padata: make the sequence counter an atomic_t Mathias Krause
2013-10-08 12:08 ` Steffen Klassert
2013-10-25  8:20   ` Mathias Krause
2013-10-25  9:26     ` Herbert Xu
2013-10-25 10:13       ` Mathias Krause
2013-10-25 10:14       ` [PATCH RESEND] " Mathias Krause
2013-10-30  4:11         ` Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox