All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: akpm@linuxfoundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, penberg@kernel.org, iamjoonsoo@lge.com,
	brouer@redhat.com
Subject: Re: [PATCH 2/3] slub: Support for array operations
Date: Thu, 12 Feb 2015 13:16:49 +1300	[thread overview]
Message-ID: <20150212131649.59b70f71@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1502111604510.15061@gentwo.org>

On Wed, 11 Feb 2015 16:06:50 -0600 (CST)
Christoph Lameter <cl@linux.com> wrote:

> On Thu, 12 Feb 2015, Jesper Dangaard Brouer wrote:
> 
> > > > This is quite an expensive lock with irqsave.
[...]
> > > We can require that interrupt are off when the functions are called. Then
> > > we can avoid the "save" part?
> >
> > Yes, we could also do so with an "_irqoff" variant of the func call,
> > but given we are defining the API we can just require this from the
> > start.
> 
> Allright. Lets do that then.

Okay. Some measurements to guide this choice.

Measured on my laptop CPU i7-2620M CPU @ 2.70GHz:

 * 12.775 ns - "clean" spin_lock_unlock
 * 21.099 ns - irqsave variant spinlock
 * 22.808 ns - "manual" irqsave before spin_lock
 * 14.618 ns - "manual" local_irq_disable + spin_lock

Reproducible via my github repo:
 https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c

The clean spin_lock_unlock is 8.324 ns faster than irqsave variant.
The irqsave variant is actually faster than expected, as the measurement
of an isolated local_irq_save_restore were 13.256 ns. 

The difference to the "manual" irqsave is only 1.709 ns, which is approx
the cost of an extra function call.

If one can use the non-flags-save version of local_irq_disable, then one
can save 6.481 ns (on this specific CPU and kernel config 3.17.8-200.fc20.x86_64).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

https://github.com/netoptimizer/prototype-kernel/commit/1471ac60

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Jesper Dangaard Brouer <brouer@redhat.com>
To: Christoph Lameter <cl@linux.com>
Cc: akpm@linuxfoundation.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, penberg@kernel.org, iamjoonsoo@lge.com,
	brouer@redhat.com
Subject: Re: [PATCH 2/3] slub: Support for array operations
Date: Thu, 12 Feb 2015 13:16:49 +1300	[thread overview]
Message-ID: <20150212131649.59b70f71@redhat.com> (raw)
In-Reply-To: <alpine.DEB.2.11.1502111604510.15061@gentwo.org>

On Wed, 11 Feb 2015 16:06:50 -0600 (CST)
Christoph Lameter <cl@linux.com> wrote:

> On Thu, 12 Feb 2015, Jesper Dangaard Brouer wrote:
> 
> > > > This is quite an expensive lock with irqsave.
[...]
> > > We can require that interrupt are off when the functions are called. Then
> > > we can avoid the "save" part?
> >
> > Yes, we could also do so with an "_irqoff" variant of the func call,
> > but given we are defining the API we can just require this from the
> > start.
> 
> Allright. Lets do that then.

Okay. Some measurements to guide this choice.

Measured on my laptop CPU i7-2620M CPU @ 2.70GHz:

 * 12.775 ns - "clean" spin_lock_unlock
 * 21.099 ns - irqsave variant spinlock
 * 22.808 ns - "manual" irqsave before spin_lock
 * 14.618 ns - "manual" local_irq_disable + spin_lock

Reproducible via my github repo:
 https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_sample.c

The clean spin_lock_unlock is 8.324 ns faster than irqsave variant.
The irqsave variant is actually faster than expected, as the measurement
of an isolated local_irq_save_restore were 13.256 ns. 

The difference to the "manual" irqsave is only 1.709 ns, which is approx
the cost of an extra function call.

If one can use the non-flags-save version of local_irq_disable, then one
can save 6.481 ns (on this specific CPU and kernel config 3.17.8-200.fc20.x86_64).

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

https://github.com/netoptimizer/prototype-kernel/commit/1471ac60

  reply	other threads:[~2015-02-12  0:16 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-10 19:48 [PATCH 0/3] Slab allocator array operations V2 Christoph Lameter
2015-02-10 19:48 ` Christoph Lameter
2015-02-10 19:48 ` [PATCH 1/3] Slab infrastructure for array operations Christoph Lameter
2015-02-10 19:48   ` Christoph Lameter
2015-02-10 22:43   ` Jesper Dangaard Brouer
2015-02-10 22:43     ` Jesper Dangaard Brouer
2015-02-10 23:58   ` David Rientjes
2015-02-10 23:58     ` David Rientjes
2015-02-11 18:47     ` Christoph Lameter
2015-02-11 18:47       ` Christoph Lameter
2015-02-11 20:18       ` David Rientjes
2015-02-11 20:18         ` David Rientjes
2015-02-11 22:04         ` Christoph Lameter
2015-02-11 22:04           ` Christoph Lameter
2015-02-12  0:35           ` David Rientjes
2015-02-12  0:35             ` David Rientjes
2015-02-13  2:35         ` Joonsoo Kim
2015-02-13  2:35           ` Joonsoo Kim
2015-02-13 15:47           ` Christoph Lameter
2015-02-13 15:47             ` Christoph Lameter
2015-02-13 21:20             ` David Rientjes
2015-02-13 21:20               ` David Rientjes
2015-02-17  5:15             ` Joonsoo Kim
2015-02-17  5:15               ` Joonsoo Kim
2015-02-17 16:03               ` Christoph Lameter
2015-02-17 16:03                 ` Christoph Lameter
2015-02-17 21:32                 ` Jesper Dangaard Brouer
2015-02-17 21:32                   ` Jesper Dangaard Brouer
2015-02-18 23:02                   ` Christoph Lameter
2015-02-18 23:02                     ` Christoph Lameter
2015-02-10 19:48 ` [PATCH 2/3] slub: Support " Christoph Lameter
2015-02-10 19:48   ` Christoph Lameter
2015-02-11  4:48   ` Jesper Dangaard Brouer
2015-02-11  4:48     ` Jesper Dangaard Brouer
2015-02-11 19:07     ` Christoph Lameter
2015-02-11 19:07       ` Christoph Lameter
2015-02-11 21:43       ` Jesper Dangaard Brouer
2015-02-11 21:43         ` Jesper Dangaard Brouer
2015-02-11 22:06         ` Christoph Lameter
2015-02-11 22:06           ` Christoph Lameter
2015-02-12  0:16           ` Jesper Dangaard Brouer [this message]
2015-02-12  0:16             ` Jesper Dangaard Brouer
2015-02-12  2:46             ` Christoph Lameter
2015-02-12  2:46               ` Christoph Lameter
2015-02-13  2:45   ` Joonsoo Kim
2015-02-13  2:45     ` Joonsoo Kim
2015-02-13 15:49     ` Christoph Lameter
2015-02-13 15:49       ` Christoph Lameter
2015-02-17  5:26       ` Joonsoo Kim
2015-02-17  5:26         ` Joonsoo Kim
2015-02-10 19:48 ` [PATCH 3/3] Array alloc test code Christoph Lameter
2015-02-10 19:48   ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150212131649.59b70f71@redhat.com \
    --to=brouer@redhat.com \
    --cc=akpm@linuxfoundation.org \
    --cc=cl@linux.com \
    --cc=iamjoonsoo@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penberg@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.