* [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask
[not found] <cover.1382950629.git.agordeev@redhat.com>
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 21:20 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
` (3 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Function steal_tags() might miss a bit in cpus_have_tags due to
unsynchronized access from percpu_ida_free(). As result, function
percpu_ida_alloc() might enter unwakable sleep. This update adds
memory barriers to prevent the described scenario.
In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
and wake_up() calls at the moment and the aforementioned sequence
does not appear could hit. Nevertheless, explicit memory barriers
still seem justifiable.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index b0698ea..96cfacf 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -68,6 +68,11 @@ static inline void steal_tags(struct percpu_ida *pool,
unsigned cpus_have_tags, cpu = pool->cpu_last_stolen;
struct percpu_ida_cpu *remote;
+ /*
+ * Pairs with smp_wmb() in percpu_ida_free()
+ */
+ smp_rmb();
+
for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
cpus_have_tags--) {
@@ -231,8 +236,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
spin_unlock(&tags->lock);
if (nr_free == 1) {
- cpumask_set_cpu(smp_processor_id(),
- &pool->cpus_have_tags);
+ cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags);
+ /*
+ * Pairs with smp_rmb() in steal_tags()
+ */
+ smp_wmb();
wake_up(&pool->wait);
}
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 21:21 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
` (2 subsequent siblings)
4 siblings, 1 reply; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Currently percpu_ida_free() waikes up waiters always with local
interrupts disabled and sometimes with pool->lock held. Yet, it
does not appear there is any reason why it could not be done out
of these atomic contexts.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 11 +++++++----
1 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 96cfacf..9dd8741 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -223,6 +223,7 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
struct percpu_ida_cpu *tags;
unsigned long flags;
unsigned nr_free;
+ bool wake_up = false;
BUG_ON(tag >= pool->nr_tags);
@@ -241,8 +242,8 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
* Pairs with smp_rmb() in steal_tags()
*/
smp_wmb();
- wake_up(&pool->wait);
- }
+ wake_up = true;
+ }
if (nr_free == pool->percpu_max_size) {
spin_lock(&pool->lock);
@@ -255,13 +256,15 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
move_tags(pool->freelist, &pool->nr_free,
tags->freelist, &tags->nr_free,
pool->percpu_batch_size);
-
- wake_up(&pool->wait);
+ wake_up = true;
}
spin_unlock(&pool->lock);
}
local_irq_restore(flags);
+
+ if (wake_up)
+ wake_up(&pool->wait);
}
EXPORT_SYMBOL_GPL(percpu_ida_free);
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
@ 2013-10-28 10:05 ` Alexander Gordeev
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:05 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
In case percpu_max_size is 1 percpu_ida_free() sets bit in
the cpumask and immediately transfers the tag to the pool.
As result, the very next call to percpu_ida_alloc() on the
same CPU will have to pull a tag from the pool to the local
cache and so on. Hence, positive effects of local caching
become largely negated.
This update assumes stealing tags is faster than ping-ponging
between local caches and the pool and prevents returning tags
to the pool in case percpu_max_size is 1.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 5 ++---
1 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 9dd8741..4adc3e5 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -243,9 +243,8 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
*/
smp_wmb();
wake_up = true;
- }
-
- if (nr_free == pool->percpu_max_size) {
+ } else if ((nr_free == pool->percpu_max_size) &&
+ (pool->percpu_max_size > 1)) {
spin_lock(&pool->lock);
/*
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 4/5] percpu_ida: Sanity check initialization parameters
[not found] <cover.1382950629.git.agordeev@redhat.com>
` (2 preceding siblings ...)
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
@ 2013-10-28 10:06 ` Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:06 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
lib/percpu_ida.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 4adc3e5..1fc89f9 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -298,6 +298,11 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
{
unsigned i, cpu, order;
+ if (batch_size > max_size)
+ return -ERANGE;
+ if (!batch_size)
+ return -EINVAL;
+
memset(pool, 0, sizeof(*pool));
init_waitqueue_head(&pool->wait);
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags
[not found] <cover.1382950629.git.agordeev@redhat.com>
` (3 preceding siblings ...)
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
@ 2013-10-28 10:06 ` Alexander Gordeev
4 siblings, 0 replies; 7+ messages in thread
From: Alexander Gordeev @ 2013-10-28 10:06 UTC (permalink / raw)
To: Kent Overstreet
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
Currently a threshold for stealing tags from remote CPUs
is set to a half of the total number of tags. However,
in general case this threshold is a function not only of
the total number of tags and maximum number of tags per
CPU, but also of a usage pattern. Just let percpu_ida
users decide how big this threshold value should be.
Signed-off-by: Alexander Gordeev <agordeev@redhat.com>
---
block/blk-mq-tag.c | 9 +++++----
include/linux/percpu_ida.h | 5 +++--
lib/percpu_ida.c | 7 +++++--
3 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d64a02f..da0b3dd 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -145,10 +145,11 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
tags->nr_max_cache = nr_cache;
tags->nr_batch_move = max(1u, nr_cache / 2);
- ret = __percpu_ida_init(&tags->free_tags, tags->nr_tags -
- tags->nr_reserved_tags,
+ ret = __percpu_ida_init(&tags->free_tags,
+ tags->nr_tags - tags->nr_reserved_tags,
tags->nr_max_cache,
- tags->nr_batch_move);
+ tags->nr_batch_move,
+ (tags->nr_tags - tags->nr_reserved_tags) / 2);
if (ret)
goto err_free_tags;
@@ -158,7 +159,7 @@ struct blk_mq_tags *blk_mq_init_tags(unsigned int total_tags,
* no cached. It's fine reserved tags allocation is slow.
*/
ret = __percpu_ida_init(&tags->reserved_tags, reserved_tags,
- 1, 1);
+ 1, 1, 0);
if (ret)
goto err_reserved_tags;
}
diff --git a/include/linux/percpu_ida.h b/include/linux/percpu_ida.h
index 1900bd0..d874cca 100644
--- a/include/linux/percpu_ida.h
+++ b/include/linux/percpu_ida.h
@@ -18,6 +18,7 @@ struct percpu_ida {
unsigned nr_tags;
unsigned percpu_max_size;
unsigned percpu_batch_size;
+ unsigned max_cached;
struct percpu_ida_cpu __percpu *tag_cpu;
@@ -66,11 +67,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag);
void percpu_ida_destroy(struct percpu_ida *pool);
int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
- unsigned long max_size, unsigned long batch_size);
+ unsigned long max_size, unsigned long batch_size, unsigned max_cached);
static inline int percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags)
{
return __percpu_ida_init(pool, nr_tags, IDA_DEFAULT_PCPU_SIZE,
- IDA_DEFAULT_PCPU_BATCH_MOVE);
+ IDA_DEFAULT_PCPU_BATCH_MOVE, nr_tags / 2);
}
typedef int (*percpu_ida_cb)(unsigned, void *);
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 1fc89f9..241f8a3 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -74,7 +74,7 @@ static inline void steal_tags(struct percpu_ida *pool,
smp_rmb();
for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
- cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
+ cpus_have_tags * pool->percpu_max_size > pool->max_cached;
cpus_have_tags--) {
cpu = cpumask_next(cpu, &pool->cpus_have_tags);
@@ -294,7 +294,7 @@ EXPORT_SYMBOL_GPL(percpu_ida_destroy);
* performance, the workload should not span more cpus than nr_tags / 128.
*/
int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
- unsigned long max_size, unsigned long batch_size)
+ unsigned long max_size, unsigned long batch_size, unsigned max_cached)
{
unsigned i, cpu, order;
@@ -302,6 +302,8 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
return -ERANGE;
if (!batch_size)
return -EINVAL;
+ if (max_cached > nr_tags)
+ return -EINVAL;
memset(pool, 0, sizeof(*pool));
@@ -310,6 +312,7 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
pool->nr_tags = nr_tags;
pool->percpu_max_size = max_size;
pool->percpu_batch_size = batch_size;
+ pool->max_cached = max_cached;
/* Guard against overflow */
if (nr_tags > (unsigned) INT_MAX + 1) {
--
1.7.7.6
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
@ 2013-10-28 21:20 ` Kent Overstreet
0 siblings, 0 replies; 7+ messages in thread
From: Kent Overstreet @ 2013-10-28 21:20 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
On Mon, Oct 28, 2013 at 11:05:06AM +0100, Alexander Gordeev wrote:
> Function steal_tags() might miss a bit in cpus_have_tags due to
> unsynchronized access from percpu_ida_free(). As result, function
> percpu_ida_alloc() might enter unwakable sleep. This update adds
> memory barriers to prevent the described scenario.
>
> In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
> and wake_up() calls at the moment and the aforementioned sequence
> does not appear could hit. Nevertheless, explicit memory barriers
> still seem justifiable.
Good catch!
Acked-by: Kent Overstreet <kmo@daterainc.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
@ 2013-10-28 21:21 ` Kent Overstreet
0 siblings, 0 replies; 7+ messages in thread
From: Kent Overstreet @ 2013-10-28 21:21 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Oleg Nesterov, Jens Axboe, Nicholas A. Bellinger, linux-kernel
On Mon, Oct 28, 2013 at 11:05:25AM +0100, Alexander Gordeev wrote:
> Currently percpu_ida_free() waikes up waiters always with local
> interrupts disabled and sometimes with pool->lock held. Yet, it
> does not appear there is any reason why it could not be done out
> of these atomic contexts.
This should be a noticable performance boost, nested irqsave/restore is painful.
Acked-by: Kent Overstreet <kmo@daterainc.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2013-10-28 21:21 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <cover.1382950629.git.agordeev@redhat.com>
2013-10-28 10:05 ` [PATCH 1/5] percpu_ida: Fix data race on cpus_have_tags cpumask Alexander Gordeev
2013-10-28 21:20 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 2/5] percpu_ida: Move waking up waiters out of atomic contexts Alexander Gordeev
2013-10-28 21:21 ` Kent Overstreet
2013-10-28 10:05 ` [PATCH 3/5] percpu_ida: Optimize freeing tags when maximum cache size is 1 Alexander Gordeev
2013-10-28 10:06 ` [PATCH 4/5] percpu_ida: Sanity check initialization parameters Alexander Gordeev
2013-10-28 10:06 ` [PATCH 5/5] percpu_ida: Allow variable maximum number of cached tags Alexander Gordeev
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.