* [PATCH] percpu_ida: Handle out-of-tags gracefully
@ 2014-03-10 14:12 Bart Van Assche
2014-03-11 13:51 ` Alexander Gordeev
0 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2014-03-10 14:12 UTC (permalink / raw)
To: Jens Axboe
Cc: Kent Overstreet, Shaohua Li, Christoph Hellwig, Jens Axboe,
Alexander Gordeev, Mike Christie, linux-kernel
Avoid that percpu_ida_alloc() hangs or crashes if there are still
tags are available. Wait until a tag becomes available instead of
giving up when running out of tags temporarily. This patch fixes
the following kernel bug:
------------[ cut here ]------------
kernel BUG at lib/percpu_ida.c:81!
invalid opcode: 0000 [#1] SMP
RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370
Call Trace:
[<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50
[<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90
[<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110
[<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480
[<ffffffff811e28f0>] generic_make_request+0xc0/0x110
[<ffffffff811e29ab>] submit_bio+0x6b/0x140
[<ffffffff8117aabb>] _submit_bh+0x13b/0x220
[<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300
[<ffffffff81181128>] blkdev_readpage+0x18/0x20
[<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280
[<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0
[<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50
[<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700
[<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70
[<ffffffff8114a28a>] do_sync_read+0x5a/0x90
[<ffffffff8114a8cb>] vfs_read+0x9b/0x160
[<ffffffff8114b389>] SyS_read+0x49/0xa0
[<ffffffff81416049>] tracesys+0xd0/0xd5
---[ end trace cdd1a8a7968266cf ]---
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Shaohua Li <shli@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Alexander Gordeev <agordeev@redhat.com>
Cc: Mike Christie <michaelc@cs.wisc.edu>
---
lib/percpu_ida.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 93d145e..170d27c 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool,
if (cpu >= nr_cpu_ids) {
cpu = cpumask_first(&pool->cpus_have_tags);
if (cpu >= nr_cpu_ids)
- BUG();
+ break;
}
pool->cpu_last_stolen = cpu;
@@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
spin_unlock(&pool->lock);
local_irq_restore(flags);
+ if (tags->nr_free)
+ wake_up(&pool->wait);
+
if (tag >= 0 || state == TASK_RUNNING)
break;
--
1.8.4.5
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-10 14:12 [PATCH] percpu_ida: Handle out-of-tags gracefully Bart Van Assche
@ 2014-03-11 13:51 ` Alexander Gordeev
2014-03-11 18:10 ` Bart Van Assche
0 siblings, 1 reply; 9+ messages in thread
From: Alexander Gordeev @ 2014-03-11 13:51 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On Mon, Mar 10, 2014 at 03:12:33PM +0100, Bart Van Assche wrote:
> Avoid that percpu_ida_alloc() hangs or crashes if there are still
> tags are available. Wait until a tag becomes available instead of
> giving up when running out of tags temporarily. This patch fixes
> the following kernel bug:
Hi Bart,
Few comments below, but the changelog does not correspond to the
actual change in 'Wait until a tag becomes available'.
> ------------[ cut here ]------------
> kernel BUG at lib/percpu_ida.c:81!
> invalid opcode: 0000 [#1] SMP
> RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370
> Call Trace:
> [<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50
> [<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90
> [<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110
> [<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480
> [<ffffffff811e28f0>] generic_make_request+0xc0/0x110
> [<ffffffff811e29ab>] submit_bio+0x6b/0x140
> [<ffffffff8117aabb>] _submit_bh+0x13b/0x220
> [<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300
> [<ffffffff81181128>] blkdev_readpage+0x18/0x20
> [<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280
> [<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0
> [<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50
> [<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700
> [<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70
> [<ffffffff8114a28a>] do_sync_read+0x5a/0x90
> [<ffffffff8114a8cb>] vfs_read+0x9b/0x160
> [<ffffffff8114b389>] SyS_read+0x49/0xa0
> [<ffffffff81416049>] tracesys+0xd0/0xd5
> ---[ end trace cdd1a8a7968266cf ]---
>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> Cc: Kent Overstreet <kmo@daterainc.com>
> Cc: Shaohua Li <shli@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc: Alexander Gordeev <agordeev@redhat.com>
> Cc: Mike Christie <michaelc@cs.wisc.edu>
> ---
> lib/percpu_ida.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
> index 93d145e..170d27c 100644
> --- a/lib/percpu_ida.c
> +++ b/lib/percpu_ida.c
> @@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool,
> if (cpu >= nr_cpu_ids) {
> cpu = cpumask_first(&pool->cpus_have_tags);
> if (cpu >= nr_cpu_ids)
> - BUG();
> + break;
I assume the BUG() above hits? If so, I am failing to understand how
the code gets here. Mind elaborate?
> }
>
> pool->cpu_last_stolen = cpu;
> @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
> spin_unlock(&pool->lock);
> local_irq_restore(flags);
>
> + if (tags->nr_free)
> + wake_up(&pool->wait);
> +
How 'tags->nr_free' could be checked out of locks?
Why waking up another thread instead of returning the tag on this CPU?
Why 'percpu_max_size' threshold is ignored?
Anyway, IMHO the above BUG() indicates a problem elsewhere.
> if (tag >= 0 || state == TASK_RUNNING)
> break;
>
> --
> 1.8.4.5
>
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-11 13:51 ` Alexander Gordeev
@ 2014-03-11 18:10 ` Bart Van Assche
2014-03-11 20:48 ` Alexander Gordeev
0 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2014-03-11 18:10 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On 03/11/14 14:51, Alexander Gordeev wrote:
> On Mon, Mar 10, 2014 at 03:12:33PM +0100, Bart Van Assche wrote:
>> Avoid that percpu_ida_alloc() hangs or crashes if there are still
>> tags are available. Wait until a tag becomes available instead of
>> giving up when running out of tags temporarily. This patch fixes
>> the following kernel bug:
>
> Hi Bart,
>
> Few comments below, but the changelog does not correspond to the
> actual change in 'Wait until a tag becomes available'.
>
>> ------------[ cut here ]------------
>> kernel BUG at lib/percpu_ida.c:81!
>> invalid opcode: 0000 [#1] SMP
>> RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370
>> Call Trace:
>> [<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50
>> [<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90
>> [<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110
>> [<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480
>> [<ffffffff811e28f0>] generic_make_request+0xc0/0x110
>> [<ffffffff811e29ab>] submit_bio+0x6b/0x140
>> [<ffffffff8117aabb>] _submit_bh+0x13b/0x220
>> [<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300
>> [<ffffffff81181128>] blkdev_readpage+0x18/0x20
>> [<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280
>> [<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0
>> [<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50
>> [<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700
>> [<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70
>> [<ffffffff8114a28a>] do_sync_read+0x5a/0x90
>> [<ffffffff8114a8cb>] vfs_read+0x9b/0x160
>> [<ffffffff8114b389>] SyS_read+0x49/0xa0
>> [<ffffffff81416049>] tracesys+0xd0/0xd5
>> ---[ end trace cdd1a8a7968266cf ]---
>>
>> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
>> Cc: Kent Overstreet <kmo@daterainc.com>
>> Cc: Shaohua Li <shli@kernel.org>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Jens Axboe <axboe@kernel.dk>
>> Cc: Alexander Gordeev <agordeev@redhat.com>
>> Cc: Mike Christie <michaelc@cs.wisc.edu>
>> ---
>> lib/percpu_ida.c | 5 ++++-
>> 1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
>> index 93d145e..170d27c 100644
>> --- a/lib/percpu_ida.c
>> +++ b/lib/percpu_ida.c
>> @@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool,
>> if (cpu >= nr_cpu_ids) {
>> cpu = cpumask_first(&pool->cpus_have_tags);
>> if (cpu >= nr_cpu_ids)
>> - BUG();
>> + break;
>
> I assume the BUG() above hits? If so, I am failing to understand how
> the code gets here. Mind elaborate?
Hello Alexander,
You are correct, the BUG() mentioned in the call stack in the
description of this patch does indeed correspond with the BUG()
statement in the above code. That BUG() was encountered while testing
the scsi-mq patch series with a workload with a large queue depth. I
think the fact that I hit that BUG() statement means that my workload
was queueing requests faster than these were processed by the SCSI LLD
and hence that percpu_ida_alloc() ran out of tags.
>> }
>>
>> pool->cpu_last_stolen = cpu;
>> @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
>> spin_unlock(&pool->lock);
>> local_irq_restore(flags);
>>
>> + if (tags->nr_free)
>> + wake_up(&pool->wait);
>> +
>
> How 'tags->nr_free' could be checked out of locks?
> Why waking up another thread instead of returning the tag on this CPU?
> Why 'percpu_max_size' threshold is ignored?
Sorry but I'm not sure how much experience you have with kernel coding ?
There are several examples in the Linux kernel of kernel drivers where
variables that are shared over CPU's are on purpose read without first
taking the lock that protects these variables. Students are thought at
the university that all accesses of shared variables should be protected
via locking. In the Linux kernel however it is common practice to read a
shared variable without locking if the code that does this will work
fine if such a read returns an previous value of that variable instead
of the latest value.
> Anyway, IMHO the above BUG() indicates a problem elsewhere.
Sorry but I disagree. percpu_ida_alloc() should either return an error
code or keep waiting when out of tags. Invoking BUG() when out of tags
is wrong.
Bart.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-11 18:10 ` Bart Van Assche
@ 2014-03-11 20:48 ` Alexander Gordeev
2014-03-12 7:22 ` Bart Van Assche
0 siblings, 1 reply; 9+ messages in thread
From: Alexander Gordeev @ 2014-03-11 20:48 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote:
> > I assume the BUG() above hits? If so, I am failing to understand how
> > the code gets here. Mind elaborate?
>
> Hello Alexander,
>
> You are correct, the BUG() mentioned in the call stack in the
> description of this patch does indeed correspond with the BUG()
> statement in the above code. That BUG() was encountered while testing
> the scsi-mq patch series with a workload with a large queue depth. I
> think the fact that I hit that BUG() statement means that my workload
> was queueing requests faster than these were processed by the SCSI LLD
> and hence that percpu_ida_alloc() ran out of tags.
Function steal_tags() is entered with disabled interrupts and
pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags'
is not zero. Which means we can not end up with no set bits at all -
and that is the reason why BUG() is (legitimately) placed there.
> Bart.
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-11 20:48 ` Alexander Gordeev
@ 2014-03-12 7:22 ` Bart Van Assche
2014-03-12 8:41 ` Alexander Gordeev
2014-03-12 15:21 ` Alexander Gordeev
0 siblings, 2 replies; 9+ messages in thread
From: Bart Van Assche @ 2014-03-12 7:22 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On 03/11/14 21:48, Alexander Gordeev wrote:
> On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote:
>>> I assume the BUG() above hits? If so, I am failing to understand how
>>> the code gets here. Mind elaborate?
>>
>> You are correct, the BUG() mentioned in the call stack in the
>> description of this patch does indeed correspond with the BUG()
>> statement in the above code. That BUG() was encountered while testing
>> the scsi-mq patch series with a workload with a large queue depth. I
>> think the fact that I hit that BUG() statement means that my workload
>> was queueing requests faster than these were processed by the SCSI LLD
>> and hence that percpu_ida_alloc() ran out of tags.
>
> Function steal_tags() is entered with disabled interrupts and
> pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags'
> is not zero. Which means we can not end up with no set bits at all -
> and that is the reason why BUG() is (legitimately) placed there.
Sorry but the above reasoning is wrong. Even if interrupts are disabled
on one CPU, even if that CPU holds pool->lock, and even if
cpus_have_tags has at least one bit set at the time steal_tags() starts,
it is still possible that another CPU obtains "remote->lock" before
steal_tags() can obtain that lock and that that other CPU causes
remote->nr_free to drop to zero. I am aware the percpu_ida code is not
easy to read due to such complex interactions between CPU cores.
However, my understanding is that the goal of the percpu_ida allocator
was not that its code would be easy to read but that its performance
would be optimal.
Is this sufficient to make you have another look at my patch ?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-12 7:22 ` Bart Van Assche
@ 2014-03-12 8:41 ` Alexander Gordeev
2014-03-12 10:05 ` Bart Van Assche
2014-03-12 15:21 ` Alexander Gordeev
1 sibling, 1 reply; 9+ messages in thread
From: Alexander Gordeev @ 2014-03-12 8:41 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On Wed, Mar 12, 2014 at 08:22:22AM +0100, Bart Van Assche wrote:
> > Function steal_tags() is entered with disabled interrupts and
> > pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags'
> > is not zero. Which means we can not end up with no set bits at all -
> > and that is the reason why BUG() is (legitimately) placed there.
>
> Sorry but the above reasoning is wrong. Even if interrupts are disabled
> on one CPU, even if that CPU holds pool->lock, and even if
> cpus_have_tags has at least one bit set at the time steal_tags() starts,
> it is still possible that another CPU obtains "remote->lock" before
> steal_tags() can obtain that lock and that that other CPU causes
> remote->nr_free to drop to zero. I am aware the percpu_ida code is not
> easy to read due to such complex interactions between CPU cores.
> However, my understanding is that the goal of the percpu_ida allocator
> was not that its code would be easy to read but that its performance
> would be optimal.
>
> Is this sufficient to make you have another look at my patch ?
Yep, makes sense - thanks for the clarification.
Still the hunk below (a) breaks the 'pool->percpu_max_size' threshold
and (b) somehow suboptimal, because you wake another thread while a
free tag was/is on this CPU. If it is still here we would better to
grab it. If not, it was stolen by another thread and we do not need
to wake one (not sure how could it be addressed, though).
In fact, did you try to remove this hunk at all? A following call to
percpu_ida_free() both honors the threshold and wakes a thread, so
your extra wake could be unnecessary.
@@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
spin_unlock(&pool->lock);
local_irq_restore(flags);
+ if (tags->nr_free)
+ wake_up(&pool->wait);
+
if (tag >= 0 || state == TASK_RUNNING)
break;
--
1.8.4.5
> Thanks,
>
> Bart.
>
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-12 8:41 ` Alexander Gordeev
@ 2014-03-12 10:05 ` Bart Van Assche
0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2014-03-12 10:05 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On 03/12/14 09:41, Alexander Gordeev wrote:
> Still the hunk below (a) breaks the 'pool->percpu_max_size' threshold
> and (b) somehow suboptimal, because you wake another thread while a
> free tag was/is on this CPU. If it is still here we would better to
> grab it. If not, it was stolen by another thread and we do not need
> to wake one (not sure how could it be addressed, though).
>
> In fact, did you try to remove this hunk at all? A following call to
> percpu_ida_free() both honors the threshold and wakes a thread, so
> your extra wake could be unnecessary.
>
> @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state)
> spin_unlock(&pool->lock);
> local_irq_restore(flags);
>
> + if (tags->nr_free)
> + wake_up(&pool->wait);
> +
> if (tag >= 0 || state == TASK_RUNNING)
> break;
>
>
Hello Alexander,
You are right, that hunk is not necessary and can be left out. That code
was added while chasing another (unrelated) bug. I will resend this
patch without that hunk.
Bart.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-12 7:22 ` Bart Van Assche
2014-03-12 8:41 ` Alexander Gordeev
@ 2014-03-12 15:21 ` Alexander Gordeev
2014-03-12 16:16 ` Bart Van Assche
1 sibling, 1 reply; 9+ messages in thread
From: Alexander Gordeev @ 2014-03-12 15:21 UTC (permalink / raw)
To: Bart Van Assche
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On Wed, Mar 12, 2014 at 08:22:22AM +0100, Bart Van Assche wrote:
> On 03/11/14 21:48, Alexander Gordeev wrote:
> > On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote:
> >>> I assume the BUG() above hits? If so, I am failing to understand how
> >>> the code gets here. Mind elaborate?
> >>
> >> You are correct, the BUG() mentioned in the call stack in the
> >> description of this patch does indeed correspond with the BUG()
> >> statement in the above code. That BUG() was encountered while testing
> >> the scsi-mq patch series with a workload with a large queue depth. I
> >> think the fact that I hit that BUG() statement means that my workload
> >> was queueing requests faster than these were processed by the SCSI LLD
> >> and hence that percpu_ida_alloc() ran out of tags.
> >
> > Function steal_tags() is entered with disabled interrupts and
> > pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags'
> > is not zero. Which means we can not end up with no set bits at all -
> > and that is the reason why BUG() is (legitimately) placed there.
>
> Sorry but the above reasoning is wrong. Even if interrupts are disabled
> on one CPU, even if that CPU holds pool->lock, and even if
> cpus_have_tags has at least one bit set at the time steal_tags() starts,
> it is still possible that another CPU obtains "remote->lock" before
> steal_tags() can obtain that lock and that that other CPU causes
> remote->nr_free to drop to zero.
I stared at the code again and I am still not getting how the BUG() gets
hit. The scenario you describe is impossible, because the code that checks
'cpus_have_tags' on one CPU and the code which can steal tags on another CPU
is protected by 'pool->lock' - that is the same steal_tags() function.
While 'remote->nr_free' could be dropped on another CPU (in fact from
percpu_ida_alloc(), not from concurrent steal_tags()) it still does not
explain how steal_tags() enters the loop, but fails to locate 'cpus_have_tags'
count of bits.
So although v2 of your patch fixes the crash it does not address the root
cause IMHO.
May be the following bits in percpu_ida_free() need a closer look:
if (nr_free == 1) {
cpumask_set_cpu(smp_processor_id(),
&pool->cpus_have_tags);
wake_up(&pool->wait);
}
I do not see anything suspicious, but may be the fact cpumask_set_cpu()
is out of any lock contributes to the problem? Do not know.
Would you be able to check if i.e. this hack makes the BUG() to go?
Thanks!
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 93d145e..8715d0e 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -233,6 +233,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
nr_free = tags->nr_free;
spin_unlock(&tags->lock);
+ if (!nr_free)
+ goto out;
+
+ spin_lock(&pool->lock);
+
if (nr_free == 1) {
cpumask_set_cpu(smp_processor_id(),
&pool->cpus_have_tags);
@@ -240,7 +245,6 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
}
if (nr_free == pool->percpu_max_size) {
- spin_lock(&pool->lock);
/*
* Global lock held and irqs disabled, don't need percpu
@@ -253,9 +257,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
wake_up(&pool->wait);
}
- spin_unlock(&pool->lock);
}
+ spin_unlock(&pool->lock);
+
+out:
local_irq_restore(flags);
}
EXPORT_SYMBOL_GPL(percpu_ida_free);
--
Regards,
Alexander Gordeev
agordeev@redhat.com
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully
2014-03-12 15:21 ` Alexander Gordeev
@ 2014-03-12 16:16 ` Bart Van Assche
0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2014-03-12 16:16 UTC (permalink / raw)
To: Alexander Gordeev
Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig,
Mike Christie, linux-kernel
On 03/12/14 16:21, Alexander Gordeev wrote:
> While 'remote->nr_free' could be dropped on another CPU it still does not
> explain how steal_tags() enters the loop, but fails to locate 'cpus_have_tags'
> count of bits.
>
> So although v2 of your patch fixes the crash it does not address the root
> cause IMHO.
Hmm ... you are probably right. I will analyze this further and see what
I can come up with.
Bart.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-12 16:16 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-03-10 14:12 [PATCH] percpu_ida: Handle out-of-tags gracefully Bart Van Assche
2014-03-11 13:51 ` Alexander Gordeev
2014-03-11 18:10 ` Bart Van Assche
2014-03-11 20:48 ` Alexander Gordeev
2014-03-12 7:22 ` Bart Van Assche
2014-03-12 8:41 ` Alexander Gordeev
2014-03-12 10:05 ` Bart Van Assche
2014-03-12 15:21 ` Alexander Gordeev
2014-03-12 16:16 ` Bart Van Assche
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox