* [PATCH] percpu_ida: Handle out-of-tags gracefully @ 2014-03-10 14:12 Bart Van Assche 2014-03-11 13:51 ` Alexander Gordeev 0 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2014-03-10 14:12 UTC (permalink / raw) To: Jens Axboe Cc: Kent Overstreet, Shaohua Li, Christoph Hellwig, Jens Axboe, Alexander Gordeev, Mike Christie, linux-kernel Avoid that percpu_ida_alloc() hangs or crashes if there are still tags are available. Wait until a tag becomes available instead of giving up when running out of tags temporarily. This patch fixes the following kernel bug: ------------[ cut here ]------------ kernel BUG at lib/percpu_ida.c:81! invalid opcode: 0000 [#1] SMP RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370 Call Trace: [<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50 [<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90 [<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110 [<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480 [<ffffffff811e28f0>] generic_make_request+0xc0/0x110 [<ffffffff811e29ab>] submit_bio+0x6b/0x140 [<ffffffff8117aabb>] _submit_bh+0x13b/0x220 [<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300 [<ffffffff81181128>] blkdev_readpage+0x18/0x20 [<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280 [<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0 [<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50 [<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700 [<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70 [<ffffffff8114a28a>] do_sync_read+0x5a/0x90 [<ffffffff8114a8cb>] vfs_read+0x9b/0x160 [<ffffffff8114b389>] SyS_read+0x49/0xa0 [<ffffffff81416049>] tracesys+0xd0/0xd5 ---[ end trace cdd1a8a7968266cf ]--- Signed-off-by: Bart Van Assche <bvanassche@acm.org> Cc: Kent Overstreet <kmo@daterainc.com> Cc: Shaohua Li <shli@kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Gordeev <agordeev@redhat.com> Cc: Mike Christie <michaelc@cs.wisc.edu> --- lib/percpu_ida.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c index 93d145e..170d27c 100644 --- a/lib/percpu_ida.c +++ b/lib/percpu_ida.c @@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool, if (cpu >= nr_cpu_ids) { cpu = cpumask_first(&pool->cpus_have_tags); if (cpu >= nr_cpu_ids) - BUG(); + break; } pool->cpu_last_stolen = cpu; @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state) spin_unlock(&pool->lock); local_irq_restore(flags); + if (tags->nr_free) + wake_up(&pool->wait); + if (tag >= 0 || state == TASK_RUNNING) break; -- 1.8.4.5 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-10 14:12 [PATCH] percpu_ida: Handle out-of-tags gracefully Bart Van Assche @ 2014-03-11 13:51 ` Alexander Gordeev 2014-03-11 18:10 ` Bart Van Assche 0 siblings, 1 reply; 9+ messages in thread From: Alexander Gordeev @ 2014-03-11 13:51 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On Mon, Mar 10, 2014 at 03:12:33PM +0100, Bart Van Assche wrote: > Avoid that percpu_ida_alloc() hangs or crashes if there are still > tags are available. Wait until a tag becomes available instead of > giving up when running out of tags temporarily. This patch fixes > the following kernel bug: Hi Bart, Few comments below, but the changelog does not correspond to the actual change in 'Wait until a tag becomes available'. > ------------[ cut here ]------------ > kernel BUG at lib/percpu_ida.c:81! > invalid opcode: 0000 [#1] SMP > RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370 > Call Trace: > [<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50 > [<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90 > [<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110 > [<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480 > [<ffffffff811e28f0>] generic_make_request+0xc0/0x110 > [<ffffffff811e29ab>] submit_bio+0x6b/0x140 > [<ffffffff8117aabb>] _submit_bh+0x13b/0x220 > [<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300 > [<ffffffff81181128>] blkdev_readpage+0x18/0x20 > [<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280 > [<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0 > [<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50 > [<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700 > [<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70 > [<ffffffff8114a28a>] do_sync_read+0x5a/0x90 > [<ffffffff8114a8cb>] vfs_read+0x9b/0x160 > [<ffffffff8114b389>] SyS_read+0x49/0xa0 > [<ffffffff81416049>] tracesys+0xd0/0xd5 > ---[ end trace cdd1a8a7968266cf ]--- > > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > Cc: Kent Overstreet <kmo@daterainc.com> > Cc: Shaohua Li <shli@kernel.org> > Cc: Christoph Hellwig <hch@lst.de> > Cc: Jens Axboe <axboe@kernel.dk> > Cc: Alexander Gordeev <agordeev@redhat.com> > Cc: Mike Christie <michaelc@cs.wisc.edu> > --- > lib/percpu_ida.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c > index 93d145e..170d27c 100644 > --- a/lib/percpu_ida.c > +++ b/lib/percpu_ida.c > @@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool, > if (cpu >= nr_cpu_ids) { > cpu = cpumask_first(&pool->cpus_have_tags); > if (cpu >= nr_cpu_ids) > - BUG(); > + break; I assume the BUG() above hits? If so, I am failing to understand how the code gets here. Mind elaborate? > } > > pool->cpu_last_stolen = cpu; > @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state) > spin_unlock(&pool->lock); > local_irq_restore(flags); > > + if (tags->nr_free) > + wake_up(&pool->wait); > + How 'tags->nr_free' could be checked out of locks? Why waking up another thread instead of returning the tag on this CPU? Why 'percpu_max_size' threshold is ignored? Anyway, IMHO the above BUG() indicates a problem elsewhere. > if (tag >= 0 || state == TASK_RUNNING) > break; > > -- > 1.8.4.5 > -- Regards, Alexander Gordeev agordeev@redhat.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-11 13:51 ` Alexander Gordeev @ 2014-03-11 18:10 ` Bart Van Assche 2014-03-11 20:48 ` Alexander Gordeev 0 siblings, 1 reply; 9+ messages in thread From: Bart Van Assche @ 2014-03-11 18:10 UTC (permalink / raw) To: Alexander Gordeev Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On 03/11/14 14:51, Alexander Gordeev wrote: > On Mon, Mar 10, 2014 at 03:12:33PM +0100, Bart Van Assche wrote: >> Avoid that percpu_ida_alloc() hangs or crashes if there are still >> tags are available. Wait until a tag becomes available instead of >> giving up when running out of tags temporarily. This patch fixes >> the following kernel bug: > > Hi Bart, > > Few comments below, but the changelog does not correspond to the > actual change in 'Wait until a tag becomes available'. > >> ------------[ cut here ]------------ >> kernel BUG at lib/percpu_ida.c:81! >> invalid opcode: 0000 [#1] SMP >> RIP: 0010:[<ffffffff8120f00e>] [<ffffffff8120f00e>] percpu_ida_alloc+0x33e/0x370 >> Call Trace: >> [<ffffffff811ef95f>] blk_mq_get_tag+0x2f/0x50 >> [<ffffffff811ed79c>] blk_mq_alloc_rq.isra.17+0x1c/0x90 >> [<ffffffff811eeb9b>] blk_mq_alloc_request_pinned+0x9b/0x110 >> [<ffffffff811ef4c6>] blk_mq_make_request+0x426/0x480 >> [<ffffffff811e28f0>] generic_make_request+0xc0/0x110 >> [<ffffffff811e29ab>] submit_bio+0x6b/0x140 >> [<ffffffff8117aabb>] _submit_bh+0x13b/0x220 >> [<ffffffff8117d70f>] block_read_full_page+0x1ff/0x300 >> [<ffffffff81181128>] blkdev_readpage+0x18/0x20 >> [<ffffffff811067b7>] __do_page_cache_readahead+0x277/0x280 >> [<ffffffff81106d1d>] force_page_cache_readahead+0x8d/0xc0 >> [<ffffffff81106d9b>] page_cache_sync_readahead+0x4b/0x50 >> [<ffffffff810fdf05>] generic_file_aio_read+0x4c5/0x700 >> [<ffffffff8118147b>] blkdev_aio_read+0x4b/0x70 >> [<ffffffff8114a28a>] do_sync_read+0x5a/0x90 >> [<ffffffff8114a8cb>] vfs_read+0x9b/0x160 >> [<ffffffff8114b389>] SyS_read+0x49/0xa0 >> [<ffffffff81416049>] tracesys+0xd0/0xd5 >> ---[ end trace cdd1a8a7968266cf ]--- >> >> Signed-off-by: Bart Van Assche <bvanassche@acm.org> >> Cc: Kent Overstreet <kmo@daterainc.com> >> Cc: Shaohua Li <shli@kernel.org> >> Cc: Christoph Hellwig <hch@lst.de> >> Cc: Jens Axboe <axboe@kernel.dk> >> Cc: Alexander Gordeev <agordeev@redhat.com> >> Cc: Mike Christie <michaelc@cs.wisc.edu> >> --- >> lib/percpu_ida.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c >> index 93d145e..170d27c 100644 >> --- a/lib/percpu_ida.c >> +++ b/lib/percpu_ida.c >> @@ -73,7 +73,7 @@ static inline void steal_tags(struct percpu_ida *pool, >> if (cpu >= nr_cpu_ids) { >> cpu = cpumask_first(&pool->cpus_have_tags); >> if (cpu >= nr_cpu_ids) >> - BUG(); >> + break; > > I assume the BUG() above hits? If so, I am failing to understand how > the code gets here. Mind elaborate? Hello Alexander, You are correct, the BUG() mentioned in the call stack in the description of this patch does indeed correspond with the BUG() statement in the above code. That BUG() was encountered while testing the scsi-mq patch series with a workload with a large queue depth. I think the fact that I hit that BUG() statement means that my workload was queueing requests faster than these were processed by the SCSI LLD and hence that percpu_ida_alloc() ran out of tags. >> } >> >> pool->cpu_last_stolen = cpu; >> @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state) >> spin_unlock(&pool->lock); >> local_irq_restore(flags); >> >> + if (tags->nr_free) >> + wake_up(&pool->wait); >> + > > How 'tags->nr_free' could be checked out of locks? > Why waking up another thread instead of returning the tag on this CPU? > Why 'percpu_max_size' threshold is ignored? Sorry but I'm not sure how much experience you have with kernel coding ? There are several examples in the Linux kernel of kernel drivers where variables that are shared over CPU's are on purpose read without first taking the lock that protects these variables. Students are thought at the university that all accesses of shared variables should be protected via locking. In the Linux kernel however it is common practice to read a shared variable without locking if the code that does this will work fine if such a read returns an previous value of that variable instead of the latest value. > Anyway, IMHO the above BUG() indicates a problem elsewhere. Sorry but I disagree. percpu_ida_alloc() should either return an error code or keep waiting when out of tags. Invoking BUG() when out of tags is wrong. Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-11 18:10 ` Bart Van Assche @ 2014-03-11 20:48 ` Alexander Gordeev 2014-03-12 7:22 ` Bart Van Assche 0 siblings, 1 reply; 9+ messages in thread From: Alexander Gordeev @ 2014-03-11 20:48 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote: > > I assume the BUG() above hits? If so, I am failing to understand how > > the code gets here. Mind elaborate? > > Hello Alexander, > > You are correct, the BUG() mentioned in the call stack in the > description of this patch does indeed correspond with the BUG() > statement in the above code. That BUG() was encountered while testing > the scsi-mq patch series with a workload with a large queue depth. I > think the fact that I hit that BUG() statement means that my workload > was queueing requests faster than these were processed by the SCSI LLD > and hence that percpu_ida_alloc() ran out of tags. Function steal_tags() is entered with disabled interrupts and pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags' is not zero. Which means we can not end up with no set bits at all - and that is the reason why BUG() is (legitimately) placed there. > Bart. -- Regards, Alexander Gordeev agordeev@redhat.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-11 20:48 ` Alexander Gordeev @ 2014-03-12 7:22 ` Bart Van Assche 2014-03-12 8:41 ` Alexander Gordeev 2014-03-12 15:21 ` Alexander Gordeev 0 siblings, 2 replies; 9+ messages in thread From: Bart Van Assche @ 2014-03-12 7:22 UTC (permalink / raw) To: Alexander Gordeev Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On 03/11/14 21:48, Alexander Gordeev wrote: > On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote: >>> I assume the BUG() above hits? If so, I am failing to understand how >>> the code gets here. Mind elaborate? >> >> You are correct, the BUG() mentioned in the call stack in the >> description of this patch does indeed correspond with the BUG() >> statement in the above code. That BUG() was encountered while testing >> the scsi-mq patch series with a workload with a large queue depth. I >> think the fact that I hit that BUG() statement means that my workload >> was queueing requests faster than these were processed by the SCSI LLD >> and hence that percpu_ida_alloc() ran out of tags. > > Function steal_tags() is entered with disabled interrupts and > pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags' > is not zero. Which means we can not end up with no set bits at all - > and that is the reason why BUG() is (legitimately) placed there. Sorry but the above reasoning is wrong. Even if interrupts are disabled on one CPU, even if that CPU holds pool->lock, and even if cpus_have_tags has at least one bit set at the time steal_tags() starts, it is still possible that another CPU obtains "remote->lock" before steal_tags() can obtain that lock and that that other CPU causes remote->nr_free to drop to zero. I am aware the percpu_ida code is not easy to read due to such complex interactions between CPU cores. However, my understanding is that the goal of the percpu_ida allocator was not that its code would be easy to read but that its performance would be optimal. Is this sufficient to make you have another look at my patch ? Thanks, Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-12 7:22 ` Bart Van Assche @ 2014-03-12 8:41 ` Alexander Gordeev 2014-03-12 10:05 ` Bart Van Assche 2014-03-12 15:21 ` Alexander Gordeev 1 sibling, 1 reply; 9+ messages in thread From: Alexander Gordeev @ 2014-03-12 8:41 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On Wed, Mar 12, 2014 at 08:22:22AM +0100, Bart Van Assche wrote: > > Function steal_tags() is entered with disabled interrupts and > > pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags' > > is not zero. Which means we can not end up with no set bits at all - > > and that is the reason why BUG() is (legitimately) placed there. > > Sorry but the above reasoning is wrong. Even if interrupts are disabled > on one CPU, even if that CPU holds pool->lock, and even if > cpus_have_tags has at least one bit set at the time steal_tags() starts, > it is still possible that another CPU obtains "remote->lock" before > steal_tags() can obtain that lock and that that other CPU causes > remote->nr_free to drop to zero. I am aware the percpu_ida code is not > easy to read due to such complex interactions between CPU cores. > However, my understanding is that the goal of the percpu_ida allocator > was not that its code would be easy to read but that its performance > would be optimal. > > Is this sufficient to make you have another look at my patch ? Yep, makes sense - thanks for the clarification. Still the hunk below (a) breaks the 'pool->percpu_max_size' threshold and (b) somehow suboptimal, because you wake another thread while a free tag was/is on this CPU. If it is still here we would better to grab it. If not, it was stolen by another thread and we do not need to wake one (not sure how could it be addressed, though). In fact, did you try to remove this hunk at all? A following call to percpu_ida_free() both honors the threshold and wakes a thread, so your extra wake could be unnecessary. @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state) spin_unlock(&pool->lock); local_irq_restore(flags); + if (tags->nr_free) + wake_up(&pool->wait); + if (tag >= 0 || state == TASK_RUNNING) break; -- 1.8.4.5 > Thanks, > > Bart. > -- Regards, Alexander Gordeev agordeev@redhat.com ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-12 8:41 ` Alexander Gordeev @ 2014-03-12 10:05 ` Bart Van Assche 0 siblings, 0 replies; 9+ messages in thread From: Bart Van Assche @ 2014-03-12 10:05 UTC (permalink / raw) To: Alexander Gordeev Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On 03/12/14 09:41, Alexander Gordeev wrote: > Still the hunk below (a) breaks the 'pool->percpu_max_size' threshold > and (b) somehow suboptimal, because you wake another thread while a > free tag was/is on this CPU. If it is still here we would better to > grab it. If not, it was stolen by another thread and we do not need > to wake one (not sure how could it be addressed, though). > > In fact, did you try to remove this hunk at all? A following call to > percpu_ida_free() both honors the threshold and wakes a thread, so > your extra wake could be unnecessary. > > @@ -189,6 +189,9 @@ int percpu_ida_alloc(struct percpu_ida *pool, int state) > spin_unlock(&pool->lock); > local_irq_restore(flags); > > + if (tags->nr_free) > + wake_up(&pool->wait); > + > if (tag >= 0 || state == TASK_RUNNING) > break; > > Hello Alexander, You are right, that hunk is not necessary and can be left out. That code was added while chasing another (unrelated) bug. I will resend this patch without that hunk. Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-12 7:22 ` Bart Van Assche 2014-03-12 8:41 ` Alexander Gordeev @ 2014-03-12 15:21 ` Alexander Gordeev 2014-03-12 16:16 ` Bart Van Assche 1 sibling, 1 reply; 9+ messages in thread From: Alexander Gordeev @ 2014-03-12 15:21 UTC (permalink / raw) To: Bart Van Assche Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On Wed, Mar 12, 2014 at 08:22:22AM +0100, Bart Van Assche wrote: > On 03/11/14 21:48, Alexander Gordeev wrote: > > On Tue, Mar 11, 2014 at 07:10:18PM +0100, Bart Van Assche wrote: > >>> I assume the BUG() above hits? If so, I am failing to understand how > >>> the code gets here. Mind elaborate? > >> > >> You are correct, the BUG() mentioned in the call stack in the > >> description of this patch does indeed correspond with the BUG() > >> statement in the above code. That BUG() was encountered while testing > >> the scsi-mq patch series with a workload with a large queue depth. I > >> think the fact that I hit that BUG() statement means that my workload > >> was queueing requests faster than these were processed by the SCSI LLD > >> and hence that percpu_ida_alloc() ran out of tags. > > > > Function steal_tags() is entered with disabled interrupts and > > pool->lock taken. Then the 'for' cycle enters/loops while 'cpus_have_tags' > > is not zero. Which means we can not end up with no set bits at all - > > and that is the reason why BUG() is (legitimately) placed there. > > Sorry but the above reasoning is wrong. Even if interrupts are disabled > on one CPU, even if that CPU holds pool->lock, and even if > cpus_have_tags has at least one bit set at the time steal_tags() starts, > it is still possible that another CPU obtains "remote->lock" before > steal_tags() can obtain that lock and that that other CPU causes > remote->nr_free to drop to zero. I stared at the code again and I am still not getting how the BUG() gets hit. The scenario you describe is impossible, because the code that checks 'cpus_have_tags' on one CPU and the code which can steal tags on another CPU is protected by 'pool->lock' - that is the same steal_tags() function. While 'remote->nr_free' could be dropped on another CPU (in fact from percpu_ida_alloc(), not from concurrent steal_tags()) it still does not explain how steal_tags() enters the loop, but fails to locate 'cpus_have_tags' count of bits. So although v2 of your patch fixes the crash it does not address the root cause IMHO. May be the following bits in percpu_ida_free() need a closer look: if (nr_free == 1) { cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags); wake_up(&pool->wait); } I do not see anything suspicious, but may be the fact cpumask_set_cpu() is out of any lock contributes to the problem? Do not know. Would you be able to check if i.e. this hack makes the BUG() to go? Thanks! diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c index 93d145e..8715d0e 100644 --- a/lib/percpu_ida.c +++ b/lib/percpu_ida.c @@ -233,6 +233,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag) nr_free = tags->nr_free; spin_unlock(&tags->lock); + if (!nr_free) + goto out; + + spin_lock(&pool->lock); + if (nr_free == 1) { cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags); @@ -240,7 +245,6 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag) } if (nr_free == pool->percpu_max_size) { - spin_lock(&pool->lock); /* * Global lock held and irqs disabled, don't need percpu @@ -253,9 +257,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag) wake_up(&pool->wait); } - spin_unlock(&pool->lock); } + spin_unlock(&pool->lock); + +out: local_irq_restore(flags); } EXPORT_SYMBOL_GPL(percpu_ida_free); -- Regards, Alexander Gordeev agordeev@redhat.com ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] percpu_ida: Handle out-of-tags gracefully 2014-03-12 15:21 ` Alexander Gordeev @ 2014-03-12 16:16 ` Bart Van Assche 0 siblings, 0 replies; 9+ messages in thread From: Bart Van Assche @ 2014-03-12 16:16 UTC (permalink / raw) To: Alexander Gordeev Cc: Jens Axboe, Kent Overstreet, Shaohua Li, Christoph Hellwig, Mike Christie, linux-kernel On 03/12/14 16:21, Alexander Gordeev wrote: > While 'remote->nr_free' could be dropped on another CPU it still does not > explain how steal_tags() enters the loop, but fails to locate 'cpus_have_tags' > count of bits. > > So although v2 of your patch fixes the crash it does not address the root > cause IMHO. Hmm ... you are probably right. I will analyze this further and see what I can come up with. Bart. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-03-12 16:16 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2014-03-10 14:12 [PATCH] percpu_ida: Handle out-of-tags gracefully Bart Van Assche 2014-03-11 13:51 ` Alexander Gordeev 2014-03-11 18:10 ` Bart Van Assche 2014-03-11 20:48 ` Alexander Gordeev 2014-03-12 7:22 ` Bart Van Assche 2014-03-12 8:41 ` Alexander Gordeev 2014-03-12 10:05 ` Bart Van Assche 2014-03-12 15:21 ` Alexander Gordeev 2014-03-12 16:16 ` Bart Van Assche
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox