* [PATCH 1/1] block: Range check cpu in blk_cpu_to_group
@ 2010-09-09 17:41 Brian King
2010-09-10 10:07 ` Jens Axboe
0 siblings, 1 reply; 2+ messages in thread
From: Brian King @ 2010-09-09 17:41 UTC (permalink / raw)
To: axboe; +Cc: linux-kernel, brking
While testing CPU DLPAR, the following problem was discovered.
We were DLPAR removing the first CPU, which in this case was
logical CPUs 0-3. CPUs 0-2 were already marked offline and
we were in the process of offlining CPU 3. After marking
the CPU inactive and offline in cpu_disable, but before the
cpu was completely idle (cpu_die), we ended up in __make_request
on CPU 3. There we looked at the topology map to see which CPU
to complete the I/O on and found no CPUs in the cpu_sibling_map.
This resulted in the block layer setting the completion cpu
to be NR_CPUS, which then caused an oops when we tried to
complete the I/O.
Fix this by sanity checking the value we return from blk_cpu_to_group
to be a valid cpu value.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
---
block/blk.h | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff -puN block/blk.h~blk_cpu_to_group_range_fix block/blk.h
--- linux-2.6/block/blk.h~blk_cpu_to_group_range_fix 2010-08-31 09:12:18.000000000 -0500
+++ linux-2.6-bjking1/block/blk.h 2010-08-31 16:48:17.000000000 -0500
@@ -142,14 +142,19 @@ static inline int queue_congestion_off_t
static inline int blk_cpu_to_group(int cpu)
{
+ int group = NR_CPUS;
#ifdef CONFIG_SCHED_MC
const struct cpumask *mask = cpu_coregroup_mask(cpu);
- return cpumask_first(mask);
+ group = cpumask_first(mask);
#elif defined(CONFIG_SCHED_SMT)
- return cpumask_first(topology_thread_cpumask(cpu));
+ group = cpumask_first(topology_thread_cpumask(cpu));
#else
return cpu;
#endif
+
+ if (likely(group < NR_CPUS))
+ return group;
+ return cpu;
}
/*
_
^ permalink raw reply [flat|nested] 2+ messages in thread* Re: [PATCH 1/1] block: Range check cpu in blk_cpu_to_group
2010-09-09 17:41 [PATCH 1/1] block: Range check cpu in blk_cpu_to_group Brian King
@ 2010-09-10 10:07 ` Jens Axboe
0 siblings, 0 replies; 2+ messages in thread
From: Jens Axboe @ 2010-09-10 10:07 UTC (permalink / raw)
To: Brian King; +Cc: linux-kernel
On 09/09/2010 07:41 PM, Brian King wrote:
> While testing CPU DLPAR, the following problem was discovered.
> We were DLPAR removing the first CPU, which in this case was
> logical CPUs 0-3. CPUs 0-2 were already marked offline and
> we were in the process of offlining CPU 3. After marking
> the CPU inactive and offline in cpu_disable, but before the
> cpu was completely idle (cpu_die), we ended up in __make_request
> on CPU 3. There we looked at the topology map to see which CPU
> to complete the I/O on and found no CPUs in the cpu_sibling_map.
> This resulted in the block layer setting the completion cpu
> to be NR_CPUS, which then caused an oops when we tried to
> complete the I/O.
>
> Fix this by sanity checking the value we return from blk_cpu_to_group
> to be a valid cpu value.
>
> Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Thanks Brian, applied.
--
Jens Axboe
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-09-10 10:07 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-09-09 17:41 [PATCH 1/1] block: Range check cpu in blk_cpu_to_group Brian King
2010-09-10 10:07 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox