* [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
@ 2017-03-03 4:47 Russell Currey
2017-03-03 5:55 ` Gavin Shan
2017-03-03 5:59 ` Alexey Kardashevskiy
0 siblings, 2 replies; 7+ messages in thread
From: Russell Currey @ 2017-03-03 4:47 UTC (permalink / raw)
To: linuxppc-dev; +Cc: aik, Russell Currey
eeh_handle_special_event() is called when an EEH event is detected but
can't be narrowed down to a specific PE. This function looks through
every PE to find one in an erroneous state, then calls the regular event
handler eeh_handle_normal_event() once it knows which PE has an error.
However, if eeh_handle_normal_event() found that the PE cannot possibly
be recovered, it will remove the PE and associated devices. This leads
to a use after free in eeh_handle_special_event() as it attempts to clear
the "recovering" state on the PE after eeh_handle_normal_event() returns.
Thus, make sure the PE is valid when attempting to clear state in
eeh_handle_special_event().
Cc: <stable@vger.kernel.org> #3.10+
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Russell Currey <ruscur@russell.cc>
---
arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
index b94887165a10..492397298a2a 100644
--- a/arch/powerpc/kernel/eeh_driver.c
+++ b/arch/powerpc/kernel/eeh_driver.c
@@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
if (rc == EEH_NEXT_ERR_FROZEN_PE ||
rc == EEH_NEXT_ERR_FENCED_PHB) {
eeh_handle_normal_event(pe);
+
+ /*
+ * eeh_handle_normal_event() can free the PE if it
+ * determines that the PE cannot possibly be recovered.
+ * Make sure the PE still exists before changing its
+ * state.
+ */
+ if (!pe || (pe->type & EEH_PE_INVALID)
+ || (pe->state & EEH_PE_REMOVED)) {
+ pr_warn("EEH: not clearing state on bad PE\n");
+ continue;
+ }
+
eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
} else {
pci_lock_rescan_remove();
--
2.12.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-03 4:47 [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Russell Currey
@ 2017-03-03 5:55 ` Gavin Shan
2017-03-03 6:05 ` Alexey Kardashevskiy
2017-03-03 5:59 ` Alexey Kardashevskiy
1 sibling, 1 reply; 7+ messages in thread
From: Gavin Shan @ 2017-03-03 5:55 UTC (permalink / raw)
To: Russell Currey; +Cc: linuxppc-dev, aik
On Fri, Mar 03, 2017 at 03:47:18PM +1100, Russell Currey wrote:
>eeh_handle_special_event() is called when an EEH event is detected but
>can't be narrowed down to a specific PE. This function looks through
>every PE to find one in an erroneous state, then calls the regular event
>handler eeh_handle_normal_event() once it knows which PE has an error.
>
>However, if eeh_handle_normal_event() found that the PE cannot possibly
>be recovered, it will remove the PE and associated devices. This leads
>to a use after free in eeh_handle_special_event() as it attempts to clear
>the "recovering" state on the PE after eeh_handle_normal_event() returns.
>
>Thus, make sure the PE is valid when attempting to clear state in
>eeh_handle_special_event().
>
>From the changelog, I don't see how the PE is free'd. Could you explain
a bit about it?
>Cc: <stable@vger.kernel.org> #3.10+
>Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>Signed-off-by: Russell Currey <ruscur@russell.cc>
>---
> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
>diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>index b94887165a10..492397298a2a 100644
>--- a/arch/powerpc/kernel/eeh_driver.c
>+++ b/arch/powerpc/kernel/eeh_driver.c
>@@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
> rc == EEH_NEXT_ERR_FENCED_PHB) {
> eeh_handle_normal_event(pe);
>+
>+ /*
>+ * eeh_handle_normal_event() can free the PE if it
>+ * determines that the PE cannot possibly be recovered.
>+ * Make sure the PE still exists before changing its
>+ * state.
>+ */
>+ if (!pe || (pe->type & EEH_PE_INVALID)
>+ || (pe->state & EEH_PE_REMOVED)) {
>+ pr_warn("EEH: not clearing state on bad PE\n");
>+ continue;
>+ }
>+
It seems not correct. @pe has set to the valid PE in advance, the !pe is
always false? If the PE has been free'd, how can we access @pe->type here
and how can we make sure PE_INVALID and PE_REMOVED flag wasn't overwritten
by somebody else?
> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> } else {
> pci_lock_rescan_remove();
Cheers,
Gavin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-03 4:47 [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Russell Currey
2017-03-03 5:55 ` Gavin Shan
@ 2017-03-03 5:59 ` Alexey Kardashevskiy
2017-03-05 23:22 ` Gavin Shan
1 sibling, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2017-03-03 5:59 UTC (permalink / raw)
To: Russell Currey, linuxppc-dev, Gavin Shan
On 03/03/17 15:47, Russell Currey wrote:
> eeh_handle_special_event() is called when an EEH event is detected but
> can't be narrowed down to a specific PE. This function looks through
> every PE to find one in an erroneous state, then calls the regular event
> handler eeh_handle_normal_event() once it knows which PE has an error.
>
> However, if eeh_handle_normal_event() found that the PE cannot possibly
> be recovered, it will remove the PE and associated devices. This leads
> to a use after free in eeh_handle_special_event() as it attempts to clear
> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>
> Thus, make sure the PE is valid when attempting to clear state in
> eeh_handle_special_event().
>
> Cc: <stable@vger.kernel.org> #3.10+
> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> ---
> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
> 1 file changed, 13 insertions(+)
>
> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
> index b94887165a10..492397298a2a 100644
> --- a/arch/powerpc/kernel/eeh_driver.c
> +++ b/arch/powerpc/kernel/eeh_driver.c
> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
> rc == EEH_NEXT_ERR_FENCED_PHB) {
> eeh_handle_normal_event(pe);
> +
> + /*
> + * eeh_handle_normal_event() can free the PE if it
> + * determines that the PE cannot possibly be recovered.
> + * Make sure the PE still exists before changing its
> + * state.
> + */
> + if (!pe || (pe->type & EEH_PE_INVALID)
> + || (pe->state & EEH_PE_REMOVED)) {
The bug is that pe becomes stale after eeh_handle_normal_event() returned
and dereferencing it afterwards is broken.
> + pr_warn("EEH: not clearing state on bad PE\n");
> + continue;
> + }
> +
> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
> } else {
> pci_lock_rescan_remove();
>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-03 5:55 ` Gavin Shan
@ 2017-03-03 6:05 ` Alexey Kardashevskiy
0 siblings, 0 replies; 7+ messages in thread
From: Alexey Kardashevskiy @ 2017-03-03 6:05 UTC (permalink / raw)
To: Gavin Shan, Russell Currey; +Cc: linuxppc-dev
On 03/03/17 16:55, Gavin Shan wrote:
> On Fri, Mar 03, 2017 at 03:47:18PM +1100, Russell Currey wrote:
>> eeh_handle_special_event() is called when an EEH event is detected but
>> can't be narrowed down to a specific PE. This function looks through
>> every PE to find one in an erroneous state, then calls the regular event
>> handler eeh_handle_normal_event() once it knows which PE has an error.
>>
>> However, if eeh_handle_normal_event() found that the PE cannot possibly
>> be recovered, it will remove the PE and associated devices. This leads
>> to a use after free in eeh_handle_special_event() as it attempts to clear
>> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>>
>> Thus, make sure the PE is valid when attempting to clear state in
>> eeh_handle_special_event().
>>
>
> From the changelog, I don't see how the PE is free'd. Could you explain
> a bit about it?
This is a backtrace when kfree(pe) is done:
dump_stack+0xb0/0xf0 (unreliable)
eeh_rmv_from_parent_pe+0x2f8/0x330
eeh_remove_device+0x128/0x170
pcibios_release_device+0x2c/0x70
pci_release_dev+0x5c/0xb0
device_release+0x58/0xf0
kobject_put+0x144/0x2e0
put_device+0x24/0x40
pci_remove_bus_device+0x14c/0x190
pci_hp_remove_devices+0xac/0x170
eeh_handle_normal_event+0x120/0x560
eeh_handle_special_event+0x328/0x3b0
eeh_handle_event+0x74/0xa0
eeh_event_handler+0x260/0x280
kthread+0x14c/0x190
ret_from_kernel_thread+0x5c/0x74
>
>> Cc: <stable@vger.kernel.org> #3.10+
>> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> Signed-off-by: Russell Currey <ruscur@russell.cc>
>> ---
>> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>> index b94887165a10..492397298a2a 100644
>> --- a/arch/powerpc/kernel/eeh_driver.c
>> +++ b/arch/powerpc/kernel/eeh_driver.c
>> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
>> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
>> rc == EEH_NEXT_ERR_FENCED_PHB) {
>> eeh_handle_normal_event(pe);
>> +
>> + /*
>> + * eeh_handle_normal_event() can free the PE if it
>> + * determines that the PE cannot possibly be recovered.
>> + * Make sure the PE still exists before changing its
>> + * state.
>> + */
>> + if (!pe || (pe->type & EEH_PE_INVALID)
>> + || (pe->state & EEH_PE_REMOVED)) {
>> + pr_warn("EEH: not clearing state on bad PE\n");
>> + continue;
>> + }
>> +
>
> It seems not correct. @pe has set to the valid PE in advance, the !pe is
> always false? If the PE has been free'd, how can we access @pe->type here
> and how can we make sure PE_INVALID and PE_REMOVED flag wasn't overwritten
> by somebody else?
>
>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>> } else {
>> pci_lock_rescan_remove();
>
> Cheers,
> Gavin
>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-03 5:59 ` Alexey Kardashevskiy
@ 2017-03-05 23:22 ` Gavin Shan
2017-03-06 1:54 ` Alexey Kardashevskiy
0 siblings, 1 reply; 7+ messages in thread
From: Gavin Shan @ 2017-03-05 23:22 UTC (permalink / raw)
To: Alexey Kardashevskiy; +Cc: Russell Currey, linuxppc-dev, Gavin Shan
On Fri, Mar 03, 2017 at 04:59:11PM +1100, Alexey Kardashevskiy wrote:
>On 03/03/17 15:47, Russell Currey wrote:
>> eeh_handle_special_event() is called when an EEH event is detected but
>> can't be narrowed down to a specific PE. This function looks through
>> every PE to find one in an erroneous state, then calls the regular event
>> handler eeh_handle_normal_event() once it knows which PE has an error.
>>
>> However, if eeh_handle_normal_event() found that the PE cannot possibly
>> be recovered, it will remove the PE and associated devices. This leads
>> to a use after free in eeh_handle_special_event() as it attempts to clear
>> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>>
>> Thus, make sure the PE is valid when attempting to clear state in
>> eeh_handle_special_event().
>>
>> Cc: <stable@vger.kernel.org> #3.10+
>> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> Signed-off-by: Russell Currey <ruscur@russell.cc>
>> ---
>> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>> index b94887165a10..492397298a2a 100644
>> --- a/arch/powerpc/kernel/eeh_driver.c
>> +++ b/arch/powerpc/kernel/eeh_driver.c
>> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
>> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
>> rc == EEH_NEXT_ERR_FENCED_PHB) {
>> eeh_handle_normal_event(pe);
>> +
>> + /*
>> + * eeh_handle_normal_event() can free the PE if it
>> + * determines that the PE cannot possibly be recovered.
>> + * Make sure the PE still exists before changing its
>> + * state.
>> + */
>> + if (!pe || (pe->type & EEH_PE_INVALID)
>> + || (pe->state & EEH_PE_REMOVED)) {
>
>
>The bug is that pe becomes stale after eeh_handle_normal_event() returned
>and dereferencing it afterwards is broken.
>
Correct, it won't cause a kernel crash as @pe is deferencing linear mapped
area whose address is always valid. I think the proper fix would be to use
eeh_handle_normal_event() to indicate the @pe has been released and don't
access it any more.
>
>
>> + pr_warn("EEH: not clearing state on bad PE\n");
The message like this isn't meaningful, no need to have it. The messages that
have prefix "EEH:" is informative messages. We definitely needn't this here.
However, the message might be not needed in next revision.
>> + continue;
>> + }
>> +
>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>> } else {
>> pci_lock_rescan_remove();
>>
Thanks,
Gavin
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-05 23:22 ` Gavin Shan
@ 2017-03-06 1:54 ` Alexey Kardashevskiy
2017-04-07 3:28 ` Alexey Kardashevskiy
0 siblings, 1 reply; 7+ messages in thread
From: Alexey Kardashevskiy @ 2017-03-06 1:54 UTC (permalink / raw)
To: Gavin Shan; +Cc: Russell Currey, linuxppc-dev
On 06/03/17 10:22, Gavin Shan wrote:
> On Fri, Mar 03, 2017 at 04:59:11PM +1100, Alexey Kardashevskiy wrote:
>> On 03/03/17 15:47, Russell Currey wrote:
>>> eeh_handle_special_event() is called when an EEH event is detected but
>>> can't be narrowed down to a specific PE. This function looks through
>>> every PE to find one in an erroneous state, then calls the regular event
>>> handler eeh_handle_normal_event() once it knows which PE has an error.
>>>
>>> However, if eeh_handle_normal_event() found that the PE cannot possibly
>>> be recovered, it will remove the PE and associated devices. This leads
>>> to a use after free in eeh_handle_special_event() as it attempts to clear
>>> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>>>
>>> Thus, make sure the PE is valid when attempting to clear state in
>>> eeh_handle_special_event().
>>>
>>> Cc: <stable@vger.kernel.org> #3.10+
>>> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>> Signed-off-by: Russell Currey <ruscur@russell.cc>
>>> ---
>>> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
>>> 1 file changed, 13 insertions(+)
>>>
>>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>>> index b94887165a10..492397298a2a 100644
>>> --- a/arch/powerpc/kernel/eeh_driver.c
>>> +++ b/arch/powerpc/kernel/eeh_driver.c
>>> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
>>> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
>>> rc == EEH_NEXT_ERR_FENCED_PHB) {
>>> eeh_handle_normal_event(pe);
>>> +
>>> + /*
>>> + * eeh_handle_normal_event() can free the PE if it
>>> + * determines that the PE cannot possibly be recovered.
>>> + * Make sure the PE still exists before changing its
>>> + * state.
>>> + */
>>> + if (!pe || (pe->type & EEH_PE_INVALID)
>>> + || (pe->state & EEH_PE_REMOVED)) {
>>
>>
>> The bug is that pe becomes stale after eeh_handle_normal_event() returned
>> and dereferencing it afterwards is broken.
>>
>
> Correct, it won't cause a kernel crash as @pe is deferencing linear mapped
> area whose address is always valid.
Dereferencing pe would not crash but dereferencing any pointer from the
pnv_ioda_pe struct would (as it would random stuff or a poison).
> I think the proper fix would be to use
> eeh_handle_normal_event() to indicate the @pe has been released and don't
> access it any more.
Correct. The problem is that the callstack from my other reply is a bit too
long to make an trivial patch :)
>>
>>
>>> + pr_warn("EEH: not clearing state on bad PE\n");
>
> The message like this isn't meaningful, no need to have it. The messages that
> have prefix "EEH:" is informative messages. We definitely needn't this here.
> However, the message might be not needed in next revision.
>
>>> + continue;
>>> + }
>>> +
>>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>>> } else {
>>> pci_lock_rescan_remove();
>>>
>
> Thanks,
> Gavin
>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event()
2017-03-06 1:54 ` Alexey Kardashevskiy
@ 2017-04-07 3:28 ` Alexey Kardashevskiy
0 siblings, 0 replies; 7+ messages in thread
From: Alexey Kardashevskiy @ 2017-04-07 3:28 UTC (permalink / raw)
To: Gavin Shan; +Cc: Russell Currey, linuxppc-dev
On 06/03/17 12:54, Alexey Kardashevskiy wrote:
> On 06/03/17 10:22, Gavin Shan wrote:
>> On Fri, Mar 03, 2017 at 04:59:11PM +1100, Alexey Kardashevskiy wrote:
>>> On 03/03/17 15:47, Russell Currey wrote:
>>>> eeh_handle_special_event() is called when an EEH event is detected but
>>>> can't be narrowed down to a specific PE. This function looks through
>>>> every PE to find one in an erroneous state, then calls the regular event
>>>> handler eeh_handle_normal_event() once it knows which PE has an error.
>>>>
>>>> However, if eeh_handle_normal_event() found that the PE cannot possibly
>>>> be recovered, it will remove the PE and associated devices. This leads
>>>> to a use after free in eeh_handle_special_event() as it attempts to clear
>>>> the "recovering" state on the PE after eeh_handle_normal_event() returns.
>>>>
>>>> Thus, make sure the PE is valid when attempting to clear state in
>>>> eeh_handle_special_event().
>>>>
>>>> Cc: <stable@vger.kernel.org> #3.10+
>>>> Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>>>> Signed-off-by: Russell Currey <ruscur@russell.cc>
>>>> ---
>>>> arch/powerpc/kernel/eeh_driver.c | 13 +++++++++++++
>>>> 1 file changed, 13 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/kernel/eeh_driver.c b/arch/powerpc/kernel/eeh_driver.c
>>>> index b94887165a10..492397298a2a 100644
>>>> --- a/arch/powerpc/kernel/eeh_driver.c
>>>> +++ b/arch/powerpc/kernel/eeh_driver.c
>>>> @@ -983,6 +983,19 @@ static void eeh_handle_special_event(void)
>>>> if (rc == EEH_NEXT_ERR_FROZEN_PE ||
>>>> rc == EEH_NEXT_ERR_FENCED_PHB) {
>>>> eeh_handle_normal_event(pe);
>>>> +
>>>> + /*
>>>> + * eeh_handle_normal_event() can free the PE if it
>>>> + * determines that the PE cannot possibly be recovered.
>>>> + * Make sure the PE still exists before changing its
>>>> + * state.
>>>> + */
>>>> + if (!pe || (pe->type & EEH_PE_INVALID)
>>>> + || (pe->state & EEH_PE_REMOVED)) {
>>>
>>>
>>> The bug is that pe becomes stale after eeh_handle_normal_event() returned
>>> and dereferencing it afterwards is broken.
>>>
>>
>> Correct, it won't cause a kernel crash as @pe is deferencing linear mapped
>> area whose address is always valid.
>
> Dereferencing pe would not crash but dereferencing any pointer from the
> pnv_ioda_pe struct would (as it would random stuff or a poison).
>
>
>> I think the proper fix would be to use
>> eeh_handle_normal_event() to indicate the @pe has been released and don't
>> access it any more.
>
> Correct. The problem is that the callstack from my other reply is a bit too
> long to make an trivial patch :)
Any update on this?
>
>
>
>>>
>>>
>>>> + pr_warn("EEH: not clearing state on bad PE\n");
>>
>> The message like this isn't meaningful, no need to have it. The messages that
>> have prefix "EEH:" is informative messages. We definitely needn't this here.
>> However, the message might be not needed in next revision.
>>
>>>> + continue;
>>>> + }
>>>> +
>>>> eeh_pe_state_clear(pe, EEH_PE_RECOVERING);
>>>> } else {
>>>> pci_lock_rescan_remove();
>>>>
>>
>> Thanks,
>> Gavin
>>
>
>
--
Alexey
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-04-07 3:28 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-03-03 4:47 [PATCH] powerpc/eeh: Avoid use after free in eeh_handle_special_event() Russell Currey
2017-03-03 5:55 ` Gavin Shan
2017-03-03 6:05 ` Alexey Kardashevskiy
2017-03-03 5:59 ` Alexey Kardashevskiy
2017-03-05 23:22 ` Gavin Shan
2017-03-06 1:54 ` Alexey Kardashevskiy
2017-04-07 3:28 ` Alexey Kardashevskiy
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).