linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel
@ 2023-08-29 17:07 Qiujun Huang
  2023-08-30  9:39 ` Sudeep Holla
  0 siblings, 1 reply; 5+ messages in thread
From: Qiujun Huang @ 2023-08-29 17:07 UTC (permalink / raw)
  To: sudeep.holla, cristian.marussi, linux-arm-kernel, linux-kernel

There is a race between the failure of probe and rx_callback (due to a
delayed response).

scmi_probe
scmi_acquire_protocal
do_xfer
 timeout
mailbox_chan_free
                                                    <--- delay response
                                                           rx_callback
mbox_free_channel
cinfo->transport_info = NULL
                                                          mailbox_clear_channel
                                                         dereference cinfo->transport_info

This patch check for it, and clear the channel at mailbox_chan_free.

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
---
 drivers/firmware/arm_scmi/mailbox.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/firmware/arm_scmi/mailbox.c b/drivers/firmware/arm_scmi/mailbox.c
index 19246ed1f01f..32c5fe0b8c29 100644
--- a/drivers/firmware/arm_scmi/mailbox.c
+++ b/drivers/firmware/arm_scmi/mailbox.c
@@ -230,6 +230,7 @@ static int mailbox_chan_free(int id, void *p, void *data)
 		smbox->chan = NULL;
 		smbox->chan_receiver = NULL;
 		smbox->cinfo = NULL;
+		shmem_clear_channel(smbox->shmem);
 	}

 	return 0;
@@ -284,7 +285,8 @@ static void mailbox_clear_channel(struct scmi_chan_info *cinfo)
 {
 	struct scmi_mailbox *smbox = cinfo->transport_info;

-	shmem_clear_channel(smbox->shmem);
+	if (smbox)
+		shmem_clear_channel(smbox->shmem);
 }

 static bool
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel
  2023-08-29 17:07 [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel Qiujun Huang
@ 2023-08-30  9:39 ` Sudeep Holla
  2023-08-30 13:23   ` Qiujun Huang
  0 siblings, 1 reply; 5+ messages in thread
From: Sudeep Holla @ 2023-08-30  9:39 UTC (permalink / raw)
  To: Qiujun Huang
  Cc: cristian.marussi, linux-arm-kernel, Sudeep Holla, linux-kernel

On Wed, Aug 30, 2023 at 01:07:47AM +0800, Qiujun Huang wrote:
> There is a race between the failure of probe and rx_callback (due to a
> delayed response).
> 
> scmi_probe
> scmi_acquire_protocal
> do_xfer
>  timeout
> mailbox_chan_free
>                                                     <--- delay response
>                                                            rx_callback
> mbox_free_channel
> cinfo->transport_info = NULL
>                                                           mailbox_clear_channel
>                                                          dereference cinfo->transport_info

It is always good to provide the kernel stacktrace which you get when a
NULL pointer is dereference. It helps for review and also to document it.

-- 
Regards,
Sudeep

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel
  2023-08-30  9:39 ` Sudeep Holla
@ 2023-08-30 13:23   ` Qiujun Huang
  2023-08-31 12:34     ` Sudeep Holla
  0 siblings, 1 reply; 5+ messages in thread
From: Qiujun Huang @ 2023-08-30 13:23 UTC (permalink / raw)
  To: Sudeep Holla; +Cc: cristian.marussi, linux-arm-kernel, linux-kernel



> 2023年8月30日 下午5:39,Sudeep Holla <sudeep.holla@arm.com> 写道:
> 
> On Wed, Aug 30, 2023 at 01:07:47AM +0800, Qiujun Huang wrote:
>> There is a race between the failure of probe and rx_callback (due to a
>> delayed response).
>> 
>> scmi_probe
>> scmi_acquire_protocal
>> do_xfer
>> timeout
>> mailbox_chan_free
>>                                                    <--- delay response
>>                                                           rx_callback
>> mbox_free_channel
>> cinfo->transport_info = NULL
>>                                                          mailbox_clear_channel
>>                                                         dereference cinfo->transport_info
> 
> It is always good to provide the kernel stacktrace which you get when a
> NULL pointer is dereference. It helps for review and also to document it.
> 
> -- 
> Regards,
> Sudeep

Get it. Here is the splat.

[    1.942240][    C0] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
[    1.942241][    C0] Mem abort info:
[    1.942243][    C0]   ESR = 0x96000005
[    1.944888][    T9] spmi spmi-1: PMIC arbiter version v7 (0x70000000)
[    1.950652][    C0]   EC = 0x25: DABT (current EL), IL = 32 bits
[    1.950653][    C0]   SET = 0, FnV = 0
[    1.950654][    C0]   EA = 0, S1PTW = 0
[    1.950656][    C0] Data abort info:
[    1.950657][    C0]   ISV = 0, ISS = 0x00000005
[    1.950658][    C0]   CM = 0, WnR = 0
[    1.950660][    C0] [0000000000000048] user address but active_mm is swapper
[    1.950663][    C0] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[    2.338929][    C0] pc : mailbox_clear_channel+0x18/0x64
[    2.344384][    C0] lr : scmi_handle_response+0x17c/0x4f4
[    2.349923][    C0] sp : ffffffc010003db0
[    2.354045][    C0] x29: ffffffc010003dc0 x28: ffffffd85263f000 
[    2.360216][    C0] x27: ffffffd851621068 x26: ffffffd84ec815c8 
[    2.366386][    C0] x25: ffffffd85263bf80 x24: ffffffd85263d230 
[    2.372556][    C0] x23: ffffff803cd70cc0 x22: 0000000000000008 
[    2.378726][    C0] x21: ffffff8036cf0df8 x20: ffffffd85161bac8 
[    2.384896][    C0] x19: ffffff8043ffa580 x18: ffffffc010005050 
[    2.391065][    C0] x17: 0000000000000000 x16: 00000000000000d8 
[    2.397234][    C0] x15: ffffffd8507965e8 x14: ffffffd84eaebdf0 
[    2.403404][    C0] x13: 00000000000001ea x12: 0000000000007ffb 
[    2.409574][    C0] x11: 000000000000ffff x10: ffffffd852c5a000 
[    2.415744][    C0] x9 : d7be1a9b75f29500 x8 : 0000000000000000 
[    2.421914][    C0] x7 : 382e31202020205b x6 : ffffffd852c57e7c 
[    2.428084][    C0] x5 : ffffffffffffffff x4 : 0000000000000000 
[    2.434254][    C0] x3 : ffffffd84eae6668 x2 : 0000000000000001 
[    2.440424][    C0] x1 : 0000000000000000 x0 : ffffff8043ffa580 
[    2.446594][    C0] Call trace:
[    2.449819][    C0]  mailbox_clear_channel+0x18/0x64
[    2.454916][    C0]  scmi_handle_response+0x17c/0x4f4
[    2.460100][    C0]  rx_callback+0x60/0xec
[    2.464311][    C0]  mbox_chan_received_data+0x44/0x94
[    2.469584][    C0]  qcom_rimps_rx_interrupt+0xc0/0x144 [qcom_rimps]
[    2.476111][    C0]  __handle_irq_event_percpu+0xa0/0x414
[    2.481652][    C0]  handle_irq_event+0x84/0x1cc
[    2.486393][    C0]  handle_fasteoi_irq+0x150/0x394
[    2.491403][    C0]  __handle_domain_irq+0x114/0x1e4
[    2.496500][    C0]  gic_handle_irq.33979+0x6c/0x2b8
[    2.501597][    C0]  el1_irq+0xe4/0x1c0
[    2.505537][    C0]  cpuidle_enter_state+0x3a4/0xa04
[    2.510634][    C0]  do_idle+0x308/0x574
[    2.514661][    C0]  cpu_startup_entry+0x84/0x90
[    2.519402][    C0]  kernel_init+0x0/0x310
[    2.523611][    C0]  start_kernel+0x0/0x648
[    2.527908][    C0]  start_kernel+0x52c/0x648
[    2.532390][    C0] Code: f800865e a9017bfd 910043fd f9400808 (f9402508) 
[    2.539360][    C0] ---[ end trace da7fdd5fdd7f7f09 ]---
[    2.550088][    C0] Kernel panic - not syncing: Oops: Fatal exception in interrupt


- - -
thanks


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel
  2023-08-30 13:23   ` Qiujun Huang
@ 2023-08-31 12:34     ` Sudeep Holla
  2023-08-31 14:49       ` Qiujun Huang
  0 siblings, 1 reply; 5+ messages in thread
From: Sudeep Holla @ 2023-08-31 12:34 UTC (permalink / raw)
  To: Qiujun Huang; +Cc: cristian.marussi, linux-arm-kernel, linux-kernel

On Wed, Aug 30, 2023 at 09:23:58PM +0800, Qiujun Huang wrote:
> 
> 
> > 2023年8月30日 下午5:39,Sudeep Holla <sudeep.holla@arm.com> 写道:
> > 
> > On Wed, Aug 30, 2023 at 01:07:47AM +0800, Qiujun Huang wrote:
> >> There is a race between the failure of probe and rx_callback (due to a
> >> delayed response).
> >> 
> >> scmi_probe
> >> scmi_acquire_protocal
> >> do_xfer
> >> timeout
> >> mailbox_chan_free
> >>                                                    <--- delay response
> >>                                                           rx_callback
> >> mbox_free_channel
> >> cinfo->transport_info = NULL
> >>                                                          mailbox_clear_channel
> >>                                                         dereference cinfo->transport_info
> > 
> > It is always good to provide the kernel stacktrace which you get when a
> > NULL pointer is dereference. It helps for review and also to document it.
> > 
> > -- 
> > Regards,
> > Sudeep
> 
> Get it. Here is the splat.
> 
> [    1.942240][    C0] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
> [    1.942241][    C0] Mem abort info:
> [    1.942243][    C0]   ESR = 0x96000005
> [    1.944888][    T9] spmi spmi-1: PMIC arbiter version v7 (0x70000000)
> [    1.950652][    C0]   EC = 0x25: DABT (current EL), IL = 32 bits
> [    1.950653][    C0]   SET = 0, FnV = 0
> [    1.950654][    C0]   EA = 0, S1PTW = 0
> [    1.950656][    C0] Data abort info:
> [    1.950657][    C0]   ISV = 0, ISS = 0x00000005
> [    1.950658][    C0]   CM = 0, WnR = 0
> [    1.950660][    C0] [0000000000000048] user address but active_mm is swapper
> [    1.950663][    C0] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> [    2.338929][    C0] pc : mailbox_clear_channel+0x18/0x64
> [    2.344384][    C0] lr : scmi_handle_response+0x17c/0x4f4
> [    2.349923][    C0] sp : ffffffc010003db0
> [    2.354045][    C0] x29: ffffffc010003dc0 x28: ffffffd85263f000 
> [    2.360216][    C0] x27: ffffffd851621068 x26: ffffffd84ec815c8 
> [    2.366386][    C0] x25: ffffffd85263bf80 x24: ffffffd85263d230 
> [    2.372556][    C0] x23: ffffff803cd70cc0 x22: 0000000000000008 
> [    2.378726][    C0] x21: ffffff8036cf0df8 x20: ffffffd85161bac8 
> [    2.384896][    C0] x19: ffffff8043ffa580 x18: ffffffc010005050 
> [    2.391065][    C0] x17: 0000000000000000 x16: 00000000000000d8 
> [    2.397234][    C0] x15: ffffffd8507965e8 x14: ffffffd84eaebdf0 
> [    2.403404][    C0] x13: 00000000000001ea x12: 0000000000007ffb 
> [    2.409574][    C0] x11: 000000000000ffff x10: ffffffd852c5a000 
> [    2.415744][    C0] x9 : d7be1a9b75f29500 x8 : 0000000000000000 
> [    2.421914][    C0] x7 : 382e31202020205b x6 : ffffffd852c57e7c 
> [    2.428084][    C0] x5 : ffffffffffffffff x4 : 0000000000000000 
> [    2.434254][    C0] x3 : ffffffd84eae6668 x2 : 0000000000000001 
> [    2.440424][    C0] x1 : 0000000000000000 x0 : ffffff8043ffa580 
> [    2.446594][    C0] Call trace:
> [    2.449819][    C0]  mailbox_clear_channel+0x18/0x64

Is this with latest kernel ? IIUC the mailbox_clear_channel should get called
only for delayed response and notification in this path(scmi_handle_response).
You are saying it happens as part of probe and again IIUC probe doesn't have
any delayed response command. What am I missing ?

Any additional changes in the tree ? My build has much smaller
mailbox_clear_channel.

> [    2.454916][    C0]  scmi_handle_response+0x17c/0x4f4
> [    2.460100][    C0]  rx_callback+0x60/0xec
> [    2.464311][    C0]  mbox_chan_received_data+0x44/0x94
> [    2.469584][    C0]  qcom_rimps_rx_interrupt+0xc0/0x144 [qcom_rimps]

This suggests out of tree driver, any pointers ?

Also I vaguely remember discussing this in the past. Perhaps at [1] or
somewhere else.

--
Regards,
Sudeep

[1] https://lore.kernel.org/linux-arm-kernel/cfa26ff3-c95a-1986-58fc-b49fc9be49d5@quicinc.com/

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel
  2023-08-31 12:34     ` Sudeep Holla
@ 2023-08-31 14:49       ` Qiujun Huang
  0 siblings, 0 replies; 5+ messages in thread
From: Qiujun Huang @ 2023-08-31 14:49 UTC (permalink / raw)
  To: Sudeep Holla; +Cc: cristian.marussi, linux-arm-kernel, linux-kernel

On Thu, Aug 31, 2023 at 8:34 PM Sudeep Holla <sudeep.holla@arm.com> wrote:
>
> On Wed, Aug 30, 2023 at 09:23:58PM +0800, Qiujun Huang wrote:
> >
> >
> > > 2023年8月30日 下午5:39,Sudeep Holla <sudeep.holla@arm.com> 写道:
> > >
> > > On Wed, Aug 30, 2023 at 01:07:47AM +0800, Qiujun Huang wrote:
> > >> There is a race between the failure of probe and rx_callback (due to a
> > >> delayed response).
> > >>
> > >> scmi_probe
> > >> scmi_acquire_protocal
> > >> do_xfer
> > >> timeout
> > >> mailbox_chan_free
> > >>                                                    <--- delay response
> > >>                                                           rx_callback
> > >> mbox_free_channel
> > >> cinfo->transport_info = NULL
> > >>                                                          mailbox_clear_channel
> > >>                                                         dereference cinfo->transport_info
> > >
> > > It is always good to provide the kernel stacktrace which you get when a
> > > NULL pointer is dereference. It helps for review and also to document it.
> > >
> > > --
> > > Regards,
> > > Sudeep
> >
> > Get it. Here is the splat.
> >
> > [    1.942240][    C0] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048
> > [    1.942241][    C0] Mem abort info:
> > [    1.942243][    C0]   ESR = 0x96000005
> > [    1.944888][    T9] spmi spmi-1: PMIC arbiter version v7 (0x70000000)
> > [    1.950652][    C0]   EC = 0x25: DABT (current EL), IL = 32 bits
> > [    1.950653][    C0]   SET = 0, FnV = 0
> > [    1.950654][    C0]   EA = 0, S1PTW = 0
> > [    1.950656][    C0] Data abort info:
> > [    1.950657][    C0]   ISV = 0, ISS = 0x00000005
> > [    1.950658][    C0]   CM = 0, WnR = 0
> > [    1.950660][    C0] [0000000000000048] user address but active_mm is swapper
> > [    1.950663][    C0] Internal error: Oops: 96000005 [#1] PREEMPT SMP
> > [    2.338929][    C0] pc : mailbox_clear_channel+0x18/0x64
> > [    2.344384][    C0] lr : scmi_handle_response+0x17c/0x4f4
> > [    2.349923][    C0] sp : ffffffc010003db0
> > [    2.354045][    C0] x29: ffffffc010003dc0 x28: ffffffd85263f000
> > [    2.360216][    C0] x27: ffffffd851621068 x26: ffffffd84ec815c8
> > [    2.366386][    C0] x25: ffffffd85263bf80 x24: ffffffd85263d230
> > [    2.372556][    C0] x23: ffffff803cd70cc0 x22: 0000000000000008
> > [    2.378726][    C0] x21: ffffff8036cf0df8 x20: ffffffd85161bac8
> > [    2.384896][    C0] x19: ffffff8043ffa580 x18: ffffffc010005050
> > [    2.391065][    C0] x17: 0000000000000000 x16: 00000000000000d8
> > [    2.397234][    C0] x15: ffffffd8507965e8 x14: ffffffd84eaebdf0
> > [    2.403404][    C0] x13: 00000000000001ea x12: 0000000000007ffb
> > [    2.409574][    C0] x11: 000000000000ffff x10: ffffffd852c5a000
> > [    2.415744][    C0] x9 : d7be1a9b75f29500 x8 : 0000000000000000
> > [    2.421914][    C0] x7 : 382e31202020205b x6 : ffffffd852c57e7c
> > [    2.428084][    C0] x5 : ffffffffffffffff x4 : 0000000000000000
> > [    2.434254][    C0] x3 : ffffffd84eae6668 x2 : 0000000000000001
> > [    2.440424][    C0] x1 : 0000000000000000 x0 : ffffff8043ffa580
> > [    2.446594][    C0] Call trace:
> > [    2.449819][    C0]  mailbox_clear_channel+0x18/0x64
>
> Is this with latest kernel ? IIUC the mailbox_clear_channel should get called
no, it's on 5.10-stable kernel
> only for delayed response and notification in this path(scmi_handle_response).
> You are saying it happens as part of probe and again IIUC probe doesn't have
> any delayed response command. What am I missing ?
>
> Any additional changes in the tree ? My build has much smaller
> mailbox_clear_channel.
>
> > [    2.454916][    C0]  scmi_handle_response+0x17c/0x4f4
> > [    2.460100][    C0]  rx_callback+0x60/0xec
> > [    2.464311][    C0]  mbox_chan_received_data+0x44/0x94
> > [    2.469584][    C0]  qcom_rimps_rx_interrupt+0xc0/0x144 [qcom_rimps]
>
> This suggests out of tree driver, any pointers ?
>
> Also I vaguely remember discussing this in the past. Perhaps at [1] or
> somewhere else.
Get that. It looks painful
>
> --
> Regards,
> Sudeep
>
> [1] https://lore.kernel.org/linux-arm-kernel/cfa26ff3-c95a-1986-58fc-b49fc9be49d5@quicinc.com/

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-31 14:50 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-29 17:07 [PATCH v1] firmware: arm_scmi: Fix NULL pointer dereference in mailbox_clear_channel Qiujun Huang
2023-08-30  9:39 ` Sudeep Holla
2023-08-30 13:23   ` Qiujun Huang
2023-08-31 12:34     ` Sudeep Holla
2023-08-31 14:49       ` Qiujun Huang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).