public inbox for linux-usb@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
@ 2025-03-28 16:17 Frode Isaksen
  2025-03-28 21:02 ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Frode Isaksen @ 2025-03-28 16:17 UTC (permalink / raw)
  To: linux-usb, Thinh.Nguyen; +Cc: gregkh, fisaksen, Frode Isaksen

From: Frode Isaksen <frode@meta.com>

Invalidate io_data by setting context to NULL when USB request is
dequeued or completed, and check for NULL io_data in epfile_io_complete().
The invalidation of io_data in req->context is done when exiting
epfile_io(), since then io_data will become invalid as it is allocated
on the stack.
The epfile_io_complete() may be called after ffs_epfile_io() returns
in case the wait_for_completion_interruptible() is interrupted.
This fixes a use-after-free error with the following call stack:

Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
pc : ffs_epfile_io_complete+0x30/0x48
lr : usb_gadget_giveback_request+0x30/0xf8
Call trace:
ffs_epfile_io_complete+0x30/0x48
usb_gadget_giveback_request+0x30/0xf8
dwc3_remove_requests+0x264/0x2e8
dwc3_gadget_pullup+0x1d0/0x250
kretprobe_trampoline+0x0/0xc4
usb_gadget_remove_driver+0x40/0xf4
usb_gadget_unregister_driver+0xdc/0x178
unregister_gadget_item+0x40/0x6c
ffs_closed+0xd4/0x10c
ffs_data_clear+0x2c/0xf0
ffs_data_closed+0x178/0x1ec
ffs_ep0_release+0x24/0x38
__fput+0xe8/0x27c

Signed-off-by: Frode Isaksen <frode@meta.com>
---
This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
Also tested on T.I. AM62x board.

 drivers/usb/gadget/function/f_fs.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 2dea9e42a0f8..f1be0a5c0bd0 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
 {
 	struct ffs_io_data *io_data = req->context;
 
+	if (WARN_ON(io_data == NULL))
+		return;
+
 	if (req->status)
 		io_data->status = req->status;
 	else
@@ -1126,6 +1129,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
 			spin_lock_irq(&epfile->ffs->eps_lock);
 			if (epfile->ep != ep) {
 				ret = -ESHUTDOWN;
+				req->context = NULL;
 				goto error_lock;
 			}
 			/*
@@ -1140,6 +1144,7 @@ static ssize_t ffs_epfile_io(struct file *file, struct ffs_io_data *io_data)
 			interrupted = io_data->status < 0;
 		}
 
+		req->context = NULL;
 		if (interrupted)
 			ret = -EINTR;
 		else if (io_data->read && io_data->status > 0)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
  2025-03-28 16:17 [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed Frode Isaksen
@ 2025-03-28 21:02 ` Greg KH
  2025-03-31  8:18   ` Frode Isaksen
  0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2025-03-28 21:02 UTC (permalink / raw)
  To: Frode Isaksen; +Cc: linux-usb, Thinh.Nguyen, Frode Isaksen

On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
> From: Frode Isaksen <frode@meta.com>
> 
> Invalidate io_data by setting context to NULL when USB request is
> dequeued or completed, and check for NULL io_data in epfile_io_complete().
> The invalidation of io_data in req->context is done when exiting
> epfile_io(), since then io_data will become invalid as it is allocated
> on the stack.
> The epfile_io_complete() may be called after ffs_epfile_io() returns
> in case the wait_for_completion_interruptible() is interrupted.
> This fixes a use-after-free error with the following call stack:
> 
> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
> pc : ffs_epfile_io_complete+0x30/0x48
> lr : usb_gadget_giveback_request+0x30/0xf8
> Call trace:
> ffs_epfile_io_complete+0x30/0x48
> usb_gadget_giveback_request+0x30/0xf8
> dwc3_remove_requests+0x264/0x2e8
> dwc3_gadget_pullup+0x1d0/0x250
> kretprobe_trampoline+0x0/0xc4
> usb_gadget_remove_driver+0x40/0xf4
> usb_gadget_unregister_driver+0xdc/0x178
> unregister_gadget_item+0x40/0x6c
> ffs_closed+0xd4/0x10c
> ffs_data_clear+0x2c/0xf0
> ffs_data_closed+0x178/0x1ec
> ffs_ep0_release+0x24/0x38
> __fput+0xe8/0x27c
> 
> Signed-off-by: Frode Isaksen <frode@meta.com>
> ---
> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
> Also tested on T.I. AM62x board.

What commit id does this fix?  Should it go to stable?

> 
>  drivers/usb/gadget/function/f_fs.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> index 2dea9e42a0f8..f1be0a5c0bd0 100644
> --- a/drivers/usb/gadget/function/f_fs.c
> +++ b/drivers/usb/gadget/function/f_fs.c
> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>  {
>  	struct ffs_io_data *io_data = req->context;
>  
> +	if (WARN_ON(io_data == NULL))
> +		return;

If this happens you just crashed the box (remember about panic-on-warn,
which is still set in a few billion Linux systems these days...)

Just handle the issue properly, no need to dump the stack and crash a
device.

But, what keeps io_data from changing after you have checked it?  Where
is the lock here?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
  2025-03-28 21:02 ` Greg KH
@ 2025-03-31  8:18   ` Frode Isaksen
  2025-03-31  8:57     ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Frode Isaksen @ 2025-03-31  8:18 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb, Thinh.Nguyen, Frode Isaksen

On 3/28/25 10:02 PM, Greg KH wrote:
> On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
>> From: Frode Isaksen <frode@meta.com>
>>
>> Invalidate io_data by setting context to NULL when USB request is
>> dequeued or completed, and check for NULL io_data in epfile_io_complete().
>> The invalidation of io_data in req->context is done when exiting
>> epfile_io(), since then io_data will become invalid as it is allocated
>> on the stack.
>> The epfile_io_complete() may be called after ffs_epfile_io() returns
>> in case the wait_for_completion_interruptible() is interrupted.
>> This fixes a use-after-free error with the following call stack:
>>
>> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
>> pc : ffs_epfile_io_complete+0x30/0x48
>> lr : usb_gadget_giveback_request+0x30/0xf8
>> Call trace:
>> ffs_epfile_io_complete+0x30/0x48
>> usb_gadget_giveback_request+0x30/0xf8
>> dwc3_remove_requests+0x264/0x2e8
>> dwc3_gadget_pullup+0x1d0/0x250
>> kretprobe_trampoline+0x0/0xc4
>> usb_gadget_remove_driver+0x40/0xf4
>> usb_gadget_unregister_driver+0xdc/0x178
>> unregister_gadget_item+0x40/0x6c
>> ffs_closed+0xd4/0x10c
>> ffs_data_clear+0x2c/0xf0
>> ffs_data_closed+0x178/0x1ec
>> ffs_ep0_release+0x24/0x38
>> __fput+0xe8/0x27c
>>
>> Signed-off-by: Frode Isaksen <frode@meta.com>
>> ---
>> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
>> Also tested on T.I. AM62x board.
> What commit id does this fix?  Should it go to stable?

This has always been there, so the is no specific commit when this was 
added.

Will add the Cc tag to stable in v2.

>
>>   drivers/usb/gadget/function/f_fs.c | 5 +++++
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>> index 2dea9e42a0f8..f1be0a5c0bd0 100644
>> --- a/drivers/usb/gadget/function/f_fs.c
>> +++ b/drivers/usb/gadget/function/f_fs.c
>> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>>   {
>>   	struct ffs_io_data *io_data = req->context;
>>   
>> +	if (WARN_ON(io_data == NULL))
>> +		return;
> If this happens you just crashed the box (remember about panic-on-warn,
> which is still set in a few billion Linux systems these days...)
>
> Just handle the issue properly, no need to dump the stack and crash a
> device.
OK, removing the WARN_ON for v2.
>
> But, what keeps io_data from changing after you have checked it?  Where
> is the lock here?

There is no lock here, as I didn't want to introduce extra complexity 
(and bugs...). But this code has been running without a crash on 
millions of devices for more than a year.

Thanks,

Frode

>
> thanks,
>
> greg k-h



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
  2025-03-31  8:18   ` Frode Isaksen
@ 2025-03-31  8:57     ` Greg KH
  2025-03-31 13:17       ` Frode Isaksen
  0 siblings, 1 reply; 6+ messages in thread
From: Greg KH @ 2025-03-31  8:57 UTC (permalink / raw)
  To: Frode Isaksen; +Cc: linux-usb, Thinh.Nguyen, Frode Isaksen

On Mon, Mar 31, 2025 at 10:18:29AM +0200, Frode Isaksen wrote:
> On 3/28/25 10:02 PM, Greg KH wrote:
> > On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
> > > From: Frode Isaksen <frode@meta.com>
> > > 
> > > Invalidate io_data by setting context to NULL when USB request is
> > > dequeued or completed, and check for NULL io_data in epfile_io_complete().
> > > The invalidation of io_data in req->context is done when exiting
> > > epfile_io(), since then io_data will become invalid as it is allocated
> > > on the stack.
> > > The epfile_io_complete() may be called after ffs_epfile_io() returns
> > > in case the wait_for_completion_interruptible() is interrupted.
> > > This fixes a use-after-free error with the following call stack:
> > > 
> > > Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
> > > pc : ffs_epfile_io_complete+0x30/0x48
> > > lr : usb_gadget_giveback_request+0x30/0xf8
> > > Call trace:
> > > ffs_epfile_io_complete+0x30/0x48
> > > usb_gadget_giveback_request+0x30/0xf8
> > > dwc3_remove_requests+0x264/0x2e8
> > > dwc3_gadget_pullup+0x1d0/0x250
> > > kretprobe_trampoline+0x0/0xc4
> > > usb_gadget_remove_driver+0x40/0xf4
> > > usb_gadget_unregister_driver+0xdc/0x178
> > > unregister_gadget_item+0x40/0x6c
> > > ffs_closed+0xd4/0x10c
> > > ffs_data_clear+0x2c/0xf0
> > > ffs_data_closed+0x178/0x1ec
> > > ffs_ep0_release+0x24/0x38
> > > __fput+0xe8/0x27c
> > > 
> > > Signed-off-by: Frode Isaksen <frode@meta.com>
> > > ---
> > > This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
> > > Also tested on T.I. AM62x board.
> > What commit id does this fix?  Should it go to stable?
> 
> This has always been there, so the is no specific commit when this was
> added.
> 
> Will add the Cc tag to stable in v2.
> 
> > 
> > >   drivers/usb/gadget/function/f_fs.c | 5 +++++
> > >   1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> > > index 2dea9e42a0f8..f1be0a5c0bd0 100644
> > > --- a/drivers/usb/gadget/function/f_fs.c
> > > +++ b/drivers/usb/gadget/function/f_fs.c
> > > @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
> > >   {
> > >   	struct ffs_io_data *io_data = req->context;
> > > +	if (WARN_ON(io_data == NULL))
> > > +		return;
> > If this happens you just crashed the box (remember about panic-on-warn,
> > which is still set in a few billion Linux systems these days...)
> > 
> > Just handle the issue properly, no need to dump the stack and crash a
> > device.
> OK, removing the WARN_ON for v2.
> > 
> > But, what keeps io_data from changing after you have checked it?  Where
> > is the lock here?
> 
> There is no lock here, as I didn't want to introduce extra complexity (and
> bugs...). But this code has been running without a crash on millions of
> devices for more than a year.

The fix has?  Great, but again, you need to at least say why this value
will not change right after testing for it, otherwise you have just
reduced the race window, not removed it.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
  2025-03-31  8:57     ` Greg KH
@ 2025-03-31 13:17       ` Frode Isaksen
  2025-03-31 13:23         ` Greg KH
  0 siblings, 1 reply; 6+ messages in thread
From: Frode Isaksen @ 2025-03-31 13:17 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-usb, Thinh.Nguyen, Frode Isaksen

On 3/31/25 10:57 AM, Greg KH wrote:
> On Mon, Mar 31, 2025 at 10:18:29AM +0200, Frode Isaksen wrote:
>> On 3/28/25 10:02 PM, Greg KH wrote:
>>> On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
>>>> From: Frode Isaksen <frode@meta.com>
>>>>
>>>> Invalidate io_data by setting context to NULL when USB request is
>>>> dequeued or completed, and check for NULL io_data in epfile_io_complete().
>>>> The invalidation of io_data in req->context is done when exiting
>>>> epfile_io(), since then io_data will become invalid as it is allocated
>>>> on the stack.
>>>> The epfile_io_complete() may be called after ffs_epfile_io() returns
>>>> in case the wait_for_completion_interruptible() is interrupted.
>>>> This fixes a use-after-free error with the following call stack:
>>>>
>>>> Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
>>>> pc : ffs_epfile_io_complete+0x30/0x48
>>>> lr : usb_gadget_giveback_request+0x30/0xf8
>>>> Call trace:
>>>> ffs_epfile_io_complete+0x30/0x48
>>>> usb_gadget_giveback_request+0x30/0xf8
>>>> dwc3_remove_requests+0x264/0x2e8
>>>> dwc3_gadget_pullup+0x1d0/0x250
>>>> kretprobe_trampoline+0x0/0xc4
>>>> usb_gadget_remove_driver+0x40/0xf4
>>>> usb_gadget_unregister_driver+0xdc/0x178
>>>> unregister_gadget_item+0x40/0x6c
>>>> ffs_closed+0xd4/0x10c
>>>> ffs_data_clear+0x2c/0xf0
>>>> ffs_data_closed+0x178/0x1ec
>>>> ffs_ep0_release+0x24/0x38
>>>> __fput+0xe8/0x27c
>>>>
>>>> Signed-off-by: Frode Isaksen <frode@meta.com>
>>>> ---
>>>> This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
>>>> Also tested on T.I. AM62x board.
>>> What commit id does this fix?  Should it go to stable?
>> This has always been there, so the is no specific commit when this was
>> added.
>>
>> Will add the Cc tag to stable in v2.
>>
>>>>    drivers/usb/gadget/function/f_fs.c | 5 +++++
>>>>    1 file changed, 5 insertions(+)
>>>>
>>>> diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
>>>> index 2dea9e42a0f8..f1be0a5c0bd0 100644
>>>> --- a/drivers/usb/gadget/function/f_fs.c
>>>> +++ b/drivers/usb/gadget/function/f_fs.c
>>>> @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
>>>>    {
>>>>    	struct ffs_io_data *io_data = req->context;
>>>> +	if (WARN_ON(io_data == NULL))
>>>> +		return;
>>> If this happens you just crashed the box (remember about panic-on-warn,
>>> which is still set in a few billion Linux systems these days...)
>>>
>>> Just handle the issue properly, no need to dump the stack and crash a
>>> device.
>> OK, removing the WARN_ON for v2.
>>> But, what keeps io_data from changing after you have checked it?  Where
>>> is the lock here?
>> There is no lock here, as I didn't want to introduce extra complexity (and
>> bugs...). But this code has been running without a crash on millions of
>> devices for more than a year.
> The fix has?  Great, but again, you need to at least say why this value
> will not change right after testing for it, otherwise you have just
> reduced the race window, not removed it.

I agree that this is only reducing the race window and not eliminating 
it completely, but I have no idea how to fix this easily.

Thanks,

Frode

>
> thanks,
>
> greg k-h



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed
  2025-03-31 13:17       ` Frode Isaksen
@ 2025-03-31 13:23         ` Greg KH
  0 siblings, 0 replies; 6+ messages in thread
From: Greg KH @ 2025-03-31 13:23 UTC (permalink / raw)
  To: Frode Isaksen; +Cc: linux-usb, Thinh.Nguyen, Frode Isaksen

On Mon, Mar 31, 2025 at 03:17:08PM +0200, Frode Isaksen wrote:
> On 3/31/25 10:57 AM, Greg KH wrote:
> > On Mon, Mar 31, 2025 at 10:18:29AM +0200, Frode Isaksen wrote:
> > > On 3/28/25 10:02 PM, Greg KH wrote:
> > > > On Fri, Mar 28, 2025 at 05:17:15PM +0100, Frode Isaksen wrote:
> > > > > From: Frode Isaksen <frode@meta.com>
> > > > > 
> > > > > Invalidate io_data by setting context to NULL when USB request is
> > > > > dequeued or completed, and check for NULL io_data in epfile_io_complete().
> > > > > The invalidation of io_data in req->context is done when exiting
> > > > > epfile_io(), since then io_data will become invalid as it is allocated
> > > > > on the stack.
> > > > > The epfile_io_complete() may be called after ffs_epfile_io() returns
> > > > > in case the wait_for_completion_interruptible() is interrupted.
> > > > > This fixes a use-after-free error with the following call stack:
> > > > > 
> > > > > Unable to handle kernel paging request at virtual address ffffffc02f7bbcc0
> > > > > pc : ffs_epfile_io_complete+0x30/0x48
> > > > > lr : usb_gadget_giveback_request+0x30/0xf8
> > > > > Call trace:
> > > > > ffs_epfile_io_complete+0x30/0x48
> > > > > usb_gadget_giveback_request+0x30/0xf8
> > > > > dwc3_remove_requests+0x264/0x2e8
> > > > > dwc3_gadget_pullup+0x1d0/0x250
> > > > > kretprobe_trampoline+0x0/0xc4
> > > > > usb_gadget_remove_driver+0x40/0xf4
> > > > > usb_gadget_unregister_driver+0xdc/0x178
> > > > > unregister_gadget_item+0x40/0x6c
> > > > > ffs_closed+0xd4/0x10c
> > > > > ffs_data_clear+0x2c/0xf0
> > > > > ffs_data_closed+0x178/0x1ec
> > > > > ffs_ep0_release+0x24/0x38
> > > > > __fput+0xe8/0x27c
> > > > > 
> > > > > Signed-off-by: Frode Isaksen <frode@meta.com>
> > > > > ---
> > > > > This bug was discovered, tested and fixed (no more crashes seen) on Meta Quest 3 device.
> > > > > Also tested on T.I. AM62x board.
> > > > What commit id does this fix?  Should it go to stable?
> > > This has always been there, so the is no specific commit when this was
> > > added.
> > > 
> > > Will add the Cc tag to stable in v2.
> > > 
> > > > >    drivers/usb/gadget/function/f_fs.c | 5 +++++
> > > > >    1 file changed, 5 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
> > > > > index 2dea9e42a0f8..f1be0a5c0bd0 100644
> > > > > --- a/drivers/usb/gadget/function/f_fs.c
> > > > > +++ b/drivers/usb/gadget/function/f_fs.c
> > > > > @@ -738,6 +738,9 @@ static void ffs_epfile_io_complete(struct usb_ep *_ep, struct usb_request *req)
> > > > >    {
> > > > >    	struct ffs_io_data *io_data = req->context;
> > > > > +	if (WARN_ON(io_data == NULL))
> > > > > +		return;
> > > > If this happens you just crashed the box (remember about panic-on-warn,
> > > > which is still set in a few billion Linux systems these days...)
> > > > 
> > > > Just handle the issue properly, no need to dump the stack and crash a
> > > > device.
> > > OK, removing the WARN_ON for v2.
> > > > But, what keeps io_data from changing after you have checked it?  Where
> > > > is the lock here?
> > > There is no lock here, as I didn't want to introduce extra complexity (and
> > > bugs...). But this code has been running without a crash on millions of
> > > devices for more than a year.
> > The fix has?  Great, but again, you need to at least say why this value
> > will not change right after testing for it, otherwise you have just
> > reduced the race window, not removed it.
> 
> I agree that this is only reducing the race window and not eliminating it
> completely, but I have no idea how to fix this easily.

The comment in the code explains where the race can not happen, which
implies where it can happen, so perhaps that is a good start?

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-03-31 13:24 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-28 16:17 [PATCH] usb: gadget: f_fs: Invalidate io_data when USB request is dequeued or completed Frode Isaksen
2025-03-28 21:02 ` Greg KH
2025-03-31  8:18   ` Frode Isaksen
2025-03-31  8:57     ` Greg KH
2025-03-31 13:17       ` Frode Isaksen
2025-03-31 13:23         ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox