public inbox for v9fs@lists.linux.dev
 help / color / mirror / Atom feed
* [PATCH] 9p/xen: mark 9p transport device as closing when removing it
@ 2025-12-08 19:51 Ariadne Conill
  2025-12-09 10:41 ` Demi Marie Obenour
  2025-12-09 20:45 ` Marek Marczykowski-Górecki
  0 siblings, 2 replies; 8+ messages in thread
From: Ariadne Conill @ 2025-12-08 19:51 UTC (permalink / raw)
  To: v9fs
  Cc: xen-devel, asmadeus, linux_oss, lucho, ericvh, Ariadne Conill,
	Juergen Gross, Stefano Stabellini, Alex Zenla

We need to do this so that we can signal to the other end that the
device is being removed, so that it will release its claim on the
underlying memory allocation.  Otherwise releasing the grant-table
entries is deferred resulting in a kernel oops since the pages have
already been freed.

Cc: Juergen Gross <jgross@suse.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Fixes: 71ebd71921e45 ("xen/9pfs: connect to the backend")
Signed-off-by: Ariadne Conill <ariadne@ariadne.space>
Signed-off-by: Alex Zenla <alex@edera.dev>
---
 net/9p/trans_xen.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index b9ff69c7522a..cde283c42dc6 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -312,6 +312,7 @@ static void xen_9pfs_front_remove(struct xenbus_device *dev)
 {
 	struct xen_9pfs_front_priv *priv = dev_get_drvdata(&dev->dev);
 
+	xenbus_switch_state(dev, XenbusStateClosing);
 	dev_set_drvdata(&dev->dev, NULL);
 	xen_9pfs_front_free(priv);
 }
-- 
2.52.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-08 19:51 [PATCH] 9p/xen: mark 9p transport device as closing when removing it Ariadne Conill
@ 2025-12-09 10:41 ` Demi Marie Obenour
  2025-12-09 17:09   ` Ariadne Conill
  2025-12-18  8:14   ` Jürgen Groß
  2025-12-09 20:45 ` Marek Marczykowski-Górecki
  1 sibling, 2 replies; 8+ messages in thread
From: Demi Marie Obenour @ 2025-12-09 10:41 UTC (permalink / raw)
  To: Ariadne Conill, v9fs
  Cc: xen-devel, asmadeus, linux_oss, lucho, ericvh, Juergen Gross,
	Stefano Stabellini, Alex Zenla


[-- Attachment #1.1.1: Type: text/plain, Size: 1839 bytes --]

On 12/8/25 14:51, Ariadne Conill wrote:
> We need to do this so that we can signal to the other end that the
> device is being removed, so that it will release its claim on the
> underlying memory allocation.  Otherwise releasing the grant-table
> entries is deferred resulting in a kernel oops since the pages have
> already been freed.

I don't think this is sufficient.  The backend can simply refuse
to release the grants.  The frontend needs to ensure that the pages
are not freed until the grant table entries are freed.  Right now,
the backend can cause a use-after-free in the frontend, and my
understanding of the Xen Project's security policy is that this is
a security vulnerability in the frontend code.

My instinct is that the core Xen code should take a reference on
each page before granting it to another domain, and not release that
reference until the pages are no longer granted.  This should prevent
any use-after-free problems if I understand Linux core MM correctly.

> Cc: Juergen Gross <jgross@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Fixes: 71ebd71921e45 ("xen/9pfs: connect to the backend")
> Signed-off-by: Ariadne Conill <ariadne@ariadne.space>
> Signed-off-by: Alex Zenla <alex@edera.dev>
> ---
>  net/9p/trans_xen.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> index b9ff69c7522a..cde283c42dc6 100644
> --- a/net/9p/trans_xen.c
> +++ b/net/9p/trans_xen.c
> @@ -312,6 +312,7 @@ static void xen_9pfs_front_remove(struct xenbus_device *dev)
>  {
>  	struct xen_9pfs_front_priv *priv = dev_get_drvdata(&dev->dev);
>  
> +	xenbus_switch_state(dev, XenbusStateClosing);
>  	dev_set_drvdata(&dev->dev, NULL);
>  	xen_9pfs_front_free(priv);
>  }


-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-09 10:41 ` Demi Marie Obenour
@ 2025-12-09 17:09   ` Ariadne Conill
  2025-12-09 17:12     ` Demi Marie Obenour
  2025-12-18  8:14   ` Jürgen Groß
  1 sibling, 1 reply; 8+ messages in thread
From: Ariadne Conill @ 2025-12-09 17:09 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: v9fs, xen-devel, asmadeus, linux_oss, lucho, ericvh,
	Juergen Gross, Stefano Stabellini, Alex Zenla

Hello,

> On Dec 9, 2025, at 2:43 AM, Demi Marie Obenour <demiobenour@gmail.com> wrote:
> 
> On 12/8/25 14:51, Ariadne Conill wrote:
>> We need to do this so that we can signal to the other end that the
>> device is being removed, so that it will release its claim on the
>> underlying memory allocation.  Otherwise releasing the grant-table
>> entries is deferred resulting in a kernel oops since the pages have
>> already been freed.
> 
> I don't think this is sufficient.  The backend can simply refuse
> to release the grants.  The frontend needs to ensure that the pages
> are not freed until the grant table entries are freed.  Right now,
> the backend can cause a use-after-free in the frontend, and my
> understanding of the Xen Project's security policy is that this is
> a security vulnerability in the frontend code.
> 
> My instinct is that the core Xen code should take a reference on
> each page before granting it to another domain, and not release that
> reference until the pages are no longer granted.  This should prevent
> any use-after-free problems if I understand Linux core MM correctly.

Yes, there are other issues in the 9p transport that are likely in play here.  In our internal testing, we confirm this is not a full fix for hotplugging 9p transport devices, but no such claim of a complete fix has been made here or in the Matrix thread.

However, this is one defect that is contributing to the overall hotplugging problem and should be merged regardless: if the driver isn’t telling the other side to disconnect, the other side will never release the grants to begin with.

Ariadne

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-09 17:09   ` Ariadne Conill
@ 2025-12-09 17:12     ` Demi Marie Obenour
  2025-12-09 17:18       ` Ariadne Conill
  0 siblings, 1 reply; 8+ messages in thread
From: Demi Marie Obenour @ 2025-12-09 17:12 UTC (permalink / raw)
  To: Ariadne Conill
  Cc: v9fs, xen-devel, asmadeus, linux_oss, lucho, ericvh,
	Juergen Gross, Stefano Stabellini, Alex Zenla


[-- Attachment #1.1.1: Type: text/plain, Size: 1923 bytes --]

On 12/9/25 12:09, Ariadne Conill wrote:
> Hello,
> 
>> On Dec 9, 2025, at 2:43 AM, Demi Marie Obenour <demiobenour@gmail.com> wrote:
>>
>> On 12/8/25 14:51, Ariadne Conill wrote:
>>> We need to do this so that we can signal to the other end that the
>>> device is being removed, so that it will release its claim on the
>>> underlying memory allocation.  Otherwise releasing the grant-table
>>> entries is deferred resulting in a kernel oops since the pages have
>>> already been freed.
>>
>> I don't think this is sufficient.  The backend can simply refuse
>> to release the grants.  The frontend needs to ensure that the pages
>> are not freed until the grant table entries are freed.  Right now,
>> the backend can cause a use-after-free in the frontend, and my
>> understanding of the Xen Project's security policy is that this is
>> a security vulnerability in the frontend code.
>>
>> My instinct is that the core Xen code should take a reference on
>> each page before granting it to another domain, and not release that
>> reference until the pages are no longer granted.  This should prevent
>> any use-after-free problems if I understand Linux core MM correctly.
> 
> Yes, there are other issues in the 9p transport that are likely in play here.  In our internal testing, we confirm this is not a full fix for hotplugging 9p transport devices, but no such claim of a complete fix has been made here or in the Matrix thread.
> 
> However, this is one defect that is contributing to the overall hotplugging problem and should be merged regardless: if the driver isn’t telling the other side to disconnect, the other side will never release the grants to begin with.
> 
> Ariadne

I definitely agree that this should be merged!

Is this code-path triggerable by the backend at will, or only during
teardown by the toolstack?
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-09 17:12     ` Demi Marie Obenour
@ 2025-12-09 17:18       ` Ariadne Conill
  0 siblings, 0 replies; 8+ messages in thread
From: Ariadne Conill @ 2025-12-09 17:18 UTC (permalink / raw)
  To: Demi Marie Obenour
  Cc: v9fs, xen-devel, asmadeus, linux_oss, lucho, ericvh,
	Juergen Gross, Stefano Stabellini, Alex Zenla

Hello,

> On Dec 9, 2025, at 9:12 AM, Demi Marie Obenour <demiobenour@gmail.com> wrote:
> 
> On 12/9/25 12:09, Ariadne Conill wrote:
>> Hello,
>> 
>>>> On Dec 9, 2025, at 2:43 AM, Demi Marie Obenour <demiobenour@gmail.com> wrote:
>>> 
>>> On 12/8/25 14:51, Ariadne Conill wrote:
>>>> We need to do this so that we can signal to the other end that the
>>>> device is being removed, so that it will release its claim on the
>>>> underlying memory allocation.  Otherwise releasing the grant-table
>>>> entries is deferred resulting in a kernel oops since the pages have
>>>> already been freed.
>>> 
>>> I don't think this is sufficient.  The backend can simply refuse
>>> to release the grants.  The frontend needs to ensure that the pages
>>> are not freed until the grant table entries are freed.  Right now,
>>> the backend can cause a use-after-free in the frontend, and my
>>> understanding of the Xen Project's security policy is that this is
>>> a security vulnerability in the frontend code.
>>> 
>>> My instinct is that the core Xen code should take a reference on
>>> each page before granting it to another domain, and not release that
>>> reference until the pages are no longer granted.  This should prevent
>>> any use-after-free problems if I understand Linux core MM correctly.
>> 
>> Yes, there are other issues in the 9p transport that are likely in play here.  In our internal testing, we confirm this is not a full fix for hotplugging 9p transport devices, but no such claim of a complete fix has been made here or in the Matrix thread.
>> 
>> However, this is one defect that is contributing to the overall hotplugging problem and should be merged regardless: if the driver isn’t telling the other side to disconnect, the other side will never release the grants to begin with.
>> 
>> Ariadne
> 
> I definitely agree that this should be merged!
> 
> Is this code-path triggerable by the backend at will, or only during
> teardown by the toolstack?

In practice it only happens when the toolstack tears down a 9p transport.

In theory it can happen in any situation where the backend decides to tear down the transport, which may or may not actually be mediated by the toolstack.

Ariadne

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-08 19:51 [PATCH] 9p/xen: mark 9p transport device as closing when removing it Ariadne Conill
  2025-12-09 10:41 ` Demi Marie Obenour
@ 2025-12-09 20:45 ` Marek Marczykowski-Górecki
  1 sibling, 0 replies; 8+ messages in thread
From: Marek Marczykowski-Górecki @ 2025-12-09 20:45 UTC (permalink / raw)
  To: Ariadne Conill
  Cc: v9fs, xen-devel, asmadeus, linux_oss, lucho, ericvh,
	Juergen Gross, Stefano Stabellini, Alex Zenla

[-- Attachment #1: Type: text/plain, Size: 1437 bytes --]

On Mon, Dec 08, 2025 at 11:51:55AM -0800, Ariadne Conill wrote:
> We need to do this so that we can signal to the other end that the
> device is being removed, so that it will release its claim on the
> underlying memory allocation.  Otherwise releasing the grant-table
> entries is deferred resulting in a kernel oops since the pages have
> already been freed.
> 
> Cc: Juergen Gross <jgross@suse.com>
> Cc: Stefano Stabellini <sstabellini@kernel.org>
> Fixes: 71ebd71921e45 ("xen/9pfs: connect to the backend")
> Signed-off-by: Ariadne Conill <ariadne@ariadne.space>
> Signed-off-by: Alex Zenla <alex@edera.dev>
> ---
>  net/9p/trans_xen.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> index b9ff69c7522a..cde283c42dc6 100644
> --- a/net/9p/trans_xen.c
> +++ b/net/9p/trans_xen.c
> @@ -312,6 +312,7 @@ static void xen_9pfs_front_remove(struct xenbus_device *dev)
>  {
>  	struct xen_9pfs_front_priv *priv = dev_get_drvdata(&dev->dev);
>  
> +	xenbus_switch_state(dev, XenbusStateClosing);

I think you need to wait for the backend to acknowledge it by switching
its state, before continuing. See for example
xennet_remove()->xennet_bus_close() in drivers/net/xen-netfront.c.

>  	dev_set_drvdata(&dev->dev, NULL);
>  	xen_9pfs_front_free(priv);
>  }
> -- 
> 2.52.0
> 
> 

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-09 10:41 ` Demi Marie Obenour
  2025-12-09 17:09   ` Ariadne Conill
@ 2025-12-18  8:14   ` Jürgen Groß
  2025-12-20  2:02     ` Demi Marie Obenour
  1 sibling, 1 reply; 8+ messages in thread
From: Jürgen Groß @ 2025-12-18  8:14 UTC (permalink / raw)
  To: Demi Marie Obenour, Ariadne Conill, v9fs
  Cc: xen-devel, asmadeus, linux_oss, lucho, ericvh, Stefano Stabellini,
	Alex Zenla


[-- Attachment #1.1.1: Type: text/plain, Size: 1605 bytes --]

On 09.12.25 11:41, Demi Marie Obenour wrote:
> On 12/8/25 14:51, Ariadne Conill wrote:
>> We need to do this so that we can signal to the other end that the
>> device is being removed, so that it will release its claim on the
>> underlying memory allocation.  Otherwise releasing the grant-table
>> entries is deferred resulting in a kernel oops since the pages have
>> already been freed.
> 
> I don't think this is sufficient.  The backend can simply refuse
> to release the grants.  The frontend needs to ensure that the pages
> are not freed until the grant table entries are freed.  Right now,
> the backend can cause a use-after-free in the frontend, and my
> understanding of the Xen Project's security policy is that this is
> a security vulnerability in the frontend code.
> 
> My instinct is that the core Xen code should take a reference on
> each page before granting it to another domain, and not release that
> reference until the pages are no longer granted.  This should prevent
> any use-after-free problems if I understand Linux core MM correctly.

I looked at this in detail now.

I don't think we have a security bug right now, but the interfaces regarding
granting pages to other domains should probably be reworked like you suggest.

Currently it is the caller who needs to handle page references correctly,
while this should be done by the grant handling (having to either issue
get_page() or to pass NULL for the page pointer, in case you don't want the
underlying page to be freed by gnttab_end_foreign_access(), is far from
intuitive).


Juergen

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3743 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 9p/xen: mark 9p transport device as closing when removing it
  2025-12-18  8:14   ` Jürgen Groß
@ 2025-12-20  2:02     ` Demi Marie Obenour
  0 siblings, 0 replies; 8+ messages in thread
From: Demi Marie Obenour @ 2025-12-20  2:02 UTC (permalink / raw)
  To: Jürgen Groß, Ariadne Conill, v9fs
  Cc: xen-devel, asmadeus, linux_oss, lucho, ericvh, Stefano Stabellini,
	Alex Zenla


[-- Attachment #1.1.1: Type: text/plain, Size: 2657 bytes --]

On 12/18/25 03:14, Jürgen Groß wrote:
> On 09.12.25 11:41, Demi Marie Obenour wrote:
>> On 12/8/25 14:51, Ariadne Conill wrote:
>>> We need to do this so that we can signal to the other end that the
>>> device is being removed, so that it will release its claim on the
>>> underlying memory allocation.  Otherwise releasing the grant-table
>>> entries is deferred resulting in a kernel oops since the pages have
>>> already been freed.
>>
>> I don't think this is sufficient.  The backend can simply refuse
>> to release the grants.  The frontend needs to ensure that the pages
>> are not freed until the grant table entries are freed.  Right now,
>> the backend can cause a use-after-free in the frontend, and my
>> understanding of the Xen Project's security policy is that this is
>> a security vulnerability in the frontend code.
>>
>> My instinct is that the core Xen code should take a reference on
>> each page before granting it to another domain, and not release that
>> reference until the pages are no longer granted.  This should prevent
>> any use-after-free problems if I understand Linux core MM correctly.
> 
> I looked at this in detail now.
> 
> I don't think we have a security bug right now, but the interfaces regarding
> granting pages to other domains should probably be reworked like you suggest.
> 
> Currently it is the caller who needs to handle page references correctly,
> while this should be done by the grant handling (having to either issue
> get_page() or to pass NULL for the page pointer, in case you don't want the
> underlying page to be freed by gnttab_end_foreign_access(), is far from
> intuitive).

Unfortunately, I don't think this is going to work.  Page refcounts
are going away.  Taking a reference on kmalloc()'d memory already
runs into a VM_BUG_ON_FOLIO().  So the only reasonable approach
is a callback that is run when the grant is no longer accessible by
the backend.  It could well be that this is another example of a bug
in 9p that also affects virtio.

More details about the memory management and
virtio-9p situation can be found in the thread at
<https://lore.kernel.org/lkml/20251210-virtio_trans_iter-1-1-92eee6d8b6db@codewreck.org/t/#u>.

A much better API would allow the grantor of a page to revoke the
grantee's access, with subsequent accesses by the grantee redirected
to a scratch page.  For PVH/HVM guests I expect this to be possible,
but for PV guests it would require Xen to modify the guest's own
page tables.  My understanding is that this isn't allowed as it would
break the guest.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)

[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 7253 bytes --]

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2025-12-20  2:02 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-12-08 19:51 [PATCH] 9p/xen: mark 9p transport device as closing when removing it Ariadne Conill
2025-12-09 10:41 ` Demi Marie Obenour
2025-12-09 17:09   ` Ariadne Conill
2025-12-09 17:12     ` Demi Marie Obenour
2025-12-09 17:18       ` Ariadne Conill
2025-12-18  8:14   ` Jürgen Groß
2025-12-20  2:02     ` Demi Marie Obenour
2025-12-09 20:45 ` Marek Marczykowski-Górecki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox