Re: [PATCH] xen/xenbus: better handle backend crash

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Jürgen Groß" <jgross@suse.com>
Cc: linux-kernel@vger.kernel.org,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
	Peng Jiang <jiang.peng9@zte.com.cn>,
	Qiu-ji Chen <chenqiuji666@gmail.com>,
	Jason Andryuk <jason.andryuk@amd.com>,
	"moderated list:XEN HYPERVISOR INTERFACE"
	<xen-devel@lists.xenproject.org>
Subject: Re: [PATCH] xen/xenbus: better handle backend crash
Date: Fri, 6 Feb 2026 17:57:37 +0100	[thread overview]
Message-ID: <aYYdgXMKnMCfXob7@mail-itl> (raw)
In-Reply-To: <e1060ba2-7985-4e80-9d84-fb738758e3b3@suse.com>

[-- Attachment #1: Type: text/plain, Size: 8644 bytes --]

On Thu, Jan 29, 2026 at 08:02:35AM +0100, Jürgen Groß wrote:
> On 26.01.26 08:08, Jürgen Groß wrote:
> > On 17.11.25 12:06, Jürgen Groß wrote:
> > > On 02.11.25 04:20, Marek Marczykowski-Górecki wrote:
> > > > When the backend domain crashes, coordinated device cleanup is not
> > > > possible (as it involves waiting for the backend state change). In that
> > > > case, toolstack forcefully removes frontend xenstore entries.
> > > > xenbus_dev_changed() handles this case, and triggers device cleanup.
> > > > It's possible that toolstack manages to connect new device in that
> > > > place, before xenbus_dev_changed() notices the old one is missing. If
> > > > that happens, new one won't be probed and will forever remain in
> > > > XenbusStateInitialising.
> > > > 
> > > > Fix this by checking backend-id and if it changes, consider it
> > > > unplug+plug operation. It's important that cleanup on such unplug
> > > > doesn't modify xenstore entries (especially the "state" key) as it
> > > > belong to the new device to be probed - changing it would derail
> > > > establishing connection to the new backend (most likely, closing the
> > > > device before it was even connected). Handle this case by setting new
> > > > xenbus_device->vanished flag to true, and check it before changing state
> > > > entry.
> > > > 
> > > > And even if xenbus_dev_changed() correctly detects the device was
> > > > forcefully removed, the cleanup handling is still racy. Since this whole
> > > > handling doesn't happend in a single xenstore transaction, it's possible
> > > > that toolstack might put a new device there already. Avoid re-creating
> > > > the state key (which in the case of loosing the race would actually
> > > > close newly attached device).
> > > > 
> > > > The problem does not apply to frontend domain crash, as this case
> > > > involves coordinated cleanup.
> > > > 
> > > > Problem originally reported at
> > > > https://lore.kernel.org/xen-devel/aOZvivyZ9YhVWDLN@mail-itl/T/#t,
> > > > including reproduction steps.
> > > > 
> > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > 
> > > Sorry I didn't get earlier to this.
> > > 
> > > My main problem with this patch is that it is basically just papering over
> > > a more general problem.
> > > 
> > > You are just making the problem much more improbable, but not impossible to
> > > occur again. In case the new driver domain has the same domid as the old one
> > > you can still have the same race.
> > > 
> > > The clean way to handle that would be to add a unique Id in Xenstore to each
> > > device on the backend side, which can be tested on the frontend side to
> > > match. In case it doesn't match, an old device with the same kind and devid
> > > can be cleaned up.
> > > 
> > > The unique Id would obviously need to be set by the Xen tools inside the
> > > transaction writing the initial backend Xenstore nodes, as doing that from
> > > the backend would add another potential ambiguity by the driver domain
> > > choosing the same unique id as the previous one did.
> > > 
> > > The question is whether something like your patch should be used as a
> > > fallback in case there is no unique Id on the backend side of the device
> > > due to a too old Xen version.
> > 
> > I think I have found a solution which is much more simple, as it doesn't
> > need any change of the protocol or any addition of new identifiers.
> > 
> > When creating a new PV device, Xen tools will always write all generic
> > frontend- and backend-nodes. This includes the frontend state, which is
> > initialized as XenbusStateInitialising.
> > 
> > The Linux kernel's xenbus driver is already storing the last known state
> > of a xenbus device in struct xenbus_device. When changing the state, the
> > xenbus driver is even reading the state from Xenstore (even if only for
> > making sure the path is still existing). So all what is needed is to check,
> > whether the read current state is matching the locally saved state. If it
> > is not matching AND the read state is XenbusStateInitialising, you can be
> > sure that the backend has been replaced.
> > 
> > Handling this will need to check the return value of xenbus_switch_state()
> > in all related drivers, but this is just a more or less mechanical change.
> > 
> > I'll prepare a patch series for that.
> 
> In the end the result is more like your patch, avoiding the need to modify
> all drivers.
> 
> I just added my idea to your patch and modified some of your code to be more
> simple. I _think_ I have covered all possible scenarios now, resulting in
> the need to keep the backend id check in case the backend died during the
> early init phase of the device.
> 
> Could you please verify the attached patch is working for you?

Thanks for the patch!

I ran it through relevant tests, and I got inconsistent results.
Specifically, sometimes, the domU hangs (actually, just one vCPU spins
inside xenwatch thread). Last console messages are:

    systemd[626]: Starting dconf.service - User preferences database...
    gnome-keyring-daemon[975]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
    gnome-keyring-daemon[975]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
    xen vif-0: xenbus: state reset occurred, reconnecting
    gnome-keyring-daemon[974]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
    gnome-keyring-daemon[976]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
    gnome-keyring-daemon[976]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
    gnome-keyring-daemon[974]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
    xen vif-0: xenbus: state reset occurred, reconnecting
    systemd[626]: Started dconf.service - User preferences database.
    xen_netfront: Initialising Xen virtual ethernet driver
    vif vif-0: xenbus: state reset occurred, reconnecting

And the call trace of the spinning xenwatch thread is:

    task:xenwatch        state:D stack:0     pid:64    tgid:64    ppid:2      task_flags:0x288040 flags:0x00080000
    Call Trace:
     <TASK>
     __schedule+0x2f3/0x780
     schedule+0x27/0x80
     xs_wait_for_reply+0xab/0x1f0
     ? __pfx_autoremove_wake_function+0x10/0x10
     xs_talkv+0xec/0x200
     xs_single+0x4a/0x70
     xenbus_gather+0xe4/0x1a0
     xenbus_read_driver_state+0x42/0x70
     xennet_bus_close+0x113/0x2c0 [xen_netfront]
     ? __pfx_autoremove_wake_function+0x10/0x10
     xennet_remove+0x16/0x80 [xen_netfront]
     xenbus_dev_remove+0x71/0xf0
     device_release_driver_internal+0x19c/0x200
     bus_remove_device+0xc6/0x130
     device_del+0x160/0x3e0
     device_unregister+0x17/0x60
     xenbus_dev_changed.cold+0x5e/0x6b
     ? __pfx_xenwatch_thread+0x10/0x10
     xenwatch_thread+0x92/0x1c0
     ? __pfx_autoremove_wake_function+0x10/0x10
     kthread+0xfc/0x240
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0xf5/0x110
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>
    task:xenbus          state:S stack:0     pid:63    tgid:63    ppid:2      task_flags:0x208040 flags:0x00080000
    Call Trace:
     <TASK>
     __schedule+0x2f3/0x780
     ? __pfx_xenbus_thread+0x10/0x10
     schedule+0x27/0x80
     xenbus_thread+0x1a8/0x200
     ? __pfx_autoremove_wake_function+0x10/0x10
     kthread+0xfc/0x240
     ? __pfx_kthread+0x10/0x10
     ret_from_fork+0xf5/0x110
     ? __pfx_kthread+0x10/0x10
     ret_from_fork_asm+0x1a/0x30
     </TASK>

(technically, `top` says it's the xenbus thread spinning, but it looks
like the actual issue is in xenwatch one)

Note that other xenwatch actions in this domU are not executed, for
example `xl sysrq` does nothing. Not surprising, given xenwatch thread
is busy... Fortunately, it blocks only a single vCPU, so I'm able to
interact with the domU over console (to get the above traces).

It isn't a reliable failure, in this test run it failed once, out of 4
related tests.

The specific test is: https://github.com/QubesOS/qubes-core-admin/blob/main/qubes/tests/integ/network.py#L234
In short:
1. Start a domU
2. Pause it
3. Attach network (backend is != dom0)
4. Unpause

TBH, I'm not sure why the "state reset occurred" message is triggered at
all, I think it shouldn't be in this case...

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

next prev parent reply	other threads:[~2026-02-06 16:57 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-02  3:20 [PATCH] xen/xenbus: better handle backend crash Marek Marczykowski-Górecki
2025-11-13 18:43 ` Marek Marczykowski-Górecki
2025-11-17 11:06 ` Jürgen Groß
2026-01-26  7:08   ` Jürgen Groß
2026-01-29  7:02     ` Jürgen Groß
2026-02-06 16:57       ` Marek Marczykowski-Górecki [this message]
2026-02-09  7:30         ` Jürgen Groß
2026-02-09  9:02         ` Jürgen Groß
2026-02-09 20:46           ` Marek Marczykowski-Górecki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYYdgXMKnMCfXob7@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=chenqiuji666@gmail.com \
    --cc=jason.andryuk@amd.com \
    --cc=jgross@suse.com \
    --cc=jiang.peng9@zte.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksandr_tyshchenko@epam.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox