Re: [PATCH] xen/xenbus: better handle backend crash

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: "Jürgen Groß" <jgross@suse.com>
Cc: linux-kernel@vger.kernel.org,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
	Peng Jiang <jiang.peng9@zte.com.cn>,
	Qiu-ji Chen <chenqiuji666@gmail.com>,
	Jason Andryuk <jason.andryuk@amd.com>,
	"moderated list:XEN HYPERVISOR INTERFACE"
	<xen-devel@lists.xenproject.org>
Subject: Re: [PATCH] xen/xenbus: better handle backend crash
Date: Mon, 9 Feb 2026 21:46:46 +0100	[thread overview]
Message-ID: <aYpHth_TvtA8hg4J@mail-itl> (raw)
In-Reply-To: <abd6f188-4c5e-4e56-9dbf-3bc942622b6f@suse.com>

[-- Attachment #1: Type: text/plain, Size: 9649 bytes --]

On Mon, Feb 09, 2026 at 10:02:28AM +0100, Jürgen Groß wrote:
> On 06.02.26 17:57, Marek Marczykowski-Górecki wrote:
> > On Thu, Jan 29, 2026 at 08:02:35AM +0100, Jürgen Groß wrote:
> > > On 26.01.26 08:08, Jürgen Groß wrote:
> > > > On 17.11.25 12:06, Jürgen Groß wrote:
> > > > > On 02.11.25 04:20, Marek Marczykowski-Górecki wrote:
> > > > > > When the backend domain crashes, coordinated device cleanup is not
> > > > > > possible (as it involves waiting for the backend state change). In that
> > > > > > case, toolstack forcefully removes frontend xenstore entries.
> > > > > > xenbus_dev_changed() handles this case, and triggers device cleanup.
> > > > > > It's possible that toolstack manages to connect new device in that
> > > > > > place, before xenbus_dev_changed() notices the old one is missing. If
> > > > > > that happens, new one won't be probed and will forever remain in
> > > > > > XenbusStateInitialising.
> > > > > > 
> > > > > > Fix this by checking backend-id and if it changes, consider it
> > > > > > unplug+plug operation. It's important that cleanup on such unplug
> > > > > > doesn't modify xenstore entries (especially the "state" key) as it
> > > > > > belong to the new device to be probed - changing it would derail
> > > > > > establishing connection to the new backend (most likely, closing the
> > > > > > device before it was even connected). Handle this case by setting new
> > > > > > xenbus_device->vanished flag to true, and check it before changing state
> > > > > > entry.
> > > > > > 
> > > > > > And even if xenbus_dev_changed() correctly detects the device was
> > > > > > forcefully removed, the cleanup handling is still racy. Since this whole
> > > > > > handling doesn't happend in a single xenstore transaction, it's possible
> > > > > > that toolstack might put a new device there already. Avoid re-creating
> > > > > > the state key (which in the case of loosing the race would actually
> > > > > > close newly attached device).
> > > > > > 
> > > > > > The problem does not apply to frontend domain crash, as this case
> > > > > > involves coordinated cleanup.
> > > > > > 
> > > > > > Problem originally reported at
> > > > > > https://lore.kernel.org/xen-devel/aOZvivyZ9YhVWDLN@mail-itl/T/#t,
> > > > > > including reproduction steps.
> > > > > > 
> > > > > > Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
> > > > > 
> > > > > Sorry I didn't get earlier to this.
> > > > > 
> > > > > My main problem with this patch is that it is basically just papering over
> > > > > a more general problem.
> > > > > 
> > > > > You are just making the problem much more improbable, but not impossible to
> > > > > occur again. In case the new driver domain has the same domid as the old one
> > > > > you can still have the same race.
> > > > > 
> > > > > The clean way to handle that would be to add a unique Id in Xenstore to each
> > > > > device on the backend side, which can be tested on the frontend side to
> > > > > match. In case it doesn't match, an old device with the same kind and devid
> > > > > can be cleaned up.
> > > > > 
> > > > > The unique Id would obviously need to be set by the Xen tools inside the
> > > > > transaction writing the initial backend Xenstore nodes, as doing that from
> > > > > the backend would add another potential ambiguity by the driver domain
> > > > > choosing the same unique id as the previous one did.
> > > > > 
> > > > > The question is whether something like your patch should be used as a
> > > > > fallback in case there is no unique Id on the backend side of the device
> > > > > due to a too old Xen version.
> > > > 
> > > > I think I have found a solution which is much more simple, as it doesn't
> > > > need any change of the protocol or any addition of new identifiers.
> > > > 
> > > > When creating a new PV device, Xen tools will always write all generic
> > > > frontend- and backend-nodes. This includes the frontend state, which is
> > > > initialized as XenbusStateInitialising.
> > > > 
> > > > The Linux kernel's xenbus driver is already storing the last known state
> > > > of a xenbus device in struct xenbus_device. When changing the state, the
> > > > xenbus driver is even reading the state from Xenstore (even if only for
> > > > making sure the path is still existing). So all what is needed is to check,
> > > > whether the read current state is matching the locally saved state. If it
> > > > is not matching AND the read state is XenbusStateInitialising, you can be
> > > > sure that the backend has been replaced.
> > > > 
> > > > Handling this will need to check the return value of xenbus_switch_state()
> > > > in all related drivers, but this is just a more or less mechanical change.
> > > > 
> > > > I'll prepare a patch series for that.
> > > 
> > > In the end the result is more like your patch, avoiding the need to modify
> > > all drivers.
> > > 
> > > I just added my idea to your patch and modified some of your code to be more
> > > simple. I _think_ I have covered all possible scenarios now, resulting in
> > > the need to keep the backend id check in case the backend died during the
> > > early init phase of the device.
> > > 
> > > Could you please verify the attached patch is working for you?
> > 
> > Thanks for the patch!
> > 
> > I ran it through relevant tests, and I got inconsistent results.
> > Specifically, sometimes, the domU hangs (actually, just one vCPU spins
> > inside xenwatch thread). Last console messages are:
> > 
> >      systemd[626]: Starting dconf.service - User preferences database...
> >      gnome-keyring-daemon[975]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
> >      gnome-keyring-daemon[975]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
> >      xen vif-0: xenbus: state reset occurred, reconnecting
> >      gnome-keyring-daemon[974]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
> >      gnome-keyring-daemon[976]: ␛[0;1;39mcouldn't access control socket: /run/user/1000/keyring/control: No such file or directory␛[0m
> >      gnome-keyring-daemon[976]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
> >      gnome-keyring-daemon[974]: ␛[0;1;38:5:185mdiscover_other_daemon: 0␛[0m
> >      xen vif-0: xenbus: state reset occurred, reconnecting
> >      systemd[626]: Started dconf.service - User preferences database.
> >      xen_netfront: Initialising Xen virtual ethernet driver
> >      vif vif-0: xenbus: state reset occurred, reconnecting
> > 
> > And the call trace of the spinning xenwatch thread is:
> > 
> >      task:xenwatch        state:D stack:0     pid:64    tgid:64    ppid:2      task_flags:0x288040 flags:0x00080000
> >      Call Trace:
> >       <TASK>
> >       __schedule+0x2f3/0x780
> >       schedule+0x27/0x80
> >       xs_wait_for_reply+0xab/0x1f0
> >       ? __pfx_autoremove_wake_function+0x10/0x10
> >       xs_talkv+0xec/0x200
> >       xs_single+0x4a/0x70
> >       xenbus_gather+0xe4/0x1a0
> >       xenbus_read_driver_state+0x42/0x70
> >       xennet_bus_close+0x113/0x2c0 [xen_netfront]
> >       ? __pfx_autoremove_wake_function+0x10/0x10
> >       xennet_remove+0x16/0x80 [xen_netfront]
> >       xenbus_dev_remove+0x71/0xf0
> >       device_release_driver_internal+0x19c/0x200
> >       bus_remove_device+0xc6/0x130
> >       device_del+0x160/0x3e0
> >       device_unregister+0x17/0x60
> >       xenbus_dev_changed.cold+0x5e/0x6b
> >       ? __pfx_xenwatch_thread+0x10/0x10
> >       xenwatch_thread+0x92/0x1c0
> >       ? __pfx_autoremove_wake_function+0x10/0x10
> >       kthread+0xfc/0x240
> >       ? __pfx_kthread+0x10/0x10
> >       ret_from_fork+0xf5/0x110
> >       ? __pfx_kthread+0x10/0x10
> >       ret_from_fork_asm+0x1a/0x30
> >       </TASK>
> >      task:xenbus          state:S stack:0     pid:63    tgid:63    ppid:2      task_flags:0x208040 flags:0x00080000
> >      Call Trace:
> >       <TASK>
> >       __schedule+0x2f3/0x780
> >       ? __pfx_xenbus_thread+0x10/0x10
> >       schedule+0x27/0x80
> >       xenbus_thread+0x1a8/0x200
> >       ? __pfx_autoremove_wake_function+0x10/0x10
> >       kthread+0xfc/0x240
> >       ? __pfx_kthread+0x10/0x10
> >       ret_from_fork+0xf5/0x110
> >       ? __pfx_kthread+0x10/0x10
> >       ret_from_fork_asm+0x1a/0x30
> >       </TASK>
> > 
> > (technically, `top` says it's the xenbus thread spinning, but it looks
> > like the actual issue is in xenwatch one)
> > 
> > Note that other xenwatch actions in this domU are not executed, for
> > example `xl sysrq` does nothing. Not surprising, given xenwatch thread
> > is busy... Fortunately, it blocks only a single vCPU, so I'm able to
> > interact with the domU over console (to get the above traces).
> > 
> > It isn't a reliable failure, in this test run it failed once, out of 4
> > related tests.
> > 
> > The specific test is: https://github.com/QubesOS/qubes-core-admin/blob/main/qubes/tests/integ/network.py#L234
> > In short:
> > 1. Start a domU
> > 2. Pause it
> > 3. Attach network (backend is != dom0)
> > 4. Unpause
> > 
> > TBH, I'm not sure why the "state reset occurred" message is triggered at
> > all, I think it shouldn't be in this case...
> > 
> 
> Second try.

This time it's green: https://openqa.qubes-os.org/tests/166851 :)
You can treat it as T-by tag.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

     prev parent reply	other threads:[~2026-02-09 20:47 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-02  3:20 [PATCH] xen/xenbus: better handle backend crash Marek Marczykowski-Górecki
2025-11-13 18:43 ` Marek Marczykowski-Górecki
2025-11-17 11:06 ` Jürgen Groß
2026-01-26  7:08   ` Jürgen Groß
2026-01-29  7:02     ` Jürgen Groß
2026-02-06 16:57       ` Marek Marczykowski-Górecki
2026-02-09  7:30         ` Jürgen Groß
2026-02-09  9:02         ` Jürgen Groß
2026-02-09 20:46           ` Marek Marczykowski-Górecki [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aYpHth_TvtA8hg4J@mail-itl \
    --to=marmarek@invisiblethingslab.com \
    --cc=chenqiuji666@gmail.com \
    --cc=jason.andryuk@amd.com \
    --cc=jgross@suse.com \
    --cc=jiang.peng9@zte.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksandr_tyshchenko@epam.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.