From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: Jason Andryuk <jason.andryuk@amd.com>
Cc: Juergen Gross <jgross@suse.com>,
Stefano Stabellini <sstabellini@kernel.org>,
Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
stable@vger.kernel.org, xen-devel@lists.xenproject.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] xenbus: Use kref to track req lifetime
Date: Thu, 8 May 2025 22:18:32 +0200 [thread overview]
Message-ID: <aB0Rmd1PCxA_7Gch@mail-itl> (raw)
In-Reply-To: <20250506210935.5607-1-jason.andryuk@amd.com>
[-- Attachment #1: Type: text/plain, Size: 2055 bytes --]
On Tue, May 06, 2025 at 05:09:33PM -0400, Jason Andryuk wrote:
> Marek reported seeing a NULL pointer fault in the xenbus_thread
> callstack:
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> RIP: e030:__wake_up_common+0x4c/0x180
> Call Trace:
> <TASK>
> __wake_up_common_lock+0x82/0xd0
> process_msg+0x18e/0x2f0
> xenbus_thread+0x165/0x1c0
>
> process_msg+0x18e is req->cb(req). req->cb is set to xs_wake_up(), a
> thin wrapper around wake_up(), or xenbus_dev_queue_reply(). It seems
> like it was xs_wake_up() in this case.
>
> It seems like req may have woken up the xs_wait_for_reply(), which
> kfree()ed the req. When xenbus_thread resumes, it faults on the zero-ed
> data.
>
> Linux Device Drivers 2nd edition states:
> "Normally, a wake_up call can cause an immediate reschedule to happen,
> meaning that other processes might run before wake_up returns."
> ... which would match the behaviour observed.
>
> Change to keeping two krefs on each request. One for the caller, and
> one for xenbus_thread. Each will kref_put() when finished, and the last
> will free it.
>
> This use of kref matches the description in
> Documentation/core-api/kref.rst
>
> Link: https://lore.kernel.org/xen-devel/ZO0WrR5J0xuwDIxW@mail-itl/
> Reported-by: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
> Fixes: fd8aa9095a95 ("xen: optimize xenbus driver for multiple concurrent xenstore accesses")
> Cc: stable@vger.kernel.org
> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
> ---
> Kinda RFC-ish as I don't know if it fixes Marek's issue. This does seem
> like the correct approach if we are seeing req free()ed out from under
> xenbus_thread.
Thanks for the patch! I don't have easy way to test if it definitely
fixes the issues (due to poor reproduction rate), but it looks very
likely. I did run it through our CI and at least there it didn't crash
(but again, it doesn't happen often).
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
prev parent reply other threads:[~2025-05-08 20:18 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-06 21:09 [PATCH] xenbus: Use kref to track req lifetime Jason Andryuk
2025-05-07 9:27 ` Jürgen Groß
2025-05-07 14:16 ` Jason Andryuk
2025-05-08 20:18 ` Marek Marczykowski-Górecki [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aB0Rmd1PCxA_7Gch@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=boris.ostrovsky@oracle.com \
--cc=jason.andryuk@amd.com \
--cc=jgross@suse.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oleksandr_tyshchenko@epam.com \
--cc=sstabellini@kernel.org \
--cc=stable@vger.kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox