linux-hyperv.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Naman Jain <namjain@linux.microsoft.com>
Cc: "K . Y . Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	linux-hyperv@vger.kernel.org, linux-kernel@vger.kernel.org,
	stable@kernel.org, Saurabh Sengar <ssengar@linux.microsoft.com>,
	Michael Kelley <mhklinux@outlook.com>,
	Long Li <longli@microsoft.com>
Subject: Re: [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Date: Tue, 25 Feb 2025 09:39:33 +0100	[thread overview]
Message-ID: <2025022515-lasso-carrot-4e1d@gregkh> (raw)
In-Reply-To: <9ee65987-4353-42c6-b517-d6f52428f718@linux.microsoft.com>

On Tue, Feb 25, 2025 at 02:04:43PM +0530, Naman Jain wrote:
> 
> 
> On 2/25/2025 11:42 AM, Greg Kroah-Hartman wrote:
> > On Tue, Feb 25, 2025 at 10:50:01AM +0530, Naman Jain wrote:
> > > On regular bootup, devices get registered to vmbus first, so when
> > > uio_hv_generic driver for a particular device type is probed,
> > > the device is already initialized and added, so sysfs creation in
> > > uio_hv_generic probe works fine. However, when device is removed
> > > and brought back, the channel rescinds and device again gets
> > > registered to vmbus. However this time, the uio_hv_generic driver is
> > > already registered to probe for that device and in this case sysfs
> > > creation is tried before the device gets initialized completely.
> > > 
> > > Fix this by moving the core logic of sysfs creation for ring buffer,
> > > from uio_hv_generic to HyperV's vmbus driver, where rest of the sysfs
> > > attributes for the channels are defined. While doing that, make use
> > > of attribute groups and macros, instead of creating sysfs directly,
> > > to ensure better error handling and code flow.
> > > 
> > > Problem path:
> > > vmbus_device_register
> > >      device_register
> > >          uio_hv_generic probe
> > >                      sysfs_create_bin_file (fails here)
> > >          kset_create_and_add (dependency)
> > >          vmbus_add_channel_kobj (dependency)
> > > 
> > > Fixes: 9ab877a6ccf8 ("uio_hv_generic: make ring buffer attribute for primary channel")
> > > Cc: stable@kernel.org
> > > Suggested-by: Saurabh Sengar <ssengar@linux.microsoft.com>
> > > Suggested-by: Michael Kelley <mhklinux@outlook.com>
> > > Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
> > > ---
> > > Hi,
> > > This is the first patch after initial RFC was posted.
> > > https://lore.kernel.org/all/20250214064351.8994-1-namjain@linux.microsoft.com/
> > > 
> > > Changes since RFC patch:
> > > * Different approach to solve the problem is proposed (credits to
> > >    Michael Kelley).
> > > * Core logic for sysfs creation moved out of uio_hv_generic, to VMBus
> > >    drivers where rest of the sysfs attributes for a VMBus channel
> > >    are defined. (addressed Greg's comments)
> > > * Used attribute groups instead of sysfs_create* functions, and bundled
> > >    ring attribute with other attributes for the channel sysfs.
> > > 
> > > Error logs:
> > > 
> > > [   35.574120] ------------[ cut here ]------------
> > > [   35.574122] WARNING: CPU: 0 PID: 10 at fs/sysfs/file.c:591 sysfs_create_bin_file+0x81/0x90
> > > [   35.574168] Workqueue: hv_pri_chan vmbus_add_channel_work
> > > [   35.574172] RIP: 0010:sysfs_create_bin_file+0x81/0x90
> > > [   35.574197] Call Trace:
> > > [   35.574199]  <TASK>
> > > [   35.574200]  ? show_regs+0x69/0x80
> > > [   35.574217]  ? __warn+0x8d/0x130
> > > [   35.574220]  ? sysfs_create_bin_file+0x81/0x90
> > > [   35.574222]  ? report_bug+0x182/0x190
> > > [   35.574225]  ? handle_bug+0x5b/0x90
> > > [   35.574244]  ? exc_invalid_op+0x19/0x70
> > > [   35.574247]  ? asm_exc_invalid_op+0x1b/0x20
> > > [   35.574252]  ? sysfs_create_bin_file+0x81/0x90
> > > [   35.574255]  hv_uio_probe+0x1e7/0x410 [uio_hv_generic]
> > > [   35.574271]  vmbus_probe+0x3b/0x90
> > > [   35.574275]  really_probe+0xf4/0x3b0
> > > [   35.574279]  __driver_probe_device+0x8a/0x170
> > > [   35.574282]  driver_probe_device+0x23/0xc0
> > > [   35.574285]  __device_attach_driver+0xb5/0x140
> > > [   35.574288]  ? __pfx___device_attach_driver+0x10/0x10
> > > [   35.574291]  bus_for_each_drv+0x86/0xe0
> > > [   35.574294]  __device_attach+0xc1/0x200
> > > [   35.574297]  device_initial_probe+0x13/0x20
> > > [   35.574315]  bus_probe_device+0x99/0xa0
> > > [   35.574318]  device_add+0x647/0x870
> > > [   35.574320]  ? hrtimer_init+0x28/0x70
> > > [   35.574323]  device_register+0x1b/0x30
> > > [   35.574326]  vmbus_device_register+0x83/0x130
> > > [   35.574328]  vmbus_add_channel_work+0x135/0x1a0
> > > [   35.574331]  process_one_work+0x177/0x340
> > > [   35.574348]  worker_thread+0x2b2/0x3c0
> > > [   35.574350]  kthread+0xe3/0x1f0
> > > [   35.574353]  ? __pfx_worker_thread+0x10/0x10
> > > [   35.574356]  ? __pfx_kthread+0x10/0x10
> > > 
> > > ---
> > >   drivers/hv/hyperv_vmbus.h    |  4 +++
> > >   drivers/hv/vmbus_drv.c       | 62 ++++++++++++++++++++++++++++++++++++
> > >   drivers/uio/uio_hv_generic.c | 34 ++------------------
> > >   include/linux/hyperv.h       |  3 ++
> > >   4 files changed, 72 insertions(+), 31 deletions(-)
> > > 
> > > diff --git a/drivers/hv/hyperv_vmbus.h b/drivers/hv/hyperv_vmbus.h
> > > index 29780f3a7478..e0c7b75e6c7a 100644
> > > --- a/drivers/hv/hyperv_vmbus.h
> > > +++ b/drivers/hv/hyperv_vmbus.h
> > > @@ -477,4 +477,8 @@ static inline int hv_debug_add_dev_dir(struct hv_device *dev)
> > >   #endif /* CONFIG_HYPERV_TESTING */
> > > +/* Create and remove sysfs entry for memory mapped ring buffers for a channel */
> > > +int hv_create_ring_sysfs(struct vmbus_channel *channel);
> > > +int hv_remove_ring_sysfs(struct vmbus_channel *channel);
> > > +
> > >   #endif /* _HYPERV_VMBUS_H */
> > > diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
> > > index 22afebfc28ff..0110643bad3f 100644
> > > --- a/drivers/hv/vmbus_drv.c
> > > +++ b/drivers/hv/vmbus_drv.c
> > > @@ -1802,6 +1802,39 @@ static ssize_t subchannel_id_show(struct vmbus_channel *channel,
> > >   }
> > >   static VMBUS_CHAN_ATTR_RO(subchannel_id);
> > > +/* Functions to create sysfs interface to allow mmap of the ring buffers.
> > > + * The ring buffer is allocated as contiguous memory by vmbus_open
> > > + */
> > > +static int hv_mmap_ring_buffer(struct vmbus_channel *channel, struct vm_area_struct *vma)
> > > +{
> > > +	void *ring_buffer = page_address(channel->ringbuffer_page);
> > > +
> > > +	if (channel->state != CHANNEL_OPENED_STATE)
> > > +		return -ENODEV;
> > > +
> > > +	return vm_iomap_memory(vma, virt_to_phys(ring_buffer),
> > > +			       channel->ringbuffer_pagecount << PAGE_SHIFT);
> > > +}
> > > +
> > > +static int hv_mmap_ring_buffer_wrapper(struct file *filp, struct kobject *kobj,
> > > +				       const struct bin_attribute *attr,
> > > +				       struct vm_area_struct *vma)
> > > +{
> > > +	struct vmbus_channel *channel = container_of(kobj, struct vmbus_channel, kobj);
> > > +
> > > +	if (!channel->mmap_ring_buffer)
> > > +		return -ENODEV;
> > > +	return channel->mmap_ring_buffer(channel, vma);
> > 
> > What is preventing mmap_ring_buffer from being set to NULL right after
> > checking it and then calling it here?  I see no locks here or where you
> > are assigning this variable at all, so what is preventing these types of
> > races?
> > 
> > thanks,
> > 
> > greg k-h
> 
> Thank you so much for reviewing.
> I spent some time to understand if this race condition can happen and it
> seems execution flow is pretty sequential, for a particular channel of a
> device.
> 
> Unless hv_uio_remove (which makes channel->mmap_ring_buffer NULL) can be
> called in parallel to hv_uio_probe (which had set
> channel->mmap_ring_buffer to non NULL), I doubt race can happen here.
> 
> Code Flow: (R, W-> Read, Write to channel->mmap_ring_buffer)
> 
> vmbus_device_register
>   device_register
>     hv_uio_probe
> 	  hv_create_ring_sysfs (W to non NULL)
>         sysfs_update_group
>           vmbus_chan_attr_is_visible (R)
>   vmbus_add_channel_kobj
>     sysfs_create_group
>       vmbus_chan_attr_is_visible  (R)
>       hv_mmap_ring_buffer_wrapper (critical section)
> 
> hv_uio_remove
>   hv_remove_ring_sysfs (W to NULL)

Yes, and right in here someone mmaps the file.

I think you can race here, no locks at all feels wrong.

Messing with sysfs groups and files like this is rough, and almost never
a good idea, why can't you just do this all at once with the default
groups, why is this being added/removed out-of-band?

thanks,

greg k-h

  reply	other threads:[~2025-02-25  8:40 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-25  5:20 [PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer Naman Jain
2025-02-25  6:12 ` Greg Kroah-Hartman
2025-02-25  8:34   ` Naman Jain
2025-02-25  8:39     ` Greg Kroah-Hartman [this message]
2025-02-26  5:13       ` Naman Jain
2025-02-26 10:03         ` Greg Kroah-Hartman
2025-02-26 12:21           ` Naman Jain
2025-02-26 14:33             ` Greg Kroah-Hartman
2025-02-27  6:24               ` Naman Jain
2025-03-05  7:06                 ` Naman Jain
2025-03-11  4:45                   ` Naman Jain
2025-03-11 17:48                     ` Dexuan Cui

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2025022515-lasso-carrot-4e1d@gregkh \
    --to=gregkh@linuxfoundation.org \
    --cc=decui@microsoft.com \
    --cc=haiyangz@microsoft.com \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=longli@microsoft.com \
    --cc=mhklinux@outlook.com \
    --cc=namjain@linux.microsoft.com \
    --cc=ssengar@linux.microsoft.com \
    --cc=stable@kernel.org \
    --cc=stephen@networkplumber.org \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).