All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Jiwei Sun <sjiwei@163.com>
Cc: nirmal.patel@linux.intel.com, jonathan.derrick@linux.dev,
	paul.m.stillwell.jr@intel.com, lpieralisi@kernel.org,
	kw@linux.com, robh@kernel.org, bhelgaas@google.com,
	linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	sunjw10@lenovo.com, ahuang12@lenovo.com,
	Pawel Baldysiak <pawel.baldysiak@intel.com>,
	Alexey Obitotskiy <aleksey.obitotskiy@intel.com>,
	Tomasz Majchrzak <tomasz.majchrzak@intel.com>
Subject: Re: [PATCH v3] PCI: vmd: Create domain symlink before pci_bus_add_devices()
Date: Tue, 9 Jul 2024 15:59:38 -0500	[thread overview]
Message-ID: <20240709205938.GA194355@bhelgaas> (raw)
In-Reply-To: <20240605124844.24293-1-sjiwei@163.com>

[+cc Pawel, Alexey, Tomasz for mdadm history]

On Wed, Jun 05, 2024 at 08:48:44PM +0800, Jiwei Sun wrote:
> From: Jiwei Sun <sunjw10@lenovo.com>
> 
> During booting into the kernel, the following error message appears:
> 
>   (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: Unable to get real path for '/sys/bus/pci/drivers/vmd/0000:c7:00.5/domain/device''
>   (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: /dev/nvme1n1 is not attached to Intel(R) RAID controller.'
>   (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: No OROM/EFI properties for /dev/nvme1n1'
>   (udev-worker)[2149]: nvme1n1: '/sbin/mdadm -I /dev/nvme1n1'(err) 'mdadm: no RAID superblock on /dev/nvme1n1.'
>   (udev-worker)[2149]: nvme1n1: Process '/sbin/mdadm -I /dev/nvme1n1' failed with exit code 1.
> 
> This symptom prevents the OS from booting successfully.

I guess the root filesystem must be on a RAID device, and it's the
failure to assemble that RAID device that prevents OS boot?  The
messages are just details about why the assembly failed?

> After a NVMe disk is probed/added by the nvme driver, the udevd executes
> some rule scripts by invoking mdadm command to detect if there is a
> mdraid associated with this NVMe disk. The mdadm determines if one
> NVMe devce is connected to a particular VMD domain by checking the
> domain symlink. Here is the root cause:

Can you tell us something about what makes this a vmd-specific issue?

I guess vmd is the only driver that creates a "domain" symlink, so
*that* part is vmd-specific.  But I guess there's something in mdadm
or its configuration that looks for that symlink?

I suppose it has to do with the mdadm code at [1] and the commit at
[2]?

[1] https://github.com/md-raid-utilities/mdadm/blob/96b8035a09b6449ea99f2eb91f9ba4f6912e5bd6/platform-intel.c#L199
[2] https://github.com/md-raid-utilities/mdadm/commit/60f0f54d6f5227f229e7131d34f93f76688b085f

I assume this is a race between vmd_enable_domain() and mdadm?  And
vmd_enable_domain() only loses the race sometimes?  Trying to figure
out why this hasn't been reported before or on non-VMD configurations.
Now that I found [2], the non-VMD part is obvious, but I'm still
curious about why we haven't seen it before.

The VMD device is sort of like another host bridge, and I wouldn't
think mdadm would normally care about a host bridge, but it looks like
mdadm does need to know about VMD for some reason.

> Thread A                   Thread B             Thread mdadm
> vmd_enable_domain
>   pci_bus_add_devices
>     __driver_probe_device
>      ...
>      work_on_cpu
>        schedule_work_on
>        : wakeup Thread B
>                            nvme_probe
>                            : wakeup scan_work
>                              to scan nvme disk
>                              and add nvme disk
>                              then wakeup udevd
>                                                 : udevd executes
>                                                   mdadm command
>        flush_work                               main
>        : wait for nvme_probe done                ...
>     __driver_probe_device                        find_driver_devices
>     : probe next nvme device                     : 1) Detect the domain
>     ...                                            symlink; 2) Find the
>     ...                                            domain symlink from
>     ...                                            vmd sysfs; 3) The
>     ...                                            domain symlink is not
>     ...                                            created yet, failed
>   sysfs_create_link
>   : create domain symlink
> 
> sysfs_create_link() is invoked at the end of vmd_enable_domain().
> However, this implementation introduces a timing issue, where mdadm
> might fail to retrieve the vmd symlink path because the symlink has not
> been created yet.
> 
> Fix the issue by creating VMD domain symlinks before invoking
> pci_bus_add_devices().
> 
> Signed-off-by: Jiwei Sun <sunjw10@lenovo.com>
> Suggested-by: Adrian Huang <ahuang12@lenovo.com>
> ---
> v3 changes:
>  - Per Paul's comment, move sysfs_remove_link() after
>    pci_stop_root_bus()
> 
> v2 changes:
>  - Add "()" after function names in subject and commit log
>  - Move sysfs_create_link() after vmd_attach_resources()
> 
>  drivers/pci/controller/vmd.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 87b7856f375a..4e7fe2e13cac 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -925,6 +925,9 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  		dev_set_msi_domain(&vmd->bus->dev,
>  				   dev_get_msi_domain(&vmd->dev->dev));
>  
> +	WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj,
> +			       "domain"), "Can't create symlink to domain\n");
> +
>  	vmd_acpi_begin();
>  
>  	pci_scan_child_bus(vmd->bus);
> @@ -964,9 +967,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  	pci_bus_add_devices(vmd->bus);
>  
>  	vmd_acpi_end();
> -
> -	WARN(sysfs_create_link(&vmd->dev->dev.kobj, &vmd->bus->dev.kobj,
> -			       "domain"), "Can't create symlink to domain\n");
>  	return 0;
>  }
>  
> @@ -1042,8 +1042,8 @@ static void vmd_remove(struct pci_dev *dev)
>  {
>  	struct vmd_dev *vmd = pci_get_drvdata(dev);
>  
> -	sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
>  	pci_stop_root_bus(vmd->bus);
> +	sysfs_remove_link(&vmd->dev->dev.kobj, "domain");
>  	pci_remove_root_bus(vmd->bus);
>  	vmd_cleanup_srcu(vmd);
>  	vmd_detach_resources(vmd);
> -- 
> 2.27.0
> 

  parent reply	other threads:[~2024-07-09 20:59 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-05 12:48 [PATCH v3] PCI: vmd: Create domain symlink before pci_bus_add_devices() Jiwei Sun
2024-06-05 16:57 ` Nirmal Patel
2024-07-06  3:22 ` Krzysztof Wilczyński
2024-07-09 20:59 ` Bjorn Helgaas [this message]
2024-07-10 13:29   ` Jiwei Sun
2024-07-10 22:16     ` Bjorn Helgaas
2024-07-11  1:32       ` Jiwei Sun
2024-07-11 16:12         ` Bjorn Helgaas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240709205938.GA194355@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=ahuang12@lenovo.com \
    --cc=aleksey.obitotskiy@intel.com \
    --cc=bhelgaas@google.com \
    --cc=jonathan.derrick@linux.dev \
    --cc=kw@linux.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=nirmal.patel@linux.intel.com \
    --cc=paul.m.stillwell.jr@intel.com \
    --cc=pawel.baldysiak@intel.com \
    --cc=robh@kernel.org \
    --cc=sjiwei@163.com \
    --cc=sunjw10@lenovo.com \
    --cc=tomasz.majchrzak@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.