Linux SCSI subsystem development
 help / color / mirror / Atom feed
* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
       [not found]     ` <2023120724-overstep-gesture-75be@gregkh>
@ 2023-12-07 11:59       ` Yafang Shao
  2023-12-07 12:12         ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Yafang Shao @ 2023-12-07 11:59 UTC (permalink / raw)
  To: Greg KH, jejb, martin.petersen; +Cc: rafael, linux-kernel, linux-scsi

On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > >
> > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > needed ways to do this properly.
> >
> > The root disk is typically identified as /dev/sda or /dev/vda, right?
>
> Depends on your system.  It can also be identified, in the proper way,
> as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> (note, fake uuid, use your own disk uuid please.)
>
> Why not do that?  That's the most stable and recommended way of doing
> things.

Adapting to this change isn't straightforward, especially for a large
fleet of servers. Our monitoring system needs to accommodate and
adjust accordingly.

>
> > This is because the root disk, which houses the operating system,
> > cannot be removed or hotplugged.
>
> Not true at all, happens for many systems (think about how systems that
> run their whole OS out of ram work...)
>
> > Therefore, it usually remains as the
> > first disk in the system. With the synchronous probe, the root disk
> > maintains a stable and consistent identification.
> >
> > >
> > > > To address this, a new kernel parameter 'driver_sync_probe=' is introduced
> > > > to enforce synchronous probe behavior for specific drivers.
> > >
> > > This should be a per-bus thing, not a driver-specific thing as drivers
> > > for the same bus could have differing settings here which would cause a
> > > mess.
> > >
> > > Please just revert the scsi bus functionality if you have had
> > > regressions here, it's not a driver-core thing to do.
> >
> > Are you suggesting a reversal of the asynchronous probe code in the
> > SCSI driver?
>
> For your broken scsi driver, yes.
>
> > While reverting to synchronous probing could ensure
> > stability, it's worth noting that asynchronous probing can potentially
> > shorten the reboot duration under specific conditions. Thus, there
> > might be some resistance to reverting this change as it offers
> > performance benefits in certain scenarios. That's why I prefer to
> > introduce a kernel parameter for it.
>
> I don't want to add a new parameter that we need to support for forever
> and add to the complexity of the system unless it is REALLY needed.

BTW, since there's already a 'driver_async_probe=', introducing
another 'driver_sync_probe=' wouldn't significantly increase the
maintenance overhead.

> Please work with the scsi developers to resolve the issue for your
> hardware, as it's been working for everyone else for well over a year
> now, right?

The SCSI guys are added to this mail thread.

I'm uncertain whether it's possible to add SCSI kernel parameters
selectively. If that's not feasible, we'll need to maintain the
following modification in our local kernel:

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index e934779..8148d12 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -607,7 +607,7 @@ static void sd_set_flush_flag(struct scsi_disk *sdkp)
                .name           = "sd",
                .owner          = THIS_MODULE,
                .probe          = sd_probe,
-               .probe_type     = PROBE_PREFER_ASYNCHRONOUS,
+               .probe_type     = PROBE_PREFER_SYNCHRONOUS,
                .remove         = sd_remove,
                .shutdown       = sd_shutdown,
                .pm             = &sd_pm_ops,




--
Regards
Yafang

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-07 11:59       ` [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe= Yafang Shao
@ 2023-12-07 12:12         ` Greg KH
  2023-12-07 12:36           ` Yafang Shao
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-12-07 12:12 UTC (permalink / raw)
  To: Yafang Shao; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > >
> > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > needed ways to do this properly.
> > >
> > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> >
> > Depends on your system.  It can also be identified, in the proper way,
> > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > (note, fake uuid, use your own disk uuid please.)
> >
> > Why not do that?  That's the most stable and recommended way of doing
> > things.
> 
> Adapting to this change isn't straightforward, especially for a large
> fleet of servers. Our monitoring system needs to accommodate and
> adjust accordingly.

Agreed, that can be rough.  But as this is an issue that was caused by a
scsi core change, perhaps the scsi developers can describe why it's ok.

But really, device naming has ALWAYS been known to not be
deterministic, which is why Pat and I did all the driver core work 20+
years ago so that you have the ability to properly name your devices in
a way that is deterministic.  Using the kernel name like sda is NOT
using that functionality, so while it has been nice to see that it has
been stable for you for a while, you are playing with fire here and will
get burned one day when the firmware in your devices decide to change
response times.

> > > While reverting to synchronous probing could ensure
> > > stability, it's worth noting that asynchronous probing can potentially
> > > shorten the reboot duration under specific conditions. Thus, there
> > > might be some resistance to reverting this change as it offers
> > > performance benefits in certain scenarios. That's why I prefer to
> > > introduce a kernel parameter for it.
> >
> > I don't want to add a new parameter that we need to support for forever
> > and add to the complexity of the system unless it is REALLY needed.
> 
> BTW, since there's already a 'driver_async_probe=', introducing
> another 'driver_sync_probe=' wouldn't significantly increase the
> maintenance overhead.

Any new code adds maintenance overhead and complexity, so you have to
justify it's existance especially when you are not going to be the one
maintaining it :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-07 12:12         ` Greg KH
@ 2023-12-07 12:36           ` Yafang Shao
  2023-12-08  5:36             ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Yafang Shao @ 2023-12-07 12:36 UTC (permalink / raw)
  To: Greg KH; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > >
> > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > needed ways to do this properly.
> > > >
> > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > >
> > > Depends on your system.  It can also be identified, in the proper way,
> > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > (note, fake uuid, use your own disk uuid please.)
> > >
> > > Why not do that?  That's the most stable and recommended way of doing
> > > things.
> >
> > Adapting to this change isn't straightforward, especially for a large
> > fleet of servers. Our monitoring system needs to accommodate and
> > adjust accordingly.
>
> Agreed, that can be rough.  But as this is an issue that was caused by a
> scsi core change, perhaps the scsi developers can describe why it's ok.
>
> But really, device naming has ALWAYS been known to not be
> deterministic, which is why Pat and I did all the driver core work 20+
> years ago so that you have the ability to properly name your devices in
> a way that is deterministic.  Using the kernel name like sda is NOT
> using that functionality, so while it has been nice to see that it has
> been stable for you for a while, you are playing with fire here and will
> get burned one day when the firmware in your devices decide to change
> response times.

I agree that using UUID is a better approach. However, it's worth
noting that the widely used IO monitoring tool 'iostat' faces
challenges when working with UUIDs. This indicates that there's a
significant amount of work ahead of us in this aspect.


>
> > > > While reverting to synchronous probing could ensure
> > > > stability, it's worth noting that asynchronous probing can potentially
> > > > shorten the reboot duration under specific conditions. Thus, there
> > > > might be some resistance to reverting this change as it offers
> > > > performance benefits in certain scenarios. That's why I prefer to
> > > > introduce a kernel parameter for it.
> > >
> > > I don't want to add a new parameter that we need to support for forever
> > > and add to the complexity of the system unless it is REALLY needed.
> >
> > BTW, since there's already a 'driver_async_probe=', introducing
> > another 'driver_sync_probe=' wouldn't significantly increase the
> > maintenance overhead.
>
> Any new code adds maintenance overhead and complexity, so you have to
> justify it's existance especially when you are not going to be the one
> maintaining it :)

Understood.


-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-07 12:36           ` Yafang Shao
@ 2023-12-08  5:36             ` Greg KH
  2023-12-08  6:49               ` Yafang Shao
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-12-08  5:36 UTC (permalink / raw)
  To: Yafang Shao; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > >
> > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > needed ways to do this properly.
> > > > >
> > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > >
> > > > Depends on your system.  It can also be identified, in the proper way,
> > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > (note, fake uuid, use your own disk uuid please.)
> > > >
> > > > Why not do that?  That's the most stable and recommended way of doing
> > > > things.
> > >
> > > Adapting to this change isn't straightforward, especially for a large
> > > fleet of servers. Our monitoring system needs to accommodate and
> > > adjust accordingly.
> >
> > Agreed, that can be rough.  But as this is an issue that was caused by a
> > scsi core change, perhaps the scsi developers can describe why it's ok.
> >
> > But really, device naming has ALWAYS been known to not be
> > deterministic, which is why Pat and I did all the driver core work 20+
> > years ago so that you have the ability to properly name your devices in
> > a way that is deterministic.  Using the kernel name like sda is NOT
> > using that functionality, so while it has been nice to see that it has
> > been stable for you for a while, you are playing with fire here and will
> > get burned one day when the firmware in your devices decide to change
> > response times.
> 
> I agree that using UUID is a better approach. However, it's worth
> noting that the widely used IO monitoring tool 'iostat' faces
> challenges when working with UUIDs. This indicates that there's a
> significant amount of work ahead of us in this aspect.

That indicates that iostat needs to be fixed as this has been an option
that people rely on for 20+ years now.  Or use a better tool :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-08  5:36             ` Greg KH
@ 2023-12-08  6:49               ` Yafang Shao
  2023-12-08  7:15                 ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Yafang Shao @ 2023-12-08  6:49 UTC (permalink / raw)
  To: Greg KH; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > >
> > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > needed ways to do this properly.
> > > > > >
> > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > >
> > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > (note, fake uuid, use your own disk uuid please.)
> > > > >
> > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > things.
> > > >
> > > > Adapting to this change isn't straightforward, especially for a large
> > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > adjust accordingly.
> > >
> > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > >
> > > But really, device naming has ALWAYS been known to not be
> > > deterministic, which is why Pat and I did all the driver core work 20+
> > > years ago so that you have the ability to properly name your devices in
> > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > using that functionality, so while it has been nice to see that it has
> > > been stable for you for a while, you are playing with fire here and will
> > > get burned one day when the firmware in your devices decide to change
> > > response times.
> >
> > I agree that using UUID is a better approach. However, it's worth
> > noting that the widely used IO monitoring tool 'iostat' faces
> > challenges when working with UUIDs. This indicates that there's a
> > significant amount of work ahead of us in this aspect.
>
> That indicates that iostat needs to be fixed as this has been an option
> that people rely on for 20+ years now.  Or use a better tool :)

The issue arises when a disk contains multiple partitions, such as
/dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
display 'sda' since only its partitions possess UUIDs. Uncertain how
to address it yet.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-08  6:49               ` Yafang Shao
@ 2023-12-08  7:15                 ` Greg KH
  2023-12-08  7:26                   ` Yafang Shao
  0 siblings, 1 reply; 7+ messages in thread
From: Greg KH @ 2023-12-08  7:15 UTC (permalink / raw)
  To: Yafang Shao; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote:
> On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> >
> > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > >
> > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > > >
> > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > > >
> > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > > needed ways to do this properly.
> > > > > > >
> > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > > >
> > > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > > (note, fake uuid, use your own disk uuid please.)
> > > > > >
> > > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > > things.
> > > > >
> > > > > Adapting to this change isn't straightforward, especially for a large
> > > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > > adjust accordingly.
> > > >
> > > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > > >
> > > > But really, device naming has ALWAYS been known to not be
> > > > deterministic, which is why Pat and I did all the driver core work 20+
> > > > years ago so that you have the ability to properly name your devices in
> > > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > > using that functionality, so while it has been nice to see that it has
> > > > been stable for you for a while, you are playing with fire here and will
> > > > get burned one day when the firmware in your devices decide to change
> > > > response times.
> > >
> > > I agree that using UUID is a better approach. However, it's worth
> > > noting that the widely used IO monitoring tool 'iostat' faces
> > > challenges when working with UUIDs. This indicates that there's a
> > > significant amount of work ahead of us in this aspect.
> >
> > That indicates that iostat needs to be fixed as this has been an option
> > that people rely on for 20+ years now.  Or use a better tool :)
> 
> The issue arises when a disk contains multiple partitions, such as
> /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
> display 'sda' since only its partitions possess UUIDs. Uncertain how
> to address it yet.

Then use one of the other many other unique ids that are in /dev/disk/
today.  You have loads of things to choose from:
	$ ls /dev/disk/
	by-diskseq  by-id  by-label  by-partlabel  by-partuuid  by-path  by-uuid

You have a plethera of choices here, use whatever works best for your
systems.  This is a userspace decision to make, not a kernel one, as
this is a policy choice of yours.

good luck!

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe=
  2023-12-08  7:15                 ` Greg KH
@ 2023-12-08  7:26                   ` Yafang Shao
  0 siblings, 0 replies; 7+ messages in thread
From: Yafang Shao @ 2023-12-08  7:26 UTC (permalink / raw)
  To: Greg KH; +Cc: jejb, martin.petersen, rafael, linux-kernel, linux-scsi

On Fri, Dec 8, 2023 at 3:15 PM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Fri, Dec 08, 2023 at 02:49:39PM +0800, Yafang Shao wrote:
> > On Fri, Dec 8, 2023 at 1:36 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Thu, Dec 07, 2023 at 08:36:56PM +0800, Yafang Shao wrote:
> > > > On Thu, Dec 7, 2023 at 8:12 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > >
> > > > > On Thu, Dec 07, 2023 at 07:59:03PM +0800, Yafang Shao wrote:
> > > > > > On Thu, Dec 7, 2023 at 6:19 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > >
> > > > > > > On Wed, Dec 06, 2023 at 10:08:40PM +0800, Yafang Shao wrote:
> > > > > > > > On Wed, Dec 6, 2023 at 9:31 PM Greg KH <gregkh@linuxfoundation.org> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, Dec 06, 2023 at 11:53:55AM +0000, Yafang Shao wrote:
> > > > > > > > > > After upgrading our kernel from version 4.19 to 6.1, certain regressions
> > > > > > > > > > occurred due to the driver's asynchronous probe behavior. Specifically,
> > > > > > > > > > the SCSI driver transitioned to an asynchronous probe by default, resulting
> > > > > > > > > > in a non-fixed root disk behavior. In the prior 4.19 kernel, the root disk
> > > > > > > > > > was consistently identified as /dev/sda. However, with kernel 6.1, the root
> > > > > > > > > > disk can be any of /dev/sdX, leading to issues for applications reliant on
> > > > > > > > > > /dev/sda, notably impacting monitoring systems monitoring the root disk.
> > > > > > > > >
> > > > > > > > > Device names are never guaranteed to be stable, ALWAYS use a persistant
> > > > > > > > > names like a filesystem label or other ways.  Look at /dev/disk/ for the
> > > > > > > > > needed ways to do this properly.
> > > > > > > >
> > > > > > > > The root disk is typically identified as /dev/sda or /dev/vda, right?
> > > > > > >
> > > > > > > Depends on your system.  It can also be identified, in the proper way,
> > > > > > > as /dev/disk/by-uuid/eef0abc1-4039-4c3f-a123-81fc99999993 if you want
> > > > > > > (note, fake uuid, use your own disk uuid please.)
> > > > > > >
> > > > > > > Why not do that?  That's the most stable and recommended way of doing
> > > > > > > things.
> > > > > >
> > > > > > Adapting to this change isn't straightforward, especially for a large
> > > > > > fleet of servers. Our monitoring system needs to accommodate and
> > > > > > adjust accordingly.
> > > > >
> > > > > Agreed, that can be rough.  But as this is an issue that was caused by a
> > > > > scsi core change, perhaps the scsi developers can describe why it's ok.
> > > > >
> > > > > But really, device naming has ALWAYS been known to not be
> > > > > deterministic, which is why Pat and I did all the driver core work 20+
> > > > > years ago so that you have the ability to properly name your devices in
> > > > > a way that is deterministic.  Using the kernel name like sda is NOT
> > > > > using that functionality, so while it has been nice to see that it has
> > > > > been stable for you for a while, you are playing with fire here and will
> > > > > get burned one day when the firmware in your devices decide to change
> > > > > response times.
> > > >
> > > > I agree that using UUID is a better approach. However, it's worth
> > > > noting that the widely used IO monitoring tool 'iostat' faces
> > > > challenges when working with UUIDs. This indicates that there's a
> > > > significant amount of work ahead of us in this aspect.
> > >
> > > That indicates that iostat needs to be fixed as this has been an option
> > > that people rely on for 20+ years now.  Or use a better tool :)
> >
> > The issue arises when a disk contains multiple partitions, such as
> > /dev/sda1 and /dev/sda2. In this case, using 'iostat -j UUID' can only
> > display 'sda' since only its partitions possess UUIDs. Uncertain how
> > to address it yet.
>
> Then use one of the other many other unique ids that are in /dev/disk/
> today.  You have loads of things to choose from:
>         $ ls /dev/disk/
>         by-diskseq  by-id  by-label  by-partlabel  by-partuuid  by-path  by-uuid
>
> You have a plethera of choices here, use whatever works best for your
> systems.  This is a userspace decision to make, not a kernel one, as
> this is a policy choice of yours.
>

Indeed, there are alternative methods besides using UUIDs. This
example serves to highlight that UUIDs might not cover all scenarios,
similar to other IDs listed under /dev/disk/.

-- 
Regards
Yafang

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-12-08  7:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20231206115355.4319-1-laoar.shao@gmail.com>
     [not found] ` <2023120644-pry-worried-22a2@gregkh>
     [not found]   ` <CALOAHbDtFKDh7C0NYeZ0xBV1z3AsNBDdnL7qRtWOrGbaU7W9VQ@mail.gmail.com>
     [not found]     ` <2023120724-overstep-gesture-75be@gregkh>
2023-12-07 11:59       ` [PATCH] drivers: base: Introduce a new kernel parameter driver_sync_probe= Yafang Shao
2023-12-07 12:12         ` Greg KH
2023-12-07 12:36           ` Yafang Shao
2023-12-08  5:36             ` Greg KH
2023-12-08  6:49               ` Yafang Shao
2023-12-08  7:15                 ` Greg KH
2023-12-08  7:26                   ` Yafang Shao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox