From: John Garry <john.g.garry@oracle.com>
To: Bart Van Assche <bvanassche@acm.org>,
"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.de>,
"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Subject: Re: [PATCH v4 04/29] scsi: core: Support allocating a pseudo SCSI device
Date: Tue, 16 Sep 2025 09:21:09 +0100 [thread overview]
Message-ID: <4c865666-a2d9-4037-9762-4bed3490a4ea@oracle.com> (raw)
In-Reply-To: <20250912182340.3487688-5-bvanassche@acm.org>
On 12/09/2025 19:21, Bart Van Assche wrote:
> From: Hannes Reinecke <hare@suse.de>
>
> Allocate a pseudo SCSI device if 'nr_reserved_cmds' has been set. Pseudo
> SCSI devices have the SCSI ID <max_id>:U64_MAX so they won't clash with
> any devices the LLD might create. Pseudo SCSI devices are excluded from
> scanning and will not show up in sysfs. Additionally, pseudo SCSI
> devices are skipped by shost_for_each_device(). This prevents that the
> SCSI error handler tries to submit a reset to a non-existent logical unit.
>
> Do not allocate a budget map for pseudo SCSI devices since the
> cmd_per_lun limit does not apply to pseudo SCSI devices.
IDGI, in v3 series you said that you would allocate the budget map
https://lore.kernel.org/linux-scsi/20250827000816.2370150-1-bvanassche@acm.org/T/#m13c361e081b886b9318238b6dc05b571840b9698
FWIW, I still think that it is worth allocating the budget map for the
psuedo sdev and making the queue depth the same as nr_reserved_commands
via a scsi_change_queue_depth() call.
If we want to optimise budget code handling, then I think that it is
worth doing later. The whole budget map and cmd_per_lun handling is
murky IMHO.
>
> Do not perform queue depth ramp up / ramp down for pseudo SCSI devices.
Are we even ever going to try ramp up or down for the pseudo sdev?
I suppose we could see it if there is some internal reserved command
error, right?
>
> Pseudo SCSI devices will be used to send internal commands to a storage
> device.
>
> Cc: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> [ bvanassche: edited patch description / renamed host_sdev into
> pseudo_sdev / unexported scsi_get_host_dev() / modified error path in
> scsi_get_pseudo_dev() / skip pseudo devices in __scsi_iterate_devices()
> and also when calling sdev_init(), sdev_configure() and sdev_destroy().
> See also
> https://lore.kernel.org/linux-scsi/20211125151048.103910-2-hare@suse.de/ ]
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
> drivers/scsi/hosts.c | 8 +++++
> drivers/scsi/scsi.c | 9 +++--
> drivers/scsi/scsi_error.c | 3 ++
> drivers/scsi/scsi_priv.h | 2 ++
> drivers/scsi/scsi_scan.c | 69 +++++++++++++++++++++++++++++++++++++-
> drivers/scsi/scsi_sysfs.c | 5 ++-
> include/scsi/scsi_device.h | 16 +++++++++
> include/scsi/scsi_host.h | 6 ++++
> 8 files changed, 114 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 9bb7f0114763..986586bf67dc 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -307,6 +307,14 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
> if (error)
> goto out_del_dev;
>
> + if (sht->nr_reserved_cmds) {
> + shost->pseudo_sdev = scsi_get_pseudo_dev(shost);
> + if (!shost->pseudo_sdev) {
> + error = -ENOMEM;
> + goto out_del_dev;
> + }
> + }
> +
> scsi_proc_host_add(shost);
> scsi_autopm_put_host(shost);
> return error;
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index ff6b0973d3b4..2d2a52c3ef49 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -257,6 +257,8 @@ EXPORT_SYMBOL(scsi_change_queue_depth);
> */
> int scsi_track_queue_full(struct scsi_device *sdev, int depth)
> {
> + if (scsi_device_is_pseudo_dev(sdev))
> + return 0;
>
> /*
> * Don't let QUEUE_FULLs on the same
> @@ -828,8 +830,11 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
> spin_lock_irqsave(shost->host_lock, flags);
> while (list->next != &shost->__devices) {
> next = list_entry(list->next, struct scsi_device, siblings);
> - /* skip devices that we can't get a reference to */
> - if (!scsi_device_get(next))
> + /*
> + * Skip pseudo devices and also devices for which
> + * scsi_device_get() fails.
> + */
> + if (!scsi_device_is_pseudo_dev(next) && !scsi_device_get(next))
> break;
looks ok
> next = NULL;
> list = list->next;
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 746ff6a1f309..540d82974529 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -749,6 +749,9 @@ static void scsi_handle_queue_ramp_up(struct scsi_device *sdev)
> const struct scsi_host_template *sht = sdev->host->hostt;
> struct scsi_device *tmp_sdev;
>
> + if (scsi_device_is_pseudo_dev(sdev))
> + return;
> +
> if (!sht->track_queue_depth ||
> sdev->queue_depth >= sdev->max_queue_depth)
> return;
> diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
> index 5b2b19f5e8ec..da3bc87ac5a6 100644
> --- a/drivers/scsi/scsi_priv.h
> +++ b/drivers/scsi/scsi_priv.h
> @@ -135,6 +135,8 @@ extern int scsi_complete_async_scans(void);
> extern int scsi_scan_host_selected(struct Scsi_Host *, unsigned int,
> unsigned int, u64, enum scsi_scan_mode);
> extern void scsi_forget_host(struct Scsi_Host *);
> +struct scsi_device *scsi_get_pseudo_dev(struct Scsi_Host *);
> +bool scsi_device_is_pseudo_dev(struct scsi_device *sdev);
>
> /* scsi_sysctl.c */
> #ifdef CONFIG_SYSCTL
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index de039efef290..a3523f964bc1 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -225,6 +225,8 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
> int ret;
> struct sbitmap sb_backup;
>
> + WARN_ON_ONCE(scsi_device_is_pseudo_dev(sdev));
> +
> depth = min_t(unsigned int, depth, scsi_device_max_queue_depth(sdev));
>
> /*
> @@ -349,6 +351,9 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
>
> scsi_sysfs_device_initialize(sdev);
>
> + if (scsi_device_is_pseudo_dev(sdev))
> + return sdev;
> +
> depth = sdev->host->cmd_per_lun ?: 1;
>
> /*
> @@ -1070,6 +1075,9 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
>
> sdev->sdev_bflags = *bflags;
>
> + if (scsi_device_is_pseudo_dev(sdev))
> + return SCSI_SCAN_LUN_PRESENT;
> +
> /*
> * No need to freeze the queue as it isn't reachable to anyone else yet.
> */
> @@ -1213,6 +1221,12 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
> if (!sdev)
> goto out;
>
> + if (scsi_device_is_pseudo_dev(sdev)) {
> + if (bflagsp)
> + *bflagsp = BLIST_NOLUN;
> + return SCSI_SCAN_LUN_PRESENT;
> + }
> +
> result = kmalloc(result_len, GFP_KERNEL);
> if (!result)
> goto out_free_sdev;
> @@ -2084,12 +2098,65 @@ void scsi_forget_host(struct Scsi_Host *shost)
> restart:
> spin_lock_irqsave(shost->host_lock, flags);
> list_for_each_entry(sdev, &shost->__devices, siblings) {
> - if (sdev->sdev_state == SDEV_DEL)
> + if (scsi_device_is_pseudo_dev(sdev) ||
> + sdev->sdev_state == SDEV_DEL)
> continue;
maybe this would be neater with seperate if statements
> spin_unlock_irqrestore(shost->host_lock, flags);
> __scsi_remove_device(sdev);
> goto restart;
> }
> spin_unlock_irqrestore(shost->host_lock, flags);
> +
> + /*
> + * Remove the pseudo device last since it may be needed during removal
> + * of other SCSI devices.
> + */
> + if (shost->pseudo_sdev)
> + __scsi_remove_device(shost->pseudo_sdev);
looks ok
> }
>
> +/**
> + * scsi_get_pseudo_dev() - Attach a pseudo SCSI device to a SCSI host
> + * @shost: Host that needs a pseudo SCSI device
> + *
> + * Lock status: None assumed.
> + *
> + * Returns: The scsi_device or NULL
> + *
> + * Notes:
> + * Attach a single scsi_device to the Scsi_Host. The primary aim for this
> + * device is to serve as a container from which SCSI commands can be
> + * allocated. Each SCSI command will carry a command tag allocated by the
> + * block layer. These SCSI commands can be used by the LLDD to send
> + * internal or passthrough commands without having to manage tag allocation
> + * inside the LLDD.
> + */
> +struct scsi_device *scsi_get_pseudo_dev(struct Scsi_Host *shost)
> +{
> + struct scsi_device *sdev = NULL;
> + struct scsi_target *starget;
> +
> + guard(mutex)(&shost->scan_mutex);
> +
> + if (!scsi_host_scan_allowed(shost))
> + goto out;
when would/could this fail? It seems to me that the shost add would also
fail (if this failed), right?
> +
> + starget = scsi_alloc_target(&shost->shost_gendev, 0, shost->max_id);
> + if (!starget)
> + goto out;
> +
> + sdev = scsi_alloc_sdev(starget, U64_MAX, NULL);
> + if (!sdev) {
> + scsi_target_reap(starget);
> + goto put_target;
> + }
> +
> + sdev->borken = 0;
> +
> +put_target:
> + /* See also the get_device(dev) call in scsi_alloc_target(). */
> + put_device(&starget->dev);
> +
> +out:
> + return sdev;
> +}
looks ok
> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 169af7d47ce7..22f76a1ca23b 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1406,6 +1406,9 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
> int error;
> struct scsi_target *starget = sdev->sdev_target;
>
> + if (WARN_ON_ONCE(scsi_device_is_pseudo_dev(sdev)))
> + return -EINVAL;
> +
> error = scsi_target_add(starget);
> if (error)
> return error;
> @@ -1513,7 +1516,7 @@ void __scsi_remove_device(struct scsi_device *sdev)
> kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
> cancel_work_sync(&sdev->requeue_work);
>
> - if (sdev->host->hostt->sdev_destroy)
> + if (!scsi_device_is_pseudo_dev(sdev) && sdev->host->hostt->sdev_destroy)
> sdev->host->hostt->sdev_destroy(sdev);
> transport_destroy_device(dev);
>
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 6d6500148c4b..3846f5dfc51c 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -589,6 +589,22 @@ static inline unsigned int sdev_id(struct scsi_device *sdev)
> #define scmd_id(scmd) sdev_id((scmd)->device)
> #define scmd_channel(scmd) sdev_channel((scmd)->device)
>
> +/**
> + * scsi_device_is_pseudo_dev() - Whether a device is a pseudo SCSI device.
> + * @sdev: SCSI device to examine
> + *
> + * A pseudo SCSI device can be used to allocate SCSI commands but does not show
> + * up in sysfs. Additionally, the logical unit information in *@sdev is made up.
> + *
> + * This function tests the LUN number instead of comparing @sdev with
> + * @sdev->host->pseudo_sdev because this function may be called before
> + * @sdev->host->pseudo_sdev has been initialized.
> + */
> +static inline bool scsi_device_is_pseudo_dev(struct scsi_device *sdev)
> +{
> + return sdev->lun == U64_MAX;
could you also check sdev->shost->psuedo_sdev == sdev?
I suppose just checking the lun is simpler
> +}
> +
> /*
> * checks for positions of the SCSI state machine
> */
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 91eb3f52b3d0..3bfb53cf5dfc 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -721,6 +721,12 @@ struct Scsi_Host {
> /* ldm bits */
> struct device shost_gendev, shost_dev;
>
> + /*
> + * A SCSI device structure used for sending internal commands to the
> + * HBA. There is no corresponding logical unit inside the SCSI device.
> + */
> + struct scsi_device *pseudo_sdev;
> +
> /*
> * Points to the transport data (if any) which is allocated
> * separately
next prev parent reply other threads:[~2025-09-16 8:21 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-12 18:21 [PATCH v4 00/29] Optimize the hot path in the UFS driver Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 01/29] scsi: core: Support allocating reserved commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 02/29] scsi: core: Move two statements Bart Van Assche
2025-09-16 8:03 ` John Garry
2025-09-16 8:28 ` Hannes Reinecke
2025-09-12 18:21 ` [PATCH v4 03/29] scsi: core: Make the budget map optional Bart Van Assche
2025-09-16 8:34 ` Hannes Reinecke
2025-09-16 15:45 ` Bart Van Assche
2025-09-16 20:38 ` Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 04/29] scsi: core: Support allocating a pseudo SCSI device Bart Van Assche
2025-09-16 8:21 ` John Garry [this message]
2025-09-16 8:44 ` Hannes Reinecke
2025-09-16 9:21 ` John Garry
2025-09-12 18:21 ` [PATCH v4 05/29] scsi: core: Introduce .queue_reserved_command() Bart Van Assche
2025-09-16 9:33 ` John Garry
2025-09-12 18:21 ` [PATCH v4 06/29] scsi: core: Extend the scsi_execute_cmd() functionality Bart Van Assche
2025-09-12 20:03 ` michael.christie
2025-09-12 20:14 ` Bart Van Assche
2025-09-16 9:09 ` John Garry
2025-09-16 15:44 ` Bart Van Assche
2025-09-17 13:08 ` John Garry
2025-09-17 18:21 ` Bart Van Assche
2025-09-17 23:42 ` Bart Van Assche
2025-09-18 8:01 ` John Garry
2025-09-18 19:49 ` Bart Van Assche
2025-09-19 7:45 ` John Garry
2025-09-12 18:21 ` [PATCH v4 07/29] scsi_debug: Allocate a pseudo SCSI device Bart Van Assche
2025-09-17 12:09 ` John Garry
2025-09-17 21:37 ` Bart Van Assche
2025-09-18 7:30 ` John Garry
2025-09-12 18:21 ` [PATCH v4 08/29] ufs: core: Move an assignment in ufshcd_mcq_process_cqe() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 09/29] ufs: core: Change the type of one ufshcd_add_cmd_upiu_trace() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 10/29] ufs: core: Only call ufshcd_add_command_trace() for SCSI commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 11/29] ufs: core: Change the type of one ufshcd_add_command_trace() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 12/29] ufs: core: Change the type of one ufshcd_send_command() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 13/29] ufs: core: Only call ufshcd_should_inform_monitor() for SCSI commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 14/29] ufs: core: Change the monitor function argument types Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 15/29] ufs: core: Rework ufshcd_mcq_compl_pending_transfer() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 16/29] ufs: core: Rework ufshcd_eh_device_reset_handler() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 17/29] ufs: core: Rework the SCSI host queue depth calculation code Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 18/29] ufs: core: Allocate the SCSI host earlier Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 19/29] ufs: core: Call ufshcd_init_lrb() later Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 20/29] ufs: core: Use hba->reserved_slot Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 21/29] ufs: core: Make the reserved slot a reserved request Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 22/29] ufs: core: Do not clear driver-private command data Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 23/29] ufs: core: Optimize the hot path Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 24/29] ufs: core: Pass a SCSI pointer instead of an LRB pointer Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 25/29] ufs: core: Remove the ufshcd_lrb task_tag member Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 26/29] ufs: core: Make blk_mq_tagset_busy_iter() skip reserved requests Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 27/29] ufs: core: Move code out of ufshcd_wait_for_dev_cmd() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 28/29] ufs: core: Rework the ufshcd_issue_dev_cmd() callers Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 29/29] ufs: core: Switch to scsi_execute_cmd() Bart Van Assche
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4c865666-a2d9-4037-9762-4bed3490a4ea@oracle.com \
--to=john.g.garry@oracle.com \
--cc=James.Bottomley@HansenPartnership.com \
--cc=bvanassche@acm.org \
--cc=hare@suse.de \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox