Linux SCSI subsystem development
 help / color / mirror / Atom feed
From: John Garry <john.g.garry@oracle.com>
To: Bart Van Assche <bvanassche@acm.org>,
	"Martin K . Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org, Hannes Reinecke <hare@suse.de>,
	"James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Subject: Re: [PATCH v4 04/29] scsi: core: Support allocating a pseudo SCSI device
Date: Tue, 16 Sep 2025 09:21:09 +0100	[thread overview]
Message-ID: <4c865666-a2d9-4037-9762-4bed3490a4ea@oracle.com> (raw)
In-Reply-To: <20250912182340.3487688-5-bvanassche@acm.org>

On 12/09/2025 19:21, Bart Van Assche wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> Allocate a pseudo SCSI device if 'nr_reserved_cmds' has been set. Pseudo
> SCSI devices have the SCSI ID <max_id>:U64_MAX so they won't clash with
> any devices the LLD might create. Pseudo SCSI devices are excluded from
> scanning and will not show up in sysfs. Additionally, pseudo SCSI
> devices are skipped by shost_for_each_device(). This prevents that the
> SCSI error handler tries to submit a reset to a non-existent logical unit.
> 
> Do not allocate a budget map for pseudo SCSI devices since the
> cmd_per_lun limit does not apply to pseudo SCSI devices.

IDGI, in v3 series you said that you would allocate the budget map 
https://lore.kernel.org/linux-scsi/20250827000816.2370150-1-bvanassche@acm.org/T/#m13c361e081b886b9318238b6dc05b571840b9698

FWIW, I still think that it is worth allocating the budget map for the 
psuedo sdev and making the queue depth the same as nr_reserved_commands 
via a scsi_change_queue_depth() call.

If we want to optimise budget code handling, then I think that it is 
worth doing later. The whole budget map and cmd_per_lun handling is 
murky IMHO.

> 
> Do not perform queue depth ramp up / ramp down for pseudo SCSI devices.

Are we even ever going to try ramp up or down for the pseudo sdev?

I suppose we could see it if there is some internal reserved command 
error, right?

> 
> Pseudo SCSI devices will be used to send internal commands to a storage
> device.
> 
> Cc: John Garry <john.g.garry@oracle.com>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> [ bvanassche: edited patch description / renamed host_sdev into
>    pseudo_sdev / unexported scsi_get_host_dev() / modified error path in
>    scsi_get_pseudo_dev() / skip pseudo devices in __scsi_iterate_devices()
>    and also when calling sdev_init(), sdev_configure() and sdev_destroy().
>    See also
>    https://lore.kernel.org/linux-scsi/20211125151048.103910-2-hare@suse.de/ ]
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>   drivers/scsi/hosts.c       |  8 +++++
>   drivers/scsi/scsi.c        |  9 +++--
>   drivers/scsi/scsi_error.c  |  3 ++
>   drivers/scsi/scsi_priv.h   |  2 ++
>   drivers/scsi/scsi_scan.c   | 69 +++++++++++++++++++++++++++++++++++++-
>   drivers/scsi/scsi_sysfs.c  |  5 ++-
>   include/scsi/scsi_device.h | 16 +++++++++
>   include/scsi/scsi_host.h   |  6 ++++
>   8 files changed, 114 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/scsi/hosts.c b/drivers/scsi/hosts.c
> index 9bb7f0114763..986586bf67dc 100644
> --- a/drivers/scsi/hosts.c
> +++ b/drivers/scsi/hosts.c
> @@ -307,6 +307,14 @@ int scsi_add_host_with_dma(struct Scsi_Host *shost, struct device *dev,
>   	if (error)
>   		goto out_del_dev;
>   
> +	if (sht->nr_reserved_cmds) {
> +		shost->pseudo_sdev = scsi_get_pseudo_dev(shost);
> +		if (!shost->pseudo_sdev) {
> +			error = -ENOMEM;
> +			goto out_del_dev;
> +		}
> +	}
> +
>   	scsi_proc_host_add(shost);
>   	scsi_autopm_put_host(shost);
>   	return error;
> diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
> index ff6b0973d3b4..2d2a52c3ef49 100644
> --- a/drivers/scsi/scsi.c
> +++ b/drivers/scsi/scsi.c
> @@ -257,6 +257,8 @@ EXPORT_SYMBOL(scsi_change_queue_depth);
>    */
>   int scsi_track_queue_full(struct scsi_device *sdev, int depth)
>   {
> +	if (scsi_device_is_pseudo_dev(sdev))
> +		return 0;
>   
>   	/*
>   	 * Don't let QUEUE_FULLs on the same
> @@ -828,8 +830,11 @@ struct scsi_device *__scsi_iterate_devices(struct Scsi_Host *shost,
>   	spin_lock_irqsave(shost->host_lock, flags);
>   	while (list->next != &shost->__devices) {
>   		next = list_entry(list->next, struct scsi_device, siblings);
> -		/* skip devices that we can't get a reference to */
> -		if (!scsi_device_get(next))
> +		/*
> +		 * Skip pseudo devices and also devices for which
> +		 * scsi_device_get() fails.
> +		 */
> +		if (!scsi_device_is_pseudo_dev(next) && !scsi_device_get(next))
>   			break;

looks ok

>   		next = NULL;
>   		list = list->next;
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 746ff6a1f309..540d82974529 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -749,6 +749,9 @@ static void scsi_handle_queue_ramp_up(struct scsi_device *sdev)
>   	const struct scsi_host_template *sht = sdev->host->hostt;
>   	struct scsi_device *tmp_sdev;
>   
> +	if (scsi_device_is_pseudo_dev(sdev))
> +		return;
> +
>   	if (!sht->track_queue_depth ||
>   	    sdev->queue_depth >= sdev->max_queue_depth)
>   		return;
> diff --git a/drivers/scsi/scsi_priv.h b/drivers/scsi/scsi_priv.h
> index 5b2b19f5e8ec..da3bc87ac5a6 100644
> --- a/drivers/scsi/scsi_priv.h
> +++ b/drivers/scsi/scsi_priv.h
> @@ -135,6 +135,8 @@ extern int scsi_complete_async_scans(void);
>   extern int scsi_scan_host_selected(struct Scsi_Host *, unsigned int,
>   				   unsigned int, u64, enum scsi_scan_mode);
>   extern void scsi_forget_host(struct Scsi_Host *);
> +struct scsi_device *scsi_get_pseudo_dev(struct Scsi_Host *);
> +bool scsi_device_is_pseudo_dev(struct scsi_device *sdev);
>   
>   /* scsi_sysctl.c */
>   #ifdef CONFIG_SYSCTL
> diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
> index de039efef290..a3523f964bc1 100644
> --- a/drivers/scsi/scsi_scan.c
> +++ b/drivers/scsi/scsi_scan.c
> @@ -225,6 +225,8 @@ static int scsi_realloc_sdev_budget_map(struct scsi_device *sdev,
>   	int ret;
>   	struct sbitmap sb_backup;
>   
> +	WARN_ON_ONCE(scsi_device_is_pseudo_dev(sdev));
> +
>   	depth = min_t(unsigned int, depth, scsi_device_max_queue_depth(sdev));
>   
>   	/*
> @@ -349,6 +351,9 @@ static struct scsi_device *scsi_alloc_sdev(struct scsi_target *starget,
>   
>   	scsi_sysfs_device_initialize(sdev);
>   
> +	if (scsi_device_is_pseudo_dev(sdev))
> +		return sdev;
> +
>   	depth = sdev->host->cmd_per_lun ?: 1;
>   
>   	/*
> @@ -1070,6 +1075,9 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned char *inq_result,
>   
>   	sdev->sdev_bflags = *bflags;
>   
> +	if (scsi_device_is_pseudo_dev(sdev))
> +		return SCSI_SCAN_LUN_PRESENT;
> +
>   	/*
>   	 * No need to freeze the queue as it isn't reachable to anyone else yet.
>   	 */
> @@ -1213,6 +1221,12 @@ static int scsi_probe_and_add_lun(struct scsi_target *starget,
>   	if (!sdev)
>   		goto out;
>   
> +	if (scsi_device_is_pseudo_dev(sdev)) {
> +		if (bflagsp)
> +			*bflagsp = BLIST_NOLUN;
> +		return SCSI_SCAN_LUN_PRESENT;
> +	}
> +
>   	result = kmalloc(result_len, GFP_KERNEL);
>   	if (!result)
>   		goto out_free_sdev;
> @@ -2084,12 +2098,65 @@ void scsi_forget_host(struct Scsi_Host *shost)
>    restart:
>   	spin_lock_irqsave(shost->host_lock, flags);
>   	list_for_each_entry(sdev, &shost->__devices, siblings) {
> -		if (sdev->sdev_state == SDEV_DEL)
> +		if (scsi_device_is_pseudo_dev(sdev) ||
> +		    sdev->sdev_state == SDEV_DEL)
>   			continue;

maybe this would be neater with seperate if statements

>   		spin_unlock_irqrestore(shost->host_lock, flags);
>   		__scsi_remove_device(sdev);
>   		goto restart;
>   	}
>   	spin_unlock_irqrestore(shost->host_lock, flags);
> +
> +	/*
> +	 * Remove the pseudo device last since it may be needed during removal
> +	 * of other SCSI devices.
> +	 */
> +	if (shost->pseudo_sdev)
> +		__scsi_remove_device(shost->pseudo_sdev);

looks ok

>   }
>   
> +/**
> + * scsi_get_pseudo_dev() - Attach a pseudo SCSI device to a SCSI host
> + * @shost: Host that needs a pseudo SCSI device
> + *
> + * Lock status: None assumed.
> + *
> + * Returns:     The scsi_device or NULL
> + *
> + * Notes:
> + *	Attach a single scsi_device to the Scsi_Host. The primary aim for this
> + *	device is to serve as a container from which SCSI commands can be
> + *	allocated. Each SCSI command will carry a command tag allocated by the
> + *	block layer. These SCSI commands can be used by the LLDD to send
> + *	internal or passthrough commands without having to manage tag allocation
> + *	inside the LLDD.
> + */
> +struct scsi_device *scsi_get_pseudo_dev(struct Scsi_Host *shost)
> +{
> +	struct scsi_device *sdev = NULL;
> +	struct scsi_target *starget;
> +
> +	guard(mutex)(&shost->scan_mutex);
> +
> +	if (!scsi_host_scan_allowed(shost))
> +		goto out;

when would/could this fail? It seems to me that the shost add would also 
fail (if this failed), right?

> +
> +	starget = scsi_alloc_target(&shost->shost_gendev, 0, shost->max_id);
> +	if (!starget)
> +		goto out;
> +
> +	sdev = scsi_alloc_sdev(starget, U64_MAX, NULL);
> +	if (!sdev) {
> +		scsi_target_reap(starget);
> +		goto put_target;
> +	}
> +
> +	sdev->borken = 0;
> +
> +put_target:
> +	/* See also the get_device(dev) call in scsi_alloc_target(). */
> +	put_device(&starget->dev);
> +
> +out:
> +	return sdev;
> +}

looks ok

> diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
> index 169af7d47ce7..22f76a1ca23b 100644
> --- a/drivers/scsi/scsi_sysfs.c
> +++ b/drivers/scsi/scsi_sysfs.c
> @@ -1406,6 +1406,9 @@ int scsi_sysfs_add_sdev(struct scsi_device *sdev)
>   	int error;
>   	struct scsi_target *starget = sdev->sdev_target;
>   
> +	if (WARN_ON_ONCE(scsi_device_is_pseudo_dev(sdev)))
> +		return -EINVAL;
> +
>   	error = scsi_target_add(starget);
>   	if (error)
>   		return error;
> @@ -1513,7 +1516,7 @@ void __scsi_remove_device(struct scsi_device *sdev)
>   	kref_put(&sdev->host->tagset_refcnt, scsi_mq_free_tags);
>   	cancel_work_sync(&sdev->requeue_work);
>   
> -	if (sdev->host->hostt->sdev_destroy)
> +	if (!scsi_device_is_pseudo_dev(sdev) && sdev->host->hostt->sdev_destroy)
>   		sdev->host->hostt->sdev_destroy(sdev);
>   	transport_destroy_device(dev);
>   
> diff --git a/include/scsi/scsi_device.h b/include/scsi/scsi_device.h
> index 6d6500148c4b..3846f5dfc51c 100644
> --- a/include/scsi/scsi_device.h
> +++ b/include/scsi/scsi_device.h
> @@ -589,6 +589,22 @@ static inline unsigned int sdev_id(struct scsi_device *sdev)
>   #define scmd_id(scmd) sdev_id((scmd)->device)
>   #define scmd_channel(scmd) sdev_channel((scmd)->device)
>   
> +/**
> + * scsi_device_is_pseudo_dev() - Whether a device is a pseudo SCSI device.
> + * @sdev: SCSI device to examine
> + *
> + * A pseudo SCSI device can be used to allocate SCSI commands but does not show
> + * up in sysfs. Additionally, the logical unit information in *@sdev is made up.
> + *
> + * This function tests the LUN number instead of comparing @sdev with
> + * @sdev->host->pseudo_sdev because this function may be called before
> + * @sdev->host->pseudo_sdev has been initialized.
> + */
> +static inline bool scsi_device_is_pseudo_dev(struct scsi_device *sdev)
> +{
> +	return sdev->lun == U64_MAX;

could you also check sdev->shost->psuedo_sdev == sdev?

I suppose just checking the lun is simpler

> +}
> +
>   /*
>    * checks for positions of the SCSI state machine
>    */
> diff --git a/include/scsi/scsi_host.h b/include/scsi/scsi_host.h
> index 91eb3f52b3d0..3bfb53cf5dfc 100644
> --- a/include/scsi/scsi_host.h
> +++ b/include/scsi/scsi_host.h
> @@ -721,6 +721,12 @@ struct Scsi_Host {
>   	/* ldm bits */
>   	struct device		shost_gendev, shost_dev;
>   
> +	/*
> +	 * A SCSI device structure used for sending internal commands to the
> +	 * HBA. There is no corresponding logical unit inside the SCSI device.
> +	 */
> +	struct scsi_device *pseudo_sdev;
> +
>   	/*
>   	 * Points to the transport data (if any) which is allocated
>   	 * separately


  reply	other threads:[~2025-09-16  8:21 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-12 18:21 [PATCH v4 00/29] Optimize the hot path in the UFS driver Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 01/29] scsi: core: Support allocating reserved commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 02/29] scsi: core: Move two statements Bart Van Assche
2025-09-16  8:03   ` John Garry
2025-09-16  8:28   ` Hannes Reinecke
2025-09-12 18:21 ` [PATCH v4 03/29] scsi: core: Make the budget map optional Bart Van Assche
2025-09-16  8:34   ` Hannes Reinecke
2025-09-16 15:45     ` Bart Van Assche
2025-09-16 20:38     ` Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 04/29] scsi: core: Support allocating a pseudo SCSI device Bart Van Assche
2025-09-16  8:21   ` John Garry [this message]
2025-09-16  8:44     ` Hannes Reinecke
2025-09-16  9:21       ` John Garry
2025-09-12 18:21 ` [PATCH v4 05/29] scsi: core: Introduce .queue_reserved_command() Bart Van Assche
2025-09-16  9:33   ` John Garry
2025-09-12 18:21 ` [PATCH v4 06/29] scsi: core: Extend the scsi_execute_cmd() functionality Bart Van Assche
2025-09-12 20:03   ` michael.christie
2025-09-12 20:14     ` Bart Van Assche
2025-09-16  9:09   ` John Garry
2025-09-16 15:44     ` Bart Van Assche
2025-09-17 13:08       ` John Garry
2025-09-17 18:21         ` Bart Van Assche
2025-09-17 23:42           ` Bart Van Assche
2025-09-18  8:01             ` John Garry
2025-09-18 19:49               ` Bart Van Assche
2025-09-19  7:45                 ` John Garry
2025-09-12 18:21 ` [PATCH v4 07/29] scsi_debug: Allocate a pseudo SCSI device Bart Van Assche
2025-09-17 12:09   ` John Garry
2025-09-17 21:37     ` Bart Van Assche
2025-09-18  7:30       ` John Garry
2025-09-12 18:21 ` [PATCH v4 08/29] ufs: core: Move an assignment in ufshcd_mcq_process_cqe() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 09/29] ufs: core: Change the type of one ufshcd_add_cmd_upiu_trace() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 10/29] ufs: core: Only call ufshcd_add_command_trace() for SCSI commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 11/29] ufs: core: Change the type of one ufshcd_add_command_trace() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 12/29] ufs: core: Change the type of one ufshcd_send_command() argument Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 13/29] ufs: core: Only call ufshcd_should_inform_monitor() for SCSI commands Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 14/29] ufs: core: Change the monitor function argument types Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 15/29] ufs: core: Rework ufshcd_mcq_compl_pending_transfer() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 16/29] ufs: core: Rework ufshcd_eh_device_reset_handler() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 17/29] ufs: core: Rework the SCSI host queue depth calculation code Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 18/29] ufs: core: Allocate the SCSI host earlier Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 19/29] ufs: core: Call ufshcd_init_lrb() later Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 20/29] ufs: core: Use hba->reserved_slot Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 21/29] ufs: core: Make the reserved slot a reserved request Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 22/29] ufs: core: Do not clear driver-private command data Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 23/29] ufs: core: Optimize the hot path Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 24/29] ufs: core: Pass a SCSI pointer instead of an LRB pointer Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 25/29] ufs: core: Remove the ufshcd_lrb task_tag member Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 26/29] ufs: core: Make blk_mq_tagset_busy_iter() skip reserved requests Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 27/29] ufs: core: Move code out of ufshcd_wait_for_dev_cmd() Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 28/29] ufs: core: Rework the ufshcd_issue_dev_cmd() callers Bart Van Assche
2025-09-12 18:21 ` [PATCH v4 29/29] ufs: core: Switch to scsi_execute_cmd() Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c865666-a2d9-4037-9762-4bed3490a4ea@oracle.com \
    --to=john.g.garry@oracle.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=bvanassche@acm.org \
    --cc=hare@suse.de \
    --cc=linux-scsi@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox