dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* Persistent Reservation API V3
@ 2015-08-26 16:03 Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2015-08-26 16:03 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-scsi, linux-nvme, dm-devel, linux-api, linux-kernel

This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

	git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
    the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Persistent Reservation API V3
@ 2015-08-26 16:06 Christoph Hellwig
  2015-08-26 16:10 ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2015-08-26 16:06 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-scsi, linux-nvme, dm-devel, linux-api, linux-kernel

This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

	git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
    the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Persistent Reservation API V3
  2015-08-26 16:06 Persistent Reservation API V3 Christoph Hellwig
@ 2015-08-26 16:10 ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2015-08-26 16:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-scsi, linux-nvme, dm-devel, linux-api,
	linux-kernel

Meh, looks like the train wifi is too bad to send out a whole patch
series.  I'll resend once I've arrived..

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Persistent Reservation API V3
@ 2015-08-26 16:56 Christoph Hellwig
       [not found] ` <1440608214-14497-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
  0 siblings, 1 reply; 6+ messages in thread
From: Christoph Hellwig @ 2015-08-26 16:56 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

This series adds support for a simplified Persistent Reservation API
to the block layer.  The intent is that both in-kernel and userspace
consumers can use the API instead of having to hand craft SCSI or NVMe
command through the various pass through interfaces.  It also adds
DM support as getting reservations through dm-multipath is a major
pain with the current scheme.

NVMe support currently isn't included as I don't have a multihost
NVMe setup to test on, but Keith offered to test it and I'll have
a patch for it shortly.

The ioctl API is documented in Documentation/block/pr.txt, but to
fully understand the concept you'll have to read up the SPC spec,
PRs are too complicated that trying to rephrase them into different
terminology is just going to create confusion.

Note that Mike wants to include the DM patches so through the DM
tree, so they are only included for reference.

I also have a set of simple test tools available at:

	git://git.infradead.org/users/hch/pr-tests.git

Changes since V2:
  - added an ignore flag to the reserve opertion as well, and redid
    the ioctl API to have general flags fields
  - rebased on top of the latest block layer tree updates
Changes since V1:
  - rename DM ->ioctl to ->prepare_ioctl
  - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
  - merge two DM patches into one
  - various spelling fixes

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Persistent Reservation API V3
       [not found] ` <1440608214-14497-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
@ 2015-08-29  1:33   ` Jeremy Linton
  2015-08-29 13:52     ` Christoph Hellwig
  0 siblings, 1 reply; 6+ messages in thread
From: Jeremy Linton @ 2015-08-29  1:33 UTC (permalink / raw)
  To: Christoph Hellwig, Jens Axboe
  Cc: linux-scsi-u79uwXL29TY76Z2rM5mHXA,
	linux-nvme-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Hello,
	So, looking at this, I don't see how it supports the algorithm I've been using
for years. For that algorithm to successfully migrate PRs across multiple paths
on a single machine without affecting other possible users (who may legitimately
have PR'ed the same device) I need PR_IN SA 1, READ RESERVATIONS to assure the
current node owns the reservation before attempting to preempt it on another
path. This can also assure that the device hasn't been reserved with a legacy
reservation.

	So, this leads me to two more general questions. The first is why isn't the PR
API simply exported to filesystems as a general reserve/release so that the PR
happens during mount/dismount. Then DM and friends can be setup to transparently
migrate or share the reservation, rather than depending on userspace to handle
these operations...
	Also, it seems to me the use of CLEAR is extremely dangerous in any environment
where actual arbitration or sharing of the resource is taking place.


	thanks,

On 8/26/2015 11:56 AM, Christoph Hellwig wrote:
> This series adds support for a simplified Persistent Reservation API
> to the block layer.  The intent is that both in-kernel and userspace
> consumers can use the API instead of having to hand craft SCSI or NVMe
> command through the various pass through interfaces.  It also adds
> DM support as getting reservations through dm-multipath is a major
> pain with the current scheme.
> 
> NVMe support currently isn't included as I don't have a multihost
> NVMe setup to test on, but Keith offered to test it and I'll have
> a patch for it shortly.
> 
> The ioctl API is documented in Documentation/block/pr.txt, but to
> fully understand the concept you'll have to read up the SPC spec,
> PRs are too complicated that trying to rephrase them into different
> terminology is just going to create confusion.
> 
> Note that Mike wants to include the DM patches so through the DM
> tree, so they are only included for reference.
> 
> I also have a set of simple test tools available at:
> 
> 	git://git.infradead.org/users/hch/pr-tests.git
> 
> Changes since V2:
>   - added an ignore flag to the reserve opertion as well, and redid
>     the ioctl API to have general flags fields
>   - rebased on top of the latest block layer tree updates
> Changes since V1:
>   - rename DM ->ioctl to ->prepare_ioctl
>   - rename dm_get_ioctl_table to dm_get_live_table_for_ioctl
>   - merge two DM patches into one
>   - various spelling fixes
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Persistent Reservation API V3
  2015-08-29  1:33   ` Jeremy Linton
@ 2015-08-29 13:52     ` Christoph Hellwig
  0 siblings, 0 replies; 6+ messages in thread
From: Christoph Hellwig @ 2015-08-29 13:52 UTC (permalink / raw)
  To: Jeremy Linton
  Cc: Christoph Hellwig, Jens Axboe, linux-scsi, linux-nvme, dm-devel,
	linux-api, linux-kernel

On Fri, Aug 28, 2015 at 08:33:24PM -0500, Jeremy Linton wrote:
> Hello,
> 	So, looking at this, I don't see how it supports the algorithm I've been using
> for years. For that algorithm to successfully migrate PRs across multiple paths
> on a single machine without affecting other possible users (who may legitimately
> have PR'ed the same device) I need PR_IN SA 1, READ RESERVATIONS to assure the
> current node owns the reservation before attempting to preempt it on another
> path. This can also assure that the device hasn't been reserved with a legacy
> reservation.

Do you have any code describing this in more detail?  We could add the
read side as well if there is strong interest.

> 	So, this leads me to two more general questions. The first is why isn't the PR
> API simply exported to filesystems as a general reserve/release so that the PR
> happens during mount/dismount. Then DM and friends can be setup to transparently
> migrate or share the reservation, rather than depending on userspace to handle
> these operations...

The API can be used by file systems, and my upcoming NFS SCSI layout
support was the main reason to write this.

> 	Also, it seems to me the use of CLEAR is extremely dangerous in any environment
> where actual arbitration or sharing of the resource is taking place.

Yes, but having it as a specific API isn't any less dangerous than
having it issued using SG_IO.  Reservations really only make sense if
you assume every user of a LU is actually cooperating in some way
and not actively hostile.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-08-29 13:52 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-26 16:06 Persistent Reservation API V3 Christoph Hellwig
2015-08-26 16:10 ` Christoph Hellwig
  -- strict thread matches above, loose matches on Subject: below --
2015-08-26 16:56 Christoph Hellwig
     [not found] ` <1440608214-14497-1-git-send-email-hch-jcswGhMUV9g@public.gmane.org>
2015-08-29  1:33   ` Jeremy Linton
2015-08-29 13:52     ` Christoph Hellwig
2015-08-26 16:03 Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).