All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0 of 3] Patches to alter BLKIF_OP_TRIM to BLKIF_OP_DISCARD (v4)
@ 2011-10-12 22:12 Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 1 of 3] interface: rename of trim to discard in blkif.h Konrad Rzeszutek Wilk
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-12 22:12 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, JBeulich; +Cc: konrad.wilk

This is the v4 patches which is broken in three parts and does:

 1). rename trim->discard
 2). flesh out the description
 3). add the secure-discard option.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1 of 3] interface: rename of trim to discard in blkif.h
  2011-10-12 22:12 [PATCH 0 of 3] Patches to alter BLKIF_OP_TRIM to BLKIF_OP_DISCARD (v4) Konrad Rzeszutek Wilk
@ 2011-10-12 22:12 ` Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-12 22:12 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, JBeulich; +Cc: konrad.wilk

# HG changeset patch
# User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
# Date 1318457224 14400
# Node ID 88b7814df143169a1cf946a9881ae2ecea9693bd
# Parent  4b0907c6a08c348962bd976c2976257b412408be
interface: rename of trim to discard in blkif.h

Just a simple sed s/trim/discard/. We are ignoring the comments
which are incorrect.

The reason for the name change is that TRIM is specific to ATA
while the operation can be done on top of SCSI interfaces too.
Hence the rename to something more generic.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

diff -r 4b0907c6a08c -r 88b7814df143 xen/include/public/io/blkif.h
--- a/xen/include/public/io/blkif.h	Tue Oct 11 12:02:58 2011 +0100
+++ b/xen/include/public/io/blkif.h	Wed Oct 12 18:07:04 2011 -0400
@@ -82,25 +82,25 @@
  */
 #define BLKIF_OP_RESERVED_1        4
 /*
- * Recognised only if "feature-trim" is present in backend xenbus info.
- * The "feature-trim" node contains a boolean indicating whether trim
- * requests are likely to succeed or fail. Either way, a trim request
+ * Recognised only if "feature-discard" is present in backend xenbus info.
+ * The "feature-discard" node contains a boolean indicating whether discard
+ * requests are likely to succeed or fail. Either way, a discard request
  * may fail at any time with BLKIF_RSP_EOPNOTSUPP if it is unsupported by
  * the underlying block-device hardware. The boolean simply indicates whether
- * or not it is worthwhile for the frontend to attempt trim requests.
- * If a backend does not recognise BLKIF_OP_TRIM, it should *not*
- * create the "feature-trim" node!
+ * or not it is worthwhile for the frontend to attempt discard requests.
+ * If a backend does not recognise BLKIF_OP_DISCARD, it should *not*
+ * create the "feature-discard" node!
  * 
- * Trim operation is a request for the underlying block device to mark
- * extents to be erased. Trim operations are passed with sector_number as the
- * sector index to begin trim operations at and nr_sectors as the number of
- * sectors to be trimmed. The specified sectors should be trimmed if the
- * underlying block device supports trim operations, or a BLKIF_RSP_EOPNOTSUPP
- * should be returned. More information about trim operations at:
+ * Discard operation is a request for the underlying block device to mark
+ * extents to be erased. Discard operations are passed with sector_number as the
+ * sector index to begin discard operations at and nr_sectors as the number of
+ * sectors to be discarded. The specified sectors should be discarded if the
+ * underlying block device supports discard operations, or a BLKIF_RSP_EOPNOTSUPP
+ * should be returned. More information about discard operations at:
  * http://t13.org/Documents/UploadedDocuments/docs2008/
  *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
  */
-#define BLKIF_OP_TRIM              5
+#define BLKIF_OP_DISCARD           5
 
 /*
  * Maximum scatter/gather segments per request.
@@ -135,17 +135,17 @@ typedef struct blkif_request blkif_reque
 
 /*
  * Cast to this structure when blkif_request.operation == BLKIF_OP_TRIM
- * sizeof(struct blkif_request_trim) <= sizeof(struct blkif_request)
+ * sizeof(struct blkif_request_discard) <= sizeof(struct blkif_request)
  */
-struct blkif_request_trim {
-    uint8_t        operation;    /* BLKIF_OP_TRIM                        */
+struct blkif_request_discard {
+    uint8_t        operation;    /* BLKIF_OP_DISCARD                     */
     uint8_t        reserved;     /*                                      */
     blkif_vdev_t   handle;       /* same as for read/write requests      */
     uint64_t       id;           /* private guest value, echoed in resp  */
     blkif_sector_t sector_number;/* start sector idx on disk             */
-    uint64_t       nr_sectors;   /* number of contiguous sectors to trim */
+    uint64_t       nr_sectors;   /* number of contiguous sectors to discard*/
 };
-typedef struct blkif_request_trim blkif_request_trim_t;
+typedef struct blkif_request_discard blkif_request_discard_t;
 
 struct blkif_response {
     uint64_t        id;              /* copied from request */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description
  2011-10-12 22:12 [PATCH 0 of 3] Patches to alter BLKIF_OP_TRIM to BLKIF_OP_DISCARD (v4) Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 1 of 3] interface: rename of trim to discard in blkif.h Konrad Rzeszutek Wilk
@ 2011-10-12 22:12 ` Konrad Rzeszutek Wilk
  2011-10-13  8:00   ` Ian Campbell
  2011-10-12 22:12 ` [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE Konrad Rzeszutek Wilk
  2 siblings, 1 reply; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-12 22:12 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, JBeulich; +Cc: konrad.wilk

# HG changeset patch
# User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
# Date 1318457227 14400
# Node ID 15c2d70dbac3e31c2d74b6700e1bb5f8a7d8256e
# Parent  88b7814df143169a1cf946a9881ae2ecea9693bd
interface: Flesh out the BLKIF_OP_DISCARD description.

We flesh out details on what is expected of 'feature-flush' and
what are some of the extra parameters that the frontend can read
from the backend. Those extra parameters are: : discard-aligment,
and discard-granularity.

Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

diff -r 88b7814df143 -r 15c2d70dbac3 xen/include/public/io/blkif.h
--- a/xen/include/public/io/blkif.h	Wed Oct 12 18:07:04 2011 -0400
+++ b/xen/include/public/io/blkif.h	Wed Oct 12 18:07:07 2011 -0400
@@ -83,22 +83,42 @@
 #define BLKIF_OP_RESERVED_1        4
 /*
  * Recognised only if "feature-discard" is present in backend xenbus info.
- * The "feature-discard" node contains a boolean indicating whether discard
- * requests are likely to succeed or fail. Either way, a discard request
+ * The "feature-discard" node contains a boolean indicating whether trim
+ * (ATA) or unmap (SCSI) - conviently called discard requests are likely
+ * to succeed or fail. Either way, a discard request
  * may fail at any time with BLKIF_RSP_EOPNOTSUPP if it is unsupported by
  * the underlying block-device hardware. The boolean simply indicates whether
  * or not it is worthwhile for the frontend to attempt discard requests.
  * If a backend does not recognise BLKIF_OP_DISCARD, it should *not*
  * create the "feature-discard" node!
- * 
+ *
  * Discard operation is a request for the underlying block device to mark
- * extents to be erased. Discard operations are passed with sector_number as the
+ * extents to be erased. However, discard does not guarantee that the blocks
+ * will be erased from the device - it is just a hint to the device
+ * controller that these blocks are no longer in use. What the device
+ * controller does with that information is left to the controller.
+ * Discard operations are passed with sector_number as the
  * sector index to begin discard operations at and nr_sectors as the number of
  * sectors to be discarded. The specified sectors should be discarded if the
- * underlying block device supports discard operations, or a BLKIF_RSP_EOPNOTSUPP
- * should be returned. More information about discard operations at:
+ * underlying block device supports trim (ATA) or unmap (SCSI) operations,
+ * or a BLKIF_RSP_EOPNOTSUPP  should be returned.
+ * More information about trim/unmap operations at:
  * http://t13.org/Documents/UploadedDocuments/docs2008/
  *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
+ * http://www.seagate.com/staticfiles/support/disc/manuals/
+ *     Interface%20manuals/100293068c.pdf
+ * The backend can optionally provide two extra XenBus attributes to
+ * further optimize the discard functionality:
+ * 'discard-aligment' - Devices that support discard functionality may
+ * internally allocate space in units that are bigger than the exported
+ * logical block size. The discard-alignment parameter indicates how many bytes
+ * the beginning of the partition is offset from the internal allocation unit's
+ * natural alignment.
+ * 'discard-granularity'  - Devices that support discard functionality may
+ * internally allocate space using units that are bigger than the logical block
+ * size. The discard-granularity parameter indicates the size of the internal
+ * allocation unit in bytes if reported by the device. Otherwise the
+ * discard-granularity will be set to match the device's physical block size.
  */
 #define BLKIF_OP_DISCARD           5

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE
  2011-10-12 22:12 [PATCH 0 of 3] Patches to alter BLKIF_OP_TRIM to BLKIF_OP_DISCARD (v4) Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 1 of 3] interface: rename of trim to discard in blkif.h Konrad Rzeszutek Wilk
  2011-10-12 22:12 ` [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description Konrad Rzeszutek Wilk
@ 2011-10-12 22:12 ` Konrad Rzeszutek Wilk
  2011-10-13  7:59   ` Ian Campbell
  2 siblings, 1 reply; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-12 22:12 UTC (permalink / raw)
  To: xen-devel, Ian.Campbell, JBeulich; +Cc: konrad.wilk

# HG changeset patch
# User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
# Date 1318457231 14400
# Node ID 50850daec7f0486ee7ca69b3d4cb58b4d340a5a7
# Parent  15c2d70dbac3e31c2d74b6700e1bb5f8a7d8256e
interface: add 'discard-secure' and BLKIF_DISCARD_SECURE

Alter the 'reserved' uint8_t to be used a 'flag'. We use only for
one flag: BLKIF_DISCARD_SECURE.

That flag can only be set if the backend has set 'discard-secure' to one.
If backend has not set 'discard-secure' to one, that flag will have no
effect.

Acked-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

diff -r 15c2d70dbac3 -r 50850daec7f0 xen/include/public/io/blkif.h
--- a/xen/include/public/io/blkif.h	Wed Oct 12 18:07:07 2011 -0400
+++ b/xen/include/public/io/blkif.h	Wed Oct 12 18:07:11 2011 -0400
@@ -107,7 +107,7 @@
  *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
  * http://www.seagate.com/staticfiles/support/disc/manuals/
  *     Interface%20manuals/100293068c.pdf
- * The backend can optionally provide two extra XenBus attributes to
+ * The backend can optionally provide three extra XenBus attributes to
  * further optimize the discard functionality:
  * 'discard-aligment' - Devices that support discard functionality may
  * internally allocate space in units that are bigger than the exported
@@ -119,6 +119,9 @@
  * size. The discard-granularity parameter indicates the size of the internal
  * allocation unit in bytes if reported by the device. Otherwise the
  * discard-granularity will be set to match the device's physical block size.
+ * 'discard-secure' - All copies of the discarded sectors (potentially created
+ * by garbage collection) must also be erased.  To use this feature, the flag
+ * BLKIF_DISCARD_SECURE must be set in the blkif_request_trim.
  */
 #define BLKIF_OP_DISCARD           5
 
@@ -159,7 +162,8 @@ typedef struct blkif_request blkif_reque
  */
 struct blkif_request_discard {
     uint8_t        operation;    /* BLKIF_OP_DISCARD                     */
-    uint8_t        reserved;     /*                                      */
+    uint8_t        flag;         /* BLKIF_DISCARD_SECURE or zero         */
+#define BLKIF_DISCARD_SECURE (1<<0)  /* ignored if discard-secure=0      */
     blkif_vdev_t   handle;       /* same as for read/write requests      */
     uint64_t       id;           /* private guest value, echoed in resp  */
     blkif_sector_t sector_number;/* start sector idx on disk             */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE
  2011-10-12 22:12 ` [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE Konrad Rzeszutek Wilk
@ 2011-10-13  7:59   ` Ian Campbell
  2011-10-13 14:12     ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 10+ messages in thread
From: Ian Campbell @ 2011-10-13  7:59 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

On Wed, 2011-10-12 at 23:12 +0100, Konrad Rzeszutek Wilk wrote:
> # HG changeset patch
> # User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> # Date 1318457231 14400
> # Node ID 50850daec7f0486ee7ca69b3d4cb58b4d340a5a7
> # Parent  15c2d70dbac3e31c2d74b6700e1bb5f8a7d8256e
> interface: add 'discard-secure' and BLKIF_DISCARD_SECURE
> 
> Alter the 'reserved' uint8_t to be used a 'flag'. We use only for
> one flag: BLKIF_DISCARD_SECURE.
> 
> That flag can only be set if the backend has set 'discard-secure' to one.
> If backend has not set 'discard-secure' to one, that flag will have no
> effect.
> 
> Acked-by: Jan Beulich <JBeulich@suse.com>
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> 
> diff -r 15c2d70dbac3 -r 50850daec7f0 xen/include/public/io/blkif.h
> --- a/xen/include/public/io/blkif.h	Wed Oct 12 18:07:07 2011 -0400
> +++ b/xen/include/public/io/blkif.h	Wed Oct 12 18:07:11 2011 -0400
> @@ -107,7 +107,7 @@
>   *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
>   * http://www.seagate.com/staticfiles/support/disc/manuals/
>   *     Interface%20manuals/100293068c.pdf
> - * The backend can optionally provide two extra XenBus attributes to
> + * The backend can optionally provide three extra XenBus attributes to
                                         these

avoids patching (or more likely forgetting to patch) this line every
time we add an attribute.

>   * further optimize the discard functionality:
>   * 'discard-aligment' - Devices that support discard functionality may
>   * internally allocate space in units that are bigger than the exported
> @@ -119,6 +119,9 @@
>   * size. The discard-granularity parameter indicates the size of the internal
>   * allocation unit in bytes if reported by the device. Otherwise the
>   * discard-granularity will be set to match the device's physical block size.
> + * 'discard-secure' - All copies of the discarded sectors (potentially created
> + * by garbage collection) must also be erased.  To use this feature, the flag
> + * BLKIF_DISCARD_SECURE must be set in the blkif_request_trim.

Stray "trim" here.

>   */
>  #define BLKIF_OP_DISCARD           5

It just occurred to me that if reusing the reserved field is going to
prove a problem we could have had BLKIF_OP_DISCARD_SECURE. I think we've
got things under control though.

>  
> @@ -159,7 +162,8 @@ typedef struct blkif_request blkif_reque
>   */
>  struct blkif_request_discard {
>      uint8_t        operation;    /* BLKIF_OP_DISCARD                     */
> -    uint8_t        reserved;     /*                                      */
> +    uint8_t        flag;         /* BLKIF_DISCARD_SECURE or zero         */
> +#define BLKIF_DISCARD_SECURE (1<<0)  /* ignored if discard-secure=0      */
>      blkif_vdev_t   handle;       /* same as for read/write requests      */
>      uint64_t       id;           /* private guest value, echoed in resp  */
>      blkif_sector_t sector_number;/* start sector idx on disk             */
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description
  2011-10-12 22:12 ` [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description Konrad Rzeszutek Wilk
@ 2011-10-13  8:00   ` Ian Campbell
  2011-10-13 14:32     ` Konrad Rzeszutek Wilk
                       ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Ian Campbell @ 2011-10-13  8:00 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

Thanks for splitting these out.

On Wed, 2011-10-12 at 23:12 +0100, Konrad Rzeszutek Wilk wrote:
[...]
> + * The backend can optionally provide two extra XenBus attributes to
> + * further optimize the discard functionality:
> + * 'discard-aligment' - Devices that support discard functionality may
> + * internally allocate space in units that are bigger than the exported
> + * logical block size. The discard-alignment parameter indicates how many bytes
> + * the beginning of the partition is offset from the internal allocation unit's
> + * natural alignment.

So this is to account for the case where a physical device can discard
e.g. 128K blocks at a time but the VBD (a better term than "partition"
in the context, I think) starts at e.g. offset 64K within that
underlying device?

Does this mean that the virtual device can discard the first 64K (and
thereafter in 128K chunks), or that it cannot because that would overlap
the first 64K of that block which belongs to something else? Or that it
can try but it may or may not succeed. What about if the secure flag is
set? 

Could we simplify and say that blkback won't expose discard support
unless the underlying block device is correctly aligned for it? i.e.
encourage people to align their underlying storage correctly? Presumably
doing that has other benefits?

> + * 'discard-granularity'  - Devices that support discard functionality may
> + * internally allocate space using units that are bigger than the logical block
> + * size. The discard-granularity parameter indicates the size of the internal
> + * allocation unit in bytes if reported by the device. Otherwise the
> + * discard-granularity will be set to match the device's physical block size.

This is effectively the minimum size you can discard? (modulo the
sub-block at the front arising from discard-alignment).

Presumably the granularity sized blocks are self aligned to that same ?
(again modulo the sub-block at the beginning).

Would there be any benefit to having both these numbers in logical-block
sized units instead of bytes? The rest of the interface typically uses
sectors/segments.

Ian.

>   */
>  #define BLKIF_OP_DISCARD           5
>  
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE
  2011-10-13  7:59   ` Ian Campbell
@ 2011-10-13 14:12     ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-13 14:12 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

On Thu, Oct 13, 2011 at 08:59:28AM +0100, Ian Campbell wrote:
> On Wed, 2011-10-12 at 23:12 +0100, Konrad Rzeszutek Wilk wrote:
> > # HG changeset patch
> > # User Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > # Date 1318457231 14400
> > # Node ID 50850daec7f0486ee7ca69b3d4cb58b4d340a5a7
> > # Parent  15c2d70dbac3e31c2d74b6700e1bb5f8a7d8256e
> > interface: add 'discard-secure' and BLKIF_DISCARD_SECURE
> > 
> > Alter the 'reserved' uint8_t to be used a 'flag'. We use only for
> > one flag: BLKIF_DISCARD_SECURE.
> > 
> > That flag can only be set if the backend has set 'discard-secure' to one.
> > If backend has not set 'discard-secure' to one, that flag will have no
> > effect.
> > 
> > Acked-by: Jan Beulich <JBeulich@suse.com>
> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> > 
> > diff -r 15c2d70dbac3 -r 50850daec7f0 xen/include/public/io/blkif.h
> > --- a/xen/include/public/io/blkif.h	Wed Oct 12 18:07:07 2011 -0400
> > +++ b/xen/include/public/io/blkif.h	Wed Oct 12 18:07:11 2011 -0400
> > @@ -107,7 +107,7 @@
> >   *     e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
> >   * http://www.seagate.com/staticfiles/support/disc/manuals/
> >   *     Interface%20manuals/100293068c.pdf
> > - * The backend can optionally provide two extra XenBus attributes to
> > + * The backend can optionally provide three extra XenBus attributes to
>                                          these
> 
> avoids patching (or more likely forgetting to patch) this line every
> time we add an attribute.
> 
> >   * further optimize the discard functionality:
> >   * 'discard-aligment' - Devices that support discard functionality may
> >   * internally allocate space in units that are bigger than the exported
> > @@ -119,6 +119,9 @@
> >   * size. The discard-granularity parameter indicates the size of the internal
> >   * allocation unit in bytes if reported by the device. Otherwise the
> >   * discard-granularity will be set to match the device's physical block size.
> > + * 'discard-secure' - All copies of the discarded sectors (potentially created
> > + * by garbage collection) must also be erased.  To use this feature, the flag
> > + * BLKIF_DISCARD_SECURE must be set in the blkif_request_trim.
> 
> Stray "trim" here.

Duh!
> 
> >   */
> >  #define BLKIF_OP_DISCARD           5
> 
> It just occurred to me that if reusing the reserved field is going to
> prove a problem we could have had BLKIF_OP_DISCARD_SECURE. I think we've
> got things under control though.

Yeah, I think the reserved->flag attribute worked out nicely.
> 
> >  
> > @@ -159,7 +162,8 @@ typedef struct blkif_request blkif_reque
> >   */
> >  struct blkif_request_discard {
> >      uint8_t        operation;    /* BLKIF_OP_DISCARD                     */
> > -    uint8_t        reserved;     /*                                      */
> > +    uint8_t        flag;         /* BLKIF_DISCARD_SECURE or zero         */
> > +#define BLKIF_DISCARD_SECURE (1<<0)  /* ignored if discard-secure=0      */
> >      blkif_vdev_t   handle;       /* same as for read/write requests      */
> >      uint64_t       id;           /* private guest value, echoed in resp  */
> >      blkif_sector_t sector_number;/* start sector idx on disk             */
> > 
> > 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description
  2011-10-13  8:00   ` Ian Campbell
@ 2011-10-13 14:32     ` Konrad Rzeszutek Wilk
  2011-10-13 15:04     ` Konrad Rzeszutek Wilk
  2011-10-13 15:32     ` Other things we need to do with backend/blkfront Was:Re: " Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-13 14:32 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

On Thu, Oct 13, 2011 at 09:00:07AM +0100, Ian Campbell wrote:
> Thanks for splitting these out.
> 
> On Wed, 2011-10-12 at 23:12 +0100, Konrad Rzeszutek Wilk wrote:
> [...]
> > + * The backend can optionally provide two extra XenBus attributes to
> > + * further optimize the discard functionality:
> > + * 'discard-aligment' - Devices that support discard functionality may
> > + * internally allocate space in units that are bigger than the exported
> > + * logical block size. The discard-alignment parameter indicates how many bytes
> > + * the beginning of the partition is offset from the internal allocation unit's
> > + * natural alignment.
> 

[note: I copied the Documentation/ABI/testing/sysfs-block contents]

> So this is to account for the case where a physical device can discard
> e.g. 128K blocks at a time but the VBD (a better term than "partition"
> in the context, I think) starts at e.g. offset 64K within that
> underlying device?

Yes. And the tools, such as 'fdisk/gparted' can take advantage of that
and create the partitions^H^H^VBDs at the proper spots.

> 
> Does this mean that the virtual device can discard the first 64K (and
> thereafter in 128K chunks), or that it cannot because that would overlap
> the first 64K of that block which belongs to something else? Or that it
> can try but it may or may not succeed. What about if the secure flag is
> set? 

They are all "best try, but we might fail."
> 
> Could we simplify and say that blkback won't expose discard support
> unless the underlying block device is correctly aligned for it? i.e.

I am not sure how we would do that? The discard support works for
full devices, not LVMs, not partitions. So if the user does not
setup the partitions correctly it will try to discard but not do a very
good job.

The current way that Linux does report that the aligment is off is by
by exporting the discard-aligment flag as -1 if it is improperly aligned.
(/sys/block/sda/discard_aligment)

> encourage people to align their underlying storage correctly? Presumably
> doing that has other benefits?

It does that automatically if the user uses the newly found tools
like parted/fdisk..
> 
> > + * 'discard-granularity'  - Devices that support discard functionality may
> > + * internally allocate space using units that are bigger than the logical block
> > + * size. The discard-granularity parameter indicates the size of the internal
> > + * allocation unit in bytes if reported by the device. Otherwise the
> > + * discard-granularity will be set to match the device's physical block size.
> 
> This is effectively the minimum size you can discard? (modulo the
> sub-block at the front arising from discard-alignment).

Yes.
> 
> Presumably the granularity sized blocks are self aligned to that same ?
> (again modulo the sub-block at the beginning).

Yes.

> 
> Would there be any benefit to having both these numbers in logical-block
> sized units instead of bytes? The rest of the interface typically uses
> sectors/segments.

Uhh, I would prefer not too - as we would have to convert those values
back to bytes when providing it to the block API. And the backend would
have to convert from bytes to sectors/segments again.

But this got me thinking - I don't think we actually figure out the
correct block size. Meaning we just hard-code 512.. But then I am not
sure what Linux is doing either:

scsi 2:0:0:0: Direct-Access     ATA      INTEL SSDSA2M080 2CV1 PQ: 0 ANSI: 5
sd 2:0:0:0: [sda] 156301488 512-byte logical blocks: (80.0 GB/74.5 GiB)
sd 2:0:0:0: Attached scsi generic sg0 type 0
scsi 3:0:0:0: Direct-Access     ATA      ST3250410AS      3.AA PQ: 0 ANSI: 5
sd 3:0:0:0: [sdb] 488397168 512-byte logical blocks: (250 GB/232 GiB)
sd 3:0:0:0: Attached scsi generic sg1 type 0

And logical_block_size 512, discard_granularity is 512, and discard_alignment
is zero.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description
  2011-10-13  8:00   ` Ian Campbell
  2011-10-13 14:32     ` Konrad Rzeszutek Wilk
@ 2011-10-13 15:04     ` Konrad Rzeszutek Wilk
  2011-10-13 15:32     ` Other things we need to do with backend/blkfront Was:Re: " Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-13 15:04 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

On Thu, Oct 13, 2011 at 09:00:07AM +0100, Ian Campbell wrote:
> Thanks for splitting these out.
> 
> On Wed, 2011-10-12 at 23:12 +0100, Konrad Rzeszutek Wilk wrote:
> [...]
> > + * The backend can optionally provide two extra XenBus attributes to
> > + * further optimize the discard functionality:
> > + * 'discard-aligment' - Devices that support discard functionality may
> > + * internally allocate space in units that are bigger than the exported
> > + * logical block size. The discard-alignment parameter indicates how many bytes
> > + * the beginning of the partition is offset from the internal allocation unit's
> > + * natural alignment.
> 
> So this is to account for the case where a physical device can discard
> e.g. 128K blocks at a time but the VBD (a better term than "partition"
> in the context, I think) starts at e.g. offset 64K within that
> underlying device?
> 
> Does this mean that the virtual device can discard the first 64K (and
> thereafter in 128K chunks), or that it cannot because that would overlap

[edit: I don't think I answered this question]
Yes.
> the first 64K of that block which belongs to something else? Or that it
> can try but it may or may not succeed. What about if the secure flag is

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Other things we need to do with backend/blkfront Was:Re: Re: [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description
  2011-10-13  8:00   ` Ian Campbell
  2011-10-13 14:32     ` Konrad Rzeszutek Wilk
  2011-10-13 15:04     ` Konrad Rzeszutek Wilk
@ 2011-10-13 15:32     ` Konrad Rzeszutek Wilk
  2 siblings, 0 replies; 10+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-10-13 15:32 UTC (permalink / raw)
  To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, JBeulich@suse.com

> Could we simplify and say that blkback won't expose discard support
> unless the underlying block device is correctly aligned for it? i.e.
> encourage people to align their underlying storage correctly? Presumably
> doing that has other benefits?

It got me thinking that we could do this - but I do not think that should
be spelled out in the interface. Rather it is up to the backend to either
expose it or not. The check for -1 in backend for that should do it. Keep
in mind that the discard operation is a hint, nothing else.

It also got me thinking about the aligment offset - which we do not
expose to the frontend. That is the one where the 63 sector DOS partition
ends up skewing up the whole disk layout. That is seperate from discard
operations.

It is more of a XenBus attribute. Then there is also the device serial
number which we don't expose either.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-10-13 15:32 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-12 22:12 [PATCH 0 of 3] Patches to alter BLKIF_OP_TRIM to BLKIF_OP_DISCARD (v4) Konrad Rzeszutek Wilk
2011-10-12 22:12 ` [PATCH 1 of 3] interface: rename of trim to discard in blkif.h Konrad Rzeszutek Wilk
2011-10-12 22:12 ` [PATCH 2 of 3] interface: Flesh out the BLKIF_OP_DISCARD description Konrad Rzeszutek Wilk
2011-10-13  8:00   ` Ian Campbell
2011-10-13 14:32     ` Konrad Rzeszutek Wilk
2011-10-13 15:04     ` Konrad Rzeszutek Wilk
2011-10-13 15:32     ` Other things we need to do with backend/blkfront Was:Re: " Konrad Rzeszutek Wilk
2011-10-12 22:12 ` [PATCH 3 of 3] interface: add 'discard-secure' and BLKIF_DISCARD_SECURE Konrad Rzeszutek Wilk
2011-10-13  7:59   ` Ian Campbell
2011-10-13 14:12     ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.