From: Hannes Reinecke <hare@suse.de>
To: Shaun Tancheff <shaun.tancheff@seagate.com>
Cc: Jens Axboe <axboe@fb.com>,
linux-block@vger.kernel.org,
"Martin K. Petersen" <martin.petersen@oracle.com>,
Christoph Hellwig <hch@lst.de>,
Damien Le Moal <damien.lemoal@hgst.com>,
linux-scsi@vger.kernel.org,
Sathya Prakash <sathya.prakash@broadcom.com>
Subject: Re: [PATCH 0/9] block/scsi: Implement SMR drive support
Date: Sat, 9 Apr 2016 10:01:45 +0200 [thread overview]
Message-ID: <5708B6E9.50400@suse.de> (raw)
In-Reply-To: <CAJVOszAp_u2JQcDxbVatPKkFN7TKp7kxqauBBvSrmcWjeuSYDg@mail.gmail.com>
On 04/08/2016 08:35 PM, Shaun Tancheff wrote:
> On Mon, Apr 4, 2016 at 5:00 PM, Hannes Reinecke <hare@suse.de> wrote:
>> Hi all,
>>
>> here's a patchset implementing SMR (shingled magnetic recording)
>> device support for the block and SCSI layer.
>>
>> There are two main parts to it:
>> - mapping the 'RESET WRITE POINTER' command to the 'discard' functionality.
>> The 'RESET WRITE POINTER' operation is pretty close to the existing
>> 'discard' functionality with the 'discard_zeroes_blocks' bit set.
>> So I've added a new 'reset_wp' provisioning mode for this.
>
> Completely agree with the REQ_OP_DISCARD -> Reset WP translation
> seems like a good idea. I have tried something similar and ended up
> essentially adding a 'reset wp' flag instead.
> Now I am optimistic to see if I can use you patch to get the
> discard -> reset wp working in my device mapper.
>
It works quite well here with my setup, although I've tripped across two
caveats:
- We currently don't handle conventional zones.
It would make sense to fallback to normal block zeroing here.
- Issuing 'RESET WP' is dead slow (at least on the prototypes I've had)
Short-circuiting it for empty zones is a _major_ performance win here;
the time for issuing discards for an entire drive is reduced by
several orders of magnitude. So you absolutely need an in-kernel
zone tree for this.
>> - Adding a 'zone' pointer to the request queue. This pointer holds an
>> RB-tree with the zone information, which can be used by other layers
>> to access the write pointer.
>
> Here is where I have some concerns. Having a common in-kernel
> shadow of the drive's zone state seems problematic to me.
>
Well, this is the general SMR programming model, is it not?
And as already pointed out above you really want this tree to be present
to avoid unnecessary RESET WP calls.
You also need it to format READ calls correctly for host-managed drives;
from my understanding of the programming model any READ call crossing
the write pointer will be aborted.
Which you could easily circumvent by splitting the READ call in two
parts, one up to the read pointer and another beyond it. For which again
you need the zone tree.
> Also if I am understanding the direction here it is to hold the zone
> information in an rbtree. Since that comes to just under 30,000
> entries I think it would be better to shift to an array of
> write pointer offsets.
>
The thing is that using an rbtree might actually be faster than an
array; the rbtree entries easily fit into the processor cache, whereas
the array doesn't. So you might end up having a slower access when using
arrays despite being easier to code.
> At the moment my translation layer keeps track of activity and state
> of all the zones on the drive so that is how I have been handling
> the zone data up to this point.
>
As outlined above: Any driver/filesystem need access to the zone states
as it might need to align its internal structures to the zones.
But you also need to keep track of the zones in the SCSI layer so as to
format the RESET WP correctly. Which means you basically need a common tree.
As you might've seen I've also programmed my own zoned device-mapper
device, caching individual zones. We should discuss if those two
approached can't be merged, to end up with a common device-mapper target.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
prev parent reply other threads:[~2016-04-09 8:01 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 10:00 [PATCH 0/9] block/scsi: Implement SMR drive support Hannes Reinecke
2016-04-04 10:00 ` [PATCH 1/9] blk-sysfs: Add 'chunk_sectors' to sysfs attributes Hannes Reinecke
2016-04-14 19:09 ` Bart Van Assche
2016-04-15 6:01 ` Hannes Reinecke
2016-04-04 10:00 ` [PATCH 2/9] block: update chunk_sectors in blk_stack_limits() Hannes Reinecke
2016-04-15 3:41 ` Bart Van Assche
2016-04-15 6:05 ` Hannes Reinecke
2016-04-04 10:00 ` [PATCH 3/9] sd: configure ZBC devices Hannes Reinecke
2016-04-15 15:47 ` Bart Van Assche
2016-04-15 18:01 ` Hannes Reinecke
2016-04-16 11:24 ` Hannes Reinecke
2016-04-04 10:00 ` [PATCH 4/9] sd: Implement new RESET_WP provisioning mode Hannes Reinecke
2016-04-04 10:00 ` [PATCH 5/9] block: Implement support for zoned block devices Hannes Reinecke
2016-04-15 17:37 ` Bart Van Assche
2016-04-04 10:00 ` [PATCH 6/9] block: Add 'zoned' sysfs queue attribute Hannes Reinecke
2016-04-07 1:56 ` Damien Le Moal
2016-04-07 5:57 ` Hannes Reinecke
2016-04-15 17:45 ` Bart Van Assche
2016-04-15 18:03 ` Hannes Reinecke
2016-04-15 18:42 ` Bart Van Assche
2016-04-04 10:00 ` [PATCH 7/9] block: Introduce BLKPREP_DONE Hannes Reinecke
2016-04-15 17:49 ` Bart Van Assche
2016-04-04 10:00 ` [PATCH 8/9] block: Add 'BLK_MQ_RQ_QUEUE_DONE' return value Hannes Reinecke
2016-04-15 17:56 ` Bart Van Assche
2016-04-15 18:05 ` Hannes Reinecke
2016-04-04 10:00 ` [PATCH 9/9] sd: Implement support for ZBC devices Hannes Reinecke
2016-04-15 18:31 ` Bart Van Assche
2016-04-16 11:34 ` Hannes Reinecke
2016-04-08 18:35 ` [PATCH 0/9] block/scsi: Implement SMR drive support Shaun Tancheff
2016-04-09 8:01 ` Hannes Reinecke [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5708B6E9.50400@suse.de \
--to=hare@suse.de \
--cc=axboe@fb.com \
--cc=damien.lemoal@hgst.com \
--cc=hch@lst.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=sathya.prakash@broadcom.com \
--cc=shaun.tancheff@seagate.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.