linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@infradead.org>
To: Matias Bjorling <m@bjorling.me>
Cc: Christoph Hellwig <hch@infradead.org>,
	keith.busch@intel.com, javier@paletta.io,
	linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org,
	axboe@fb.com, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 1/5 v2] blk-mq: Add prep/unprep support
Date: Sat, 18 Apr 2015 13:16:10 -0700	[thread overview]
Message-ID: <20150418201610.GB20311@infradead.org> (raw)
In-Reply-To: <5531FD7F.8070809@bjorling.me>

On Sat, Apr 18, 2015 at 08:45:19AM +0200, Matias Bjorling wrote:
> The low level drivers will be NVMe and vendor's own PCI-e drivers. It's very
> generic in their nature. Each driver would duplicate the same work. Both
> could have normal and open-channel drives attached.

I didn't say the work should move into the driver, bur rather that
driver should talk to the open channel ssd code directly instead of
hooking into the core block code.

> I'll like to keep blk-mq in the loop. I don't think it will be pretty to
> have two data paths in the drivers. For blk-mq, bios are splitted/merged on
> the way down. Thus, the actual physical addresses needs aren't known before
> the IO is diced to the right size.

But you _do_ have two different data path already.  Nothing says you
can't use blk-mq for your data path, ut it should be a separate entry
point.  Similar to say how a SCSI disk and MMC device both use the block
layer but still use different entry points.

> The reason it shouldn't be under the a single block device, is that a target
> should be able to provide a global address space.
> That allows the address
> space to grow/shrink dynamically with the disks. Allowing a continuously
> growing address space, where disks can be added/removed as requirements grow
> or flash ages. Not on a sector level, but on a flash block level.

I don't understand what you mean with a single block device here, but I
suspect we're talking past each other somehow.

> >>In the future, applications can have an API to get/put flash block directly.
> >>(using the blk_nvm_[get/put]_blk interface).
> >
> >s/application/filesystem/?
> >
> 
> Applications. The goal is that key value stores, e.g. RocksDB, Aerospike,
> Ceph and similar have direct access to flash storage. There won't be a
> kernel file-system between.
> 
> The get/put interface can be seen as a space reservation interface for where
> a given process is allowed to access the storage media.
> 
> It can also be seen in the way that we provide a block allocator in the
> kernel, while applications implement the rest of "file-system" in
> user-space, specially optimized for their data structures. This makes a lot
> of sense for a small subset (LSM, Fractal trees, etc.) of database
> applications.

While we'll need a proper API for that first it's just another reason of
why we shouldnt shoe horn the open channel ssd support into the block
layer.

  reply	other threads:[~2015-04-18 20:16 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-15 12:34 [PATCH 0/5 v2] Support for Open-Channel SSDs Matias Bjørling
2015-04-15 12:34 ` [PATCH 1/5 v2] blk-mq: Add prep/unprep support Matias Bjørling
2015-04-17  6:34   ` Christoph Hellwig
2015-04-17  8:15     ` Matias Bjørling
2015-04-17 17:46       ` Christoph Hellwig
2015-04-18  6:45         ` Matias Bjorling
2015-04-18 20:16           ` Christoph Hellwig [this message]
2015-04-19 18:12             ` Matias Bjorling
2015-04-15 12:34 ` [PATCH 2/5 v2] blk-mq: Support for Open-Channel SSDs Matias Bjørling
2015-04-16  9:10   ` Paul Bolle
2015-04-16 10:23     ` Matias Bjørling
2015-04-16 11:34       ` Paul Bolle
2015-04-16 13:29         ` Matias Bjørling
2015-04-15 12:34 ` [PATCH 3/5 v2] lightnvm: RRPC target Matias Bjørling
2015-04-16  9:12   ` Paul Bolle
2015-04-15 12:34 ` [PATCH 4/5 v2] null_blk: LightNVM support Matias Bjørling
2015-04-15 12:34 ` [PATCH 5/5 v2] nvme: " Matias Bjørling
2015-04-16 14:55   ` Keith Busch
2015-04-16 15:14     ` Javier González
2015-04-16 15:52       ` Keith Busch
2015-04-16 16:01         ` James R. Bergsten
2015-04-16 16:12           ` Keith Busch
2015-04-16 17:17     ` Matias Bjorling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150418201610.GB20311@infradead.org \
    --to=hch@infradead.org \
    --cc=axboe@fb.com \
    --cc=javier@paletta.io \
    --cc=keith.busch@intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=m@bjorling.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).