From: Sagi Grimberg <sagig@dev.mellanox.co.il>
To: Boaz Harrosh <boaz@plexistor.com>,
Chuck Lever <chuck.lever@oracle.com>,
lsf-pc@lists.linux-foundation.org,
Dan Williams <dan.j.williams@intel.com>,
Yigal Korman <yigal@plexistor.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>,
Linux RDMA Mailing List <linux-rdma@vger.kernel.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>,
Jan Kara <jack@suse.cz>, Ric Wheeler <rwheeler@redhat.com>
Subject: Re: [LSF/MM TOPIC/ATTEND] RDMA passive target
Date: Wed, 27 Jan 2016 19:27:44 +0200 [thread overview]
Message-ID: <56A8FE10.7000309@dev.mellanox.co.il> (raw)
In-Reply-To: <56A8F646.5020003@plexistor.com>
Hey Boaz,
> RDMA passive target
> ~~~~~~~~~~~~~~~~~~~
>
> The idea is to have a storage brick that exports a very
> low level pure RDMA API to access its memory based storage.
> The brick might be battery backed volatile based memory, or
> pmem based. In any case the brick might utilize a much higher
> capacity then memory by utilizing a "tiering" to slower media,
> which is enabled by the API.
>
> The API is simple:
>
> 1. Alloc_2M_block_at_virtual_address (ADDR_64_BIT)
> ADDR_64_BIT is any virtual address and defines the logical ID of the block.
> If the ID is already allocated an error is returned.
> If storage is exhausted return => ENOSPC
> 2. Free_2M_block_at_virtual_address (ADDR_64_BIT)
> Space for logical ID is returned to free store and the ID becomes free for
> a new allocation.
> 3. map_virtual_address(ADDR_64_BIT, flags) => RDMA handle
> previously allocated virtual address is locked in memory and an RDMA handle
> is returned.
> Flags: read-only, read-write, shared and so on...
> 4. unmap__virtual_address(ADDR_64_BIT)
> At this point the brick can write data to slower storage if memory space
> is needed. The RDMA handle from [3] is revoked.
> 5. List_mapped_IDs
> An extent based list of all allocated ranges. (This is usually used on
> mount or after a crash)
My understanding is that you're describing a wire protocol correct?
> The dumb brick is not the Network allocator / storage manager at all. and it
> is not a smart target / server. like an iser-target or pnfs-DS. A SW defined
> application can do that, on top of the Dumb-brick. The motivation is a low level
> very low latency API+library, which can be built upon for higher protocols or
> used directly for very low latency cluster.
> It does however mange a virtual allocation map of logical to physical mapping
> of the 2M blocks.
The challenge in my mind would be to have persistence semantics in
place.
>
> Currently both drivers initiator and target are in Kernel, but with
> latest advancement by Dan Williams it can be implemented in user-mode as well,
> Almost.
>
> The almost is because:
> 1. If the target is over a /dev/pmemX then all is fine we have 2M contiguous
> memory blocks.
> 2. If the target is over an FS, we have a proposal pending for an falloc_2M_flag
> to ask the FS for a contiguous 2M allocations only. If any of the 2M allocations
> fail then return ENOSPC from falloc. This way we guaranty that each 2M block can be
> mapped by a single RDAM handle.
Umm, you don't need the 2M to be contiguous in order to represent them
as a single RDMA handle. If that was true iSER would have never worked.
Or I misunderstood what you meant...
next prev parent reply other threads:[~2016-01-27 17:27 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-25 21:19 [LSF/MM TOPIC] Remote access to pmem on storage targets Chuck Lever
2016-01-25 21:19 ` Chuck Lever
2016-01-26 8:25 ` [Lsf-pc] " Jan Kara
2016-01-26 8:25 ` Jan Kara
2016-01-26 15:58 ` Chuck Lever
2016-01-27 0:04 ` Dave Chinner
2016-01-27 15:55 ` Chuck Lever
2016-01-27 15:55 ` Chuck Lever
2016-01-28 21:10 ` Dave Chinner
2016-01-27 10:52 ` Sagi Grimberg
2016-01-27 10:52 ` Sagi Grimberg
2016-01-26 15:25 ` Atchley, Scott
2016-01-26 15:25 ` Atchley, Scott
2016-01-26 15:25 ` Atchley, Scott
2016-01-26 15:29 ` Chuck Lever
2016-01-26 15:29 ` Chuck Lever
2016-01-26 17:00 ` Christoph Hellwig
2016-01-26 17:00 ` Christoph Hellwig
2016-01-27 16:54 ` [LSF/MM TOPIC/ATTEND] RDMA passive target Boaz Harrosh
2016-01-27 17:02 ` [Lsf-pc] " James Bottomley
2016-01-27 17:02 ` James Bottomley
2016-01-27 17:27 ` Sagi Grimberg [this message]
2016-01-31 14:20 ` Boaz Harrosh
2016-01-31 14:20 ` Boaz Harrosh
2016-01-31 16:55 ` Yigal Korman
2016-02-01 10:36 ` Sagi Grimberg
2016-02-01 10:36 ` Sagi Grimberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=56A8FE10.7000309@dev.mellanox.co.il \
--to=sagig@dev.mellanox.co.il \
--cc=boaz@plexistor.com \
--cc=chuck.lever@oracle.com \
--cc=dan.j.williams@intel.com \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux-rdma@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=rwheeler@redhat.com \
--cc=yigal@plexistor.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.