Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"Hack, Jenny (Ft. Collins)" <jhack@hpe.com>,
	Frank Zago <frank.zago@hpe.com>, Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [RFC] Alternative design for fast register physical memory
Date: Wed, 8 Jun 2022 11:23:03 +0000	[thread overview]
Message-ID: <46f9f366-d31e-c600-4ac6-7ec440d6baff@fujitsu.com> (raw)
In-Reply-To: <20220527124240.GB2960187@ziepe.ca>

Hey Bob

Sorry to disturb you. Are you going to revert it to single map as the suggestion from Jason


Thanks
Zhijian


On 27/05/2022 20:42, Jason Gunthorpe wrote:
> On Tue, May 24, 2022 at 05:28:00PM -0500, Bob Pearson wrote:
>
>> We have a work around by fencing all the local operations which more
>> or less works but will have bad performance.  The maps used in FMRs
>> have fairly short lifetimes but definitely longer than we we can
>> support today. I am trying to work out the semantics of everything.
> IBTA specifies the fence requirements, I thought we decided RXE or
> maybe even lustre wasn't following the spec?
>
>> To make this all recoverable in the face of errors let there be more
>> than one map present for an FMR indexed by the key portion of the
>> l/rkeys.
> Real HW doesn't have more than one map, this seems like the wrong
> direction.
>
> As we discussed, there is something wrong with how rxe is processing
> its queues, it isn't following IBTA define behaviors in the
> exceptional cases.
>
>> Alternative view of FMRs:
>>
>> verb: ib_alloc_mr(pd, max_num_sg)			- create an empty MR object with no maps
>> 							  with l/rkey = [index, key] with index
>> 							  fixed and key some initial value.
>>
>> verb: ib_update_fast_reg_key(mr, newkey)		- update key portion of l/rkey
>>
>> verb: ib_map_mr_sg(mr, sg, sg_nents, sg_offset)		- create a new map from allocated memory
>> 							  or by re-using an INVALID map. Maps are
>> 							  all the same size (max_num_sg). The
>> 							  key (index) of this map is the current
>> 							  key from l/rkey. The initial state of
>> 							  the map is FREE. (and thus not usable
>> 							  until a REG_MR work request is used.)
> More than one map is nonsense, real HW has a single map, a MR object is that
> single map.
>
>> This is an improvement over the current state. At the moment we have
>> only two maps one for making new ones and one for doing IO. There is
>> no room to back up but at the moment the retry logic assumes that
>> you can which is false. This can be fixed easily by forcing all
>> local operations to be fenced which is what we are doing at the
>> moment at HPE. This can insert long delays between every new FMR
>> instance.  By allowing three maps and then fencing we can back up
>> one broken IO operation without too much of a delay.
> IMHO you need to go back to one map and fix the queue processing
> logic to be spec compliant.
>
> Jason

      reply	other threads:[~2022-06-08 11:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-24 22:28 [RFC] Alternative design for fast register physical memory Bob Pearson
2022-05-26  6:01 ` lizhijian
2022-05-27 12:42 ` Jason Gunthorpe
2022-06-08 11:23   ` lizhijian [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=46f9f366-d31e-c600-4ac6-7ec440d6baff@fujitsu.com \
    --to=lizhijian@fujitsu.com \
    --cc=frank.zago@hpe.com \
    --cc=jgg@ziepe.ca \
    --cc=jhack@hpe.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=rpearsonhpe@gmail.com \
    --cc=zyjzyj2000@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox