From: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>
To: Bob Pearson <rpearsonhpe@gmail.com>
Cc: Zhu Yanjun <zyjzyj2000@gmail.com>,
"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
"Hack, Jenny (Ft. Collins)" <jhack@hpe.com>,
Frank Zago <frank.zago@hpe.com>, Jason Gunthorpe <jgg@ziepe.ca>
Subject: Re: [RFC] Alternative design for fast register physical memory
Date: Wed, 8 Jun 2022 11:23:03 +0000 [thread overview]
Message-ID: <46f9f366-d31e-c600-4ac6-7ec440d6baff@fujitsu.com> (raw)
In-Reply-To: <20220527124240.GB2960187@ziepe.ca>
Hey Bob
Sorry to disturb you. Are you going to revert it to single map as the suggestion from Jason
Thanks
Zhijian
On 27/05/2022 20:42, Jason Gunthorpe wrote:
> On Tue, May 24, 2022 at 05:28:00PM -0500, Bob Pearson wrote:
>
>> We have a work around by fencing all the local operations which more
>> or less works but will have bad performance. The maps used in FMRs
>> have fairly short lifetimes but definitely longer than we we can
>> support today. I am trying to work out the semantics of everything.
> IBTA specifies the fence requirements, I thought we decided RXE or
> maybe even lustre wasn't following the spec?
>
>> To make this all recoverable in the face of errors let there be more
>> than one map present for an FMR indexed by the key portion of the
>> l/rkeys.
> Real HW doesn't have more than one map, this seems like the wrong
> direction.
>
> As we discussed, there is something wrong with how rxe is processing
> its queues, it isn't following IBTA define behaviors in the
> exceptional cases.
>
>> Alternative view of FMRs:
>>
>> verb: ib_alloc_mr(pd, max_num_sg) - create an empty MR object with no maps
>> with l/rkey = [index, key] with index
>> fixed and key some initial value.
>>
>> verb: ib_update_fast_reg_key(mr, newkey) - update key portion of l/rkey
>>
>> verb: ib_map_mr_sg(mr, sg, sg_nents, sg_offset) - create a new map from allocated memory
>> or by re-using an INVALID map. Maps are
>> all the same size (max_num_sg). The
>> key (index) of this map is the current
>> key from l/rkey. The initial state of
>> the map is FREE. (and thus not usable
>> until a REG_MR work request is used.)
> More than one map is nonsense, real HW has a single map, a MR object is that
> single map.
>
>> This is an improvement over the current state. At the moment we have
>> only two maps one for making new ones and one for doing IO. There is
>> no room to back up but at the moment the retry logic assumes that
>> you can which is false. This can be fixed easily by forcing all
>> local operations to be fenced which is what we are doing at the
>> moment at HPE. This can insert long delays between every new FMR
>> instance. By allowing three maps and then fencing we can back up
>> one broken IO operation without too much of a delay.
> IMHO you need to go back to one map and fix the queue processing
> logic to be spec compliant.
>
> Jason
prev parent reply other threads:[~2022-06-08 11:23 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-05-24 22:28 [RFC] Alternative design for fast register physical memory Bob Pearson
2022-05-26 6:01 ` lizhijian
2022-05-27 12:42 ` Jason Gunthorpe
2022-06-08 11:23 ` lizhijian [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46f9f366-d31e-c600-4ac6-7ec440d6baff@fujitsu.com \
--to=lizhijian@fujitsu.com \
--cc=frank.zago@hpe.com \
--cc=jgg@ziepe.ca \
--cc=jhack@hpe.com \
--cc=linux-rdma@vger.kernel.org \
--cc=rpearsonhpe@gmail.com \
--cc=zyjzyj2000@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox