Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed
From: Bob Pearson <rpearsonhpe@gmail.com>
To: "lizhijian@fujitsu.com" <lizhijian@fujitsu.com>,
	Haris Iqbal <haris.iqbal@ionos.com>
Cc: Yanjun Zhu <yanjun.zhu@linux.dev>, Jason Gunthorpe <jgg@ziepe.ca>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	Leon Romanovsky <leon@kernel.org>,
	Guoqing Jiang <guoqing.jiang@linux.dev>
Subject: Re: [RESEND RFC PATCH for-next] Revert "RDMA/rxe: Create duplicate mapping tables for FMRs"
Date: Mon, 25 Jul 2022 14:15:02 -0500	[thread overview]
Message-ID: <30b8e464-c3dc-47c2-4636-9fd02a2cd6cd@gmail.com> (raw)
In-Reply-To: <9f67703e-a030-c467-dfd6-6b4cf0538e7f@fujitsu.com>

On 7/24/22 23:00, lizhijian@fujitsu.com wrote:
> 
> 
> On 22/07/2022 02:18, Bob Pearson wrote:
>> On 7/20/22 05:50, Haris Iqbal wrote:
>>> On Wed, Jul 20, 2022 at 12:22 PM Li Zhijian <lizhijian@fujitsu.com> wrote:
>>>> Below 2 commits will be reverted:
>>>>       8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
>>>>       647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs")
>>>>
>>>> The community has a few bug reports which pointed this commit at last.
>>>> Some proposals are raised up in the meantime but all of them have no
>>>> follow-up operation.
>>>>
>>>> The previous commit led the map_set of FMR to be not avaliable any more if
>>>> the MR is registered again after invalidating. Although the mentioned
>>>> patch try to fix a potential race in building/accessing the same table
>>>> for fast memory regions, it broke rnbd etc ULPs. Since the latter could
>>>> be worse, revert this patch.
>>>>
>>>> With previous commit, it's observed that a same MR in rnbd server will
>>>> trigger below code path:
>>> Looks Good. I tested the patch against rdma for-next and it solves the
>>> problem mentioned in the commit.
>>> One small nitpick. It should be rtrs, and not rnbd in the commit message.
>>>
>>> Feel free to add my,
>>>
>>> Tested-by: Md Haris Iqbal <haris.iqbal@ionos.com>
>>>
>> Li,
>>
>> It has been a while since this was added. If I recall there was a problem in rnfs
>> that this was supposed to fix. It was also supposed to allow overlap of using the
>> previous mappings and the driver creating new ones. But it seems that most fmr
>> based ulps don't require it, maybe all. Before we do this we should make sure that
>> blktests, srp, lustre, rnfs, etc all work. Have these been tested?
> 
> blktests(nvme over RXE and srp) works fine after this reverting.
> lustre and rnfs have not tested because I have no lustre and rnfs local environment currently.
> 
> I do wish to know what's the original problem you fixed in 647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs")
> Could we have other approaches for it such as add locks to prevent the potential *race*.
> 
> I agreed on the view[1]("you need to go back to one map") from Jason
> 
> [1]: https://lore.kernel.org/all/20220527124240.GB2960187@ziepe.ca/
> 
> Thanks
> ZHijian
> 
>>
>> Bob

Li,

I agree. You can add

Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com>

Our Lustre testing is still on older versions of the driver with one map and it works fine.
I am not able to reproduce the rnfs results from last year so I just don't know.

I still have failures in srp blktests but I doubt it is related to this issue. Tests 002 and 011
seem to hang and I have never been able to figure out why.

I suspect that mr->state can be racy. It is a state machine that can trigger changes from client code or
tasklet code on the request side or the response side. I don't have solid evidence that this has happened
but it seems to me like a good idea to guard the state machine with a spin lock. I will post a patch that
does that.

Bob

  reply	other threads:[~2022-07-25 19:15 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-20 10:29 [RESEND RFC PATCH for-next] Revert "RDMA/rxe: Create duplicate mapping tables for FMRs" Li Zhijian
2022-07-20 10:50 ` Haris Iqbal
2022-07-21 18:18   ` Bob Pearson
2022-07-25  4:00     ` lizhijian
2022-07-25 19:15       ` Bob Pearson [this message]
2022-07-21 22:19   ` Bob Pearson
2022-07-22 10:43     ` Haris Iqbal
2022-07-22 15:13       ` Jason Gunthorpe
2022-07-25  1:13     ` lizhijian
2022-07-25  2:16       ` Guoqing Jiang
2022-07-25  9:25         ` lizhijian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=30b8e464-c3dc-47c2-4636-9fd02a2cd6cd@gmail.com \
    --to=rpearsonhpe@gmail.com \
    --cc=guoqing.jiang@linux.dev \
    --cc=haris.iqbal@ionos.com \
    --cc=jgg@ziepe.ca \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=lizhijian@fujitsu.com \
    --cc=yanjun.zhu@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox