target-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sagi Grimberg <sagi@grimberg.me>
To: Laurence Oberman <loberman@redhat.com>,
	Max Gurtovoy <mgurtovoy@nvidia.com>,
	David Jeffery <djeffery@redhat.com>
Cc: linux-rdma@vger.kernel.org, target-devel@vger.kernel.org
Subject: Re: [Patch 0/2] iscsit/isert deadlock prevention under heavy I/O
Date: Wed, 16 Mar 2022 16:39:31 +0200	[thread overview]
Message-ID: <165698e9-9ccd-c840-b926-48f2ee7c6dcb@grimberg.me> (raw)
In-Reply-To: <720ebd1f98ab3c709443176011f51d6e3ed37272.camel@redhat.com>


>>>>>>> Hi David,
>>>>>>>
>>>>>>> thanks for the report.
>>>>>>>
>>>>>>> Please check how we fixed that in NVMf in Sagi's commit:
>>>>>>>
>>>>>>> nvmet-rdma: fix possible bogus dereference under heavy load
>>>>>>> (commit:
>>>>>>> 8407879c4e0d77)
>>>>>>>
>>>>>>> Maybe this can be done in isert and will solve this problem
>>>>>>> in
>>>>>>> a simpler
>>>>>>> way.
>>>>>>>
>>>>>>> is it necessary to change max_cmd_sn ?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> Sure, there are alternative methods which could fix this
>>>>>> immediate
>>>>>> issue. e.g. We could make the command structs for scsi
>>>>>> commands
>>>>>> get
>>>>>> allocated from a mempool. Is there a particular reason you
>>>>>> don't
>>>>>> want
>>>>>> to do anything to modify max_cmd_sn behavior?
>>>>>
>>>>> according to the description the command was parsed successful
>>>>> and
>>>>> sent
>>>>> to the initiator.
>>>>>
>>>>
>>>> Yes.
>>>>
>>>>> Why do we need to change the window ? it's just a race of
>>>>> putting
>>>>> the
>>>>> context back to the pool.
>>>>>
>>>>> And this race is rare.
>>>>>
>>>>
>>>> Sure, it's going to be rare. Systems using isert targets with
>>>> infiniband are going to be naturally rare. It's part of why I
>>>> left
>>>> the
>>>> max_cmd_sn behavior untouched for non-isert iscsit since they
>>>> seem to
>>>> be fine as is. But it's easily and regularly triggered by some
>>>> systems
>>>> which use isert, so worth fixing.
>>>>
>>>>>> I didn't do something like this as it seems to me to go
>>>>>> against
>>>>>> the
>>>>>> intent of the design. It makes the iscsi window mostly
>>>>>> meaningless in
>>>>>> some conditions and complicates any allocation path since it
>>>>>> now
>>>>>> must
>>>>>> gracefully and sanely handle an iscsi_cmd/isert_cmd not
>>>>>> existing.
>>>>>> I
>>>>>> assume special commands like task-management, logouts, and
>>>>>> pings
>>>>>> would
>>>>>> need a separate allocation source to keep from being dropped
>>>>>> under
>>>>>> memory load.
>>>>>
>>>>> it won't be dropped. It would be allocated dynamically and
>>>>> freed
>>>>> (instead of putting it back to the pool).
>>>>>
>>>>
>>>> If it waits indefinitely for an allocation it ends up with a
>>>> variation
>>>> of the original problem under memory pressure. If it waits for
>>>> allocation on isert receive, then receive stalls under memory
>>>> pressure
>>>> and won't process the completions which would have released the
>>>> other
>>>> iscsi_cmd structs just needing final acknowledgement.
>>
>> If your system is under such memory pressure can you can't allocate
>> few
>> bytes for isert response, the silent drop
>>
>> of the command is your smallest problem. You need to keep the system
>> from crashing. And we do that in my suggestion.
>>
>>>>
>>>> David Jeffery
>>>>
>>>
>>> Folks this is a pending issue stopping a customer from making
>>> progress.
>>> They run Oracle and very high workloads on EDR 100 so David fixed
>>> this
>>> fosusing on the needs of the isert target changes etc.
>>>
>>> Are you able to give us technical reasons why David's patch is not
>>> suitable and why we he would have to start from scratch.
>>
>> You shouldn't start from scratch. You did all the investigation and
>> the
>> debugging already.
>>
>> Coding a solution is the small part after you understand the root
>> cause.
>>
>>>
>>> We literally spent weeks on this and built another special lab for
>>> fully testing EDR 100.
>>> This issue was pending in a BZ for some time and Mellnox had eyes
>>> on it
>>> then but this latest suggestion was never put forward in that BZ to
>>> us.
>>
>> Mellanox maintainers saw this issue few days before you sent it
>> upstream. I suggested sending it upstream and have a discussion here
>> since it has nothing to do with Mellanox adapters and Mellanox SW
>> stack
>> MLNX_OFED.
>>
>> Our job as maintainers and reviewers in the community is to see the
>> big
>> picture and suggest solutions that not always same as posted in the
>> mailing list.
>>
>>>
>>> Sincerely
>>> Laurence
>>>
>>
>>
> 
> Hi Max

Hey,

> The issue was reported with the OFED stack at the customer, so its why
> we opened the BZ to get the Mallnox partners engineers engaged.
> We had them then see if it also existed with the inbox stack which it
> did.
> Sergey worked a little bit on the issue but did not have the same
> suggestion you provivided and asked David for help.

I think you can move the corporate discussions offline.

> We will be happy to take the fix you propose doing it your way. May I
> that the engineewrs to work on this the most and understand the code
> best propose a fix your way.

I tend to agree with Max, I looked into the patch and I can't say that
we know for a fact that incrementing the cmdsn after releasing the
iscsi cmd will not introduce anything else (although looks fine at
a high-level).

Is there any measure-able performance implication?

Max, doing dynamic allocation is also a valid fix.

  reply	other threads:[~2022-03-16 14:39 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-11 17:57 [Patch 0/2] iscsit/isert deadlock prevention under heavy I/O David Jeffery
2022-03-11 17:57 ` [PATCH 1/2] isert: support for unsolicited NOPIN with no response David Jeffery
2022-03-11 17:57 ` [PATCH 2/2] iscsit: increment max_cmd_sn for isert on command release David Jeffery
2022-03-11 19:08 ` [Patch 0/2] iscsit/isert deadlock prevention under heavy I/O Laurence Oberman
2022-03-13  9:59 ` Max Gurtovoy
2022-03-14 13:57   ` David Jeffery
2022-03-14 14:52     ` Max Gurtovoy
2022-03-14 15:55       ` David Jeffery
2022-03-14 17:40         ` Laurence Oberman
2022-03-16 10:38           ` Max Gurtovoy
2022-03-16 13:07             ` Laurence Oberman
2022-03-16 14:39               ` Sagi Grimberg [this message]
2022-03-16 15:26                 ` Laurence Oberman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=165698e9-9ccd-c840-b926-48f2ee7c6dcb@grimberg.me \
    --to=sagi@grimberg.me \
    --cc=djeffery@redhat.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=loberman@redhat.com \
    --cc=mgurtovoy@nvidia.com \
    --cc=target-devel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).