From: Vlad Buslov <vladbu@mellanox.com>
To: John Hurley <john.hurley@netronome.com>
Cc: Vlad Buslov <vladbu@mellanox.com>, Jiri Pirko <jiri@mellanox.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"simon.horman@netronome.com" <simon.horman@netronome.com>,
"jakub.kicinski@netronome.com" <jakub.kicinski@netronome.com>,
"louis.peens@netronome.com" <louis.peens@netronome.com>,
"oss-drivers@netronome.com" <oss-drivers@netronome.com>
Subject: Re: [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race
Date: Fri, 1 Nov 2019 15:08:12 +0000 [thread overview]
Message-ID: <vbfd0eby6qv.fsf@mellanox.com> (raw)
In-Reply-To: <CAK+XE==iPdOq65FK1_MvTr7x6=dRQDYe7kASLDEkux+D5zUh+g@mail.gmail.com>
On Fri 01 Nov 2019 at 16:54, John Hurley <john.hurley@netronome.com> wrote:
> On Fri, Nov 1, 2019 at 1:01 PM Vlad Buslov <vladbu@mellanox.com> wrote:
>>
>>
>> On Thu 31 Oct 2019 at 21:08, John Hurley <john.hurley@netronome.com> wrote:
>> > When a new filter is added to cls_api, the function
>> > tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
>> > determine if the tcf_proto is duplicated in the chain's hashtable. It then
>> > creates a new entry or continues with an existing one. In cls_flower, this
>> > allows the function fl_ht_insert_unque to determine if a filter is a
>> > duplicate and reject appropriately, meaning that the duplicate will not be
>> > passed to drivers via the offload hooks. However, when a tcf_proto is
>> > destroyed it is removed from its chain before a hardware remove hook is
>> > hit. This can lead to a race whereby the driver has not received the
>> > remove message but duplicate flows can be accepted. This, in turn, can
>> > lead to the offload driver receiving incorrect duplicate flows and out of
>> > order add/delete messages.
>> >
>> > Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
>> > hash table per block stores each unique chain/protocol/prio being
>> > destroyed. This entry is only removed when the full destroy (and hardware
>> > offload) has completed. If a new flow is being added with the same
>> > identiers as a tc_proto being detroyed, then the add request is replayed
>> > until the destroy is complete.
>> >
>> > v1->v2:
>> > - put tcf_proto into block->proto_destroy_ht rather than creating new key
>> > (Vlad Buslov)
>> > - add error log for cases when hash entry fails. Previously it failed
>> > silently with no indication. (Vlad Buslov)
>> >
>> > Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
>> > Signed-off-by: John Hurley <john.hurley@netronome.com>
>> > Reviewed-by: Simon Horman <simon.horman@netronome.com>
>> > Reported-by: Louis Peens <louis.peens@netronome.com>
>> > ---
>>
>> Hi John,
>>
>> Patch looks good! However, I think we can simplify it even more and
>> remove duplication of data in tcf_proto (hashtable key contains copy of
>> data that is already available in the struct itself) and remove all
>> dynamic memory allocations. I have refactored your patch accordingly
>> (attached). WDYT?
>>
>
> Hi Vlad,
> Thanks for the review/modifications.
> The changes look good to me but I can fire it through our test setup to be sure.
>
> The only thing I'm a bit concerned about here is the size of the
> static allocation.
> We are now defining quite a large hash table per block (65536 buckets).
> It's hard to know exactly how many elements will be in this at the one time.
> With flushes of large chains it may well become quite full, but I'd
> expect that it will be empty a majority of the time.
> Resizable hash seems a bit more appropriate here.
> Do you have any opinions on this?
>
> John
Yeah, I agree that 65536 buckets is quite an overkill for this. I think
cls API assumes that there is not too many tp instances because they are
attached to chain through regular linked list and each lookup is a
linear search. With this I assume that proto_destroy_ht size can be
safely reduced to 256 buckets, if not less. Resizable table is an
option, but that also sounds like an overkill to me since linear search
over list of chains attached to block or over list of tp's attached to
chain will be the main bottleneck, if user creates hundreds of thousands
of proto instances.
>
>> [...]
>>
next prev parent reply other threads:[~2019-11-01 15:08 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-31 19:08 [RFC net-next v2 1/1] net: sched: prevent duplicate flower rules from tcf_proto destroy race John Hurley
2019-11-01 13:01 ` Vlad Buslov
2019-11-01 14:54 ` John Hurley
2019-11-01 15:08 ` Vlad Buslov [this message]
2019-11-01 15:27 ` John Hurley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=vbfd0eby6qv.fsf@mellanox.com \
--to=vladbu@mellanox.com \
--cc=jakub.kicinski@netronome.com \
--cc=jiri@mellanox.com \
--cc=john.hurley@netronome.com \
--cc=louis.peens@netronome.com \
--cc=netdev@vger.kernel.org \
--cc=oss-drivers@netronome.com \
--cc=simon.horman@netronome.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).