From: Oz Shlomo <ozsh@nvidia.com>
To: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>, <netdev@vger.kernel.org>,
<netfilter-devel@vger.kernel.org>,
Saeed Mahameed <saeedm@nvidia.com>,
"Paul Blakey" <paulb@nvidia.com>
Subject: Re: [PATCH nf-next] netfilter: flowtable: separate replace, destroy and stats to different workqueues
Date: Thu, 25 Mar 2021 10:46:12 +0200 [thread overview]
Message-ID: <b89d8340-ca1c-1424-bbaa-0e85d37a84bb@nvidia.com> (raw)
In-Reply-To: <YFutK3Mn+h5OWNXe@horizon.localdomain>
Hi Marcelo,
On 3/24/2021 11:20 PM, Marcelo Ricardo Leitner wrote:
> On Wed, Mar 24, 2021 at 01:24:53PM +0200, Oz Shlomo wrote:
>> Hi,
>
> Hi,
>
>>
>> On 3/24/2021 3:38 AM, Pablo Neira Ayuso wrote:
>>> Hi Marcelo,
>>>
>>> On Mon, Mar 22, 2021 at 03:09:51PM -0300, Marcelo Ricardo Leitner wrote:
>>>> On Wed, Mar 03, 2021 at 05:11:47PM +0100, Pablo Neira Ayuso wrote:
>>> [...]
>>>>> Or probably make the cookie unique is sufficient? The cookie refers to
>>>>> the memory address but memory can be recycled very quickly. If the
>>>>> cookie helps to catch the reorder scenario, then the conntrack id
>>>>> could be used instead of the memory address as cookie.
>>>>
>>>> Something like this, if I got the idea right, would be even better. If
>>>> the entry actually expired before it had a chance of being offloaded,
>>>> there is no point in offloading it to then just remove it.
>>>
>>> It would be interesting to explore this idea you describe. Maybe a
>>> flag can be set on stale objects, or simply remove the stale object
>>> from the offload queue. So I guess it should be possible to recover
>>> control on the list of pending requests as a batch that is passed
>>> through one single queue_work call.
>>>
>>
>> Removing stale objects is a good optimization for cases when the rate of
>> established connections is greater than the hardware offload insertion rate.
>> However, with a single workqueue design, a burst of del commands may postpone connection offload tasks.
>> Postponed offloads may cause additional packets to go through software, thus
>> creating a chain effect which may diminish the system's connection rate.
>
> Right. I didn't intend to object to multiqueues. I'm sorry if it
> sounded that way.
>
>>
>> Marcelo, AFAIU add/del are synchronized by design since the del is triggered by the gc thread.
>> A del workqueue item will be instantiated only after a connection is in hardware.
>
> They were synchronized, but after this patch, not anymore AFAICT:
>
> tcf_ct_flow_table_add()
> flow_offload_add()
> if (nf_flowtable_hw_offload(flow_table)) {
> __set_bit(NF_FLOW_HW, &flow->flags); [A]
> nf_flow_offload_add(flow_table, flow);
> ^--- schedules on _add workqueue
>
> then the gc thread:
> nf_flow_offload_gc_step()
> if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct))
> set_bit(NF_FLOW_TEARDOWN, &flow->flags);
>
> if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
> ^-- can also set by tcf_ct_flow_table_lookup()
> on fin's, by calling flow_offload_teardown()
> if (test_bit(NF_FLOW_HW, &flow->flags)) {
> ^--- this is set in [A], even if the _add is still queued
> if (!test_bit(NF_FLOW_HW_DYING, &flow->flags))
> nf_flow_offload_del(flow_table, flow);
>
> nf_flow_offload_del()
> offload = nf_flow_offload_work_alloc(flowtable, flow, FLOW_CLS_DESTROY);
> if (!offload)
> return;
>
> set_bit(NF_FLOW_HW_DYING, &flow->flags);
> flow_offload_queue_work(offload);
>
> NF_FLOW_HW_DYING only avoids a double _del here.
>
> Maybe I'm just missing it but I'm not seeing how removals would only
> happen after the entry is actually offloaded. As in, if the add queue
> is very long, and the datapath see a FIN, seems the next gc iteration
> could try to remove it before it's actually offloaded. I think this is
> what Pablo meant on his original reply here too, then his idea on
> having add/del to work with the same queue.
>
The work item will not be allocated if the hw offload is pending.
nf_flow_offload_work_alloc()
if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
return NULL;
next prev parent reply other threads:[~2021-03-25 8:46 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-03 12:59 [PATCH nf-next] netfilter: flowtable: separate replace, destroy and stats to different workqueues Oz Shlomo
2021-03-03 16:11 ` Pablo Neira Ayuso
2021-03-22 18:09 ` Marcelo Ricardo Leitner
2021-03-24 1:38 ` Pablo Neira Ayuso
2021-03-24 11:24 ` Oz Shlomo
2021-03-24 21:20 ` Marcelo Ricardo Leitner
2021-03-25 8:46 ` Oz Shlomo [this message]
2021-03-26 13:51 ` Marcelo Ricardo Leitner
2021-03-17 23:36 ` Pablo Neira Ayuso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b89d8340-ca1c-1424-bbaa-0e85d37a84bb@nvidia.com \
--to=ozsh@nvidia.com \
--cc=marcelo.leitner@gmail.com \
--cc=netdev@vger.kernel.org \
--cc=netfilter-devel@vger.kernel.org \
--cc=pablo@netfilter.org \
--cc=paulb@nvidia.com \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).