All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: Paul Blakey <paulb@mellanox.com>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	Saeed Mahameed <saeedm@mellanox.com>,
	Oz Shlomo <ozsh@mellanox.com>,
	Jakub Kicinski <jakub.kicinski@netronome.com>,
	Vlad Buslov <vladbu@mellanox.com>,
	David Miller <davem@davemloft.net>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Jiri Pirko <jiri@mellanox.com>, Roi Dayan <roid@mellanox.com>
Subject: Re: [PATCH net-next v6 1/3] net/sched: act_ct: Create nf flow table per zone
Date: Sun, 8 Mar 2020 13:42:15 -0700	[thread overview]
Message-ID: <5ab39d86-7ed3-b9b8-e6c2-2d96a3bd1f83@gmail.com> (raw)
In-Reply-To: <69fa856f-4aaf-4f54-7324-009cdbf26e38@mellanox.com>



On 3/8/20 12:15 AM, Paul Blakey wrote:
> On 3/8/2020 10:11 AM, Paul Blakey wrote:
> 
>> iirc I did the spin lock bh because we can be called from queue work rcu handler , so I wanted to disable soft irq.
>>
>> I got a possible deadlock splat for that.
> 
> Here I meant this call rcu:
> 
> static void tcf_ct_cleanup(struct tc_action *a)
> {
>> -------struct tcf_ct_params *params;
>> -------struct tcf_ct *c = to_ct(a);
> 
>> -------params = rcu_dereference_protected(c->params, 1);
>> -------if (params)
>> ------->-------call_rcu(&params->rcu, tcf_ct_params_free);
> }
> 

Yes, understood, but to solve this problem we had many other choices,
and still keeping GFP_KERNEL allocations and a mutex for control path.

Have you read my patch ?

By not even trying to get a spinlock in tcf_ct_flow_table_put(),
and instead use a refcount for ->ref, we avoid having this issue in the first place.

static void tcf_ct_flow_table_put(struct tcf_ct_params *params)
{
        struct tcf_ct_flow_table *ct_ft = params->ct_ft;

        if (refcount_dec_and_test(&params->ct_ft->ref)) {
                rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
                INIT_RCU_WORK(&ct_ft->rwork, tcf_ct_flow_table_cleanup_work);
                queue_rcu_work(act_ct_wq, &ct_ft->rwork);
        }
}

> static void tcf_ct_params_free(struct rcu_head *head)
> {
>> -------struct tcf_ct_params *params = container_of(head,
>> ------->------->------->------->------->-------    struct tcf_ct_params, rcu);
> 
>> -------tcf_ct_flow_table_put(params);
> 
> ...
> 
> 
>>
>>
>> On 3/7/2020 10:53 PM, Eric Dumazet wrote:
>>
>>> On 3/7/20 12:12 PM, Eric Dumazet wrote:
>>>> On 3/3/20 7:57 AM, Paul Blakey wrote:
>>>>> Use the NF flow tables infrastructure for CT offload.
>>>>>
>>>>> Create a nf flow table per zone.
>>>>>
>>>>> Next patches will add FT entries to this table, and do
>>>>> the software offload.
>>>>>
>>>>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>>>>> Reviewed-by: Jiri Pirko <jiri@mellanox.com>
>>>>> ---
>>>>>   v4->v5:
>>>>>     Added reviewed by Jiri, thanks!
>>>>>   v3->v4:
>>>>>     Alloc GFP_ATOMIC
>>>>>   v2->v3:
>>>>>     Ditch re-locking to alloc, and use atomic allocation
>>>>>   v1->v2:
>>>>>     Use spin_lock_bh instead of spin_lock, and unlock for alloc (as it can sleep)
>>>>>     Free ft on last tc act instance instead of last instance + last offloaded tuple,
>>>>>     this removes cleanup cb and netfilter patches, and is simpler
>>>>>     Removed accidental mlx5/core/en_tc.c change
>>>>>     Removed reviewed by Jiri - patch changed
>>>>>
>>>>> +	err = nf_flow_table_init(&ct_ft->nf_ft);
>>>> This call is going to allocate a rhashtable (GFP_KERNEL allocations that might sleep)
>>>>
>>>> Since you still hold zones_lock spinlock, a splat should occur.
>>>>
>>>> "BUG: sleeping function called from invalid context in  ..."
>>>>
>>>> DEBUG_ATOMIC_SLEEP=y is your friend.
>>>>
>>>> And it is always a good thing to make sure a patch does not trigger a lockdep splat
>>>>
>>>> CONFIG_PROVE_LOCKING=y
>>> Also abusing a spinlock and GFP_ATOMIC allocations in control path is highly discouraged.
>>>
>>> I can not test the following fix, any objections before I submit this officially ?
>>>
>>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>>> index 23eba61f0f819212a3522c3c63b938d0b8d997e2..3d9e678d7d5336f1746035745b091bea0dcb5fdd 100644
>>> --- a/net/sched/act_ct.c
>>> +++ b/net/sched/act_ct.c
>>> @@ -35,15 +35,15 @@
>>>  
>>>  static struct workqueue_struct *act_ct_wq;
>>>  static struct rhashtable zones_ht;
>>> -static DEFINE_SPINLOCK(zones_lock);
>>> +static DEFINE_MUTEX(zones_mutex);
>>>  
>>>  struct tcf_ct_flow_table {
>>>         struct rhash_head node; /* In zones tables */
>>>  
>>>         struct rcu_work rwork;
>>>         struct nf_flowtable nf_ft;
>>> +       refcount_t ref;
>>>         u16 zone;
>>> -       u32 ref;
>>>  
>>>         bool dying;
>>>  };
>>> @@ -64,14 +64,15 @@ static int tcf_ct_flow_table_get(struct tcf_ct_params *params)
>>>         struct tcf_ct_flow_table *ct_ft;
>>>         int err = -ENOMEM;
>>>  
>>> -       spin_lock_bh(&zones_lock);
>>> +       mutex_lock(&zones_mutex);
>>>         ct_ft = rhashtable_lookup_fast(&zones_ht, &params->zone, zones_params);
>>> -       if (ct_ft)
>>> -               goto take_ref;
>>> +       if (ct_ft && refcount_inc_not_zero(&ct_ft->ref))
>>> +               goto out_unlock;
>>>  
>>> -       ct_ft = kzalloc(sizeof(*ct_ft), GFP_ATOMIC);
>>> +       ct_ft = kzalloc(sizeof(*ct_ft), GFP_KERNEL);
>>>         if (!ct_ft)
>>>                 goto err_alloc;
>>> +       refcount_set(&ct_ft->ref, 1);
>>>  
>>>         ct_ft->zone = params->zone;
>>>         err = rhashtable_insert_fast(&zones_ht, &ct_ft->node, zones_params);
>>> @@ -84,10 +85,9 @@ static int tcf_ct_flow_table_get(struct tcf_ct_params *params)
>>>                 goto err_init;
>>>  
>>>         __module_get(THIS_MODULE);
>>> -take_ref:
>>> +out_unlock:
>>>         params->ct_ft = ct_ft;
>>> -       ct_ft->ref++;
>>> -       spin_unlock_bh(&zones_lock);
>>> +       mutex_unlock(&zones_mutex);
>>>  
>>>         return 0;
>>>  
>>> @@ -96,7 +96,7 @@ static int tcf_ct_flow_table_get(struct tcf_ct_params *params)
>>>  err_insert:
>>>         kfree(ct_ft);
>>>  err_alloc:
>>> -       spin_unlock_bh(&zones_lock);
>>> +       mutex_unlock(&zones_mutex);
>>>         return err;
>>>  }
>>>  
>>> @@ -116,13 +116,11 @@ static void tcf_ct_flow_table_put(struct tcf_ct_params *params)
>>>  {
>>>         struct tcf_ct_flow_table *ct_ft = params->ct_ft;
>>>  
>>> -       spin_lock_bh(&zones_lock);
>>> -       if (--params->ct_ft->ref == 0) {
>>> +       if (refcount_dec_and_test(&params->ct_ft->ref)) {
>>>                 rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
>>>                 INIT_RCU_WORK(&ct_ft->rwork, tcf_ct_flow_table_cleanup_work);
>>>                 queue_rcu_work(act_ct_wq, &ct_ft->rwork);
>>>         }
>>> -       spin_unlock_bh(&zones_lock);
>>>  }
>>>  
>>>  static void tcf_ct_flow_table_add(struct tcf_ct_flow_table *ct_ft,
>>>

  reply	other threads:[~2020-03-08 20:42 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-03 15:57 [PATCH net-next v6 0/3] act_ct: Software offload of conntrack_in Paul Blakey
2020-03-03 15:57 ` [PATCH net-next v6 1/3] net/sched: act_ct: Create nf flow table per zone Paul Blakey
2020-03-03 16:51   ` Marcelo Ricardo Leitner
2020-03-07 20:12   ` Eric Dumazet
2020-03-07 20:53     ` Eric Dumazet
2020-03-08  8:11       ` Paul Blakey
2020-03-08  8:15         ` Paul Blakey
2020-03-08 20:42           ` Eric Dumazet [this message]
2020-03-09  8:02             ` Paul Blakey
2020-03-03 15:57 ` [PATCH net-next v6 2/3] net/sched: act_ct: Offload established connections to flow table Paul Blakey
2020-03-03 16:51   ` Marcelo Ricardo Leitner
2020-03-03 15:57 ` [PATCH net-next v6 3/3] net/sched: act_ct: Software offload of established flows Paul Blakey
2020-03-03 16:51   ` Marcelo Ricardo Leitner
2020-03-03 18:32   ` Nikolay Aleksandrov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5ab39d86-7ed3-b9b8-e6c2-2d96a3bd1f83@gmail.com \
    --to=eric.dumazet@gmail.com \
    --cc=davem@davemloft.net \
    --cc=jakub.kicinski@netronome.com \
    --cc=jiri@mellanox.com \
    --cc=netdev@vger.kernel.org \
    --cc=ozsh@mellanox.com \
    --cc=paulb@mellanox.com \
    --cc=roid@mellanox.com \
    --cc=saeedm@mellanox.com \
    --cc=vladbu@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.