From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS, URIBL_BLOCKED,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6CFBC10F13 for ; Thu, 11 Apr 2019 11:20:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8A3032184B for ; Thu, 11 Apr 2019 11:20:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=messagingengine.com header.i=@messagingengine.com header.b="8g6qiUfW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726872AbfDKLUN (ORCPT ); Thu, 11 Apr 2019 07:20:13 -0400 Received: from out1-smtp.messagingengine.com ([66.111.4.25]:36915 "EHLO out1-smtp.messagingengine.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726850AbfDKLUL (ORCPT ); Thu, 11 Apr 2019 07:20:11 -0400 X-Greylist: delayed 384 seconds by postgrey-1.27 at vger.kernel.org; Thu, 11 Apr 2019 07:20:10 EDT Received: from compute3.internal (compute3.nyi.internal [10.202.2.43]) by mailout.nyi.internal (Postfix) with ESMTP id D896220D92; Thu, 11 Apr 2019 07:13:45 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute3.internal (MEProxy); Thu, 11 Apr 2019 07:13:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=299nZv A13DUh22LU70/y3mssXpVxHNl/yJCnCiDqvvQ=; b=8g6qiUfWLcIKSjJ6AxB9Oi yUxLDHoMrcFFg/rujYTn++n1RB0FHPkBfd0P9WZKbRwUv+5qBoWRZAcOcSTHG//C 3nAopAJeTHaj4Q1PwBoKOyylpH5sFtjNH1UyiwYq1xnXRndGeRGt72Og4aAw9bXn idpR4cOY66a06SQ0IA/mN0vO3C1s7kiD7tZdLkPQWXJ6drWcFGlbLNAoqCkMVJVk F4lF0SYcnd+elih+fXY8maaqvZpnHqDDdh0ZjyK53LW+rnscjP8s7QPvWVPft3K6 08uXUw0ESggwJG8cqqLquxsunqKeNQMRwofQvR4fN/mKd5tuYQYA3vIu/6KOU7mQ == X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeduuddrudelgdefkecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpeffhffvuffkfhggtggujggfsehttdertddtredvnecuhfhrohhmpefkughoucfu tghhihhmmhgvlhcuoehiughoshgthhesihguohhstghhrdhorhhgqeenucffohhmrghinh epfhgvughorhgrphhrohhjvggtthdrohhrghenucfkphepudelfedrgeejrdduieehrddv hedunecurfgrrhgrmhepmhgrihhlfhhrohhmpehiughoshgthhesihguohhstghhrdhorh hgnecuvehluhhsthgvrhfuihiivgeptd X-ME-Proxy: Received: from localhost (unknown [193.47.165.251]) by mail.messagingengine.com (Postfix) with ESMTPA id 74062E44B4; Thu, 11 Apr 2019 07:13:44 -0400 (EDT) Date: Thu, 11 Apr 2019 14:13:42 +0300 From: Ido Schimmel To: Vlad Buslov Cc: netdev@vger.kernel.org, jhs@mojatatu.com, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, john.hurley@netronome.com Subject: Re: [PATCH net-next] net: sched: flower: insert filter to ht before offloading it to hw Message-ID: <20190411111342.GA29053@splinter> References: <20190405175626.4123-1-vladbu@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190405175626.4123-1-vladbu@mellanox.com> User-Agent: Mutt/1.11.3 (2019-02-01) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Fri, Apr 05, 2019 at 08:56:26PM +0300, Vlad Buslov wrote: > John reports: > > Recent refactoring of fl_change aims to use the classifier spinlock to > avoid the need for rtnl lock. In doing so, the fl_hw_replace_filer() > function was moved to before the lock is taken. This can create problems > for drivers if duplicate filters are created (commmon in ovs tc offload > due to filters being triggered by user-space matches). > > Drivers registered for such filters will now receive multiple copies of > the same rule, each with a different cookie value. This means that the > drivers would need to do a full match field lookup to determine > duplicates, repeating work that will happen in flower __fl_lookup(). > Currently, drivers do not expect to receive duplicate filters. > > To fix this, verify that filter with same key is not present in flower > classifier hash table and insert the new filter to the flower hash table > before offloading it to hardware. Implement helper function > fl_ht_insert_unique() to atomically verify/insert a filter. > > This change makes filter visible to fast path at the beginning of > fl_change() function, which means it can no longer be freed directly in > case of error. Refactor fl_change() error handling code to deallocate the > filter with rcu timeout. > > Fixes: 620da4860827 ("net: sched: flower: refactor fl_change") > Reported-by: John Hurley > Signed-off-by: Vlad Buslov Vlad, Our regression machines all hit a NULL pointer dereference [1] which I bisected to this patch. Created this reproducer that you can use: ip netns add ns1 ip -n ns1 link add dev dummy1 type dummy tc -n ns1 qdisc add dev dummy1 clsact tc -n ns1 filter add dev dummy1 ingress pref 1 proto ip \ flower skip_hw src_ip 192.0.2.1 action drop ip netns del ns1 Can you please look into this? Thanks [1] [ 5.332176] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 [ 5.334372] #PF error: [normal kernel read fault] [ 5.335619] PGD 0 P4D 0 [ 5.336360] Oops: 0000 [#1] SMP [ 5.337249] CPU: 0 PID: 7 Comm: kworker/u2:0 Not tainted 5.1.0-rc4-custom-01473-g526bb57a6ad6 #1374 [ 5.339232] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180531_142017-buildhw-08.phx2.fedoraproject.org-1.fc28 04/01/2014 [ 5.341982] Workqueue: netns cleanup_net [ 5.342843] RIP: 0010:__fl_put+0x24/0xb0 [ 5.343808] Code: 84 00 00 00 00 00 3e ff 8f f0 03 00 00 0f 88 da 7b 14 00 74 01 c3 80 bf f4 03 00 00 00 0f 84 83 00 00 00 4c 8b 87 c8 01 00 00 <41> 8b 50 04 49 8d 70 04 85 d2 74 60 8d 4a 01 39 ca 7f 52 81 fa fe [ 5.348099] RSP: 0018:ffffabe300663be0 EFLAGS: 00010202 [ 5.349223] RAX: ffff9ea4ba1aff00 RBX: ffff9ea4b99af400 RCX: ffffabe300663c67 [ 5.350572] RDX: 00000000000004c5 RSI: 0000000000000000 RDI: ffff9ea4b99af400 [ 5.351919] RBP: ffff9ea4ba28e900 R08: 0000000000000000 R09: ffffffff9d1b0075 [ 5.353272] R10: ffffeb2884e66b80 R11: ffffffff9dc4dcd8 R12: ffff9ea4b99af408 [ 5.354635] R13: ffff9ea4b99ae400 R14: ffff9ea4b9a47800 R15: ffff9ea4b99ae000 [ 5.355988] FS: 0000000000000000(0000) GS:ffff9ea4bba00000(0000) knlGS:0000000000000000 [ 5.357436] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.358530] CR2: 0000000000000004 CR3: 00000001398fa004 CR4: 0000000000160ef0 [ 5.359876] Call Trace: [ 5.360360] __fl_delete+0x223/0x3b0 [ 5.361008] fl_destroy+0xb4/0x130 [ 5.361641] tcf_proto_destroy+0x15/0x40 [ 5.362429] tcf_chain_flush+0x4e/0x60 [ 5.363125] __tcf_block_put+0xb4/0x150 [ 5.363805] clsact_destroy+0x30/0x40 [ 5.364507] qdisc_destroy+0x44/0x110 [ 5.365218] dev_shutdown+0x6e/0xa0 [ 5.365821] rollback_registered_many+0x25d/0x510 [ 5.366724] ? netdev_run_todo+0x221/0x280 [ 5.367485] unregister_netdevice_many+0x15/0xa0 [ 5.368355] default_device_exit_batch+0x13f/0x170 [ 5.369268] ? wait_woken+0x80/0x80 [ 5.369910] cleanup_net+0x19a/0x280 [ 5.370558] process_one_work+0x1f5/0x3f0 [ 5.371326] worker_thread+0x28/0x3c0 [ 5.372038] ? process_one_work+0x3f0/0x3f0 [ 5.372755] kthread+0x10d/0x130 [ 5.373358] ? __kthread_create_on_node+0x180/0x180 [ 5.374298] ret_from_fork+0x35/0x40 [ 5.374934] CR2: 0000000000000004 [ 5.375454] ---[ end trace c20e7f74127772e5 ]--- [ 5.376284] RIP: 0010:__fl_put+0x24/0xb0 [ 5.377003] Code: 84 00 00 00 00 00 3e ff 8f f0 03 00 00 0f 88 da 7b 14 00 74 01 c3 80 bf f4 03 00 00 00 0f 84 83 00 00 00 4c 8b 87 c8 01 00 00 <41> 8b 50 04 49 8d 70 04 85 d2 74 60 8d 4a 01 39 ca 7f 52 81 fa fe [ 5.380269] RSP: 0018:ffffabe300663be0 EFLAGS: 00010202 [ 5.381237] RAX: ffff9ea4ba1aff00 RBX: ffff9ea4b99af400 RCX: ffffabe300663c67 [ 5.382551] RDX: 00000000000004c5 RSI: 0000000000000000 RDI: ffff9ea4b99af400 [ 5.383972] RBP: ffff9ea4ba28e900 R08: 0000000000000000 R09: ffffffff9d1b0075 [ 5.385314] R10: ffffeb2884e66b80 R11: ffffffff9dc4dcd8 R12: ffff9ea4b99af408 [ 5.386616] R13: ffff9ea4b99ae400 R14: ffff9ea4b9a47800 R15: ffff9ea4b99ae000 [ 5.387986] FS: 0000000000000000(0000) GS:ffff9ea4bba00000(0000) knlGS:0000000000000000 [ 5.389512] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5.390546] CR2: 0000000000000004 CR3: 00000001398fa004 CR4: 0000000000160ef0