From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
patches@lists.linux.dev, Stefano Brivio <sbrivio@redhat.com>,
Florian Westphal <fw@strlen.de>,
Pablo Neira Ayuso <pablo@netfilter.org>,
Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.1 37/64] netfilter: nft_set_pipapo: store index in scratch maps
Date: Tue, 13 Feb 2024 18:21:23 +0100 [thread overview]
Message-ID: <20240213171845.915219273@linuxfoundation.org> (raw)
In-Reply-To: <20240213171844.702064831@linuxfoundation.org>
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal <fw@strlen.de>
[ Upstream commit 76313d1a4aa9e30d5b43dee5efd8bcd4d8250006 ]
Pipapo needs a scratchpad area to keep state during matching.
This state can be large and thus cannot reside on stack.
Each set preallocates percpu areas for this.
On each match stage, one scratchpad half starts with all-zero and the other
is inited to all-ones.
At the end of each stage, the half that starts with all-ones is
always zero. Before next field is tested, pointers to the two halves
are swapped, i.e. resmap pointer turns into fill pointer and vice versa.
After the last field has been processed, pipapo stashes the
index toggle in a percpu variable, with assumption that next packet
will start with the all-zero half and sets all bits in the other to 1.
This isn't reliable.
There can be multiple sets and we can't be sure that the upper
and lower half of all set scratch map is always in sync (lookups
can be conditional), so one set might have swapped, but other might
not have been queried.
Thus we need to keep the index per-set-and-cpu, just like the
scratchpad.
Note that this bug fix is incomplete, there is a related issue.
avx2 and normal implementation might use slightly different areas of the
map array space due to the avx2 alignment requirements, so
m->scratch (generic/fallback implementation) and ->scratch_aligned
(avx) may partially overlap. scratch and scratch_aligned are not distinct
objects, the latter is just the aligned address of the former.
After this change, write to scratch_align->map_index may write to
scratch->map, so this issue becomes more prominent, we can set to 1
a bit in the supposedly-all-zero area of scratch->map[].
A followup patch will remove the scratch_aligned and makes generic and
avx code use the same (aligned) area.
Its done in a separate change to ease review.
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
net/netfilter/nft_set_pipapo.c | 41 ++++++++++++++++++-----------
net/netfilter/nft_set_pipapo.h | 14 ++++++++--
net/netfilter/nft_set_pipapo_avx2.c | 15 +++++------
3 files changed, 44 insertions(+), 26 deletions(-)
diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c
index 4e1cc31729b8..fbd0dbf9b965 100644
--- a/net/netfilter/nft_set_pipapo.c
+++ b/net/netfilter/nft_set_pipapo.c
@@ -342,9 +342,6 @@
#include "nft_set_pipapo_avx2.h"
#include "nft_set_pipapo.h"
-/* Current working bitmap index, toggled between field matches */
-static DEFINE_PER_CPU(bool, nft_pipapo_scratch_index);
-
/**
* pipapo_refill() - For each set bit, set bits from selected mapping table item
* @map: Bitmap to be scanned for set bits
@@ -412,6 +409,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
const u32 *key, const struct nft_set_ext **ext)
{
struct nft_pipapo *priv = nft_set_priv(set);
+ struct nft_pipapo_scratch *scratch;
unsigned long *res_map, *fill_map;
u8 genmask = nft_genmask_cur(net);
const u8 *rp = (const u8 *)key;
@@ -422,15 +420,17 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
local_bh_disable();
- map_index = raw_cpu_read(nft_pipapo_scratch_index);
-
m = rcu_dereference(priv->match);
if (unlikely(!m || !*raw_cpu_ptr(m->scratch)))
goto out;
- res_map = *raw_cpu_ptr(m->scratch) + (map_index ? m->bsize_max : 0);
- fill_map = *raw_cpu_ptr(m->scratch) + (map_index ? 0 : m->bsize_max);
+ scratch = *raw_cpu_ptr(m->scratch);
+
+ map_index = scratch->map_index;
+
+ res_map = scratch->map + (map_index ? m->bsize_max : 0);
+ fill_map = scratch->map + (map_index ? 0 : m->bsize_max);
memset(res_map, 0xff, m->bsize_max * sizeof(*res_map));
@@ -460,7 +460,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
b = pipapo_refill(res_map, f->bsize, f->rules, fill_map, f->mt,
last);
if (b < 0) {
- raw_cpu_write(nft_pipapo_scratch_index, map_index);
+ scratch->map_index = map_index;
local_bh_enable();
return false;
@@ -477,7 +477,7 @@ bool nft_pipapo_lookup(const struct net *net, const struct nft_set *set,
* current inactive bitmap is clean and can be reused as
* *next* bitmap (not initial) for the next packet.
*/
- raw_cpu_write(nft_pipapo_scratch_index, map_index);
+ scratch->map_index = map_index;
local_bh_enable();
return true;
@@ -1114,12 +1114,12 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone,
int i;
for_each_possible_cpu(i) {
- unsigned long *scratch;
+ struct nft_pipapo_scratch *scratch;
#ifdef NFT_PIPAPO_ALIGN
- unsigned long *scratch_aligned;
+ void *scratch_aligned;
#endif
-
- scratch = kzalloc_node(bsize_max * sizeof(*scratch) * 2 +
+ scratch = kzalloc_node(struct_size(scratch, map,
+ bsize_max * 2) +
NFT_PIPAPO_ALIGN_HEADROOM,
GFP_KERNEL, cpu_to_node(i));
if (!scratch) {
@@ -1138,7 +1138,16 @@ static int pipapo_realloc_scratch(struct nft_pipapo_match *clone,
*per_cpu_ptr(clone->scratch, i) = scratch;
#ifdef NFT_PIPAPO_ALIGN
- scratch_aligned = NFT_PIPAPO_LT_ALIGN(scratch);
+ /* Align &scratch->map (not the struct itself): the extra
+ * %NFT_PIPAPO_ALIGN_HEADROOM bytes passed to kzalloc_node()
+ * above guarantee we can waste up to those bytes in order
+ * to align the map field regardless of its offset within
+ * the struct.
+ */
+ BUILD_BUG_ON(offsetof(struct nft_pipapo_scratch, map) > NFT_PIPAPO_ALIGN_HEADROOM);
+
+ scratch_aligned = NFT_PIPAPO_LT_ALIGN(&scratch->map);
+ scratch_aligned -= offsetof(struct nft_pipapo_scratch, map);
*per_cpu_ptr(clone->scratch_aligned, i) = scratch_aligned;
#endif
}
@@ -2132,7 +2141,7 @@ static int nft_pipapo_init(const struct nft_set *set,
m->field_count = field_count;
m->bsize_max = 0;
- m->scratch = alloc_percpu(unsigned long *);
+ m->scratch = alloc_percpu(struct nft_pipapo_scratch *);
if (!m->scratch) {
err = -ENOMEM;
goto out_scratch;
@@ -2141,7 +2150,7 @@ static int nft_pipapo_init(const struct nft_set *set,
*per_cpu_ptr(m->scratch, i) = NULL;
#ifdef NFT_PIPAPO_ALIGN
- m->scratch_aligned = alloc_percpu(unsigned long *);
+ m->scratch_aligned = alloc_percpu(struct nft_pipapo_scratch *);
if (!m->scratch_aligned) {
err = -ENOMEM;
goto out_free;
diff --git a/net/netfilter/nft_set_pipapo.h b/net/netfilter/nft_set_pipapo.h
index 25a75591583e..de96e1a01dc0 100644
--- a/net/netfilter/nft_set_pipapo.h
+++ b/net/netfilter/nft_set_pipapo.h
@@ -130,6 +130,16 @@ struct nft_pipapo_field {
union nft_pipapo_map_bucket *mt;
};
+/**
+ * struct nft_pipapo_scratch - percpu data used for lookup and matching
+ * @map_index: Current working bitmap index, toggled between field matches
+ * @map: store partial matching results during lookup
+ */
+struct nft_pipapo_scratch {
+ u8 map_index;
+ unsigned long map[];
+};
+
/**
* struct nft_pipapo_match - Data used for lookup and matching
* @field_count Amount of fields in set
@@ -142,9 +152,9 @@ struct nft_pipapo_field {
struct nft_pipapo_match {
int field_count;
#ifdef NFT_PIPAPO_ALIGN
- unsigned long * __percpu *scratch_aligned;
+ struct nft_pipapo_scratch * __percpu *scratch_aligned;
#endif
- unsigned long * __percpu *scratch;
+ struct nft_pipapo_scratch * __percpu *scratch;
size_t bsize_max;
struct rcu_head rcu;
struct nft_pipapo_field f[];
diff --git a/net/netfilter/nft_set_pipapo_avx2.c b/net/netfilter/nft_set_pipapo_avx2.c
index 52e0d026d30a..78213c73af2e 100644
--- a/net/netfilter/nft_set_pipapo_avx2.c
+++ b/net/netfilter/nft_set_pipapo_avx2.c
@@ -71,9 +71,6 @@
#define NFT_PIPAPO_AVX2_ZERO(reg) \
asm volatile("vpxor %ymm" #reg ", %ymm" #reg ", %ymm" #reg)
-/* Current working bitmap index, toggled between field matches */
-static DEFINE_PER_CPU(bool, nft_pipapo_avx2_scratch_index);
-
/**
* nft_pipapo_avx2_prepare() - Prepare before main algorithm body
*
@@ -1120,11 +1117,12 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
const u32 *key, const struct nft_set_ext **ext)
{
struct nft_pipapo *priv = nft_set_priv(set);
- unsigned long *res, *fill, *scratch;
+ struct nft_pipapo_scratch *scratch;
u8 genmask = nft_genmask_cur(net);
const u8 *rp = (const u8 *)key;
struct nft_pipapo_match *m;
struct nft_pipapo_field *f;
+ unsigned long *res, *fill;
bool map_index;
int i, ret = 0;
@@ -1146,10 +1144,11 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
kernel_fpu_end();
return false;
}
- map_index = raw_cpu_read(nft_pipapo_avx2_scratch_index);
- res = scratch + (map_index ? m->bsize_max : 0);
- fill = scratch + (map_index ? 0 : m->bsize_max);
+ map_index = scratch->map_index;
+
+ res = scratch->map + (map_index ? m->bsize_max : 0);
+ fill = scratch->map + (map_index ? 0 : m->bsize_max);
/* Starting map doesn't need to be set for this implementation */
@@ -1221,7 +1220,7 @@ bool nft_pipapo_avx2_lookup(const struct net *net, const struct nft_set *set,
out:
if (i % 2)
- raw_cpu_write(nft_pipapo_avx2_scratch_index, !map_index);
+ scratch->map_index = !map_index;
kernel_fpu_end();
return ret >= 0;
--
2.43.0
next prev parent reply other threads:[~2024-02-13 17:24 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-13 17:20 [PATCH 6.1 00/64] 6.1.78-rc1 review Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 01/64] ext4: regenerate buddy after block freeing failed if under fc replay Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 02/64] dmaengine: fsl-dpaa2-qdma: Fix the size of dma pools Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 03/64] dmaengine: ti: k3-udma: Report short packet errors Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 04/64] dmaengine: fsl-qdma: Fix a memory leak related to the status queue DMA Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 05/64] dmaengine: fsl-qdma: Fix a memory leak related to the queue command DMA Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 06/64] phy: renesas: rcar-gen3-usb2: Fix returning wrong error code Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 07/64] dmaengine: fix is_slave_direction() return false when DMA_DEV_TO_DEV Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 08/64] phy: ti: phy-omap-usb2: Fix NULL pointer dereference for SRP Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 09/64] cifs: failure to add channel on iface should bump up weight Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 10/64] drm/msms/dp: fixed link clock divider bits be over written in BPC unknown case Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 11/64] drm/msm/dp: return correct Colorimetry for DP_TEST_DYNAMIC_RANGE_CEA case Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 12/64] drm/msm/dpu: check for valid hw_pp in dpu_encoder_helper_phys_cleanup Greg Kroah-Hartman
2024-02-13 17:20 ` [PATCH 6.1 13/64] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 14/64] wifi: mac80211: fix waiting for beacons logic Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 15/64] netdevsim: avoid potential loop in nsim_dev_trap_report_work() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 16/64] net: atlantic: Fix DMA mapping for PTP hwts ring Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 17/64] selftests: net: cut more slack for gro fwd tests Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 18/64] selftests: net: avoid just another constant wait Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 19/64] tunnels: fix out of bounds access when building IPv6 PMTU error Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 20/64] atm: idt77252: fix a memleak in open_card_ubr0 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 21/64] octeontx2-pf: Fix a memleak otx2_sq_init Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 22/64] hwmon: (aspeed-pwm-tacho) mutex for tach reading Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 23/64] hwmon: (coretemp) Fix out-of-bounds memory access Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 24/64] hwmon: (coretemp) Fix bogus core_id to attr name mapping Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 25/64] inet: read sk->sk_family once in inet_recv_error() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 26/64] drm/i915/gvt: Fix uninitialized variable in handle_mmio() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 27/64] rxrpc: Fix response to PING RESPONSE ACKs to a dead call Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 28/64] tipc: Check the bearer type before calling tipc_udp_nl_bearer_add() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 29/64] af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 30/64] ppp_async: limit MRU to 64K Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 31/64] selftests: cmsg_ipv6: repeat the exact packet Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 32/64] netfilter: nft_compat: narrow down revision to unsigned 8-bits Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 33/64] netfilter: nft_compat: reject unused compat flag Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 34/64] netfilter: nft_compat: restrict match/target protocol to u16 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 35/64] drm/amd/display: Implement bounds check for stream encoder creation in DCN301 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 36/64] netfilter: nft_ct: reject direction for ct id Greg Kroah-Hartman
2024-02-13 17:21 ` Greg Kroah-Hartman [this message]
2024-02-13 17:21 ` [PATCH 6.1 38/64] netfilter: nft_set_pipapo: add helper to release pcpu scratch area Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 39/64] netfilter: nft_set_pipapo: remove scratch_aligned pointer Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 40/64] fs/ntfs3: Fix an NULL dereference bug Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 41/64] scsi: core: Move scsi_host_busy() out of host lock if it is for per-command Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 42/64] blk-iocost: Fix an UBSAN shift-out-of-bounds warning Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 43/64] fs: dlm: dont put dlm_local_addrs on heap Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 44/64] mtd: parsers: ofpart: add workaround for #size-cells 0 Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 45/64] ALSA: usb-audio: Add delay quirk for MOTU M Series 2nd revision Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 46/64] ALSA: usb-audio: Add a quirk for Yamaha YIT-W12TX transmitter Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 47/64] ALSA: usb-audio: add quirk for RODE NT-USB+ Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 48/64] USB: serial: qcserial: add new usb-id for Dell Wireless DW5826e Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 49/64] USB: serial: option: add Fibocom FM101-GL variant Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 50/64] USB: serial: cp210x: add ID for IMST iM871A-USB Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 51/64] usb: dwc3: host: Set XHCI_SG_TRB_CACHE_SIZE_QUIRK Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 52/64] usb: host: xhci-plat: Add support for XHCI_SG_TRB_CACHE_SIZE_QUIRK Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 53/64] xhci: process isoc TD properly when there was a transaction error mid TD Greg Kroah-Hartman
2024-02-13 18:47 ` Michał Pecio
2024-02-14 14:23 ` Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 54/64] xhci: handle isoc Babble and Buffer Overrun events properly Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 55/64] hrtimer: Report offline hrtimer enqueue Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 56/64] Input: i8042 - fix strange behavior of touchpad on Clevo NS70PU Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 57/64] Input: atkbd - skip ATKBD_CMD_SETLEDS when skipping ATKBD_CMD_GETID Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 58/64] io_uring/net: fix sr->len for IORING_OP_RECV with MSG_WAITALL and buffers Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 59/64] Revert "ASoC: amd: Add new dmi entries for acp5x platform" Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 60/64] vhost: use kzalloc() instead of kmalloc() followed by memset() Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 61/64] RDMA/irdma: Fix support for 64k pages Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 62/64] f2fs: add helper to check compression level Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 63/64] block: treat poll queue enter similarly to timeouts Greg Kroah-Hartman
2024-02-13 17:21 ` [PATCH 6.1 64/64] clocksource: Skip watchdog check for large watchdog intervals Greg Kroah-Hartman
2024-02-13 19:03 ` [PATCH 6.1 00/64] 6.1.78-rc1 review SeongJae Park
2024-02-13 19:34 ` Pavel Machek
2024-02-13 21:04 ` Kelsey Steele
2024-02-13 21:15 ` Florian Fainelli
2024-02-13 22:46 ` Allen
2024-02-14 0:17 ` Shuah Khan
2024-02-14 9:03 ` Jon Hunter
2024-02-14 13:08 ` Greg Kroah-Hartman
2024-02-14 13:15 ` Jon Hunter
2024-02-14 14:23 ` Greg Kroah-Hartman
2024-02-14 11:07 ` Naresh Kamboju
2024-02-14 11:22 ` Yann Sionneau
2024-02-14 16:44 ` Sven Joachim
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240213171845.915219273@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=fw@strlen.de \
--cc=pablo@netfilter.org \
--cc=patches@lists.linux.dev \
--cc=sashal@kernel.org \
--cc=sbrivio@redhat.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).