public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, cgroups@vger.kernel.org,
	Nadia Pinaeva <n.m.pinaeva@gmail.com>,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>,
	Sasha Levin <sashal@kernel.org>
Subject: [PATCH 6.1 52/63] netfilter: nft_socket: make cgroupsv2 matching work with namespaces
Date: Mon, 16 Sep 2024 13:44:31 +0200	[thread overview]
Message-ID: <20240916114222.900057413@linuxfoundation.org> (raw)
In-Reply-To: <20240916114221.021192667@linuxfoundation.org>

6.1-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

[ Upstream commit 7f3287db654395f9c5ddd246325ff7889f550286 ]

When running in container environmment, /sys/fs/cgroup/ might not be
the real root node of the sk-attached cgroup.

Example:

In container:
% stat /sys//fs/cgroup/
Device: 0,21    Inode: 2214  ..
% stat /sys/fs/cgroup/foo
Device: 0,21    Inode: 2264  ..

The expectation would be for:

  nft add rule .. socket cgroupv2 level 1 "foo" counter

to match traffic from a process that got added to "foo" via
"echo $pid > /sys/fs/cgroup/foo/cgroup.procs".

However, 'level 3' is needed to make this work.

Seen from initial namespace, the complete hierarchy is:

% stat /sys/fs/cgroup/system.slice/docker-.../foo
  Device: 0,21    Inode: 2264 ..

i.e. hierarchy is
0    1               2              3
/ -> system.slice -> docker-1... -> foo

... but the container doesn't know that its "/" is the "docker-1.."
cgroup.  Current code will retrieve the 'system.slice' cgroup node
and store its kn->id in the destination register, so compare with
2264 ("foo" cgroup id) will not match.

Fetch "/" cgroup from ->init() and add its level to the level we try to
extract.  cgroup root-level is 0 for the init-namespace or the level
of the ancestor that is exposed as the cgroup root inside the container.

In the above case, cgrp->level of "/" resolved in the container is 2
(docker-1...scope/) and request for 'level 1' will get adjusted
to fetch the actual level (3).

v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it.
    (kernel test robot)

Cc: cgroups@vger.kernel.org
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 net/netfilter/nft_socket.c | 41 +++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c
index 0f37738e4b26..8722f712c019 100644
--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -9,7 +9,8 @@
 
 struct nft_socket {
 	enum nft_socket_keys		key:8;
-	u8				level;
+	u8				level;		/* cgroupv2 level to extract */
+	u8				level_user;	/* cgroupv2 level provided by userspace */
 	u8				len;
 	union {
 		u8			dreg;
@@ -53,6 +54,28 @@ nft_sock_get_eval_cgroupv2(u32 *dest, struct sock *sk, const struct nft_pktinfo
 	memcpy(dest, &cgid, sizeof(u64));
 	return true;
 }
+
+/* process context only, uses current->nsproxy. */
+static noinline int nft_socket_cgroup_subtree_level(void)
+{
+	struct cgroup *cgrp = cgroup_get_from_path("/");
+	int level;
+
+	if (!cgrp)
+		return -ENOENT;
+
+	level = cgrp->level;
+
+	cgroup_put(cgrp);
+
+	if (WARN_ON_ONCE(level > 255))
+		return -ERANGE;
+
+	if (WARN_ON_ONCE(level < 0))
+		return -EINVAL;
+
+	return level;
+}
 #endif
 
 static struct sock *nft_socket_do_lookup(const struct nft_pktinfo *pkt)
@@ -174,9 +197,10 @@ static int nft_socket_init(const struct nft_ctx *ctx,
 	case NFT_SOCKET_MARK:
 		len = sizeof(u32);
 		break;
-#ifdef CONFIG_CGROUPS
+#ifdef CONFIG_SOCK_CGROUP_DATA
 	case NFT_SOCKET_CGROUPV2: {
 		unsigned int level;
+		int err;
 
 		if (!tb[NFTA_SOCKET_LEVEL])
 			return -EINVAL;
@@ -185,6 +209,17 @@ static int nft_socket_init(const struct nft_ctx *ctx,
 		if (level > 255)
 			return -EOPNOTSUPP;
 
+		err = nft_socket_cgroup_subtree_level();
+		if (err < 0)
+			return err;
+
+		priv->level_user = level;
+
+		level += err;
+		/* Implies a giant cgroup tree */
+		if (WARN_ON_ONCE(level > 255))
+			return -EOPNOTSUPP;
+
 		priv->level = level;
 		len = sizeof(u64);
 		break;
@@ -209,7 +244,7 @@ static int nft_socket_dump(struct sk_buff *skb,
 	if (nft_dump_register(skb, NFTA_SOCKET_DREG, priv->dreg))
 		return -1;
 	if (priv->key == NFT_SOCKET_CGROUPV2 &&
-	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level)))
+	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level_user)))
 		return -1;
 	return 0;
 }
-- 
2.43.0




  parent reply	other threads:[~2024-09-16 12:01 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-16 11:43 [PATCH 6.1 00/63] 6.1.111-rc1 review Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 01/63] ksmbd: override fsids for share path check Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 02/63] ksmbd: override fsids for smb2_query_info() Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 03/63] usbnet: ipheth: fix carrier detection in modes 1 and 4 Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 04/63] net: ethernet: use ip_hdrlen() instead of bit shift Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 05/63] drm: panel-orientation-quirks: Add quirk for Ayn Loki Zero Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 06/63] drm: panel-orientation-quirks: Add quirk for Ayn Loki Max Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 07/63] net: phy: vitesse: repair vsc73xx autonegotiation Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 08/63] powerpc/mm: Fix boot warning with hugepages and CONFIG_DEBUG_VIRTUAL Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 09/63] btrfs: update target inodes ctime on unlink Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 10/63] Input: ads7846 - ratelimit the spi_sync error message Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 11/63] Input: synaptics - enable SMBus for HP Elitebook 840 G2 Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 12/63] HID: multitouch: Add support for GT7868Q Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 13/63] scripts: kconfig: merge_config: config files: add a trailing newline Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 14/63] platform/surface: aggregator_registry: Add Support for Surface Pro 10 Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 15/63] platform/surface: aggregator_registry: Add support for Surface Laptop Go 3 Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 16/63] drm/msm/adreno: Fix error return if missing firmware-name Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 17/63] Input: i8042 - add Fujitsu Lifebook E756 to i8042 quirk table Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 18/63] smb/server: fix return value of smb2_open() Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 19/63] NFSv4: Fix clearing of layout segments in layoutreturn Greg Kroah-Hartman
2024-09-16 11:43 ` [PATCH 6.1 20/63] NFS: Avoid unnecessary rescanning of the per-server delegation list Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 21/63] platform/x86: panasonic-laptop: Fix SINF array out of bounds accesses Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 22/63] platform/x86: panasonic-laptop: Allocate 1 entry extra in the sinf array Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 23/63] mptcp: pm: Fix uaf in __timer_delete_sync Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 24/63] arm64: dts: rockchip: fix eMMC/SPI corruption when audio has been used on RK3399 Puma Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 25/63] arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog " Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 26/63] minmax: reduce min/max macro expansion in atomisp driver Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 27/63] net: tighten bad gso csum offset check in virtio_net_hdr Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 28/63] dm-integrity: fix a race condition when accessing recalc_sector Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 29/63] mm: avoid leaving partial pfn mappings around in error case Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 30/63] net: xilinx: axienet: Fix race in axienet_stop Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 31/63] pmdomain: ti: Add a null pointer check to the omap_prm_domain_init Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 32/63] fs/ntfs3: Use kvfree to free memory allocated by kvmalloc Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 33/63] arm64: dts: rockchip: fix PMIC interrupt pin in pinctrl for ROCK Pi E Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 34/63] eeprom: digsy_mtc: Fix 93xx46 driver probe failure Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 35/63] cxl/core: Fix incorrect vendor debug UUID define Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 36/63] selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected() Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 37/63] hwmon: (pmbus) Conditionally clear individual status bits for pmbus rev >= 1.2 Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 38/63] ice: fix accounting for filters shared by multiple VSIs Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 39/63] igb: Always call igb_xdp_ring_update_tail() under Tx lock Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 40/63] net/mlx5: Update the list of the PCI supported devices Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 41/63] net/mlx5e: Add missing link modes to ptys2ethtool_map Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 42/63] net/mlx5: Explicitly set scheduling element and TSAR type Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 43/63] net/mlx5: Add missing masks and QoS bit masks for scheduling elements Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 44/63] net/mlx5: Correct TASR typo into TSAR Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 45/63] net/mlx5: Verify support for scheduling element and TSAR type Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 46/63] net/mlx5: Fix bridge mode operations when there are no VFs Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 47/63] fou: fix initialization of grc Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 48/63] octeontx2-af: Set XOFF on other child transmit schedulers during SMQ flush Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 49/63] octeontx2-af: Modify SMQ flush sequence to drop packets Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 50/63] net: ftgmac100: Enable TX interrupt to avoid TX timeout Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 51/63] netfilter: nft_socket: fix sk refcount leaks Greg Kroah-Hartman
2024-09-16 11:44 ` Greg Kroah-Hartman [this message]
2024-09-16 11:44 ` [PATCH 6.1 53/63] net: dpaa: Pad packets to ETH_ZLEN Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 54/63] spi: nxp-fspi: fix the KASAN report out-of-bounds bug Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 55/63] soundwire: stream: Revert "soundwire: stream: fix programming slave ports for non-continous port maps" Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 56/63] dma-buf: heaps: Fix off-by-one in CMA heap fault handler Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 57/63] drm/amdgpu/atomfirmware: Silence UBSAN warning Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 58/63] spi: geni-qcom: Convert to platform remove callback returning void Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 59/63] spi: geni-qcom: Undo runtime PM changes at driver exit time Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 60/63] spi: geni-qcom: Fix incorrect free_irq() sequence Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 61/63] drm/i915/guc: prevent a possible int overflow in wq offsets Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 62/63] pinctrl: meteorlake: Add Arrow Lake-H/U ACPI ID Greg Kroah-Hartman
2024-09-16 11:44 ` [PATCH 6.1 63/63] ASoC: meson: axg-card: fix use-after-free Greg Kroah-Hartman
2024-09-16 17:28 ` [PATCH 6.1 00/63] 6.1.111-rc1 review Peter Schneider
2024-09-17  9:38 ` Yann Sionneau
2024-09-17  9:56 ` Mark Brown
2024-09-17 14:43 ` Naresh Kamboju
2024-09-18  6:19   ` Greg Kroah-Hartman
2024-09-18 13:56     ` Naresh Kamboju
2024-09-18 12:08   ` Georgi Djakov
2024-09-18 13:54     ` Naresh Kamboju
2024-09-25 15:42     ` Dan Carpenter
2024-10-08 23:43       ` Georgi Djakov
2024-09-17 15:18 ` Jon Hunter
2024-09-17 19:06 ` Pavel Machek
2024-09-17 21:29 ` Florian Fainelli
2024-09-17 22:42 ` Ron Economos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240916114222.900057413@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fw@strlen.de \
    --cc=n.m.pinaeva@gmail.com \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox