public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, cgroups@vger.kernel.org,
	Nadia Pinaeva <n.m.pinaeva@gmail.com>,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 6.10 51/58] netfilter: nft_socket: make cgroupsv2 matching work with namespaces
Date: Fri, 27 Sep 2024 14:23:53 +0200	[thread overview]
Message-ID: <20240927121720.899632594@linuxfoundation.org> (raw)
In-Reply-To: <20240927121718.789211866@linuxfoundation.org>

6.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

commit 7f3287db654395f9c5ddd246325ff7889f550286 upstream.

When running in container environmment, /sys/fs/cgroup/ might not be
the real root node of the sk-attached cgroup.

Example:

In container:
% stat /sys//fs/cgroup/
Device: 0,21    Inode: 2214  ..
% stat /sys/fs/cgroup/foo
Device: 0,21    Inode: 2264  ..

The expectation would be for:

  nft add rule .. socket cgroupv2 level 1 "foo" counter

to match traffic from a process that got added to "foo" via
"echo $pid > /sys/fs/cgroup/foo/cgroup.procs".

However, 'level 3' is needed to make this work.

Seen from initial namespace, the complete hierarchy is:

% stat /sys/fs/cgroup/system.slice/docker-.../foo
  Device: 0,21    Inode: 2264 ..

i.e. hierarchy is
0    1               2              3
/ -> system.slice -> docker-1... -> foo

... but the container doesn't know that its "/" is the "docker-1.."
cgroup.  Current code will retrieve the 'system.slice' cgroup node
and store its kn->id in the destination register, so compare with
2264 ("foo" cgroup id) will not match.

Fetch "/" cgroup from ->init() and add its level to the level we try to
extract.  cgroup root-level is 0 for the init-namespace or the level
of the ancestor that is exposed as the cgroup root inside the container.

In the above case, cgrp->level of "/" resolved in the container is 2
(docker-1...scope/) and request for 'level 1' will get adjusted
to fetch the actual level (3).

v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it.
    (kernel test robot)

Cc: cgroups@vger.kernel.org
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/netfilter/nft_socket.c |   41 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -9,7 +9,8 @@
 
 struct nft_socket {
 	enum nft_socket_keys		key:8;
-	u8				level;
+	u8				level;		/* cgroupv2 level to extract */
+	u8				level_user;	/* cgroupv2 level provided by userspace */
 	u8				len;
 	union {
 		u8			dreg;
@@ -53,6 +54,28 @@ nft_sock_get_eval_cgroupv2(u32 *dest, st
 	memcpy(dest, &cgid, sizeof(u64));
 	return true;
 }
+
+/* process context only, uses current->nsproxy. */
+static noinline int nft_socket_cgroup_subtree_level(void)
+{
+	struct cgroup *cgrp = cgroup_get_from_path("/");
+	int level;
+
+	if (!cgrp)
+		return -ENOENT;
+
+	level = cgrp->level;
+
+	cgroup_put(cgrp);
+
+	if (WARN_ON_ONCE(level > 255))
+		return -ERANGE;
+
+	if (WARN_ON_ONCE(level < 0))
+		return -EINVAL;
+
+	return level;
+}
 #endif
 
 static struct sock *nft_socket_do_lookup(const struct nft_pktinfo *pkt)
@@ -174,9 +197,10 @@ static int nft_socket_init(const struct
 	case NFT_SOCKET_MARK:
 		len = sizeof(u32);
 		break;
-#ifdef CONFIG_CGROUPS
+#ifdef CONFIG_SOCK_CGROUP_DATA
 	case NFT_SOCKET_CGROUPV2: {
 		unsigned int level;
+		int err;
 
 		if (!tb[NFTA_SOCKET_LEVEL])
 			return -EINVAL;
@@ -185,6 +209,17 @@ static int nft_socket_init(const struct
 		if (level > 255)
 			return -EOPNOTSUPP;
 
+		err = nft_socket_cgroup_subtree_level();
+		if (err < 0)
+			return err;
+
+		priv->level_user = level;
+
+		level += err;
+		/* Implies a giant cgroup tree */
+		if (WARN_ON_ONCE(level > 255))
+			return -EOPNOTSUPP;
+
 		priv->level = level;
 		len = sizeof(u64);
 		break;
@@ -209,7 +244,7 @@ static int nft_socket_dump(struct sk_buf
 	if (nft_dump_register(skb, NFTA_SOCKET_DREG, priv->dreg))
 		return -1;
 	if (priv->key == NFT_SOCKET_CGROUPV2 &&
-	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level)))
+	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level_user)))
 		return -1;
 	return 0;
 }



  parent reply	other threads:[~2024-09-27 12:29 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-27 12:23 [PATCH 6.10 00/58] 6.10.12-rc1 review Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 01/58] ASoC: SOF: mediatek: Add missing board compatible Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 02/58] ASoC: mediatek: mt8188: Mark AFE_DAC_CON0 register as volatile Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 03/58] ASoC: allow module autoloading for table db1200_pids Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 04/58] ASoC: allow module autoloading for table board_ids Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 05/58] ALSA: hda/realtek - Fixed ALC256 headphone no sound Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 06/58] ALSA: hda/realtek - FIxed ALC285 " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 07/58] scsi: lpfc: Fix overflow build issue Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 08/58] pinctrl: at91: make it work with current gpiolib Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 09/58] hwmon: (asus-ec-sensors) remove VRM temp X570-E GAMING Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 10/58] microblaze: dont treat zero reserved memory regions as error Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 11/58] platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 12/58] platform/x86: x86-android-tablets: Make Lenovo Yoga Tab 3 X90F DMI match less strict Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 13/58] net: ftgmac100: Ensure tx descriptor updates are visible Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 14/58] LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 15/58] LoongArch: KVM: Invalidate guest steal time address on vCPU reset Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 16/58] wifi: iwlwifi: lower message level for FW buffer destination Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 17/58] wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 18/58] wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 19/58] wifi: iwlwifi: mvm: pause TCM when the firmware is stopped Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 20/58] wifi: iwlwifi: mvm: dont wait for tx queues if firmware is dead Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 21/58] wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 22/58] wifi: iwlwifi: clear trans->state earlier upon error Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 23/58] can: m_can: Limit coalescing to peripheral instances Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 24/58] can: mcp251xfd: mcp251xfd_ring_init(): check TX-coalescing configuration Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 25/58] ASoC: Intel: soc-acpi-cht: Make Lenovo Yoga Tab 3 X90F DMI match less strict Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 26/58] ASoC: intel: fix module autoloading Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 27/58] ASoC: google: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 28/58] ASoC: tda7419: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 29/58] ASoC: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 30/58] ASoC: mediatek: mt8188-mt6359: Modify key Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 31/58] spi: spidev: Add an entry for elgin,jg10309-01 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 32/58] ASoC: amd: yc: Add a quirk for MSI Bravo 17 (D7VEK) Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 33/58] clk: qcom: gcc-sm8650: Dont use shared clk_ops for QUPs Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 34/58] ALSA: hda: add HDMI codec ID for Intel PTL Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 35/58] drm: komeda: Fix an issue related to normalized zpos Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 36/58] spi: bcm63xx: Enable module autoloading Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 37/58] smb: client: fix hang in wait_for_response() for negproto Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 38/58] platform/x86/amd: pmf: Make ASUS GA403 quirk generic Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 39/58] ice: check for XDP rings instead of bpf program when unconfiguring Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 40/58] x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 41/58] tools: hv: rm .*.cmd when make clean Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 42/58] drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 43/58] spi: spidev: Add missing spi_device_id for jg10309-01 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 44/58] ocfs2: add bounds checking to ocfs2_xattr_find_entry() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 45/58] ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 46/58] drm: Use XArray instead of IDR for minors Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 47/58] accel: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 48/58] drm: Expand max DRM device number to full MINORBITS Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 49/58] powercap/intel_rapl: Add support for AMD family 1Ah Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 50/58] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs Greg Kroah-Hartman
2024-09-27 12:23 ` Greg Kroah-Hartman [this message]
2024-09-27 12:23 ` [PATCH 6.10 52/58] netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 53/58] Bluetooth: btintel_pcie: Allocate memory for driver private data Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 54/58] nvme-pci: qdepth 1 quirk Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 55/58] can: mcp251xfd: properly indent labels Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 56/58] can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 57/58] USB: serial: pl2303: add device id for Macrosilicon MS3020 Greg Kroah-Hartman
2024-09-27 12:24 ` [PATCH 6.10 58/58] USB: usbtmc: prevent kernel-usb-infoleak Greg Kroah-Hartman
2024-09-27 15:51 ` [PATCH 6.10 00/58] 6.10.12-rc1 review Allen
2024-09-27 17:12 ` Peter Schneider
2024-09-27 18:36 ` Jon Hunter
2024-09-27 19:40 ` Florian Fainelli
2024-09-28 12:44 ` Naresh Kamboju
2024-09-28 17:14 ` Shuah Khan
2024-09-29  8:28 ` Ron Economos
2024-09-29 10:55 ` Kexy Biscuit
2024-09-29 11:22 ` Muhammad Usama Anjum
2024-09-30  8:46 ` Pavel Machek
2024-10-01 14:24 ` Guenter Roeck
2024-10-01 14:38   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240927121720.899632594@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fw@strlen.de \
    --cc=n.m.pinaeva@gmail.com \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox