All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: stable@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	patches@lists.linux.dev, cgroups@vger.kernel.org,
	Nadia Pinaeva <n.m.pinaeva@gmail.com>,
	Florian Westphal <fw@strlen.de>,
	Pablo Neira Ayuso <pablo@netfilter.org>
Subject: [PATCH 6.10 51/58] netfilter: nft_socket: make cgroupsv2 matching work with namespaces
Date: Fri, 27 Sep 2024 14:23:53 +0200	[thread overview]
Message-ID: <20240927121720.899632594@linuxfoundation.org> (raw)
In-Reply-To: <20240927121718.789211866@linuxfoundation.org>

6.10-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Florian Westphal <fw@strlen.de>

commit 7f3287db654395f9c5ddd246325ff7889f550286 upstream.

When running in container environmment, /sys/fs/cgroup/ might not be
the real root node of the sk-attached cgroup.

Example:

In container:
% stat /sys//fs/cgroup/
Device: 0,21    Inode: 2214  ..
% stat /sys/fs/cgroup/foo
Device: 0,21    Inode: 2264  ..

The expectation would be for:

  nft add rule .. socket cgroupv2 level 1 "foo" counter

to match traffic from a process that got added to "foo" via
"echo $pid > /sys/fs/cgroup/foo/cgroup.procs".

However, 'level 3' is needed to make this work.

Seen from initial namespace, the complete hierarchy is:

% stat /sys/fs/cgroup/system.slice/docker-.../foo
  Device: 0,21    Inode: 2264 ..

i.e. hierarchy is
0    1               2              3
/ -> system.slice -> docker-1... -> foo

... but the container doesn't know that its "/" is the "docker-1.."
cgroup.  Current code will retrieve the 'system.slice' cgroup node
and store its kn->id in the destination register, so compare with
2264 ("foo" cgroup id) will not match.

Fetch "/" cgroup from ->init() and add its level to the level we try to
extract.  cgroup root-level is 0 for the init-namespace or the level
of the ancestor that is exposed as the cgroup root inside the container.

In the above case, cgrp->level of "/" resolved in the container is 2
(docker-1...scope/) and request for 'level 1' will get adjusted
to fetch the actual level (3).

v2: use CONFIG_SOCK_CGROUP_DATA, eval function depends on it.
    (kernel test robot)

Cc: cgroups@vger.kernel.org
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2")
Reported-by: Nadia Pinaeva <n.m.pinaeva@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/netfilter/nft_socket.c |   41 ++++++++++++++++++++++++++++++++++++++---
 1 file changed, 38 insertions(+), 3 deletions(-)

--- a/net/netfilter/nft_socket.c
+++ b/net/netfilter/nft_socket.c
@@ -9,7 +9,8 @@
 
 struct nft_socket {
 	enum nft_socket_keys		key:8;
-	u8				level;
+	u8				level;		/* cgroupv2 level to extract */
+	u8				level_user;	/* cgroupv2 level provided by userspace */
 	u8				len;
 	union {
 		u8			dreg;
@@ -53,6 +54,28 @@ nft_sock_get_eval_cgroupv2(u32 *dest, st
 	memcpy(dest, &cgid, sizeof(u64));
 	return true;
 }
+
+/* process context only, uses current->nsproxy. */
+static noinline int nft_socket_cgroup_subtree_level(void)
+{
+	struct cgroup *cgrp = cgroup_get_from_path("/");
+	int level;
+
+	if (!cgrp)
+		return -ENOENT;
+
+	level = cgrp->level;
+
+	cgroup_put(cgrp);
+
+	if (WARN_ON_ONCE(level > 255))
+		return -ERANGE;
+
+	if (WARN_ON_ONCE(level < 0))
+		return -EINVAL;
+
+	return level;
+}
 #endif
 
 static struct sock *nft_socket_do_lookup(const struct nft_pktinfo *pkt)
@@ -174,9 +197,10 @@ static int nft_socket_init(const struct
 	case NFT_SOCKET_MARK:
 		len = sizeof(u32);
 		break;
-#ifdef CONFIG_CGROUPS
+#ifdef CONFIG_SOCK_CGROUP_DATA
 	case NFT_SOCKET_CGROUPV2: {
 		unsigned int level;
+		int err;
 
 		if (!tb[NFTA_SOCKET_LEVEL])
 			return -EINVAL;
@@ -185,6 +209,17 @@ static int nft_socket_init(const struct
 		if (level > 255)
 			return -EOPNOTSUPP;
 
+		err = nft_socket_cgroup_subtree_level();
+		if (err < 0)
+			return err;
+
+		priv->level_user = level;
+
+		level += err;
+		/* Implies a giant cgroup tree */
+		if (WARN_ON_ONCE(level > 255))
+			return -EOPNOTSUPP;
+
 		priv->level = level;
 		len = sizeof(u64);
 		break;
@@ -209,7 +244,7 @@ static int nft_socket_dump(struct sk_buf
 	if (nft_dump_register(skb, NFTA_SOCKET_DREG, priv->dreg))
 		return -1;
 	if (priv->key == NFT_SOCKET_CGROUPV2 &&
-	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level)))
+	    nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level_user)))
 		return -1;
 	return 0;
 }



  parent reply	other threads:[~2024-09-27 12:29 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-27 12:23 [PATCH 6.10 00/58] 6.10.12-rc1 review Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 01/58] ASoC: SOF: mediatek: Add missing board compatible Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 02/58] ASoC: mediatek: mt8188: Mark AFE_DAC_CON0 register as volatile Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 03/58] ASoC: allow module autoloading for table db1200_pids Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 04/58] ASoC: allow module autoloading for table board_ids Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 05/58] ALSA: hda/realtek - Fixed ALC256 headphone no sound Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 06/58] ALSA: hda/realtek - FIxed ALC285 " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 07/58] scsi: lpfc: Fix overflow build issue Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 08/58] pinctrl: at91: make it work with current gpiolib Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 09/58] hwmon: (asus-ec-sensors) remove VRM temp X570-E GAMING Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 10/58] microblaze: dont treat zero reserved memory regions as error Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 11/58] platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 12/58] platform/x86: x86-android-tablets: Make Lenovo Yoga Tab 3 X90F DMI match less strict Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 13/58] net: ftgmac100: Ensure tx descriptor updates are visible Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 14/58] LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 15/58] LoongArch: KVM: Invalidate guest steal time address on vCPU reset Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 16/58] wifi: iwlwifi: lower message level for FW buffer destination Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 17/58] wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 18/58] wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 19/58] wifi: iwlwifi: mvm: pause TCM when the firmware is stopped Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 20/58] wifi: iwlwifi: mvm: dont wait for tx queues if firmware is dead Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 21/58] wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 22/58] wifi: iwlwifi: clear trans->state earlier upon error Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 23/58] can: m_can: Limit coalescing to peripheral instances Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 24/58] can: mcp251xfd: mcp251xfd_ring_init(): check TX-coalescing configuration Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 25/58] ASoC: Intel: soc-acpi-cht: Make Lenovo Yoga Tab 3 X90F DMI match less strict Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 26/58] ASoC: intel: fix module autoloading Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 27/58] ASoC: google: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 28/58] ASoC: tda7419: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 29/58] ASoC: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 30/58] ASoC: mediatek: mt8188-mt6359: Modify key Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 31/58] spi: spidev: Add an entry for elgin,jg10309-01 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 32/58] ASoC: amd: yc: Add a quirk for MSI Bravo 17 (D7VEK) Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 33/58] clk: qcom: gcc-sm8650: Dont use shared clk_ops for QUPs Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 34/58] ALSA: hda: add HDMI codec ID for Intel PTL Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 35/58] drm: komeda: Fix an issue related to normalized zpos Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 36/58] spi: bcm63xx: Enable module autoloading Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 37/58] smb: client: fix hang in wait_for_response() for negproto Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 38/58] platform/x86/amd: pmf: Make ASUS GA403 quirk generic Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 39/58] ice: check for XDP rings instead of bpf program when unconfiguring Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 40/58] x86/hyperv: Set X86_FEATURE_TSC_KNOWN_FREQ when Hyper-V provides frequency Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 41/58] tools: hv: rm .*.cmd when make clean Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 42/58] drm/amd/pm: fix the pp_dpm_pcie issue on smu v14.0.2/3 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 43/58] spi: spidev: Add missing spi_device_id for jg10309-01 Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 44/58] ocfs2: add bounds checking to ocfs2_xattr_find_entry() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 45/58] ocfs2: strict bound check before memcmp in ocfs2_xattr_find_entry() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 46/58] drm: Use XArray instead of IDR for minors Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 47/58] accel: " Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 48/58] drm: Expand max DRM device number to full MINORBITS Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 49/58] powercap/intel_rapl: Add support for AMD family 1Ah Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 50/58] powercap/intel_rapl: Fix the energy-pkg event for AMD CPUs Greg Kroah-Hartman
2024-09-27 12:23 ` Greg Kroah-Hartman [this message]
2024-09-27 12:23 ` [PATCH 6.10 52/58] netfilter: nft_socket: Fix a NULL vs IS_ERR() bug in nft_socket_cgroup_subtree_level() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 53/58] Bluetooth: btintel_pcie: Allocate memory for driver private data Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 54/58] nvme-pci: qdepth 1 quirk Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 55/58] can: mcp251xfd: properly indent labels Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 56/58] can: mcp251xfd: move mcp251xfd_timestamp_start()/stop() into mcp251xfd_chip_start/stop() Greg Kroah-Hartman
2024-09-27 12:23 ` [PATCH 6.10 57/58] USB: serial: pl2303: add device id for Macrosilicon MS3020 Greg Kroah-Hartman
2024-09-27 12:24 ` [PATCH 6.10 58/58] USB: usbtmc: prevent kernel-usb-infoleak Greg Kroah-Hartman
2024-09-27 15:51 ` [PATCH 6.10 00/58] 6.10.12-rc1 review Allen
2024-09-27 17:12 ` Peter Schneider
2024-09-27 18:36 ` Jon Hunter
2024-09-27 19:40 ` Florian Fainelli
2024-09-28 12:44 ` Naresh Kamboju
2024-09-28 17:14 ` Shuah Khan
2024-09-29  8:28 ` Ron Economos
2024-09-29 10:55 ` Kexy Biscuit
2024-09-29 11:22 ` Muhammad Usama Anjum
2024-09-30  8:46 ` Pavel Machek
2024-10-01 14:24 ` Guenter Roeck
2024-10-01 14:38   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240927121720.899632594@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=fw@strlen.de \
    --cc=n.m.pinaeva@gmail.com \
    --cc=pablo@netfilter.org \
    --cc=patches@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.