linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kuniyuki Iwashima <kuniyu@amazon.com>
To: "David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	Jeff Layton <jlayton@kernel.org>,
	Chuck Lever <chuck.lever@oracle.com>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>,
	Kuniyuki Iwashima <kuni1840@gmail.com>, <netdev@vger.kernel.org>,
	<linux-fsdevel@vger.kernel.org>
Subject: [PATCH v1 net-next 02/13] sysctl: Support LOCK_MAND for read/write.
Date: Thu, 25 Aug 2022 17:04:34 -0700	[thread overview]
Message-ID: <20220826000445.46552-3-kuniyu@amazon.com> (raw)
In-Reply-To: <20220826000445.46552-1-kuniyu@amazon.com>

The preceding patch added LOCK_MAND support for flock(), and this patch
adds read/write protection on sysctl knobs.  The read/write operations
will return -EPERM if the file is mandatory-locked.

The following patches introduce sysctl knobs which are read in clone() or
unshare() to control a per-netns hash table size for TCP/UDP.  In such a
case, we can use write protection to guarantee the hash table's size for
the child netns.

The difference between BPF_PROG_TYPE_CGROUP_SYSCTL is that the BPF prog
requires processes to be in the same cgroup to allow/deny read/write to
sysctl knobs.

Note that the read protection might be useless, especially for some
sysctl knobs whose value we can know in another way.  For example, we
can know fs.nr_open by opening too many files and checking the error,
and net.ipv4.tcp_syn_retries by dropping SYN and dumping packets.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
---
 fs/locks.c            | 26 ++++++++++++++++++++++++++
 fs/proc/proc_sysctl.c | 25 ++++++++++++++++++++++++-
 include/linux/fs.h    |  1 +
 3 files changed, 51 insertions(+), 1 deletion(-)

diff --git a/fs/locks.c b/fs/locks.c
index 03ff10a3165e..c858c6c61920 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -890,6 +890,32 @@ static bool flock_locks_conflict(struct file_lock *caller_fl,
 	return locks_conflict(caller_fl, sys_fl);
 }
 
+int flock_mandatory_locked(struct file *filp)
+{
+	struct file_lock_context *ctx;
+	struct file_lock *fl;
+	int flags = 0;
+
+	ctx = smp_load_acquire(&file_inode(filp)->i_flctx);
+	if (!ctx)
+		goto out;
+
+	spin_lock(&ctx->flc_lock);
+	list_for_each_entry(fl, &ctx->flc_flock, fl_list) {
+		if (!(fl->fl_type & LOCK_MAND))
+			continue;
+
+		if (fl->fl_file != filp)
+			flags = fl->fl_type & (LOCK_MAND | LOCK_RW);
+
+		break;
+	}
+	spin_unlock(&ctx->flc_lock);
+out:
+	return flags;
+}
+EXPORT_SYMBOL(flock_mandatory_locked);
+
 void
 posix_test_lock(struct file *filp, struct file_lock *fl)
 {
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 021e83fe831f..ce2755670970 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -561,10 +561,30 @@ static struct dentry *proc_sys_lookup(struct inode *dir, struct dentry *dentry,
 	return err;
 }
 
+static bool proc_mandatory_locked(struct file *filp, int write)
+{
+	int flags = flock_mandatory_locked(filp);
+
+	if (flags & LOCK_MAND) {
+		if (write) {
+			if (flags & LOCK_WRITE)
+				return false;
+		} else {
+			if (flags & LOCK_READ)
+				return false;
+		}
+
+		return true;
+	}
+
+	return false;
+}
+
 static ssize_t proc_sys_call_handler(struct kiocb *iocb, struct iov_iter *iter,
 		int write)
 {
-	struct inode *inode = file_inode(iocb->ki_filp);
+	struct file *filp = iocb->ki_filp;
+	struct inode *inode = file_inode(filp);
 	struct ctl_table_header *head = grab_header(inode);
 	struct ctl_table *table = PROC_I(inode)->sysctl_entry;
 	size_t count = iov_iter_count(iter);
@@ -582,6 +602,9 @@ static ssize_t proc_sys_call_handler(struct kiocb *iocb, struct iov_iter *iter,
 	if (sysctl_perm(head, table, write ? MAY_WRITE : MAY_READ))
 		goto out;
 
+	if (proc_mandatory_locked(filp, write))
+		goto out;
+
 	/* if that can happen at all, it should be -EINVAL, not -EISDIR */
 	error = -EINVAL;
 	if (!table->proc_handler)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 9eced4cc286e..5d1d4b10a868 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1164,6 +1164,7 @@ extern void locks_copy_conflock(struct file_lock *, struct file_lock *);
 extern void locks_remove_posix(struct file *, fl_owner_t);
 extern void locks_remove_file(struct file *);
 extern void locks_release_private(struct file_lock *);
+int flock_mandatory_locked(struct file *filp);
 extern void posix_test_lock(struct file *, struct file_lock *);
 extern int posix_lock_file(struct file *, struct file_lock *, struct file_lock *);
 extern int locks_delete_block(struct file_lock *);
-- 
2.30.2


  parent reply	other threads:[~2022-08-26  0:06 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26  0:04 [PATCH v1 net-next 00/13] tcp/udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 01/13] fs/lock: Revive LOCK_MAND Kuniyuki Iwashima
2022-08-26 10:02   ` Jeff Layton
2022-08-26 16:48     ` Kuniyuki Iwashima
2022-08-26  0:04 ` Kuniyuki Iwashima [this message]
2022-08-26  0:04 ` [PATCH v1 net-next 03/13] selftest: sysctl: Add test for flock(LOCK_MAND) Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 04/13] net: Introduce init2() for pernet_operations Kuniyuki Iwashima
2022-08-26 15:20   ` Eric Dumazet
2022-08-26 17:03     ` Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 05/13] tcp: Clean up some functions Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 06/13] tcp: Set NULL to sk->sk_prot->h.hashinfo Kuniyuki Iwashima
2022-08-26 15:40   ` Eric Dumazet
2022-08-26 17:26     ` Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 07/13] tcp: Access &tcp_hashinfo via net Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 08/13] tcp: Introduce optional per-netns ehash Kuniyuki Iwashima
2022-08-26 15:24   ` Eric Dumazet
2022-08-26 17:19     ` Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 09/13] udp: Clean up some functions Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 10/13] udp: Set NULL to sk->sk_prot->h.udp_table Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 11/13] udp: Set NULL to udp_seq_afinfo.udp_table Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 12/13] udp: Access &udp_table via net Kuniyuki Iwashima
2022-08-26  0:04 ` [PATCH v1 net-next 13/13] udp: Introduce optional per-netns hash table Kuniyuki Iwashima
2022-08-26 15:17 ` [PATCH v1 net-next 00/13] tcp/udp: " Eric Dumazet
2022-08-26 16:51   ` Kuniyuki Iwashima

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220826000445.46552-3-kuniyu@amazon.com \
    --to=kuniyu@amazon.com \
    --cc=chuck.lever@oracle.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jlayton@kernel.org \
    --cc=keescook@chromium.org \
    --cc=kuba@kernel.org \
    --cc=kuni1840@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).