From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Miklos Szeredi <mszeredi@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 4.14 04/38] af_unix: fix garbage collect vs MSG_PEEK
Date: Mon, 2 Aug 2021 15:44:26 +0200 [thread overview]
Message-ID: <20210802134334.972631041@linuxfoundation.org> (raw)
In-Reply-To: <20210802134334.835358048@linuxfoundation.org>
From: Miklos Szeredi <mszeredi@redhat.com>
commit cbcf01128d0a92e131bd09f1688fe032480b65ca upstream.
unix_gc() assumes that candidate sockets can never gain an external
reference (i.e. be installed into an fd) while the unix_gc_lock is
held. Except for MSG_PEEK this is guaranteed by modifying inflight
count under the unix_gc_lock.
MSG_PEEK does not touch any variable protected by unix_gc_lock (file
count is not), yet it needs to be serialized with garbage collection.
Do this by locking/unlocking unix_gc_lock:
1) increment file count
2) lock/unlock barrier to make sure incremented file count is visible
to garbage collection
3) install file into fd
This is a lock barrier (unlike smp_mb()) that ensures that garbage
collection is run completely before or completely after the barrier.
Cc: <stable@vger.kernel.org>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/unix/af_unix.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 49 insertions(+), 2 deletions(-)
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1521,6 +1521,53 @@ out:
return err;
}
+static void unix_peek_fds(struct scm_cookie *scm, struct sk_buff *skb)
+{
+ scm->fp = scm_fp_dup(UNIXCB(skb).fp);
+
+ /*
+ * Garbage collection of unix sockets starts by selecting a set of
+ * candidate sockets which have reference only from being in flight
+ * (total_refs == inflight_refs). This condition is checked once during
+ * the candidate collection phase, and candidates are marked as such, so
+ * that non-candidates can later be ignored. While inflight_refs is
+ * protected by unix_gc_lock, total_refs (file count) is not, hence this
+ * is an instantaneous decision.
+ *
+ * Once a candidate, however, the socket must not be reinstalled into a
+ * file descriptor while the garbage collection is in progress.
+ *
+ * If the above conditions are met, then the directed graph of
+ * candidates (*) does not change while unix_gc_lock is held.
+ *
+ * Any operations that changes the file count through file descriptors
+ * (dup, close, sendmsg) does not change the graph since candidates are
+ * not installed in fds.
+ *
+ * Dequeing a candidate via recvmsg would install it into an fd, but
+ * that takes unix_gc_lock to decrement the inflight count, so it's
+ * serialized with garbage collection.
+ *
+ * MSG_PEEK is special in that it does not change the inflight count,
+ * yet does install the socket into an fd. The following lock/unlock
+ * pair is to ensure serialization with garbage collection. It must be
+ * done between incrementing the file count and installing the file into
+ * an fd.
+ *
+ * If garbage collection starts after the barrier provided by the
+ * lock/unlock, then it will see the elevated refcount and not mark this
+ * as a candidate. If a garbage collection is already in progress
+ * before the file count was incremented, then the lock/unlock pair will
+ * ensure that garbage collection is finished before progressing to
+ * installing the fd.
+ *
+ * (*) A -> B where B is on the queue of A or B is on the queue of C
+ * which is on the queue of listening socket A.
+ */
+ spin_lock(&unix_gc_lock);
+ spin_unlock(&unix_gc_lock);
+}
+
static int unix_scm_to_skb(struct scm_cookie *scm, struct sk_buff *skb, bool send_fds)
{
int err = 0;
@@ -2146,7 +2193,7 @@ static int unix_dgram_recvmsg(struct soc
sk_peek_offset_fwd(sk, size);
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
}
err = (flags & MSG_TRUNC) ? skb->len - skip : size;
@@ -2387,7 +2434,7 @@ unlock:
/* It is questionable, see note in unix_dgram_recvmsg.
*/
if (UNIXCB(skb).fp)
- scm.fp = scm_fp_dup(UNIXCB(skb).fp);
+ unix_peek_fds(&scm, skb);
sk_peek_offset_fwd(sk, chunk);
next prev parent reply other threads:[~2021-08-02 13:53 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-08-02 13:44 [PATCH 4.14 00/38] 4.14.242-rc1 review Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 01/38] selftest: fix build error in tools/testing/selftests/vm/userfaultfd.c Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 02/38] KVM: x86: determine if an exception has an error code only when injecting it Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 03/38] net: split out functions related to registering inflight socket files Greg Kroah-Hartman
2021-08-02 13:44 ` Greg Kroah-Hartman [this message]
2021-08-02 13:44 ` [PATCH 4.14 05/38] workqueue: fix UAF in pwq_unbound_release_workfn() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 06/38] net/802/mrp: fix memleak in mrp_request_join() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 07/38] net/802/garp: fix memleak in garp_request_join() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 08/38] net: annotate data race around sk_ll_usec Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 09/38] sctp: move 198 addresses from unusable to private scope Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 10/38] hfs: add missing clean-up in hfs_fill_super Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 11/38] hfs: fix high memory mapping in hfs_bnode_read Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 12/38] hfs: add lock nesting notation to hfs_find_init Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 13/38] ARM: dts: versatile: Fix up interrupt controller node names Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 14/38] virtio_net: Do not pull payload in skb->head Greg Kroah-Hartman
2021-08-02 13:44 ` Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 15/38] gro: ensure frag0 meets IP header alignment Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 16/38] x86/kvm: fix vcpu-id indexed array sizes Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 17/38] ocfs2: fix zero out valid data Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 18/38] ocfs2: issue zeroout to EOF blocks Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 19/38] can: raw: raw_setsockopt(): fix raw_rcv panic for sock UAF Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 20/38] can: mcba_usb_start(): add missing urb->transfer_dma initialization Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 21/38] can: usb_8dev: fix memory leak Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 22/38] can: ems_usb: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 23/38] can: esd_usb2: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 24/38] NIU: fix incorrect error return, missed in previous revert Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 25/38] nfc: nfcsim: fix use after free during module unload Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 26/38] x86/asm: Ensure asm/proto.h can be included stand-alone Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 27/38] cfg80211: Fix possible memory leak in function cfg80211_bss_update Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 28/38] netfilter: conntrack: adjust stop timestamp to real expiry value Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 29/38] netfilter: nft_nat: allow to specify layer 4 protocol NAT only Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 30/38] tipc: fix sleeping in tipc accept routine Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 31/38] mlx4: Fix missing error code in mlx4_load_one() Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 32/38] net: llc: fix skb_over_panic Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 33/38] net/mlx5: Fix flow table chaining Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 34/38] sctp: fix return value check in __sctp_rcv_asconf_lookup Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 35/38] tulip: windbond-840: Fix missing pci_disable_device() in probe and remove Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 36/38] sis900: " Greg Kroah-Hartman
2021-08-02 13:44 ` [PATCH 4.14 37/38] can: hi311x: fix a signedness bug in hi3110_cmd() Greg Kroah-Hartman
2021-08-02 13:45 ` [PATCH 4.14 38/38] Revert "perf map: Fix dso->nsinfo refcounting" Greg Kroah-Hartman
2021-08-03 13:17 ` [PATCH 4.14 00/38] 4.14.242-rc1 review Naresh Kamboju
2021-08-03 14:50 ` Jon Hunter
2021-08-03 19:15 ` Guenter Roeck
2021-08-04 2:56 ` Samuel Zou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210802134334.972631041@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mszeredi@redhat.com \
--cc=stable@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.