All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yewon Choi <woni9911@gmail.com>
To: Wenjia Zhang <wenjia@linux.ibm.com>,
	Jan Karcher <jaka@linux.ibm.com>,
	"D. Wythe" <alibuda@linux.alibaba.com>,
	Tony Lu <tonylu@linux.alibaba.com>,
	Wen Gu <guwen@linux.alibaba.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
	linux-s390@vger.kernel.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
Cc: "Dae R. Jeong" <threeearcat@gmail.com>
Subject: Re: net/smc: Buggy reordering scenario in smc socket
Date: Mon, 15 Apr 2024 20:16:54 +0900	[thread overview]
Message-ID: <Zh0Mpr5fZqFUGzkb@libra05> (raw)
In-Reply-To: <CAFgxCDwA8Lv8LLwEpJur3FKs=Gkkc0KE=bx7Q1Do2+iwdAzoCw@mail.gmail.com>


On Mon, Apr 15, 2024 at 8:02 PM Yewon Choi <woni9911@gmail.com> wrote:
> Hello,
> we suspect some buggy scenario due to memory reordering in concurrent
> execution
> of setsockopt() and sendmmsg().
>
> (CPU 1) setsockopt():
>     case TCP_FASTOPEN_NO_COOKIE:
>         ...
>         smc_switch_to_fallback():
>             clcsock->file = sk.sk_socket->file; // (1)
>             clcsock->file->private_data = clcsock; // (2)
>
> (CPU 2) __sys_sendmmsg():
>     sockfd_lookup_light():
>         sock_from_file():
>             sock = file->private_data; // (3)
>     ...
>     fput_light(sock->file, fput_needed): // (4)
>         fput():
>             refcount_dec_and_test(sock->file->f_count) // null-ptr-deref
>
> There is no memory barrier between (1) and (2), so (1) might be reordered
> after
> (2) is written to memory. Then, execution order can be (2)->(3)->(4)->(1)
> and (4) will read uninitialized value which may cause system crash.
>
>
> This kind of reordering may happen in smc_ulp_init():
>
> (CPU 1) smc_ulp_init():
>     ...
>     smcsock->file = tcp->file; // (5)
>         smcsock->file->private_data = smcsock; // (6)
>
> Execution order can be (6)->(3)->(4)->(5), showing same symptom as above.
>
>
> One possible solution seems to be adding release semantic in (2) and (6).
>
> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
> index 4b52b3b159c0..37c23ef3e2d5 100644
> --- a/net/smc/af_smc.c
> +++ b/net/smc/af_smc.c
> @@ -921,7 +921,7 @@ static int smc_switch_to_fallback(struct smc_sock
> *smc, int reason_code)
>         trace_smc_switch_to_fallback(smc, reason_code);
>         if (smc->sk.sk_socket && smc->sk.sk_socket->file) {
>                 smc->clcsock->file = smc->sk.sk_socket->file;
> -               smc->clcsock->file->private_data = smc->clcsock;
> +               smp_store_release(&smc->clcsock->file->private_data,
> smc->clcsock);
>                 smc->clcsock->wq.fasync_list =
>                         smc->sk.sk_socket->wq.fasync_list;
>                 smc->sk.sk_socket->wq.fasync_list = NULL;
> @@ -3410,7 +3410,7 @@ static int smc_ulp_init(struct sock *sk)
>
>         /* replace tcp socket to smc */
>         smcsock->file = tcp->file;
> -       smcsock->file->private_data = smcsock;
> +       smp_store_release(&smcsock->file->private_data, smcsock);
>         smcsock->file->f_inode = SOCK_INODE(smcsock); /* replace inode
> when sock_close */
>         smcsock->file->f_path.dentry->d_inode = SOCK_INODE(smcsock); /*
> dput() in __fput */
>         tcp->file = NULL;
>
> I think we don't need memory barrier between (3) and (4) because there are
> critical section between (3) and (4), so lock(lock_sock/release_sock) will
> do this.
>
>
> Could you check these? If confirmed to be a bug, we will send a patch.
>
> Best Regards,
> Yewon Choi
>

Additionally, we found that below line (1) in smc_ulp_init() triggers 
kernel panic even when normaly executed. 

smc_ulp_init():
    ...
    tcp->file = NULL; // (1)

It can be triggered by simple system calls: 
    int sk = socket(0xa, 0x1, 0)
    setsockopt(sk, 0x6, 0x1f, "smc", sizeof("smc"))

[350998.391059] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018
[350998.391980] Mem abort info:
[350998.392288]   ESR = 0x0000000096000006
[350998.392691]   EC = 0x25: DABT (current EL), IL = 32 bits
[350998.393252]   SET = 0, FnV = 0
[350998.393586]   EA = 0, S1PTW = 0
[350998.396496]   FSC = 0x06: level 2 translation fault
[350998.399755] Data abort info:
[350998.400720]   ISV = 0, ISS = 0x00000006, ISS2 = 0x00000000
[350998.402329]   CM = 0, WnR = 0, TnD = 0, TagAccess = 0
[350998.404023]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
[350998.405543] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000047e44000
[350998.406735] [0000000000000018] pgd=080000004b288003, p4d=080000004b288003, pud=080000004aea9003, pmd=0000000000000000
[350998.409243] Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP
[350998.409996] Modules linked in:
[350998.410404] CPU: 1 PID: 2936860 Comm: tls Not tainted 6.8.0-rc5-00163-gffd2cb6b718e-dirty #45
[350998.411462] Hardware name: linux,dummy-virt (DT)
[350998.412050] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[350998.412923] pc : fput+0x20/0x188
[350998.413349] lr : __sys_setsockopt+0xb4/0xc0
[350998.413889] sp : ffff800080443d90
[350998.414325] x29: ffff800080443d90 x28: ffff0000051cc740 x27: 0000000000000000
[350998.415218] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000
[350998.416112] x23: 0000000000000004 x22: 00000000004613c8 x21: 000000000000001f
[350998.417007] x20: 0000000000000006 x19: 0000000000000000 x18: 0000000000000001
[350998.417909] x17: ffffc369333ee3cc x16: ffffc36933410ad8 x15: ffffc369335203a8
[350998.418797] x14: ffffc36933520188 x13: ffffc36932426dc0 x12: ffffc36932426cf4
[350998.419621] x11: ffffc36932426bec x10: ffffc36933522a34 x9 : 0000000fffffffe0
[350998.420447] x8 : ffffc3693351ef8c x7 : ffff00000a790578 x6 : ffff00000a790558
[350998.421273] x5 : ffff00000a790420 x4 : ffff0000051cc740 x3 : 0000000000000001
[350998.422105] x2 : 0000000000000000 x1 : 0000000000000018 x0 : ffffffffffffffff
[350998.422932] Call trace:
[350998.423231]  fput+0x20/0x188
[350998.423583]  __sys_setsockopt+0xb4/0xc0
[350998.424041]  __arm64_sys_setsockopt+0x28/0x38
[350998.424557]  invoke_syscall+0x48/0x114
[350998.425006]  el0_svc_common+0x3c/0xe8
[350998.425444]  do_el0_svc+0x20/0x2c
[350998.425844]  el0_svc+0x34/0xb8
[350998.426235]  el0t_64_sync_handler+0x13c/0x158
[350998.426749]  el0t_64_sync+0x190/0x194
[350998.427187] Code: aa0003f3 d503201f 92800000 91006261 (f8e00020) 
[350998.427893] ---[ end trace 0000000000000000 ]---
[350998.428460] Kernel panic - not syncing: Oops: Fatal exception
[350998.429126] SMP: stopping secondary CPUs
[350998.429617] Kernel Offset: 0x4368b2400000 from 0xffff800080000000
[350998.430335] PHYS_OFFSET: 0x40000000
[350998.430752] CPU features: 0x0,00000021,7002014a,2140720b
[350998.431371] Memory Limit: none
[350998.431741] ---[ end Kernel panic - not syncing: Oops: Fatal exception ]---

Could you check this, too?

Yewon Choi

  parent reply	other threads:[~2024-04-15 11:17 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-15 11:02 net/smc: Buggy reordering scenario in smc socket Yewon Choi
     [not found] ` <CAFgxCDwA8Lv8LLwEpJur3FKs=Gkkc0KE=bx7Q1Do2+iwdAzoCw@mail.gmail.com>
2024-04-15 11:16   ` Yewon Choi [this message]
2024-04-16  2:15     ` Tony Lu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zh0Mpr5fZqFUGzkb@libra05 \
    --to=woni9911@gmail.com \
    --cc=alibuda@linux.alibaba.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=guwen@linux.alibaba.com \
    --cc=jaka@linux.ibm.com \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=threeearcat@gmail.com \
    --cc=tonylu@linux.alibaba.com \
    --cc=wenjia@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.