netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Martin Habets <habetsm.xilinx@gmail.com>
To: Ding Hui <dinghui@sangfor.com.cn>
Cc: davem@davemloft.net, edumazet@google.com, kuba@kernel.org,
	pabeni@redhat.com, ecree.xilinx@gmail.com,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	pengdonglin@sangfor.com.cn, huangcun@sangfor.com.cn
Subject: Re: [RFC PATCH net] sfc: Fix use-after-free due to selftest_work
Date: Fri, 14 Apr 2023 13:33:49 +0100	[thread overview]
Message-ID: <ZDlILcEDU16YgTT/@gmail.com> (raw)
In-Reply-To: <a7cb0d9e-7519-f90c-bd58-eab9ee82b3dc@sangfor.com.cn>

On Fri, Apr 14, 2023 at 07:03:41PM +0800, Ding Hui wrote:
> On 2023/4/14 17:44, Martin Habets wrote:
> > On Thu, Apr 13, 2023 at 04:35:08PM +0800, Ding Hui wrote:
> > > On 2023/4/13 15:37, Martin Habets wrote:
> > > > On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
> > > > > There is a use-after-free scenario that is:
> > > > > 
> > > > > When netif_running() is false, user set mac address or vlan tag to VF,
> > > > > the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> > > > > and efx_net_open(), since netif_running() is false, the port will not
> > > > > start and keep port_enabled false, but selftest_worker is scheduled
> > > > > in efx_net_open().
> > > > > 
> > > > > If we remove the device before selftest_worker run, the efx is freed,
> > > > > then we will get a UAF in run_timer_softirq() like this:
> > > > > 
> > > > > [ 1178.907941] ==================================================================
> > > > > [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> > > > > [ 1178.907950]
> > > > > [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> > > > > [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> > > > > [ 1178.907955] Call Trace:
> > > > > [ 1178.907956]  <IRQ>
> > > > > [ 1178.907960]  dump_stack+0x71/0xab
> > > > > [ 1178.907963]  print_address_description+0x6b/0x290
> > > > > [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907967]  kasan_report+0x14a/0x2b0
> > > > > [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907971]  ? init_timer_key+0x170/0x170
> > > > > [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> > > > > [ 1178.907976]  ? sched_clock+0x5/0x10
> > > > > [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> > > > > [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> > > > > [ 1178.907985]  irq_exit+0x213/0x240
> > > > > [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> > > > > [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> > > > > [ 1178.907990]  </IRQ>
> > > > > [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> > > > > 
> > > > > I am thinking about several ways to fix the issue:
> > > > > 
> > > > > [1] In this RFC, I cancel the selftest_worker unconditionally in
> > > > > efx_pci_remove().
> > > > > 
> > > > > [2] Add a test condition, only invoke efx_selftest_async_start() when
> > > > > efx->port_enabled is true in efx_net_open().
> > > > > 
> > > > > [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> > > > > efx_start_all() or efx_start_port(), that matching cancel action in
> > > > > efx_stop_port().
> > > > 
> > > > I think moving this to efx_start_port() is best, as you say to match
> > > > the cancel in efx_stop_port().
> > > > 
> > > 
> > > If moving to efx_start_port(), should we worry about that IRQ_TIMEOUT
> > > is still enough?
> > 
> > 1 second is a long time for a machine running code, so it does not worry me.
> > 
> > > I'm not sure if there is a long time waiting from starting of schedule
> > > selftest_work to the ending of efx_net_open().
> > 
> > I see your point. Looking at efx_start_all() there is the call to
> > efx_start_datapath() after the call to efx_net_open(), which takes a
>                                          ^^^^^^^^^^^^
> Do you mean efx_start_port()?

Woops, yes that what I meant.

> > relatively long time (well under 200ms though).
> > Logically it would be better to move efx_selftest_async_start() after this
> > call. What do you think?
> 
> Agree with you.
> 
> > The point here is that efx_start_all() calls efx_start_port() early, and
> > efx_stop_all() also calls efx_stop_port() early. The calling sequence is
> > correct but they are not the strict inverse of each other.
> > 
> 
> Yeah, that is what I noticed monitor_work does.
> Then I'll move efx_selftest_async_start() into efx_start_all(), follows
> the monitor_work.

Sounds good.

Thanks,
Martin

> -- 
> Thanks,
> - Ding Hui

      reply	other threads:[~2023-04-14 12:33 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-12  0:50 [RFC PATCH net] sfc: Fix use-after-free due to selftest_work Ding Hui
2023-04-12 22:34 ` Jacob Keller
2023-04-13  1:12   ` Ding Hui
2023-04-13  7:52   ` Martin Habets
2023-04-13  8:11     ` Ding Hui
2023-04-13  7:37 ` Martin Habets
2023-04-13  8:35   ` Ding Hui
2023-04-14  9:44     ` Martin Habets
2023-04-14 11:03       ` Ding Hui
2023-04-14 12:33         ` Martin Habets [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZDlILcEDU16YgTT/@gmail.com \
    --to=habetsm.xilinx@gmail.com \
    --cc=davem@davemloft.net \
    --cc=dinghui@sangfor.com.cn \
    --cc=ecree.xilinx@gmail.com \
    --cc=edumazet@google.com \
    --cc=huangcun@sangfor.com.cn \
    --cc=kuba@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pengdonglin@sangfor.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).