netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jakub Kicinski <kuba@kernel.org>
To: Lin Liu <lin.liu@citrix.com>
Cc: Juergen Gross <jgross@suse.com>,
	Stefano Stabellini <sstabellini@kernel.org>,
	Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Paolo Abeni <pabeni@redhat.com>,
	"moderated list:XEN HYPERVISOR INTERFACE" 
	<xen-devel@lists.xenproject.org>,
	"open list:NETWORKING DRIVERS" <netdev@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] drivers/net/netfront: Fix NULL sring after live migration
Date: Wed, 30 Nov 2022 21:26:45 -0800	[thread overview]
Message-ID: <20221130212645.06a36158@kernel.org> (raw)
In-Reply-To: <20221129061702.60629-1-lin.liu@citrix.com>

On Tue, 29 Nov 2022 06:17:02 +0000 Lin Liu wrote:
> A NAPI is setup for each network sring to poll data to kernel
> The sring with source host is destroyed before live migration and
> new sring with target host is setup after live migration.
> The NAPI for the old sring is not deleted until setup new sring
> with target host after migration. With busy_poll/busy_read enabled,
> the NAPI can be polled before got deleted when resume VM.
> 
> [50116.602938] BUG: unable to handle kernel NULL pointer dereference at
> 0000000000000008
> [50116.603047] IP: xennet_poll+0xae/0xd20
> [50116.603090] PGD 0 P4D 0
> [50116.603118] Oops: 0000 [#1] SMP PTI
> [50116.604624] Call Trace:
> [50116.604674]  ? finish_task_switch+0x71/0x230
> [50116.604745]  ? timerqueue_del+0x1d/0x40
> [50116.604807]  ? hrtimer_try_to_cancel+0xb5/0x110
> [50116.604882]  ? xennet_alloc_rx_buffers+0x2a0/0x2a0
> [50116.604958]  napi_busy_loop+0xdb/0x270
> [50116.605017]  sock_poll+0x87/0x90
> [50116.605066]  do_sys_poll+0x26f/0x580
> [50116.605125]  ? tracing_map_insert+0x1d4/0x2f0
> [50116.605196]  ? event_hist_trigger+0x14a/0x260

You can trim all the ' ? ' entries from the stack trace, 
and the time stamps, FWIW. Makes it easier to read.

> [50116.613598]  ? finish_task_switch+0x71/0x230
> [50116.614131]  ? __schedule+0x256/0x890
> [50116.614640]  ? recalc_sigpending+0x1b/0x50
> [50116.615144]  ? xen_sched_clock+0x15/0x20
> [50116.615643]  ? __rb_reserve_next+0x12d/0x140
> [50116.616138]  ? ring_buffer_lock_reserve+0x123/0x3d0
> [50116.616634]  ? event_triggers_call+0x87/0xb0
> [50116.617138]  ? trace_event_buffer_commit+0x1c4/0x210
> [50116.617625]  ? xen_clocksource_get_cycles+0x15/0x20
> [50116.618112]  ? ktime_get_ts64+0x51/0xf0
> [50116.618578]  SyS_ppoll+0x160/0x1a0
> [50116.619029]  ? SyS_ppoll+0x160/0x1a0
> [50116.619475]  do_syscall_64+0x73/0x130
> [50116.619901]  entry_SYSCALL_64_after_hwframe+0x41/0xa6
> ...
> [50116.806230] RIP: xennet_poll+0xae/0xd20 RSP: ffffb4f041933900
> [50116.806772] CR2: 0000000000000008
> [50116.807337] ---[ end trace f8601785b354351c ]---
> 
> xen frontend should remove the NAPIs for the old srings before live
> migration as the bond srings are destroyed
> 
> There is a tiny window between the srings are set to NULL and
> the NAPIs are disabled, It is safe as the NAPI threads are still
> frozen at that time
> 

Since this is a fix please add a Fixes tag, and add [PATCH net]
to the subject.

> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 9af2b027c19c..dc404e05970c 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -1862,6 +1862,12 @@ static int netfront_resume(struct xenbus_device *dev)
>  	netif_tx_unlock_bh(info->netdev);
>  
>  	xennet_disconnect_backend(info);
> +
> +	rtnl_lock();
> +	if (info->queues)
> +		xennet_destroy_queues(info);
> +	rtnl_unlock();

Now all callers of xennet_disconnect_backend() destroy queues soon
after, can we just move the destroy queues into disconnect ?

>  	return 0;
>  }
>  


      reply	other threads:[~2022-12-01  5:26 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-29  6:17 [PATCH] drivers/net/netfront: Fix NULL sring after live migration Lin Liu
2022-12-01  5:26 ` Jakub Kicinski [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221130212645.06a36158@kernel.org \
    --to=kuba@kernel.org \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=jgross@suse.com \
    --cc=lin.liu@citrix.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=oleksandr_tyshchenko@epam.com \
    --cc=pabeni@redhat.com \
    --cc=sstabellini@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).