From: "Marek Marczykowski-Górecki" <marmarek@invisiblethingslab.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Annie Li <annie.li@oracle.com>,
xen-devel <xen-devel@lists.xen.org>,
netdev@vger.kernel.org
Subject: Re: xen-netfront crash when detaching network while some network activity
Date: Tue, 17 Nov 2015 03:45:15 +0100 [thread overview]
Message-ID: <20151117024515.GN976@mail-itl> (raw)
In-Reply-To: <20151021185734.GD31646@mail-itl>
[-- Attachment #1: Type: text/plain, Size: 5867 bytes --]
On Wed, Oct 21, 2015 at 08:57:34PM +0200, Marek Marczykowski-Górecki wrote:
> On Wed, May 27, 2015 at 12:03:12AM +0200, Marek Marczykowski-Górecki wrote:
> > On Tue, May 26, 2015 at 11:56:00AM +0100, David Vrabel wrote:
> > > On 22/05/15 12:49, Marek Marczykowski-Górecki wrote:
> > > > Hi all,
> > > >
> > > > I'm experiencing xen-netfront crash when doing xl network-detach while
> > > > some network activity is going on at the same time. It happens only when
> > > > domU has more than one vcpu. Not sure if this matters, but the backend
> > > > is in another domU (not dom0). I'm using Xen 4.2.2. It happens on kernel
> > > > 3.9.4 and 4.1-rc1 as well.
> > > >
> > > > Steps to reproduce:
> > > > 1. Start the domU with some network interface
> > > > 2. Call there 'ping -f some-IP'
> > > > 3. Call 'xl network-detach NAME 0'
> > >
> > > There's a use-after-free in xennet_remove(). Does this patch fix it?
> >
> > Unfortunately not. Note that the crash is in xennet_disconnect_backend,
> > which is called before xennet_destroy_queues in xennet_remove.
> > I've tried to add napi_disable and even netif_napi_del just after
> > napi_synchronize in xennet_disconnect_backend (which would probably
> > cause crash when trying to cleanup the same later again), but it doesn't
> > help - the crash is the same (still in gnttab_end_foreign_access called
> > from xennet_disconnect_backend).
>
> Finally I've found some more time to debug this... All tests redone on
> v4.3-rc6 frontend and 3.18.17 backend.
>
> Looking at xennet_tx_buf_gc(), I have an impression that shared page
> (queue->grant_tx_page[id]) is/should be freed in some other means than
> (indirectly) calling to free_page via gnttab_end_foreign_access. Maybe the bug
> is that the page _is_ actually freed somewhere else already? At least changing
> gnttab_end_foreign_access to gnttab_end_foreign_access_ref makes the crash
> gone.
>
> Relevant xennet_tx_buf_gc fragment:
> gnttab_end_foreign_access_ref(
> queue->grant_tx_ref[id], GNTMAP_readonly);
> gnttab_release_grant_reference(
> &queue->gref_tx_head, queue->grant_tx_ref[id]);
> queue->grant_tx_ref[id] = GRANT_INVALID_REF;
> queue->grant_tx_page[id] = NULL;
> add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, id);
> dev_kfree_skb_irq(skb);
>
> And similar fragment from xennet_release_tx_bufs:
> get_page(queue->grant_tx_page[i]);
> gnttab_end_foreign_access(queue->grant_tx_ref[i],
> GNTMAP_readonly,
> (unsigned long)page_address(queue->grant_tx_page[i]));
> queue->grant_tx_page[i] = NULL;
> queue->grant_tx_ref[i] = GRANT_INVALID_REF;
> add_id_to_freelist(&queue->tx_skb_freelist, queue->tx_skbs, i);
> dev_kfree_skb_irq(skb);
>
> Note that both have dev_kfree_skb_irq, but the former use
> gnttab_end_foreign_access_ref, while the later - gnttab_end_foreign_access.
> Also note that the crash is in gnttab_end_foreign_access, so before
> dev_kfree_skb_irq. If that would be double free, I'd expect crash in the later.
>
> This change was introduced by cefe007 "xen-netfront: fix resource leak in
> netfront". I'm not sure if changing gnttab_end_foreign_access back to
> gnttab_end_foreign_access_ref would not (re)introduce some memory leak.
>
> Let me paste again the error message:
> [ 73.718636] page:ffffea000043b1c0 count:0 mapcount:0 mapping: (null) index:0x0
> [ 73.718661] flags: 0x3ffc0000008000(tail)
> [ 73.718684] page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
> [ 73.718725] ------------[ cut here ]------------
> [ 73.718743] kernel BUG at include/linux/mm.h:338!
>
> Also it all look quite strange - there is get_page() call just before
> gnttab_end_foreign_access, but page->_count is still 0. Maybe it have something
> to do how get_page() works on "tail" pages (whatever it means)?
>
> static inline void get_page(struct page *page)
> {
> if (unlikely(PageTail(page)))
> if (likely(__get_page_tail(page)))
> return;
> /*
> * Getting a normal page or the head of a compound page
> * requires to already have an elevated page->_count.
> */
> VM_BUG_ON_PAGE(atomic_read(&page->_count) <= 0, page);
> atomic_inc(&page->_count);
> }
>
> which (I think) ends up in:
>
> static inline void __get_page_tail_foll(struct page *page,
> bool get_page_head)
> {
> /*
> * If we're getting a tail page, the elevated page->_count is
> * required only in the head page and we will elevate the head
> * page->_count and tail page->_mapcount.
> *
> * We elevate page_tail->_mapcount for tail pages to force
> * page_tail->_count to be zero at all times to avoid getting
> * false positives from get_page_unless_zero() with
> * speculative page access (like in
> * page_cache_get_speculative()) on tail pages.
> */
> VM_BUG_ON_PAGE(atomic_read(&page->first_page->_count) <= 0, page);
> if (get_page_head)
> atomic_inc(&page->first_page->_count);
> get_huge_page_tail(page);
> }
>
> So the use counter is incremented in page->first_page->_count, not
> page->_count. But according to the comment, it should influence
> page->_mapcount, but the error message says it does not.
>
> Any ideas?
Ping?
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
[-- Attachment #2: Type: application/pgp-signature, Size: 473 bytes --]
next prev parent reply other threads:[~2015-11-17 2:45 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-22 11:49 xen-netfront crash when detaching network while some network activity Marek Marczykowski-Górecki
2015-05-22 16:25 ` David Vrabel
2015-05-22 16:25 ` [Xen-devel] " David Vrabel
2015-05-22 16:42 ` Marek Marczykowski-Górecki
2015-05-22 16:42 ` [Xen-devel] " Marek Marczykowski-Górecki
2015-05-22 16:58 ` David Vrabel
2015-05-22 16:58 ` [Xen-devel] " David Vrabel
2015-05-22 17:13 ` Marek Marczykowski-Górecki
2015-05-22 17:13 ` Marek Marczykowski-Górecki
2015-05-26 10:56 ` David Vrabel
2015-05-26 22:03 ` Marek Marczykowski-Górecki
2015-05-26 22:03 ` Marek Marczykowski-Górecki
2015-10-21 18:57 ` Marek Marczykowski-Górecki
2015-11-17 2:45 ` Marek Marczykowski-Górecki
2015-11-17 2:45 ` Marek Marczykowski-Górecki [this message]
2015-12-01 22:00 ` Konrad Rzeszutek Wilk
2015-12-01 22:32 ` Marek Marczykowski-Górecki
2015-12-01 22:32 ` Marek Marczykowski-Górecki
2016-01-20 21:59 ` Konrad Rzeszutek Wilk
2016-01-21 12:30 ` Joao Martins
2016-01-22 19:23 ` Marek Marczykowski-Górecki
2016-01-22 19:23 ` Marek Marczykowski-Górecki
2016-01-21 12:30 ` Joao Martins
2015-11-17 11:59 ` David Vrabel
2015-11-17 11:59 ` [Xen-devel] " David Vrabel
2015-10-21 18:57 ` Marek Marczykowski-Górecki
-- strict thread matches above, loose matches on Subject: below --
2015-05-22 11:49 Marek Marczykowski-Górecki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151117024515.GN976@mail-itl \
--to=marmarek@invisiblethingslab.com \
--cc=annie.li@oracle.com \
--cc=boris.ostrovsky@oracle.com \
--cc=david.vrabel@citrix.com \
--cc=konrad.wilk@oracle.com \
--cc=netdev@vger.kernel.org \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.