From: Niklas Cassel <niklas.cassel@axis.com>
To: Jose Abreu <Jose.Abreu@synopsys.com>
Cc: Joao Pinto <Joao.Pinto@synopsys.com>,
linux-netdev <netdev@vger.kernel.org>,
Giuseppe CAVALLARO <peppe.cavallaro@st.com>,
alexandre.torgue@st.com
Subject: Re: Commit 05cf0d1bf4 ("net: stmmac: free an skb first when there are no longer any descriptors using it") breaks stmmac?
Date: Wed, 6 Dec 2017 13:14:13 +0100 [thread overview]
Message-ID: <20171206121412.GB18005@axis.com> (raw)
In-Reply-To: <f638c527-75d7-1b98-ae7f-978f08652856@synopsys.com>
On Thu, Nov 30, 2017 at 04:50:38PM +0000, Jose Abreu wrote:
> Hi Niklas,
>
> Thanks for the detailed explanation!
>
> On 30-11-2017 03:51, Niklas Cassel wrote:
> >
> > Could you try to see if the following patch
> > solves your problem:
> > (still don't see why it should be needed,
> > considering stmmac_tx_clean() should set all non-NULL
> > entries to NULL)
> >
> > diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > index 1763e48c84e2..1d30b20b3740 100644
> > --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> > @@ -2830,6 +2830,7 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
> >
> > tx_q->tx_skbuff_dma[first_entry].buf = des;
> > tx_q->tx_skbuff_dma[first_entry].len = skb_headlen(skb);
> > + tx_q->tx_skbuff[first_entry] = NULL;
> >
> > first->des0 = cpu_to_le32(des);
> >
> > @@ -3003,6 +3004,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
> >
> > first = desc;
> >
> > + tx_q->tx_skbuff[first_entry] = NULL;
> > +
> > enh_desc = priv->plat->enh_desc;
> > /* To program the descriptors according to the size of the frame */
> > if (enh_desc)
>
> I confirm that applying 05cf0d1bf4 ("net: stmmac: free an skb
> first when there are no longer any descriptors using it") and
> this patch makes my setup work perfectly in multi-queue
> configuration. Can you send an official patch to -net?
>
> You can add my:
> Tested-by: Jose Abreu <joabreu@synopsys.com>
>
> Also, maybe we should try to understand why stmmac_tx_clean() is
> not setting the entry to NULL?
Hello Jose,
It is not easy to decode the databook, however,
after reading the databook several times more,
looking at:
3.3.2.2 Tx DMA Operation: OSP Mode
"The DMA writes the status, with a cleared Own bit, to the
corresponding TDES3, thus closing the descriptor."
So closing the descriptor means clearing the own bit.
Now looking at the mode we are interested in:
3.3.2.1 Tx DMA Operation: Default (Non-OSP) Mode
8. If an Ethernet packet is stored over data buffers in multiple descriptors,
the DMA closes the intermediate descriptor and fetches the next descriptor.
Steps 3 through 7 are repeated until the end-of-Ethernet-packet data is
transferred to the MTL.
So I think that my initial understanding was correct after all,
i.e., the commit 05cf0d1bf4 ("net: stmmac: free an skb first when
there are no longer any descriptors using it") is needed to make
sure that the DMA is not using skbs that might have been freed.
How certain are you that this is the offending commit?
Your crash is on v4.14-rc5, I can see an interesting
commit: 8d5f4b071749 ("stmmac: Don't access tx_q->dirty_tx before
netif_tx_lock"), which got included first in v4.14-rc6.
I've tried to understand if this could be a use-after-free,
but I don't see it.
An skb has a certain queue-mapping, so it shouldn't be in more
than one queue.
tx_q->tx_skbuff is allocated with kmalloc_array(), but is
initialized in init_dma_tx_desc_rings().
Other than that, it is only used in stmmac_tso_xmit()/stmmac_xmit()
and stmmac_tx_clean(). stmmac_tx_clean() will free and non-NULL skb,
and set tx_q->tx_skbuff[entry] to NULL.
I don't see why stmmac does:
for (i = 0; i < nfrags; i++) {
...
tx_q->tx_skbuff[tx_q->cur_tx] = NULL;
since it is initialized to NULL in init_dma_tx_desc_rings(),
and if it has been cleaned, it should be set to NULL again.
If we haven't cleaned for a long time, we will not be able
to queue more skbs, since stmmac_tso_xmit()/stmmac_xmit()
checks current "clean" entries, see stmmac_tx_avail().
I removed the tx_q->tx_skbuff[tx_q->cur_tx] = NULL
from both stmmac_tso_xmit()/stmmac_xmit() locally,
and I could not see any problems with ping (different sizes)
or with iperf3.
Could you try to reproduce the bug with CONFIG_KASAN enabled?
(And please don't cut anything from the oops message).
Regards,
Niklas
next prev parent reply other threads:[~2017-12-06 12:14 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-27 14:41 Commit 05cf0d1bf4 ("net: stmmac: free an skb first when there are no longer any descriptors using it") breaks stmmac? Jose Abreu
2017-11-30 3:51 ` Niklas Cassel
2017-11-30 16:50 ` Jose Abreu
2017-12-06 12:14 ` Niklas Cassel [this message]
2018-02-13 13:33 ` Niklas Cassel
2018-02-16 9:34 ` Jose Abreu
2018-02-16 10:10 ` Niklas Cassel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171206121412.GB18005@axis.com \
--to=niklas.cassel@axis.com \
--cc=Joao.Pinto@synopsys.com \
--cc=Jose.Abreu@synopsys.com \
--cc=alexandre.torgue@st.com \
--cc=netdev@vger.kernel.org \
--cc=peppe.cavallaro@st.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).