qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
Date: Sun, 06 Sep 2015 15:21:53 +0100	[thread overview]
Message-ID: <1441549313.24871.218.camel@linuxfoundation.org> (raw)
In-Reply-To: <CAPokK=p9GUx7OeJ7xNwbnJkzd7JMLVayhgOCf0XapHnW9M9JLA@mail.gmail.com>

On Sat, 2015-09-05 at 13:30 -0700, Peter Crosthwaite wrote:
> On Fri, Sep 4, 2015 at 10:30 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> > On 4 September 2015 at 18:20, Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> >> On Fri, 2015-09-04 at 13:43 +0100, Richard Purdie wrote:
> >>> On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote:
> >>> > On 4 September 2015 at 12:24, Richard Purdie
> >>> > <richard.purdie@linuxfoundation.org> wrote:
> >>> > > So just based on that, yes, seems that the rx_fifo looks to be
> >>> > > overrunning. I can add the asserts but I think it would just confirm
> >>> > > this.
> >>> >
> >>> > Yes, the point of adding assertions is to confirm a hypothesis.
> >>>
> >>> I've now confirmed that it does indeed trigger the assert in
> >>> smc91c111_receive().
> >>
> >> I just tried an experiment where I put:
> >>
> >>     if (s->rx_fifo_len >= NUM_PACKETS)
> >>         return -1;
> >>
> >> into smc91c111_receive() and my reproducer stops reproducing the
> >> problem.
> 
> Does it just stop the crash or does it eliminate the problem
> completely with a fully now-working network?

It stops the crash, the network works great.

> >> I also noticed can_receive() could also have a check on buffer
> >> availability. Would one of these changes be the correct fix here?
> >
> > The interesting question is why smc91c111_allocate_packet() doesn't
> > fail in this situation. We only have NUM_PACKETS worth of storage,
> > shared between the tx and rx buffers, so how could we both have
> > already filled the rx_fifo and have a spare packet for the allocate
> > function to return?
> 
> Maybe this:
> 
>             case 5: /* Release.  */
>                 smc91c111_release_packet(s, s->packet_num);
>                 break;
> 
> The guest is able to free an allocated packet without the accompanying
> pop of tx/rx fifo. This may suggest some sort of guest error?
> 
> The fix depends on the behaviour of the real hardware. If that MMIO op
> is supposed to dequeue the corresponding queue entry then we may need
> to patch that logic to do search the queues and dequeue it. Otherwise
> we need to find out the genuine length of the rx queue, and clamp it
> without something like Richards patch. There are a few other bits and
> pieces that suggest the guest can have independent control of the
> queues and allocated buffers but i'm confused as to how the rx fifo
> length can get up to 10 in any case.

I think I have a handle on what is going on. smc91c111_release_packet()
changes s->allocated() but not rx_fifo. can_receive() only looks at
s->allocated. We can trigger new network packets to arrive from
smc91c111_release_packet() which calls qemu_flush_queued_packets()
*before* we change rx_fifo and this can loop.

The patch below which explicitly orders the qemu_flush_queued_packets()
call resolved the test case I was able to reproduce this problem in.

So there are three ways to fix this, either can_receive() needs to check
both s->allocated() and rx_fifo, or the code is more explicit about when
qemu_flush_queued_packets() is called (as per my patch below), or the
case 4 where smc91c111_release_packet() and then
smc91c111_pop_rx_fifo(s) is called is reversed. I also tested the latter
which also works, albeit with more ugly code.

The problem is much more reproducible with the assert btw, booting a
qemu image with this and hitting the network interface with scp of a few
large files is usually enough.

So which patch would be preferred? :)

Cheers,

Richard



Index: qemu-2.4.0/hw/net/smc91c111.c
===================================================================
--- qemu-2.4.0.orig/hw/net/smc91c111.c
+++ qemu-2.4.0/hw/net/smc91c111.c
@@ -185,7 +185,6 @@ static void smc91c111_release_packet(smc
     s->allocated &= ~(1 << packet);
     if (s->tx_alloc == 0x80)
         smc91c111_tx_alloc(s);
-    qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
 /* Flush the TX FIFO.  */
@@ -237,9 +236,11 @@ static void smc91c111_do_tx(smc91c111_st
             }
         }
 #endif
-        if (s->ctr & CTR_AUTO_RELEASE)
+        if (s->ctr & CTR_AUTO_RELEASE) {
             /* Race?  */
             smc91c111_release_packet(s, packetnum);
+            qemu_flush_queued_packets(qemu_get_queue(s->nic));
+        }
         else if (s->tx_fifo_done_len < NUM_PACKETS)
             s->tx_fifo_done[s->tx_fifo_done_len++] = packetnum;
         qemu_send_packet(qemu_get_queue(s->nic), p, len);
@@ -379,9 +380,11 @@ static void smc91c111_writeb(void *opaqu
                     smc91c111_release_packet(s, s->rx_fifo[0]);
                 }
                 smc91c111_pop_rx_fifo(s);
+                qemu_flush_queued_packets(qemu_get_queue(s->nic));
                 break;
             case 5: /* Release.  */
                 smc91c111_release_packet(s, s->packet_num);
+                qemu_flush_queued_packets(qemu_get_queue(s->nic));
                 break;
             case 6: /* Add to TX FIFO.  */
                 smc91c111_queue_tx(s, s->packet_num);

  reply	other threads:[~2015-09-06 14:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 10:25 [Qemu-devel] Segfault using qemu-system-arm in smc91c111 Richard Purdie
2015-09-04 10:45 ` Peter Maydell
2015-09-04 11:24   ` Richard Purdie
2015-09-04 11:31     ` Peter Maydell
2015-09-04 12:43       ` Richard Purdie
2015-09-04 17:20         ` Richard Purdie
2015-09-04 17:30           ` Peter Maydell
2015-09-05 20:30             ` Peter Crosthwaite
2015-09-06 14:21               ` Richard Purdie [this message]
2015-09-06 18:37                 ` Peter Crosthwaite
2015-09-06 23:26                   ` Richard Purdie
2015-09-07  0:48                     ` Peter Crosthwaite
2015-09-07  7:09                       ` Richard Purdie
2015-09-07 18:05                         ` Peter Crosthwaite
2015-09-07  7:18                       ` Richard Purdie
2015-09-07  7:47                       ` Richard Purdie
2015-09-07  9:21                         ` Peter Maydell
2015-09-07 18:12                           ` Peter Crosthwaite
2015-09-08  9:55                           ` Jason Wang
2015-09-07 18:42                   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1441549313.24871.218.camel@linuxfoundation.org \
    --to=richard.purdie@linuxfoundation.org \
    --cc=crosthwaitepeter@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).