All of lore.kernel.org
 help / color / mirror / Atom feed
From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>,
	qemu-devel <qemu-devel@nongnu.org>
Subject: Re: [Qemu-devel] Segfault using qemu-system-arm in smc91c111
Date: Sun, 06 Sep 2015 15:21:53 +0100	[thread overview]
Message-ID: <1441549313.24871.218.camel@linuxfoundation.org> (raw)
In-Reply-To: <CAPokK=p9GUx7OeJ7xNwbnJkzd7JMLVayhgOCf0XapHnW9M9JLA@mail.gmail.com>

On Sat, 2015-09-05 at 13:30 -0700, Peter Crosthwaite wrote:
> On Fri, Sep 4, 2015 at 10:30 AM, Peter Maydell <peter.maydell@linaro.org> wrote:
> > On 4 September 2015 at 18:20, Richard Purdie
> > <richard.purdie@linuxfoundation.org> wrote:
> >> On Fri, 2015-09-04 at 13:43 +0100, Richard Purdie wrote:
> >>> On Fri, 2015-09-04 at 12:31 +0100, Peter Maydell wrote:
> >>> > On 4 September 2015 at 12:24, Richard Purdie
> >>> > <richard.purdie@linuxfoundation.org> wrote:
> >>> > > So just based on that, yes, seems that the rx_fifo looks to be
> >>> > > overrunning. I can add the asserts but I think it would just confirm
> >>> > > this.
> >>> >
> >>> > Yes, the point of adding assertions is to confirm a hypothesis.
> >>>
> >>> I've now confirmed that it does indeed trigger the assert in
> >>> smc91c111_receive().
> >>
> >> I just tried an experiment where I put:
> >>
> >>     if (s->rx_fifo_len >= NUM_PACKETS)
> >>         return -1;
> >>
> >> into smc91c111_receive() and my reproducer stops reproducing the
> >> problem.
> 
> Does it just stop the crash or does it eliminate the problem
> completely with a fully now-working network?

It stops the crash, the network works great.

> >> I also noticed can_receive() could also have a check on buffer
> >> availability. Would one of these changes be the correct fix here?
> >
> > The interesting question is why smc91c111_allocate_packet() doesn't
> > fail in this situation. We only have NUM_PACKETS worth of storage,
> > shared between the tx and rx buffers, so how could we both have
> > already filled the rx_fifo and have a spare packet for the allocate
> > function to return?
> 
> Maybe this:
> 
>             case 5: /* Release.  */
>                 smc91c111_release_packet(s, s->packet_num);
>                 break;
> 
> The guest is able to free an allocated packet without the accompanying
> pop of tx/rx fifo. This may suggest some sort of guest error?
> 
> The fix depends on the behaviour of the real hardware. If that MMIO op
> is supposed to dequeue the corresponding queue entry then we may need
> to patch that logic to do search the queues and dequeue it. Otherwise
> we need to find out the genuine length of the rx queue, and clamp it
> without something like Richards patch. There are a few other bits and
> pieces that suggest the guest can have independent control of the
> queues and allocated buffers but i'm confused as to how the rx fifo
> length can get up to 10 in any case.

I think I have a handle on what is going on. smc91c111_release_packet()
changes s->allocated() but not rx_fifo. can_receive() only looks at
s->allocated. We can trigger new network packets to arrive from
smc91c111_release_packet() which calls qemu_flush_queued_packets()
*before* we change rx_fifo and this can loop.

The patch below which explicitly orders the qemu_flush_queued_packets()
call resolved the test case I was able to reproduce this problem in.

So there are three ways to fix this, either can_receive() needs to check
both s->allocated() and rx_fifo, or the code is more explicit about when
qemu_flush_queued_packets() is called (as per my patch below), or the
case 4 where smc91c111_release_packet() and then
smc91c111_pop_rx_fifo(s) is called is reversed. I also tested the latter
which also works, albeit with more ugly code.

The problem is much more reproducible with the assert btw, booting a
qemu image with this and hitting the network interface with scp of a few
large files is usually enough.

So which patch would be preferred? :)

Cheers,

Richard



Index: qemu-2.4.0/hw/net/smc91c111.c
===================================================================
--- qemu-2.4.0.orig/hw/net/smc91c111.c
+++ qemu-2.4.0/hw/net/smc91c111.c
@@ -185,7 +185,6 @@ static void smc91c111_release_packet(smc
     s->allocated &= ~(1 << packet);
     if (s->tx_alloc == 0x80)
         smc91c111_tx_alloc(s);
-    qemu_flush_queued_packets(qemu_get_queue(s->nic));
 }
 
 /* Flush the TX FIFO.  */
@@ -237,9 +236,11 @@ static void smc91c111_do_tx(smc91c111_st
             }
         }
 #endif
-        if (s->ctr & CTR_AUTO_RELEASE)
+        if (s->ctr & CTR_AUTO_RELEASE) {
             /* Race?  */
             smc91c111_release_packet(s, packetnum);
+            qemu_flush_queued_packets(qemu_get_queue(s->nic));
+        }
         else if (s->tx_fifo_done_len < NUM_PACKETS)
             s->tx_fifo_done[s->tx_fifo_done_len++] = packetnum;
         qemu_send_packet(qemu_get_queue(s->nic), p, len);
@@ -379,9 +380,11 @@ static void smc91c111_writeb(void *opaqu
                     smc91c111_release_packet(s, s->rx_fifo[0]);
                 }
                 smc91c111_pop_rx_fifo(s);
+                qemu_flush_queued_packets(qemu_get_queue(s->nic));
                 break;
             case 5: /* Release.  */
                 smc91c111_release_packet(s, s->packet_num);
+                qemu_flush_queued_packets(qemu_get_queue(s->nic));
                 break;
             case 6: /* Add to TX FIFO.  */
                 smc91c111_queue_tx(s, s->packet_num);

  reply	other threads:[~2015-09-06 14:22 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-04 10:25 [Qemu-devel] Segfault using qemu-system-arm in smc91c111 Richard Purdie
2015-09-04 10:45 ` Peter Maydell
2015-09-04 11:24   ` Richard Purdie
2015-09-04 11:31     ` Peter Maydell
2015-09-04 12:43       ` Richard Purdie
2015-09-04 17:20         ` Richard Purdie
2015-09-04 17:30           ` Peter Maydell
2015-09-05 20:30             ` Peter Crosthwaite
2015-09-06 14:21               ` Richard Purdie [this message]
2015-09-06 18:37                 ` Peter Crosthwaite
2015-09-06 23:26                   ` Richard Purdie
2015-09-07  0:48                     ` Peter Crosthwaite
2015-09-07  7:09                       ` Richard Purdie
2015-09-07 18:05                         ` Peter Crosthwaite
2015-09-07  7:18                       ` Richard Purdie
2015-09-07  7:47                       ` Richard Purdie
2015-09-07  9:21                         ` Peter Maydell
2015-09-07 18:12                           ` Peter Crosthwaite
2015-09-08  9:55                           ` Jason Wang
2015-09-07 18:42                   ` Peter Maydell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1441549313.24871.218.camel@linuxfoundation.org \
    --to=richard.purdie@linuxfoundation.org \
    --cc=crosthwaitepeter@gmail.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.