qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] e1000e migration
Date: Mon, 15 May 2017 17:14:46 +0100	[thread overview]
Message-ID: <20170515161445.GA2324@work-vm> (raw)
In-Reply-To: <6F78534C-7A67-43F4-A9E8-E532BCD16976@daynix.com>

* Dmitry Fleytman (dmitry@daynix.com) wrote:
> Hello Dave,
> 
> It looks like we identified the problem.
> 
> We are working on fix and will send it as soon as it is ready.

Thanks!

Dave

> ~Dmitry.
> 
> Sent from my iPhone
> 
> > On 15 May 2017, at 12:22, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> > * Dmitry Fleytman (dmitry@daynix.com) wrote:
> >> Hello Dave,
> > 
> > Hi Dmitry,
> >  Thanks for the reply.
> > 
> >> We are trying to reproduce this issue on our systems but with no luck so far…
> > 
> > Note our QE hit this with both a Win8.1 and a win2012r2 guest - although
> > the 2012r2 is reported to have recoverd after a few minutes.
> > 2016 apparently works OK.
> > 
> >> From what you describe it looks like some bit in ICR is not being cleared by the driver.
> >> This usually means that this bit should never be set in that specific interrupt mode.
> >> 
> >> Could you please check which bit is not cleared and who sets it?
> > 
> > The full set of e1000e_irq_pending_interrupts after migration is:
> > 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004)
> > 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004)
> > 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004)
> > 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x80300082, IMS: 0x1e00004)
> > 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1c00004)
> >  <repeated lots>
> > 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80300082, IMS: 0x1c00004)
> > 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1c00004)
> > 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1400004)
> > 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1000004)
> > 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> >  <repeats>
> > 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x800004)
> >  <repeats>
> > 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x800004)
> > 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> > 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> > 
> > and then I think we get stuck in this cycle of this one always being the
> > one that fires repeatedly.  I think that's the 'other' firing, I think
> > because of the receive-overrun.  One thing I've not
> > figured out is why the receive overrun happens - is that because we
> > really have a very heavy packet rate or is it because something has
> > stopped receiving them.
> > The network I'm testing on does have a fair amount of broadcast traffic
> > on.
> > 
> > Dave
> > 
> >> Regards,
> >> Dmitry
> >> 
> >>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>> 
> >>> Hi Dmitry,
> >>> Have you seen any problems with e1000e migration under windows?
> >>> I've got a repeatable case where after migration with e1000e windows
> >>> hangs/almost hangs.
> >>> I'm seeing the e1000e generate interrupts at a very very high
> >>> rate (maybe ~1000 second ish?) after migration.
> >>> 
> >>> Some versions of qemu do it and some dont, but my attempts
> >>> at bisection lead me to code that should be irrelevant.
> >>> 
> >>> Prior to migration I see:
> >>> 
> >>> 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004)
> >>> 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004)
> >>> 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004)
> >>> 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004)
> >>> 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004)
> >>> 
> >>> which I think the ICR means:
> >>>     31 - int asserted
> >>>     20 - RxQ0 - receive queue 0 interrupt
> >>>     7  - RXT0 - receiver timer interrupt
> >>>     1  - TXQE - Transmit Queue empty
> >>> 
> >>> after migration it varies more, I'm seeing mostly:
> >>> 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> >>>     31 - int asserted
> >>>     24 - 'Other'
> >>>     22 - TxQ0 interrupt
> >>>     20 - RxQ0 interrupt
> >>>     07 - RXT0 Receiver timer interrupt
> >>>     06 - RX0 - Receiver overrun
> >>>     01 - TXQE - Transmit queue empty
> >>> 
> >>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935
> >>> 
> >>> Dave
> >>> --
> >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >> 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

      reply	other threads:[~2017-05-15 16:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 12:36 [Qemu-devel] e1000e migration Dr. David Alan Gilbert
2017-05-14 13:14 ` Dmitry Fleytman
2017-05-15  9:22   ` Dr. David Alan Gilbert
2017-05-15 16:13     ` Dmitry Fleytman
2017-05-15 16:14       ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170515161445.GA2324@work-vm \
    --to=dgilbert@redhat.com \
    --cc=dmitry@daynix.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).