All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: Dmitry Fleytman <dmitry@daynix.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] e1000e migration
Date: Mon, 15 May 2017 17:14:46 +0100	[thread overview]
Message-ID: <20170515161445.GA2324@work-vm> (raw)
In-Reply-To: <6F78534C-7A67-43F4-A9E8-E532BCD16976@daynix.com>

* Dmitry Fleytman (dmitry@daynix.com) wrote:
> Hello Dave,
> 
> It looks like we identified the problem.
> 
> We are working on fix and will send it as soon as it is ready.

Thanks!

Dave

> ~Dmitry.
> 
> Sent from my iPhone
> 
> > On 15 May 2017, at 12:22, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> > * Dmitry Fleytman (dmitry@daynix.com) wrote:
> >> Hello Dave,
> > 
> > Hi Dmitry,
> >  Thanks for the reply.
> > 
> >> We are trying to reproduce this issue on our systems but with no luck so far…
> > 
> > Note our QE hit this with both a Win8.1 and a win2012r2 guest - although
> > the 2012r2 is reported to have recoverd after a few minutes.
> > 2016 apparently works OK.
> > 
> >> From what you describe it looks like some bit in ICR is not being cleared by the driver.
> >> This usually means that this bit should never be set in that specific interrupt mode.
> >> 
> >> Could you please check which bit is not cleared and who sets it?
> > 
> > The full set of e1000e_irq_pending_interrupts after migration is:
> > 23004@1494519346.673905:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004)
> > 23004@1494519346.674787:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004)
> > 23004@1494519346.674946:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1e00004)
> > 23004@1494519346.675119:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x80300082, IMS: 0x1e00004)
> > 23004@1494519346.675302:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80100082, IMS: 0x1c00004)
> >  <repeated lots>
> > 23004@1494519346.716279:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80300082, IMS: 0x1c00004)
> > 23004@1494519346.716380:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1c00004)
> > 23004@1494519346.717040:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1400004)
> > 23004@1494519346.717276:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x813000c2, IMS: 0x1000004)
> > 23004@1494519346.717443:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717567:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717782:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.717918:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.718319:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.718523:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.718684:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.718890:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.719034:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.719130:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> >  <repeats>
> > 23004@1494519346.722699:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.722868:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.723068:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x800004)
> >  <repeats>
> > 23004@1494519346.731198:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x800004)
> > 23004@1494519346.731422:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x813000c2, IMS: 0x4)
> > 23004@1494519346.731930:e1000e_irq_pending_interrupts ICR PENDING: 0x200000 (ICR: 0x813000c2, IMS: 0xa00004)
> > 23004@1494519346.732082:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.732274:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0x4)
> > 23004@1494519346.732404:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.732504:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x811000c2, IMS: 0xa00004)
> > 23004@1494519346.784150:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.786506:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.786534:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.789644:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> > 23004@1494519346.789864:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.789992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.790413:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.790539:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.792593:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.792620:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x815000c2, IMS: 0xa00004)
> > 23004@1494519346.795943:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> > 
> > and then I think we get stuck in this cycle of this one always being the
> > one that fires repeatedly.  I think that's the 'other' firing, I think
> > because of the receive-overrun.  One thing I've not
> > figured out is why the receive overrun happens - is that because we
> > really have a very heavy packet rate or is it because something has
> > stopped receiving them.
> > The network I'm testing on does have a fair amount of broadcast traffic
> > on.
> > 
> > Dave
> > 
> >> Regards,
> >> Dmitry
> >> 
> >>> On 11 May 2017, at 15:36 PM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> >>> 
> >>> Hi Dmitry,
> >>> Have you seen any problems with e1000e migration under windows?
> >>> I've got a repeatable case where after migration with e1000e windows
> >>> hangs/almost hangs.
> >>> I'm seeing the e1000e generate interrupts at a very very high
> >>> rate (maybe ~1000 second ish?) after migration.
> >>> 
> >>> Some versions of qemu do it and some dont, but my attempts
> >>> at bisection lead me to code that should be irrelevant.
> >>> 
> >>> Prior to migration I see:
> >>> 
> >>> 36461@1494504466.711929:e1000e_irq_pending_interrupts ICR PENDING: 0x100000 (ICR: 0x80100082, IMS: 0x1f00004)
> >>> 36461@1494504466.711992:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004)
> >>> 36461@1494504466.712076:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004)
> >>> 36461@1494504466.712245:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1a00004)
> >>> 36461@1494504466.712332:e1000e_irq_pending_interrupts ICR PENDING: 0x0 (ICR: 0x80000082, IMS: 0x1f00004)
> >>> 
> >>> which I think the ICR means:
> >>>     31 - int asserted
> >>>     20 - RxQ0 - receive queue 0 interrupt
> >>>     7  - RXT0 - receiver timer interrupt
> >>>     1  - TXQE - Transmit Queue empty
> >>> 
> >>> after migration it varies more, I'm seeing mostly:
> >>> 21977@1494504516.320707:e1000e_irq_pending_interrupts ICR PENDING: 0x1000000 (ICR: 0x815000c2, IMS: 0x1a00004)
> >>>     31 - int asserted
> >>>     24 - 'Other'
> >>>     22 - TxQ0 interrupt
> >>>     20 - RxQ0 interrupt
> >>>     07 - RXT0 Receiver timer interrupt
> >>>     06 - RX0 - Receiver overrun
> >>>     01 - TXQE - Transmit queue empty
> >>> 
> >>> For reference this is https://bugzilla.redhat.com/show_bug.cgi?id=1447935
> >>> 
> >>> Dave
> >>> --
> >>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> >> 
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

      reply	other threads:[~2017-05-15 16:14 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 12:36 [Qemu-devel] e1000e migration Dr. David Alan Gilbert
2017-05-14 13:14 ` Dmitry Fleytman
2017-05-15  9:22   ` Dr. David Alan Gilbert
2017-05-15 16:13     ` Dmitry Fleytman
2017-05-15 16:14       ` Dr. David Alan Gilbert [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170515161445.GA2324@work-vm \
    --to=dgilbert@redhat.com \
    --cc=dmitry@daynix.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.