All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keller, Jacob E <jacob.e.keller@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [net v2 0/5] igb: fix ptp suspend/resume issue
Date: Tue, 17 May 2016 22:04:43 +0000	[thread overview]
Message-ID: <1463522683.10583.26.camel@intel.com> (raw)
In-Reply-To: <309B89C4C689E141A5FF6A0C5FB2118B81EFE3FA@ORSMSX101.amr.corp.intel.com>

On Tue, 2016-05-17 at 22:01 +0000, Brown, Aaron F wrote:
> > 
> > From: Keller, Jacob E
> > Sent: Tuesday, May 17, 2016 2:54 PM
> > To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Brown, Aaron
> > F
> > <aaron.f.brown@intel.com>; intel-wired-lan at lists.osuosl.org
> > Cc: sagar.tv at gmail.com
> > Subject: Re: [Intel-wired-lan] [net v2 0/5] igb: fix ptp
> > suspend/resume issue
> > 
> > On Tue, 2016-05-17 at 21:05 +0000, Brown, Aaron F wrote:
> > > 
> > > > 
> > > > 
> > > > From: Keller, Jacob E
> > > > 
Sent: Tuesday, May 17, 2016 1:47 PM
> > > > To: Kirsher, Jeffrey T <jeffrey.t.kirsher@intel.com>; Brown,
> > > > Aaron
> > > > F
> > > > <aaron.f.brown@intel.com>; intel-wired-lan at lists.osuosl.org
> > > > Cc: sagar.tv at gmail.com
> > > > Subject: Re: [Intel-wired-lan] [net v2 0/5] igb: fix ptp
> > > > suspend/resume issue
> > > > 
> > > > On Mon, 2016-05-16 at 19:29 -0700, Jeff Kirsher wrote:
> > > > > 
> > > > > 
> > > > > On Tue, 2016-05-17 at 01:57 +0000, Brown, Aaron F wrote:
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > From: Intel-wired-lan [mailto:intel-wired-lan-bounces at lis
> > > > > > > ts.o
> > > > > > > suos
> > > > > > > l.org]
> > > > > > On
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Behalf Of Jacob Keller
> > > > > > > Sent: Wednesday, May 11, 2016 4:18 PM
> > > > > > > To: Intel Wired LAN <intel-wired-lan@lists.osuosl.org>
> > > > > > > Cc: Vidya Sagar <sagar.tv@gmail.com>
> > > > > > > Subject: [Intel-wired-lan] [net v2 0/5] igb: fix ptp
> > > > > > > suspend/resume
> > > > > > issue
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > This patch series (properly) fixes the issue with igb's
> > > > > > > workqueue
> > > > > > > item
> > > > > > > for overflow check from causing a surprise remove event.
> > > > > > > To
> > > > > > > do
> > > > > > > this,
> > > > > > > properly suspend the workqueue items in suspend and then
> > > > > > > resume
> > > > > > > them
> > > > > > > again during the resume flow.
> > > > > > > 
> > > > > > > The patch series has a few extra steps to reduce code
> > > > > > > duplication
> > > > > > > and
> > > > > > > implement suspend and resume properly, which makes the
> > > > > > > overall
> > > > > > > fix a
> > > > > > bit
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > more complicated, and thus review is welcome.
> > > > > > > 
> > > > > > > A smaller fix would be to implement suspend and resume
> > > > > > > irrespective of
> > > > > > > the current igb_ptp_stop and igb_ptp_init but this seems
> > > > > > > more
> > > > > > > likely to
> > > > > > > introduce bugs especially if either function ever changes
> > > > > > > in
> > > > > > > the
> > > > > > future.
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > In addition, the ptp_flags variable is added mostly to
> > > > > > > simplify
> > > > > > > the
> > > > > > work
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > of writing several complex MAC type checks in the ptp
> > > > > > > code
> > > > > > > 
while
> > > > > > > doing
> > > > > > > this.
> > > > > > > 
> > > > > > > Jacob Keller (5):
> > > > > > > ?? igb: introduce ptp_flags variable and use it to
> > > > > > > replace
> > > > > > > IGB_FLAG_PTP
> > > > > > > ?? igb: introduce IGB_PTP_OVERFLOW_CHECK flag
> > > > > > > ?? igb: introduce igb_ptp_resume function
> > > > > > > ?? igb: implement igb_ptp_suspend
> > > > > > > ?? igb: call igb_ptp_suspend/igb_ptp_resume during
> > > > > > > suspend/resume
> > > > > > > cycle
> > > > > > > 
> > > > > > > ? drivers/net/ethernet/intel/igb/igb.h????? |?? 8 ++-
> > > > > > > ? drivers/net/ethernet/intel/igb/igb_main.c |?? 4 +-
> > > > > > > ? drivers/net/ethernet/intel/igb/igb_ptp.c? | 110
> > > > > > > ++++++++++++++++----
> > > > > > ---------
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > -
> > > > > > > ? 3 files changed, 68 insertions(+), 54 deletions(-)
> > > > > > I have not isolated it to the exact patch yet, but one of
> > > > > > the
> > > > > > patches in
> > > > > > this series is causing my systems to lock up with a call
> > > > > > trace.? I
> > > > > > am
> > > > > > currently unable to capture the trace in any form other
> > > > > > than a
> > > > > > bitmap
> > > > > > (which I'll send to Jacob but am not attaching here.)? The
> > > > > > trace is
> > > > > > really several splats a few minutes apart.? The exact text
> > > > > > /
> > > > > > procedure
> > > > > > calls of the first one seems to vary, but it seems to be in
> > > > > > a
> > > > > > wakeup
> > > > > > routing with "do_page_fault", "? _raw_spin_lock_irq", "?
> > > > > > timecounter_read", "? _raw_spin_lock_irqsave",
> > > > > > "igb_ptp_gettime_82576"
> > > > > > and "igb_ptp_overflow_check" showing up prominently in at
> > > > > > least
> > > > > > a
> > > > > > few
> > > > > > instances.? Usually it moves to the next trace before I can
> > > > > > get
> > > > > > a
> > > > > > snapshot.? The follow on trace is where it usually stops
> > > > > > with a
> > > > > > RIP:,
> > > > > > bunch of hex, stack info and a Call Trace saying
> > > > > > "arch_cpu_idle",
> > > > > > "default_idle_call", "cpu_startup_entry" and
> > > > > > "start_secondary"
> > > > > > called
> > > > > > out.
> > > > > Andrew thought it was with patch 3 in the series, at least
> > > > > that
> > > > > is
> > > > > what his
> > > > > initial git bisect was telling him.
> > > > > 
> > > > > I am going to go ahead and drop the entire series for now, so
> > > > > that we
> > > > > can
> > > > > work offline to resolve the issue.
> > > > Yep. I'm investigating the traces. I'll focus my initial
> > > > efforts at
> > > > patch 3.
> > > I can confirm Andrew's bisection and that the issue does not
> > > start
> > > occurring until patch 3 is included.
> > 
> > Does this just occur from loading? or does it occur after some
> > action
> > on your part?
> I initially ran into it just loading the driver and launching a
> BAT.??But after trying to figure out which section of the BAT was
> breaking it, I noticed another system I had simply loaded the driver
> on would lock up after a while as well.??I do not know if it shows up
> sooner or later if the interface is getting traffic across it, but it
> shows up either way.

Ok, so you're not doing suspend resume, and it's just loading the
driver and waiting.

Thanks,
Jake

      reply	other threads:[~2016-05-17 22:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-11 23:18 [Intel-wired-lan] [net v2 0/5] igb: fix ptp suspend/resume issue Jacob Keller
2016-05-11 23:18 ` [Intel-wired-lan] [net v2 1/5] igb: introduce ptp_flags variable and use it to replace IGB_FLAG_PTP Jacob Keller
2016-05-11 23:18 ` [Intel-wired-lan] [net v2 2/5] igb: introduce IGB_PTP_OVERFLOW_CHECK flag Jacob Keller
2016-05-11 23:18 ` [Intel-wired-lan] [net v2 3/5] igb: introduce igb_ptp_resume function Jacob Keller
2016-05-11 23:18 ` [Intel-wired-lan] [net v2 4/5] igb: implement igb_ptp_suspend Jacob Keller
2016-05-11 23:18 ` [Intel-wired-lan] [net v2 5/5] igb: call igb_ptp_suspend/igb_ptp_resume during suspend/resume cycle Jacob Keller
2016-05-17  1:57 ` [Intel-wired-lan] [net v2 0/5] igb: fix ptp suspend/resume issue Brown, Aaron F
2016-05-17  2:29   ` Jeff Kirsher
2016-05-17 20:46     ` Keller, Jacob E
2016-05-17 21:05       ` Brown, Aaron F
2016-05-17 21:54         ` Keller, Jacob E
2016-05-17 22:01           ` Brown, Aaron F
2016-05-17 22:04             ` Keller, Jacob E [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1463522683.10583.26.camel@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=intel-wired-lan@osuosl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.