All of lore.kernel.org
 help / color / mirror / Atom feed
From: Keller, Jacob E <jacob.e.keller@intel.com>
To: intel-wired-lan@osuosl.org
Subject: [Intel-wired-lan] [PATCH v2 2/2] e1000e: Fix ptp time reset on network interruption
Date: Thu, 14 Apr 2016 23:25:25 +0000	[thread overview]
Message-ID: <1460676325.28210.18.camel@intel.com> (raw)
In-Reply-To: <20160414224237.GA18429@hobbes.lan20.walshnetwork.net>

On Thu, 2016-04-14 at 18:42 -0400, Brian Walsh wrote:
> On Thu, Apr 14, 2016 at 06:21:09PM +0000, Keller, Jacob E wrote:
> > 
> > On Thu, 2016-04-14 at 11:08 -0400, Brian Walsh wrote:
> > > 
> > > On Thu, Apr 14, 2016 at 03:11:45AM +0000, Brown, Aaron F wrote:
> > > > 
> > > > 
> > > > > 
> > > > > 
> > > > > From: Intel-wired-lan [mailto:intel-wired-lan-bounces at lists.o
> > > > > suos
> > > > > l.org] On
> > > > > Behalf Of Brian Walsh
> > > > > Sent: Tuesday, April 12, 2016 8:23 PM
> > > > > To: intel-wired-lan at lists.osuosl.org
> > > > > Subject: [Intel-wired-lan] [PATCH v2 2/2] e1000e: Fix ptp
> > > > > time
> > > > > reset on
> > > > > network interruption
> > > > > 
> > > > > Time is resetting on any interruption of network
> > > > > connectivity.
> > > > > This
> > > > > causes the clock to jump around by the leapsecond offset. It
> > > > > should
> > > > > only reset when the device is initialized.
> > > > > 
> > > > > Signed-off-by: Brian Walsh <brian@walsh.ws>
> > > > > ---
> > > > > ?drivers/net/ethernet/intel/e1000e/netdev.c | 22 +++++++++++-
> > > > > ----
> > > > > ------
> > > > > ?1 file changed, 11 insertions(+), 11 deletions(-)
> > > > > 
> > > > This patch introduces a Call Trace and panic for me on a
> > > > handful of
> > > > regression systems.??I am usually seeing this on the e1000e
> > > > driver
> > > > load, but on one system when just under traffic stress.??It
> > > > seems
> > > > to show up mostly on older hardware, the trace has been spotted
> > > > on
> > > > a system with a 82573 LOM, another system with a pair of
> > > > 80003ES2LAN controller's and an add in 82572.??The following
> > > > trace
> > > > is taken via a serial console from a system with an 82574L and
> > > > 82579L LOM on the board after the system had been running
> > > > randomish
> > > > netperf traffic for an hour or so.??The trace on driver load is
> > > > similar to the first call trace of this series, but generally
> > > > did
> > > > not recover enough to get the follow along messages:
> > > > 
> > > This patch seems to be causing issues on other systems. I am
> > > running
> > > it
> > > on about 30 units with all the same card. I also have linuxptp
> > > running
> > > at the same time.
> > > 
> > > Would there be some other way to address the problem that I am
> > > trying
> > > to fix with this patch?
> > > 
> > > Basically if the network connection between the device and the
> > > 1588
> > > clock is interrupted for a period of time the hardware clock was
> > > switching from being on TAI time to thinking that the time is now
> > > UTC
> > > time. This causes the system time to fluctuate by the leapsecond
> > > offset.
> > > 
> > > I was able to reproduce this problem with a 1588 clock source
> > > using
> > > ipv4
> > > udp by temporarily dropping udp traffic on ports 319 and 320
> > > through
> > > iptables.
> > > 
> > > Moving the the clock reset to only in initialization fixed the
> > > problem
> > > for me.
> > > 
> > > Brian
> > Moving the clock reset to initialization seems like the correct
> > behavior to me.
> > 
> > Thanks,
> > Jake
> It looks like reseting the System Time Register SYSTIM base frequency
> has to occur. That is why the divide zero error is happening. The
> timecounter_init should not need to be reset anywhere other than
> initialization.
> 
> I will put together another patch and test it on my equipment and see
> if
> that does any better.
> 
> Brian
> 

I have a patch, I will send you momentarily which should resolve your
issue.

timecounter_init must occur during reset because the hardware SYSTIME
register will have been reset. However, it does NOT need to occur
during the SIOCSHWTSTAMP ioctl as it does now. I have a proposed fix,
if you could test, that would be great.

Thanks,
Jake

  reply	other threads:[~2016-04-14 23:25 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-16 22:44 [Intel-wired-lan] [PATCH 1/1] e1000e: Fix ptp time reset on network interruption Brian Walsh
2016-04-13  3:22 ` [Intel-wired-lan] [PATCH v2 1/2] e1000e: Cleanup consistency in ret_val variable usage Brian Walsh
2016-04-13  3:22   ` [Intel-wired-lan] [PATCH v2 2/2] e1000e: Fix ptp time reset on network interruption Brian Walsh
2016-04-14  3:11     ` Brown, Aaron F
2016-04-14 14:48       ` Fujinaka, Todd
2016-04-14 15:08       ` Brian Walsh
2016-04-14 18:21         ` Keller, Jacob E
2016-04-14 22:42           ` Brian Walsh
2016-04-14 23:25             ` Keller, Jacob E [this message]
2016-04-14 22:38     ` Keller, Jacob E
2016-04-14 23:00     ` Keller, Jacob E
2016-04-15  2:30     ` Jeff Kirsher
2016-04-14 12:46   ` [Intel-wired-lan] [PATCH v2 1/2] e1000e: Cleanup consistency in ret_val variable usage Avargil, Raanan
2016-04-15  1:44   ` Brown, Aaron F

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1460676325.28210.18.camel@intel.com \
    --to=jacob.e.keller@intel.com \
    --cc=intel-wired-lan@osuosl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.