From: george anzinger <george@mvista.com>
To: Per Gregers Bilse <bilse@qbfox.com>
Cc: linux-kernel@vger.kernel.org
Subject: Re: 2.4.18 clock warps 4294 seconds
Date: Mon, 29 Jul 2002 11:39:32 -0700 [thread overview]
Message-ID: <3D458BE4.60C7FB77@mvista.com> (raw)
In-Reply-To: 200207261022.LAA08395@spirit.qbfox.com
Per Gregers Bilse wrote:
>
> On Jul 25, 10:34am, george anzinger <george@mvista.com> wrote:
> > You have the number a bit low. If I recall, this is an 800
>
> Yes, I figured that, and bumped it up a lot (to 1e9). Of course, since
> setting the trap, things have been fine, including no loss of NTP synch.-/
> Let's see over the weekend.
>
> > The first thing I would check is that you are using DMA for
> > you disc transfers. To the best of my knowledge, the
>
> Yes, both machines and both disks use DMA, and also allow interrupts
> ("unmaskirq" option) during disk transfers, here's from hdparm(8):
>
> /dev/hda:
> multcount = 16 (on)
> I/O support = 1 (32-bit)
> unmaskirq = 1 (on)
> using_dma = 1 (on)
> keepsettings = 0 (off)
> nowerr = 0 (off)
> readonly = 0 (off)
> readahead = 8 (on)
> geometry = 2434/255/63, sectors = 39102336, start = 0
>
> The only slightly unusual thing is that both machines use soft RAID,
> I don't know if that code might be doing something. But the problem
> occurred at the same time as I made an application change (debug/logging)
> that vastly -reduced- disk I/O.
>
> Anyway, I've been looking through archived log files, and found a few
> entries from the 2.4.7-10 kernel that looked interesting, here's a pair:
>
> Feb 23 04:07:52 vulpes kernel: probable hardware bug: clock timer configuration lost - probably a VIA686a motherboard.
> Feb 23 04:07:52 vulpes kernel: probable hardware bug: restoring chip configuration.
>
> Both machines indeed have identical VIA686a motherboards. The messages
> come from code in timer_interrupt() in time.c:
>
> /* read Pentium cycle counter */
>
> rdtscl(last_tsc_low);
>
> spin_lock(&i8253_lock);
> outb_p(0x00, 0x43); /* latch the count ASAP */
>
> count = inb_p(0x40); /* read the latched count */
> count |= inb(0x40) << 8;
>
> /* VIA686a test code... reset the latch if count > max */
> if (count > LATCH) {
> static int last_whine;
> outb_p(0x34, 0x43);
> outb_p(LATCH & 0xff, 0x40);
> outb(LATCH >> 8, 0x40);
> count = LATCH - 1;
> if(time_after(jiffies, last_whine))
> {
> printk(KERN_WARNING "probable hardware bug: clock timer configuration lost - probably a VIA686a motherboard.\n");
> printk(KERN_WARNING "probable hardware bug: restoring chip configuration.\n");
> last_whine = jiffies + HZ;
> }
> }
>
> spin_unlock(&i8253_lock);
>
> The "if (count > LATCH)" block has been taken out of the 2.4.18
I am not sure it was ever in the kernel in that form. Are
you sure you did not put some patch in here?
> kernel, while similar code is in do_slow_gettimeoffset() in both
> the 2.4.7-10 and 2.4.18 kernels. I'm not sufficiently familiar
> with the hardware and the code to know if this is significant,
> but it does seem that there are some known hardware bugs which
> the earlier kernel tried to address (but with limited or no success).
I wish I knew more about this hardware bug. The test
suggests that the chip is not resetting the latch on
interrupt, but rather that it just rolls over (or under).
This would cause the count to, again, reach zero (and,
hopefully interrupt) in about 50 ms. On the other hand, the
chip could be switching modes and only the "0X34" mode will
continue to interrupt with out the chip being reprogrammed.
In this case, it is hard to understand how the system keeps
ANY time at all.
The above "fix" detects the count not being reset and
reprograms the chip, it does not attempt to correct for any
lost time.
>
> Anyway, let's see what happens over the weekend.
>
> Thanks.
>
> -- Per
--
George Anzinger george@mvista.com
High-res-timers:
http://sourceforge.net/projects/high-res-timers/
Real time sched: http://sourceforge.net/projects/rtsched/
Preemption patch:
http://www.kernel.org/pub/linux/kernel/people/rml
next prev parent reply other threads:[~2002-07-29 19:55 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-07-24 12:35 2.4.18 clock warps 4294 seconds Per Gregers Bilse
2002-07-24 13:37 ` Richard B. Johnson
2002-07-24 17:46 ` Per Gregers Bilse
2002-07-24 18:50 ` george anzinger
2002-07-25 10:01 ` Per Gregers Bilse
2002-07-25 10:36 ` Per Gregers Bilse
2002-07-25 17:34 ` george anzinger
2002-07-26 10:22 ` Per Gregers Bilse
2002-07-29 18:39 ` george anzinger [this message]
2002-07-31 11:22 ` Per Gregers Bilse
2002-08-29 15:19 ` Per Gregers Bilse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3D458BE4.60C7FB77@mvista.com \
--to=george@mvista.com \
--cc=bilse@qbfox.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox