From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <47F3A600.2050308@domain.hid> Date: Wed, 02 Apr 2008 17:28:00 +0200 From: Sebastian Smolorz MIME-Version: 1.0 References: <20080402012645.506e53ef.Cornelius.Koepp@domain.hid> <47F34C0D.6090809@domain.hid> <47F37579.7080601@domain.hid> <47F37BF8.6000401@domain.hid> <2ff1a98a0804020546v5eaa8ff4q100ad4820d4ad015@domain.hid> <47F38365.8070008@domain.hid> In-Reply-To: <47F38365.8070008@domain.hid> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Xenomai-core] latencys drifting into negative (Xenomai 2.4.2/2.4.3) List-Id: "Xenomai life and development \(bug reports, patches, discussions\)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: Jan Kiszka , xenomai-core , =?ISO-8859-1?Q?Cornelius_K=F6pp?= Sebastian Smolorz wrote: > Gilles Chanteperdrix wrote: >> On Wed, Apr 2, 2008 at 2:28 PM, Jan Kiszka wrote: >>> Sebastian Smolorz wrote: >>> > Jan Kiszka wrote: >>> >> >>> >> 2.3.x did not incorporate the new TSC-to-ns conversion. Maybe it is >>> >> not a PIC vs. APIC thing, but rather a rounding problem of larger TSC >>> >> values (that naturally show up when the system runs for a longer time). >>> > >>> > This hint seems to point into the right direction. I tried out a >>> > modified pod_32.h (xnarch_tsc_to_ns() commented out) so that the old >>> > implementation in include/asm-generic/bits/pod.h was used. The drifting >>> > bug disappeared. So there seems so be a buggy x86-specific >>> > implementation of this routine. >>> >>> Hmm, maybe even a conceptional issue: the multiply-shift-based >>> xnarch_tsc_to_ns is not as precise as the still multiply-divide-based >>> xnarch_ns_to_tsc. So when converting from tsc over ns back to tsc, we >>> may loose some bits, maybe too many bits... >> If you want to know whether llmulshft implementation is broken on x86 >> or if there is a design issue, you can attempt to use the generic >> implementation on x86. >> > > You mean not using rthal_llmulshft() in arith_32.h and instead using > __rthal_generic_llmulshft()? I tried this and it's also suffering from > the drift although it seems that the drift per time unit is smaller in > the generic case. This was a subjective impression, the drift caused by rthal_llmulshft() and __rthal_generic_llmulshft() is the same. > I will try to get some numbers to compare the values > returned from rthal_llmulshft(), __rthal_generic_llmulshft() and > __rthal_generic_ullimd(). > Here are some results. The latency test run one hour and 46 minutes (kernel mode). I measured the difference between the return value of the routine __rthal_generic_ullimd() which I considered as right and the return value of rthal_llmulshft(). This difference increases over time. At start: 21 ns After 1 minute: 50 ns After 4 minutes: 132 ns After 8 minutes: 238 ns After 86 minutes: 2342 ns The higher the TSC value the bigger the error during the conversion of the TSC value to nanoseconds. The converted value is too small. This all explains why latency reports decreasing values over time. Houston, we have a problem. -- Sebastian