* Re: Isochronous streaming with VT6315 OHCI
[not found] ` <20140716182849.GA17892@trinnov.com>
@ 2014-07-17 13:16 ` Peter Hurley
2014-07-17 16:52 ` Remy Bruno
0 siblings, 1 reply; 2+ messages in thread
From: Peter Hurley @ 2014-07-17 13:16 UTC (permalink / raw)
To: Remy Bruno; +Cc: Len Brown, linux1394-devel, Rafael J. Wysocki, Linux PM list
[ +cc intel_idle maintainers ]
On 07/16/2014 02:28 PM, Remy Bruno wrote:
> A follow-up and solution!
>
> By digging more, I found out that a sleep of 100us in a loop, instead of
> a busy loop, is enough to make both streamings work without drops. So I
> looked closer after what the cpu is doing while being idle and found the
> intel_idle driver. My CPU is using states C1-IVB to C7-IVB.
> By removing the states C3-IVB to C7-IVB from the driver's list, the
> drops disappear!
>
> So the explanation that comes to me is that: if all CPU cores are at the
> same time in one of the states C3-IVB to C7-IVB, then I can get drops.
> If I force one or all cores to be in a normal running state or in the
> states C1/C1E, then the ohci drops disappear.
>
> I'm not familiar at all with these CPU Cn states, but I will maybe look
> deeper into this. At least, I now have a clean solution.
>
> I noticed that the incriminated states are those with
> CPUIDLE_FLAG_TLB_FLUSHED, maybe it is related to my problem, I will try
> more things tomorrow.
>
> I would be interested if anyone can comment on this. Also if this is not
> a problem to remove states C3-C7 in my kernel (and of course if you
> think this can be interesting to propose a kernel patch when the precise
> reason will be identified)
>
> Regards,
>
> Rémy
>
> Le mercredi 16 juillet à 15:59, Remy Bruno a écrit:
>> First, I noticed that the last quirk settings for this VIA VT6315 makes
>> the whole firewire stack hang completely very quickly, the same way it
>> did before my re-writing of the AVS/ISO user stack (ie when using data
>> payloads of more than 1k), so most probably disparition of interrupts.
>> So I'm not sure if the last quirks changes were such a good idea... For
>> now, I reverted to "QUIRK_CYCLE_TIMER | QUIRK_NO_MSI" settings, as it
>> works less badly this way...
>>
>> Some more information about the issue I reported:
>> - this double streaming 16 channels 96kHz works fine with an old kernel
>> (ie old firewire stack)
>> - I noticed more or less the same behaviour with a TI XIO2200A OHCI.
>> "more or less" because the 100% CPU load trick does not fix the
>> problem so magically (event though it reduces the number of drops)
>> - for debuging purpose, I changed the skip-to-self behaviour to a
>> skip-to-next behaviour (ie I changed the skip address of each context
>> descriptor block from the block itself to the next block) so that I
>> expected to get a xferstatus for the non-transmitted packet. I could
>> get the memory dump corresponding to this skip:
>>
>> 34775280 02000008 00000000 347752D3 00000000 .........Rw4....
>> 34775290 000244A0 04080000 00000000 00000000 .D..............
>> 347752A0 00000008 347752C0 00000000 00000000 .....Rw4........
>> 347752B0 180C0400 3478B400 347752D3 841103B7 ......x4.Rw4....
>> 347752C0 20001002 D1A20490 00000000 00000000 ... ............
>>
>> 347752D0 02000008 00000000 34775314 00000000 .........Sw4....
>> 347752E0 000244A0 00080000 00000000 00000000 .D..............
>> 347752F0 180C0008 34775300 34775314 841103B8 .....Sw4.Sw4.... <- cycle 0x03B8
>> 34775300 30001002 FFFF0490 00000000 00000000 ...0............
>>
>> 34775310 02000008 00000000 34775364 00000000 ........dSw4....
>> 34775320 000244A0 04080000 00000000 00000000 .D..............
>> 34775330 00000008 34775350 00000000 00000000 ....PSw4........
>> 34775340 180C0400 3478B800 34775364 00000000 ......x4dSw4.... <- Nothing written by OHCI!!
>> 34775350 30001002 D1B60490 00000000 00000000 ...0............
>>
>> 34775360 02000008 00000000 347753B4 00000000 .........Sw4....
>> 34775370 000244A0 04080000 00000000 00000000 .D..............
>> 34775380 00000008 347753A0 00000000 00000000 .....Sw4........
>> 34775390 180C0400 3478BC00 347753B4 841103BA ......x4.Sw4.... <- cycle 0x03BA
>> 347753A0 40001002 D1CA0490 00000000 00000000 ...@............
>>
>> The skipped packet begins here at address 34775310 and we can see that
>> the status was written by the OHCI for packets before and after
>> (corresponding to cycles 0x03B8 and 0x03BA), but the OHCI didn't write
>> anything for cycle 0x03B9.
>>
>>
>> Le mercredi 16 juillet à 14:11, Clemens Ladisch a écrit:
>>>> - the drops occur more or less in the middle of an IRQ period (typically
>>>> on the 10th or 13th cycle of a period when I use IRQ periods of 20),
>>>> so this does not seem to be a kind of kernel-going-past-the
>>>> -last-packet problem.
>>>
>>> How many periods are you using?
>>
>> I'm using 4 periods of 20 cycles
>>
>>> What happens with a period length of, say, 200 frames?
>>
>> The same thing, drops occur quickly
>>
>>>> - cherry on the cake: the drops stop completely if one of the CPU cores
>>>> (I'm using an Intel i3-3220T with 2 hyperthreaded cores) is 100%
>>>> loaded!
>>>
>>> Strange. I have seen large differences in asynchronous throughput (due
>>> to smaller interrupt wakeup latencies), but isochronous transfers should
>>> not be affected by CPU latencies as long as the queue does not underrun.
>>
>> Yes, I checked this is not an underrun. See above: the OHCI continued to
>> stream, it just skipped a cycle
>>
>>> The only difference I see is that a CPU going to sleep will flush its
>>> caches, but this should reduce the latencies when the controller is
>>> reading that data from RAM, so the effect should be the exact opposite.
>>
>> Interresting.
>> But I'm not an expert (at all) about CPU caches and all the like... I
>> was even not able to find the code that the CPU executes in the kernel
>> when sleeping (even though I grepped the kernel with "idle" and "hlt"),
>> aka the "swapper" process if I'm right.
>>
>> I will continue investigations...
>>
>> Regards,
>> Rémy
>>
>> --
>> Rémy BRUNO
>> Trinnov Audio
>> remy.bruno@trinnov.com / http://www.trinnov.com
>> 5 rue Edmond Michelet, 93360 Neuilly-Plaisance, France
>> Tel: +33 (0)1 47 06 61 37
>> Mob: +33 (0)6 83 04 01 31
>>
>> ------------------------------------------------------------------------------
>> Want fast and easy access to all the code in your enterprise? Index and
>> search up to 200,000 lines of code with a free copy of Black Duck
>> Code Sight - the same software that powers the world's largest code
>> search on Ohloh, the Black Duck Open Hub! Try it now.
>> http://p.sf.net/sfu/bds
>> _______________________________________________
>> mailing list linux1394-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/linux1394-devel
>
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
mailing list linux1394-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux1394-devel
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Isochronous streaming with VT6315 OHCI
2014-07-17 13:16 ` Isochronous streaming with VT6315 OHCI Peter Hurley
@ 2014-07-17 16:52 ` Remy Bruno
0 siblings, 0 replies; 2+ messages in thread
From: Remy Bruno @ 2014-07-17 16:52 UTC (permalink / raw)
To: Peter Hurley
Cc: Remy Bruno, linux1394-devel, Len Brown, Linux PM list,
Rafael J. Wysocki
Hi,
Here is a summary/update of the problem and solutions I found.
This occurs with the following:
- linux 3.11.10
- firewire OHCI VIA VT6315
- motherboard DH61DL
- CPU Intel(R) Core(TM) i3-3220T CPU @ 2.80GHz (2 hyperthreaded cores,
thus 4 virtual cores)
* The problem *
The firewire OHCI skips some cycles (say about one each 5-10ms) when
streaming to 2 DICEs 16 channels at 96kHz each. No cycle skips occur
with only one stream. I checked that it is not due to some kind of
software underrun. The OHCI simply fails to transmit one cycle
(sometimes two in a row) for some unknown reason (no xferstatus, skip
address taken by the OHCI, streaming continues afterwards).
* The solution *
I found out that the issue disappears completely (no cycle skips within more
than 12 hours) as soon as one core (out of the 4 virtual ones) is
forbidden to go to C3 or higher idle state. No matter which core it is, as
long as only C1/C1E idle states are allowed on one of the cores (there
is no C2 state on this CPU), I don't get any cycle skips at all.
This can be done for example by having a process doing a busy loop on
one core: "while (1);". This can also be done by disabling cpuidle in
the kernel (acpi and intel). This can also be done by writing "1" to the
relevant /sys/devices/system/cpu/cpuX/cpuidle/stateY/disable files.
The reason for this is completely obscure to me. Just for trying, I
added CPUIDLE_FLAG_TLB_FLUSHED to the C1E state, but it didn't cause
skips when this state was allowed (so the problem is not due to the TLB
flush that occurs when switching to C3+ states). So, for some reason,
the OHCI (DMA?) sometimes fails to read or write from/to memory when all
cores are in C3+ state at the same time. Note that it is far from
systematic: all cores being in C3+ state does not imply that a skip will
occur.
FYI, I'm still having cycle skips with a PCIe TI XIO2200A firewire OHCI card.
The fix above reduces the amount of skips but does not remove them with
this firewire card.
Best regards,
Rémy
PS: Please note for further discussion that I'm not registered to
mailing-list linux-pm@vger.kernel.org
Le jeudi 17 juillet à 09:16, Peter Hurley a écrit:
> [ +cc intel_idle maintainers ]
>
> On 07/16/2014 02:28 PM, Remy Bruno wrote:
> > A follow-up and solution!
> >
> > By digging more, I found out that a sleep of 100us in a loop, instead of
> > a busy loop, is enough to make both streamings work without drops. So I
> > looked closer after what the cpu is doing while being idle and found the
> > intel_idle driver. My CPU is using states C1-IVB to C7-IVB.
> > By removing the states C3-IVB to C7-IVB from the driver's list, the
> > drops disappear!
> >
> > So the explanation that comes to me is that: if all CPU cores are at the
> > same time in one of the states C3-IVB to C7-IVB, then I can get drops.
> > If I force one or all cores to be in a normal running state or in the
> > states C1/C1E, then the ohci drops disappear.
> >
> > I'm not familiar at all with these CPU Cn states, but I will maybe look
> > deeper into this. At least, I now have a clean solution.
> >
> > I noticed that the incriminated states are those with
> > CPUIDLE_FLAG_TLB_FLUSHED, maybe it is related to my problem, I will try
> > more things tomorrow.
> >
> > I would be interested if anyone can comment on this. Also if this is not
> > a problem to remove states C3-C7 in my kernel (and of course if you
> > think this can be interesting to propose a kernel patch when the precise
> > reason will be identified)
> >
> > Regards,
> >
> > Rémy
> >
> > Le mercredi 16 juillet à 15:59, Remy Bruno a écrit:
> >> First, I noticed that the last quirk settings for this VIA VT6315 makes
> >> the whole firewire stack hang completely very quickly, the same way it
> >> did before my re-writing of the AVS/ISO user stack (ie when using data
> >> payloads of more than 1k), so most probably disparition of interrupts.
> >> So I'm not sure if the last quirks changes were such a good idea... For
> >> now, I reverted to "QUIRK_CYCLE_TIMER | QUIRK_NO_MSI" settings, as it
> >> works less badly this way...
> >>
> >> Some more information about the issue I reported:
> >> - this double streaming 16 channels 96kHz works fine with an old kernel
> >> (ie old firewire stack)
> >> - I noticed more or less the same behaviour with a TI XIO2200A OHCI.
> >> "more or less" because the 100% CPU load trick does not fix the
> >> problem so magically (event though it reduces the number of drops)
> >> - for debuging purpose, I changed the skip-to-self behaviour to a
> >> skip-to-next behaviour (ie I changed the skip address of each context
> >> descriptor block from the block itself to the next block) so that I
> >> expected to get a xferstatus for the non-transmitted packet. I could
> >> get the memory dump corresponding to this skip:
> >>
> >> 34775280 02000008 00000000 347752D3 00000000 .........Rw4....
> >> 34775290 000244A0 04080000 00000000 00000000 .D..............
> >> 347752A0 00000008 347752C0 00000000 00000000 .....Rw4........
> >> 347752B0 180C0400 3478B400 347752D3 841103B7 ......x4.Rw4....
> >> 347752C0 20001002 D1A20490 00000000 00000000 ... ............
> >>
> >> 347752D0 02000008 00000000 34775314 00000000 .........Sw4....
> >> 347752E0 000244A0 00080000 00000000 00000000 .D..............
> >> 347752F0 180C0008 34775300 34775314 841103B8 .....Sw4.Sw4.... <- cycle 0x03B8
> >> 34775300 30001002 FFFF0490 00000000 00000000 ...0............
> >>
> >> 34775310 02000008 00000000 34775364 00000000 ........dSw4....
> >> 34775320 000244A0 04080000 00000000 00000000 .D..............
> >> 34775330 00000008 34775350 00000000 00000000 ....PSw4........
> >> 34775340 180C0400 3478B800 34775364 00000000 ......x4dSw4.... <- Nothing written by OHCI!!
> >> 34775350 30001002 D1B60490 00000000 00000000 ...0............
> >>
> >> 34775360 02000008 00000000 347753B4 00000000 .........Sw4....
> >> 34775370 000244A0 04080000 00000000 00000000 .D..............
> >> 34775380 00000008 347753A0 00000000 00000000 .....Sw4........
> >> 34775390 180C0400 3478BC00 347753B4 841103BA ......x4.Sw4.... <- cycle 0x03BA
> >> 347753A0 40001002 D1CA0490 00000000 00000000 ...@............
> >>
> >> The skipped packet begins here at address 34775310 and we can see that
> >> the status was written by the OHCI for packets before and after
> >> (corresponding to cycles 0x03B8 and 0x03BA), but the OHCI didn't write
> >> anything for cycle 0x03B9.
> >>
> >>
> >> Le mercredi 16 juillet à 14:11, Clemens Ladisch a écrit:
> >>>> - the drops occur more or less in the middle of an IRQ period (typically
> >>>> on the 10th or 13th cycle of a period when I use IRQ periods of 20),
> >>>> so this does not seem to be a kind of kernel-going-past-the
> >>>> -last-packet problem.
> >>>
> >>> How many periods are you using?
> >>
> >> I'm using 4 periods of 20 cycles
> >>
> >>> What happens with a period length of, say, 200 frames?
> >>
> >> The same thing, drops occur quickly
> >>
> >>>> - cherry on the cake: the drops stop completely if one of the CPU cores
> >>>> (I'm using an Intel i3-3220T with 2 hyperthreaded cores) is 100%
> >>>> loaded!
> >>>
> >>> Strange. I have seen large differences in asynchronous throughput (due
> >>> to smaller interrupt wakeup latencies), but isochronous transfers should
> >>> not be affected by CPU latencies as long as the queue does not underrun.
> >>
> >> Yes, I checked this is not an underrun. See above: the OHCI continued to
> >> stream, it just skipped a cycle
> >>
> >>> The only difference I see is that a CPU going to sleep will flush its
> >>> caches, but this should reduce the latencies when the controller is
> >>> reading that data from RAM, so the effect should be the exact opposite.
> >>
> >> Interresting.
> >> But I'm not an expert (at all) about CPU caches and all the like... I
> >> was even not able to find the code that the CPU executes in the kernel
> >> when sleeping (even though I grepped the kernel with "idle" and "hlt"),
> >> aka the "swapper" process if I'm right.
> >>
> >> I will continue investigations...
> >>
> >> Regards,
> >> Rémy
> >>
> >> --
> >> Rémy BRUNO
> >> Trinnov Audio
> >> remy.bruno@trinnov.com / http://www.trinnov.com
> >> 5 rue Edmond Michelet, 93360 Neuilly-Plaisance, France
> >> Tel: +33 (0)1 47 06 61 37
> >> Mob: +33 (0)6 83 04 01 31
> >>
> >> ------------------------------------------------------------------------------
> >> Want fast and easy access to all the code in your enterprise? Index and
> >> search up to 200,000 lines of code with a free copy of Black Duck
> >> Code Sight - the same software that powers the world's largest code
> >> search on Ohloh, the Black Duck Open Hub! Try it now.
> >> http://p.sf.net/sfu/bds
> >> _______________________________________________
> >> mailing list linux1394-devel@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/linux1394-devel
> >
>
--
Rémy BRUNO
Trinnov Audio
remy.bruno@trinnov.com / http://www.trinnov.com
5 rue Edmond Michelet, 93360 Neuilly-Plaisance, France
Tel: +33 (0)1 47 06 61 37
Mob: +33 (0)6 83 04 01 31
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-07-17 18:47 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20140123154321.GA1557@trinnov.com>
[not found] ` <20140123233205.628240b5@stein>
[not found] ` <20140127100932.GB1557@trinnov.com>
[not found] ` <20140615011758.2817384f@kant>
[not found] ` <20140715185351.GA28484@trinnov.com>
[not found] ` <53C66BFC.4010500@ladisch.de>
[not found] ` <20140716135959.GA11877@trinnov.com>
[not found] ` <20140716182849.GA17892@trinnov.com>
2014-07-17 13:16 ` Isochronous streaming with VT6315 OHCI Peter Hurley
2014-07-17 16:52 ` Remy Bruno
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).