* Re: Problem with hwlat detector in smp_processor_id()
@ 2009-07-22 9:18 John Kacur
0 siblings, 0 replies; 6+ messages in thread
From: John Kacur @ 2009-07-22 9:18 UTC (permalink / raw)
To: linux-rt-users, Wolfgang Steinwender, Carsten Emde, tglx
Cc: linux-kernel, Clark Williams, Jon Masters, Peter Zijlstra
I've tested this patch against 2.6.29.6-rt23 and 2.6.31-rc3-rtx
and it seems to solve the problem for me. I have not tested against vanilla
2.6.31-rc3 yet, but it looks to me like it should be applied there too.
(not just as an extra -rt patch)
^ permalink raw reply [flat|nested] 6+ messages in thread
* Problem with hwlat detector in smp_processor_id()
@ 2009-07-09 11:32 Wolfgang Steinwender
2009-07-09 14:36 ` Carsten Emde
2009-07-09 14:36 ` Jon Masters
0 siblings, 2 replies; 6+ messages in thread
From: Wolfgang Steinwender @ 2009-07-09 11:32 UTC (permalink / raw)
To: linux-rt-users
Hello everyone,
I'm doing some testing here. When I'm trying
the hardware latency test (hwlatdetect.py) I'm
getting the following messages:
hwlat_detector: version 1.0.0
BUG: using smp_processor_id() in preemptible [00000000] code:
hwlatdetect/3755
caller is debug_sample_fread+0x138/0x1ea [hwlat_detector]
Pid: 3755, comm: hwlatdetect Tainted: G N
2.6.29.5-M-jen80-rtpae-debug #1
Call Trace:
[<c035be0b>] ? printk+0x14/0x19
[<c02403df>] debug_smp_processor_id+0xb3/0xc8
[<f855061f>] debug_sample_fread+0x138/0x1ea [hwlat_detector]
[<c020fed8>] ? security_file_permission+0x14/0x16
[<c01b8a65>] ? rw_verify_area+0x8f/0xb1
[<f85504e7>] ? debug_sample_fread+0x0/0x1ea [hwlat_detector]
[<c01b93b5>] vfs_read+0x8e/0x138
[<c01b9502>] sys_read+0x40/0x65
[<c0102cdc>] sysenter_do_call+0x12/0x2d
The message is printed for every poll.
Manually loading the hwlat module and reading "sample" produces
the same message.
I'm using a SuSE kernel with RT-Patches from j.eng (recompiled
with hwlat Module). Version should roughly be 2.6.29.5-rt20,
PAE is enabled. CPU is a dual core.
The offending source could be:
mutex_lock(&ring_buffer_mutex);
e = ring_buffer_consume(ring_buffer, smp_processor_id(), NULL);
I'm wondering, because the seems to run at least for some people.
Can it be a problem with my kernel?
Best regards,
W. Steinwender
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problem with hwlat detector in smp_processor_id()
2009-07-09 11:32 Wolfgang Steinwender
@ 2009-07-09 14:36 ` Carsten Emde
2009-08-10 14:07 ` Wolfgang Steinwender
2009-07-09 14:36 ` Jon Masters
1 sibling, 1 reply; 6+ messages in thread
From: Carsten Emde @ 2009-07-09 14:36 UTC (permalink / raw)
To: Wolfgang Steinwender; +Cc: linux-rt-users
[-- Attachment #1: Type: text/plain, Size: 378 bytes --]
On 07/09/2009 01:32 PM, Wolfgang Steinwender wrote:
> [..] When I'm trying the hardware latency test (hwlatdetect.py) I'm
> getting the following messages:
> hwlat_detector: version 1.0.0
> BUG: using smp_processor_id() in preemptible [00000000] code:
> hwlatdetect/3755
> caller is debug_sample_fread+0x138/0x1ea [hwlat_detector] [..]
Does the attached patch help?
Carsten.
[-- Attachment #2: hwlat_detector-avoid-smp_processor_id.patch --]
[-- Type: text/x-patch, Size: 849 bytes --]
Index: linux-2.6.29.5-rt22/drivers/misc/hwlat_detector.c
===================================================================
--- linux-2.6.29.5-rt22.orig/drivers/misc/hwlat_detector.c
+++ linux-2.6.29.5-rt22/drivers/misc/hwlat_detector.c
@@ -191,17 +191,11 @@ static struct sample *buffer_get_sample(
if (!sample)
return NULL;
- /* ring_buffers are per-cpu but we just want any value */
- /* so we'll start with this cpu and try others if not */
- /* Steven is planning to add a generic mechanism */
mutex_lock(&ring_buffer_mutex);
- e = ring_buffer_consume(ring_buffer, smp_processor_id(), NULL);
- if (!e) {
- for_each_online_cpu(cpu) {
- e = ring_buffer_consume(ring_buffer, cpu, NULL);
- if (e)
- break;
- }
+ for_each_online_cpu(cpu) {
+ e = ring_buffer_consume(ring_buffer, cpu, NULL);
+ if (e)
+ break;
}
if (e) {
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Problem with hwlat detector in smp_processor_id()
2009-07-09 14:36 ` Carsten Emde
@ 2009-08-10 14:07 ` Wolfgang Steinwender
2009-08-10 18:58 ` Clark Williams
0 siblings, 1 reply; 6+ messages in thread
From: Wolfgang Steinwender @ 2009-08-10 14:07 UTC (permalink / raw)
To: linux-rt-users; +Cc: Carsten Emde
Carsten Emde wrote:
> Does the attached patch help?
Sorry for the late reply. I now switched to linux-2.6.29.6-rt23
(which has the patch included) and verified that the problem is
solved. Reverting the patch gives the problem again.
Now the error messages have disappeared, but I really cannot
tell if the test is doing something at all.
Here's the output from running the python script from rt-tests-50:
$> hwlatdetect --debug
debugging prints turned on
looking for modules
module path: /lib/modules/2.6.29.6-rt23-pae-debug/kernel/drivers/misc
checking
/lib/modules/2.6.29.6-rt23-pae-debug/kernel/drivers/misc/hwlat_detector.ko
not mounting debugfs
test duration is 120s
hwlatdetect: test duration 120 seconds
parameters:
Latency threshold: 10us
Sample window: 1000000us
Sample width: 500000us
Non-sampling period: 500000us
Output File: None
Starting test
Starting hardware latency detection for 120 seconds
enabling detector module
first attempt at enable
detector module enabled
disabling detector module
first attempt at disable
detector module disabled
Hardware latency detection done (0 samples)
test finished
Max Latency: 0us
Samples recorded: 0
Samples exceeding threshold: 0
not umounting debugfs
The output from the hwlat_detector module is:
hwlat_detector: version 1.0.0
For me, the output "Samples recorded: 0" means that no samples have
been read at all. Or do I misinterpret the output?
It is also not possible for me to cat the sample entry
when the module is enabled: "strace cat sample"
just waits forever:
open("sample", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
read(3,
Is there anything else I can try?
Best regards,
W. Steinwender
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: Problem with hwlat detector in smp_processor_id()
2009-08-10 14:07 ` Wolfgang Steinwender
@ 2009-08-10 18:58 ` Clark Williams
0 siblings, 0 replies; 6+ messages in thread
From: Clark Williams @ 2009-08-10 18:58 UTC (permalink / raw)
To: Wolfgang Steinwender; +Cc: linux-rt-users, Carsten Emde
[-- Attachment #1: Type: text/plain, Size: 2653 bytes --]
On Mon, 10 Aug 2009 16:07:27 +0200
Wolfgang Steinwender <wolfgang@psysteme.de> wrote:
> Carsten Emde wrote:
> > Does the attached patch help?
>
> Sorry for the late reply. I now switched to linux-2.6.29.6-rt23
> (which has the patch included) and verified that the problem is
> solved. Reverting the patch gives the problem again.
>
> Now the error messages have disappeared, but I really cannot
> tell if the test is doing something at all.
>
> Here's the output from running the python script from rt-tests-50:
> $> hwlatdetect --debug
> debugging prints turned on
> looking for modules
> module path: /lib/modules/2.6.29.6-rt23-pae-debug/kernel/drivers/misc
> checking
> /lib/modules/2.6.29.6-rt23-pae-debug/kernel/drivers/misc/hwlat_detector.ko
> not mounting debugfs
> test duration is 120s
> hwlatdetect: test duration 120 seconds
> parameters:
> Latency threshold: 10us
> Sample window: 1000000us
> Sample width: 500000us
> Non-sampling period: 500000us
> Output File: None
>
> Starting test
> Starting hardware latency detection for 120 seconds
> enabling detector module
> first attempt at enable
> detector module enabled
> disabling detector module
> first attempt at disable
> detector module disabled
> Hardware latency detection done (0 samples)
> test finished
> Max Latency: 0us
> Samples recorded: 0
> Samples exceeding threshold: 0
> not umounting debugfs
>
> The output from the hwlat_detector module is:
> hwlat_detector: version 1.0.0
>
> For me, the output "Samples recorded: 0" means that no samples have
> been read at all. Or do I misinterpret the output?
Wolfgang,
The kernel module behavior changed on me. Originally the smi_detector.ko
module just streamed sample data out, most of it being samples of zero
(meaning no gaps in time seen). When Jon re-worked it to use the
ring-buffer structure and renamed it to hwlat_detector.ko, he only
provides sample data if it exceeds the specified threshold.
So, long answer to a short question, yes you interpreted the output
correctly, there were no gaps in the TSC values read by the sampling
thread.
>
> It is also not possible for me to cat the sample entry
> when the module is enabled: "strace cat sample"
> just waits forever:
> open("sample", O_RDONLY|O_LARGEFILE) = 3
> fstat64(3, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
> read(3,
>
> Is there anything else I can try?
>
Due to the change in behavior above, the hwlatdetect python script
now opens the "sample" entry with O_NDELAY and polls that descriptor.
Clark
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Problem with hwlat detector in smp_processor_id()
2009-07-09 11:32 Wolfgang Steinwender
2009-07-09 14:36 ` Carsten Emde
@ 2009-07-09 14:36 ` Jon Masters
1 sibling, 0 replies; 6+ messages in thread
From: Jon Masters @ 2009-07-09 14:36 UTC (permalink / raw)
To: Wolfgang Steinwender; +Cc: linux-rt-users
On Thu, 2009-07-09 at 13:32 +0200, Wolfgang Steinwender wrote:
> e = ring_buffer_consume(ring_buffer, smp_processor_id(), NULL);
>
> I'm wondering, because the seems to run at least for some people.
> Can it be a problem with my kernel?
Indeed. Can you post some more information - the config file as an
attachment, along with the specifics of what you recompiled?
Jon.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2009-08-10 18:58 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-22 9:18 Problem with hwlat detector in smp_processor_id() John Kacur
-- strict thread matches above, loose matches on Subject: below --
2009-07-09 11:32 Wolfgang Steinwender
2009-07-09 14:36 ` Carsten Emde
2009-08-10 14:07 ` Wolfgang Steinwender
2009-08-10 18:58 ` Clark Williams
2009-07-09 14:36 ` Jon Masters
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox