blktrace / relay: bad trace

linux-btrace.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* blktrace / relay: bad trace
@ 2009-03-05 16:23 Martin Peschke
  2009-03-09  5:10 ` Tom Zanussi
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Martin Peschke @ 2009-03-05 16:23 UTC (permalink / raw)
  To: linux-btrace

Hi,

I keep running into bad traces, at least on System z.

(See http://marc.info/?l=linux-btrace&m\x122709472202537&w=2 for an
earlier post).

Looking at the data as read by blktrace from relay files,
I found fragments of old traces partially overlaying other traces.

In the hexdump-ed trace below, I have added "|"s as delimiters between
traces. The corrupted part of the trace - a fragment containing the
sequence number 0x4d1d3 - is in parenthesis.

12d8ca0 |6561 7407 0005 6440 0000 00fb ef9d 6496  -> sequence 0x56440
12d8cb0  0000 0000 000f 4790 0002 0000 4001 0011
12d8cc0  0000 1445 0080 0020 0000 0001 0000 0018
12d8cd0  0000 0001 0001 007d 0000 0000 0001 3e40
12d8ce0  0000 0000 001b badc|6561 7407 0005 6441  -> sequence 0x56441
12d8cf0  0000 00fb ef9d 79e3 0000 0000 000f 4790
12d8d00  0002 0000 0181 0008 0000 1445 0080 0020
12d8d10  0000 0001 0000 0000|6561 7407 0005 6442  -> sequence 0x56442
12d8d20 (0000 1444 0080 0020 0000 0001 0000 0000
12d8d30 |6561 7407 0004 d1d3 0000 00e3 4a8b 1544  -> sequence 0x4d1d3
12d8d40  0000 0000 0015 5728 0003 0000 4001 0011     old trace!!
12d8d50  0000 1447 0080 0020 0000 0001 0000 0018)
12d8d60 |6561 7407 0005 6443 0000 00fb f04d c32a  -> sequence 0x56443
12d8d70  0000 0000 000f 4928 0000 8000 4001 0011
12d8d80  0000 1445 0080 0020 0000 0001 0000 0018
12d8d90  0000 0001 0001 007f 0000 0000 0000 55d8
12d8da0  0000 0000 0008 1070|6561 7407 0005 6444  -> sequence 0x56444

The same fragment containing sequence 0x4d1d3 originally appeared in
this context:

10d8d70  0000 0000 0013 970e|6561 7407 0004 d1d2  -> sequence 0x4d1d2
10d8d80  0000 00e3 4a82 8773 0000 0000 0039 4d20
10d8d90  0001 0000 0181 0008(0000 1444 0080 0020
10d8da0  0000 0001 0000 0000|6561 7407 0004 d1d3  -> sequence 0x4d1d3
10d8db0  0000 00e3 4a8b 1544 0000 0000 0015 5728
10d8dc0  0003 0000 4001 0011 0000 1447 0080 0020
10d8dd0  0000 0001 0000 0018)0000 0001 0001 0080
10d8de0  0000 0000 0001 156c 0000 0000 001a 10c8
10d8df0 |6561 7407 0004 d1d4 0000 00e3 4a8b 239c  -> sequence 0x4d1d4
10d8e00  0000 0000 0015 5728 0003 0000 0181 0008
10d8e10  0000 1447 0080 0020 0000 0001 0000 0000

Looks like a kernel issue to me, blktrace or relay.

Is there anything I can do in order to help fixing this?
Which debug data would be needed?
Any idea on what should be done next?

Could this issue be caused by some race in __blk_add_trace()?
I don't see one, though...

Or, could it be related to relay subbuffer switching, padding etc.?

I am using a recent git-kernel and a recent blktrace (v2).

Thanks,
Martin

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blktrace / relay: bad trace
  2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
@ 2009-03-09  5:10 ` Tom Zanussi
  2009-03-09 14:23 ` Martin Peschke
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Tom Zanussi @ 2009-03-09  5:10 UTC (permalink / raw)
  To: linux-btrace

Hi,

On Thu, 2009-03-05 at 17:23 +0100, Martin Peschke wrote:
> Hi,
> 
> I keep running into bad traces, at least on System z.
> 
> (See http://marc.info/?l=linux-btrace&m\x122709472202537&w=2 for an
> earlier post).
> 
> Looking at the data as read by blktrace from relay files,
> I found fragments of old traces partially overlaying other traces.
> 
> In the hexdump-ed trace below, I have added "|"s as delimiters between
> traces. The corrupted part of the trace - a fragment containing the
> sequence number 0x4d1d3 - is in parenthesis.
> 
> 12d8ca0 |6561 7407 0005 6440 0000 00fb ef9d 6496  -> sequence 0x56440
> 12d8cb0  0000 0000 000f 4790 0002 0000 4001 0011
> 12d8cc0  0000 1445 0080 0020 0000 0001 0000 0018
> 12d8cd0  0000 0001 0001 007d 0000 0000 0001 3e40
> 12d8ce0  0000 0000 001b badc|6561 7407 0005 6441  -> sequence 0x56441
> 12d8cf0  0000 00fb ef9d 79e3 0000 0000 000f 4790
> 12d8d00  0002 0000 0181 0008 0000 1445 0080 0020
> 12d8d10  0000 0001 0000 0000|6561 7407 0005 6442  -> sequence 0x56442
> 12d8d20 (0000 1444 0080 0020 0000 0001 0000 0000
> 12d8d30 |6561 7407 0004 d1d3 0000 00e3 4a8b 1544  -> sequence 0x4d1d3
> 12d8d40  0000 0000 0015 5728 0003 0000 4001 0011     old trace!!
> 12d8d50  0000 1447 0080 0020 0000 0001 0000 0018)
> 12d8d60 |6561 7407 0005 6443 0000 00fb f04d c32a  -> sequence 0x56443
> 12d8d70  0000 0000 000f 4928 0000 8000 4001 0011
> 12d8d80  0000 1445 0080 0020 0000 0001 0000 0018
> 12d8d90  0000 0001 0001 007f 0000 0000 0000 55d8
> 12d8da0  0000 0000 0008 1070|6561 7407 0005 6444  -> sequence 0x56444
> 
> The same fragment containing sequence 0x4d1d3 originally appeared in
> this context:
> 
> 10d8d70  0000 0000 0013 970e|6561 7407 0004 d1d2  -> sequence 0x4d1d2
> 10d8d80  0000 00e3 4a82 8773 0000 0000 0039 4d20
> 10d8d90  0001 0000 0181 0008(0000 1444 0080 0020
> 10d8da0  0000 0001 0000 0000|6561 7407 0004 d1d3  -> sequence 0x4d1d3
> 10d8db0  0000 00e3 4a8b 1544 0000 0000 0015 5728
> 10d8dc0  0003 0000 4001 0011 0000 1447 0080 0020
> 10d8dd0  0000 0001 0000 0018)0000 0001 0001 0080
> 10d8de0  0000 0000 0001 156c 0000 0000 001a 10c8
> 10d8df0 |6561 7407 0004 d1d4 0000 00e3 4a8b 239c  -> sequence 0x4d1d4
> 10d8e00  0000 0000 0015 5728 0003 0000 0181 0008
> 10d8e10  0000 1447 0080 0020 0000 0001 0000 0000
> 
> Looks like a kernel issue to me, blktrace or relay.
> 
> Is there anything I can do in order to help fixing this?
> Which debug data would be needed?
> Any idea on what should be done next?
> 
> Could this issue be caused by some race in __blk_add_trace()?
> I don't see one, though...
> 
> Or, could it be related to relay subbuffer switching, padding etc.?
> 

It's definitely good to see hexdumps, but it's hard to tell from a
single one - the best way to make progress would be to provide enough
information for anyone to be able to reproduce it on more common
hardware ie. x86/x86_64.  Does it only happen only on systemZ as far as
you know (I haven't seen it on x86_64)?  Do you get dropped events every
time it happens (which would indicate a buffer-full related condition)
or never?  Does it only happen in pipeline mode or normal to-disk
logging?  Are you using the default sub-buffer sizes or something else?
Only logging certain event types?  Is it a recent regression, since it
doesn't seem to have been a problem before, etc.?

From looking at the trace, it could be related to some garbage being
erroneously read from padding that could contain old event data that
should be skipped over, but it doesn't happen on a sub-buffer boundary
as I'd expect in that case.  Also I notice that the distance between the
first event and the second old trace is almost exactly 2Mb, which is
interesting, but hard to know if it's significant without more examples.
It may be the relay read producing these effects, but it may also be the
blktrace userspace buffering of that read data that has a bug.  Also I
see that the bad trace is one that has a large pdu payload, which is cut
off in the second trace, which could point to erroneous pdu handling in
blktrace userspace.

There are several moving parts here, both in kernel and userspace, that
could be contributing to the problem - it's hard to tell which one
without more data and/or the ability to reproduce it and probably
supplement with some ad hoc tracing from the event write/read paths.

Tom

> I am using a recent git-kernel and a recent blktrace (v2).
> 
> Thanks,
> Martin
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blktrace / relay: bad trace
  2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
  2009-03-09  5:10 ` Tom Zanussi
@ 2009-03-09 14:23 ` Martin Peschke
  2009-03-10 17:19 ` Alan D. Brunelle
  2009-03-11 11:59 ` Martin Peschke
  3 siblings, 0 replies; 6+ messages in thread
From: Martin Peschke @ 2009-03-09 14:23 UTC (permalink / raw)
  To: linux-btrace

Hello Tom,
thanks, for your reply.

On Mon, 2009-03-09 at 00:10 -0500, Tom Zanussi wrote:
> On Thu, 2009-03-05 at 17:23 +0100, Martin Peschke wrote:
> 
> It's definitely good to see hexdumps, but it's hard to tell from a
> single one - the best way to make progress would be to provide enough
> information for anyone to be able to reproduce it on more common
> hardware ie. x86/x86_64.

I run some internal I/O workload generator against an FCP disk with 4
paths. My blktrace command line:

blktrace -a issue -a complete -o - /dev/sda /dev/sdu /dev/sdao /dev/sdbi
| blkiomon -I 100 -b blkiomon.out

> Does it only happen only on systemZ as far as
> you know (I haven't seen it on x86_64)?

Don't know yet. I am trying to reproduce the issue on my laptop... but
it's not quite the same setup.

>  Do you get dropped events every
> time it happens (which would indicate a buffer-full related condition)
> or never?

no dropped events

> Does it only happen in pipeline mode or normal to-disk
> logging?

both

> Are you using the default sub-buffer sizes or something else?

defaults

> Only logging certain event types?

issue/dispatch and complete

> Is it a recent regression, since it
> doesn't seem to have been a problem before, etc.?

no recent regression - I saw it last summer too

> It may be the relay read producing these effects, but it may also be the
> blktrace userspace buffering of that read data that has a bug.

I am quite sure it is a kernel issue. I have patched relay in order to
"poison" relay buffers - 0x55 for re-used buffers, 0x66 for padding,
0x77 for consumed buffers:

---
 kernel/relay.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -735,6 +735,8 @@ size_t relay_switch_subbuf(struct rchan_
 		old_subbuf = buf->subbufs_produced % buf->chan->n_subbufs;
 		buf->padding[old_subbuf] = buf->prev_padding;
 		buf->subbufs_produced++;
+		memset((char *)buf->data + buf->offset, 0x66,
+		       buf->prev_padding);
 		if (buf->dentry)
 			buf->dentry->d_inode->i_size + 				buf->chan->subbuf_size -
@@ -767,6 +769,8 @@ size_t relay_switch_subbuf(struct rchan_
 	if (unlikely(length + buf->offset > buf->chan->subbuf_size))
 		goto toobig;
 
+	memset(buf->data, 0x55, buf->chan->subbuf_size);
+
 	return length;
 
 toobig:
@@ -1112,6 +1116,7 @@ static int subbuf_read_actor(size_t read
 		desc->error = -EFAULT;
 		ret = 0;
 	}
+	memset(from, 0x77, ret);
 	desc->arg.data += ret;
 	desc->written += ret;
 	desc->count -= ret;


This is what I get in user space then:

--- suspiscous pdu_len ---
magic    0x65617407
sequence 0x00003baf
time     0x5555555555555555
sector   0x5555555555555555
bytes    0x55555555
action   0x55555555
pid      0x55555555
device   0x55555555
cpu      0x55555555
error    0x5555
pdu_len  0x5555

65617407 00003baf 55555555 55555555 55555555 55555555 55555555 55555555
55555555 55555555 55555555 55555555

> Also I
> see that the bad trace is one that has a large pdu payload, which is cut
> off in the second trace, which could point to erroneous pdu handling in
> blktrace userspace.

Don't worry about the bad trace. It's junk anyway, because a broken
pdu_len of the previous trace made blktrace look for the next trace at a
wrong offset.


Martin


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blktrace / relay: bad trace
  2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
  2009-03-09  5:10 ` Tom Zanussi
  2009-03-09 14:23 ` Martin Peschke
@ 2009-03-10 17:19 ` Alan D. Brunelle
  2009-03-11 11:59 ` Martin Peschke
  3 siblings, 0 replies; 6+ messages in thread
From: Alan D. Brunelle @ 2009-03-10 17:19 UTC (permalink / raw)
  To: linux-btrace

Hi Martin -

What version of the kernel are you running on? I'm experiencing some bad
stuff the last couple of days, and it's in the blktrace/relay arena:
http://lkml.org/lkml/2009/3/10/331 - this is with 2.6.29-rc[67]...

For some reason the first e-mail isn't showing up correctly on LKML (I
got it via e-mail OK), anyways, that stack looked like:

------------[ cut here ]------------
kernel BUG at mm/slab.c:3002!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
CPU 6
Modules linked in: xfs exportfs fuse ext2 loop dm_mod sd_mod crc_t10dif
bnx2 ipmi_si sg qla2xxx shpchp scsi_transport_fc sr_mod rtc_cmos button
container ipmi_msghandler hpilo hpwdt rtc_core pci_hotplug pcspkr
rtc_lib cdrom scsi_tgt serio_raw usbhid hid ehci_hcd uhci_hcd ohci_hcd
usbcore edd ext3 mbcache jbd fan ide_pci_generic amd74xx ide_core
pata_amd thermal processor thermal_sys hwmon cciss ata_generic libata
scsi_mod
Pid: 11346, comm: blktrace Tainted: G    B      2.6.29-rc7 #3 ProLiant
DL585 G5
RIP: 0010:[<ffffffff802c5099>]  [<ffffffff802c5099>]
cache_alloc_refill+0x107/0x229
RSP: 0018:ffff88081384d9e8  EFLAGS: 00010046
RAX: 0000000000000070 RBX: ffff88187fc01340 RCX: 0000000000000015
RDX: ffff88187c032000 RSI: ffff88187c682000 RDI: ffff88187fc01350
RBP: ffff88081384da28 R08: ffff88187fc01360 R09: 00000000000000d2
R10: ffff8817f4b9eabf R11: 000000000000000a R12: ffff88187c762c00
R13: 0000000000000027 R14: ffff88087fc00040 R15: 00000000000492d0
FS:  00007f3b2d6806f0(0000) GS:ffff88187c7671c0(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007f3b2d022f30 CR3: 000000183c883000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process blktrace (pid: 11346, threadinfo ffff88081384c000, task
ffff88082e5ae140)
Stack:
 ffff88081384da78 ffffffff802b7061 000000021384da18 0000000000000002
 ffff88087fc00040 00000000000080d0 0000000000000292 ffff88181f992ec0
 ffff88081384da68 ffffffff802c4cb1 0000000077c6c910 ffff88187a89fc80
Call Trace:
 [<ffffffff802b7061>] ? alloc_vmap_area+0x1fe/0x211
 [<ffffffff802c4cb1>] kmem_cache_alloc_node+0x9a/0xe6
 [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
 [<ffffffff802c56a2>] __kmalloc_node+0x43/0x45
 [<ffffffff802b79af>] __vmalloc_area_node+0x76/0x14b
 [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
 [<ffffffff802b7b00>] __vmalloc_node+0x7c/0x8c
 [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
 [<ffffffff802b7c34>] vmalloc+0x1f/0x21
 [<ffffffff80289a49>] relay_open_buf+0x9f/0x23c
 [<ffffffff8028a4b3>] relay_open+0x144/0x218
 [<ffffffff8036a643>] do_blk_trace_setup+0x1a4/0x59b
 [<ffffffff8036aa7e>] blk_trace_setup+0x44/0x75
 [<ffffffff8036ad56>] blk_trace_ioctl+0x9a/0xcf
 [<ffffffff802d4685>] ? path_put+0x2c/0x30
 [<ffffffff80361dd8>] blkdev_ioctl+0x803/0x853
 [<ffffffff802d615b>] ? putname+0x30/0x39
 [<ffffffff802d80be>] ? user_path_at+0x5d/0x8c
 [<ffffffff802e2e67>] ? mntput_no_expire+0x31/0x18f
 [<ffffffff802d4685>] ? path_put+0x2c/0x30
 [<ffffffff802f10f3>] block_ioctl+0x38/0x3c
 [<ffffffff802d9690>] vfs_ioctl+0x2a/0x78
 [<ffffffff802d9b24>] do_vfs_ioctl+0x446/0x482
 [<ffffffff8024ff46>] ? do_sigaction+0x166/0x187
 [<ffffffff802d9bb5>] sys_ioctl+0x55/0x77
 [<ffffffff8020c42a>] system_call_fastpath+0x16/0x1b
Code: 00 00 00 48 8b 33 48 39 de 75 14 48 8b 73 20 c7 43 60 01 00 00 00
4c 39 c6 0f 84 a6 00 00 00 8b 46 20 41 3b 86 18 10 00 00 72 33 <0f> 0b
eb fe ff c0 41 8b 0c 24 41 8b 96 0c 10 00 00 89 46 20 8b
RIP  [<ffffffff802c5099>] cache_alloc_refill+0x107/0x229
 RSP <ffff88081384d9e8>
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
hpwdt: An NMI occurred, but unable to determine source.
Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.29-rc7 (root@seatpost) (gcc version 4.3.2
[gcc-4_3-branch revision 141291] (SUSE Linux) ) #3 SMP Tue Mar 10
10:15:07 EDT 2009
Command line: root=/dev/cciss/c0d2p3 text resume=/dev/cciss/c0d2p2
vga=0x317 console=ttyS1,115200N8 elevatorﬁadline sysrq=1 reset_devices
irqpoll maxcpus=1  memmap=exactmap memmapd0K@0K memmap\x130412K@17024K
elfcorehdr\x147436K memmap2K#2095416K
KERNEL supported cpus:
  Intel GenuineIntel
  AMD AuthenticAMD
  Centaur CentaurHauls

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-btrace" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blktrace / relay: bad trace
  2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
                   ` (2 preceding siblings ...)
  2009-03-10 17:19 ` Alan D. Brunelle
@ 2009-03-11 11:59 ` Martin Peschke
  2009-03-11 12:03   ` Alan D. Brunelle
  3 siblings, 1 reply; 6+ messages in thread
From: Martin Peschke @ 2009-03-11 11:59 UTC (permalink / raw)
  To: linux-btrace

Hi Alan,
I get bad data from relay in user space. I don't see a crash similar to
the one you have posted. My issue appears to be older anyway.
The last kernel I have tried out is 2.6.29-rc7.

Martin

On Tue, 2009-03-10 at 13:19 -0400, Alan D. Brunelle wrote:
> Hi Martin -
> 
> What version of the kernel are you running on? I'm experiencing some bad
> stuff the last couple of days, and it's in the blktrace/relay arena:
> http://lkml.org/lkml/2009/3/10/331 - this is with 2.6.29-rc[67]...
> 
> For some reason the first e-mail isn't showing up correctly on LKML (I
> got it via e-mail OK), anyways, that stack looked like:
> 
> ------------[ cut here ]------------
> kernel BUG at mm/slab.c:3002!
> invalid opcode: 0000 [#1] SMP
> last sysfs file: /sys/devices/system/cpu/cpu15/cache/index2/shared_cpu_map
> CPU 6
> Modules linked in: xfs exportfs fuse ext2 loop dm_mod sd_mod crc_t10dif
> bnx2 ipmi_si sg qla2xxx shpchp scsi_transport_fc sr_mod rtc_cmos button
> container ipmi_msghandler hpilo hpwdt rtc_core pci_hotplug pcspkr
> rtc_lib cdrom scsi_tgt serio_raw usbhid hid ehci_hcd uhci_hcd ohci_hcd
> usbcore edd ext3 mbcache jbd fan ide_pci_generic amd74xx ide_core
> pata_amd thermal processor thermal_sys hwmon cciss ata_generic libata
> scsi_mod
> Pid: 11346, comm: blktrace Tainted: G    B      2.6.29-rc7 #3 ProLiant
> DL585 G5
> RIP: 0010:[<ffffffff802c5099>]  [<ffffffff802c5099>]
> cache_alloc_refill+0x107/0x229
> RSP: 0018:ffff88081384d9e8  EFLAGS: 00010046
> RAX: 0000000000000070 RBX: ffff88187fc01340 RCX: 0000000000000015
> RDX: ffff88187c032000 RSI: ffff88187c682000 RDI: ffff88187fc01350
> RBP: ffff88081384da28 R08: ffff88187fc01360 R09: 00000000000000d2
> R10: ffff8817f4b9eabf R11: 000000000000000a R12: ffff88187c762c00
> R13: 0000000000000027 R14: ffff88087fc00040 R15: 00000000000492d0
> FS:  00007f3b2d6806f0(0000) GS:ffff88187c7671c0(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 00007f3b2d022f30 CR3: 000000183c883000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process blktrace (pid: 11346, threadinfo ffff88081384c000, task
> ffff88082e5ae140)
> Stack:
>  ffff88081384da78 ffffffff802b7061 000000021384da18 0000000000000002
>  ffff88087fc00040 00000000000080d0 0000000000000292 ffff88181f992ec0
>  ffff88081384da68 ffffffff802c4cb1 0000000077c6c910 ffff88187a89fc80
> Call Trace:
>  [<ffffffff802b7061>] ? alloc_vmap_area+0x1fe/0x211
>  [<ffffffff802c4cb1>] kmem_cache_alloc_node+0x9a/0xe6
>  [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
>  [<ffffffff802c56a2>] __kmalloc_node+0x43/0x45
>  [<ffffffff802b79af>] __vmalloc_area_node+0x76/0x14b
>  [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
>  [<ffffffff802b7b00>] __vmalloc_node+0x7c/0x8c
>  [<ffffffff80289a49>] ? relay_open_buf+0x9f/0x23c
>  [<ffffffff802b7c34>] vmalloc+0x1f/0x21
>  [<ffffffff80289a49>] relay_open_buf+0x9f/0x23c
>  [<ffffffff8028a4b3>] relay_open+0x144/0x218
>  [<ffffffff8036a643>] do_blk_trace_setup+0x1a4/0x59b
>  [<ffffffff8036aa7e>] blk_trace_setup+0x44/0x75
>  [<ffffffff8036ad56>] blk_trace_ioctl+0x9a/0xcf
>  [<ffffffff802d4685>] ? path_put+0x2c/0x30
>  [<ffffffff80361dd8>] blkdev_ioctl+0x803/0x853
>  [<ffffffff802d615b>] ? putname+0x30/0x39
>  [<ffffffff802d80be>] ? user_path_at+0x5d/0x8c
>  [<ffffffff802e2e67>] ? mntput_no_expire+0x31/0x18f
>  [<ffffffff802d4685>] ? path_put+0x2c/0x30
>  [<ffffffff802f10f3>] block_ioctl+0x38/0x3c
>  [<ffffffff802d9690>] vfs_ioctl+0x2a/0x78
>  [<ffffffff802d9b24>] do_vfs_ioctl+0x446/0x482
>  [<ffffffff8024ff46>] ? do_sigaction+0x166/0x187
>  [<ffffffff802d9bb5>] sys_ioctl+0x55/0x77
>  [<ffffffff8020c42a>] system_call_fastpath+0x16/0x1b
> Code: 00 00 00 48 8b 33 48 39 de 75 14 48 8b 73 20 c7 43 60 01 00 00 00
> 4c 39 c6 0f 84 a6 00 00 00 8b 46 20 41 3b 86 18 10 00 00 72 33 <0f> 0b
> eb fe ff c0 41 8b 0c 24 41 8b 96 0c 10 00 00 89 46 20 8b
> RIP  [<ffffffff802c5099>] cache_alloc_refill+0x107/0x229
>  RSP <ffff88081384d9e8>
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> hpwdt: An NMI occurred, but unable to determine source.
> Initializing cgroup subsys cpuset
> Initializing cgroup subsys cpu
> Linux version 2.6.29-rc7 (root@seatpost) (gcc version 4.3.2
> [gcc-4_3-branch revision 141291] (SUSE Linux) ) #3 SMP Tue Mar 10
> 10:15:07 EDT 2009
> Command line: root=/dev/cciss/c0d2p3 text resume=/dev/cciss/c0d2p2
> vga=0x317 console=ttyS1,115200N8 elevatorÞadline sysrq=1 reset_devices
> irqpoll maxcpus=1  memmap=exactmap memmapd0K@0K memmap\x130412K@17024K
> elfcorehdr\x147436K memmap2K#2095416K
> KERNEL supported cpus:
>   Intel GenuineIntel
>   AMD AuthenticAMD
>   Centaur CentaurHauls
> 
> Alan


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blktrace / relay: bad trace
  2009-03-11 11:59 ` Martin Peschke
@ 2009-03-11 12:03   ` Alan D. Brunelle
  0 siblings, 0 replies; 6+ messages in thread
From: Alan D. Brunelle @ 2009-03-11 12:03 UTC (permalink / raw)
  To: linux-s390, linux-btrace

Martin Peschke wrote:
> Hi Alan,
> I get bad data from relay in user space. I don't see a crash similar to
> the one you have posted. My issue appears to be older anyway.
> The last kernel I have tried out is 2.6.29-rc7.
> 
> Martin

Looking like a false alarm on my part anyhow: bad firmware leading to
memory corruption(!) - amazing, but seemingly true...

Alan

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-03-11 12:03 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
2009-03-09  5:10 ` Tom Zanussi
2009-03-09 14:23 ` Martin Peschke
2009-03-10 17:19 ` Alan D. Brunelle
2009-03-11 11:59 ` Martin Peschke
2009-03-11 12:03   ` Alan D. Brunelle

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).