Re: blktrace / relay: bad trace

linux-btrace.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Martin Peschke <mpeschke@linux.vnet.ibm.com>
To: linux-btrace@vger.kernel.org
Subject: Re: blktrace / relay: bad trace
Date: Mon, 09 Mar 2009 14:23:24 +0000	[thread overview]
Message-ID: <1236608604.7354.38.camel@kitka.ibm.com> (raw)
In-Reply-To: <1236270223.10040.61.camel@kitka.ibm.com>

Hello Tom,
thanks, for your reply.

On Mon, 2009-03-09 at 00:10 -0500, Tom Zanussi wrote:
> On Thu, 2009-03-05 at 17:23 +0100, Martin Peschke wrote:
> 
> It's definitely good to see hexdumps, but it's hard to tell from a
> single one - the best way to make progress would be to provide enough
> information for anyone to be able to reproduce it on more common
> hardware ie. x86/x86_64.

I run some internal I/O workload generator against an FCP disk with 4
paths. My blktrace command line:

blktrace -a issue -a complete -o - /dev/sda /dev/sdu /dev/sdao /dev/sdbi
| blkiomon -I 100 -b blkiomon.out

> Does it only happen only on systemZ as far as
> you know (I haven't seen it on x86_64)?

Don't know yet. I am trying to reproduce the issue on my laptop... but
it's not quite the same setup.

>  Do you get dropped events every
> time it happens (which would indicate a buffer-full related condition)
> or never?

no dropped events

> Does it only happen in pipeline mode or normal to-disk
> logging?

both

> Are you using the default sub-buffer sizes or something else?

defaults

> Only logging certain event types?

issue/dispatch and complete

> Is it a recent regression, since it
> doesn't seem to have been a problem before, etc.?

no recent regression - I saw it last summer too

> It may be the relay read producing these effects, but it may also be the
> blktrace userspace buffering of that read data that has a bug.

I am quite sure it is a kernel issue. I have patched relay in order to
"poison" relay buffers - 0x55 for re-used buffers, 0x66 for padding,
0x77 for consumed buffers:

---
 kernel/relay.c |    5 +++++
 1 file changed, 5 insertions(+)

--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -735,6 +735,8 @@ size_t relay_switch_subbuf(struct rchan_
 		old_subbuf = buf->subbufs_produced % buf->chan->n_subbufs;
 		buf->padding[old_subbuf] = buf->prev_padding;
 		buf->subbufs_produced++;
+		memset((char *)buf->data + buf->offset, 0x66,
+		       buf->prev_padding);
 		if (buf->dentry)
 			buf->dentry->d_inode->i_size + 				buf->chan->subbuf_size -
@@ -767,6 +769,8 @@ size_t relay_switch_subbuf(struct rchan_
 	if (unlikely(length + buf->offset > buf->chan->subbuf_size))
 		goto toobig;
 
+	memset(buf->data, 0x55, buf->chan->subbuf_size);
+
 	return length;
 
 toobig:
@@ -1112,6 +1116,7 @@ static int subbuf_read_actor(size_t read
 		desc->error = -EFAULT;
 		ret = 0;
 	}
+	memset(from, 0x77, ret);
 	desc->arg.data += ret;
 	desc->written += ret;
 	desc->count -= ret;


This is what I get in user space then:

--- suspiscous pdu_len ---
magic    0x65617407
sequence 0x00003baf
time     0x5555555555555555
sector   0x5555555555555555
bytes    0x55555555
action   0x55555555
pid      0x55555555
device   0x55555555
cpu      0x55555555
error    0x5555
pdu_len  0x5555

65617407 00003baf 55555555 55555555 55555555 55555555 55555555 55555555
55555555 55555555 55555555 55555555

> Also I
> see that the bad trace is one that has a large pdu payload, which is cut
> off in the second trace, which could point to erroneous pdu handling in
> blktrace userspace.

Don't worry about the bad trace. It's junk anyway, because a broken
pdu_len of the previous trace made blktrace look for the next trace at a
wrong offset.


Martin

next prev parent reply	other threads:[~2009-03-09 14:23 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-05 16:23 blktrace / relay: bad trace Martin Peschke
2009-03-09  5:10 ` Tom Zanussi
2009-03-09 14:23 ` Martin Peschke [this message]
2009-03-10 17:19 ` Alan D. Brunelle
2009-03-11 11:59 ` Martin Peschke
2009-03-11 12:03   ` Alan D. Brunelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1236608604.7354.38.camel@kitka.ibm.com \
    --to=mpeschke@linux.vnet.ibm.com \
    --cc=linux-btrace@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).