From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54650)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1WPtki-0002t7-W7
	for qemu-devel@nongnu.org; Tue, 18 Mar 2014 09:08:26 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1WPtkd-0007qs-0K
	for qemu-devel@nongnu.org; Tue, 18 Mar 2014 09:08:20 -0400
Received: from mx1.redhat.com ([209.132.183.28]:36990)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dgilbert@redhat.com>) id 1WPtkc-0007qT-9Z
	for qemu-devel@nongnu.org; Tue, 18 Mar 2014 09:08:14 -0400
Received: from int-mx12.intmail.prod.int.phx2.redhat.com
	(int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25])
	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2ID8ArV011278
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)
	for <qemu-devel@nongnu.org>; Tue, 18 Mar 2014 09:08:11 -0400
Date: Tue, 18 Mar 2014 13:08:07 +0000
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Message-ID: <20140318130806.GB2715@work-vm>
References: <20140314201719.GJ2513@work-vm>
 <8738ifr881.fsf@elfo.mitica>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <8738ifr881.fsf@elfo.mitica>
Subject: Re: [Qemu-devel] semantics of qemu_peek_buffer ?
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: Juan Quintela <quintela@redhat.com>
Cc: qemu-devel@nongnu.org

* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote:
> > Hi Juan,
> >   What are the semantics of 'qemu_peek_buffer'?
> >
> >  - is it supposed to guarantee (if there are no errors) that
> >    it will read 'size' bytes? (i.e. it should block)
> 
> 
> We are talking qemu, specifically migration.  "quarantee" is always a
> too strong word O:-)

It would be nice for people to expect it!

> Once told that, qemu_peek_buffer() should always read the amount of
> stuff it has been asked (or terun one error.

OK.

> > There are currently two users of it:
> >    * qemu_read_buffer which spins filling it's buffer up
> >      with repeated calls to qemu_peek_buffer
> 
> <if people are using grep, function in qemu_get_buffer()>
> 
> if we ask for "size" bytes, and there are less that that size, we are in trouble.
> 
> 
> >    * vmstate_subsection_load that returns if the size read
> >      doesn't match what it was expecting
> 
> This is look-ahead of "size" chars, and has all the problems that you
> can think of read-ahead of more than one char.  How is it used in
> sub-sections:
> 
> - we read that the 1st char is a subsection number (but it could not be
>   a subsection).
> - we read the size
> - we read the subsection name of that size
> 
> - we search for a subsection with that name, if it exist, we then read
>   them properly.  if it don't exist, we abort the whole subsection idea,
>   nad continue as if the subsection number hadn't exist in the 1st place.
> 
> > I can't see how both of them can be right.
> >
> > The problem I'm seeing is that in my world I've got a 
> > qemu_peek_buffer of 8 bytes, and with a repeated virt-test
> > local tcp migration it's failing about 1 in 8 times;
> > here is some debug:
> >
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (pre); size=8 offset=0 index=32764 pending=4 buf_index=32764 buf_size=32768 pos=23302795
> > 19:51:15 INFO | [qemu output] qemu_fill_buffer got 1
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer refill case (post); size=8 offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796
> > 19:51:15 INFO | [qemu output] qemu_peek_buffer (size>pending); size=8 offset=0 index=0 pending=5 buf_index=0 buf_size=5 pos=23302796
> >
> > i.e. I asked for 8 bytes, there were 4 in the buffer, it called fill buffer, which got one
> > more byte, and thus it returned me 5.
> 
> Uh, oh.  That shouldn't happen.  could you try to change the
>    "if (pending < size)"
> to
>    "while (pending < size)"
> 
> remember that you need to handle errors on that loop?

Yes, I think that will work - I think really the loop that is in qemu_get_buffer()
needs to move up into qemu_peek_buffer() (with qemu_fill_buffer returning the
number of bytes it has read).

> BTW, what are you doing when you get that error?  It is vmstate_load or
> qemu_get_buffer()?  and for what fiel?

In my migration visitor world, in the compatibility code for the current format,
I'm using qemu_peek_buffer to peek the 8 byte header on ram pages to figure
out which type of page encoding we're using, so this means I'm calling qemu_peek_buffer
once per page (with 8 bytes peek'd each time), so it's calling it a lot.

> > I think what vmstate_subsection_load wants (and what I want) is something
> > like qemu_read_buffer but which doesn't advance it's pointer, i.e. to read
> > a header, decide it's not for me and let the next function along use it.
> 
> Problem was to fix the case when there is less that <size> bytes into
> the buffer.  i.e. we start to search for a string that is bigger than
> the remaining (already read) buffer.
> 
> > vmstate_subsection_load doesn't look like it flags an error if it
> > doesn't read enough; I guess the effect will be just to fail to 
> > load a migration in some interesting way.
> 
> Error checking is missing there.  If we don't get enough space, we
> really want to fail the subsection read.

Yes, I think it's possible this code could hit the problem I've seen,
but it's much less likely because there are a lot less subsections
than there are pages.
I think the kernel can return only a few bytes, possibly because when
qemu_fill_buffer fills the buffer it can read a weird number of bytes
depending how much buffer space is left; so the kernel could have a few
left for the next read.

Dave
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK