From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chunyan Liu Subject: question about SIGSEGV in datacopier_readable in libxl_aoutil.c Date: Tue, 3 Sep 2013 15:01:39 +0800 Message-ID: Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="===============8480423951575893172==" Return-path: List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: "xen-devel@lists.xensource.com" List-Id: xen-devel@lists.xenproject.org --===============8480423951575893172== Content-Type: multipart/alternative; boundary=089e01161660b1a3e504e57542c9 --089e01161660b1a3e504e57542c9 Content-Type: text/plain; charset=UTF-8 Hi, List, I'm trying to add migration APIs to libvirt libxl driver. In testing HVM migration, on source side, when executing libxl_domain_suspend, often meet SIGSEGV in libxl_aoutil.c: datacopier_readable, the malloc() function place: if (!buf || buf->used >= sizeof(buf->buf)) { buf = malloc(sizeof(*buf)); I doubt the heap is corrupted someway but couldn't confirm the root cause. And I tried valgrind to find some clue, following is the info right before the SIGSEGV. #valgrind --leak-check=full /usr/sbin/libvirtd -l -d [snip] ==7510== Syscall param read(buf) points to unaddressable byte(s) ==7510== at 0x8ECC76D: ??? (syscall-template.S:82) ==7510== by 0x14AB3070: datacopier_readable (unistd.h:45) ==7510== by 0x14AB833C: afterpoll_internal (libxl_event.c:995) ==7510== by 0x14AB8F16: eventloop_iteration (libxl_event.c:1440) ==7510== by 0x14AB9439: libxl__ao_inprogress (libxl_event.c:1685) ==7510== by 0x14A9ABF7: libxl_domain_suspend (libxl.c:785) ==7510== by 0x148404B3: libxlDomainMigratePerform3 (libxl_driver.c:5100) ==7510== by 0x5390CFA: virDomainMigratePerform3 (libvirt.c:7050) ==7510== by 0x12C262: remoteDispatchDomainMigratePerform3Helper (remote.c:3507) ==7510== by 0x53EACBE: virNetServerProgramDispatch (virnetserverprogram.c:435) ==7510== by 0x53EBCBD: virNetServerProcessMsg (virnetserver.c:165) ==7510== by 0x53EC912: virNetServerHandleJob (virnetserver.c:186) ==7510== Address 0x18a409ec is 0 bytes after a block of size 28 alloc'd ==7510== at 0x4C26FFB: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==7510== by 0x14AAECB6: libxl__zalloc (libxl_internal.c:83) ==7510== by 0x14AB33B0: libxl__datacopier_prefixdata (libxl_aoutils.c:92) ==7510== by 0x14AA6CDC: libxl__domain_save_device_model (libxl_dom.c:1447) ==7510== by 0x14AA8097: libxl__xc_domain_save_done (libxl_dom.c:1382) ==7510== by 0x14AB4268: helper_done (libxl_save_callout.c:332) ==7510== by 0x14AB4CA2: helper_exited (libxl_save_callout.c:317) ==7510== by 0x14ABB274: childproc_reaped (libxl_fork.c:264) ==7510== by 0x14ABB97A: libxl__fork_selfpipe_woken (libxl_fork.c:300) ==7510== by 0x14AB83A0: afterpoll_internal (libxl_event.c:1008) ==7510== by 0x14AB8F16: eventloop_iteration (libxl_event.c:1440) ==7510== by 0x14AB9439: libxl__ao_inprogress (libxl_event.c:1685) ==7510== --7510-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) - exiting --7510-- si_code=80; Faulting address: 0x0; sp: 0x406ad5da0 I couldn't find a clear problem in the code, but after trying to change the code a little, it turned to be working. Following is the change. --- a/tools/libxl/libxl_aoutils.c +++ b/tools/libxl/libxl_aoutils.c @@ -89,7 +89,8 @@ void libxl__datacopier_prefixdata(libxl_ assert(len < dc->maxsz - dc->used); - buf = libxl__zalloc(NOGC, sizeof(*buf) - sizeof(buf->buf) + len); +// buf = libxl__zalloc(NOGC, sizeof(*buf) - sizeof(buf->buf) + len); + buf = libxl__zalloc(NOGC, sizeof(libxl__datacopier_buf)); buf->used = len; memcpy(buf->buf, data, len); @@ -141,10 +142,11 @@ static void datacopier_readable(libxl__e libxl__datacopier_buf *buf = LIBXL_TAILQ_LAST(&dc->bufs, libxl__datacopier_bufs); if (!buf || buf->used >= sizeof(buf->buf)) { - buf = malloc(sizeof(*buf)); - if (!buf) libxl__alloc_failed(CTX, __func__, 1, sizeof(*buf)); - buf->used = 0; - LIBXL_TAILQ_INSERT_TAIL(&dc->bufs, buf, entry); + libxl__datacopier_buf *newbuf = malloc(sizeof(libxl__datacopier_buf)); + if (!newbuf) libxl__alloc_failed(CTX, __func__, 1, sizeof(libxl__datacopier_buf)); + newbuf->used = 0; + LIBXL_TAILQ_INSERT_TAIL(&dc->bufs, newbuf, entry); + buf = newbuf; } int r = read(ev->fd, buf->buf + buf->used, Could anybody familiar with this part of code take a look at it? Thanks, Chunyan --089e01161660b1a3e504e57542c9 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi, List,

I'm trying to add mig= ration APIs to libvirt libxl driver. In testing HVM migration, on source si= de, when executing libxl_domain_suspend, often meet SIGSEGV in libxl_aoutil= .c: datacopier_readable, the malloc() function place:
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!buf || buf->used >=3D sizeof(buf-= >buf)) {
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 buf =3D malloc(sizeof(*buf));
I doubt the heap is corrupted= someway but couldn't confirm the root cause. And I tried valgrind to f= ind some clue, following is the info right before the SIGSEGV.
#valgrind --leak-check=3Dfull /usr/sbin/libvirtd -l -d<= br>
[snip]
=3D=3D7510=3D=3D Syscall param read(buf) point= s to unaddressable byte(s)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 at 0x8ECC7= 6D: ??? (syscall-template.S:82)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB3070: datacopier_readable (unis= td.h:45)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB833C: afterpoll_int= ernal (libxl_event.c:995)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB8F= 16: eventloop_iteration (libxl_event.c:1440)
=3D=3D7510=3D=3D=C2=A0=C2= =A0=C2=A0 by 0x14AB9439: libxl__ao_inprogress (libxl_event.c:1685)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14A9ABF7: libxl_domain_suspend (lib= xl.c:785)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x148404B3: libxlDomainM= igratePerform3 (libxl_driver.c:5100)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 = by 0x5390CFA: virDomainMigratePerform3 (libvirt.c:7050)
=3D=3D7510=3D=3D= =C2=A0=C2=A0=C2=A0 by 0x12C262: remoteDispatchDomainMigratePerform3Helper (= remote.c:3507)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x53EACBE: virNetServerProgramDispatc= h (virnetserverprogram.c:435)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x53= EBCBD: virNetServerProcessMsg (virnetserver.c:165)
=3D=3D7510=3D=3D=C2= =A0=C2=A0=C2=A0 by 0x53EC912: virNetServerHandleJob (virnetserver.c:186) =3D=3D7510=3D=3D=C2=A0 Address 0x18a409ec is 0 bytes after a block of size = 28 alloc'd
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 at 0x4C26FFB: calloc (= in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
=3D=3D7510=3D= =3D=C2=A0=C2=A0=C2=A0 by 0x14AAECB6: libxl__zalloc (libxl_internal.c:83) =3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB33B0: libxl__datacopier_prefixd= ata (libxl_aoutils.c:92)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AA6CD= C: libxl__domain_save_device_model (libxl_dom.c:1447)
=3D=3D7510=3D=3D= =C2=A0=C2=A0=C2=A0 by 0x14AA8097: libxl__xc_domain_save_done (libxl_dom.c:1= 382)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB4268: helper_done (libxl_save_c= allout.c:332)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB4CA2: helper_e= xited (libxl_save_callout.c:317)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0= x14ABB274: childproc_reaped (libxl_fork.c:264)
=3D=3D7510=3D=3D=C2=A0=C2= =A0=C2=A0 by 0x14ABB97A: libxl__fork_selfpipe_woken (libxl_fork.c:300)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB83A0: afterpoll_internal (libxl= _event.c:1008)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by 0x14AB8F16: eventlo= op_iteration (libxl_event.c:1440)
=3D=3D7510=3D=3D=C2=A0=C2=A0=C2=A0 by = 0x14AB9439: libxl__ao_inprogress (libxl_event.c:1685)
=3D=3D7510=3D=3D --7510-- VALGRIND INTERNAL ERROR: Valgrind received a signal 11 (SIGSEGV) -= exiting
--7510-- si_code=3D80;=C2=A0 Faulting address: 0x0;=C2=A0 sp: 0= x406ad5da0

I couldn't find a clear problem in the code, but afte= r trying to change the code a little, it turned to be working.
Following is the change.

--- a/tools/libxl/libxl_aoutils.c
+++ b/tools/libxl/libxl_aoutils.c
@@ -89,7 +89,8 @@ void libxl__datacopier_prefixdata(libxl_

=C2=A0=C2=A0=C2=A0=C2=A0 assert(len < dc->maxsz - dc->used);

-=C2=A0=C2=A0=C2=A0 buf =3D libxl__zalloc(NOGC, sizeof(*buf) - sizeof(buf-&= gt;buf) + len);
+//=C2=A0=C2=A0=C2=A0 buf =3D libxl__zalloc(NOGC, sizeof(*buf) - sizeof(buf= ->buf) + len);
+=C2=A0=C2=A0=C2=A0 buf =3D libxl__zalloc(NOGC, sizeof(libxl__datacopier_bu= f));
=C2=A0=C2=A0=C2=A0=C2=A0 buf->used =3D len;
=C2=A0=C2=A0=C2=A0=C2=A0 memcpy(buf->buf, data, len);

@@ -141,10 +142,11 @@ static void datacopier_readable(libxl__e
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 libxl__datacopier_buf *buf= =3D
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 LI= BXL_TAILQ_LAST(&dc->bufs, libxl__datacopier_bufs);
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!buf || buf->used &= gt;=3D sizeof(buf->buf)) {
-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 buf =3D= malloc(sizeof(*buf));
-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!bu= f) libxl__alloc_failed(CTX, __func__, 1, sizeof(*buf));
-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 buf->= ;used =3D 0;
-=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 LIBXL_T= AILQ_INSERT_TAIL(&dc->bufs, buf, entry);
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 libxl__= datacopier_buf *newbuf =3D malloc(sizeof(libxl__datacopier_buf));
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 if (!ne= wbuf) libxl__alloc_failed(CTX, __func__, 1, sizeof(libxl__datacopier_buf));=
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 newbuf-= >used =3D 0;
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 LIBXL_T= AILQ_INSERT_TAIL(&dc->bufs, newbuf, entry);
+=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 buf =3D= newbuf;
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 }
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 int r =3D read(ev->fd,<= br> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 buf->buf + buf->u= sed,


Could anybody familiar with this part of code take a look at it?
Thanks,
Chunyan
--089e01161660b1a3e504e57542c9-- --===============8480423951575893172== Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Disposition: inline _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel --===============8480423951575893172==--