From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Ian Campbell <ian.campbell@citrix.com>
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
Ian Jackson <Ian.Jackson@eu.citrix.com>,
Wei Liu <wei.liu2@citrix.com>,
xen-devel@lists.xen.org
Subject: Re: [PATCH v3 6/6] tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable
Date: Wed, 1 Apr 2015 12:34:24 +0200 [thread overview]
Message-ID: <551BC9B0.10000@citrix.com> (raw)
In-Reply-To: <1427712044.13935.284.camel@citrix.com>
Hello,
El 30/03/15 a les 12.40, Ian Campbell ha escrit:
> On Thu, 2015-03-26 at 16:20 +0100, Roger Pau Monné wrote:
>> El 16/03/15 a les 14.29, Ross Lagerwall ha escrit:
>>> From: Andrew Cooper <andrew.cooper3@citrix.com>
>>>
>>> POLLHUP|POLLIN is a valid revent to receive when there is readable data in a
>>> pipe, but the writable fd has been closed. This occurs in migration v2 when
>>> the legacy conversion process (which transforms the data inline) completes and
>>> exits successfully.
>>>
>>> In the case that there is data to read, suppress the POLLHUP. POSIX states
>>> that the hangup state is latched[1], which means it will reoccur on subsequent
>>> poll() calls. The datacopier is thus provided the opportunity to read until
>>> EOF, if possible.
>>>
>>> A POLLHUP on its own is treated exactly as before, indicating a different
>>> error with the fd.
>>>
>>> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/poll.html
>>
>> This patch breaks booting pygrub guests on FreeBSD, it seems to trigger
>> some kind of error when pygrub exits:
>>
>> libxl: error: libxl_bootloader.c:596:bootloader_copyfail: unexpected eof copying bootloader output
>> libxl: error: libxl_create.c:1129:domcreate_rebuild_done: cannot (re-)build domain: -3
>>
>> I will try to look into it, have you tested if the same happens on
>> Linux?
>
> Ross/Andrew, have you tried this on Linux?
>
> Roger, Does xl -vvv or even strace give any clues as to the sequences of
> events which lead to the error?
Here is the output of xl -vvv, it doesn't seem to contain any useful
info apart from what I've already posted:
# xl -vvv create debian.cfg
Parsing config from debian.cfg
libxl: debug: libxl_create.c:1521:do_domain_create: ao 0x802834080: create: how=0x0 callback=0x0 poller=0x80281f0a0
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=xvda spec.backend=unknown
libxl: debug: libxl_device.c:298:libxl__device_disk_set_backend: Disk vdev=xvda, using backend phy
libxl: debug: libxl_create.c:924:initiate_domain_create: running bootloader
libxl: debug: libxl_device.c:269:libxl__device_disk_set_backend: Disk vdev=(null) spec.backend=phy
libxl: debug: libxl.c:3065:libxl__device_disk_local_initiate_attach: locally attaching PHY disk /dev/zvol/tank/debian
libxl: debug: libxl_bootloader.c:411:bootloader_disk_attached_cb: Config bootloader value: pygrub
libxl: debug: libxl_bootloader.c:427:bootloader_disk_attached_cb: Checking for bootloader in libexec path: /usr/local/lib/xen/bin/pygrub
libxl: debug: libxl_create.c:1537:do_domain_create: ao 0x802834080: inprogress: poller=0x80281f0a0, flags=i
libxl: debug: libxl_event.c:577:libxl__ev_xswatch_register: watch w=0x8028367a0 wpath=/local/domain/2 token=3/0: register slotnum=3
libxl: debug: libxl_event.c:1942:libxl__ao_progress_report: ao 0x802834080: progress report: ignored
libxl: debug: libxl_bootloader.c:539:bootloader_gotptys: executing bootloader: /usr/local/lib/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: /usr/local/lib/xen/bin/pygrub
libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output=/var/run/xen/bootloader.2.out
libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output-format=simple0
libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: --output-directory=/var/run/xen/bootloader.2.d
libxl: debug: libxl_bootloader.c:543:bootloader_gotptys: bootloader arg: /dev/zvol/tank/debian
libxl: debug: libxl_event.c:514:watchfd_callback: watch w=0x8028367a0 wpath=/local/domain/2 token=3/0: event epath=/local/domain/2
libxl: error: libxl_bootloader.c:596:bootloader_copyfail: unexpected eof copying bootloader output
libxl: debug: libxl_bootloader.c:638:bootloader_finished: bootloader completed
libxl: debug: libxl_event.c:615:libxl__ev_xswatch_deregister: watch w=0x8028367a0 wpath=/local/domain/2 token=3/0: deregister slotnum=3
libxl: error: libxl_create.c:1138:domcreate_rebuild_done: cannot (re-)build domain: -3
libxl: info: libxl.c:1699:devices_destroy_cb: forked pid 763 for destroy of domain 2
libxl: debug: libxl_event.c:1766:libxl__ao_complete: ao 0x802834080: complete, rc=-3
libxl: debug: libxl_event.c:1738:libxl__ao__destroy: ao 0x802834080: destroy
xc: debug: hypercall buffer: total allocations:44 total releases:44
xc: debug: hypercall buffer: current allocations:0 maximum allocations:2
xc: debug: hypercall buffer: cache current size:2
xc: debug: hypercall buffer: cache hits:35 misses:2 toobig:7
And this is (I think) the relevant output from ktrace/kdump:
766 xl RET read 20/0x14
766 xl CALL read(0xc,0x80280702f,0x3cd)
766 xl RET read -1 errno 35 Resource temporarily unavailable
766 xl CALL gettimeofday(0x7fffffffde90,0)
766 xl RET gettimeofday 0
766 xl CALL poll(0x802816180,0x6,0xffffffff)
766 xl RET poll 1
766 xl CALL gettimeofday(0x7fffffffde90,0)
766 xl RET gettimeofday 0
766 xl CALL write(0x10,0x802807014,0x1b)
766 xl GIO fd 16 wrote 27 bytes
0x0000 1b5b 3f31 6c1b 3e1b 5b32 343b 3148 0000 0000 000d 1b5b |.[?1l.>.[24;1H.......[|
0x0016 3f31 6c1b 3e |?1l.>|
766 xl RET write 27/0x1b
766 xl CALL gettimeofday(0x7fffffffde90,0)
766 xl RET gettimeofday 0
766 xl CALL poll(0x802816180,0x5,0xffffffff)
766 xl RET poll 1
766 xl PSIG SIGCHLD caught handler=0x80114bdf0 mask=0x0 code=CLD_EXITED
766 xl CALL sigprocmask(SIG_SETMASK,0x7fffffffd81c,0)
766 xl RET sigprocmask 0
766 xl CALL write(0xe,0x800af1fd2,0x1)
766 xl GIO fd 14 wrote 1 byte
"\0"
766 xl RET write 1
766 xl CALL sigreturn(0x7fffffffd440)
766 xl RET sigreturn JUSTRETURN
766 xl CALL gettimeofday(0x7fffffffde90,0)
766 xl RET gettimeofday 0
766 xl CALL read(0xc,0x802807014,0x3e8)
766 xl GIO fd 12 read 0 bytes
""
766 xl RET read 0
766 xl CALL write(0x2,0x7fffffffd0b0,0x7)
766 xl GIO fd 2 wrote 7 bytes
"libxl: "
766 xl RET write 7
766 xl CALL write(0x2,0x7fffffffd0b0,0x7)
766 xl GIO fd 2 wrote 7 bytes
"error: "
> Did/does commit 916735e1814d "tools/libxl: Restore errnoval behavior for
> datacopier callback" make any difference?
No, I still get the same error. I will try to take a closer look into this
when I finish some pending stuff...
Roger.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel
next prev parent reply other threads:[~2015-04-01 10:34 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-16 13:29 [PATCH v3 1/6] tools/libxl: Introduce min and max macros Ross Lagerwall
2015-03-16 13:29 ` [PATCH v3 2/6] tools/libxl: Update datacopier to support sending data only Ross Lagerwall
2015-03-16 13:29 ` [PATCH v3 3/6] tools/libxl: Avoid overrunning static buffer with prefixdata Ross Lagerwall
2015-03-16 13:29 ` [PATCH v3 4/6] tools/libxl: Allow limiting amount copied by datacopier Ross Lagerwall
2015-03-18 11:12 ` Ian Campbell
2015-03-16 13:29 ` [PATCH v3 5/6] tools/libxl: Extend datacopier to support reading into a buffer Ross Lagerwall
2015-03-18 11:18 ` Ian Campbell
2015-04-01 15:53 ` Ian Jackson
2015-04-01 15:59 ` Ian Campbell
2015-03-16 13:29 ` [PATCH v3 6/6] tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable Ross Lagerwall
2015-03-18 11:31 ` Ian Campbell
2015-03-26 15:20 ` Roger Pau Monné
2015-03-30 10:40 ` Ian Campbell
2015-04-01 10:34 ` Roger Pau Monné [this message]
2015-04-01 14:36 ` Andrew Cooper
2015-04-02 15:03 ` [PATCH 0/3] datacopier POLLHUP fixes " Ian Jackson
2015-04-02 15:04 ` [PATCH 1/3] Revert "tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable" Ian Jackson
2015-04-02 15:08 ` Ian Campbell
2015-04-02 15:27 ` Ian Jackson
2015-04-02 15:10 ` Andrew Cooper
2015-04-02 15:11 ` Ian Jackson
2015-04-02 15:04 ` [PATCH 2/3] libxl: Cope with pipes which signal POLLHUP|POLLIN on read eof Ian Jackson
2015-04-02 15:13 ` Ian Campbell
2015-04-02 15:04 ` [PATCH 3/3] libxl: datacopier: Avoid theoretical eof/POLLHUP race Ian Jackson
2015-04-02 15:15 ` Ian Campbell
2015-04-02 16:29 ` Ian Jackson
2015-04-07 11:14 ` Roger Pau Monné
2015-04-07 12:26 ` Ian Jackson
2015-04-02 15:09 ` [PATCH 0/3] datacopier POLLHUP fixes handling when the fd is also readable Ian Jackson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=551BC9B0.10000@citrix.com \
--to=roger.pau@citrix.com \
--cc=Ian.Jackson@eu.citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=ian.campbell@citrix.com \
--cc=ross.lagerwall@citrix.com \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.