* Tagging Xen 4.0.0 first release candidate
@ 2010-01-05 6:42 Keir Fraser
2010-01-05 8:56 ` Jan Beulich
` (2 more replies)
0 siblings, 3 replies; 22+ messages in thread
From: Keir Fraser @ 2010-01-05 6:42 UTC (permalink / raw)
To: xen-devel
I plan to tag -rc1 later this week. If you have any outstanding patches,
please send them to the list now.
-- Keir
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate
2010-01-05 6:42 Tagging Xen 4.0.0 first release candidate Keir Fraser
@ 2010-01-05 8:56 ` Jan Beulich
2010-01-05 9:06 ` Keir Fraser
2010-01-06 13:50 ` Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug Pasi Kärkkäinen
2010-01-06 15:00 ` Tagging Xen 4.0.0 first release candidate Christian Tramnitz
2 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2010-01-05 8:56 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
>>> Keir Fraser <keir.fraser@eu.citrix.com> 05.01.10 07:42 >>>
>I plan to tag -rc1 later this week. If you have any outstanding patches,
>please send them to the list now.
While I have a kernel side draft patch implementing a replacement
privcmd mmap with proper error indication batch close to ready, I
don't have the libxc and qemu ones even started, yet. To fully
finish up the kernel side I wanted to wait for your and possibly
other's opinions on the lifted single shot mapping I suggested
earlier today.
Plus the way errors are to be propagated may be controversial: Other
than originally planned, after the paging patches went in, using a
simple bit field won't do anymore, as we now need at least two bits
for indicating all possible states. Right now I'm simply using an array
of int-s (returning the actual error codes):
typedef struct privcmd_mmap_batch {
unsigned int num; /* number of pages to populate */
domid_t dom; /* target domain */
__u64 addr; /* virtual address */
const xen_pfn_t __user *arr; /* array of mfns */
int __user *err; /* array of error codes */
} privcmd_mmap_batch_t;
but that could be considered overkill. A non-extensible alternative
would be two bit fields (one for error indications, the other for
paged-out ones), and another possibility would be to at least
use __s16 instead of int for the array to reduce the virtual address
space needed. A third possibility, helping in those cases where the
caller doesn't need the MFN array for other than passing to the ioctl,
could be to explicitly allow the two pointers to hold the same
address (i.e. documenting that the output will never overwrite
unconsumed input).
In any case I'm of the opinion that the tools limitations with the old
ioctl should be eliminated before 4.0 gets released - I had hoped
that someone with better knowledge of the tools than I have
would approach this, but since no-one showed up I'll try to.
Jan
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate
2010-01-05 8:56 ` Jan Beulich
@ 2010-01-05 9:06 ` Keir Fraser
2010-01-05 16:00 ` Alex Williamson
0 siblings, 1 reply; 22+ messages in thread
From: Keir Fraser @ 2010-01-05 9:06 UTC (permalink / raw)
To: Jan Beulich; +Cc: xen-devel@lists.xensource.com
On 05/01/2010 08:56, "Jan Beulich" <JBeulich@novell.com> wrote:
>>>> Keir Fraser <keir.fraser@eu.citrix.com> 05.01.10 07:42 >>>
>> I plan to tag -rc1 later this week. If you have any outstanding patches,
>> please send them to the list now.
>
> While I have a kernel side draft patch implementing a replacement
> privcmd mmap with proper error indication batch close to ready, I
> don't have the libxc and qemu ones even started, yet. To fully
> finish up the kernel side I wanted to wait for your and possibly
> other's opinions on the lifted single shot mapping I suggested
> earlier today.
That sounded okay to me.
> In any case I'm of the opinion that the tools limitations with the old
> ioctl should be eliminated before 4.0 gets released - I had hoped
> that someone with better knowledge of the tools than I have
> would approach this, but since no-one showed up I'll try to.
It's a bit late for 4.0.0 really. The lack of interest is probably the lack
of people hitting the 43-bit limitation built into the current interface.
Just about noone is anywhere near close to it.
-- Keir
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate
2010-01-05 9:06 ` Keir Fraser
@ 2010-01-05 16:00 ` Alex Williamson
2010-01-05 16:06 ` Keir Fraser
0 siblings, 1 reply; 22+ messages in thread
From: Alex Williamson @ 2010-01-05 16:00 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Jan Beulich
On Tue, Jan 5, 2010 at 2:06 AM, Keir Fraser <keir.fraser@eu.citrix.com> wrote:
> On 05/01/2010 08:56, "Jan Beulich" <JBeulich@novell.com> wrote:
>
>> In any case I'm of the opinion that the tools limitations with the old
>> ioctl should be eliminated before 4.0 gets released - I had hoped
>> that someone with better knowledge of the tools than I have
>> would approach this, but since no-one showed up I'll try to.
>
> It's a bit late for 4.0.0 really. The lack of interest is probably the lack
> of people hitting the 43-bit limitation built into the current interface.
> Just about noone is anywhere near close to it.
Perhaps due to the lack of x86 processors supporting more than 40-bits
of physical address space that are currently on the market. But we
know that's going to change fairly shortly and x86 will finally get
support for more than 1TB. That opens the doors for hardware vendors
to create interesting configurations and maybe not worry so much about
compressing the address space into a contiguous block. If Xen can't
support at least a 44-bit physical address space within the 4.x
lifetime, it could become a serious limiting factor. It seems rather
shortsighted not to prepare for it now, especially given the
opportunity we have at a major version break. Thanks,
Alex
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate
2010-01-05 16:00 ` Alex Williamson
@ 2010-01-05 16:06 ` Keir Fraser
2010-01-19 13:57 ` Tagging Xen 4.0.0 first release candidate, 44-bit address space support Pasi Kärkkäinen
0 siblings, 1 reply; 22+ messages in thread
From: Keir Fraser @ 2010-01-05 16:06 UTC (permalink / raw)
To: Alex Williamson; +Cc: xen-devel@lists.xensource.com, Jan Beulich
On 05/01/2010 16:00, "Alex Williamson" <alex.williamson@hp.com> wrote:
>> It's a bit late for 4.0.0 really. The lack of interest is probably the lack
>> of people hitting the 43-bit limitation built into the current interface.
>> Just about noone is anywhere near close to it.
>
> Perhaps due to the lack of x86 processors supporting more than 40-bits
> of physical address space that are currently on the market. But we
> know that's going to change fairly shortly and x86 will finally get
> support for more than 1TB. That opens the doors for hardware vendors
> to create interesting configurations and maybe not worry so much about
> compressing the address space into a contiguous block. If Xen can't
> support at least a 44-bit physical address space within the 4.x
> lifetime, it could become a serious limiting factor. It seems rather
> shortsighted not to prepare for it now, especially given the
> opportunity we have at a major version break. Thanks,
If it's considered important we can hold up until next week.
-- Keir
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-05 6:42 Tagging Xen 4.0.0 first release candidate Keir Fraser
2010-01-05 8:56 ` Jan Beulich
@ 2010-01-06 13:50 ` Pasi Kärkkäinen
2010-01-21 12:28 ` Pasi Kärkkäinen
2010-01-06 15:00 ` Tagging Xen 4.0.0 first release candidate Christian Tramnitz
2 siblings, 1 reply; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-06 13:50 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> I plan to tag -rc1 later this week. If you have any outstanding patches,
> please send them to the list now.
>
Hmm.. I just remembered this pygrub bug:
https://bugzilla.redhat.com/show_bug.cgi?id=466681
pygrub doesn't use O_DIRECT so sometimes it gets old information
from dom0 kernel cache - and fails to use the updated domU grub.conf.
Redhat seems to have patches available for testing.. not for
xen-unstable though.
I've personally hit this bug many times.
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate
2010-01-05 6:42 Tagging Xen 4.0.0 first release candidate Keir Fraser
2010-01-05 8:56 ` Jan Beulich
2010-01-06 13:50 ` Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug Pasi Kärkkäinen
@ 2010-01-06 15:00 ` Christian Tramnitz
2010-01-06 21:42 ` Ian Pratt
2010-01-07 11:51 ` Pasi Kärkkäinen
2 siblings, 2 replies; 22+ messages in thread
From: Christian Tramnitz @ 2010-01-06 15:00 UTC (permalink / raw)
To: xen-devel
Was this a little too late for the 4.0 discussion or was just noone
interested?
http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902
Best regards,
Christian
^ permalink raw reply [flat|nested] 22+ messages in thread
* RE: Re: Tagging Xen 4.0.0 first release candidate
2010-01-06 15:00 ` Tagging Xen 4.0.0 first release candidate Christian Tramnitz
@ 2010-01-06 21:42 ` Ian Pratt
2010-01-06 21:52 ` Pasi Kärkkäinen
2010-01-07 11:51 ` Pasi Kärkkäinen
1 sibling, 1 reply; 22+ messages in thread
From: Ian Pratt @ 2010-01-06 21:42 UTC (permalink / raw)
To: Christian Tramnitz, xen-devel@lists.xensource.com; +Cc: Ian Pratt
> Was this a little too late for the 4.0 discussion or was just noone
> interested?
> http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902
XCP contains a small iso image with the Citrix PV drivers on. I believe they've been made to work on xen-unstable, however the binaries are freely distributable but not open source.
Ian
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Re: Tagging Xen 4.0.0 first release candidate
2010-01-06 21:42 ` Ian Pratt
@ 2010-01-06 21:52 ` Pasi Kärkkäinen
0 siblings, 0 replies; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-06 21:52 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel@lists.xensource.com, Christian Tramnitz
On Wed, Jan 06, 2010 at 09:42:19PM +0000, Ian Pratt wrote:
> > Was this a little too late for the 4.0 discussion or was just noone
> > interested?
> > http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902
>
> XCP contains a small iso image with the Citrix PV drivers on. I believe they've been made to work on xen-unstable, however the binaries are freely distributable but not open source.
>
xen-3.4-testing.hg also has XCP Windows PV drivers support in it, so upcoming Xen 3.4.3 will also support them.
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Re: Tagging Xen 4.0.0 first release candidate
2010-01-06 15:00 ` Tagging Xen 4.0.0 first release candidate Christian Tramnitz
2010-01-06 21:42 ` Ian Pratt
@ 2010-01-07 11:51 ` Pasi Kärkkäinen
1 sibling, 0 replies; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-07 11:51 UTC (permalink / raw)
To: Christian Tramnitz; +Cc: xen-devel
On Wed, Jan 06, 2010 at 04:00:34PM +0100, Christian Tramnitz wrote:
> Was this a little too late for the 4.0 discussion or was just noone
> interested?
> http://permalink.gmane.org/gmane.comp.emulators.xen.devel/75902
>
What do you exactly mean with "bundling" gplpv drivers with Xen release?
What would it help with? Xen is distributed as source tarball anyway..
Gplpv drivers are already available as a binary from author's website.
I think it's more flexible when the gplpv drivers are distributed as a
separate package, and not part of Xen.
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate, 44-bit address space support
2010-01-05 16:06 ` Keir Fraser
@ 2010-01-19 13:57 ` Pasi Kärkkäinen
2010-01-19 14:02 ` Keir Fraser
0 siblings, 1 reply; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-19 13:57 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Jan Beulich, Alex Williamson
On Tue, Jan 05, 2010 at 04:06:50PM +0000, Keir Fraser wrote:
> On 05/01/2010 16:00, "Alex Williamson" <alex.williamson@hp.com> wrote:
>
> >> It's a bit late for 4.0.0 really. The lack of interest is probably the lack
> >> of people hitting the 43-bit limitation built into the current interface.
> >> Just about noone is anywhere near close to it.
> >
> > Perhaps due to the lack of x86 processors supporting more than 40-bits
> > of physical address space that are currently on the market. But we
> > know that's going to change fairly shortly and x86 will finally get
> > support for more than 1TB. That opens the doors for hardware vendors
> > to create interesting configurations and maybe not worry so much about
> > compressing the address space into a contiguous block. If Xen can't
> > support at least a 44-bit physical address space within the 4.x
> > lifetime, it could become a serious limiting factor. It seems rather
> > shortsighted not to prepare for it now, especially given the
> > opportunity we have at a major version break. Thanks,
>
> If it's considered important we can hold up until next week.
>
Any progress with this 44-bit address space support?
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate, 44-bit address space support
2010-01-19 13:57 ` Tagging Xen 4.0.0 first release candidate, 44-bit address space support Pasi Kärkkäinen
@ 2010-01-19 14:02 ` Keir Fraser
2010-01-19 14:22 ` Pasi Kärkkäinen
0 siblings, 1 reply; 22+ messages in thread
From: Keir Fraser @ 2010-01-19 14:02 UTC (permalink / raw)
To: Pasi Kärkkäinen
Cc: xen-devel@lists.xensource.com, Jan Beulich, Alex Williamson
On 19/01/2010 13:57, "Pasi Kärkkäinen" <pasik@iki.fi> wrote:
>>> Perhaps due to the lack of x86 processors supporting more than 40-bits
>>> of physical address space that are currently on the market. But we
>>> know that's going to change fairly shortly and x86 will finally get
>>> support for more than 1TB. That opens the doors for hardware vendors
>>> to create interesting configurations and maybe not worry so much about
>>> compressing the address space into a contiguous block. If Xen can't
>>> support at least a 44-bit physical address space within the 4.x
>>> lifetime, it could become a serious limiting factor. It seems rather
>>> shortsighted not to prepare for it now, especially given the
>>> opportunity we have at a major version break. Thanks,
>>
>> If it's considered important we can hold up until next week.
>
> Any progress with this 44-bit address space support?
Yes, it's checked in now, ahead of 4.0.0-rc2.
-- Keir
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate, 44-bit address space support
2010-01-19 14:02 ` Keir Fraser
@ 2010-01-19 14:22 ` Pasi Kärkkäinen
0 siblings, 0 replies; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-19 14:22 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel@lists.xensource.com, Jan Beulich, Alex Williamson
On Tue, Jan 19, 2010 at 02:02:46PM +0000, Keir Fraser wrote:
> On 19/01/2010 13:57, "Pasi Kärkkäinen" <pasik@iki.fi> wrote:
>
> >>> Perhaps due to the lack of x86 processors supporting more than 40-bits
> >>> of physical address space that are currently on the market. But we
> >>> know that's going to change fairly shortly and x86 will finally get
> >>> support for more than 1TB. That opens the doors for hardware vendors
> >>> to create interesting configurations and maybe not worry so much about
> >>> compressing the address space into a contiguous block. If Xen can't
> >>> support at least a 44-bit physical address space within the 4.x
> >>> lifetime, it could become a serious limiting factor. It seems rather
> >>> shortsighted not to prepare for it now, especially given the
> >>> opportunity we have at a major version break. Thanks,
> >>
> >> If it's considered important we can hold up until next week.
> >
> > Any progress with this 44-bit address space support?
>
> Yes, it's checked in now, ahead of 4.0.0-rc2.
>
Oh, nice, I missed that.
Thanks!
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-06 13:50 ` Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug Pasi Kärkkäinen
@ 2010-01-21 12:28 ` Pasi Kärkkäinen
2010-01-21 15:39 ` Jan Beulich
2010-01-21 18:44 ` Daniel Stodden
0 siblings, 2 replies; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-21 12:28 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > please send them to the list now.
> >
>
> Hmm.. I just remembered this pygrub bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=466681
>
> pygrub doesn't use O_DIRECT so sometimes it gets old information
> from dom0 kernel cache - and fails to use the updated domU grub.conf.
>
> Redhat seems to have patches available for testing.. not for
> xen-unstable though.
>
> I've personally hit this bug many times.
>
It seems Redhat guys have a fix available.. they fixed the problem by
patching dom0 kernel blkback.
More details about the fix here:
https://bugzilla.redhat.com/show_bug.cgi?id=466681
Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 12:28 ` Pasi Kärkkäinen
@ 2010-01-21 15:39 ` Jan Beulich
2010-01-21 18:44 ` Daniel Stodden
1 sibling, 0 replies; 22+ messages in thread
From: Jan Beulich @ 2010-01-21 15:39 UTC (permalink / raw)
To: Keir Fraser, Pasi Kärkkäinen; +Cc: xen-devel
>>> Pasi Kärkkäinen<pasik@iki.fi> 21.01.10 13:28 >>>
>On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
>> pygrub doesn't use O_DIRECT so sometimes it gets old information
>> from dom0 kernel cache - and fails to use the updated domU grub.conf.
>>
>> Redhat seems to have patches available for testing.. not for
>> xen-unstable though.
>>
>> I've personally hit this bug many times.
>>
>
>It seems Redhat guys have a fix available.. they fixed the problem by
>patching dom0 kernel blkback.
>
>More details about the fix here:
>https://bugzilla.redhat.com/show_bug.cgi?id=466681
>
>Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
Yes, please.
Jan
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 12:28 ` Pasi Kärkkäinen
2010-01-21 15:39 ` Jan Beulich
@ 2010-01-21 18:44 ` Daniel Stodden
2010-01-21 19:16 ` Daniel Stodden
1 sibling, 1 reply; 22+ messages in thread
From: Daniel Stodden @ 2010-01-21 18:44 UTC (permalink / raw)
To: Pasi Kärkkäinen; +Cc: xen-devel@lists.xensource.com, Keir Fraser
On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > please send them to the list now.
> > >
> >
> > Hmm.. I just remembered this pygrub bug:
> > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> >
> > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> >
> > Redhat seems to have patches available for testing.. not for
> > xen-unstable though.
> >
> > I've personally hit this bug many times.
> >
>
> It seems Redhat guys have a fix available.. they fixed the problem by
> patching dom0 kernel blkback.
>
> More details about the fix here:
> https://bugzilla.redhat.com/show_bug.cgi?id=466681
>
> Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
Only to 2.6.18.
It's obsolete after 2.6.27.
O_DIRECT gained page cache invalidation in the meantime.
Cheers,
Daniel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 18:44 ` Daniel Stodden
@ 2010-01-21 19:16 ` Daniel Stodden
2010-01-21 19:37 ` Ian Campbell
0 siblings, 1 reply; 22+ messages in thread
From: Daniel Stodden @ 2010-01-21 19:16 UTC (permalink / raw)
To: Pasi Kärkkäinen; +Cc: xen-devel@lists.xensource.com, Keir Fraser
On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > please send them to the list now.
> > > >
> > >
> > > Hmm.. I just remembered this pygrub bug:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > >
> > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > >
> > > Redhat seems to have patches available for testing.. not for
> > > xen-unstable though.
> > >
> > > I've personally hit this bug many times.
> > >
> >
> > It seems Redhat guys have a fix available.. they fixed the problem by
> > patching dom0 kernel blkback.
> >
> > More details about the fix here:
> > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> >
> > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
>
> Only to 2.6.18.
>
> It's obsolete after 2.6.27.
> O_DIRECT gained page cache invalidation in the meantime.
Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
invalidation only covers the filemap. That obviously won't fix blkback
bios on raw devices.
Ian Campbell recently noted he came across a different fix, which adds
direct-io to e2fsprogs.
http://www.spinics.net/lists/linux-ext4/msg16992.html
Any opinions on the tradeoff?
Daniel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 19:16 ` Daniel Stodden
@ 2010-01-21 19:37 ` Ian Campbell
2010-01-21 21:01 ` Pasi Kärkkäinen
0 siblings, 1 reply; 22+ messages in thread
From: Ian Campbell @ 2010-01-21 19:37 UTC (permalink / raw)
To: Daniel Stodden; +Cc: xen-devel@lists.xensource.com, Keir Fraser
On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:
> On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > > please send them to the list now.
> > > > >
> > > >
> > > > Hmm.. I just remembered this pygrub bug:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > >
> > > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > > >
> > > > Redhat seems to have patches available for testing.. not for
> > > > xen-unstable though.
> > > >
> > > > I've personally hit this bug many times.
> > > >
> > >
> > > It seems Redhat guys have a fix available.. they fixed the problem by
> > > patching dom0 kernel blkback.
> > >
> > > More details about the fix here:
> > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > >
> > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
> >
> > Only to 2.6.18.
> >
> > It's obsolete after 2.6.27.
> > O_DIRECT gained page cache invalidation in the meantime.
>
> Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
> invalidation only covers the filemap. That obviously won't fix blkback
> bios on raw devices.
>
> Ian Campbell recently noted he came across a different fix, which adds
> direct-io to e2fsprogs.
I noted the thread because the root problem seemed interesting and
worthy of investigation, but I should have made it clear that I didn't
think messing with direct-io in e2fsprogs was the correct solution. I
think the majority of the participants in the thread thought that too.
The biggest problem is that it only solves the issue in the one specific
case of things which use e2fsprogs and not in general, we can't go round
adding O_DIRECT to everything which might be used to access these disks.
Ian.
>
> http://www.spinics.net/lists/linux-ext4/msg16992.html
>
> Any opinions on the tradeoff?
>
> Daniel
>
>
>
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 19:37 ` Ian Campbell
@ 2010-01-21 21:01 ` Pasi Kärkkäinen
2010-01-21 21:53 ` Daniel Stodden
0 siblings, 1 reply; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-21 21:01 UTC (permalink / raw)
To: Ian Campbell; +Cc: xen-devel@lists.xensource.com, Keir Fraser, Daniel Stodden
On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote:
> On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:
> > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > > > please send them to the list now.
> > > > > >
> > > > >
> > > > > Hmm.. I just remembered this pygrub bug:
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > >
> > > > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > > > >
> > > > > Redhat seems to have patches available for testing.. not for
> > > > > xen-unstable though.
> > > > >
> > > > > I've personally hit this bug many times.
> > > > >
> > > >
> > > > It seems Redhat guys have a fix available.. they fixed the problem by
> > > > patching dom0 kernel blkback.
> > > >
> > > > More details about the fix here:
> > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > >
> > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
> > >
> > > Only to 2.6.18.
> > >
> > > It's obsolete after 2.6.27.
> > > O_DIRECT gained page cache invalidation in the meantime.
> >
> > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
> > invalidation only covers the filemap. That obviously won't fix blkback
> > bios on raw devices.
> >
> > Ian Campbell recently noted he came across a different fix, which adds
> > direct-io to e2fsprogs.
>
> I noted the thread because the root problem seemed interesting and
> worthy of investigation, but I should have made it clear that I didn't
> think messing with direct-io in e2fsprogs was the correct solution. I
> think the majority of the participants in the thread thought that too.
> The biggest problem is that it only solves the issue in the one specific
> case of things which use e2fsprogs and not in general, we can't go round
> adding O_DIRECT to everything which might be used to access these disks.
>
Yeah, it should be fixed in blkback.. who knows, some users might be using
other tools in dom0 aswell, not just pygrub.
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 21:01 ` Pasi Kärkkäinen
@ 2010-01-21 21:53 ` Daniel Stodden
2010-01-27 9:27 ` Pasi Kärkkäinen
0 siblings, 1 reply; 22+ messages in thread
From: Daniel Stodden @ 2010-01-21 21:53 UTC (permalink / raw)
To: Pasi Kärkkäinen
Cc: Ian Campbell, xen-devel@lists.xensource.com, Keir Fraser
On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote:
> On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote:
> > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:
> > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > > > > please send them to the list now.
> > > > > > >
> > > > > >
> > > > > > Hmm.. I just remembered this pygrub bug:
> > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > > >
> > > > > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > > > > >
> > > > > > Redhat seems to have patches available for testing.. not for
> > > > > > xen-unstable though.
> > > > > >
> > > > > > I've personally hit this bug many times.
> > > > > >
> > > > >
> > > > > It seems Redhat guys have a fix available.. they fixed the problem by
> > > > > patching dom0 kernel blkback.
> > > > >
> > > > > More details about the fix here:
> > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > >
> > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
> > > >
> > > > Only to 2.6.18.
> > > >
> > > > It's obsolete after 2.6.27.
> > > > O_DIRECT gained page cache invalidation in the meantime.
> > >
> > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
> > > invalidation only covers the filemap. That obviously won't fix blkback
> > > bios on raw devices.
> > >
> > > Ian Campbell recently noted he came across a different fix, which adds
> > > direct-io to e2fsprogs.
> >
> > I noted the thread because the root problem seemed interesting and
> > worthy of investigation, but I should have made it clear that I didn't
> > think messing with direct-io in e2fsprogs was the correct solution. I
> > think the majority of the participants in the thread thought that too.
> > The biggest problem is that it only solves the issue in the one specific
> > case of things which use e2fsprogs and not in general, we can't go round
> > adding O_DIRECT to everything which might be used to access these disks.
> >
>
> Yeah, it should be fixed in blkback.. who knows, some users might be using
> other tools in dom0 aswell, not just pygrub.
Fully agreed.
But: One thing about the rhel patch isn't immediately clear to me. The
invalidate step apparently goes into the VBD creation.
I don't see why this is sufficient, my understanding was that pygrub
would rather read stale data after boot, then run, then shutdown, then
reboot.
Which rather suggests flushing during shutdown (?).
Or rather on both ends. Because 1) installing a guest by copying a VDI
image 2) failing to properly close the raw device to get the caches
flushed before 3) booting the VM is another potential problem.
We used to see the latter becoming an issue in the past.
Thanks,
Daniel
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-21 21:53 ` Daniel Stodden
@ 2010-01-27 9:27 ` Pasi Kärkkäinen
2010-01-28 19:34 ` Daniel Stodden
0 siblings, 1 reply; 22+ messages in thread
From: Pasi Kärkkäinen @ 2010-01-27 9:27 UTC (permalink / raw)
To: Daniel Stodden; +Cc: Ian Campbell, xen-devel@lists.xensource.com, Keir Fraser
On Thu, Jan 21, 2010 at 01:53:21PM -0800, Daniel Stodden wrote:
> On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote:
> > On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote:
> > > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:
> > > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> > > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > > > > > please send them to the list now.
> > > > > > > >
> > > > > > >
> > > > > > > Hmm.. I just remembered this pygrub bug:
> > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > > > >
> > > > > > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > > > > > >
> > > > > > > Redhat seems to have patches available for testing.. not for
> > > > > > > xen-unstable though.
> > > > > > >
> > > > > > > I've personally hit this bug many times.
> > > > > > >
> > > > > >
> > > > > > It seems Redhat guys have a fix available.. they fixed the problem by
> > > > > > patching dom0 kernel blkback.
> > > > > >
> > > > > > More details about the fix here:
> > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > > >
> > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
> > > > >
> > > > > Only to 2.6.18.
> > > > >
> > > > > It's obsolete after 2.6.27.
> > > > > O_DIRECT gained page cache invalidation in the meantime.
> > > >
> > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
> > > > invalidation only covers the filemap. That obviously won't fix blkback
> > > > bios on raw devices.
> > > >
> > > > Ian Campbell recently noted he came across a different fix, which adds
> > > > direct-io to e2fsprogs.
> > >
> > > I noted the thread because the root problem seemed interesting and
> > > worthy of investigation, but I should have made it clear that I didn't
> > > think messing with direct-io in e2fsprogs was the correct solution. I
> > > think the majority of the participants in the thread thought that too.
> > > The biggest problem is that it only solves the issue in the one specific
> > > case of things which use e2fsprogs and not in general, we can't go round
> > > adding O_DIRECT to everything which might be used to access these disks.
> > >
> >
> > Yeah, it should be fixed in blkback.. who knows, some users might be using
> > other tools in dom0 aswell, not just pygrub.
>
> Fully agreed.
>
> But: One thing about the rhel patch isn't immediately clear to me. The
> invalidate step apparently goes into the VBD creation.
>
With the RH kernel blkback kernel patch/fix:
1) xm create domU
2) pygrub runs, caching stuff in dom0 kernel cache
3) domU is started, the patched blkback driver flushes dom0 kernel cache when the disk backend is created
4) grub.conf is modified in the guest
5) domU shuts down
6) xm create domU
7) pygrub runs, and gets the new updated grub.conf, since there's nothing in the dom0 kernel cache, since it was flushed in 3)
8) domU is started, blkback again flushes the dom0 kernel cache to prevent future problems
That'w how I understood it..
> I don't see why this is sufficient, my understanding was that pygrub
> would rather read stale data after boot, then run, then shutdown, then
> reboot.
>
> Which rather suggests flushing during shutdown (?).
>
disk IO from the domU blkfront is not cached in dom0,
so it's enough to flush during the disk backend creation?
pygrub is the only player here who gets stuff in the dom0 cache.
> Or rather on both ends. Because 1) installing a guest by copying a VDI
> image 2) failing to properly close the raw device to get the caches
> flushed before 3) booting the VM is another potential problem.
>
> We used to see the latter becoming an issue in the past.
>
I guess it wouldn't hurt to also flush during shutdown.. ?
-- Pasi
^ permalink raw reply [flat|nested] 22+ messages in thread
* Re: Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug
2010-01-27 9:27 ` Pasi Kärkkäinen
@ 2010-01-28 19:34 ` Daniel Stodden
0 siblings, 0 replies; 22+ messages in thread
From: Daniel Stodden @ 2010-01-28 19:34 UTC (permalink / raw)
To: Pasi Kärkkäinen; +Cc: xen-devel@lists.xensource.com
On Wed, 2010-01-27 at 04:27 -0500, Pasi Kärkkäinen wrote:
> On Thu, Jan 21, 2010 at 01:53:21PM -0800, Daniel Stodden wrote:
> > On Thu, 2010-01-21 at 16:01 -0500, Pasi Kärkkäinen wrote:
> > > On Thu, Jan 21, 2010 at 07:37:27PM +0000, Ian Campbell wrote:
> > > > On Thu, 2010-01-21 at 19:16 +0000, Daniel Stodden wrote:
> > > > > On Thu, 2010-01-21 at 13:44 -0500, Daniel Stodden wrote:
> > > > > > On Thu, 2010-01-21 at 07:28 -0500, Pasi Kärkkäinen wrote:
> > > > > > > On Wed, Jan 06, 2010 at 03:50:05PM +0200, Pasi Kärkkäinen wrote:
> > > > > > > > On Tue, Jan 05, 2010 at 06:42:05AM +0000, Keir Fraser wrote:
> > > > > > > > > I plan to tag -rc1 later this week. If you have any outstanding patches,
> > > > > > > > > please send them to the list now.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Hmm.. I just remembered this pygrub bug:
> > > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > > > > >
> > > > > > > > pygrub doesn't use O_DIRECT so sometimes it gets old information
> > > > > > > > from dom0 kernel cache - and fails to use the updated domU grub.conf.
> > > > > > > >
> > > > > > > > Redhat seems to have patches available for testing.. not for
> > > > > > > > xen-unstable though.
> > > > > > > >
> > > > > > > > I've personally hit this bug many times.
> > > > > > > >
> > > > > > >
> > > > > > > It seems Redhat guys have a fix available.. they fixed the problem by
> > > > > > > patching dom0 kernel blkback.
> > > > > > >
> > > > > > > More details about the fix here:
> > > > > > > https://bugzilla.redhat.com/show_bug.cgi?id=466681
> > > > > > >
> > > > > > > Should this be applied to 2.6.18-xen and pv_ops dom0 kernels aswell?
> > > > > >
> > > > > > Only to 2.6.18.
> > > > > >
> > > > > > It's obsolete after 2.6.27.
> > > > > > O_DIRECT gained page cache invalidation in the meantime.
> > > > >
> > > > > Aiiee, sorry. I guess this one only applies to tapdisks. The page cache
> > > > > invalidation only covers the filemap. That obviously won't fix blkback
> > > > > bios on raw devices.
> > > > >
> > > > > Ian Campbell recently noted he came across a different fix, which adds
> > > > > direct-io to e2fsprogs.
> > > >
> > > > I noted the thread because the root problem seemed interesting and
> > > > worthy of investigation, but I should have made it clear that I didn't
> > > > think messing with direct-io in e2fsprogs was the correct solution. I
> > > > think the majority of the participants in the thread thought that too.
> > > > The biggest problem is that it only solves the issue in the one specific
> > > > case of things which use e2fsprogs and not in general, we can't go round
> > > > adding O_DIRECT to everything which might be used to access these disks.
> > > >
> > >
> > > Yeah, it should be fixed in blkback.. who knows, some users might be using
> > > other tools in dom0 aswell, not just pygrub.
> >
> > Fully agreed.
> >
> > But: One thing about the rhel patch isn't immediately clear to me. The
> > invalidate step apparently goes into the VBD creation.
> >
>
> With the RH kernel blkback kernel patch/fix:
>
> 1) xm create domU
> 2) pygrub runs, caching stuff in dom0 kernel cache
> 3) domU is started, the patched blkback driver flushes dom0 kernel cache when the disk backend is created
> 4) grub.conf is modified in the guest
> 5) domU shuts down
>
> 6) xm create domU
> 7) pygrub runs, and gets the new updated grub.conf, since there's nothing in the dom0 kernel cache, since it was flushed in 3)
> 8) domU is started, blkback again flushes the dom0 kernel cache to prevent future problems
>
> That'w how I understood it..
>
> > I don't see why this is sufficient, my understanding was that pygrub
> > would rather read stale data after boot, then run, then shutdown, then
> > reboot.
> >
> > Which rather suggests flushing during shutdown (?).
> >
>
> disk IO from the domU blkfront is not cached in dom0,
> so it's enough to flush during the disk backend creation?
Yes. By flush I meant discarding the cache entries left from 2), not
some writeback.
You're right, it doesn't matter as long as the disk is not buffered
somewhere while still opened by the backend.
> pygrub is the only player here who gets stuff in the dom0 cache.
> > Or rather on both ends. Because 1) installing a guest by copying a VDI
> > image 2) failing to properly close the raw device to get the caches
> > flushed before 3) booting the VM is another potential problem.
> >
> > We used to see the latter becoming an issue in the past.
> >
>
> I guess it wouldn't hurt to also flush during shutdown.. ?
I don't really mind.
Xenserver accesses the disks only by attaching them to dom0, right now
that's actually my preferred alternative. The blkback thing can hardly
hurt, so let's pull it in.
I'd mainly wonder a little barrier utility for the control stack
wouldn't be a way more flexible solution. So instead of patching some
particular backend to fix the world of e2fsprogs, or (worse) patching
e2fsprogs themselves, let pygrub call some program with in turn does
some ioctl(barrier) magic on the node to discard potentially stale
mappings.
Daniel
^ permalink raw reply [flat|nested] 22+ messages in thread
end of thread, other threads:[~2010-01-28 19:34 UTC | newest]
Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-05 6:42 Tagging Xen 4.0.0 first release candidate Keir Fraser
2010-01-05 8:56 ` Jan Beulich
2010-01-05 9:06 ` Keir Fraser
2010-01-05 16:00 ` Alex Williamson
2010-01-05 16:06 ` Keir Fraser
2010-01-19 13:57 ` Tagging Xen 4.0.0 first release candidate, 44-bit address space support Pasi Kärkkäinen
2010-01-19 14:02 ` Keir Fraser
2010-01-19 14:22 ` Pasi Kärkkäinen
2010-01-06 13:50 ` Tagging Xen 4.0.0 first release candidate / pygrub dom0 caching bug Pasi Kärkkäinen
2010-01-21 12:28 ` Pasi Kärkkäinen
2010-01-21 15:39 ` Jan Beulich
2010-01-21 18:44 ` Daniel Stodden
2010-01-21 19:16 ` Daniel Stodden
2010-01-21 19:37 ` Ian Campbell
2010-01-21 21:01 ` Pasi Kärkkäinen
2010-01-21 21:53 ` Daniel Stodden
2010-01-27 9:27 ` Pasi Kärkkäinen
2010-01-28 19:34 ` Daniel Stodden
2010-01-06 15:00 ` Tagging Xen 4.0.0 first release candidate Christian Tramnitz
2010-01-06 21:42 ` Ian Pratt
2010-01-06 21:52 ` Pasi Kärkkäinen
2010-01-07 11:51 ` Pasi Kärkkäinen
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).