* QLogic Fibre Channel HBA Support
@ 2004-07-04 8:38 Steve Traugott
2004-07-04 9:02 ` Keir Fraser
2004-07-04 9:11 ` Ian Pratt
0 siblings, 2 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-04 8:38 UTC (permalink / raw)
To: xen-devel
Hey Folks,
Still hunting for better alternatives vs. NFS roots -- does anyone know
what I'd need to do to get the driver for QLogic fibre HBA's working, so
I can host VBD's from a SAN?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 8:38 Steve Traugott
@ 2004-07-04 9:02 ` Keir Fraser
2004-07-04 19:47 ` Steve Traugott
2004-07-04 9:11 ` Ian Pratt
1 sibling, 1 reply; 28+ messages in thread
From: Keir Fraser @ 2004-07-04 9:02 UTC (permalink / raw)
To: Steve Traugott; +Cc: xen-devel
We now run device drivers unmodified with privileged guest OSes. If
you use the unstable tree you would now compile all your drivers into
the DOM0 kernel (including the standard qlogic driver), and you can
then export chunks of disc space to other domains cia a
high-performance shared-memory interface.
-- Keir
> Hey Folks,
>
> Still hunting for better alternatives vs. NFS roots -- does anyone know
> what I'd need to do to get the driver for QLogic fibre HBA's working, so
> I can host VBD's from a SAN?
>
> Steve
> --
> Stephen G. Traugott (KG6HDQ)
> UNIX/Linux Infrastructure Architect, TerraLuna LLC
> stevegt@TerraLuna.Org
> http://www.stevegt.com -- http://Infrastructures.Org
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> digital self defense, top technical experts, no vendor pitches,
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 8:38 Steve Traugott
2004-07-04 9:02 ` Keir Fraser
@ 2004-07-04 9:11 ` Ian Pratt
2004-07-04 20:31 ` Steve Traugott
1 sibling, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-04 9:11 UTC (permalink / raw)
To: Steve Traugott; +Cc: xen-devel, Ian.Pratt
> Still hunting for better alternatives vs. NFS roots -- does anyone know
> what I'd need to do to get the driver for QLogic fibre HBA's working, so
> I can host VBD's from a SAN?
If there's a driver in Linux, it should just work if you use the
unstable-xeno tree and modify the config for the domain 0 linux
to add the driver.
In the new tree, rather than using our own virtual disk code we're
planning on using standard Linux's standard LVM2 code to enable
physical partitions to be sliced and diced. The tool support for
this isn't quite there yet.
An alternative to using a FCAL SAN is to use iSCSI. I've found
that the Linux Cisco iSCSI initiator code works nicely, and can
either talk to a hardware iSCSI target or to the Ardistech Linux
iSCSI s/w target. I've generally configured it such that the
domain talks iSCSI directly (using an initrd to enable root to be
on the iSCSI volume). Others have configured iSCSI in domain 0
and then exported the partitions to other domains as block
devices using the normal VBD mechanism.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 9:02 ` Keir Fraser
@ 2004-07-04 19:47 ` Steve Traugott
0 siblings, 0 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-04 19:47 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
Fantastic. That's what I was hoping to hear (since I already ordered
the fibre channel gear). :-}
On Sun, Jul 04, 2004 at 10:02:25AM +0100, Keir Fraser wrote:
>
> We now run device drivers unmodified with privileged guest OSes. If
> you use the unstable tree you would now compile all your drivers into
> the DOM0 kernel (including the standard qlogic driver), and you can
> then export chunks of disc space to other domains cia a
> high-performance shared-memory interface.
>
> -- Keir
>
>
> > Hey Folks,
> >
> > Still hunting for better alternatives vs. NFS roots -- does anyone know
> > what I'd need to do to get the driver for QLogic fibre HBA's working, so
> > I can host VBD's from a SAN?
> >
> > Steve
> > --
> > Stephen G. Traugott (KG6HDQ)
> > UNIX/Linux Infrastructure Architect, TerraLuna LLC
> > stevegt@TerraLuna.Org
> > http://www.stevegt.com -- http://Infrastructures.Org
> >
> >
> > -------------------------------------------------------
> > This SF.Net email sponsored by Black Hat Briefings & Training.
> > Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> > digital self defense, top technical experts, no vendor pitches,
> > unmatched networking opportunities. Visit www.blackhat.com
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
>
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 9:11 ` Ian Pratt
@ 2004-07-04 20:31 ` Steve Traugott
2004-07-04 20:50 ` Ian Pratt
2004-07-05 9:40 ` Niraj Tolia
0 siblings, 2 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-04 20:31 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
On Sun, Jul 04, 2004 at 10:11:38AM +0100, Ian Pratt wrote:
>
> > Still hunting for better alternatives vs. NFS roots -- does anyone know
> > what I'd need to do to get the driver for QLogic fibre HBA's working, so
> > I can host VBD's from a SAN?
>
> If there's a driver in Linux, it should just work if you use the
> unstable-xeno tree and modify the config for the domain 0 linux
> to add the driver.
So, how stable is unstable these days? I.E. would you trust it to host
other people's guests?
> In the new tree, rather than using our own virtual disk code we're
> planning on using standard Linux's standard LVM2 code to enable
> physical partitions to be sliced and diced. The tool support for
> this isn't quite there yet.
That sounds good -- you mean tools support as in python? I should be
able to help if that's the case.
> An alternative to using a FCAL SAN is to use iSCSI. I've found
> that the Linux Cisco iSCSI initiator code works nicely, and can
> either talk to a hardware iSCSI target or to the Ardistech Linux
> iSCSI s/w target. I've generally configured it such that the
> domain talks iSCSI directly (using an initrd to enable root to be
> on the iSCSI volume). Others have configured iSCSI in domain 0
> and then exported the partitions to other domains as block
> devices using the normal VBD mechanism.
I'd need to use the iSCSI in domain 0 approach (other people's
guests...), haven't tried it due to lack of hardware targets, didn't get
warm fuzzies from Ardistech's code -- you've had no problems with it
though?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 20:31 ` Steve Traugott
@ 2004-07-04 20:50 ` Ian Pratt
2004-07-05 18:24 ` Steve Traugott
2004-07-05 9:40 ` Niraj Tolia
1 sibling, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-04 20:50 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, xen-devel
> On Sun, Jul 04, 2004 at 10:11:38AM +0100, Ian Pratt wrote:
> >
> > > Still hunting for better alternatives vs. NFS roots -- does anyone know
> > > what I'd need to do to get the driver for QLogic fibre HBA's working, so
> > > I can host VBD's from a SAN?
> >
> > If there's a driver in Linux, it should just work if you use the
> > unstable-xeno tree and modify the config for the domain 0 linux
> > to add the driver.
>
> So, how stable is unstable these days? I.E. would you trust it to host
> other people's guests?
I think Xen and XenLinux 2.4.26 are pretty damn stable -- at
least I know of no outstanding bugs that have been posted to the
list.
The new control tools still have a few stability issues, but
since they're restartable its typically not a show stopper.
Armed with good bug reports I'm sure Mike will be able to resolve
this. There are a few extra features and fixes we need to get
into the tools, then I think we'll be ready for a 2.0-alpha
release. I'd hope that all users of 1.2 would test it out for
us.
> > In the new tree, rather than using our own virtual disk code we're
> > planning on using standard Linux's standard LVM2 code to enable
> > physical partitions to be sliced and diced. The tool support for
> > this isn't quite there yet.
>
> That sounds good -- you mean tools support as in python? I should be
> able to help if that's the case.
Having other people who understand the new xend and xm tool would
be very helpful. It's all a bit to object-oriented for me to
understand :-)
> > An alternative to using a FCAL SAN is to use iSCSI. I've found
> > that the Linux Cisco iSCSI initiator code works nicely, and can
> > either talk to a hardware iSCSI target or to the Ardistech Linux
> > iSCSI s/w target. I've generally configured it such that the
> > domain talks iSCSI directly (using an initrd to enable root to be
> > on the iSCSI volume). Others have configured iSCSI in domain 0
> > and then exported the partitions to other domains as block
> > devices using the normal VBD mechanism.
>
> I'd need to use the iSCSI in domain 0 approach (other people's
> guests...), haven't tried it due to lack of hardware targets, didn't get
> warm fuzzies from Ardistech's code -- you've had no problems with it
> though?
I've mainly used a NetApp filer h/w target, so I haven't really
got enough experience to says whether the Ardistech code is
stable or not. There's always enbd, nbd, gnbd which are all
simple enough to believe that they work...
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
@ 2004-07-05 4:39 Brian Wolfe
0 siblings, 0 replies; 28+ messages in thread
From: Brian Wolfe @ 2004-07-05 4:39 UTC (permalink / raw)
To: Steve Traugott; +Cc: xen-devel
[-- Attachment #1: Type: text/plain, Size: 760 bytes --]
I've been working on putting togeather a system to manage clusters of Xen nodes. I can't go into a lot of details yet since it's only in the design stage right now.
As for stable enough to host, I'm using a 1.3 version from May or June (can't remember the version off the top of my head) to do virtual server hosting for my own machines and for several clients. Interestingly enough, because of the setup I moved to I'm getting better overall performance using 2 main NFS servers with 8-disk raid-5 arrays than with individual machines using mirror sets!
I will be looking for anyone interested in using the prototype tools later once we have picked up the new set that Ian and friends are coding for everyone, and have rewritten things based on them. :)
[-- Attachment #2: Type: text/plain, Size: 83 bytes --]
On Sun, 4 Jul 2004 13:31:39 -0700
Steve Traugott <stevegt@TerraLuna.Org> said...
[-- Attachment #3: Type: text/plain, Size: 2188 bytes --]
On Sun, Jul 04, 2004 at 10:11:38AM +0100, Ian Pratt wrote:
>
> > Still hunting for better alternatives vs. NFS roots -- does anyone know
> > what I'd need to do to get the driver for QLogic fibre HBA's working, so
> > I can host VBD's from a SAN?
>
> If there's a driver in Linux, it should just work if you use the
> unstable-xeno tree and modify the config for the domain 0 linux
> to add the driver.
So, how stable is unstable these days? I.E. would you trust it to host
other people's guests?
> In the new tree, rather than using our own virtual disk code we're
> planning on using standard Linux's standard LVM2 code to enable
> physical partitions to be sliced and diced. The tool support for
> this isn't quite there yet.
That sounds good -- you mean tools support as in python? I should be
able to help if that's the case.
> An alternative to using a FCAL SAN is to use iSCSI. I've found
> that the Linux Cisco iSCSI initiator code works nicely, and can
> either talk to a hardware iSCSI target or to the Ardistech Linux
> iSCSI s/w target. I've generally configured it such that the
> domain talks iSCSI directly (using an initrd to enable root to be
> on the iSCSI volume). Others have configured iSCSI in domain 0
> and then exported the partitions to other domains as block
> devices using the normal VBD mechanism.
I'd need to use the iSCSI in domain 0 approach (other people's
guests...), haven't tried it due to lack of hardware targets, didn't get
warm fuzzies from Ardistech's code -- you've had no problems with it
though?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 20:31 ` Steve Traugott
2004-07-04 20:50 ` Ian Pratt
@ 2004-07-05 9:40 ` Niraj Tolia
2004-07-05 10:11 ` Ian Pratt
1 sibling, 1 reply; 28+ messages in thread
From: Niraj Tolia @ 2004-07-05 9:40 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, xen-devel
> > An alternative to using a FCAL SAN is to use iSCSI. I've found
> > that the Linux Cisco iSCSI initiator code works nicely, and can
> > either talk to a hardware iSCSI target or to the Ardistech Linux
> > iSCSI s/w target. I've generally configured it such that the
> > domain talks iSCSI directly (using an initrd to enable root to be
> > on the iSCSI volume). Others have configured iSCSI in domain 0
> > and then exported the partitions to other domains as block
> > devices using the normal VBD mechanism.
>
> I'd need to use the iSCSI in domain 0 approach (other people's
> guests...), haven't tried it due to lack of hardware targets, didn't get
> warm fuzzies from Ardistech's code -- you've had no problems with it
> though?
>
FWIW, I managed to get the following combinations of iSCSI
initiators/targets to work. Initiators ran in domain 0 while targets
ran on a vanilla linux system (no xen). I haven't really stressed the
systems though.
UNH Initiator <--> UNH Target
Cisco Initiator <--> UNH Target
Intel PRO/1000 T (Hardware) Initiator <--> UNH Target
A good site that might help is <http://zaal.org/iscsi/>.
Niraj
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 9:40 ` Niraj Tolia
@ 2004-07-05 10:11 ` Ian Pratt
0 siblings, 0 replies; 28+ messages in thread
From: Ian Pratt @ 2004-07-05 10:11 UTC (permalink / raw)
To: Niraj Tolia; +Cc: Steve Traugott, Ian Pratt, xen-devel, Ian.Pratt
> FWIW, I managed to get the following combinations of iSCSI
> initiators/targets to work. Initiators ran in domain 0 while targets
> ran on a vanilla linux system (no xen). I haven't really stressed the
> systems though.
>
> UNH Initiator <--> UNH Target
> Cisco Initiator <--> UNH Target
> Intel PRO/1000 T (Hardware) Initiator <--> UNH Target
>
> A good site that might help is <http://zaal.org/iscsi/>.
Interesting -- I hadn't come across the UNH target or initiator.
The guy on zaal.org is pretty damning of both the UNH and Ardis
targets, but has set about writing his own:
http://sourceforge.net/projects/iscsitarget/
This actually looks pretty good, even though it's still an alpha
release. I'd be interested to hear if anyone has tried it in
anger.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-04 20:50 ` Ian Pratt
@ 2004-07-05 18:24 ` Steve Traugott
2004-07-05 18:48 ` Wim Coekaerts
2004-07-05 20:14 ` Ian Pratt
0 siblings, 2 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-05 18:24 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
On Sun, Jul 04, 2004 at 09:50:57PM +0100, Ian Pratt wrote:
> > I'd need to use the iSCSI in domain 0 approach (other people's
> > guests...), haven't tried it due to lack of hardware targets, didn't get
> > warm fuzzies from Ardistech's code -- you've had no problems with it
> > though?
>
> I've mainly used a NetApp filer h/w target, so I haven't really
> got enough experience to says whether the Ardistech code is
> stable or not. There's always enbd, nbd, gnbd which are all
> simple enough to believe that they work...
I can see how that works using initrd to mount *nbd as root in guest
domains, but what about using *nbd in dom0 and then allocating that as
VDs to the other domains? Is that supposed to work? I remember
something about needing physical raw partitions for VBDs, at least under
1.2. Am I missing something?
(For anyone curious, if using *nbd I would need to keep it in dom0,
rather than in each guest, for both security and maintainability. For a
public Xenoserver, uid 0 on the guests is assumed to be untrusted.)
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
[not found] <200407050431.i654VV130769@roton.TerraLuna.Org>
@ 2004-07-05 18:33 ` Steve Traugott
2004-07-05 20:19 ` Ian Pratt
0 siblings, 1 reply; 28+ messages in thread
From: Steve Traugott @ 2004-07-05 18:33 UTC (permalink / raw)
To: Brian Wolfe; +Cc: xen-devel
Hi Brian,
On Mon, Jul 05, 2004 at 04:31:33AM +0000, Brian Wolfe wrote:
> As for stable enough to host, I'm using a 1.3 version from May or June
> (can't remember the version off the top of my head) to do virtual
> server hosting for my own machines and for several clients.
> Interestingly enough, because of the setup I moved to I'm getting
> better overall performance using 2 main NFS servers with 8-disk raid-5
> arrays than with individual machines using mirror sets!
When using NFS roots you haven't seen any hung clients? I was getting a
lot of those, got rid of a lot of the hangs by using the /dev/urandom
workaround, but still got a few after that, attributed to a long-lived
Linux NFS client kernel bug. See these threads:
05 May 2004: xenolinux /dev/random
12 May 2004: Xen hangs with NFS root under high loads
I'm sure 1.3 fixes the first, but what about the second?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 18:24 ` Steve Traugott
@ 2004-07-05 18:48 ` Wim Coekaerts
2004-07-05 20:22 ` Ian Pratt
2004-07-05 20:14 ` Ian Pratt
1 sibling, 1 reply; 28+ messages in thread
From: Wim Coekaerts @ 2004-07-05 18:48 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, xen-devel
being able to share devices across domains in interesting...
in particular for us ... I would hope that it will be possible to do at
some point.
we have cluster filesystems that can run on each domain. as long as we
have persistent shared block devices we should be fine.
in fact this way we can run a database in cluster mode, as we require
shared disk. without it would limit the usability for xen there.
might not matter to most but I'm playing around with it for the db to
see if this is useful for us.
there is a lot of potential
On Mon, Jul 05, 2004 at 11:24:26AM -0700, Steve Traugott wrote:
> On Sun, Jul 04, 2004 at 09:50:57PM +0100, Ian Pratt wrote:
> > > I'd need to use the iSCSI in domain 0 approach (other people's
> > > guests...), haven't tried it due to lack of hardware targets, didn't get
> > > warm fuzzies from Ardistech's code -- you've had no problems with it
> > > though?
> >
> > I've mainly used a NetApp filer h/w target, so I haven't really
> > got enough experience to says whether the Ardistech code is
> > stable or not. There's always enbd, nbd, gnbd which are all
> > simple enough to believe that they work...
>
> I can see how that works using initrd to mount *nbd as root in guest
> domains, but what about using *nbd in dom0 and then allocating that as
> VDs to the other domains? Is that supposed to work? I remember
> something about needing physical raw partitions for VBDs, at least under
> 1.2. Am I missing something?
>
> (For anyone curious, if using *nbd I would need to keep it in dom0,
> rather than in each guest, for both security and maintainability. For a
> public Xenoserver, uid 0 on the guests is assumed to be untrusted.)
>
> Steve
> --
> Stephen G. Traugott (KG6HDQ)
> UNIX/Linux Infrastructure Architect, TerraLuna LLC
> stevegt@TerraLuna.Org
> http://www.stevegt.com -- http://Infrastructures.Org
>
>
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
> digital self defense, top technical experts, no vendor pitches,
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
@ 2004-07-05 18:54 Brian Wolfe
0 siblings, 0 replies; 28+ messages in thread
From: Brian Wolfe @ 2004-07-05 18:54 UTC (permalink / raw)
To: Steve Traugott; +Cc: xen-devel
[-- Attachment #1: Type: text/plain, Size: 612 bytes --]
Yeah, I'm still seeing a hang on the very last domain int he list. I haven't been able to track down what is causing it exactly. Up until now I thought that it's unique aspect of running apache wtih php5 beta 1 was causing it with OOM due to php5b1 memory leaks....
Other than that, none of the other domains are hanging...even with nfs1 periodicly rebooting spontaneously (hardware issue i'm tracking on it).
I'm waiting for 2.0-alpha to be officially released before I start monkeying with the bleeding edge again. Once that happens we will start protoyping the xenctld in python using the existing tools.
[-- Attachment #2: Type: text/plain, Size: 83 bytes --]
On Mon, 5 Jul 2004 11:33:57 -0700
Steve Traugott <stevegt@TerraLuna.Org> said...
[-- Attachment #3: Type: text/plain, Size: 1531 bytes --]
Hi Brian,
On Mon, Jul 05, 2004 at 04:31:33AM +0000, Brian Wolfe wrote:
> As for stable enough to host, I'm using a 1.3 version from May or June
> (can't remember the version off the top of my head) to do virtual
> server hosting for my own machines and for several clients.
> Interestingly enough, because of the setup I moved to I'm getting
> better overall performance using 2 main NFS servers with 8-disk raid-5
> arrays than with individual machines using mirror sets!
When using NFS roots you haven't seen any hung clients? I was getting a
lot of those, got rid of a lot of the hangs by using the /dev/urandom
workaround, but still got a few after that, attributed to a long-lived
Linux NFS client kernel bug. See these threads:
05 May 2004: xenolinux /dev/random
12 May 2004: Xen hangs with NFS root under high loads
I'm sure 1.3 fixes the first, but what about the second?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/xen-devel
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 18:24 ` Steve Traugott
2004-07-05 18:48 ` Wim Coekaerts
@ 2004-07-05 20:14 ` Ian Pratt
2004-07-05 22:53 ` Steve Traugott
1 sibling, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-05 20:14 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, xen-devel
> > I've mainly used a NetApp filer h/w target, so I haven't really
> > got enough experience to says whether the Ardistech code is
> > stable or not. There's always enbd, nbd, gnbd which are all
> > simple enough to believe that they work...
>
> I can see how that works using initrd to mount *nbd as root in guest
> domains, but what about using *nbd in dom0 and then allocating that as
> VDs to the other domains? Is that supposed to work? I remember
> something about needing physical raw partitions for VBDs, at least under
> 1.2. Am I missing something?
You can, in principle, export anything that dom0 sees as a block
device e.g. sda7, nbd0, vg0 as a block device in another domain
e.g. sda1, hda1.
Hence, you should be able to implement the equivalent of Xen 1.2
VDs just by using Linux's existing LVM mechanism.
The current tools don't quite allow the full set of functionality
that Xen implements, but adding support for LVM partitions should be
easy.
> (For anyone curious, if using *nbd I would need to keep it in dom0,
> rather than in each guest, for both security and maintainability. For a
> public Xenoserver, uid 0 on the guests is assumed to be untrusted.)
Each domain could use nbd directly and have its own separate area
of disk and its own uid space. The only issue would be that if
you were using nbd in each domain directly it would be quite
apparent to the user of each VM. E.g. an initrd would be required
to use nbd as a root file system. By running nbd in domain0 you
can hide all of this stuff from the other domains.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 18:33 ` Steve Traugott
@ 2004-07-05 20:19 ` Ian Pratt
2004-07-05 22:02 ` Steve Traugott
0 siblings, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-05 20:19 UTC (permalink / raw)
To: Steve Traugott; +Cc: Brian Wolfe, xen-devel, Ian.Pratt
> When using NFS roots you haven't seen any hung clients? I was getting a
> lot of those, got rid of a lot of the hangs by using the /dev/urandom
> workaround, but still got a few after that, attributed to a long-lived
> Linux NFS client kernel bug. See these threads:
>
> 05 May 2004: xenolinux /dev/random
> 12 May 2004: Xen hangs with NFS root under high loads
I'd be interested to know if /dev/random generates enough entropy
to work as expected in xen unstable. I'd certainly hope that it
would.
I've seen standard Linux 2.4.26 hang with NFS root under high
load. 2.4.27 is about to come out. If we're really lucky someone
might have fixed it.
BTW: I've found enabling swap to a block device seems to
significantly reduce the likelihood of a hang with NFS root.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 18:48 ` Wim Coekaerts
@ 2004-07-05 20:22 ` Ian Pratt
0 siblings, 0 replies; 28+ messages in thread
From: Ian Pratt @ 2004-07-05 20:22 UTC (permalink / raw)
To: Wim Coekaerts; +Cc: Steve Traugott, Ian Pratt, xen-devel
> being able to share devices across domains in interesting...
> in particular for us ... I would hope that it will be possible to do at
> some point.
There should be no problem with multiple writer sharing of
devices across domains. Of course, make sure you're using a file
system that expects this (e.g. GFS) otherwise you'll trash the
file system for sure!
Xen 1.2 supported this too, but the control tools deliberately
made it hard to do (you had to enable 'expert mode') otherwise
people would accidentally try sharing ext2/3 file systems and end
up trashing them.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 20:19 ` Ian Pratt
@ 2004-07-05 22:02 ` Steve Traugott
0 siblings, 0 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-05 22:02 UTC (permalink / raw)
To: Ian Pratt; +Cc: Brian Wolfe, xen-devel
On Mon, Jul 05, 2004 at 09:19:18PM +0100, Ian Pratt wrote:
> BTW: I've found enabling swap to a block device seems to
> significantly reduce the likelihood of a hang with NFS root.
I've been using local disk VD swap, NFS root.
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
[not found] <200407051852.i65IqL124320@roton.TerraLuna.Org>
@ 2004-07-05 22:12 ` Steve Traugott
0 siblings, 0 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-05 22:12 UTC (permalink / raw)
To: Brian Wolfe; +Cc: xen-devel
On Mon, Jul 05, 2004 at 06:52:21PM +0000, Brian Wolfe wrote:
> Yeah, I'm still seeing a hang on the very last domain int he list.
*Interesting* -- I hadn't noticed anything like that, but I wasn't
looking for it. It's not always the same root filesystem, is it? If
you shuffle the order is it still always the last one that hangs?
> I haven't been able to track down what is causing it exactly. Up until
> now I thought that it's unique aspect of running apache wtih php5 beta
> 1 was causing it with OOM due to php5b1 memory leaks....
That sounds an awful lot like the /dev/random bug we had in 1.2. I was
never able to duplicate it on purpose, but after linking /dev/random ->
/dev/urandom most of the hangs went away.
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 20:14 ` Ian Pratt
@ 2004-07-05 22:53 ` Steve Traugott
2004-07-05 23:23 ` Ian Pratt
0 siblings, 1 reply; 28+ messages in thread
From: Steve Traugott @ 2004-07-05 22:53 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
On Mon, Jul 05, 2004 at 09:14:15PM +0100, Ian Pratt wrote:
> You can, in principle, export anything that dom0 sees as a block
> device e.g. sda7, nbd0, vg0 as a block device in another domain
> e.g. sda1, hda1.
What about loop devices? Should they work? This lookup_raw_partn()
exception (in 1.2) might be what threw me off:
xendev1:~# dd if=/dev/zero of=/tmp/d0 bs=1M count=100
100+0 records in
100+0 records out
xendev1:~# losetup /dev/loop0 /tmp/d0
xendev1:~# xc_vd_tool.py initialise /dev/loop0 4
Formatting for virtual disks
Device: /dev/loop0
Extent size: 4MB
Traceback (most recent call last):
File "/usr/bin/xc_vd_tool.py", line 53, in ?
rc = XenoUtil.vd_format(dev, extent_size)
File "/usr/lib/python2.2/site-packages/XenoUtil.py", line 261, in vd_format
part_info = lookup_raw_partn(partition)[0]
TypeError: unsubscriptable object
> Hence, you should be able to implement the equivalent of Xen 1.2
> VDs just by using Linux's existing LVM mechanism.
I'm trying to picture how this would work -- LVM volume groups on top of
the dom0 VBDs, and then xc.vbd_create() to assign LVM logical volumes to
other domains?
> The current tools don't quite allow the full set of functionality
> that Xen implements, but adding support for LVM partitions should be
> easy.
What's missing?
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-05 22:53 ` Steve Traugott
@ 2004-07-05 23:23 ` Ian Pratt
0 siblings, 0 replies; 28+ messages in thread
From: Ian Pratt @ 2004-07-05 23:23 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, xen-devel
> On Mon, Jul 05, 2004 at 09:14:15PM +0100, Ian Pratt wrote:
> > You can, in principle, export anything that dom0 sees as a block
> > device e.g. sda7, nbd0, vg0 as a block device in another domain
> > e.g. sda1, hda1.
>
> What about loop devices? Should they work? This lookup_raw_partn()
> exception (in 1.2) might be what threw me off:
They won't work in 1.2, but unstable tree has the capability
internally.
I fear the current user space tools may still currently give you
the lookup error, but I hope we can get this fixed soon. [It's
possible that just commenting out the partition check might
work...]
I haven't tried it, but using a combination of a loop back device
(pointing at a sparse file) and Bin Ren's CoW block driver should
then provide CoW sparse disks. Alternatively, it may be possible
to do something similar with LVM2 snapshots.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
@ 2004-07-06 3:40 Brian Wolfe
0 siblings, 0 replies; 28+ messages in thread
From: Brian Wolfe @ 2004-07-06 3:40 UTC (permalink / raw)
To: Steve Traugott; +Cc: xen-devel
[-- Attachment #1: Type: text/plain, Size: 1200 bytes --]
Well, after having to restart another of the domains on that xen server I discovered that the last in the list is the one that would lock up once every 24 hours. (not certain fo the exact timeframe since it locks during the night.)
I've disabled the cron entries that generate high workload to eliminate that as a possible cause. Next is to try running the virtual server on another xen host that is running an older version of xen and xenolinux, see if that still hangs it.
I'll report more as I find it. However, I'm still swamped with house repairs. 8-( a 10' section of wall that's destroyed by termites, a bad kithen remodeling that has gone rotten (literally), and a bathroom shower install where they "forgot" to add the drain part that mates the drain hole to the drain pipe. I may have to post a pic of the shower issue to show how stupid it is. ;-P
After that I've still got lots of catchup to do since i've been swamped for the last 2 months. *sigh* This had better be worth it, getting into a house that is. Either way, in approximatly 2 to 3 weeks i'll be back full force working on xen setups and beating the crap out of the system. ;) Expect to receive LOTS of nitpicking. *grin*
[-- Attachment #2: Type: text/plain, Size: 83 bytes --]
On Mon, 5 Jul 2004 15:12:56 -0700 Steve Traugott <stevegt@TerraLuna.Org>
said...
[-- Attachment #3: Type: text/plain, Size: 899 bytes --]
On Mon, Jul 05, 2004 at 06:52:21PM +0000, Brian Wolfe wrote:
> Yeah, I'm still seeing a hang on the very last domain int he list.
*Interesting* -- I hadn't noticed anything like that, but I wasn't
looking for it. It's not always the same root filesystem, is it? If
you shuffle the order is it still always the last one that hangs?
> I haven't been able to track down what is causing it exactly. Up until
> now I thought that it's unique aspect of running apache wtih php5 beta
> 1 was causing it with OOM due to php5b1 memory leaks....
That sounds an awful lot like the /dev/random bug we had in 1.2. I was
never able to duplicate it on purpose, but after linking /dev/random ->
/dev/urandom most of the hangs went away.
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
[not found] <200407060323.i663Nl107730@roton.TerraLuna.Org>
@ 2004-07-06 5:06 ` Steve Traugott
2004-07-06 6:22 ` Keir Fraser
2004-07-06 7:26 ` Ian Pratt
0 siblings, 2 replies; 28+ messages in thread
From: Steve Traugott @ 2004-07-06 5:06 UTC (permalink / raw)
To: Brian Wolfe; +Cc: xen-devel
On Tue, Jul 06, 2004 at 03:23:47AM +0000, Brian Wolfe wrote:
> Well, after having to restart another of the domains on that xen
> server I discovered that the last in the list is the one that would
> lock up once every 24 hours. (not certain fo the exact timeframe since
> it locks during the night.)
>
> I've disabled the cron entries that generate high workload to
> eliminate that as a possible cause.
This may do it -- once I told my beta users that high workloads and/or
lots of entropy use could cause hangs, they started nice'ing and
--bwlimit'ing their ssh rsyncs, and there hasn't been a hang for a few
weeks. No proof whether I was hitting another entropy bug or just
workload. (Reminder to lurkers -- I'm running 1.2 with /dev/random
major,minor set to 1,9 -- same as /dev/urandom.)
> Next is to try running the virtual server on another xen host that is
> running an older version of xen and xenolinux, see if that still hangs
> it.
I think it should get worse if it changes at all. Your /dev/random is
1,8, right?
> I'll report more as I find it. However, I'm still swamped with house
> repairs. 8-( a 10' section of wall that's destroyed by termites, a bad
> kithen remodeling that has gone rotten (literally), and a bathroom
> shower install where they "forgot" to add the drain part that mates
> the drain hole to the drain pipe. I may have to post a pic of the
> shower issue to show how stupid it is. ;-P
I feel your pain. ;-) Building a home server closet here (this is
silly valley; it'll probably even increase the house value). It's been
an 18-month project so far, 80 amp subpanel, dedicated air conditioner
in the closet, that sort of thing. Lots of conduit, wiring, structural,
attic, and drywall work. Gotta deal with the fiberglass insulation this
week -- my wife says I keep putting that part off, I keep telling her
she has no idea how itchy that stuff is... :-} The good news is, when
it's finally done it'll be able to serve as a minimal disaster recovery
site for the Xen infrastructure we're building at our shop. Woo hoo.
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 5:06 ` QLogic Fibre Channel HBA Support Steve Traugott
@ 2004-07-06 6:22 ` Keir Fraser
2004-07-06 7:26 ` Ian Pratt
1 sibling, 0 replies; 28+ messages in thread
From: Keir Fraser @ 2004-07-06 6:22 UTC (permalink / raw)
To: Steve Traugott; +Cc: Brian Wolfe, xen-devel
> On Tue, Jul 06, 2004 at 03:23:47AM +0000, Brian Wolfe wrote:
> > Well, after having to restart another of the domains on that xen
> > server I discovered that the last in the list is the one that would
> > lock up once every 24 hours. (not certain fo the exact timeframe since
> > it locks during the night.)
> >
> > I've disabled the cron entries that generate high workload to
> > eliminate that as a possible cause.
>
> This may do it -- once I told my beta users that high workloads and/or
> lots of entropy use could cause hangs, they started nice'ing and
> --bwlimit'ing their ssh rsyncs, and there hasn't been a hang for a few
> weeks. No proof whether I was hitting another entropy bug or just
> workload. (Reminder to lurkers -- I'm running 1.2 with /dev/random
> major,minor set to 1,9 -- same as /dev/urandom.)
>
> > Next is to try running the virtual server on another xen host that is
> > running an older version of xen and xenolinux, see if that still hangs
> > it.
>
> I think it should get worse if it changes at all. Your /dev/random is
> 1,8, right?
If you use /dev/urandom (1,9) then you will never block while reading
entropy. The /only/ time anything can block while extracting entropy is
when a user application reads from /dev/random (1,8).
If you've linked /dev/random -> /dev/urandom then any lockups must be
due to bugs elsewhere.
Compare:
while [ true ] ; do dd if=/dev/urandom of=/dev/null bs=512 count=1024 2>/dev/null && echo -n "." ; done
while [ true ] ; do dd if=/dev/random of=/dev/null bs=512 count=1024 2>/dev/null && echo -n "." ; done
-- Keir
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 5:06 ` QLogic Fibre Channel HBA Support Steve Traugott
2004-07-06 6:22 ` Keir Fraser
@ 2004-07-06 7:26 ` Ian Pratt
2004-07-06 12:53 ` Steve Traugott
1 sibling, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-06 7:26 UTC (permalink / raw)
To: Steve Traugott; +Cc: Brian Wolfe, xen-devel, Ian.Pratt
> No proof whether I was hitting another entropy bug or just
> workload. (Reminder to lurkers -- I'm running 1.2 with /dev/random
> major,minor set to 1,9 -- same as /dev/urandom.)
If you temporarily run out of entropy, it should only ever be the
particular user-space process that's reading from /dev/random
that blocks. Everything else should carry on fine in the meantime.
In the unstable tree, AFAIK all interrupt sources are correctly
adding entropy to the kernel's entropy pool -- there are just
fewer bits of entropy generated per second in a VM.
If people are still finding that some heavy users of /dev/random
are blocking unexpectedly (e.g. apache during startup) then we'll
need to think what to do. One grim hack would be to modify the
guest kernel to make it less conservative about its estimate of
entropy generated. An alternative would be to have Xen handle
entropy generation centrally (from all physical interrupt
sources) and then have a special random driver in each guest.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 7:26 ` Ian Pratt
@ 2004-07-06 12:53 ` Steve Traugott
2004-07-06 13:31 ` Ian Pratt
0 siblings, 1 reply; 28+ messages in thread
From: Steve Traugott @ 2004-07-06 12:53 UTC (permalink / raw)
To: Ian Pratt; +Cc: Brian Wolfe, xen-devel
On Tue, Jul 06, 2004 at 08:26:43AM +0100, Ian Pratt wrote:
>
> > No proof whether I was hitting another entropy bug or just
> > workload. (Reminder to lurkers -- I'm running 1.2 with /dev/random
> > major,minor set to 1,9 -- same as /dev/urandom.)
>
> If you temporarily run out of entropy, it should only ever be the
> particular user-space process that's reading from /dev/random
> that blocks. Everything else should carry on fine in the meantime.
A little hard to tell -- in our case, hangs happen days after startup,
are pingable, but can't ssh in, don't respond to http requests, don't
show new syslog entries, etc. Some of these (maybe not syslog) might be
explained by named hanging, for instance, since it uses /dev/random (but
I haven't looked to see what named does with it -- maybe only sortlist
randomization). That plus a dose of not knowing what to look for at the
time could have made them look like a total inability to execute
userspace code and/or write to the root filesystem.
Are these symptoms consistent with what you know about the NFS bug?
I now have users executing a simple controller script themselves via ssh
to reboot machines when they hang -- it might make sense to start adding
diagnostic data collection to that. Right now it only makes a short log
entry -- and what I said a couple of days ago about "no hangs in a
month" was wrong. There are recent log entries; people have just been
rebooting their own hangs and have stopped telling me.
> In the unstable tree, AFAIK all interrupt sources are correctly
> adding entropy to the kernel's entropy pool -- there are just
> fewer bits of entropy generated per second in a VM.
>
> If people are still finding that some heavy users of /dev/random
> are blocking unexpectedly (e.g. apache during startup) then we'll
> need to think what to do. One grim hack would be to modify the
> guest kernel to make it less conservative about its estimate of
> entropy generated. An alternative would be to have Xen handle
> entropy generation centrally (from all physical interrupt
> sources) and then have a special random driver in each guest.
The only thing that bothers me about the latter is the special driver
needed in the guests. On the upside, I bet you could generate plenty of
entropy if it were done centrally. If so, I'm wondering if it would be
mathematically safe to include non-interrupt activities of other guests
in the pool as well.
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 12:53 ` Steve Traugott
@ 2004-07-06 13:31 ` Ian Pratt
2004-07-06 13:59 ` Steve Traugott
0 siblings, 1 reply; 28+ messages in thread
From: Ian Pratt @ 2004-07-06 13:31 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, Brian Wolfe, xen-devel
> > If you temporarily run out of entropy, it should only ever be the
> > particular user-space process that's reading from /dev/random
> > that blocks. Everything else should carry on fine in the meantime.
>
> A little hard to tell -- in our case, hangs happen days after startup,
> are pingable, but can't ssh in, don't respond to http requests, don't
> show new syslog entries, etc. Some of these (maybe not syslog) might be
> explained by named hanging, for instance, since it uses /dev/random (but
> I haven't looked to see what named does with it -- maybe only sortlist
> randomization). That plus a dose of not knowing what to look for at the
> time could have made them look like a total inability to execute
> userspace code and/or write to the root filesystem.
If you've got a console connection it should be pretty easy to
tell whether its just one process blocked on /dev/random or the
whole of the kernel sick due to an NFS deadlock.
I believe the former to be most likely solved, which does suggest
the latter. Once you have iSCSI or the SAN set up I'd hope the
hangs disappear.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 13:31 ` Ian Pratt
@ 2004-07-06 13:59 ` Steve Traugott
2004-07-06 14:21 ` Ian Pratt
0 siblings, 1 reply; 28+ messages in thread
From: Steve Traugott @ 2004-07-06 13:59 UTC (permalink / raw)
To: Ian Pratt; +Cc: Brian Wolfe, xen-devel
On Tue, Jul 06, 2004 at 02:31:48PM +0100, Ian Pratt wrote:
> If you've got a console connection it should be pretty easy to
> tell whether its just one process blocked on /dev/random or the
> whole of the kernel sick due to an NFS deadlock.
What would we be looking at? 'q' output?
> I believe the former to be most likely solved, which does suggest
> the latter. Once you have iSCSI or the SAN set up I'd hope the
> hangs disappear.
I'm betting that's the case -- otherwise I'm going to have a small
surplus of fibre channel gear. ;-)
It's going to be at least a week before the equipment arrives, another
week for me to get a chance to build and evaluate unstable on it. I'll
keep the list posted.
Steve
--
Stephen G. Traugott (KG6HDQ)
UNIX/Linux Infrastructure Architect, TerraLuna LLC
stevegt@TerraLuna.Org
http://www.stevegt.com -- http://Infrastructures.Org
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
* Re: QLogic Fibre Channel HBA Support
2004-07-06 13:59 ` Steve Traugott
@ 2004-07-06 14:21 ` Ian Pratt
0 siblings, 0 replies; 28+ messages in thread
From: Ian Pratt @ 2004-07-06 14:21 UTC (permalink / raw)
To: Steve Traugott; +Cc: Ian Pratt, Brian Wolfe, xen-devel
> On Tue, Jul 06, 2004 at 02:31:48PM +0100, Ian Pratt wrote:
> > If you've got a console connection it should be pretty easy to
> > tell whether its just one process blocked on /dev/random or the
> > whole of the kernel sick due to an NFS deadlock.
>
> What would we be looking at? 'q' output?
I was referring to the domain's console rather than the serial
console.
In the unstable tree, if you're running a 'getty' on the domain's
tty0 you should be able to log in and poke around even if
the domain's networking or sshd is screwed. The domain's console
will be presented as a TCP socket on domain0 (e.g.
"xencons localhost 9601" )
For domain0, you should be able to login over the serial line
with a getty on ttyS0.
Ian
-------------------------------------------------------
This SF.Net email sponsored by Black Hat Briefings & Training.
Attend Black Hat Briefings & Training, Las Vegas July 24-29 -
digital self defense, top technical experts, no vendor pitches,
unmatched networking opportunities. Visit www.blackhat.com
^ permalink raw reply [flat|nested] 28+ messages in thread
end of thread, other threads:[~2004-07-06 14:21 UTC | newest]
Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200407060323.i663Nl107730@roton.TerraLuna.Org>
2004-07-06 5:06 ` QLogic Fibre Channel HBA Support Steve Traugott
2004-07-06 6:22 ` Keir Fraser
2004-07-06 7:26 ` Ian Pratt
2004-07-06 12:53 ` Steve Traugott
2004-07-06 13:31 ` Ian Pratt
2004-07-06 13:59 ` Steve Traugott
2004-07-06 14:21 ` Ian Pratt
2004-07-06 3:40 Brian Wolfe
[not found] <200407051852.i65IqL124320@roton.TerraLuna.Org>
2004-07-05 22:12 ` Steve Traugott
-- strict thread matches above, loose matches on Subject: below --
2004-07-05 18:54 Brian Wolfe
[not found] <200407050431.i654VV130769@roton.TerraLuna.Org>
2004-07-05 18:33 ` Steve Traugott
2004-07-05 20:19 ` Ian Pratt
2004-07-05 22:02 ` Steve Traugott
2004-07-05 4:39 Brian Wolfe
2004-07-04 8:38 Steve Traugott
2004-07-04 9:02 ` Keir Fraser
2004-07-04 19:47 ` Steve Traugott
2004-07-04 9:11 ` Ian Pratt
2004-07-04 20:31 ` Steve Traugott
2004-07-04 20:50 ` Ian Pratt
2004-07-05 18:24 ` Steve Traugott
2004-07-05 18:48 ` Wim Coekaerts
2004-07-05 20:22 ` Ian Pratt
2004-07-05 20:14 ` Ian Pratt
2004-07-05 22:53 ` Steve Traugott
2004-07-05 23:23 ` Ian Pratt
2004-07-05 9:40 ` Niraj Tolia
2004-07-05 10:11 ` Ian Pratt
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.