* Re: system lockup when starting secondary domains
[not found] <20040513221648.W77678@demos.bsdclusters.com>
@ 2004-05-14 6:35 ` Keir Fraser
2004-05-14 20:56 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Keir Fraser @ 2004-05-14 6:35 UTC (permalink / raw)
To: Kip Macy; +Cc: xen-devel
> Good news:
> I can now mount LUNs over iSCSI using the Adaptec HW initiator running
> Adaptec's driver in DOM0.
>
> The bad news is that when I try to export the LUN to another domain, the
> machine stops responding. I've attached the kernel config for both dom0
> and the non-privileged domains as well as the configuration file I'm
> using.
>
> Please let me know of anything I can do to help track this down.
>
>
> Trivia:
> DOM0 stops responding to ping after this. The second domain will start
> responding to ping at some point - but ssh does not appear to be
> starting.
>
> The LUN contains the same contents as the local IDE drive except for
> /etc/sysconfig/network-scripts/ifcfg-eth0 and /etc/fstab.
When you create a new domain, it's virtual interface gets bridged to
eth0. Unfortunately this means that eth0 loses IP abilities.
The fix for now is to run a script something like the following before
creating the first domain:
/sbin/ifconfig nbe-br 128.232.38.20 netmask 255.255.240.0 up
/usr/sbin/brctl addif nbe-br eth0
/sbin/ip r d 128.232.32.0/20 dev eth0
/sbin/ip r a 128.232.32.0/20 dev nbe-br
/sbin/ip r d default via 128.232.32.1 dev eth0
/sbin/ip r a default via 128.232.32.1 dev nbe-br
i.e., attach your IP/netmask to device nbe-br. Also, any routes that
reference eth0 should be replaced with one that refers to nbe-br.
-- Keir
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: Re: system lockup when starting secondary domains
2004-05-14 6:35 ` system lockup when starting secondary domains Keir Fraser
@ 2004-05-14 20:56 ` Kip Macy
2004-05-14 21:02 ` never mind was " Kip Macy
2004-05-14 21:25 ` telnet xend Kip Macy
0 siblings, 2 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-14 20:56 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
Thanks. Next question.
On the console of the domain I'm creating I see:
Checking root filesystem
[/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda3
/dev/sda3 is mounted. e2fsck: Cannot continue, aborting.
[FAILED]
*** An error occurred during the file system check.
*** Dropping you to a shell; the system will reboot
I'm going to disable fsck to work around this - but what am I likely
doing wrong?
-Kip
> /usr/sbin/brctl addif nbe-br eth0
> /sbin/ip r d 128.232.32.0/20 dev eth0
> /sbin/ip r a 128.232.32.0/20 dev nbe-br
> /sbin/ip r d default via 128.232.32.1 dev eth0
> /sbin/ip r a default via 128.232.32.1 dev nbe-br
>
> i.e., attach your IP/netmask to device nbe-br. Also, any routes that
> reference eth0 should be replaced with one that refers to nbe-br.
>
> -- Keir
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* never mind was Re: Re: system lockup when starting secondary domains
2004-05-14 20:56 ` Kip Macy
@ 2004-05-14 21:02 ` Kip Macy
2004-05-14 21:25 ` telnet xend Kip Macy
1 sibling, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-14 21:02 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
root has to be read only:
cmdline_root = "root=/dev/sda3 ro"
I had it set to rw
-Kip
On Fri, 14 May 2004, Kip Macy wrote:
> Thanks. Next question.
>
> On the console of the domain I'm creating I see:
>
> Checking root filesystem
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda3
> /dev/sda3 is mounted. e2fsck: Cannot continue, aborting.
>
>
> [FAILED]
>
> *** An error occurred during the file system check.
> *** Dropping you to a shell; the system will reboot
>
> I'm going to disable fsck to work around this - but what am I likely
> doing wrong?
>
>
> -Kip
>
>
>
> > /usr/sbin/brctl addif nbe-br eth0
> > /sbin/ip r d 128.232.32.0/20 dev eth0
> > /sbin/ip r a 128.232.32.0/20 dev nbe-br
> > /sbin/ip r d default via 128.232.32.1 dev eth0
> > /sbin/ip r a default via 128.232.32.1 dev nbe-br
> >
> > i.e., attach your IP/netmask to device nbe-br. Also, any routes that
> > reference eth0 should be replaced with one that refers to nbe-br.
> >
> > -- Keir
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: SourceForge.net Broadband
> > Sign-up now for SourceForge Broadband and get the fastest
> > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> >
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* telnet xend
2004-05-14 20:56 ` Kip Macy
2004-05-14 21:02 ` never mind was " Kip Macy
@ 2004-05-14 21:25 ` Kip Macy
2004-05-14 21:33 ` Ian Pratt
1 sibling, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-14 21:25 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
Please remind me what the proper way is to interact with non-privileged
consoles. Without any telnet negotation no control characters get
transmitted and there is no notion of what type of terminal the client
is running.
-Kip
On Fri, 14 May 2004, Kip Macy wrote:
> Thanks. Next question.
>
> On the console of the domain I'm creating I see:
>
> Checking root filesystem
> [/sbin/fsck.ext3 (1) -- /] fsck.ext3 -a /dev/sda3
> /dev/sda3 is mounted. e2fsck: Cannot continue, aborting.
>
>
> [FAILED]
>
> *** An error occurred during the file system check.
> *** Dropping you to a shell; the system will reboot
>
> I'm going to disable fsck to work around this - but what am I likely
> doing wrong?
>
>
> -Kip
>
>
>
> > /usr/sbin/brctl addif nbe-br eth0
> > /sbin/ip r d 128.232.32.0/20 dev eth0
> > /sbin/ip r a 128.232.32.0/20 dev nbe-br
> > /sbin/ip r d default via 128.232.32.1 dev eth0
> > /sbin/ip r a default via 128.232.32.1 dev nbe-br
> >
> > i.e., attach your IP/netmask to device nbe-br. Also, any routes that
> > reference eth0 should be replaced with one that refers to nbe-br.
> >
> > -- Keir
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: SourceForge.net Broadband
> > Sign-up now for SourceForge Broadband and get the fastest
> > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> >
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 21:25 ` telnet xend Kip Macy
@ 2004-05-14 21:33 ` Ian Pratt
2004-05-14 21:48 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Ian Pratt @ 2004-05-14 21:33 UTC (permalink / raw)
To: Kip Macy; +Cc: Keir Fraser, xen-devel, Ian.Pratt
> Please remind me what the proper way is to interact with non-privileged
> consoles. Without any telnet negotation no control characters get
> transmitted and there is no notion of what type of terminal the client
> is running.
"xencons <machine> <port>" is what we use. Any raw terminal
program should work.
xend should probably have support to spot a telnet client and do
the necessary negotiation to put the client into raw character
mode.
Ian
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 21:33 ` Ian Pratt
@ 2004-05-14 21:48 ` Kip Macy
2004-05-14 22:37 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-14 21:48 UTC (permalink / raw)
To: Ian Pratt; +Cc: Keir Fraser, xen-devel
That works great.
I just did a shutdown -r now in the non-privileged domain and now I'm
seeing an endless stream of the messages below. DOM0 is now
unresponsive. Any suggestions?
Thanks for your help.
KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
KERNEL: assertion (skb==NULL || before(tp->copied_seq,
TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
(flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
KERNEL: assertion (skb==NULL || before(tp->copied_seq,
TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
(flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
KERNEL: assertion (skb==NULL || before(tp->copied_seq,
TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
(flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
KERNEL: assertion (skb==NULL || before(tp->copied_seq,
TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
(flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
On Fri, 14 May 2004, Ian Pratt wrote:
> > Please remind me what the proper way is to interact with non-privileged
> > consoles. Without any telnet negotation no control characters get
> > transmitted and there is no notion of what type of terminal the client
> > is running.
>
> "xencons <machine> <port>" is what we use. Any raw terminal
> program should work.
>
> xend should probably have support to spot a telnet client and do
> the necessary negotiation to put the client into raw character
> mode.
>
> Ian
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 21:48 ` Kip Macy
@ 2004-05-14 22:37 ` Kip Macy
2004-05-14 22:39 ` Kip Macy
2004-05-14 22:49 ` Ian Pratt
0 siblings, 2 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-14 22:37 UTC (permalink / raw)
To: Ian Pratt; +Cc: Keir Fraser, xen-devel
The machine now locks up while spitting out the error message below when
the non-privileged domain is initially *started*.
Let me know what information you need.
-Kip
On Fri, 14 May 2004, Kip Macy wrote:
>
> That works great.
>
> I just did a shutdown -r now in the non-privileged domain and now I'm
> seeing an endless stream of the messages below. DOM0 is now
> unresponsive. Any suggestions?
>
> Thanks for your help.
>
>
> KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
>
>
> On Fri, 14 May 2004, Ian Pratt wrote:
>
> > > Please remind me what the proper way is to interact with non-privileged
> > > consoles. Without any telnet negotation no control characters get
> > > transmitted and there is no notion of what type of terminal the client
> > > is running.
> >
> > "xencons <machine> <port>" is what we use. Any raw terminal
> > program should work.
> >
> > xend should probably have support to spot a telnet client and do
> > the necessary negotiation to put the client into raw character
> > mode.
> >
> > Ian
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: SourceForge.net Broadband
> > Sign-up now for SourceForge Broadband and get the fastest
> > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> >
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 22:37 ` Kip Macy
@ 2004-05-14 22:39 ` Kip Macy
2004-05-14 22:49 ` Ian Pratt
1 sibling, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-14 22:39 UTC (permalink / raw)
To: Ian Pratt; +Cc: Keir Fraser, xen-devel
And this is the last thing that the non-priv domain prints out before
the network goes dead:
Binding to the NIS domain: [ OK ]
Listening for an NIS domain server.
Starting automount:
On Fri, 14 May 2004, Kip Macy wrote:
> The machine now locks up while spitting out the error message below when
> the non-privileged domain is initially *started*.
>
> Let me know what information you need.
>
> -Kip
>
> On Fri, 14 May 2004, Kip Macy wrote:
>
> >
> > That works great.
> >
> > I just did a shutdown -r now in the non-privileged domain and now I'm
> > seeing an endless stream of the messages below. DOM0 is now
> > unresponsive. Any suggestions?
> >
> > Thanks for your help.
> >
> >
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> > (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> > (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> > (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
> > (flags&(MSG_PEEK|MSG_TRUNC))) failed at tcp.c(1603)
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> >
> >
> > On Fri, 14 May 2004, Ian Pratt wrote:
> >
> > > > Please remind me what the proper way is to interact with non-privileged
> > > > consoles. Without any telnet negotation no control characters get
> > > > transmitted and there is no notion of what type of terminal the client
> > > > is running.
> > >
> > > "xencons <machine> <port>" is what we use. Any raw terminal
> > > program should work.
> > >
> > > xend should probably have support to spot a telnet client and do
> > > the necessary negotiation to put the client into raw character
> > > mode.
> > >
> > > Ian
> > >
> > >
> > > -------------------------------------------------------
> > > This SF.Net email is sponsored by: SourceForge.net Broadband
> > > Sign-up now for SourceForge Broadband and get the fastest
> > > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.sourceforge.net
> > > https://lists.sourceforge.net/lists/listinfo/xen-devel
> > >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: SourceForge.net Broadband
> > Sign-up now for SourceForge Broadband and get the fastest
> > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
> >
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 22:37 ` Kip Macy
2004-05-14 22:39 ` Kip Macy
@ 2004-05-14 22:49 ` Ian Pratt
2004-05-14 23:14 ` Kip Macy
2004-05-15 4:26 ` suspending a domain in the ngio world Kip Macy
1 sibling, 2 replies; 26+ messages in thread
From: Ian Pratt @ 2004-05-14 22:49 UTC (permalink / raw)
To: Kip Macy; +Cc: Ian Pratt, Keir Fraser, xen-devel
> The machine now locks up while spitting out the error message below when
> the non-privileged domain is initially *started*.
>
> > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
I've never seen anything like this. Did you build the kernel
yourself? What version of gcc? (We use 3.2.2 as per RH9)
Can you reproduce with one of our nightly builds?
The TCP stack is clearly seriously confused. It's hard to imagine
how Xen could cause this.
Ian
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-14 22:49 ` Ian Pratt
@ 2004-05-14 23:14 ` Kip Macy
2004-05-15 4:26 ` suspending a domain in the ngio world Kip Macy
1 sibling, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-14 23:14 UTC (permalink / raw)
To: Ian Pratt; +Cc: Keir Fraser, xen-devel
Do your nightly builds have xenolinux binaries for nodev xen?
-Kip
On Fri, 14 May 2004, Ian Pratt wrote:
> > The machine now locks up while spitting out the error message below when
> > the non-privileged domain is initially *started*.
> >
> > > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
>
> I've never seen anything like this. Did you build the kernel
> yourself? What version of gcc? (We use 3.2.2 as per RH9)
>
> Can you reproduce with one of our nightly builds?
>
> The TCP stack is clearly seriously confused. It's hard to imagine
> how Xen could cause this.
>
> Ian
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: telnet xend
@ 2004-05-14 23:28 Neugebauer, Rolf
2004-05-15 0:46 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Neugebauer, Rolf @ 2004-05-14 23:28 UTC (permalink / raw)
To: Ian Pratt, Kip Macy; +Cc: Keir Fraser, xen-devel
> -----Original Message-----
> From: xen-devel-admin@lists.sourceforge.net [mailto:xen-devel-
> admin@lists.sourceforge.net] On Behalf Of Ian Pratt
> Sent: 14 May 2004 23:49
> To: Kip Macy
> Cc: Ian Pratt; Keir Fraser; xen-devel@lists.sourceforge.net;
> Ian.Pratt@cl.cam.ac.uk
> Subject: Re: [Xen-devel] telnet xend
>
> > The machine now locks up while spitting out the error message below
when
> > the non-privileged domain is initially *started*.
> >
> > > KERNEL: assertion (flags&MSG_PEEK) failed at tcp.c(1540)
> > > KERNEL: assertion (skb==NULL || before(tp->copied_seq,
> > > TCP_SKB_CB(skb)->end_seq)) failed at tcp.c(1290)
> > > KERNEL: assertion (tp->copied_seq == tp->rcv_nxt ||
>
> I've never seen anything like this. Did you build the kernel
> yourself? What version of gcc? (We use 3.2.2 as per RH9)
I think I have these before as well, although not with the recent, i.e.,
todays checkins. This is in ngio (nodev=y) world, right?
> Can you reproduce with one of our nightly builds?
I saw these occasionally doing ttcp test in ngio land a day or two ago.
Could you try the latest bk version. Keir checked in a few fixes
recently.
Rolf
> The TCP stack is clearly seriously confused. It's hard to imagine
> how Xen could cause this.
>
> Ian
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62&alloc_ida84&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: telnet xend
2004-05-14 23:28 telnet xend Neugebauer, Rolf
@ 2004-05-15 0:46 ` Kip Macy
2004-05-15 1:39 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-15 0:46 UTC (permalink / raw)
To: Neugebauer, Rolf; +Cc: Ian Pratt, Keir Fraser, xen-devel
> I think I have these before as well, although not with the recent, i.e.,
> todays checkins. This is in ngio (nodev=y) world, right?
Yes.
>
> Could you try the latest bk version. Keir checked in a few fixes
> recently.
>
I'll try that and let you know.
-Kip
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: telnet xend
2004-05-15 0:46 ` Kip Macy
@ 2004-05-15 1:39 ` Kip Macy
2004-05-15 8:59 ` Ian Pratt
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-15 1:39 UTC (permalink / raw)
To: Neugebauer, Rolf; +Cc: Ian Pratt, Keir Fraser, xen-devel
Latest sources were much better behaved. Thanks.
I now have secondary domains up and running with iSCSI luns as the root
device. DOM1 doesn't appear to have any difficulty saturating the
iSCSI initiator.
I'm extremely pleased.
kmacy@xen-vm0 pwd
/tmp
kmacy@xen-vm0 time dd if=/dev/zero of=bwout bs=1048576 count=1024
1024+0 records in
1024+0 records out
0.000u 0.920s 0:10.73 8.5% 0+0k 0+0io 134pf+0w
kmacy@xen-vm0 time dd of=/dev/null if=bwout bs=1048576 count=1024
1024+0 records in
1024+0 records out
0.000u 0.000s 0:19.51 0.0% 0+0k 0+0io 137pf+0w
kmacy@xen-vm0 cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 64434176 62726144 1708032 0 2285568 27713536
Swap: 0 0 0
MemTotal: 62924 kB
-Kip
On Fri, 14 May 2004, Kip Macy wrote:
>
> > I think I have these before as well, although not with the recent, i.e.,
> > todays checkins. This is in ngio (nodev=y) world, right?
>
> Yes.
>
> >
> > Could you try the latest bk version. Keir checked in a few fixes
> > recently.
> >
> I'll try that and let you know.
>
> -Kip
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* suspending a domain in the ngio world
2004-05-14 22:49 ` Ian Pratt
2004-05-14 23:14 ` Kip Macy
@ 2004-05-15 4:26 ` Kip Macy
2004-05-15 5:17 ` Kip Macy
2004-05-15 8:16 ` Keir Fraser
1 sibling, 2 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-15 4:26 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
Does xc_linux_save.c need to change for ngio?
The following command:
./xc_dom_control.py suspend 4 /tmp/xen-vm0.core
never completes.
This all I see in the output of strace (many times over)
mlock(0xbffff170, 72) = 0
ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
munlock(0xbffff170, 72) = 0
select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
mlock(0xbffff170, 72) = 0
ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
munlock(0xbffff170, 72) = 0
select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
mlock(0xbffff170, 72) = 0
ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
-Kip
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 4:26 ` suspending a domain in the ngio world Kip Macy
@ 2004-05-15 5:17 ` Kip Macy
2004-05-15 8:16 ` Keir Fraser
1 sibling, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-15 5:17 UTC (permalink / raw)
To: Ian Pratt; +Cc: xen-devel
It looks like a change in the DOM0 interface:
>>> xc.domain_getinfo()
[{'cpu_time': 297676674252L, 'stopped': 0, 'name': 'Domain-0', 'mem_kb':
257112, 'dom': 0L, 'running': 1, 'maxmem_kb': 262144, 'cpu': 0},
{'cpu_time': 9552521165L, 'stopped': 0, 'name': 'This is VM 2',
'mem_kb': 65536, 'dom': 6L, 'running': 0, 'maxmem_kb': 65536, 'cpu': 0}]
but the domain is in fact stopped.
-Kip
On Fri, 14 May 2004, Kip Macy wrote:
> Does xc_linux_save.c need to change for ngio?
>
> The following command:
> ./xc_dom_control.py suspend 4 /tmp/xen-vm0.core
>
> never completes.
>
> This all I see in the output of strace (many times over)
>
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> munlock(0xbffff170, 72) = 0
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> munlock(0xbffff170, 72) = 0
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
>
>
>
> -Kip
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 4:26 ` suspending a domain in the ngio world Kip Macy
2004-05-15 5:17 ` Kip Macy
@ 2004-05-15 8:16 ` Keir Fraser
2004-05-15 15:51 ` Kip Macy
1 sibling, 1 reply; 26+ messages in thread
From: Keir Fraser @ 2004-05-15 8:16 UTC (permalink / raw)
To: Kip Macy; +Cc: Ian Pratt, xen-devel
Suspend/resume won't work with ngio at the moment. It'll be a few
weeks at least before we tackle merging the two features.
-- Keir
> Does xc_linux_save.c need to change for ngio?
>
> The following command:
> ./xc_dom_control.py suspend 4 /tmp/xen-vm0.core
>
> never completes.
>
> This all I see in the output of strace (many times over)
>
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> munlock(0xbffff170, 72) = 0
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> munlock(0xbffff170, 72) = 0
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> mlock(0xbffff170, 72) = 0
> ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
>
>
>
> -Kip
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-15 1:39 ` Kip Macy
@ 2004-05-15 8:59 ` Ian Pratt
2004-05-15 15:58 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Ian Pratt @ 2004-05-15 8:59 UTC (permalink / raw)
To: Kip Macy; +Cc: Neugebauer, Rolf, Ian Pratt, Keir Fraser, xen-devel
> I now have secondary domains up and running with iSCSI luns as the root
> device. DOM1 doesn't appear to have any difficulty saturating the
> iSCSI initiator.
>
> I'm extremely pleased.
Great! Just to check I understand what you're doing: you're
running an iSCSI initiator in dom0 that is talking to an iSCSI
disk array over GigE, and then re-exporting this as xen block
devices to other domains. (?)
Which iSCSI initiator are you using? Do you know of a compatible
iSCSI target (disk) implementation for Linux? (for those of us
that don't have an iSCSI arry to play with.
Do you know of any iSCSI initiator implementations that support
root fs on iSCSI? (I know this isn't relevant for your setup in
dom0, but in some circumstances it would be nice to have the
domains talking direct.
Thanks,
Ian
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: telnet xend
@ 2004-05-15 9:40 Neugebauer, Rolf
2004-05-15 15:59 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Neugebauer, Rolf @ 2004-05-15 9:40 UTC (permalink / raw)
To: Ian Pratt, Kip Macy; +Cc: Keir Fraser, xen-devel
> -----Original Message-----
> From: Ian Pratt [mailto:Ian.Pratt@cl.cam.ac.uk]
> Sent: 15 May 2004 09:59
> To: Kip Macy
> Cc: Neugebauer, Rolf; Ian Pratt; Keir Fraser; xen-
> devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk
> Subject: Re: [Xen-devel] telnet xend
>
>
> > I now have secondary domains up and running with iSCSI luns as the
root
> > device. DOM1 doesn't appear to have any difficulty saturating the
> > iSCSI initiator.
> >
> > I'm extremely pleased.
>
> Great! Just to check I understand what you're doing: you're
> running an iSCSI initiator in dom0 that is talking to an iSCSI
> disk array over GigE, and then re-exporting this as xen block
> devices to other domains. (?)
>
> Which iSCSI initiator are you using? Do you know of a compatible
> iSCSI target (disk) implementation for Linux? (for those of us
> that don't have an iSCSI arry to play with.
AFAIK the intel iSCSI code on sf.net has a sample target implementation
but the last time I looked only contained target code for a ram disk and
a fake user mode disk (file on a normal file system).
A quick look through the UNH implementation
(http://sourceforge.net/projects/unh-iscsi/) suggests that their target
code can be used to export real disks.
> Do you know of any iSCSI initiator implementations that support
> root fs on iSCSI? (I know this isn't relevant for your setup in
> dom0, but in some circumstances it would be nice to have the
> domains talking direct.
I found this mini howto:
http://eludicate.com/~bolen/iscsi/
Suggests that root fs with the CISCO iSCSI software initiator might
work.
Rolf
> Thanks,
> Ian
>
>
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id%62&alloc_ida84&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 8:16 ` Keir Fraser
@ 2004-05-15 15:51 ` Kip Macy
2004-05-15 16:12 ` Keir Fraser
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-15 15:51 UTC (permalink / raw)
To: Keir Fraser; +Cc: Ian Pratt, xen-devel
Ok - thanks. Is this because all the dependencies for stopping a domain
are not in place? Or are the interfaces in flux in general? What I'm
trying to do is get the bits in place for core debugging of non-
privileged domains.
-Kip
On Sat, 15 May 2004, Keir Fraser wrote:
>
> Suspend/resume won't work with ngio at the moment. It'll be a few
> weeks at least before we tackle merging the two features.
>
> -- Keir
>
> > Does xc_linux_save.c need to change for ngio?
> >
> > The following command:
> > ./xc_dom_control.py suspend 4 /tmp/xen-vm0.core
> >
> > never completes.
> >
> > This all I see in the output of strace (many times over)
> >
> > mlock(0xbffff170, 72) = 0
> > ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> > munlock(0xbffff170, 72) = 0
> > select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> > mlock(0xbffff170, 72) = 0
> > ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> > munlock(0xbffff170, 72) = 0
> > select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> > mlock(0xbffff170, 72) = 0
> > ioctl(3, SNDCTL_DSP_RESET, 0xbffff130) = 0
> >
> >
> >
> > -Kip
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by: SourceForge.net Broadband
> > Sign-up now for SourceForge Broadband and get the fastest
> > 6.0/768 connection for only $19.95/mo for the first 3 months!
> > http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/xen-devel
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: telnet xend
2004-05-15 8:59 ` Ian Pratt
@ 2004-05-15 15:58 ` Kip Macy
0 siblings, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-15 15:58 UTC (permalink / raw)
To: Ian Pratt; +Cc: Neugebauer, Rolf, Keir Fraser, xen-devel
> Great! Just to check I understand what you're doing: you're
> running an iSCSI initiator in dom0 that is talking to an iSCSI
> disk array over GigE, and then re-exporting this as xen block
> devices to other domains. (?)
I'm using Adaptec's iSCSI hardware initiator. The driver is the latest
version I downloaded from their website. A NetApp filer is the device
exporting LUNs over iSCSI.
> Which iSCSI initiator are you using? Do you know of a compatible
> iSCSI target (disk) implementation for Linux? (for those of us
> that don't have an iSCSI arry to play with.
I can provide a user-mode version if a working Linux one isn't found.
The UNH code is not the cleanest. The few people I know who've tried it
have had problems.
> Do you know of any iSCSI initiator implementations that support
> root fs on iSCSI? (I know this isn't relevant for your setup in
> dom0, but in some circumstances it would be nice to have the
> domains talking direct.
The HW initiator can supposedly boot off of LUNs now. Britt Bolen -
mentioned in another e-mail figuring out the contortions required to get
iSCSI root with the Cisco initiator.
-Kip
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* RE: telnet xend
2004-05-15 9:40 Neugebauer, Rolf
@ 2004-05-15 15:59 ` Kip Macy
0 siblings, 0 replies; 26+ messages in thread
From: Kip Macy @ 2004-05-15 15:59 UTC (permalink / raw)
To: Neugebauer, Rolf; +Cc: Ian Pratt, Keir Fraser, xen-devel
Britt did some of the blocks work here so he did that for his own use. I
only know of a couple people using it.
-Kip
>
> I found this mini howto:
> http://eludicate.com/~bolen/iscsi/
>
> Suggests that root fs with the CISCO iSCSI software initiator might
> work.
>
> Rolf
>
> > Thanks,
> > Ian
> >
> >
> >
>
>
>
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 15:51 ` Kip Macy
@ 2004-05-15 16:12 ` Keir Fraser
2004-05-15 17:02 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Keir Fraser @ 2004-05-15 16:12 UTC (permalink / raw)
To: Kip Macy; +Cc: xen-devel
> Ok - thanks. Is this because all the dependencies for stopping a domain
> are not in place? Or are the interfaces in flux in general? What I'm
> trying to do is get the bits in place for core debugging of non-
> privileged domains.
xend doesn't do setup/teardown of i/o connections properly yet. What's
there is a very basic lashup to create very simple configurations --
but not enough to suspend/resume them.
-- Keir
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 16:12 ` Keir Fraser
@ 2004-05-15 17:02 ` Kip Macy
2004-05-15 17:43 ` Keir Fraser
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-15 17:02 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
questions:
1)
Does this extend as far as not being able to stop without destroying at
all?
kmacy@curly r ./xc_dom_control.py list
Dom Name Mem(kb) CPU State Time(ms)
0 Domain-0 257128 0 r- 28516
1 This is VM 2 64920 0 -- 5227
kmacy@curly r ./xc_dom_control.py stop 1
return code 0
kmacy@curly r ./xc_dom_control.py list
Dom Name Mem(kb) CPU State Time(ms)
0 Domain-0 257052 0 r- 28769
1 This is VM 2 64996 0 -- 5245
I can still interact with the domain over its console.
============================================================
2)
I take it that many of the following are expected right now when
destroying a domain with I/O in flight:
(XEN) DOM0: (file=memory.c, line=935) Unknown domain '2'
(file=main.c, line=266) Failed MMU update transferring to DOM2
========================================================
3)
I just did the following:
[root@xen-vm0 ~]$ while (1)
while? dd if=/dev/zero of=/tmp/bwout count=1024 bs=1024k
while? end
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
1024+0 records in
1024+0 records out
and then I saw this on the machine console:
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process python
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process syslogd
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process sendmail
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process ypbind
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process ypbind
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process sshd
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process sshd
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process tcsh
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)
__alloc_pages: 0-order allocation failed (gfp=0xf0/0)
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process crond
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process crond
(XEN) (file=traps.c, line=469) GPF (0004): fc520e08 -> fc52e2d2
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process portmap
__alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
VM: killing process umount
I guess memory management is a work in progress?
Thanks.
-Kip
On Sat, 15 May 2004, Keir Fraser wrote:
> > Ok - thanks. Is this because all the dependencies for stopping a domain
> > are not in place? Or are the interfaces in flux in general? What I'm
> > trying to do is get the bits in place for core debugging of non-
> > privileged domains.
>
> xend doesn't do setup/teardown of i/o connections properly yet. What's
> there is a very basic lashup to create very simple configurations --
> but not enough to suspend/resume them.
>
> -- Keir
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: SourceForge.net Broadband
> Sign-up now for SourceForge Broadband and get the fastest
> 6.0/768 connection for only $19.95/mo for the first 3 months!
> http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/xen-devel
>
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 17:02 ` Kip Macy
@ 2004-05-15 17:43 ` Keir Fraser
2004-05-15 18:10 ` Kip Macy
0 siblings, 1 reply; 26+ messages in thread
From: Keir Fraser @ 2004-05-15 17:43 UTC (permalink / raw)
To: Kip Macy; +Cc: Keir Fraser, xen-devel
> questions:
> 1)
> Does this extend as far as not being able to stop without destroying at
> all?
> I can still interact with the domain over its console.
I checked in a fix for this a couple of hours ago -- it was stopping
a domain from dying except via a forced destroy from DOM0 (e.g.,
/sbin/reboot within the domain itself wouldn't work). So a stop
request should now stop the domain. Won't be much use though as I'm
pretty sure it won't start up again happily!
> ============================================================
> 2)
> I take it that many of the following are expected right now when
> destroying a domain with I/O in flight:
>
> (XEN) DOM0: (file=memory.c, line=935) Unknown domain '2'
> (file=main.c, line=266) Failed MMU update transferring to DOM2
Yep, I see this. As I said: xend can just about set up a basic
interface between a guest and a device-driver backend. It's not got
functionality for tearing the interface down properly, which leaves
the backend driver in a confused state, getting you a bunch of
(fairly harmless) errors.
> [root@xen-vm0 ~]$ while (1)
> while? dd if=/dev/zero of=/tmp/bwout count=1024 bs=1024k
> while? end
> 1024+0 records in
> 1024+0 records out
> 1024+0 records in
> 1024+0 records out
> 1024+0 records in
> 1024+0 records out
> 1024+0 records in
> __alloc_pages: 0-order allocation failed (gfp=0x1d2/0)
> VM: killing process umount
> I guess memory management is a work in progress?
This is within DOM1 (i.e., not DOM0) right? If so, I guess that doing
this 'dd' test within DOM0 doesn't get you similar messages?
This is rather unexpected -- if you could add a stack backtrace to the
out-of-memory path in the page allocator (page_alloc.c in Xenolinux)
an d post me that with the kernel image (vmlinux) then I'll see what I
can work out. I guess I haven't tested all that hard so there might be
a memory leak.
-- Keir
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 17:43 ` Keir Fraser
@ 2004-05-15 18:10 ` Kip Macy
2004-05-15 23:21 ` Keir Fraser
0 siblings, 1 reply; 26+ messages in thread
From: Kip Macy @ 2004-05-15 18:10 UTC (permalink / raw)
To: Keir Fraser; +Cc: xen-devel
The dd is running in DOM1. The OOM killer is getting run in DOM0.
There is clearly a memory leak in the block I/O path.
DOM0 is curly and DOM1 is xen-vm0.
A large amount of memory has already been leaked:
kmacy@curly cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 262565888 205619200 56946688 0 23339008 28123136
==
[root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout bs=1024k count=256
==
kmacy@curly cat /proc/meminfo
total: used: free: shared: buffers: cached:
Mem: 262565888 214687744 47878144 0 23339008 28123136
==
[root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout count=256 bs=1024k
256+0 records in
256+0 records out
==
kmacy@curly cat /proc/meminfo | head -3
total: used: free: shared: buffers: cached:
Mem: 262565888 223727616 38838272 0 23339008 28123136
==
[root@xen-vm0 ~]$ dd if=/dev/zero of=/tmp/bwout count=256 bs=1024k
256+0 records in
256+0 records out
==
kmacy@curly cat /proc/meminfo | head -2
total: used: free: shared: buffers: cached:
Mem: 262565888 232873984 29691904 0 23339008 28123136
So ~40MB is leaked for every 1GB transferred.
I can give you a stack backtrace of the memory allocation failure in
DOM0 if you like, but as far as I can tell the horse has long since left
the barn at that point.
> This is within DOM1 (i.e., not DOM0) right? If so, I guess that doing
> this 'dd' test within DOM0 doesn't get you similar messages?
>
> This is rather unexpected -- if you could add a stack backtrace to the
> out-of-memory path in the page allocator (page_alloc.c in Xenolinux)
> an d post me that with the kernel image (vmlinux) then I'll see what I
> can work out. I guess I haven't tested all that hard so there might be
> a memory leak.
On a side note - I don't need suspend/restore, I just need coredump and
almost immediately after that PTRACE_STOP. So long as I can stop the
domain long enough to write out its state I have what I need.
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: suspending a domain in the ngio world
2004-05-15 18:10 ` Kip Macy
@ 2004-05-15 23:21 ` Keir Fraser
0 siblings, 0 replies; 26+ messages in thread
From: Keir Fraser @ 2004-05-15 23:21 UTC (permalink / raw)
To: Kip Macy; +Cc: xen-devel
> The dd is running in DOM1. The OOM killer is getting run in DOM0.
> There is clearly a memory leak in the block I/O path.
Now fixed. It turned out to be rather blatant.
> On a side note - I don't need suspend/restore, I just need coredump and
> almost immediately after that PTRACE_STOP. So long as I can stop the
> domain long enough to write out its state I have what I need.
A pause operation will be coming up soon, as part of a cleanup of the
scheduler interface in Xen. This will fix the problem that there's
currently no way to stop a domain without having it suspend itself.
-- Keir
-------------------------------------------------------
This SF.Net email is sponsored by: SourceForge.net Broadband
Sign-up now for SourceForge Broadband and get the fastest
6.0/768 connection for only $19.95/mo for the first 3 months!
http://ads.osdn.com/?ad_id=2562&alloc_id=6184&op=click
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2004-05-15 23:21 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20040513221648.W77678@demos.bsdclusters.com>
2004-05-14 6:35 ` system lockup when starting secondary domains Keir Fraser
2004-05-14 20:56 ` Kip Macy
2004-05-14 21:02 ` never mind was " Kip Macy
2004-05-14 21:25 ` telnet xend Kip Macy
2004-05-14 21:33 ` Ian Pratt
2004-05-14 21:48 ` Kip Macy
2004-05-14 22:37 ` Kip Macy
2004-05-14 22:39 ` Kip Macy
2004-05-14 22:49 ` Ian Pratt
2004-05-14 23:14 ` Kip Macy
2004-05-15 4:26 ` suspending a domain in the ngio world Kip Macy
2004-05-15 5:17 ` Kip Macy
2004-05-15 8:16 ` Keir Fraser
2004-05-15 15:51 ` Kip Macy
2004-05-15 16:12 ` Keir Fraser
2004-05-15 17:02 ` Kip Macy
2004-05-15 17:43 ` Keir Fraser
2004-05-15 18:10 ` Kip Macy
2004-05-15 23:21 ` Keir Fraser
2004-05-14 23:28 telnet xend Neugebauer, Rolf
2004-05-15 0:46 ` Kip Macy
2004-05-15 1:39 ` Kip Macy
2004-05-15 8:59 ` Ian Pratt
2004-05-15 15:58 ` Kip Macy
-- strict thread matches above, loose matches on Subject: below --
2004-05-15 9:40 Neugebauer, Rolf
2004-05-15 15:59 ` Kip Macy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.