* [Qemu-devel] Network code on AMD64
@ 2005-07-21 11:32 Paul LeoNerd Evans
2005-07-21 15:25 ` Jim C. Brown
0 siblings, 1 reply; 5+ messages in thread
From: Paul LeoNerd Evans @ 2005-07-21 11:32 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 2782 bytes --]
Apologies if this issue has already been solved, by the way; I've only
just joined the mailing list...
I've been running 0.7.0 on an AMD64, and noticed that DHCP doesn't work.
I further observe that a build of the same source, running the same
image, works fine on an i386. Being familiar with fixing small code bugs
on AMD64, I had a good look through the code for any 64bit issues that
might arise (usually assumptions that "long" is 32 bits wide)...
I found two places where this happens, and fixed them; see patch below...
I find now, that DHCP works with this code.
But there's a problem. Currently my only test image is a Windows 98SE
install - not best known for being able to properly debug - I shall have
to test with a decent Knoppix or something like that... But I find that
if I start up IE, it attempts a connection to its default homepage, then
Qemu itself segfaults. Normally I'd fire up gdb at this stage and have a
good look around, but I gather from documentation that the internals of
qemu are far from standard, and I might be somewhat out of my depth here.
I thought I'd report here anyway; maybe someone with more development
experience could pick it up, or at least, give me some suggestions of
tests to run. I'm quite familiar with C in general, and Linux coding, but
I've never done anything like the dynamic translation stuff that qemu is
doing here...
Also, I shall try to come up with a minimal test case using a Linux
image; maybe if I provide an image that reliably boots and segfaults
qemu..?
diff -urN qemu-0.7.0-orig/slirp/bootp.h qemu-0.7.0/slirp/bootp.h
--- qemu-0.7.0-orig/slirp/bootp.h 2005-04-27 21:52:05.000000000 +0100
+++ qemu-0.7.0/slirp/bootp.h 2005-07-20 20:33:45.413577774 +0100
@@ -97,9 +97,9 @@
uint8_t bp_htype;
uint8_t bp_hlen;
uint8_t bp_hops;
- unsigned long bp_xid;
- unsigned short bp_secs;
- unsigned short unused;
+ uint32_t bp_xid;
+ uint16_t bp_secs;
+ uint16_t unused;
struct in_addr bp_ciaddr;
struct in_addr bp_yiaddr;
struct in_addr bp_siaddr;
diff -urN qemu-0.7.0-orig/slirp/ip.h qemu-0.7.0/slirp/ip.h
--- qemu-0.7.0-orig/slirp/ip.h 2005-04-27 21:52:05.000000000 +0100
+++ qemu-0.7.0/slirp/ip.h 2005-07-20 20:33:45.413577774 +0100
@@ -209,7 +209,7 @@
* Overlay for ip header used by other protocols (tcp, udp).
*/
struct ipovly {
- caddr32_t ih_next, ih_prev; /* for protocol sequence q's */
+ uint32_t ih_next, ih_prev; /* for protocol sequence q's */
u_int8_t ih_x1; /* (unused) */
u_int8_t ih_pr; /* protocol */
int16_t ih_len; /* protocol length */
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk
ICQ# 4135350 | Registered Linux# 179460
http://www.leonerd.org.uk/
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Network code on AMD64
2005-07-21 11:32 [Qemu-devel] Network code on AMD64 Paul LeoNerd Evans
@ 2005-07-21 15:25 ` Jim C. Brown
2005-07-21 16:18 ` Julian Seward
2005-07-21 21:55 ` Paul LeoNerd Evans
0 siblings, 2 replies; 5+ messages in thread
From: Jim C. Brown @ 2005-07-21 15:25 UTC (permalink / raw)
To: qemu-devel
On Thu, Jul 21, 2005 at 12:32:32PM +0100, Paul LeoNerd Evans wrote:
> Apologies if this issue has already been solved, by the way; I've only
> just joined the mailing list...
>
No, this problem has come up a couple times but until now no one has actually
tried to fix them.
Good job.
> But there's a problem. Currently my only test image is a Windows 98SE
> install - not best known for being able to properly debug - I shall have
> to test with a decent Knoppix or something like that... But I find that
> if I start up IE, it attempts a connection to its default homepage, then
> Qemu itself segfaults. Normally I'd fire up gdb at this stage and have a
> good look around, but I gather from documentation that the internals of
> qemu are far from standard, and I might be somewhat out of my depth here.
>
qemu does a lot of strange things, but the hardware emulation code (e.g. the
code that emulates the ne2k) as well as the servers emulation code (e.g. the
code that emulates a dhcp server or the code that handles the proxying of tcp/ip
requests) can easily be debugged using gdb. I've done it many times myself - only
the translated machine code itself can not be viewed this way (for obvious
reasons).
> I thought I'd report here anyway; maybe someone with more development
> experience could pick it up, or at least, give me some suggestions of
> tests to run. I'm quite familiar with C in general, and Linux coding, but
> I've never done anything like the dynamic translation stuff that qemu is
> doing here...
>
Odds are good this isn't the place where the segfault is occuring, and like I
said the rest of qemu is perfectly debuggable in gdb.
--
Infinite complexity begets infinite beauty.
Infinite precision begets infinite perfection.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Network code on AMD64
2005-07-21 15:25 ` Jim C. Brown
@ 2005-07-21 16:18 ` Julian Seward
2005-07-21 21:55 ` Paul LeoNerd Evans
1 sibling, 0 replies; 5+ messages in thread
From: Julian Seward @ 2005-07-21 16:18 UTC (permalink / raw)
To: qemu-devel; +Cc: Jim C. Brown
> > Qemu itself segfaults. Normally I'd fire up gdb at this stage and have a
> > good look around,
Why don't you fire up Valgrind and have a good look around? It can
find all manner of bad stuff that GDB doesn't find, like out-of-
bounds memory accesses and use of uninitialised values that are often
the root causes of segfaults. At least, that's what lots of Valgrind
users tell us :-) Recent Valgrinds should be able to run QEMU-softmmu
variants.
J
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Network code on AMD64
2005-07-21 15:25 ` Jim C. Brown
2005-07-21 16:18 ` Julian Seward
@ 2005-07-21 21:55 ` Paul LeoNerd Evans
2005-07-21 22:58 ` Paul LeoNerd Evans
1 sibling, 1 reply; 5+ messages in thread
From: Paul LeoNerd Evans @ 2005-07-21 21:55 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1138 bytes --]
On Thu, Jul 21, 2005 at 11:25:43AM -0400, Jim C. Brown wrote:
> > But there's a problem. Currently my only test image is a Windows 98SE
> > install - not best known for being able to properly debug - I shall have
> > to test with a decent Knoppix or something like that... But I find that
> > if I start up IE, it attempts a connection to its default homepage, then
> > Qemu itself segfaults. Normally I'd fire up gdb at this stage and have a
> > good look around, but I gather from documentation that the internals of
> > qemu are far from standard, and I might be somewhat out of my depth here.
I have determined, by the way, a much more precise location for the bug.
I can start a Knoppix image, which can reliably resolve hostnames, and
ping the host machine. I then tried a http-over-telnet, to test TCP. I
connect, send/receive data just fine. The moment I Ctrl+C the telnet,
that's when qemu dies. So I suspect the bug is related to the TCP close
code. I shall investigate further...
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk
ICQ# 4135350 | Registered Linux# 179460
http://www.leonerd.org.uk/
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [Qemu-devel] Network code on AMD64
2005-07-21 21:55 ` Paul LeoNerd Evans
@ 2005-07-21 22:58 ` Paul LeoNerd Evans
0 siblings, 0 replies; 5+ messages in thread
From: Paul LeoNerd Evans @ 2005-07-21 22:58 UTC (permalink / raw)
To: qemu-devel
[-- Attachment #1: Type: text/plain, Size: 1762 bytes --]
On Thu, Jul 21, 2005 at 10:55:22PM +0100, Paul LeoNerd Evans wrote:
> I have determined, by the way, a much more precise location for the bug.
> I can start a Knoppix image, which can reliably resolve hostnames, and
> ping the host machine. I then tried a http-over-telnet, to test TCP. I
> connect, send/receive data just fine. The moment I Ctrl+C the telnet,
> that's when qemu dies. So I suspect the bug is related to the TCP close
> code. I shall investigate further...
Maybe there's some developers around who know the slirp code better than
I do... But I'm finding something truely bizare here..
slirp/tcp_input.c lines 137-139:
for (q = (struct tcpiphdr *)tp->seg_next; q != (struct tcpiphdr *)tp;
q = (struct tcpiphdr *)q->ti_next)
if (SEQ_GT(q->ti_seq, ti->ti_seq))
break;
We're using tp->seg_next and q->ti_next as pointers to an in-memory
struct.
But; tp's type is defined as:
#if SIZEOF_CHAR_P == 4
typedef struct tcpiphdr *tcpiphdrp_32;
#else
typedef u_int32_t tcpiphdrp_32;
#endif
struct tcpcb {
tcpiphdrp_32 seg_next;»/* sequencing queue */
tcpiphdrp_32 seg_prev;
...
}
Which I find odd, seeing as therefore we're using a u_int32_t as a
pointer to a struct..? Sounds oddly dangerous.
Similarly, ti_next is really a macro for ti_i.ih_next, which is
similarly typed as uint32_t.
As sizeof(int) == sizeof(void*) on i386 platforms, I'm guessing that's
why the code works there. Seems quite broken here on AMD64 where
sizeof(void*) == 8.
Seems to me an overloaded use of fields to mean ints in some cases, and
pointers in others...
--
Paul "LeoNerd" Evans
leonerd@leonerd.org.uk
ICQ# 4135350 | Registered Linux# 179460
http://www.leonerd.org.uk/
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2005-07-21 23:25 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-21 11:32 [Qemu-devel] Network code on AMD64 Paul LeoNerd Evans
2005-07-21 15:25 ` Jim C. Brown
2005-07-21 16:18 ` Julian Seward
2005-07-21 21:55 ` Paul LeoNerd Evans
2005-07-21 22:58 ` Paul LeoNerd Evans
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).