On Sat, Dec 09, 2006 at 11:43:05PM -0800, Andrew Morton wrote: > > hm, AOE on alpha - who'd have imagined? > > James, are you able to get us a copy of the oops trace? It turns out the aoe driver wasn't setting up the linear part of the skb correctly and so the data was being offset when the skb was linearized for cards that didn't do scatter gather. James Boddington reports that everything's fine with a RealTek 8139 card he has but not fine with several other cards: I just tried a cheap RealTek card. A rtl8139 pci 100MBit. AoE in 2.6.19 works with the realtek card. Did your test again with the realtek. The results look like the results I got with AoE v23. Correct offset and the data not truncated. Which leaves me with both ne2k-pci and 3com, AoE not working. Rtl8139 and AoE does work. I haven't got confirmation, but we think that the other cards don't perform scatter gather. We can replicate JB's problem by turning off the scatter gather feature in an Intel gigabit network card using ethtool. The attached patch fixes the offsets when scatter gather is not in use by setting up the linear part of the skb correctly. After applying this patch to 2.6.19.1, everything looks great for writes like: echo AaAbAcAdAe > /dev/etherd/e0.2 ... and when using ext3 on the device. However, we see problems when using XFS. It appears that the XFS problem is unrelated, because the kernel's new lock debugging sees some problems that appear in the attached netconsole log. XFS passes us a bio with a pointer to a page that has a reference count of zero which causes problems when __pskb_pull_tail does a put on the page. With ext3, we only got pages with a reference count greater than zero. The problem doesn't appear when scatter gather is turned on, although the same locking issues are logged. (See third attachment.) -- Ed L Cashin