From: Ingo Molnar <mingo@elte.hu>
To: Vegard Nossum <vegard.nossum@gmail.com>
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
Pekka Enberg <penberg@cs.helsinki.fi>,
"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten
Date: Fri, 18 Jul 2008 01:52:54 +0200 [thread overview]
Message-ID: <20080717235254.GA6833@elte.hu> (raw)
In-Reply-To: <19f34abd0807171615s5b477d4cr22d3e9444bcf65df@mail.gmail.com>
* Vegard Nossum <vegard.nossum@gmail.com> wrote:
> On Thu, Jul 17, 2008 at 11:42 PM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > A regression to v2.6.26:
> >
> > I started getting this skb-head corruption message today, on a T60
> > laptop with e1000:
> >
> > PM: Removing info for No Bus:vcs11
> > device: 'vcs11': device_create_release
> > =============================================================================
> > BUG skbuff_head_cache: Poison overwritten
> > -----------------------------------------------------------------------------
> >
> > INFO: 0xf658ae9c-0xf658ae9c. First byte 0x6a instead of 0x6b
>
> 1. Notice the range. It's just a single byte.
> 2. Notice the value. It's just a ++.
>
> Probably a stray increment of a uint8_t somewhere on a freed object?
>
> The offset from the beginning of the object is 0xf658ae9c - 0xf658ae00
> = 0x9c.
>
> How big is a struct sk_buff? Hm.. it is in fact quite big. Now what
> member has offset 0x9c? Seems to depend on your config. Is there any
> way you can figure it out, Ingo? I'll try it with your config too.
hmm ... your analysis gave me a wonderful albeit admittedly remote idea:
If only we had some kernel technology that could track and validate
memory accesses, and point out the cases where we access uninitialized
memory, just like Valgrind?
... something like kmemcheck? ;-)
So i booted that box with tip/master and kmemcheck enabled. (plus a few
fixlets to make networking allocations be properly tracked by
kmemcheck.)
It was a slow bootup and long wait, but it gave a few hits here:
kmemcheck: Caught 8-bit read from uninitialized memory (f653ad24)
iiiiiiiiiiiiiiiiuuuuuuuuuuuuuuuuuuuuuiuuuuuuuuuuuuuuuuuuuuuuuuuu
^
Pid: 2484, comm: arping Not tainted (2.6.26-tip #20187)
EIP: 0060:[<c05e973c>] EFLAGS: 00010282 CPU: 0
EIP is at __copy_skb_header+0x7c/0x100
EAX: 00000000 EBX: f653acc0 ECX: f653ac00 EDX: f653ac00
ESI: f653ac50 EDI: f653ad10 EBP: c09b9e84 ESP: c09ddaa8
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
CR0: 8005003b CR2: f71c2700 CR3: 36513000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff4ff0 DR7: 00000400
[<c05e97e7>] __skb_clone+0x27/0xe0
[<c05eb101>] skb_clone+0x41/0x60
[<c065cbf1>] packet_rcv+0xc1/0x290
[<c05f07ad>] netif_receive_skb+0x20d/0x400
[<c03b2aa7>] e1000_receive_skb+0x47/0x180
[<c03b3983>] e1000_clean_rx_irq+0x223/0x2e0
[<c03b225b>] e1000_clean+0x5b/0x200
[<c05f29db>] net_rx_action+0xfb/0x160
[<c0129092>] __do_softirq+0x82/0xf0
[<c0105b8a>] call_on_stack+0x1a/0x30
false positive? Find below the quick hacks i did to pre-initialize skb
allocations that have RX DMA into them.
another one is:
kmemcheck: Caught 8-bit read from uninitialized memory (f653a902)
iiuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
^
Pid: 2575, comm: hcid Not tainted (2.6.26-tip #20187)
EIP: 0060:[<c02b9926>] EFLAGS: 00010293 CPU: 0
EIP is at __copy_to_user_ll+0x46/0x70
EAX: 00000004 EBX: b7f3c478 ECX: 00000002 EDX: f653a900
ESI: f653a902 EDI: b7f3c47a EBP: f668ceec ESP: c09ddbc8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: f71c2700 CR3: 3668d000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff4ff0 DR7: 00000400
[<c02b9e4a>] copy_to_user+0x3a/0x50
[<c068b1b0>] hci_get_dev_list+0x100/0x120
[<c068fe53>] hci_sock_ioctl+0x143/0x2c0
[<c05e6a41>] sock_ioctl+0xc1/0x1d0
[<c0187aad>] vfs_ioctl+0x2d/0x90
[<c0187d7b>] do_vfs_ioctl+0x26b/0x2d0
[<c0187e37>] sys_ioctl+0x57/0x70
[<c0103c01>] sysenter_past_esp+0x6a/0x91
[<ffffffff>] 0xffffffff
this might actually be genuine use of uninitialized memory, hm? Or
perhaps gcc optimizing out bitmasks and kmemcheck not coping with it?
a third type was this:
kmemcheck: Caught 8-bit read from uninitialized memory (f653a2a4)
iiiiiiiiiiiiiiiiuuuuuuuuuuuuuuuuuuuuuiuuuuuuuuuuuuuuuuuuuuuuuuuu
^
Pid: 2771, comm: ssh Not tainted (2.6.26-tip #20187)
EIP: 0060:[<c05e973c>] EFLAGS: 00010282 CPU: 0
EIP is at __copy_skb_header+0x7c/0x100
EAX: 00000000 EBX: f653a240 ECX: f6762000 EDX: f6762000
ESI: f6762050 EDI: f653a290 EBP: f675cd28 ESP: c09ddce8
DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
CR0: 8005003b CR2: f71c2700 CR3: 367e3000 CR4: 000006d0
DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
DR6: ffff4ff0 DR7: 00000400
[<c05e97e7>] __skb_clone+0x27/0xe0
[<c05eb101>] skb_clone+0x41/0x60
[<c062fbd1>] tcp_transmit_skb+0x41/0x800
[<c06328c3>] tcp_connect+0x293/0x330
[<c0636676>] tcp_v4_connect+0x3d6/0x550
[<c0642359>] inet_stream_connect+0x1b9/0x240
[<c05e4e66>] sys_connect+0x86/0xa0
[<c05e66d0>] sys_socketcall+0x220/0x260
[<c0103c01>] sysenter_past_esp+0x6a/0x91
[<ffffffff>] 0xffffffff
this too is likely a false positive related to RX packets?
none of this looks netconsole related.
I'll keep the box running under kmemcheck - maybe something pops up.
Ingo
------------------->
Subject: kmemcheck/net hacks
From: Ingo Molnar <mingo@elte.hu>
---
include/asm-generic/siginfo.h | 8 ++++++++
include/linux/fs.h | 4 ++--
include/linux/netdevice.h | 4 ++--
include/linux/skbuff.h | 6 +++++-
include/net/inet_sock.h | 3 ++-
include/net/tcp.h | 11 +++++++++++
kernel/signal.c | 12 ++++++++++++
net/core/skbuff.c | 6 ++++++
net/ipv4/tcp_output.c | 4 ++++
9 files changed, 52 insertions(+), 6 deletions(-)
Index: linux/include/asm-generic/siginfo.h
===================================================================
--- linux.orig/include/asm-generic/siginfo.h
+++ linux/include/asm-generic/siginfo.h
@@ -278,11 +278,19 @@ void do_schedule_next_timer(struct sigin
static inline void copy_siginfo(struct siginfo *to, struct siginfo *from)
{
+#ifdef CONFIG_KMEMCHECK
+ memcpy(to, from, sizeof(*to));
+#else
+ /*
+ * Optimization, only copy up to the size of the largest known
+ * union member:
+ */
if (from->si_code < 0)
memcpy(to, from, sizeof(*to));
else
/* _sigchld is currently the largest know union member */
memcpy(to, from, __ARCH_SI_PREAMBLE_SIZE + sizeof(from->_sifields._sigchld));
+#endif
}
#endif
Index: linux/include/linux/fs.h
===================================================================
--- linux.orig/include/linux/fs.h
+++ linux/include/linux/fs.h
@@ -922,8 +922,8 @@ struct file_lock {
struct pid *fl_nspid;
wait_queue_head_t fl_wait;
struct file *fl_file;
- unsigned char fl_flags;
- unsigned char fl_type;
+ unsigned int fl_flags;
+ unsigned int fl_type;
loff_t fl_start;
loff_t fl_end;
Index: linux/include/linux/netdevice.h
===================================================================
--- linux.orig/include/linux/netdevice.h
+++ linux/include/linux/netdevice.h
@@ -199,8 +199,8 @@ struct dev_addr_list
{
struct dev_addr_list *next;
u8 da_addr[MAX_ADDR_LEN];
- u8 da_addrlen;
- u8 da_synced;
+ unsigned int da_addrlen;
+ unsigned int da_synced;
int da_users;
int da_gusers;
};
Index: linux/include/linux/skbuff.h
===================================================================
--- linux.orig/include/linux/skbuff.h
+++ linux/include/linux/skbuff.h
@@ -1208,7 +1208,11 @@ static inline void __skb_queue_purge(str
static inline struct sk_buff *__dev_alloc_skb(unsigned int length,
gfp_t gfp_mask)
{
- struct sk_buff *skb = alloc_skb(length + NET_SKB_PAD, gfp_mask);
+ struct sk_buff *skb;
+#ifdef CONFIG_KMEMCHECK
+ gfp_mask |= __GFP_ZERO;
+#endif
+ skb = alloc_skb(length + NET_SKB_PAD, gfp_mask);
if (likely(skb))
skb_reserve(skb, NET_SKB_PAD);
return skb;
Index: linux/include/net/inet_sock.h
===================================================================
--- linux.orig/include/net/inet_sock.h
+++ linux/include/net/inet_sock.h
@@ -72,7 +72,8 @@ struct inet_request_sock {
sack_ok : 1,
wscale_ok : 1,
ecn_ok : 1,
- acked : 1;
+ acked : 1,
+ __filler : 3;
struct ip_options *opt;
};
Index: linux/include/net/tcp.h
===================================================================
--- linux.orig/include/net/tcp.h
+++ linux/include/net/tcp.h
@@ -966,6 +966,17 @@ static inline void tcp_openreq_init(stru
tcp_rsk(req)->rcv_isn = TCP_SKB_CB(skb)->seq;
req->mss = rx_opt->mss_clamp;
req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0;
+#ifdef CONFIG_KMEMCHECK
+ /* bitfield init */
+ ireq->snd_wscale =
+ ireq->rcv_wscale =
+ ireq->tstamp_ok =
+ ireq->sack_ok =
+ ireq->wscale_ok =
+ ireq->ecn_ok =
+ ireq->acked =
+ ireq->__filler = 0;
+#endif
ireq->tstamp_ok = rx_opt->tstamp_ok;
ireq->sack_ok = rx_opt->sack_ok;
ireq->snd_wscale = rx_opt->snd_wscale;
Index: linux/kernel/signal.c
===================================================================
--- linux.orig/kernel/signal.c
+++ linux/kernel/signal.c
@@ -841,6 +841,12 @@ static int send_signal(int sig, struct s
list_add_tail(&q->list, &pending->list);
switch ((unsigned long) info) {
case (unsigned long) SEND_SIG_NOINFO:
+ /*
+ * Make sure we always have a fully initialized
+ * siginfo struct:
+ */
+ memset(&q->info, 0, sizeof(q->info));
+
q->info.si_signo = sig;
q->info.si_errno = 0;
q->info.si_code = SI_USER;
@@ -848,6 +854,12 @@ static int send_signal(int sig, struct s
q->info.si_uid = current->uid;
break;
case (unsigned long) SEND_SIG_PRIV:
+ /*
+ * Make sure we always have a fully initialized
+ * siginfo struct:
+ */
+ memset(&q->info, 0, sizeof(q->info));
+
q->info.si_signo = sig;
q->info.si_errno = 0;
q->info.si_code = SI_KERNEL;
Index: linux/net/core/skbuff.c
===================================================================
--- linux.orig/net/core/skbuff.c
+++ linux/net/core/skbuff.c
@@ -225,6 +225,9 @@ struct sk_buff *__alloc_skb(unsigned int
struct sk_buff *child = skb + 1;
atomic_t *fclone_ref = (atomic_t *) (child + 1);
+#ifdef CONFIG_KMEMCHECK
+ memset(child, 0, offsetof(struct sk_buff, tail));
+#endif
skb->fclone = SKB_FCLONE_ORIG;
atomic_set(fclone_ref, 1);
@@ -257,6 +260,9 @@ struct sk_buff *__netdev_alloc_skb(struc
int node = dev_to_node(&dev->dev);
struct sk_buff *skb;
+#ifdef CONFIG_KMEMCHECK
+ gfp_mask |= __GFP_ZERO;
+#endif
skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, node);
if (likely(skb)) {
skb_reserve(skb, NET_SKB_PAD);
Index: linux/net/ipv4/tcp_output.c
===================================================================
--- linux.orig/net/ipv4/tcp_output.c
+++ linux/net/ipv4/tcp_output.c
@@ -333,6 +333,10 @@ static inline void TCP_ECN_send(struct s
static void tcp_init_nondata_skb(struct sk_buff *skb, u32 seq, u8 flags)
{
skb->csum = 0;
+ skb->local_df = skb->cloned = skb->ip_summed = skb->nohdr =
+ skb->nfctinfo = 0;
+ skb->pkt_type = skb->fclone = skb->ipvs_property = skb->peeked =
+ skb->nf_trace = 0;
TCP_SKB_CB(skb)->flags = flags;
TCP_SKB_CB(skb)->sacked = 0;
next prev parent reply other threads:[~2008-07-17 23:53 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-17 21:42 [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten Ingo Molnar
2008-07-17 21:45 ` David Miller
2008-07-17 22:06 ` Ingo Molnar
2008-07-17 22:09 ` David Miller
2008-07-17 22:43 ` Ingo Molnar
2008-07-17 23:15 ` Vegard Nossum
2008-07-17 23:35 ` Vegard Nossum
2008-07-17 23:52 ` Ingo Molnar [this message]
2008-07-18 0:01 ` Ingo Molnar
2008-07-18 0:05 ` Vegard Nossum
2008-07-18 0:16 ` Ingo Molnar
2008-07-18 2:13 ` David Miller
2008-07-18 2:03 ` David Miller
2008-07-18 7:03 ` Vegard Nossum
2008-07-18 7:12 ` David Miller
2008-07-18 9:05 ` Ingo Molnar
2008-07-18 19:10 ` [bug] Attempt to release alive inet socket f6fac040 Ingo Molnar
2008-07-18 19:55 ` Ingo Molnar
2008-07-17 23:27 ` [bug, netconsole, SLUB] BUG skbuff_head_cache: Poison overwritten Vegard Nossum
2008-07-17 23:56 ` Ingo Molnar
2008-07-21 11:41 ` Vegard Nossum
2008-07-18 5:46 ` Evgeniy Polyakov
2008-07-18 9:02 ` Pekka Enberg
2008-07-18 9:09 ` Ingo Molnar
2008-07-18 9:15 ` Pekka Enberg
2008-07-18 10:16 ` Evgeniy Polyakov
2008-07-18 14:44 ` Pekka Enberg
2008-07-18 14:48 ` Christoph Lameter
2008-07-18 16:07 ` Evgeniy Polyakov
2008-07-18 9:00 ` Pekka J Enberg
2008-07-18 9:11 ` Ingo Molnar
2008-07-18 9:16 ` Pekka Enberg
2008-07-18 13:54 ` Christoph Lameter
2008-07-21 9:41 ` Ingo Molnar
2008-07-21 9:52 ` Pekka Enberg
2008-07-21 10:06 ` Evgeniy Polyakov
2008-07-21 10:50 ` Ingo Molnar
2008-07-21 11:03 ` Vegard Nossum
2008-07-21 11:13 ` Ingo Molnar
2008-07-21 16:19 ` Christoph Lameter
2008-07-21 20:23 ` Vegard Nossum
2008-07-21 11:25 ` Evgeniy Polyakov
2008-07-21 11:55 ` Ingo Molnar
2008-07-21 12:57 ` Evgeniy Polyakov
2008-07-21 14:01 ` Ingo Molnar
2008-07-21 19:21 ` Ingo Molnar
2008-07-21 21:24 ` Evgeniy Polyakov
2008-07-21 23:33 ` David Miller
2008-07-22 7:50 ` Ingo Molnar
2008-07-22 13:34 ` Ingo Molnar
2008-07-23 22:31 ` David Miller
2008-07-23 22:40 ` Jeff Kirsher
2008-07-21 16:22 ` Christoph Lameter
2008-07-21 19:57 ` Evgeniy Polyakov
2008-07-21 20:05 ` Ingo Molnar
2008-07-21 20:22 ` Vegard Nossum
2008-07-18 13:55 ` Christoph Lameter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080717235254.GA6833@elte.hu \
--to=mingo@elte.hu \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=penberg@cs.helsinki.fi \
--cc=rjw@sisk.pl \
--cc=vegard.nossum@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).