* [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
@ 2008-12-04 1:14 Scott Raynel
2008-12-04 2:30 ` Marek Lindner
2008-12-04 11:35 ` Simon Wunderlich
0 siblings, 2 replies; 6+ messages in thread
From: Scott Raynel @ 2008-12-04 1:14 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Hi there,
I've been spending some time tracking down a bug that's been causing
memory corruption followed by random kernel panics. Thanks to the
kernel's slab memory debugger I tracked it down to a kfree in send.c
that was freeing a block of memory that had been written to past the
end of its allocation.
Turned out to be a simple typo, which I've fixed in the following
patch. When resizing the packet_buff struct in batman_if, the new
length was being updated but the old length was being used for the
kmalloc(), causing something later to think it had more memory
allocated to write to, hence writing past the end of the allocation.
Signed-off-by: Scott Raynel <scottraynel@gmail.com>
Index: send.c
===================================================================
--- send.c (revision 1105)
+++ send.c (working copy)
@@ -159,7 +159,7 @@
if ((hna_local_changed) && (batman_if->if_num == 0)) {
new_len = sizeof(struct batman_packet) + (num_hna * ETH_ALEN);
- new_buf = kmalloc(batman_if->pack_buff_len, GFP_ATOMIC);
+ new_buf = kmalloc(new_len, GFP_ATOMIC);
/* keep old buffer if kmalloc should fail */
if (new_buf) {
Cheers,
--
Scott Raynel
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
2008-12-04 1:14 [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug Scott Raynel
@ 2008-12-04 2:30 ` Marek Lindner
2008-12-04 11:35 ` Simon Wunderlich
1 sibling, 0 replies; 6+ messages in thread
From: Marek Lindner @ 2008-12-04 2:30 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Hey,
> Turned out to be a simple typo, which I've fixed in the following
> patch. When resizing the packet_buff struct in batman_if, the new
> length was being updated but the old length was being used for the
> kmalloc(), causing something later to think it had more memory
> allocated to write to, hence writing past the end of the allocation.
wow - nice catch !
I happily applied your patch (revision 1173). :-)
Regards,
Marek
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
2008-12-04 1:14 [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug Scott Raynel
2008-12-04 2:30 ` Marek Lindner
@ 2008-12-04 11:35 ` Simon Wunderlich
2008-12-05 10:40 ` Scott Raynel
1 sibling, 1 reply; 6+ messages in thread
From: Simon Wunderlich @ 2008-12-04 11:35 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
[-- Attachment #1: Type: text/plain, Size: 1969 bytes --]
Hey Scott,
thank you very much for the fix! Can you confirm if this bug is related
to https://dev.open-mesh.net/batman/ticket/86 ?
This bug has very likely been caused by a memory corruption, but i
couldn´t find where. (i have not experienced any kernel panics by this
however ...).
Thanks, best regards
Simon
On Thu, Dec 04, 2008 at 02:14:27PM +1300, Scott Raynel wrote:
> Hi there,
>
> I've been spending some time tracking down a bug that's been causing
> memory corruption followed by random kernel panics. Thanks to the
> kernel's slab memory debugger I tracked it down to a kfree in send.c
> that was freeing a block of memory that had been written to past the
> end of its allocation.
>
> Turned out to be a simple typo, which I've fixed in the following
> patch. When resizing the packet_buff struct in batman_if, the new
> length was being updated but the old length was being used for the
> kmalloc(), causing something later to think it had more memory
> allocated to write to, hence writing past the end of the allocation.
>
> Signed-off-by: Scott Raynel <scottraynel@gmail.com>
>
> Index: send.c
> ===================================================================
> --- send.c (revision 1105)
> +++ send.c (working copy)
> @@ -159,7 +159,7 @@
> if ((hna_local_changed) && (batman_if->if_num == 0)) {
>
> new_len = sizeof(struct batman_packet) + (num_hna *
> ETH_ALEN);
> - new_buf = kmalloc(batman_if->pack_buff_len, GFP_ATOMIC);
> + new_buf = kmalloc(new_len, GFP_ATOMIC);
>
> /* keep old buffer if kmalloc should fail */
> if (new_buf) {
>
>
> Cheers,
>
> --
> Scott Raynel
> WAND Network Research Group
> Department of Computer Science
> University of Waikato
> New Zealand
>
>
>
> _______________________________________________
> B.A.T.M.A.N mailing list
> B.A.T.M.A.N@open-mesh.net
> https://list.open-mesh.net/mm/listinfo/b.a.t.m.a.n
>
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
2008-12-04 11:35 ` Simon Wunderlich
@ 2008-12-05 10:40 ` Scott Raynel
2008-12-05 19:51 ` Simon Wunderlich
0 siblings, 1 reply; 6+ messages in thread
From: Scott Raynel @ 2008-12-05 10:40 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Hi Simon,
On 5/12/2008, at 12:35 AM, Simon Wunderlich wrote:
> Hey Scott,
>
> thank you very much for the fix! Can you confirm if this bug is
> related
> to https://dev.open-mesh.net/batman/ticket/86 ?
> This bug has very likely been caused by a memory corruption, but i
> couldn´t find where. (i have not experienced any kernel panics by
> this
> however ...).
It is quite possible that they are related. The slab error states that
a memory allocation was overwritten - the same problem as my patch
fixed. However, I can't confirm whether it is the same memory
allocation or a different one. The stack trace I got specifically
mentioned the kfree() in send_own_packet(), whereas this stack trace
does not.
Is that bug easily reproducible? It will be a couple of days before I
can try to look at it.
Also, the stack trace is confusing as it appears to indicate a kfree()
within hardif_min_mtu(), which I can't find :)
I'll try to do some stress testing of the module with the slab
debugger turned on for a while and see what happens.
Cheers,
--
Scott Raynel
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
2008-12-05 10:40 ` Scott Raynel
@ 2008-12-05 19:51 ` Simon Wunderlich
2008-12-12 9:08 ` Scott Raynel
0 siblings, 1 reply; 6+ messages in thread
From: Simon Wunderlich @ 2008-12-05 19:51 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
[-- Attachment #1: Type: text/plain, Size: 1578 bytes --]
Hey Scott,
On Fri, Dec 05, 2008 at 11:40:30PM +1300, Scott Raynel wrote:
> Hi Simon,
>
> On 5/12/2008, at 12:35 AM, Simon Wunderlich wrote:
>
> >Hey Scott,
> >
> >thank you very much for the fix! Can you confirm if this bug is
> >related
> >to https://dev.open-mesh.net/batman/ticket/86 ?
> >This bug has very likely been caused by a memory corruption, but i
> >couldn´t find where. (i have not experienced any kernel panics by
> >this
> >however ...).
>
>
> It is quite possible that they are related. The slab error states that
> a memory allocation was overwritten - the same problem as my patch
> fixed. However, I can't confirm whether it is the same memory
> allocation or a different one. The stack trace I got specifically
> mentioned the kfree() in send_own_packet(), whereas this stack trace
> does not.
>
> Is that bug easily reproducible? It will be a couple of days before I
> can try to look at it.
Yep, it was quite easy: just turn it on and off a few times. (echo
device and nothing into /proc/net/batman-adv/interfaces). The warning
appeared after 10 times in my qemu instance. No crash, only this warning.
>
> Also, the stack trace is confusing as it appears to indicate a kfree()
> within hardif_min_mtu(), which I can't find :)
That's the problem, that is what confused me at this point. :/
>
> I'll try to do some stress testing of the module with the slab
> debugger turned on for a while and see what happens.
Sounds great. Thanks for you hard work. :)
best regards,
Simon
[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug
2008-12-05 19:51 ` Simon Wunderlich
@ 2008-12-12 9:08 ` Scott Raynel
0 siblings, 0 replies; 6+ messages in thread
From: Scott Raynel @ 2008-12-12 9:08 UTC (permalink / raw)
To: The list for a Better Approach To Mobile Ad-hoc Networking
Hi Simon,
On 6/12/2008, at 8:51 AM, Simon Wunderlich wrote:
> Hey Scott,
>
> On Fri, Dec 05, 2008 at 11:40:30PM +1300, Scott Raynel wrote:
>> Hi Simon,
>>
>> On 5/12/2008, at 12:35 AM, Simon Wunderlich wrote:
>>
>>> Hey Scott,
>>>
>>> thank you very much for the fix! Can you confirm if this bug is
>>> related
>>> to https://dev.open-mesh.net/batman/ticket/86 ?
>>> This bug has very likely been caused by a memory corruption, but i
>>> couldnôt find where. (i have not experienced any kernel panics by
>>> this
>>> however ...).
>>
>>
>> It is quite possible that they are related. The slab error states
>> that
>> a memory allocation was overwritten - the same problem as my patch
>> fixed. However, I can't confirm whether it is the same memory
>> allocation or a different one. The stack trace I got specifically
>> mentioned the kfree() in send_own_packet(), whereas this stack trace
>> does not.
>>
>> Is that bug easily reproducible? It will be a couple of days before I
>> can try to look at it.
>
> Yep, it was quite easy: just turn it on and off a few times. (echo
> device and nothing into /proc/net/batman-adv/interfaces). The warning
> appeared after 10 times in my qemu instance. No crash, only this
> warning.
I can't reproduce this bug before my patch is applied because the bug
it fixes always gets in the way :)
After applying the patch I seem to be able to consistently lock up the
system by adding and removing an interface from the batman device
several times. The box still replies to pings, but I can't SSH in.
This does not trigger the slab debugger. I've looked at using the
magic sysreq interface to see what's going on and by printing the
current task it appears to be hanging during the
cancel_rearming_delayed_work() call in shutdown_module(). This might
be related to the scheduling-while-atomic bugs. I'll keep looking into
this as I get time, but things are pretty busy here at the moment.
Cheers,
--
Scott Raynel
WAND Network Research Group
Department of Computer Science
University of Waikato
New Zealand
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2008-12-12 9:08 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-12-04 1:14 [B.A.T.M.A.N.] [PATCH] batman-adv-kernelland: Fix memory corruption bug Scott Raynel
2008-12-04 2:30 ` Marek Lindner
2008-12-04 11:35 ` Simon Wunderlich
2008-12-05 10:40 ` Scott Raynel
2008-12-05 19:51 ` Simon Wunderlich
2008-12-12 9:08 ` Scott Raynel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox