From: Kurt Van Dijck <dev.kurt@vandijck-laurijssen.be>
To: Maxime Jayat <maxime.jayat@mobile-devices.fr>
Cc: linux-can@vger.kernel.org
Subject: Re: J1939 Memory corruption
Date: Wed, 20 Jul 2016 09:36:19 +0200 [thread overview]
Message-ID: <20160720073619.GC5233@airbook> (raw)
In-Reply-To: <CA+8_ywZWX9ECCvsRM5NOXaD9TvL1Sg=WuSGELfmAhzYRn6Z_GA@mail.gmail.com>
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=utf-8, Size: 7091 bytes --]
Hey Maxime,
I added linux-can in CC:
I appreciate your comments.
I will try to provide you with proper answers.
I'll first fresh up my memory, and look into the code.
I hope to come back on this before the weekend.
Kind regards,
Kurt
--- Original message ---
> Date: Tue, 19 Jul 2016 15:15:24 +0200
> From: Maxime Jayat <maxime.jayat@mobile-devices.fr>
> To: kurt@vandijck-laurijssen.be
> Subject: J1939 Memory corruption
>
> Hello Kurt,
> I am a user of your J1939 implementation for linux. Previously I was
> using it with a linux 2.6.32 with success. Recently we moved to a 4.1
> kernel, so I am currently testing the new J1939 implementation (the
> 'direct' one). I merged the linux j1939d-v4.1 branch into my tree and
> it works correctly. Not having to patch and use iproute2 is greatly
> appreciated.
> However, I am able to reliably reproduce a kernel panic, or at least a
> kref warning which leads to a random kernel crash some time later.
> I have attached a C program reproducing the problem, but here is a
> quick description:
> - I bind a J1939 socket to source address 0x80 with name
> 0x7777777777777777
> - I send an address claim for this source address
> - Another "external" ECU sends an address claim for 0x80 with name
> 0x1111111111111111 (which has higher priority, so "wins" the source
> address).
> - I re-bind my J1939 socket to source address 0x81.
> This leads to:
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 861 at include/linux/kref.h:47
> j1939_name_local_get+0x128/0x13c()
> Modules linked in:
> CPU: 0 PID: 861 Comm: j1939 Not tainted 4.1.24-linux4sam_5.3 #64
> Hardware name: Atmel SAMA5
> [<c001721c>] (unwind_backtrace) from [<c00145e8>]
> (show_stack+0x20/0x24)
> [<c00145e8>] (show_stack) from [<c0697db0>] (dump_stack+0x24/0x28)
> [<c0697db0>] (dump_stack) from [<c0027a14>]
> (warn_slowpath_common+0x94/0xc0)
> [<c0027a14>] (warn_slowpath_common) from [<c0027afc>]
> (warn_slowpath_null+0x2c/0x34)
> [<c0027afc>] (warn_slowpath_null) from [<c05a00ec>]
> (j1939_name_local_get+0x128/0x13c)
> [<c05a00ec>] (j1939_name_local_get) from [<c05a42c8>]
> (j1939sk_bind+0x208/0x360)
> [<c05a42c8>] (j1939sk_bind) from [<c049aef4>] (SyS_bind+0x6c/0x9c)
> [<c049aef4>] (SyS_bind) from [<c0010860>] (ret_fast_syscall+0x0/0x60)
> ---[ end trace 93231ae522a8b287 ]---
> FYI, candump during this:
> (029.642505)Â can0Â 18EEFF80Â Â [8]Â 77 77 77 77 77 77 77 77
> (001.001629)Â can0Â 18EEFF80Â Â [8]Â 11 11 11 11 11 11 11 11
> (001.234503)Â can0Â 18EEFF81Â Â [8]Â 77 77 77 77 77 77 77 77
> I tried to understand to problem and arrived to the following (and
> incomplete) conclusions:
> It seems to be a mismatch between the ecu list in priv->ecus and the
> refcounting of the ECUs.
> When the external address claim is received, __j1939_ecu_unregister is
> called by j1939_process_address_claim for my ECU and it is removed from
> the priv->ecus list but there are still some references to it, so it is
> not freed.
> Then, when re-binding the socket, j1939_name_local_put is called which
> calls __j1939_ecu_get_register which does not find the name in the
> priv->ecus list, so it reallocates a new ECU and insert it in the
> priv->ecus list. However j1939_name_local_put immediately drops the ECU
> reference, so the ECU is freed, but it is still in the priv->ecus list!
> So the list contains a reference to a freed structure. The previous
> warning appears when kref tries to take a reference while the count is
> already at zero. At this point the kernel panics, either when closing
> the socket, or randomly a few seconds later because the memory is
> corrupted.
> I have trouble following where all the krefs to the ecus are taken and
> if it all makes sense everywhere. __j1939_ecu_get_register can either
> allocate a new ECU and hold a reference (kref_init initializes at 1),
> or return an existing ECU without touching the reference count. How is
> it possible to know if the caller must call put_j1939_ecu when done
> with the ecu?
> By the way, __j1939_ecu_get_register has a unused "create_if_necessary"
> parameter.
> Although I probably could fix these issues with some time, I would
> appreciate any help you can give me.
> Thanks in advance,
> --
> Maxime Jayat
> #include <errno.h>
> #include <inttypes.h>
> #include <net/if.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/ioctl.h>
> #include <sys/socket.h>
> #include <sys/types.h>
> #include <unistd.h>
> /* linux can headers copied into currect directory */
> #include "can.h"
> #include "raw.h"
> #include "j1939.h"
>
>
> static int open_j1939_socket(const char *iface, uint64_t name, uint8_t sa)
> {
> int ret, sock;
> int value;
> struct sockaddr_can saddr = {
> .can_family = AF_CAN,
> .can_addr.j1939 = {
> .name = name,
> .addr = sa,
> .pgn = 0x0ee00,
> },
> };
>
> sock = ret = socket(PF_CAN, SOCK_DGRAM | SOCK_CLOEXEC, CAN_J1939);
> if (ret < 0) {
> goto fail;
> }
>
> ret = setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE,
> iface, strnlen(iface, 40));
> if (ret < 0) {
> goto fail;
> }
>
> ret = bind(sock, (void *)&saddr, sizeof(saddr));
> if (ret < 0) {
> goto fail;
> }
> return sock;
>
> fail:
> close(sock);
> return -1;
> }
>
> static int open_raw_socket(const char *iface)
> {
> int s;
> struct sockaddr_can addr;
> struct ifreq ifr;
> s = socket(PF_CAN, SOCK_RAW, CAN_RAW);
>
> strcpy(ifr.ifr_name, iface);
> if (ioctl(s, SIOCGIFINDEX, &ifr) < 0) {
> perror("SIOCGIFINDEX");
> return 1;
> }
> addr.can_family = AF_CAN;
> addr.can_ifindex = ifr.ifr_ifindex;
>
> if (bind(s, (struct sockaddr *)&addr, sizeof(addr)) < 0) {
> perror("bind");
> return 1;
> }
>
> return s;
> }
>
>
> int main(void)
> {
> int s1939, sraw;
> uint64_t name = 0x7777777777777777ll;
> uint8_t buf[8];
>
> s1939 = open_j1939_socket("can0", name, /* src address */ 0x80);
> sraw = open_raw_socket("can0");
>
> memcpy(buf, &name, 8);
> /* Send address claim for src address 0x80 */
> send(s1939, buf, 8, 0);
> sleep(1);
> {
> /* Simulate an ecu claiming src address 0x80 with smaller NAME */
> struct can_frame frame = {
> .can_id = 0x18eeff80 | CAN_EFF_FLAG,
> .can_dlc = 8,
> .data = { 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11, 0x11 }
> };
> write(sraw, &frame, sizeof(frame));
> }
> sleep(1);
> /* Rebind to src address 0x81 */
> struct sockaddr_can saddr = {
> .can_family = AF_CAN,
> .can_addr.j1939 = {
> .name = name,
> .addr = 0x81,
> .pgn = 0x0ee00,
> },
> };
> bind(s1939, (void *)&saddr, sizeof(saddr));
> /* Memory corrupted */
>
> close(s1939);
> close(sraw);
> }
parent reply other threads:[~2016-07-20 7:49 UTC|newest]
Thread overview: expand[flat|nested] mbox.gz Atom feed
[parent not found: <CA+8_ywZWX9ECCvsRM5NOXaD9TvL1Sg=WuSGELfmAhzYRn6Z_GA@mail.gmail.com>]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160720073619.GC5233@airbook \
--to=dev.kurt@vandijck-laurijssen.be \
--cc=linux-can@vger.kernel.org \
--cc=maxime.jayat@mobile-devices.fr \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.