* Re: BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
[not found] <001a1145ac5480242305609956b3@google.com>
@ 2017-12-18 16:28 ` Santosh Shilimkar
2017-12-18 17:12 ` David Miller
0 siblings, 1 reply; 6+ messages in thread
From: Santosh Shilimkar @ 2017-12-18 16:28 UTC (permalink / raw)
To: syzbot, davem, linux-kernel, linux-rdma, netdev, rds-devel,
syzkaller-bugs
On 12/18/2017 12:43 AM, syzbot wrote:
> Hello,
>
> syzkaller hit the following crash on
> 6084b576dca2e898f5c101baef151f7bfdbb606d
> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
> compiler: gcc (GCC) 7.1.1 20170620
> .config is attached
> Raw console output is attached.
>
> Unfortunately, I don't have any reproducer for this bug yet.
>
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
> program syz-executor6 is using a deprecated SCSI ioctl, please convert
> it to SG_IO
> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
Looks like another one tripping on empty transport. Mostly below should
address it but we will test it if it does.
diff --git a/net/rds/send.c b/net/rds/send.c
index 7244d2e..e2d0eaa 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -183,7 +183,7 @@ int rds_send_xmit(struct rds_conn_path *cp)
goto out;
}
- if (conn->c_trans->xmit_path_prepare)
+ if (conn->c_trans && conn->c_trans->xmit_path_prepare)
conn->c_trans->xmit_path_prepare(cp);
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
2017-12-18 16:28 ` BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit Santosh Shilimkar
@ 2017-12-18 17:12 ` David Miller
2017-12-18 17:16 ` Santosh Shilimkar
2017-12-18 17:22 ` [rds-devel] " Sowmini Varadhan
0 siblings, 2 replies; 6+ messages in thread
From: David Miller @ 2017-12-18 17:12 UTC (permalink / raw)
To: santosh.shilimkar
Cc: bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-kernel,
linux-rdma, netdev, rds-devel, syzkaller-bugs
From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Date: Mon, 18 Dec 2017 08:28:05 -0800
> On 12/18/2017 12:43 AM, syzbot wrote:
>> Hello,
>> syzkaller hit the following crash on
>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>> compiler: gcc (GCC) 7.1.1 20170620
>> .config is attached
>> Raw console output is attached.
>> Unfortunately, I don't have any reproducer for this bug yet.
>> BUG: unable to handle kernel NULL pointer dereference at
>> 0000000000000028
>> program syz-executor6 is using a deprecated SCSI ioctl, please convert
>> it to SG_IO
>> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
>
> Looks like another one tripping on empty transport. Mostly below
> should
> address it but we will test it if it does.
>
> diff --git a/net/rds/send.c b/net/rds/send.c
> index 7244d2e..e2d0eaa 100644
> --- a/net/rds/send.c
> +++ b/net/rds/send.c
> @@ -183,7 +183,7 @@ int rds_send_xmit(struct rds_conn_path *cp)
> goto out;
> }
>
> - if (conn->c_trans->xmit_path_prepare)
> + if (conn->c_trans && conn->c_trans->xmit_path_prepare)
> conn->c_trans->xmit_path_prepare(cp);
We're seeming to accumulate a lot of checks like this, maybe there
is a more general way to deal with this problem?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
2017-12-18 17:12 ` David Miller
@ 2017-12-18 17:16 ` Santosh Shilimkar
2017-12-18 17:22 ` [rds-devel] " Sowmini Varadhan
1 sibling, 0 replies; 6+ messages in thread
From: Santosh Shilimkar @ 2017-12-18 17:16 UTC (permalink / raw)
To: David Miller
Cc: bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-kernel,
linux-rdma, netdev, rds-devel, syzkaller-bugs
On 12/18/2017 9:12 AM, David Miller wrote:
> From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Date: Mon, 18 Dec 2017 08:28:05 -0800
>
>> On 12/18/2017 12:43 AM, syzbot wrote:
>>> Hello,
>>> syzkaller hit the following crash on
>>> 6084b576dca2e898f5c101baef151f7bfdbb606d
>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master
>>> compiler: gcc (GCC) 7.1.1 20170620
>>> .config is attached
>>> Raw console output is attached.
>>> Unfortunately, I don't have any reproducer for this bug yet.
>>> BUG: unable to handle kernel NULL pointer dereference at
>>> 0000000000000028
>>> program syz-executor6 is using a deprecated SCSI ioctl, please convert
>>> it to SG_IO
>>> IP: rds_send_xmit+0x80/0x930 net/rds/send.c:186
>>
>> Looks like another one tripping on empty transport. Mostly below
>> should
>> address it but we will test it if it does.
>>
>> diff --git a/net/rds/send.c b/net/rds/send.c
>> index 7244d2e..e2d0eaa 100644
>> --- a/net/rds/send.c
>> +++ b/net/rds/send.c
>> @@ -183,7 +183,7 @@ int rds_send_xmit(struct rds_conn_path *cp)
>> goto out;
>> }
>>
>> - if (conn->c_trans->xmit_path_prepare)
>> + if (conn->c_trans && conn->c_trans->xmit_path_prepare)
>> conn->c_trans->xmit_path_prepare(cp);
>
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?
>
Agree. Some of these additional transports hooks got added later
to specific transports which needs them. Will review this overall
and see if it can be addressed generically.
Regards,
Santosh
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
2017-12-18 17:12 ` David Miller
2017-12-18 17:16 ` Santosh Shilimkar
@ 2017-12-18 17:22 ` Sowmini Varadhan
2018-01-30 22:22 ` Eric Biggers
1 sibling, 1 reply; 6+ messages in thread
From: Sowmini Varadhan @ 2017-12-18 17:22 UTC (permalink / raw)
To: David Miller
Cc: santosh.shilimkar, rds-devel,
bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-rdma, netdev,
syzkaller-bugs, linux-kernel
> From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> Date: Mon, 18 Dec 2017 08:28:05 -0800
:
> > Looks like another one tripping on empty transport. Mostly below
> > should
> > address it but we will test it if it does.
that was my first thought, but it cannot be the case here: rds_sendmsg
etc itself would have bombed if that were the case, and the packet
would never have gotten queued.
This is unlike f3069c6d33, where an applications skips the transport
binding (either misses the explicit bind, or gets the wrong transport
due to an implicit bind) before it triggers the setsockopt.
I suspect that the problems is that the conn (and thus c_trans)
have gotten destroyed, but the cp_send_w work got incorrectly
re-queued. For example, rds_cong_queue_updates() (because the
peer sent a congestion update) can happen in softirq context,
and would end up requeing work in the middle of rds_conn_destroy,
after we have assumed that everything is quisced.
On (12/18/17 12:12), David Miller wrote:
>
> We're seeming to accumulate a lot of checks like this, maybe there
> is a more general way to deal with this problem?
Yeah, I was thinking about this.. let me try to reprodcue this in-house
and get back with a patchset.
--Sowmini
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
2017-12-18 17:22 ` [rds-devel] " Sowmini Varadhan
@ 2018-01-30 22:22 ` Eric Biggers
2018-01-30 22:28 ` Sowmini Varadhan
0 siblings, 1 reply; 6+ messages in thread
From: Eric Biggers @ 2018-01-30 22:22 UTC (permalink / raw)
To: Sowmini Varadhan
Cc: David Miller, santosh.shilimkar, rds-devel,
bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-rdma, netdev,
syzkaller-bugs, linux-kernel
On Mon, Dec 18, 2017 at 12:22:51PM -0500, Sowmini Varadhan wrote:
> > From: Santosh Shilimkar <santosh.shilimkar@oracle.com>
> > Date: Mon, 18 Dec 2017 08:28:05 -0800
> :
> > > Looks like another one tripping on empty transport. Mostly below
> > > should
> > > address it but we will test it if it does.
>
> that was my first thought, but it cannot be the case here: rds_sendmsg
> etc itself would have bombed if that were the case, and the packet
> would never have gotten queued.
>
> This is unlike f3069c6d33, where an applications skips the transport
> binding (either misses the explicit bind, or gets the wrong transport
> due to an implicit bind) before it triggers the setsockopt.
>
> I suspect that the problems is that the conn (and thus c_trans)
> have gotten destroyed, but the cp_send_w work got incorrectly
> re-queued. For example, rds_cong_queue_updates() (because the
> peer sent a congestion update) can happen in softirq context,
> and would end up requeing work in the middle of rds_conn_destroy,
> after we have assumed that everything is quisced.
>
> On (12/18/17 12:12), David Miller wrote:
> >
> > We're seeming to accumulate a lot of checks like this, maybe there
> > is a more general way to deal with this problem?
>
> Yeah, I was thinking about this.. let me try to reprodcue this in-house
> and get back with a patchset.
>
I assume you weren't able to reproduce this? This crash hasn't been seen again,
and it was reported while KASAN was accidentally disabled in the syzbot kconfig
due to a change to the kconfig menus in linux-next. So this crash was possibly
caused by slab corruption elsewhere.
I am invalidating the bug for syzbot so it will report the same crash signature
again if it occurs, but if you think there is a real bug feel free to keep
looking into it.
#syz invalid
Thanks,
Eric
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [rds-devel] BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit
2018-01-30 22:22 ` Eric Biggers
@ 2018-01-30 22:28 ` Sowmini Varadhan
0 siblings, 0 replies; 6+ messages in thread
From: Sowmini Varadhan @ 2018-01-30 22:28 UTC (permalink / raw)
To: Eric Biggers
Cc: David Miller, santosh.shilimkar, rds-devel,
bot+aaf54a8c644d559d34dedcf3126aac68a20c9e63, linux-rdma, netdev,
syzkaller-bugs, linux-kernel
On (01/30/18 14:22), Eric Biggers wrote:
>
> I assume you weren't able to reproduce this? This crash hasn't been
> seen again,
:
> I am invalidating the bug for syzbot so it will report the same crash
> signature
> again if it occurs, but if you think there is a real bug feel free to keep
> looking into it.
correct I was not able to reproduce this. However based on code
inspecion, I came up with
commit 3db6e0d172c94bd9953a1347c55ffb64b1d2e74f
rds: use RCU to synchronize work-enqueue with connection teardown
Marking it invalid sounds good to me.
--Sowmini
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-01-30 22:28 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <001a1145ac5480242305609956b3@google.com>
2017-12-18 16:28 ` BUG: unable to handle kernel NULL pointer dereference in rds_send_xmit Santosh Shilimkar
2017-12-18 17:12 ` David Miller
2017-12-18 17:16 ` Santosh Shilimkar
2017-12-18 17:22 ` [rds-devel] " Sowmini Varadhan
2018-01-30 22:22 ` Eric Biggers
2018-01-30 22:28 ` Sowmini Varadhan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).