From: Leon Romanovsky <leon-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: "Michael J. Ruhl"
<michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag
Date: Fri, 20 Oct 2017 10:37:24 +0300 [thread overview]
Message-ID: <20171020073724.GY2106@mtr-leonro.local> (raw)
In-Reply-To: <20171019213859.26124.37851.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
[-- Attachment #1: Type: text/plain, Size: 5644 bytes --]
On Thu, Oct 19, 2017 at 05:40:59PM -0400, Michael J. Ruhl wrote:
> I was playing with the ibacm service and discovered an issue
> the other day.
>
> If no provider library is present (I removed libacmp.so, and the
> provider keyword in the opts.cfg file is libacmp), when a resolve
> request is posted, the kernel will crash with the following Oops:
>
> BUG: unable to handle kernel NULL pointer dereference at (null)
> IP: (null)
> PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
> Oops: 0010 [#1] SMP
> Modules linked in: rpcrdma ib_isert iscsi_target_mod
> target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_u
> ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod
> dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_si
> glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me
> lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devi
> ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd gra
> sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
> pps_core drm dca libata i2c_algo_bit i2c_core
> CPU: 54 PID: 9841 Comm: ibacm Tainted: G I 4.14.0-rc2+ #6
> Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
> task: ffff880855f42d00 task.stack: ffffc900246b4000
> RIP: 0010: (null)
> RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
> RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX: 0000000000000000
> RDX: 0000000000001100 RSI: ffff881058bb1320 RDI: ffff881056362000
> RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09: 0000000000001100
> R10: ffff8810573a5000 R11: 0000000000000000 R12: ffff881056362000
> R13: 0000000000000ec0 R14: ffff881058bb1320 R15: 0000000000000ec0
> FS: 00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 0000001056f5d003 CR4: 00000000001606e0
> Call Trace:
> ? netlink_dump+0x12c/0x290
> __netlink_dump_start+0x186/0x1f0
> rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
> rdma_nl_rcv+0xdc/0x130 [ib_core]
> netlink_unicast+0x181/0x240
> netlink_sendmsg+0x2c2/0x3b0
> sock_sendmsg+0x38/0x50
> SYSC_sendto+0x102/0x190
> ? __audit_syscall_entry+0xaf/0x100
> ? syscall_trace_enter+0x1d0/0x2b0
> ? __audit_syscall_exit+0x209/0x290
> SyS_sendto+0xe/0x10
> do_syscall_64+0x67/0x1b0
> entry_SYSCALL64_slow_path+0x25/0x25
> RIP: 0033:0x7fe0b9db2a63
> RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX: 000000000000002c
> RAX: ffffffffffffffda RBX: 0000000000000010 RCX: 00007fe0b9db2a63
> RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI: 000000000000000d
> RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09: 000000000000000c
> R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffc55edc280
> R13: 000000000260b400 R14: 000000000000000d R15: 0000000000000001
> Code: Bad RIP value.
> RIP: (null) RSP: ffffc900246b7bc8
> CR2: 0000000000000000
> ---[ end trace 8d67abcfd10ec209 ]---
> Kernel panic - not syncing: Fatal exception
> Kernel Offset: disabled
> ---[ end Kernel panic - not syncing: Fatal exception
> ------------[ cut here ]------------
>
> The issue is that in rdma_nl_rcv_msg(), the check
> 'if (flags & NLM_F_DUMP)' is not completely correct.
>
> NLM_F_DUMP is two bits NLM_F_ROOT | NLM_F_MATCH.
>
> ibacm sends a RDMA_NL_LS response with the RDMA_NL_LS_F_ERR bit set
> if an error occurs in the service (like no provider being available,
> or ACM_STATUS_ENODATA, etc.).
>
> NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.
>
> The current code thinks that it sees a NLM_F_DUMP flag and incorrectly calls
> the .dump() callback.
Hi Michael,
Thanks for the report and for excellent analysis, You are right that
RDMA_NL_LS_F_ERR has the same value as NLM_F_ROOT and it is bad, but
I just think that it is not the final root cause.
In case of errors, the LS was supposed to send NLMSG_ERROR message and not
overload general nlmsg_flags, which is awful. However I don't know if it is
feasible to fix current implementation without breaking UAPI contract.
In meanwhile, can we implement dummy dumpit functions for the LS,
which reuse ib_nl_is_good_ip_resp?
I prefer this solution over yours, because it doesn't mix LS-specifics with
general decision function and leaves LS anomalies in the LS-relevant code.
And returning 0 in absence of dumpit function as a response with
NLM_F_DUMP flag is wrong. User should be aware of the fact that
something wrong was with his request.
Thanks
>
> The included patch is an atempt to fix this issue. This patch fixes the
> issue that I am seeing, but I am not sure how to test the messages for
> RDMA_NL_RDMA_CM or RDMA_NL_IWCM (or any message that uses the
> NLM_F_DUMP bits).
>
> If anyone has some knowledge of these services, any extra testing would
> be welcomed.
>
> If the patch has no issues or comments, I will formally re-submit it
> (through my usual channel Denny).
>
> Thanks,
>
> Mike
>
>
> ---
>
> Michael J. Ruhl (1):
> RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag
>
>
> drivers/infiniband/core/netlink.c | 7 +++++--
> 1 files changed, 5 insertions(+), 2 deletions(-)
>
> --
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
next prev parent reply other threads:[~2017-10-20 7:37 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-10-19 21:40 [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag Michael J. Ruhl
[not found] ` <20171019213859.26124.37851.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
2017-10-19 21:41 ` Michael J. Ruhl
2017-10-20 7:37 ` Leon Romanovsky [this message]
[not found] ` <20171020073724.GY2106-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-20 12:18 ` Wan, Kaike
[not found] ` <3F128C9216C9B84BB6ED23EF16290AFB6347E3BF-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-20 16:20 ` Leon Romanovsky
[not found] ` <20171020162017.GZ2106-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-20 19:04 ` Wan, Kaike
[not found] ` <3F128C9216C9B84BB6ED23EF16290AFB6347E59B-8k97q/ur5Z2krb+BlOpmy7fspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-23 5:54 ` Leon Romanovsky
2017-10-20 17:20 ` Ruhl, Michael J
[not found] ` <14063C7AD467DE4B82DEDB5C278E8663875E0841-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-23 8:11 ` Leon Romanovsky
[not found] ` <20171023081117.GE2106-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-23 13:38 ` Ruhl, Michael J
2017-10-23 14:49 ` Doug Ledford
[not found] ` <f03e51d6-4157-64b4-ec5d-9beac00ceb87-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-23 17:12 ` Leon Romanovsky
[not found] ` <20171023171211.GM2106-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-23 17:39 ` Doug Ledford
[not found] ` <1508780384.3325.13.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-23 18:03 ` Leon Romanovsky
[not found] ` <20171023180336.GQ2106-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-23 18:19 ` Ruhl, Michael J
[not found] ` <14063C7AD467DE4B82DEDB5C278E8663875E0FE2-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-23 18:25 ` Leon Romanovsky
[not found] ` <20171023182504.GB16127-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-23 20:24 ` Ruhl, Michael J
-- strict thread matches above, loose matches on Subject: below --
2017-10-24 12:41 Michael J. Ruhl
[not found] ` <20171024123957.32207.70888.stgit-K+u1se/DcYrLESAwzcoQNrvm/XP+8Wra@public.gmane.org>
2017-10-24 14:41 ` Leon Romanovsky
[not found] ` <20171024144152.GH16127-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-24 14:52 ` Ruhl, Michael J
[not found] ` <14063C7AD467DE4B82DEDB5C278E8663875E153D-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-24 15:19 ` Leon Romanovsky
[not found] ` <20171024151958.GI16127-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-24 15:42 ` Ruhl, Michael J
[not found] ` <14063C7AD467DE4B82DEDB5C278E8663875E15AD-AtyAts71sc88Ug9VwtkbtrfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2017-10-25 18:57 ` Doug Ledford
[not found] ` <1508957840.3325.54.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-25 19:06 ` Leon Romanovsky
[not found] ` <20171025190608.GX16127-U/DQcQFIOTAAJjI8aNfphQ@public.gmane.org>
2017-10-25 19:17 ` Doug Ledford
[not found] ` <1508959048.3325.58.camel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2017-10-25 19:32 ` Leon Romanovsky
2017-10-24 14:42 ` Shiraz Saleem
2017-10-24 16:31 ` Doug Ledford
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171020073724.GY2106@mtr-leonro.local \
--to=leon-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
--cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
--cc=michael.j.ruhl-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.