From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C1B6C433FF for ; Thu, 1 Aug 2019 12:14:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D1F3214DA for ; Thu, 1 Aug 2019 12:14:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1564661695; bh=r7wJ2M7jka9SZ1Xml6hsIHdcRHOGVN6u4vKgkMc3iMA=; h=From:To:Cc:Subject:Date:List-ID:From; b=f18hOavuMkAEqhMqLzUrEs1Aj2F1RGddtkqiXNfxkK8CftPNqCYZuDCE07GUp9Jt2 ulnvZTTfyrGHUHnl/VmIpQvQAsJYfxwklxf4pyBq4x1yY/0S/hE7Ve/DcFROh1T2dR e3Z0UhNnZ75gs3LRbotVF6CxTB0lGgiDjscmLNXQ= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726422AbfHAMOy (ORCPT ); Thu, 1 Aug 2019 08:14:54 -0400 Received: from mail.kernel.org ([198.145.29.99]:58638 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725790AbfHAMOy (ORCPT ); Thu, 1 Aug 2019 08:14:54 -0400 Received: from localhost (unknown [193.47.165.251]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 77DFC20838; Thu, 1 Aug 2019 12:14:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1564661693; bh=r7wJ2M7jka9SZ1Xml6hsIHdcRHOGVN6u4vKgkMc3iMA=; h=From:To:Cc:Subject:Date:From; b=peMf9rrekQy/wJdcAKq1OBbxPzhA+KNibs7KbbaeOOB6fCbLJm2LHs7JpjyDvLV5k Q1PLAXb7kOqpoRo+hOiaHYbcuNcR7MtUcZXxwnJbbOOvUCHnmjun+YidLLuVKZq8Vy MJbc9wSaeJExjuzVbPfzehLs0fA3LmJDUAJL6xUE= From: Leon Romanovsky To: Doug Ledford , Jason Gunthorpe Cc: Jack Morgenstein , RDMA mailing list , Jason Gunthorpe , Leon Romanovsky Subject: [PATCH rdma-rc] IB/mad: Fix use-after-free in ib mad completion handling Date: Thu, 1 Aug 2019 15:14:49 +0300 Message-Id: <20190801121449.24973-1-leon@kernel.org> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Jack Morgenstein We encountered a use-after-free bug when unloading the driver: [ 3562.116059] BUG: KASAN: use-after-free in ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.117233] Read of size 4 at addr ffff8882ca5aa868 by task kworker/u13:2/23862 [ 3562.118385] [ 3562.119519] CPU: 2 PID: 23862 Comm: kworker/u13:2 Tainted: G OE 5.1.0-for-upstream-dbg-2019-05-19_16-44-30-13 #1 [ 3562.121806] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014 [ 3562.123075] Workqueue: ib-comp-unb-wq ib_cq_poll_work [ib_core] [ 3562.124383] Call Trace: [ 3562.125640] dump_stack+0x9a/0xeb [ 3562.126911] print_address_description+0xe3/0x2e0 [ 3562.128223] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.129545] __kasan_report+0x15c/0x1df [ 3562.130866] ? ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.132174] kasan_report+0xe/0x20 [ 3562.133514] ib_mad_post_receive_mads+0xddc/0xed0 [ib_core] [ 3562.134835] ? find_mad_agent+0xa00/0xa00 [ib_core] [ 3562.136158] ? qlist_free_all+0x51/0xb0 [ 3562.137498] ? mlx4_ib_sqp_comp_worker+0x1970/0x1970 [mlx4_ib] [ 3562.138833] ? quarantine_reduce+0x1fa/0x270 [ 3562.140171] ? kasan_unpoison_shadow+0x30/0x40 [ 3562.141522] ib_mad_recv_done+0xdf6/0x3000 [ib_core] [ 3562.142880] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.144277] ? ib_mad_send_done+0x1810/0x1810 [ib_core] [ 3562.145649] ? mlx4_ib_destroy_cq+0x2a0/0x2a0 [mlx4_ib] [ 3562.147008] ? _raw_spin_unlock_irqrestore+0x46/0x70 [ 3562.148380] ? debug_object_deactivate+0x2b9/0x4a0 [ 3562.149814] __ib_process_cq+0xe2/0x1d0 [ib_core] [ 3562.151195] ib_cq_poll_work+0x45/0xf0 [ib_core] [ 3562.152577] process_one_work+0x90c/0x1860 [ 3562.153959] ? pwq_dec_nr_in_flight+0x320/0x320 [ 3562.155320] worker_thread+0x87/0xbb0 [ 3562.156687] ? __kthread_parkme+0xb6/0x180 [ 3562.158058] ? process_one_work+0x1860/0x1860 [ 3562.159429] kthread+0x320/0x3e0 [ 3562.161391] ? kthread_park+0x120/0x120 [ 3562.162744] ret_from_fork+0x24/0x30 ... [ 3562.187615] Freed by task 31682: [ 3562.188602] save_stack+0x19/0x80 [ 3562.189586] __kasan_slab_free+0x11d/0x160 [ 3562.190571] kfree+0xf5/0x2f0 [ 3562.191552] ib_mad_port_close+0x200/0x380 [ib_core] [ 3562.192538] ib_mad_remove_device+0xf0/0x230 [ib_core] [ 3562.193538] remove_client_context+0xa6/0xe0 [ib_core] [ 3562.194514] disable_device+0x14e/0x260 [ib_core] [ 3562.195488] __ib_unregister_device+0x79/0x150 [ib_core] [ 3562.196462] ib_unregister_device+0x21/0x30 [ib_core] [ 3562.197439] mlx4_ib_remove+0x162/0x690 [mlx4_ib] [ 3562.198408] mlx4_remove_device+0x204/0x2c0 [mlx4_core] [ 3562.199381] mlx4_unregister_interface+0x49/0x1d0 [mlx4_core] [ 3562.200356] mlx4_ib_cleanup+0xc/0x1d [mlx4_ib] [ 3562.201329] __x64_sys_delete_module+0x2d2/0x400 [ 3562.202288] do_syscall_64+0x95/0x470 [ 3562.203277] entry_SYSCALL_64_after_hwframe+0x49/0xbe The problem was that the MAD PD was deallocated before the MAD CQ. There was completion work pending for the CQ when the PD got deallocated. When the mad completion handling reached procedure ib_mad_post_receive_mads(), we got a use-after-free bug in the following line of code in that procedure: sg_list.lkey = qp_info->port_priv->pd->local_dma_lkey; (the pd pointer in the above line is no longer valid, because the pd has been deallocated). We fix this by allocating the PD before the CQ in procedure ib_mad_port_open(), and deallocating the PD after freeing the CQ in procedure ib_mad_port_close(). Since the CQ completion work queue is flushed during ib_free_cq(), no completions will be pending for that CQ when the PD is later deallocated. Note that freeing the CQ before deallocating the PD is the practice in the ULPs. Fixes: 4be90bc60df4 ("IB/mad: Remove ib_get_dma_mr calls") Signed-off-by: Jack Morgenstein Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/mad.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/infiniband/core/mad.c b/drivers/infiniband/core/mad.c index cc99479b2c09..9947d16edef2 100644 --- a/drivers/infiniband/core/mad.c +++ b/drivers/infiniband/core/mad.c @@ -3224,18 +3224,18 @@ static int ib_mad_port_open(struct ib_device *device, if (has_smi) cq_size *= 2; + port_priv->pd = ib_alloc_pd(device, 0); + if (IS_ERR(port_priv->pd)) { + dev_err(&device->dev, "Couldn't create ib_mad PD\n"); + ret = PTR_ERR(port_priv->pd); + goto error3; + } + port_priv->cq = ib_alloc_cq(port_priv->device, port_priv, cq_size, 0, IB_POLL_UNBOUND_WORKQUEUE); if (IS_ERR(port_priv->cq)) { dev_err(&device->dev, "Couldn't create ib_mad CQ\n"); ret = PTR_ERR(port_priv->cq); - goto error3; - } - - port_priv->pd = ib_alloc_pd(device, 0); - if (IS_ERR(port_priv->pd)) { - dev_err(&device->dev, "Couldn't create ib_mad PD\n"); - ret = PTR_ERR(port_priv->pd); goto error4; } @@ -3278,11 +3278,11 @@ static int ib_mad_port_open(struct ib_device *device, error7: destroy_mad_qp(&port_priv->qp_info[0]); error6: - ib_dealloc_pd(port_priv->pd); -error4: ib_free_cq(port_priv->cq); cleanup_recv_queue(&port_priv->qp_info[1]); cleanup_recv_queue(&port_priv->qp_info[0]); +error4: + ib_dealloc_pd(port_priv->pd); error3: kfree(port_priv); @@ -3312,8 +3312,8 @@ static int ib_mad_port_close(struct ib_device *device, int port_num) destroy_workqueue(port_priv->wq); destroy_mad_qp(&port_priv->qp_info[1]); destroy_mad_qp(&port_priv->qp_info[0]); - ib_dealloc_pd(port_priv->pd); ib_free_cq(port_priv->cq); + ib_dealloc_pd(port_priv->pd); cleanup_recv_queue(&port_priv->qp_info[1]); cleanup_recv_queue(&port_priv->qp_info[0]); /* XXX: Handle deallocation of MAD registration tables */ -- 2.20.1