public inbox for linux-rdma@vger.kernel.org
 help / color / mirror / Atom feed
From: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
To: dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org
Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>,
	Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Subject: [PATCH 4/4] IB/hfi1: Fix mm_struct use after free
Date: Tue, 16 Aug 2016 13:27:03 -0700	[thread overview]
Message-ID: <20160816202634.24004.54031.stgit@scvm10.sc.intel.com> (raw)
In-Reply-To: <20160816202133.24004.31487.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Testing with CONFIG_SLUB_DEBUG_ON=y resulted in the kernel panic below.

This is the result of the mm_struct sometimes being free'd prior to
hfi1_file_close being called.

This was due to the combination of 2 reasons:

1) hfi1_file_close is deferred in process exit and it therefore may not
   be called synchronously with process exit.
2) exit_mm is called prior to exit_files in do_exit.  Normally this is ok
   however, our kernel bypass code requires us to have access to the
   mm_struct for house keeping both at "normal" close time as well as at
   process exit.

Therefore, the fix is to simply keep a reference to the mm_struct until
we are done with it.

[ 3006.340150] general protection fault: 0000 [#1] SMP
[ 3006.346469] Modules linked in: hfi1 rdmavt rpcrdma ib_isert iscsi_target_mod
ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod
 ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm
 ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod snd_hda_code
 c_realtek iTCO_wdt snd_hda_codec_generic iTCO_vendor_support sb_edac edac_core
 x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass c
 rct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel lrw snd_hda_intel
 gf128mul snd_hda_codec glue_helper snd_hda_core ablk_helper sn
 d_hwdep cryptd snd_seq snd_seq_device snd_pcm snd_timer snd soundcore pcspkr
 shpchp mei_me sg lpc_ich mei i2c_i801 mfd_core ioatdma ipmi_devi
 ntf wmi ipmi_si ipmi_msghandler acpi_cpufreq nfsd auth_rpcgss nfs_acl lockd
 grace sunrpc ip_tables ext4 jbd2 mbcache mlx4_en ib_core sr_mod s
 d_mod cdrom crc32c_intel mgag200 drm_kms_helper syscopyarea sysfillrect igb
 sysimgblt fb_sys_fops ptp mlx4_core ttm isci pps_core ahci drm li
 bsas libahci dca firewire_ohci i2c_algo_bit scsi_transport_sas firewire_core
 crc_itu_t i2c_core libata [last unloaded: mlx4_ib]
 [ 3006.461759] CPU: 16 PID: 11624 Comm: mpi_stress Not tainted 4.7.0-rc5+ #1
 [ 3006.469915] Hardware name: Intel Corporation W2600CR ........../W2600CR, BIOS SE5C600.86B.01.08.0003.022620131521 02/26/2013
 [ 3006.483027] task: ffff8804102f0040 ti: ffff8804102f8000 task.ti: ffff8804102f8000
 [ 3006.491971] RIP: 0010:[<ffffffff810f0383>]  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
 [ 3006.501905] RSP: 0018:ffff8804102fb908  EFLAGS: 00010002
 [ 3006.508447] RAX: 6b6b6b6b6b6b6b6b RBX: 0000000000000001 RCX: 0000000000000000
 [ 3006.517012] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff880410b56a40
 [ 3006.525569] RBP: ffff8804102fb9b0 R08: 0000000000000001 R09: 0000000000000000
 [ 3006.534119] R10: ffff8804102f0040 R11: 0000000000000000 R12: 0000000000000000
 [ 3006.542664] R13: ffff880410b56a40 R14: 0000000000000000 R15: 0000000000000000
 [ 3006.551203] FS:  00007ff478c08700(0000) GS:ffff88042e200000(0000) knlGS:0000000000000000
 [ 3006.560814] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 [ 3006.567806] CR2: 00007f667f5109e0 CR3: 0000000001c06000 CR4: 00000000000406e0
 [ 3006.576352] Stack:
 [ 3006.579157]  ffffffff8124b819 ffffffffffffffff 0000000000000000 ffff8804102fb940
 [ 3006.588072]  0000000000000002 0000000000000000 ffff8804102f0040 0000000000000007
 [ 3006.596971]  0000000000000006 ffff8803cad6f000 0000000000000000 ffff8804102f0040
 [ 3006.605878] Call Trace:
 [ 3006.609220]  [<ffffffff8124b819>] ? uncharge_batch+0x109/0x250
 [ 3006.616382]  [<ffffffff810f2313>] lock_acquire+0xd3/0x220
 [ 3006.623056]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.631593]  [<ffffffff81775579>] down_write+0x49/0x80
 [ 3006.638022]  [<ffffffffa0a30bfc>] ? hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.646569]  [<ffffffffa0a30bfc>] hfi1_release_user_pages+0x7c/0xa0 [hfi1]
 [ 3006.654898]  [<ffffffffa0a2efb6>] cacheless_tid_rb_remove+0x106/0x330 [hfi1]
 [ 3006.663417]  [<ffffffff810efd36>] ? mark_held_locks+0x66/0x90
 [ 3006.670498]  [<ffffffff817771f6>] ? _raw_spin_unlock_irqrestore+0x36/0x60
 [ 3006.678741]  [<ffffffffa0a2f1ee>] tid_rb_remove+0xe/0x10 [hfi1]
 [ 3006.686010]  [<ffffffffa0a0c5d5>] hfi1_mmu_rb_unregister+0xc5/0x100 [hfi1]
 [ 3006.694387]  [<ffffffffa0a2fcb9>] hfi1_user_exp_rcv_free+0x39/0x120 [hfi1]
 [ 3006.702732]  [<ffffffffa09fc6ea>] hfi1_file_close+0x17a/0x330 [hfi1]
 [ 3006.710489]  [<ffffffff81263e9a>] __fput+0xfa/0x230
 [ 3006.716595]  [<ffffffff8126400e>] ____fput+0xe/0x10
 [ 3006.722696]  [<ffffffff810b95c6>] task_work_run+0x86/0xc0
 [ 3006.729379]  [<ffffffff81099933>] do_exit+0x323/0xc40
 [ 3006.735672]  [<ffffffff8109a2dc>] do_group_exit+0x4c/0xc0
 [ 3006.742371]  [<ffffffff810a7f55>] get_signal+0x345/0x940
 [ 3006.748958]  [<ffffffff810340c7>] do_signal+0x37/0x700
 [ 3006.755328]  [<ffffffff8127872a>] ? poll_select_set_timeout+0x5a/0x90
 [ 3006.763146]  [<ffffffff811609cb>] ? __audit_syscall_exit+0x1db/0x260
 [ 3006.770853]  [<ffffffff8110f3e3>] ? rcu_read_lock_sched_held+0x93/0xa0
 [ 3006.778765]  [<ffffffff812347a4>] ? kfree+0x1e4/0x2a0
 [ 3006.784986]  [<ffffffff8108e75a>] ? exit_to_usermode_loop+0x33/0xac
 [ 3006.792551]  [<ffffffff8108e785>] exit_to_usermode_loop+0x5e/0xac
 [ 3006.799907]  [<ffffffff81003dca>] do_syscall_64+0x12a/0x190
 [ 3006.806664]  [<ffffffff81777a7f>] entry_SYSCALL64_slow_path+0x25/0x25
 [ 3006.814396] Code: 24 08 44 89 44 24 10 89 4c 24 18 e8 a8 d8 ff ff 48 85 c0
 8b 4c 24 18 44 8b 44 24 10 44 8b 4c 24 08 4c 8b 14 24 0f 84 30
 08 00 00 <f0> ff 80 98 01 00 00 8b 3d 48 ad be 01 45 8b a2 90 0b 00 00 85
 [ 3006.837158] RIP  [<ffffffff810f0383>] __lock_acquire+0xb3/0x19e0
 [ 3006.844401]  RSP <ffff8804102fb908>
 [ 3006.851170] ---[ end trace b7b9f21cf06c27df ]---
 [ 3006.927420] Kernel panic - not syncing: Fatal exception
 [ 3006.933954] Kernel Offset: disabled
 [ 3006.940961] ---[ end Kernel panic - not syncing: Fatal exception
 [ 3006.948249] ------------[ cut here ]------------

Fixes: 3faa3d9a308e ("IB/hfi1: Make use of mm consistent")
Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/file_ops.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 1ecbec1..c249f19 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -183,6 +183,7 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
 	if (fd) {
 		fd->rec_cpu_num = -1; /* no cpu affinity by default */
 		fd->mm = current->mm;
+		atomic_inc(&fd->mm->mm_count);
 	}
 
 	fp->private_data = fd;
@@ -779,6 +780,7 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 	mutex_unlock(&hfi1_mutex);
 	hfi1_free_ctxtdata(dd, uctxt);
 done:
+	mmdrop(fdata->mm);
 	kobject_put(&dd->kobj);
 	kfree(fdata);
 	return 0;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  parent reply	other threads:[~2016-08-16 20:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-16 20:23 [PATCH 0/4] hfi1/rdmavt: For 4.8 rc Dennis Dalessandro
     [not found] ` <20160816202133.24004.31487.stgit-9QXIwq+3FY+1XWohqUldA0EOCMrvLtNR@public.gmane.org>
2016-08-16 20:25   ` [PATCH 1/4] IB/hfi1: Return invalid field for non-QSFP CableInfo queries Dennis Dalessandro
2016-08-16 20:26   ` [PATCH 2/4] IB/hfi1: Improve J_KEY generation Dennis Dalessandro
2016-08-16 20:26   ` [PATCH 3/4] IB/rdmvat: Fix double vfree() in rvt_create_qp() error path Dennis Dalessandro
2016-08-16 20:27   ` Dennis Dalessandro [this message]
2016-08-22 18:41   ` [PATCH 0/4] hfi1/rdmavt: For 4.8 rc Doug Ledford

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160816202634.24004.54031.stgit@scvm10.sc.intel.com \
    --to=dennis.dalessandro-ral2jqcrhueavxtiumwx3w@public.gmane.org \
    --cc=dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    --cc=ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org \
    --cc=linux-rdma-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox