All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Sagi Grimberg <sagi@grimberg.me>, Max Gurtovoy <maxg@nvidia.com>,
	<linux-rdma@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Doug Ledford <dledford@redhat.com>
Subject: Re: [PATCH] IB/iser: Remove in_interrupt() usage.
Date: Fri, 27 Nov 2020 09:03:14 -0400	[thread overview]
Message-ID: <20201127130314.GE552508@nvidia.com> (raw)
In-Reply-To: <20201127123455.scnqc7xvuwwofdp2@linutronix.de>

On Fri, Nov 27, 2020 at 01:34:55PM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-11-26 16:53:57 [-0400], Jason Gunthorpe wrote:
> > > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
> > > @@ -187,12 +187,8 @@ iser_initialize_task_headers(struct iscsi_task *task,
> > >  	struct iser_device *device = iser_conn->ib_conn.device;
> > >  	struct iscsi_iser_task *iser_task = task->dd_data;
> > >  	u64 dma_addr;
> > > -	const bool mgmt_task = !task->sc && !in_interrupt();
> > >  	int ret = 0;
> > 
> > Why do you think the task->sc doesn't matter?
> 
> Based on the call paths I checked, there was no evidence that
> state_mutex can be acquired. If I remove locking here then `mgmt_task'
> is no longer needed.

That only says there is no recursive deadlock..

> How should task->sc matter?

I was able to get the internal bug report that caused the
7414dde0a6c3a commit.

The issue here is that the state_mutex is protecting 

This:

	if (unlikely(iser_conn->state != ISER_CONN_UP)) {

Which indicates that this:

        dma_addr = ib_dma_map_single(device->ib_device, (void *)tx_desc,

Won't crash because iser_con->ib_con is invalid. The notes say that
the iSCSI stack is in some state where data traffic won't flow but
management traffic is still possible. I suppose this is some fast path
so it was "optimized" to eliminate the lock for data traffic.

A call chain of interest for the lock at least is:

Nov  3 12:24:37 rsws10 BUG: unable to handle kernel 
Nov  3 12:24:37 NULL pointer dereference
Nov  3 12:24:37 rsws10 Pid: 5245, comm: scsi_eh_5 Tainted: GF          O 3.8.13-16.2.1.el6uek.x86_64 #1 IBM System x3550 M3 -[7944KEG]-/90Y4784
[..]
Nov  3 12:24:37 rsws10  [<ffffffffa069d628>] iscsi_iser_task_init+0x28/0x70 [ib_iser]
Nov  3 12:24:37 rsws10  [<ffffffffa0610029>] iscsi_prep_mgmt_task+0x129/0x150 [libiscsi]
Nov  3 12:24:37 rsws10  [<ffffffffa061354c>] __iscsi_conn_send_pdu+0x23c/0x310 [libiscsi]
Nov  3 12:24:37 rsws10  [<ffffffffa0614277>] iscsi_exec_task_mgmt_fn+0x37/0x290 [libiscsi]
Nov  3 12:24:37 rsws10  [<ffffffff813c2694>] ? scsi_send_eh_cmnd+0xd4/0x3a0
Nov  3 12:24:37 rsws10  [<ffffffff810c39df>] ? module_refcount+0x9f/0xc0
Nov  3 12:24:37 rsws10  [<ffffffffa061497b>] iscsi_eh_device_reset+0x1bb/0x2d0 [libiscsi]
Nov  3 12:24:37 rsws10  [<ffffffff813c3119>] scsi_eh_bus_device_reset+0xb9/0x1e0
Nov  3 12:24:37 rsws10  [<ffffffff813c3f60>] ? scsi_unjam_host+0x1f0/0x1f0
Nov  3 12:24:37 rsws10  [<ffffffff813c3cbe>] scsi_eh_ready_devs+0x5e/0x110
Nov  3 12:24:37 rsws10  [<ffffffff813c3f60>] ? scsi_unjam_host+0x1f0/0x1f0
Nov  3 12:24:37 rsws10  [<ffffffff813c3e5d>] scsi_unjam_host+0xed/0x1f0
Nov  3 12:24:37 rsws10  [<ffffffff813c3f60>] ? scsi_unjam_host+0x1f0/0x1f0
Nov  3 12:24:37 rsws10  [<ffffffff813c40c8>] scsi_error_handler+0x168/0x1c0
Nov  3 12:24:37 rsws10  [<ffffffff813c3f60>] ? scsi_unjam_host+0x1f0/0x1f0
Nov  3 12:24:37 rsws10  [<ffffffff81082a6e>] kthread+0xce/0xe0
Nov  3 12:24:37 rsws10  [<ffffffff810829a0>] ? kthread_freezable_should_stop+0x70/0x70
Nov  3 12:24:37 rsws10  [<ffffffff8159b66c>] ret_from_fork+0x7c/0xb0
Nov  3 12:24:37 rsws10  [<ffffffff810829a0>] ? kthread_freezable_should_stop+0x70/0x70

So, I think the usual 'pass in atomic context flag' is really needed
here

Jason

  reply	other threads:[~2020-11-27 13:03 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 20:27 [PATCH] IB/iser: Remove in_interrupt() usage Sebastian Andrzej Siewior
2020-11-26 20:53 ` Jason Gunthorpe
2020-11-27 12:34   ` Sebastian Andrzej Siewior
2020-11-27 13:03     ` Jason Gunthorpe [this message]
2020-11-27 14:14       ` Sebastian Andrzej Siewior
2020-11-27 14:31         ` Jason Gunthorpe
2020-12-03 13:56           ` Sebastian Andrzej Siewior
2020-12-03 19:30             ` Sagi Grimberg
2020-12-04 17:42               ` [PATCH v2] " Sebastian Andrzej Siewior
2020-12-07 20:32                 ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201127130314.GE552508@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=bigeasy@linutronix.de \
    --cc=dledford@redhat.com \
    --cc=linux-rdma@vger.kernel.org \
    --cc=maxg@nvidia.com \
    --cc=sagi@grimberg.me \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.