linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* isci, INTx mode, race condition
@ 2015-08-19 16:01 Stefan Fausser
  2015-08-24 14:14 ` Artur Paszkiewicz
  0 siblings, 1 reply; 2+ messages in thread
From: Stefan Fausser @ 2015-08-19 16:01 UTC (permalink / raw)
  To: intel-linux-scu, artur.paszkiewicz, JBottomley, linux-scsi

[-- Attachment #1: Type: text/plain, Size: 1711 bytes --]

Dear all,

attached are two patches for the "isci" module (CONFIG_SCSI_ISCI).

Both patches apply to the current Linux kernel, retrieved by GIT 
(4.2.0-rc7).

The first patch (init.patch) is for reproducing the problem with the 
"Intel(R) C600 SAS Controller" in INTx Mode, see below. The second patch 
(host.patch) is for fixing this problem.

The problem:

By applying the first patch "init.patch", the "Intel(R) C600 SAS 
Controller" (now abbreviated by SAS) generates level-triggered INTx 
Interrupts instead of (edge-triggered) MSI-X Interrupts.

In the ISR (isci_intx_isr), the controller determines if the interrupt 
is due to a normal operation (normal interrrupt) or an error. In the 
case of a normal interrupt, a tasklet is scheduled that should handle 
the normal interrupt. However, in the ISR, the interrupts are left 
unmasked and the SAS device may trigger the next interrupt after the ISR 
has left and before the tasklet has been scheduled.

Thus, with this patch "init.patch" and on my system (Intel C600 chipset 
series), the SAS device repeatedly level-triggers the interrupt and the 
tasklet to handle the interrupt never gets scheduled. This will result 
in a soft-lockup on the executing core.

In my investigations, the above described problem occurs in all Linux 
kernel version starting from 3.5 and up to to-day.

The fix:

By applying the second patch "host.patch", the interrupts are masked in 
the INTx ISR in case of a normal interrupt. Thus, the scheduler has 
enough time to schedule the handling tasklet. In the tasklet (see 
sci_controller_completion_handler), the interrupts are unmasked again.

Please let me know if you need any other information.

Kind Regards,

Stefan


[-- Attachment #2: host.patch --]
[-- Type: text/x-patch, Size: 558 bytes --]

Signed-off-by: Stefan Fausser <info@real-time-systems.com>

--- linux/drivers/scsi/isci/host.c.orig	2015-08-19 15:23:28.000000000 +0200
+++ linux/drivers/scsi/isci/host.c	2015-08-19 16:08:30.000000000 +0200
@@ -612,6 +612,7 @@ irqreturn_t isci_intx_isr(int vec, void 
 
 	if (sci_controller_isr(ihost)) {
 		writel(SMU_ISR_COMPLETION, &ihost->smu_registers->interrupt_status);
+		writel(0xFF000000, &ihost->smu_registers->interrupt_mask);
 		tasklet_schedule(&ihost->completion_tasklet);
 		ret = IRQ_HANDLED;
 	} else if (sci_controller_error_isr(ihost)) {

[-- Attachment #3: init.patch --]
[-- Type: text/x-patch, Size: 448 bytes --]

Signed-off-by: Stefan Fausser <info@real-time-systems.com>

--- linux/drivers/scsi/isci/init.c.orig	2015-08-19 15:23:36.000000000 +0200
+++ linux/drivers/scsi/isci/init.c	2015-08-19 15:47:47.000000000 +0200
@@ -345,6 +345,7 @@ static int isci_setup_interrupts(struct 
 	struct isci_host *ihost;
 	struct isci_pci_info *pci_info = to_pci_info(pdev);
 
+	goto intx;
 	/*
 	 *  Determine the number of vectors associated with this
 	 *  PCI function.

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: isci, INTx mode, race condition
  2015-08-19 16:01 isci, INTx mode, race condition Stefan Fausser
@ 2015-08-24 14:14 ` Artur Paszkiewicz
  0 siblings, 0 replies; 2+ messages in thread
From: Artur Paszkiewicz @ 2015-08-24 14:14 UTC (permalink / raw)
  To: Stefan Fausser, intel-linux-scu, JBottomley, linux-scsi,
	linux-kernel

On 08/19/2015 06:01 PM, Stefan Fausser wrote:
> Dear all,
> 
> attached are two patches for the "isci" module (CONFIG_SCSI_ISCI).
> 
> Both patches apply to the current Linux kernel, retrieved by GIT (4.2.0-rc7).
> 
> The first patch (init.patch) is for reproducing the problem with the "Intel(R) C600 SAS Controller" in INTx Mode, see below. The second patch (host.patch) is for fixing this problem.
> 
> The problem:
> 
> By applying the first patch "init.patch", the "Intel(R) C600 SAS Controller" (now abbreviated by SAS) generates level-triggered INTx Interrupts instead of (edge-triggered) MSI-X Interrupts.
> 
> In the ISR (isci_intx_isr), the controller determines if the interrupt is due to a normal operation (normal interrrupt) or an error. In the case of a normal interrupt, a tasklet is scheduled that should handle the normal interrupt. However, in the ISR, the interrupts are left unmasked and the SAS device may trigger the next interrupt after the ISR has left and before the tasklet has been scheduled.
> 
> Thus, with this patch "init.patch" and on my system (Intel C600 chipset series), the SAS device repeatedly level-triggers the interrupt and the tasklet to handle the interrupt never gets scheduled. This will result in a soft-lockup on the executing core.
> 
> In my investigations, the above described problem occurs in all Linux kernel version starting from 3.5 and up to to-day.
> 
> The fix:
> 
> By applying the second patch "host.patch", the interrupts are masked in the INTx ISR in case of a normal interrupt. Thus, the scheduler has enough time to schedule the handling tasklet. In the tasklet (see sci_controller_completion_handler), the interrupts are unmasked again.
> 
> Please let me know if you need any other information.
> 
> Kind Regards,
> 
> Stefan
> 

Hi Stefan,

I tried to reproduce this issue using just the init.patch and I had no
soft lockups, but they started occuring after I added some delays in
isci_intx_isr(). The host.patch fixed this and I think the solution is
correct.

Acked-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>

Thanks,
Artur

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-08-24 14:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-08-19 16:01 isci, INTx mode, race condition Stefan Fausser
2015-08-24 14:14 ` Artur Paszkiewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).