* BUG in sas event handling mechanism???
@ 2006-09-22 17:40 malahal
2006-09-22 18:17 ` Alexis Bruemmer
2006-09-22 18:47 ` Luben Tuikov
0 siblings, 2 replies; 4+ messages in thread
From: malahal @ 2006-09-22 17:40 UTC (permalink / raw)
To: linux-scsi
I see a PORTE_BYTES_DMAED event, followed by a PHYE_LOSS_OF_SIGNAL event
and then followed by a PORTE_BYTES_DMAED event on the same phy. The code
seems to just drop the last event because of the not yet processed first
event. So, it just processes the first two events in that order. In
other words, the link doesn't get used at all!
Of course, I get timeouts because of Vitesse/AIC94xx combination while
doing discovery. :-(
Maybe, this is what Luben is talking about when he said,
"Priority Queue Without Duplication Implementation"?
Thanks, Malahal.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG in sas event handling mechanism???
2006-09-22 17:40 BUG in sas event handling mechanism??? malahal
@ 2006-09-22 18:17 ` Alexis Bruemmer
2006-09-22 18:32 ` malahal
2006-09-22 18:47 ` Luben Tuikov
1 sibling, 1 reply; 4+ messages in thread
From: Alexis Bruemmer @ 2006-09-22 18:17 UTC (permalink / raw)
To: malahal; +Cc: linux-scsi
On Fri, 2006-09-22 at 10:40 -0700, malahal@us.ibm.com wrote:
> I see a PORTE_BYTES_DMAED event, followed by a PHYE_LOSS_OF_SIGNAL event
> and then followed by a PORTE_BYTES_DMAED event on the same phy. The code
> seems to just drop the last event because of the not yet processed first
> event. So, it just processes the first two events in that order. In
> other words, the link doesn't get used at all!
I am seeing a very similar issue with hot-plugging on the x260 systems
with the internal expanders. If a disk is pulled and plugged back in
right away then the PORTE_BROADCAST_RCVD event that was triggered when
the disk is plugged back in is dropped, causing the disk to never be
rediscovered. If there is enough of a delay between unplugging and
plugging then both PORTE_BROADCAST_RCVD events are processed correctly.
Any ideas on what would cause these events to be dropped?
--Alexis
> -
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG in sas event handling mechanism???
2006-09-22 18:17 ` Alexis Bruemmer
@ 2006-09-22 18:32 ` malahal
0 siblings, 0 replies; 4+ messages in thread
From: malahal @ 2006-09-22 18:32 UTC (permalink / raw)
To: Alexis Bruemmer; +Cc: linux-scsi
Alexis Bruemmer [alexisb@us.ibm.com] wrote:
> On Fri, 2006-09-22 at 10:40 -0700, malahal@us.ibm.com wrote:
> > I see a PORTE_BYTES_DMAED event, followed by a PHYE_LOSS_OF_SIGNAL event
> > and then followed by a PORTE_BYTES_DMAED event on the same phy. The code
> > seems to just drop the last event because of the not yet processed first
> > event. So, it just processes the first two events in that order. In
> > other words, the link doesn't get used at all!
> I am seeing a very similar issue with hot-plugging on the x260 systems
> with the internal expanders. If a disk is pulled and plugged back in
> right away then the PORTE_BROADCAST_RCVD event that was triggered when
> the disk is plugged back in is dropped, causing the disk to never be
> rediscovered. If there is enough of a delay between unplugging and
> plugging then both PORTE_BROADCAST_RCVD events are processed correctly.
>
> Any ideas on what would cause these events to be dropped?
The code in sas_queue_event() drops an event if there is one already in
the work queue. I don't know enough about PORTE_BROADCAST_RCVD event to
say if that is OK to to drop the event if one is pending already...
Thanks, Malahal.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: BUG in sas event handling mechanism???
2006-09-22 17:40 BUG in sas event handling mechanism??? malahal
2006-09-22 18:17 ` Alexis Bruemmer
@ 2006-09-22 18:47 ` Luben Tuikov
1 sibling, 0 replies; 4+ messages in thread
From: Luben Tuikov @ 2006-09-22 18:47 UTC (permalink / raw)
To: malahal, linux-scsi
--- malahal@us.ibm.com wrote:
> I see a PORTE_BYTES_DMAED event, followed by a PHYE_LOSS_OF_SIGNAL event
> and then followed by a PORTE_BYTES_DMAED event on the same phy. The code
> seems to just drop the last event because of the not yet processed first
> event. So, it just processes the first two events in that order. In
> other words, the link doesn't get used at all!
Hi Malahal,
This is of course wrong and the original SAS Stack as written by me
doesn't process this in the way you outlined above, and doesn't lose
*any* events. In the above example, the device is _properly discovered_.
You should hold Bottomley and LTC directly responsible for this.
Not "someone" or "the community".
Here are their patches directy responsible for this:
Bottomley: http://marc.theaimsgroup.com/?l=linux-scsi&m=114218113500117&w=2
Bruemmer: http://marc.theaimsgroup.com/?l=linux-scsi&m=114235935625301&w=2
Bruemmer: http://marc.theaimsgroup.com/?l=linux-scsi&m=114721079722569&w=2
> Of course, I get timeouts because of Vitesse/AIC94xx combination while
> doing discovery. :-(
Don't blame the hardware. Software should be robust enough to handle
the worst cases, in fact any case.
> Maybe, this is what Luben is talking about when he said,
> "Priority Queue Without Duplication Implementation"?
Yes.
Good luck Malahal!
Luben
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-09-22 18:47 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-09-22 17:40 BUG in sas event handling mechanism??? malahal
2006-09-22 18:17 ` Alexis Bruemmer
2006-09-22 18:32 ` malahal
2006-09-22 18:47 ` Luben Tuikov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox