* [PATCH] "killing" sg_last(), and discussion
@ 2007-10-31 8:49 Jeff Garzik
2007-10-31 10:19 ` Boaz Harrosh
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Garzik @ 2007-10-31 8:49 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-ide, LKML
I looked into killing sg_last(), but really, this is the best its gonna
get (moving sg_last to libata-core.c).
You could maybe kill one use with caching, but in the other sg_last()
callsites there isn't another s/g loop we can stick a "last_sg = sg;"
into.
libata is stuck because we undertake the highly unusual operation of
fiddling with the final S/G element, to enforce 32-bit alignment.
Of course we could eliminate all that nasty fiddling/padding
completely, including sg_last(), if other areas of the kernel would
guarantee ahead of time that buffer lengths are always a multiple
of 4........
Jeff
[the obvious patch follows, moving sg_last()...]
drivers/ata/libata-core.c | 24 ++++++++++++++++++++++++
drivers/ata/sata_nv.c | 1 -
drivers/ata/sata_promise.c | 1 -
drivers/ata/sata_qstor.c | 1 -
include/linux/scatterlist.h | 34 ----------------------------------
5 files changed, 24 insertions(+), 37 deletions(-)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 63035d7..4f0acc5 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4480,6 +4480,30 @@ static unsigned int ata_dev_init_params(struct ata_device *dev,
}
/**
+ * sg_last - return the last scatterlist entry in a list
+ * @sgl: First entry in the scatterlist
+ * @nents: Number of entries in the scatterlist
+ *
+ * Description:
+ * Should only be used casually, it (currently) scan the entire list
+ * to get the last entry.
+ *
+ * Note that the @sgl@ pointer passed in need not be the first one,
+ * the important bit is that @nents@ denotes the number of entries that
+ * exist from @sgl@.
+ *
+ **/
+static inline struct scatterlist *sg_last(struct scatterlist *sgl,
+ unsigned int nents)
+{
+ struct scatterlist *sg, *ret = NULL;
+ unsigned int i;
+
+ for_each_sg(sgl, sg, nents, i)
+ ret = sg;
+}
+
+/**
* ata_sg_clean - Unmap DMA memory associated with command
* @qc: Command containing DMA memory to be released
*
diff --git a/drivers/ata/sata_nv.c b/drivers/ata/sata_nv.c
index 35b2df2..6546913 100644
--- a/drivers/ata/sata_nv.c
+++ b/drivers/ata/sata_nv.c
@@ -1985,7 +1985,6 @@ static void nv_swncq_fill_sg(struct ata_queued_cmd *qc)
struct nv_swncq_port_priv *pp = ap->private_data;
struct ata_prd *prd;
- WARN_ON(qc->__sg == NULL);
WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
prd = pp->prd + ATA_MAX_PRD * qc->tag;
diff --git a/drivers/ata/sata_promise.c b/drivers/ata/sata_promise.c
index 825e717..1fba9d4 100644
--- a/drivers/ata/sata_promise.c
+++ b/drivers/ata/sata_promise.c
@@ -547,7 +547,6 @@ static void pdc_fill_sg(struct ata_queued_cmd *qc)
if (!(qc->flags & ATA_QCFLAG_DMAMAP))
return;
- WARN_ON(qc->__sg == NULL);
WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
idx = 0;
diff --git a/drivers/ata/sata_qstor.c b/drivers/ata/sata_qstor.c
index 6d43ba7..664e3f7 100644
--- a/drivers/ata/sata_qstor.c
+++ b/drivers/ata/sata_qstor.c
@@ -276,7 +276,6 @@ static unsigned int qs_fill_sg(struct ata_queued_cmd *qc)
unsigned int nelem;
u8 *prd = pp->pkt + QS_CPB_BYTES;
- WARN_ON(qc->__sg == NULL);
WARN_ON(qc->n_elem == 0 && qc->pad_len == 0);
nelem = 0;
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 32326c2..ab94464 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -130,40 +130,6 @@ static inline struct scatterlist *sg_next(struct scatterlist *sg)
for (__i = 0, sg = (sglist); __i < (nr); __i++, sg = sg_next(sg))
/**
- * sg_last - return the last scatterlist entry in a list
- * @sgl: First entry in the scatterlist
- * @nents: Number of entries in the scatterlist
- *
- * Description:
- * Should only be used casually, it (currently) scan the entire list
- * to get the last entry.
- *
- * Note that the @sgl@ pointer passed in need not be the first one,
- * the important bit is that @nents@ denotes the number of entries that
- * exist from @sgl@.
- *
- **/
-static inline struct scatterlist *sg_last(struct scatterlist *sgl,
- unsigned int nents)
-{
-#ifndef ARCH_HAS_SG_CHAIN
- struct scatterlist *ret = &sgl[nents - 1];
-#else
- struct scatterlist *sg, *ret = NULL;
- unsigned int i;
-
- for_each_sg(sgl, sg, nents, i)
- ret = sg;
-
-#endif
-#ifdef CONFIG_DEBUG_SG
- BUG_ON(sgl[0].sg_magic != SG_MAGIC);
- BUG_ON(!sg_is_last(ret));
-#endif
- return ret;
-}
-
-/**
* sg_chain - Chain two sglists together
* @prv: First scatterlist
* @prv_nents: Number of entries in prv
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] "killing" sg_last(), and discussion
2007-10-31 8:49 [PATCH] "killing" sg_last(), and discussion Jeff Garzik
@ 2007-10-31 10:19 ` Boaz Harrosh
2007-10-31 10:29 ` Jeff Garzik
0 siblings, 1 reply; 4+ messages in thread
From: Boaz Harrosh @ 2007-10-31 10:19 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, linux-ide, LKML
On Wed, Oct 31 2007 at 10:49 +0200, Jeff Garzik <jeff@garzik.org> wrote:
> I looked into killing sg_last(), but really, this is the best its gonna
> get (moving sg_last to libata-core.c).
>
> You could maybe kill one use with caching, but in the other sg_last()
> callsites there isn't another s/g loop we can stick a "last_sg = sg;"
> into.
>
> libata is stuck because we undertake the highly unusual operation of
> fiddling with the final S/G element, to enforce 32-bit alignment.
>
> Of course we could eliminate all that nasty fiddling/padding
> completely, including sg_last(), if other areas of the kernel would
> guarantee ahead of time that buffer lengths are always a multiple
> of 4........
>
> Jeff
>
OK Now I'm confused. I thought that ULD's can give you SG's
that are actually longer than bufflen and that, at the end, the
bufflen should govern the transfer length.
Now FS_PC commands are sector aligned so you do not have
problems with that.
The BLOCK_PC commands have 2 main sources that I know of
one is sg && bsg from user mode that can easily enforce
4 bytes alignment. The second is kernel services which 80%
of these are done by scsi_execute(). All These can be found
and fixed. Starting with scsi_execute(). Another place can be
blk_rq_map_sg(), since all IO's are bio based. It can enforce
alignment too.
I would start by sticking a WARN_ON(qc->pad_len) and
see if it triggers, what are the sources of that.
Please note that the code already has a 4 bytes alignment
assumption about the start of the transfer, other wise
the first SG can also have a none aligned length, which
is not checked for.
Boaz
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] "killing" sg_last(), and discussion
2007-10-31 10:19 ` Boaz Harrosh
@ 2007-10-31 10:29 ` Jeff Garzik
2007-10-31 11:13 ` Boaz Harrosh
0 siblings, 1 reply; 4+ messages in thread
From: Jeff Garzik @ 2007-10-31 10:29 UTC (permalink / raw)
To: Boaz Harrosh; +Cc: Jens Axboe, linux-ide, LKML
Boaz Harrosh wrote:
> On Wed, Oct 31 2007 at 10:49 +0200, Jeff Garzik <jeff@garzik.org> wrote:
>> I looked into killing sg_last(), but really, this is the best its gonna
>> get (moving sg_last to libata-core.c).
>>
>> You could maybe kill one use with caching, but in the other sg_last()
>> callsites there isn't another s/g loop we can stick a "last_sg = sg;"
>> into.
>>
>> libata is stuck because we undertake the highly unusual operation of
>> fiddling with the final S/G element, to enforce 32-bit alignment.
>>
>> Of course we could eliminate all that nasty fiddling/padding
>> completely, including sg_last(), if other areas of the kernel would
>> guarantee ahead of time that buffer lengths are always a multiple
>> of 4........
>>
>> Jeff
>>
> OK Now I'm confused. I thought that ULD's can give you SG's
> that are actually longer than bufflen and that, at the end, the
> bufflen should govern the transfer length.
>
> Now FS_PC commands are sector aligned so you do not have
> problems with that.
>
> The BLOCK_PC commands have 2 main sources that I know of
> one is sg && bsg from user mode that can easily enforce
> 4 bytes alignment. The second is kernel services which 80%
> of these are done by scsi_execute(). All These can be found
> and fixed. Starting with scsi_execute(). Another place can be
> blk_rq_map_sg(), since all IO's are bio based. It can enforce
> alignment too.
>
> I would start by sticking a WARN_ON(qc->pad_len) and
> see if it triggers, what are the sources of that.
The whole qc->pad_len etc. machinery was added because it solved
problems in the field with ATAPI devices. So sr or some userland
application is sending lengths that are not padded to 32-bit boundary,
probably because plenty of trivial commands can send or return odd
amounts of data.
This used to be irrelevant, but now with SATA, even PIO data xfer
(normally what is used for non-READ/WRITE CDBs) must be 32-bit aligned
because both SATA DMA and SATA PIO are converted into dword-based SATA
FIS's on the wire.
Jeff
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] "killing" sg_last(), and discussion
2007-10-31 10:29 ` Jeff Garzik
@ 2007-10-31 11:13 ` Boaz Harrosh
0 siblings, 0 replies; 4+ messages in thread
From: Boaz Harrosh @ 2007-10-31 11:13 UTC (permalink / raw)
To: Jeff Garzik; +Cc: Jens Axboe, linux-ide, LKML
On Wed, Oct 31 2007 at 12:29 +0200, Jeff Garzik <jeff@garzik.org> wrote:
> Boaz Harrosh wrote:
>> On Wed, Oct 31 2007 at 10:49 +0200, Jeff Garzik <jeff@garzik.org> wrote:
>>> I looked into killing sg_last(), but really, this is the best its gonna
>>> get (moving sg_last to libata-core.c).
>>>
>>> You could maybe kill one use with caching, but in the other sg_last()
>>> callsites there isn't another s/g loop we can stick a "last_sg = sg;"
>>> into.
>>>
>>> libata is stuck because we undertake the highly unusual operation of
>>> fiddling with the final S/G element, to enforce 32-bit alignment.
>>>
>>> Of course we could eliminate all that nasty fiddling/padding
>>> completely, including sg_last(), if other areas of the kernel would
>>> guarantee ahead of time that buffer lengths are always a multiple
>>> of 4........
>>>
>>> Jeff
>>>
>> OK Now I'm confused. I thought that ULD's can give you SG's
>> that are actually longer than bufflen and that, at the end, the
>> bufflen should govern the transfer length.
>>
>> Now FS_PC commands are sector aligned so you do not have
>> problems with that.
>>
>> The BLOCK_PC commands have 2 main sources that I know of
>> one is sg && bsg from user mode that can easily enforce
>> 4 bytes alignment. The second is kernel services which 80%
>> of these are done by scsi_execute(). All These can be found
>> and fixed. Starting with scsi_execute(). Another place can be
>> blk_rq_map_sg(), since all IO's are bio based. It can enforce
>> alignment too.
>>
>> I would start by sticking a WARN_ON(qc->pad_len) and
>> see if it triggers, what are the sources of that.
>
> The whole qc->pad_len etc. machinery was added because it solved
> problems in the field with ATAPI devices. So sr or some userland
> application is sending lengths that are not padded to 32-bit boundary,
> probably because plenty of trivial commands can send or return odd
> amounts of data.
>
> This used to be irrelevant, but now with SATA, even PIO data xfer
> (normally what is used for non-READ/WRITE CDBs) must be 32-bit aligned
> because both SATA DMA and SATA PIO are converted into dword-based SATA
> FIS's on the wire.
>
> Jeff
>
>
>
2 things
1. Than why not fix blk_rq_map_sg() to enforce the alignment. Also I bet
that these "problems in the field" are from pre 2.6.18 kernels, and this
is no longer the case. Why not put that WARN_ON(qc->pad_len) and prove me
wrong.
2. Just checking bufflen is enough. Since you are already assuming that
first SG's offset is aligned, than if last SG's length is odd than so is
bufflen. (You are already assuming that SG's total length matches bufflen)
Boaz
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2007-10-31 11:14 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-10-31 8:49 [PATCH] "killing" sg_last(), and discussion Jeff Garzik
2007-10-31 10:19 ` Boaz Harrosh
2007-10-31 10:29 ` Jeff Garzik
2007-10-31 11:13 ` Boaz Harrosh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).