[PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)

linux-scsi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
@ 2008-05-20  4:58 FUJITA Tomonori
  2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
                   ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  4:58 UTC (permalink / raw)
  To: linux-scsi, linux-ide
  Cc: jens.axboe, tsbogend, bzolnier, James.Bottomley, jeff, akpm,
	davem, fujita.tomonori

The goal of this patchset is reverting the commit
22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use kmalloced buffers
instead of buffers on stack).

http://lkml.org/lkml/2008/4/21/634

The commit is using kmalloced buffers for cdrom packet commands to
avoid stack corruption on non coherent platforms. But allocating a
small buffer like this is not nice (unnecessary complicity):

+	buffer = kmalloc(8, GFP_KERNEL);

I tried to remove generic_packet() and convert cdrom users to use the
block queue like pkt_generic_packet (as Jens suggested in the thread
if I correctly understand), but it turned out that it needs tricky
surgery (like handling ssleep and retries for packet commands in ide).

Then I found that we can easily handle packet commands on non coherent
platforms. The diffstat is pretty small except for the revert. All
this patchset does is just setting the dma_pad_mask to
ARCH_KMALLOC_MINALIGN.

Only scsi and ide-cd do DMA generic_packet. In the case of scsi,
sr_packet uses blk_rq_map_kern (the commit
68154e90c9d1492d570671ae181d9a8f8530da55) post 2.6.25. So if we set
the dma padding on non coherent platforms, sr_packet uses allocated
pages properly.

In the case of IDE, ide-cd has a mechanism to handle alignment and
padding for SG_IO. So we can easily exploit it for packet commands.

If some architectures can't do DMA on stack, we also need to a new
queue_flag like QUEUE_FLAG_NO_DMA_ON_STACK in addtion of this
patchset.

The diffstat is:

 block/blk-settings.c      |   30 +++++-
 drivers/ata/libata-scsi.c |    3 +-
 drivers/cdrom/cdrom.c     |  274 +++++++++++++++------------------------------
 drivers/ide/ide-cd.c      |   17 ++-
 include/linux/blkdev.h    |    1 +
 5 files changed, 134 insertions(+), 191 deletions(-)

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  4:58 [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
@ 2008-05-20  4:58 ` FUJITA Tomonori
  2008-05-20  4:58   ` [PATCH 2/4] block: add blk_queue_update_dma_pad FUJITA Tomonori
  2008-05-20  9:31   ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask Andrew Morton
  2008-05-21 22:11 ` [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) Thomas Bogendoerfer
  2008-05-27 18:58 ` Bartlomiej Zolnierkiewicz
  2 siblings, 2 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  4:58 UTC (permalink / raw)
  To: linux-scsi, linux-ide
  Cc: jens.axboe, tsbogend, bzolnier, James.Bottomley, jeff, akpm,
	davem, FUJITA Tomonori

This sets the default dma pad mask to ARCH_KMALLOC_MINALIGN in
blk_queue_make_request(). It affects only non-coherent platforms.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-settings.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 8dd8641..781d1bf 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -84,6 +84,11 @@ EXPORT_SYMBOL(blk_queue_softirq_done);
  **/
 void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
 {
+#ifndef ARCH_KMALLOC_MINALIGN
+#define ARCH_KMALLOC_MINALIGN 1
+#endif
+	unsigned int min_align = ARCH_KMALLOC_MINALIGN;
+
 	/*
 	 * set defaults
 	 */
@@ -98,6 +103,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
 	blk_queue_max_sectors(q, SAFE_MAX_SECTORS);
 	blk_queue_hardsect_size(q, 512);
 	blk_queue_dma_alignment(q, 511);
+	blk_queue_dma_pad(q, min_align - 1);
 	blk_queue_congestion_threshold(q);
 	q->nr_batching = BLK_BATCH_REQ;
 
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 2/4] block: add blk_queue_update_dma_pad
  2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
@ 2008-05-20  4:58   ` FUJITA Tomonori
  2008-05-20  4:58     ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC FUJITA Tomonori
  2008-05-20  9:31   ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask Andrew Morton
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  4:58 UTC (permalink / raw)
  To: linux-scsi, linux-ide
  Cc: jens.axboe, tsbogend, bzolnier, James.Bottomley, jeff, akpm,
	davem, FUJITA Tomonori, Tejun Heo

This adds blk_queue_update_dma_pad to prevent LLDs from overwriting
the dma pad mask wrongly (we added blk_queue_update_dma_alignment due
to the same reason).

This also converts libata to use blk_queue_update_dma_pad instead of
blk_queue_dma_pad.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Tejun Heo <htejun@gmail.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 block/blk-settings.c      |   24 ++++++++++++++++++++----
 drivers/ata/libata-scsi.c |    3 ++-
 include/linux/blkdev.h    |    1 +
 3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 781d1bf..941777c 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -308,11 +308,10 @@ EXPORT_SYMBOL(blk_queue_stack_limits);
  * @q:     the request queue for the device
  * @mask:  pad mask
  *
- * Set pad mask.  Direct IO requests are padded to the mask specified.
+ * Set dma pad mask.
  *
- * Appending pad buffer to a request modifies ->data_len such that it
- * includes the pad buffer.  The original requested data length can be
- * obtained using blk_rq_raw_data_len().
+ * Appending pad buffer to a request modifies the last entry of a
+ * scatter list such that it includes the pad buffer.
  **/
 void blk_queue_dma_pad(struct request_queue *q, unsigned int mask)
 {
@@ -321,6 +320,23 @@ void blk_queue_dma_pad(struct request_queue *q, unsigned int mask)
 EXPORT_SYMBOL(blk_queue_dma_pad);
 
 /**
+ * blk_queue_update_dma_pad - update pad mask
+ * @q:     the request queue for the device
+ * @mask:  pad mask
+ *
+ * Update dma pad mask.
+ *
+ * Appending pad buffer to a request modifies the last entry of a
+ * scatter list such that it includes the pad buffer.
+ **/
+void blk_queue_update_dma_pad(struct request_queue *q, unsigned int mask)
+{
+	if (mask > q->dma_pad_mask)
+		q->dma_pad_mask = mask;
+}
+EXPORT_SYMBOL(blk_queue_update_dma_pad);
+
+/**
  * blk_queue_dma_drain - Set up a drain buffer for excess dma.
  * @q:  the request queue for the device
  * @dma_drain_needed: fn which returns non-zero if drain is necessary
diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c
index aeb6e01..60c0e28 100644
--- a/drivers/ata/libata-scsi.c
+++ b/drivers/ata/libata-scsi.c
@@ -885,7 +885,8 @@ static int ata_scsi_dev_config(struct scsi_device *sdev,
 		/* set the min alignment and padding */
 		blk_queue_update_dma_alignment(sdev->request_queue,
 					       ATA_DMA_PAD_SZ - 1);
-		blk_queue_dma_pad(sdev->request_queue, ATA_DMA_PAD_SZ - 1);
+		blk_queue_update_dma_pad(sdev->request_queue,
+					 ATA_DMA_PAD_SZ - 1);
 
 		/* configure draining */
 		buf = kmalloc(ATAPI_MAX_DRAIN, q->bounce_gfp | GFP_KERNEL);
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index d2a1b71..740d68f 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -749,6 +749,7 @@ extern void blk_queue_max_segment_size(struct request_queue *, unsigned int);
 extern void blk_queue_hardsect_size(struct request_queue *, unsigned short);
 extern void blk_queue_stack_limits(struct request_queue *t, struct request_queue *b);
 extern void blk_queue_dma_pad(struct request_queue *, unsigned int);
+extern void blk_queue_update_dma_pad(struct request_queue *, unsigned int);
 extern int blk_queue_dma_drain(struct request_queue *q,
 			       dma_drain_needed_fn *dma_drain_needed,
 			       void *buf, unsigned int size);
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC
  2008-05-20  4:58   ` [PATCH 2/4] block: add blk_queue_update_dma_pad FUJITA Tomonori
@ 2008-05-20  4:58     ` FUJITA Tomonori
  2008-05-20  4:58       ` [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
  2008-05-28  6:43       ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC Borislav Petkov
  0 siblings, 2 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  4:58 UTC (permalink / raw)
  To: linux-scsi, linux-ide
  Cc: jens.axboe, tsbogend, bzolnier, James.Bottomley, jeff, akpm,
	davem, FUJITA Tomonori

This uses the dma safe check for REQ_TYPE_ATA_PC. The dma safe check
is used for only sg requests but it should be used for other non fs
commands.

This uses blk_queue_update_dma_pad to make the intention clear though
ide don't use the blk APIs so it doesn't change anything.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
---
 drivers/ide/ide-cd.c |   17 ++++++++++++-----
 1 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
index 68e7f19..1d8d510 100644
--- a/drivers/ide/ide-cd.c
+++ b/drivers/ide/ide-cd.c
@@ -1178,10 +1178,15 @@ static ide_startstop_t cdrom_do_block_pc(ide_drive_t *drive, struct request *rq)
 	info->dma = 0;
 
 	/* sg request */
-	if (rq->bio) {
-		int mask = drive->queue->dma_alignment;
-		unsigned long addr =
-			(unsigned long)page_address(bio_page(rq->bio));
+	if (rq->bio || ((rq->cmd_type == REQ_TYPE_ATA_PC) && rq->data_len)) {
+		struct request_queue *q = drive->queue;
+		unsigned int alignment;
+		unsigned long addr;
+
+		if (rq->bio)
+			addr = (unsigned long)bio_data(rq->bio);
+		else
+			addr = (unsigned long)rq->data;
 
 		info->dma = drive->using_dma;
 
@@ -1191,7 +1196,8 @@ static ide_startstop_t cdrom_do_block_pc(ide_drive_t *drive, struct request *rq)
 		 * NOTE! The "len" and "addr" checks should possibly have
 		 * separate masks.
 		 */
-		if ((rq->data_len & 15) || (addr & mask))
+		alignment = queue_dma_alignment(q) | q->dma_pad_mask;
+		if (addr & alignment || rq->data_len & alignment)
 			info->dma = 0;
 	}
 
@@ -1869,6 +1875,7 @@ static int ide_cdrom_setup(ide_drive_t *drive)
 
 	blk_queue_prep_rq(drive->queue, ide_cdrom_prep_fn);
 	blk_queue_dma_alignment(drive->queue, 31);
+	blk_queue_update_dma_pad(drive->queue, 15);
 	drive->queue->unplug_delay = (1 * HZ) / 1000;
 	if (!drive->queue->unplug_delay)
 		drive->queue->unplug_delay = 1;
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  4:58     ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC FUJITA Tomonori
@ 2008-05-20  4:58       ` FUJITA Tomonori
  2008-05-20  7:33         ` Elias Oltmanns
  2008-05-28  6:43       ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC Borislav Petkov
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  4:58 UTC (permalink / raw)
  To: linux-scsi, linux-ide
  Cc: jens.axboe, tsbogend, bzolnier, James.Bottomley, jeff, akpm,
	davem, FUJITA Tomonori

The commit 22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use
kmalloced buffers instead of buffers on stack) is introduced to use
kmalloced buffers for packet commands to avoid stack corruption on non
coherent platforms.

SCSI cdrom uses blk_rq_map_kern (the commit
68154e90c9d1492d570671ae181d9a8f8530da55) post 2.6.25. So sr_packet
uses allocated pages when necessary. IDE cdrom has a mechnism to
handle alignment and padding. So we don't need this extra complexitiy
in cdrom.c.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
 drivers/cdrom/cdrom.c |  274 +++++++++++++++++--------------------------------
 1 files changed, 93 insertions(+), 181 deletions(-)

diff --git a/drivers/cdrom/cdrom.c b/drivers/cdrom/cdrom.c
index 69f26eb..a5da356 100644
--- a/drivers/cdrom/cdrom.c
+++ b/drivers/cdrom/cdrom.c
@@ -461,37 +461,27 @@ int cdrom_get_media_event(struct cdrom_device_info *cdi,
 			  struct media_event_desc *med)
 {
 	struct packet_command cgc;
-	unsigned char *buffer;
-	struct event_header *eh;
-	int ret = 1;
-
-	buffer = kmalloc(8, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
+	unsigned char buffer[8];
+	struct event_header *eh = (struct event_header *) buffer;
 
-	eh = (struct event_header *)buffer;
-
-	init_cdrom_command(&cgc, buffer, 8, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_GET_EVENT_STATUS_NOTIFICATION;
 	cgc.cmd[1] = 1;		/* IMMED */
 	cgc.cmd[4] = 1 << 4;	/* media event */
-	cgc.cmd[8] = 8;
+	cgc.cmd[8] = sizeof(buffer);
 	cgc.quiet = 1;
 
 	if (cdi->ops->generic_packet(cdi, &cgc))
-		goto err;
+		return 1;
 
 	if (be16_to_cpu(eh->data_len) < sizeof(*med))
-		goto err;
+		return 1;
 
 	if (eh->nea || eh->notification_class != 0x4)
-		goto err;
+		return 1;
 
-	memcpy(med, buffer + sizeof(*eh), sizeof(*med));
-	ret = 0;
-err:
-	kfree(buffer);
-	return ret;
+	memcpy(med, &buffer[sizeof(*eh)], sizeof(*med));
+	return 0;
 }
 
 /*
@@ -501,82 +491,68 @@ err:
 static int cdrom_mrw_probe_pc(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char *buffer;
-	int ret = 1;
-
-	buffer = kmalloc(16, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
+	char buffer[16];
 
-	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
 	cgc.timeout = HZ;
 	cgc.quiet = 1;
 
 	if (!cdrom_mode_sense(cdi, &cgc, MRW_MODE_PC, 0)) {
 		cdi->mrw_mode_page = MRW_MODE_PC;
-		ret = 0;
+		return 0;
 	} else if (!cdrom_mode_sense(cdi, &cgc, MRW_MODE_PC_PRE1, 0)) {
 		cdi->mrw_mode_page = MRW_MODE_PC_PRE1;
-		ret = 0;
+		return 0;
 	}
-	kfree(buffer);
-	return ret;
+
+	return 1;
 }
 
 static int cdrom_is_mrw(struct cdrom_device_info *cdi, int *write)
 {
 	struct packet_command cgc;
 	struct mrw_feature_desc *mfd;
-	unsigned char *buffer;
+	unsigned char buffer[16];
 	int ret;
 
 	*write = 0;
-	buffer = kmalloc(16, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
 
-	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[3] = CDF_MRW;
-	cgc.cmd[8] = 16;
+	cgc.cmd[8] = sizeof(buffer);
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	mfd = (struct mrw_feature_desc *)&buffer[sizeof(struct feature_header)];
-	if (be16_to_cpu(mfd->feature_code) != CDF_MRW) {
-		ret = 1;
-		goto err;
-	}
+	if (be16_to_cpu(mfd->feature_code) != CDF_MRW)
+		return 1;
 	*write = mfd->write;
 
 	if ((ret = cdrom_mrw_probe_pc(cdi))) {
 		*write = 0;
+		return ret;
 	}
-err:
-	kfree(buffer);
-	return ret;
+
+	return 0;
 }
 
 static int cdrom_mrw_bgformat(struct cdrom_device_info *cdi, int cont)
 {
 	struct packet_command cgc;
-	unsigned char *buffer;
+	unsigned char buffer[12];
 	int ret;
 
 	printk(KERN_INFO "cdrom: %sstarting format\n", cont ? "Re" : "");
 
-	buffer = kmalloc(12, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
-
 	/*
 	 * FmtData bit set (bit 4), format type is 1
 	 */
-	init_cdrom_command(&cgc, buffer, 12, CGC_DATA_WRITE);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_WRITE);
 	cgc.cmd[0] = GPCMD_FORMAT_UNIT;
 	cgc.cmd[1] = (1 << 4) | 1;
 
@@ -603,7 +579,6 @@ static int cdrom_mrw_bgformat(struct cdrom_device_info *cdi, int cont)
 	if (ret)
 		printk(KERN_INFO "cdrom: bgformat failed\n");
 
-	kfree(buffer);
 	return ret;
 }
 
@@ -663,17 +638,16 @@ static int cdrom_mrw_set_lba_space(struct cdrom_device_info *cdi, int space)
 {
 	struct packet_command cgc;
 	struct mode_page_header *mph;
-	char *buffer;
+	char buffer[16];
 	int ret, offset, size;
 
-	buffer = kmalloc(16, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
-	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
+	cgc.buffer = buffer;
+	cgc.buflen = sizeof(buffer);
 
 	if ((ret = cdrom_mode_sense(cdi, &cgc, cdi->mrw_mode_page, 0)))
-		goto err;
+		return ret;
 
 	mph = (struct mode_page_header *) buffer;
 	offset = be16_to_cpu(mph->desc_length);
@@ -683,70 +657,55 @@ static int cdrom_mrw_set_lba_space(struct cdrom_device_info *cdi, int space)
 	cgc.buflen = size;
 
 	if ((ret = cdrom_mode_select(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	printk(KERN_INFO "cdrom: %s: mrw address space %s selected\n", cdi->name, mrw_address_space[space]);
-	ret = 0;
-err:
-	kfree(buffer);
-	return ret;
+	return 0;
 }
 
 static int cdrom_get_random_writable(struct cdrom_device_info *cdi,
 			      struct rwrt_feature_desc *rfd)
 {
 	struct packet_command cgc;
-	char *buffer;
+	char buffer[24];
 	int ret;
 
-	buffer = kmalloc(24, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buffer, 24, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;	/* often 0x46 */
 	cgc.cmd[3] = CDF_RWRT;			/* often 0x0020 */
-	cgc.cmd[8] = 24;		        /* often 0x18 */
+	cgc.cmd[8] = sizeof(buffer);		/* often 0x18 */
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	memcpy(rfd, &buffer[sizeof(struct feature_header)], sizeof (*rfd));
-	ret = 0;
-err:
-	kfree(buffer);
-	return ret;
+	return 0;
 }
 
 static int cdrom_has_defect_mgt(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char *buffer;
+	char buffer[16];
 	__be16 *feature_code;
 	int ret;
 
-	buffer = kmalloc(16, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[3] = CDF_HWDM;
-	cgc.cmd[8] = 16;
+	cgc.cmd[8] = sizeof(buffer);
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	feature_code = (__be16 *) &buffer[sizeof(struct feature_header)];
 	if (be16_to_cpu(*feature_code) == CDF_HWDM)
-		ret = 0;
-err:
-	kfree(buffer);
-	return ret;
+		return 0;
+
+	return 1;
 }
 
 
@@ -837,14 +796,10 @@ static int cdrom_mrw_open_write(struct cdrom_device_info *cdi)
 static int mo_open_write(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char *buffer;
+	char buffer[255];
 	int ret;
 
-	buffer = kmalloc(255, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buffer, 4, CGC_DATA_READ);
+	init_cdrom_command(&cgc, &buffer, 4, CGC_DATA_READ);
 	cgc.quiet = 1;
 
 	/*
@@ -861,15 +816,10 @@ static int mo_open_write(struct cdrom_device_info *cdi)
 	}
 
 	/* drive gave us no info, let the user go ahead */
-	if (ret) {
-		ret = 0;
-		goto err;
-	}
+	if (ret)
+		return 0;
 
-	ret = buffer[3] & 0x80;
-err:
-	kfree(buffer);
-	return ret;
+	return buffer[3] & 0x80;
 }
 
 static int cdrom_ram_open_write(struct cdrom_device_info *cdi)
@@ -892,19 +842,15 @@ static int cdrom_ram_open_write(struct cdrom_device_info *cdi)
 static void cdrom_mmc3_profile(struct cdrom_device_info *cdi)
 {
 	struct packet_command cgc;
-	char *buffer;
+	char buffer[32];
 	int ret, mmc3_profile;
 
-	buffer = kmalloc(32, GFP_KERNEL);
-	if (!buffer)
-		return;
-
-	init_cdrom_command(&cgc, buffer, 32, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buffer, sizeof(buffer), CGC_DATA_READ);
 
 	cgc.cmd[0] = GPCMD_GET_CONFIGURATION;
 	cgc.cmd[1] = 0;
 	cgc.cmd[2] = cgc.cmd[3] = 0;		/* Starting Feature Number */
-	cgc.cmd[8] = 32;		        /* Allocation Length */
+	cgc.cmd[8] = sizeof(buffer);		/* Allocation Length */
 	cgc.quiet = 1;
 
 	if ((ret = cdi->ops->generic_packet(cdi, &cgc)))
@@ -913,7 +859,6 @@ static void cdrom_mmc3_profile(struct cdrom_device_info *cdi)
 		mmc3_profile = (buffer[6] << 8) | buffer[7];
 
 	cdi->mmc3_profile = mmc3_profile;
-	kfree(buffer);
 }
 
 static int cdrom_is_dvd_rw(struct cdrom_device_info *cdi)
@@ -1628,15 +1573,12 @@ static void setup_send_key(struct packet_command *cgc, unsigned agid, unsigned t
 static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 {
 	int ret;
-	u_char *buf;
+	u_char buf[20];
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
-	rpc_state_t *rpc_state;
-
-	buf = kzalloc(20, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
+	rpc_state_t rpc_state;
 
+	memset(buf, 0, sizeof(buf));
 	init_cdrom_command(&cgc, buf, 0, CGC_DATA_READ);
 
 	switch (ai->type) {
@@ -1647,7 +1589,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsa.agid, 0);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		ai->lsa.agid = buf[7] >> 6;
 		/* Returning data, let host change state */
@@ -1658,7 +1600,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsk.agid, 2);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		copy_key(ai->lsk.key, &buf[4]);
 		/* Returning data, let host change state */
@@ -1669,7 +1611,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsc.agid, 1);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		copy_chal(ai->lsc.chal, &buf[4]);
 		/* Returning data, let host change state */
@@ -1686,7 +1628,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		cgc.cmd[2] = ai->lstk.lba >> 24;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		ai->lstk.cpm = (buf[4] >> 7) & 1;
 		ai->lstk.cp_sec = (buf[4] >> 6) & 1;
@@ -1700,7 +1642,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		setup_report_key(&cgc, ai->lsasf.agid, 5);
 		
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		ai->lsasf.asf = buf[7] & 1;
 		break;
@@ -1713,7 +1655,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		copy_chal(&buf[4], ai->hsc.chal);
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
 		ai->type = DVD_LU_SEND_KEY1;
 		break;
@@ -1726,7 +1668,7 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 
 		if ((ret = cdo->generic_packet(cdi, &cgc))) {
 			ai->type = DVD_AUTH_FAILURE;
-			goto err;
+			return ret;
 		}
 		ai->type = DVD_AUTH_ESTABLISHED;
 		break;
@@ -1737,23 +1679,24 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		cdinfo(CD_DVD, "entering DVD_INVALIDATE_AGID\n"); 
 		setup_report_key(&cgc, ai->lsa.agid, 0x3f);
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 		break;
 
 	/* Get region settings */
 	case DVD_LU_SEND_RPC_STATE:
 		cdinfo(CD_DVD, "entering DVD_LU_SEND_RPC_STATE\n");
 		setup_report_key(&cgc, 0, 8);
+		memset(&rpc_state, 0, sizeof(rpc_state_t));
+		cgc.buffer = (char *) &rpc_state;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 
-		rpc_state = (rpc_state_t *)buf;
-		ai->lrpcs.type = rpc_state->type_code;
-		ai->lrpcs.vra = rpc_state->vra;
-		ai->lrpcs.ucca = rpc_state->ucca;
-		ai->lrpcs.region_mask = rpc_state->region_mask;
-		ai->lrpcs.rpc_scheme = rpc_state->rpc_scheme;
+		ai->lrpcs.type = rpc_state.type_code;
+		ai->lrpcs.vra = rpc_state.vra;
+		ai->lrpcs.ucca = rpc_state.ucca;
+		ai->lrpcs.region_mask = rpc_state.region_mask;
+		ai->lrpcs.rpc_scheme = rpc_state.rpc_scheme;
 		break;
 
 	/* Set region settings */
@@ -1764,23 +1707,20 @@ static int dvd_do_auth(struct cdrom_device_info *cdi, dvd_authinfo *ai)
 		buf[4] = ai->hrpcs.pdrc;
 
 		if ((ret = cdo->generic_packet(cdi, &cgc)))
-			goto err;
+			return ret;
 		break;
 
 	default:
 		cdinfo(CD_WARNING, "Invalid DVD key ioctl (%d)\n", ai->type);
-		ret = -ENOTTY;
-		goto err;
+		return -ENOTTY;
 	}
-	ret = 0;
-err:
-	kfree(buf);
-	return ret;
+
+	return 0;
 }
 
 static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 {
-	unsigned char *buf, *base;
+	unsigned char buf[21], *base;
 	struct dvd_layer *layer;
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
@@ -1789,11 +1729,7 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	if (layer_num >= DVD_LAYERS)
 		return -EINVAL;
 
-	buf = kmalloc(21, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buf, 21, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[6] = layer_num;
 	cgc.cmd[7] = s->type;
@@ -1805,7 +1741,7 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	cgc.quiet = 1;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	base = &buf[4];
 	layer = &s->physical.layer[layer_num];
@@ -1829,24 +1765,17 @@ static int dvd_read_physical(struct cdrom_device_info *cdi, dvd_struct *s)
 	layer->end_sector_l0 = base[13] << 16 | base[14] << 8 | base[15];
 	layer->bca = base[16] >> 7;
 
-	ret = 0;
-err:
-	kfree(buf);
-	return ret;
+	return 0;
 }
 
 static int dvd_read_copyright(struct cdrom_device_info *cdi, dvd_struct *s)
 {
 	int ret;
-	u_char *buf;
+	u_char buf[8];
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
 
-	buf = kmalloc(8, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buf, 8, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[6] = s->copyright.layer_num;
 	cgc.cmd[7] = s->type;
@@ -1854,15 +1783,12 @@ static int dvd_read_copyright(struct cdrom_device_info *cdi, dvd_struct *s)
 	cgc.cmd[9] = cgc.buflen & 0xff;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	s->copyright.cpst = buf[4];
 	s->copyright.rmi = buf[5];
 
-	ret = 0;
-err:
-	kfree(buf);
-	return ret;
+	return 0;
 }
 
 static int dvd_read_disckey(struct cdrom_device_info *cdi, dvd_struct *s)
@@ -1894,33 +1820,26 @@ static int dvd_read_disckey(struct cdrom_device_info *cdi, dvd_struct *s)
 static int dvd_read_bca(struct cdrom_device_info *cdi, dvd_struct *s)
 {
 	int ret;
-	u_char *buf;
+	u_char buf[4 + 188];
 	struct packet_command cgc;
 	struct cdrom_device_ops *cdo = cdi->ops;
 
-	buf = kmalloc(4 + 188, GFP_KERNEL);
-	if (!buf)
-		return -ENOMEM;
-
-	init_cdrom_command(&cgc, buf, 4 + 188, CGC_DATA_READ);
+	init_cdrom_command(&cgc, buf, sizeof(buf), CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_DVD_STRUCTURE;
 	cgc.cmd[7] = s->type;
 	cgc.cmd[9] = cgc.buflen & 0xff;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	s->bca.len = buf[0] << 8 | buf[1];
 	if (s->bca.len < 12 || s->bca.len > 188) {
 		cdinfo(CD_WARNING, "Received invalid BCA length (%d)\n", s->bca.len);
-		ret = -EIO;
-		goto err;
+		return -EIO;
 	}
 	memcpy(s->bca.value, &buf[4], s->bca.len);
-	ret = 0;
-err:
-	kfree(buf);
-	return ret;
+
+	return 0;
 }
 
 static int dvd_read_manufact(struct cdrom_device_info *cdi, dvd_struct *s)
@@ -2020,13 +1939,9 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 {
 	struct cdrom_device_ops *cdo = cdi->ops;
 	struct packet_command cgc;
-	char *buffer;
+	char buffer[32];
 	int ret;
 
-	buffer = kmalloc(32, GFP_KERNEL);
-	if (!buffer)
-		return -ENOMEM;
-
 	init_cdrom_command(&cgc, buffer, 16, CGC_DATA_READ);
 	cgc.cmd[0] = GPCMD_READ_SUBCHANNEL;
 	cgc.cmd[1] = 2;     /* MSF addressing */
@@ -2035,7 +1950,7 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 	cgc.cmd[8] = 16;
 
 	if ((ret = cdo->generic_packet(cdi, &cgc)))
-		goto err;
+		return ret;
 
 	subchnl->cdsc_audiostatus = cgc.buffer[1];
 	subchnl->cdsc_format = CDROM_MSF;
@@ -2050,10 +1965,7 @@ static int cdrom_read_subchannel(struct cdrom_device_info *cdi,
 	subchnl->cdsc_absaddr.msf.second = cgc.buffer[10];
 	subchnl->cdsc_absaddr.msf.frame = cgc.buffer[11];
 
-	ret = 0;
-err:
-	kfree(buffer);
-	return ret;
+	return 0;
 }
 
 /*
-- 
1.5.4.2


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  4:58       ` [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
@ 2008-05-20  7:33         ` Elias Oltmanns
  2008-05-20  7:42           ` Jim Paris
  0 siblings, 1 reply; 45+ messages in thread
From: Elias Oltmanns @ 2008-05-20  7:33 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, akpm, davem

FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:
> The commit 22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use
> kmalloced buffers instead of buffers on stack) is introduced to use
> kmalloced buffers for packet commands to avoid stack corruption on non
> coherent platforms.
>
> SCSI cdrom uses blk_rq_map_kern (the commit
> 68154e90c9d1492d570671ae181d9a8f8530da55) post 2.6.25. So sr_packet
> uses allocated pages when necessary. IDE cdrom has a mechnism to
> handle alignment and padding. So we don't need this extra complexitiy
> in cdrom.c.
>
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
> Cc: Jens Axboe <jens.axboe@oracle.com>
[...]
> @@ -837,14 +796,10 @@ static int cdrom_mrw_open_write(struct cdrom_device_info *cdi)
>  static int mo_open_write(struct cdrom_device_info *cdi)
>  {
>  	struct packet_command cgc;
> -	char *buffer;
> +	char buffer[255];
>  	int ret;
>  
> -	buffer = kmalloc(255, GFP_KERNEL);
> -	if (!buffer)
> -		return -ENOMEM;
> -
> -	init_cdrom_command(&cgc, buffer, 4, CGC_DATA_READ);
> +	init_cdrom_command(&cgc, &buffer, 4, CGC_DATA_READ);
				 ^
Careful here.

Regards,

Elias

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  7:33         ` Elias Oltmanns
@ 2008-05-20  7:42           ` Jim Paris
  2008-05-20  8:14             ` Elias Oltmanns
  0 siblings, 1 reply; 45+ messages in thread
From: Jim Paris @ 2008-05-20  7:42 UTC (permalink / raw)
  To: Elias Oltmanns
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, akpm, davem

Elias Oltmanns wrote:
> > +	char buffer[255];
> >  	int ret;
> >  
> > -	buffer = kmalloc(255, GFP_KERNEL);
> > -	if (!buffer)
> > -		return -ENOMEM;
> > -
> > -	init_cdrom_command(&cgc, buffer, 4, CGC_DATA_READ);
> > +	init_cdrom_command(&cgc, &buffer, 4, CGC_DATA_READ);
> 				 ^
> Careful here.

They're the same in C.

-jim

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  7:42           ` Jim Paris
@ 2008-05-20  8:14             ` Elias Oltmanns
  0 siblings, 0 replies; 45+ messages in thread
From: Elias Oltmanns @ 2008-05-20  8:14 UTC (permalink / raw)
  To: Jim Paris
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, akpm, davem

Jim Paris <jim@jtan.com> wrote:
> Elias Oltmanns wrote:
>> > +	char buffer[255];
>
>> >  	int ret;
>> >  
>> > -	buffer = kmalloc(255, GFP_KERNEL);
>> > -	if (!buffer)
>> > -		return -ENOMEM;
>> > -
>> > -	init_cdrom_command(&cgc, buffer, 4, CGC_DATA_READ);
>> > +	init_cdrom_command(&cgc, &buffer, 4, CGC_DATA_READ);
>> 				 ^
>> Careful here.
>
> They're the same in C.

Oh yes, I see. Sorry for the noise.

Regards,

Elias

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC
  2008-05-20  4:58     ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC FUJITA Tomonori
  2008-05-20  4:58       ` [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
@ 2008-05-28  6:43       ` Borislav Petkov
  1 sibling, 0 replies; 45+ messages in thread
From: Borislav Petkov @ 2008-05-28  6:43 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, akpm, davem

On Tue, May 20, 2008 at 01:58:33PM +0900, FUJITA Tomonori wrote:
> This uses the dma safe check for REQ_TYPE_ATA_PC. The dma safe check
> is used for only sg requests but it should be used for other non fs
> commands.
> 
> This uses blk_queue_update_dma_pad to make the intention clear though
> ide don't use the blk APIs so it doesn't change anything.
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
> ---
>  drivers/ide/ide-cd.c |   17 ++++++++++++-----
>  1 files changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/ide/ide-cd.c b/drivers/ide/ide-cd.c
> index 68e7f19..1d8d510 100644
> --- a/drivers/ide/ide-cd.c
> +++ b/drivers/ide/ide-cd.c
> @@ -1178,10 +1178,15 @@ static ide_startstop_t cdrom_do_block_pc(ide_drive_t *drive, struct request *rq)
>  	info->dma = 0;
>  
>  	/* sg request */
> -	if (rq->bio) {
> -		int mask = drive->queue->dma_alignment;
> -		unsigned long addr =
> -			(unsigned long)page_address(bio_page(rq->bio));
> +	if (rq->bio || ((rq->cmd_type == REQ_TYPE_ATA_PC) && rq->data_len)) {
> +		struct request_queue *q = drive->queue;
> +		unsigned int alignment;
> +		unsigned long addr;
> +
> +		if (rq->bio)
> +			addr = (unsigned long)bio_data(rq->bio);
> +		else
> +			addr = (unsigned long)rq->data;
>  
>  		info->dma = drive->using_dma;
>  
> @@ -1191,7 +1196,8 @@ static ide_startstop_t cdrom_do_block_pc(ide_drive_t *drive, struct request *rq)
>  		 * NOTE! The "len" and "addr" checks should possibly have
>  		 * separate masks.
>  		 */
> -		if ((rq->data_len & 15) || (addr & mask))
> +		alignment = queue_dma_alignment(q) | q->dma_pad_mask;
> +		if (addr & alignment || rq->data_len & alignment)
>  			info->dma = 0;
>  	}
>  
> @@ -1869,6 +1875,7 @@ static int ide_cdrom_setup(ide_drive_t *drive)
>  
>  	blk_queue_prep_rq(drive->queue, ide_cdrom_prep_fn);
>  	blk_queue_dma_alignment(drive->queue, 31);
> +	blk_queue_update_dma_pad(drive->queue, 15);
>  	drive->queue->unplug_delay = (1 * HZ) / 1000;
>  	if (!drive->queue->unplug_delay)
>  		drive->queue->unplug_delay = 1;

Looks good to me.

Acked-by: Borislav Petkov <petkovbb@gmail.com>
-- 
Regards/Gruß,
    Boris.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
  2008-05-20  4:58   ` [PATCH 2/4] block: add blk_queue_update_dma_pad FUJITA Tomonori
@ 2008-05-20  9:31   ` Andrew Morton
  2008-05-20  9:38     ` Herbert Xu
  2008-05-20  9:55     ` Paul Mundt
  1 sibling, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-05-20  9:31 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm, Herbert Xu

On Tue, 20 May 2008 13:58:31 +0900 FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> This sets the default dma pad mask to ARCH_KMALLOC_MINALIGN in
> blk_queue_make_request(). It affects only non-coherent platforms.
> 
> Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> ---
>  block/blk-settings.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/block/blk-settings.c b/block/blk-settings.c
> index 8dd8641..781d1bf 100644
> --- a/block/blk-settings.c
> +++ b/block/blk-settings.c
> @@ -84,6 +84,11 @@ EXPORT_SYMBOL(blk_queue_softirq_done);
>   **/
>  void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
>  {
> +#ifndef ARCH_KMALLOC_MINALIGN
> +#define ARCH_KMALLOC_MINALIGN 1
> +#endif
> +	unsigned int min_align = ARCH_KMALLOC_MINALIGN;
> +
>  	/*
>  	 * set defaults
>  	 */
> @@ -98,6 +103,7 @@ void blk_queue_make_request(struct request_queue *q, make_request_fn *mfn)
>  	blk_queue_max_sectors(q, SAFE_MAX_SECTORS);
>  	blk_queue_hardsect_size(q, 512);
>  	blk_queue_dma_alignment(q, 511);
> +	blk_queue_dma_pad(q, min_align - 1);
>  	blk_queue_congestion_threshold(q);
>  	q->nr_batching = BLK_BATCH_REQ;

urgh.  This ARCH_KMALLOC_MINALIGN thing has the smell of an expedient
hack which is now growing.

Look at what crypto did (which seems to be a lot worse):

/*
 * The macro CRYPTO_MINALIGN_ATTR (along with the void * type in the actual
 * declaration) is used to ensure that the crypto_tfm context structure is
 * aligned correctly for the given architecture so that there are no alignment
 * faults for C data types.  In particular, this is required on platforms such
 * as arm where pointers are 32-bit aligned but there are data types such as
 * u64 which require 64-bit alignment.
 */
#if defined(ARCH_KMALLOC_MINALIGN)
#define CRYPTO_MINALIGN ARCH_KMALLOC_MINALIGN
#elif defined(ARCH_SLAB_MINALIGN)
#define CRYPTO_MINALIGN ARCH_SLAB_MINALIGN
#else
#define CRYPTO_MINALIGN __alignof__(unsigned long long)
#endif

So here you're using it for "dma aligment" whereas crypto is using it
(or ARCH_SLAB_MINALIGN!) for "cpu 64-bit alignment".

Why does ARCH_KMALLOC_MINALIGN even exist?  What is its mandate?  Sigh.

It's not really related to your patch (although your patch compounds
the problem a little).  But we should sit down and work out what we
actually want to do here.  Something like:

In each architecture's arch/foo/Kconfig, define

	CONFIG_ARCH_DMA_ALIGN

and

	CONFIG_ARCH_64BIT_POINTER_ALIGN

and then use them.  Note that these have nothing to do with each other,
as far as I can tell.

Which leaves the question: "what should slab use"?  Maybe
CONFIG_ARCH_DMA_ALIGN?  But that depends what ARCH_KMALLOC_MINALIGN is
supposed to exist for.

ick.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:31   ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask Andrew Morton
@ 2008-05-20  9:38     ` Herbert Xu
  2008-05-20  9:52       ` Andrew Morton
  2008-05-20 13:25       ` FUJITA Tomonori
  2008-05-20  9:55     ` Paul Mundt
  1 sibling, 2 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-20  9:38 UTC (permalink / raw)
  To: Andrew Morton
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, May 20, 2008 at 02:31:29AM -0700, Andrew Morton wrote:
>
> So here you're using it for "dma aligment" whereas crypto is using it
> (or ARCH_SLAB_MINALIGN!) for "cpu 64-bit alignment".

No the 64-bit alignment is just an example.  The purpose of
CRYPTO_MINALIGN is pretty much the same as ARCH_KMALLOC_MINALIGN,
i.e., the minimum alignment guaranteed by kmalloc.  The only
reason it exists is because ARCH_KMALLOC_MINALIGN isn't defined
on all platforms.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:38     ` Herbert Xu
@ 2008-05-20  9:52       ` Andrew Morton
  2008-05-20  9:58         ` FUJITA Tomonori
  2008-05-20 11:32         ` Herbert Xu
  2008-05-20 13:25       ` FUJITA Tomonori
  1 sibling, 2 replies; 45+ messages in thread
From: Andrew Morton @ 2008-05-20  9:52 UTC (permalink / raw)
  To: Herbert Xu
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, 20 May 2008 17:38:20 +0800 Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Tue, May 20, 2008 at 02:31:29AM -0700, Andrew Morton wrote:
> >
> > So here you're using it for "dma aligment" whereas crypto is using it
> > (or ARCH_SLAB_MINALIGN!) for "cpu 64-bit alignment".
> 
> No the 64-bit alignment is just an example.  The purpose of
> CRYPTO_MINALIGN is pretty much the same as ARCH_KMALLOC_MINALIGN,
> i.e., the minimum alignment guaranteed by kmalloc.  The only
> reason it exists is because ARCH_KMALLOC_MINALIGN isn't defined
> on all platforms.

I'm struggling to understand what you're saying here.

The comment you have there over the CRYPTO_MINALIGN definition is quite
specific.  Is it wrong?

Whether the mapping between CRYPTO_MINALIGN and ARCH_KMALLOC_MINALIGN
is abusive is (I find) hard to say, because first one would need to be
able to say what ARCH_KMALLOC_MINALIGN is for.  I expect it was for DMA
purposes.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:52       ` Andrew Morton
@ 2008-05-20  9:58         ` FUJITA Tomonori
  2008-05-20 11:32         ` Herbert Xu
  1 sibling, 0 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20  9:58 UTC (permalink / raw)
  To: akpm
  Cc: herbert, fujita.tomonori, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, 20 May 2008 02:52:31 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> On Tue, 20 May 2008 17:38:20 +0800 Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > On Tue, May 20, 2008 at 02:31:29AM -0700, Andrew Morton wrote:
> > >
> > > So here you're using it for "dma aligment" whereas crypto is using it
> > > (or ARCH_SLAB_MINALIGN!) for "cpu 64-bit alignment".
> > 
> > No the 64-bit alignment is just an example.  The purpose of
> > CRYPTO_MINALIGN is pretty much the same as ARCH_KMALLOC_MINALIGN,
> > i.e., the minimum alignment guaranteed by kmalloc.  The only
> > reason it exists is because ARCH_KMALLOC_MINALIGN isn't defined
> > on all platforms.
> 
> I'm struggling to understand what you're saying here.
> 
> The comment you have there over the CRYPTO_MINALIGN definition is quite
> specific.  Is it wrong?
> 
> Whether the mapping between CRYPTO_MINALIGN and ARCH_KMALLOC_MINALIGN
> is abusive is (I find) hard to say, because first one would need to be
> able to say what ARCH_KMALLOC_MINALIGN is for.  I expect it was for DMA
> purposes.

Yeah, ARCH_KMALLOC_MINALIGN is for DMA. Non coherent platforms define
it.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:52       ` Andrew Morton
  2008-05-20  9:58         ` FUJITA Tomonori
@ 2008-05-20 11:32         ` Herbert Xu
  1 sibling, 0 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-20 11:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, May 20, 2008 at 02:52:31AM -0700, Andrew Morton wrote:
> 
> The comment you have there over the CRYPTO_MINALIGN definition is quite
> specific.  Is it wrong?

No it's not wrong, but the comment isn't about CRYPTO_MINALIGN.
The comment is talking about CRYPTO_MINALIGN_ATTR, which uses
CRYPTO_MINALIGN.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:38     ` Herbert Xu
  2008-05-20  9:52       ` Andrew Morton
@ 2008-05-20 13:25       ` FUJITA Tomonori
  2008-05-20 15:34         ` Herbert Xu
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20 13:25 UTC (permalink / raw)
  To: herbert
  Cc: akpm, fujita.tomonori, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, 20 May 2008 17:38:20 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Tue, May 20, 2008 at 02:31:29AM -0700, Andrew Morton wrote:
> >
> > So here you're using it for "dma aligment" whereas crypto is using it
> > (or ARCH_SLAB_MINALIGN!) for "cpu 64-bit alignment".
> 
> No the 64-bit alignment is just an example.  The purpose of
> CRYPTO_MINALIGN is pretty much the same as ARCH_KMALLOC_MINALIGN,
> i.e., the minimum alignment guaranteed by kmalloc.  The only
> reason it exists is because ARCH_KMALLOC_MINALIGN isn't defined
> on all platforms.

struct ablkcipher_request {
	struct crypto_async_request base;

	unsigned int nbytes;

	void *info;

	struct scatterlist *src;
	struct scatterlist *dst;

	void *__ctx[] CRYPTO_MINALIGN_ATTR;
};

Does someone do DMA from/to __ctx?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20 13:25       ` FUJITA Tomonori
@ 2008-05-20 15:34         ` Herbert Xu
  2008-05-20 15:53           ` Plans for libsas and SAS-2? Marushak, Nathan
  2008-05-20 16:09           ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
  0 siblings, 2 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-20 15:34 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Tue, May 20, 2008 at 10:25:25PM +0900, FUJITA Tomonori wrote:
>
> Does someone do DMA from/to __ctx?

Nobody.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Plans for libsas and SAS-2?
  2008-05-20 15:34         ` Herbert Xu
@ 2008-05-20 15:53           ` Marushak, Nathan
  2008-05-20 16:09           ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
  1 sibling, 0 replies; 45+ messages in thread
From: Marushak, Nathan @ 2008-05-20 15:53 UTC (permalink / raw)
  To: linux-scsi

Are there plans to add SAS-2 support (e.g. zoning, new broadcast
primitives, etc.) to libsas?

Thanks,
Nate

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20 15:34         ` Herbert Xu
  2008-05-20 15:53           ` Plans for libsas and SAS-2? Marushak, Nathan
@ 2008-05-20 16:09           ` FUJITA Tomonori
  2008-05-21  1:26             ` Herbert Xu
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-20 16:09 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Tue, 20 May 2008 23:34:24 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Tue, May 20, 2008 at 10:25:25PM +0900, FUJITA Tomonori wrote:
> >
> > Does someone do DMA from/to __ctx?
> 
> Nobody.

Then, you don't need to use ARCH_KMALLOC_MINALIGN. 8 bytes align works
for you on all the architectures.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20 16:09           ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
@ 2008-05-21  1:26             ` Herbert Xu
  2008-05-21  1:36               ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21  1:26 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 01:09:41AM +0900, FUJITA Tomonori wrote:
>
> Then, you don't need to use ARCH_KMALLOC_MINALIGN. 8 bytes align works
> for you on all the architectures.

DMA isn't the only thing that requires alignment.  The CPU needs
it too.  Also using a constant like 8 is broken because if we used
a value larger than the alignment guaranteed by kmalloc then the
context may end up unaligned.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  1:26             ` Herbert Xu
@ 2008-05-21  1:36               ` FUJITA Tomonori
  2008-05-21  3:16                 ` Herbert Xu
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21  1:36 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 09:26:22 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 01:09:41AM +0900, FUJITA Tomonori wrote:
> >
> > Then, you don't need to use ARCH_KMALLOC_MINALIGN. 8 bytes align works
> > for you on all the architectures.
> 
> DMA isn't the only thing that requires alignment.  The CPU needs
> it too.  Also using a constant like 8 is broken because if we used
> a value larger than the alignment guaranteed by kmalloc then the
> context may end up unaligned.

ARCH_KMALLOC_MINALIGN represents DMA alignment since we guarantee
kmalloced buffers can be used for DMA.

Only non coherent architecutures defines ARCH_KMALLOC_MINALIGN to the
cache line size since an DMA'able object within a structure isn't
sharing a cache line with some other object (note that it not about
only alignment).

For your case, the alignment requirement for a pointer is appropriate
(it's about the CPU alignment requirement that you talk about. 8 bytes
alignment always works).

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  1:36               ` FUJITA Tomonori
@ 2008-05-21  3:16                 ` Herbert Xu
  2008-05-21  6:54                   ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21  3:16 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 10:36:51AM +0900, FUJITA Tomonori wrote:
>
> ARCH_KMALLOC_MINALIGN represents DMA alignment since we guarantee
> kmalloced buffers can be used for DMA.

That may be why it was created, but that is not its only application.
In particular, it forms part of the calculation of the minimum
alignment guaranteed by kmalloc which is why it's used in crpyto.

Of course, if some kind soul would move this calculation into a
header file then we wouldn't be having this discussion.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  3:16                 ` Herbert Xu
@ 2008-05-21  6:54                   ` FUJITA Tomonori
  2008-05-21  8:47                     ` Herbert Xu
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21  6:54 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 11:16:46 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 10:36:51AM +0900, FUJITA Tomonori wrote:
> >
> > ARCH_KMALLOC_MINALIGN represents DMA alignment since we guarantee
> > kmalloced buffers can be used for DMA.
> 
> That may be why it was created, but that is not its only application.

Currently, it's only applicaiton.


> In particular, it forms part of the calculation of the minimum
> alignment guaranteed by kmalloc which is why it's used in crpyto.
> 
> Of course, if some kind soul would move this calculation into a
> header file then we wouldn't be having this discussion.

As explained, with the current way we define ARCH_KMALLOC_MINALIGN,
crypto doesn't need to use it. But to make it clear, we had better
clean up these defines, such as renaming it an appropriate name like
ARCH_DMA_ALIGN.

I'll send patches shortly.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  6:54                   ` FUJITA Tomonori
@ 2008-05-21  8:47                     ` Herbert Xu
  2008-05-21  9:34                       ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21  8:47 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 03:54:14PM +0900, FUJITA Tomonori wrote:
>
> As explained, with the current way we define ARCH_KMALLOC_MINALIGN,
> crypto doesn't need to use it. But to make it clear, we had better
> clean up these defines, such as renaming it an appropriate name like
> ARCH_DMA_ALIGN.

No you don't understand the way crypto is using it.  We need to
know exactly the minimum alignment guaranteed by kmalloc.  Too much
or too little are both buggy.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  8:47                     ` Herbert Xu
@ 2008-05-21  9:34                       ` FUJITA Tomonori
  2008-05-21 10:05                         ` Herbert Xu
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21  9:34 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 16:47:00 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 03:54:14PM +0900, FUJITA Tomonori wrote:
> >
> > As explained, with the current way we define ARCH_KMALLOC_MINALIGN,
> > crypto doesn't need to use it. But to make it clear, we had better
> > clean up these defines, such as renaming it an appropriate name like
> > ARCH_DMA_ALIGN.
> 
> No you don't understand the way crypto is using it.  We need to
> know exactly the minimum alignment guaranteed by kmalloc.  Too much
> or too little are both buggy.

Why do crypto need to know exactly the minimum alignment guaranteed by
kmalloc? Can you tell me an example how the alignment breaks crypto?

For me, the way crypto use it is idential to what the hostdata in
struct Scsi_Host does.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21  9:34                       ` FUJITA Tomonori
@ 2008-05-21 10:05                         ` Herbert Xu
  2008-05-21 11:01                           ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21 10:05 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 06:34:45PM +0900, FUJITA Tomonori wrote:
>
> Why do crypto need to know exactly the minimum alignment guaranteed by
> kmalloc? Can you tell me an example how the alignment breaks crypto?

It's used to make the context aligned so that for most algorithms
we can get to the context without going through ALIGN_PTR.  Only
algorithms requiring alignments bigger than that offered by kmalloc
would have to use ALIGN_PTR.  This is important because the context
is used on the fast path, i.e., for AES every block has to access
the context.

If we used an alignment value is bigger than that guaranteed by
kmalloc then this would break because the context may end up
unaligned.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 10:05                         ` Herbert Xu
@ 2008-05-21 11:01                           ` FUJITA Tomonori
  2008-05-21 11:25                             ` Herbert Xu
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21 11:01 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 18:05:29 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 06:34:45PM +0900, FUJITA Tomonori wrote:
> >
> > Why do crypto need to know exactly the minimum alignment guaranteed by
> > kmalloc? Can you tell me an example how the alignment breaks crypto?
> 
> It's used to make the context aligned so that for most algorithms
> we can get to the context without going through ALIGN_PTR.  Only

How many bytes does the context need to be aligned?

I'm still not sure what you mean. You referred to aes, so let me use
aes as an example.

I think 'we can get to the context' means that accessing to
crypto_aes_ctx from struct crypto_tfm like this:

struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm);


struct crypto_tfm and crypto_tfm_ctx are defined as:

struct crypto_tfm {

	u32 crt_flags;

	union {
		struct ablkcipher_tfm ablkcipher;
		struct aead_tfm aead;
		struct blkcipher_tfm blkcipher;
		struct cipher_tfm cipher;
		struct hash_tfm hash;
		struct compress_tfm compress;
	} crt_u;

	struct crypto_alg *__crt_alg;

	void *__crt_ctx[] CRYPTO_MINALIGN_ATTR;
};

static inline void *crypto_tfm_ctx(struct crypto_tfm *tfm)
{
	return tfm->__crt_ctx;
}

struct crypto_aes_ctx is placed right after struct crypto_tfm.

My question is why __crt_ctx needs ARCH_KMALLOC_MINALIGN alignment,
e.g., could be 128 bytes.


> algorithms requiring alignments bigger than that offered by kmalloc
> would have to use ALIGN_PTR.  This is important because the context
> is used on the fast path, i.e., for AES every block has to access
> the context.

Why do algorithms require alignments bigger than ARCH_KMALLOC_MINALIGN?


> If we used an alignment value is bigger than that guaranteed by
> kmalloc then this would break because the context may end up
> unaligned.
> 
> Cheers,
> -- 
> Visit Openswan at http://www.openswan.org/
> Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 11:01                           ` FUJITA Tomonori
@ 2008-05-21 11:25                             ` Herbert Xu
  2008-05-21 12:09                               ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21 11:25 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 08:01:12PM +0900, FUJITA Tomonori wrote:
>
> Why do algorithms require alignments bigger than ARCH_KMALLOC_MINALIGN?

Because the hardware may require it.  For example, the VIA Padlock
will fault unless the buffers are 16-byte aligned (it being an
x86-32 platform).

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 11:25                             ` Herbert Xu
@ 2008-05-21 12:09                               ` FUJITA Tomonori
  2008-05-21 12:22                                 ` Herbert Xu
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21 12:09 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 19:25:54 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 08:01:12PM +0900, FUJITA Tomonori wrote:
> >
> > Why do algorithms require alignments bigger than ARCH_KMALLOC_MINALIGN?
> 
> Because the hardware may require it.  For example, the VIA Padlock
> will fault unless the buffers are 16-byte aligned (it being an
> x86-32 platform).

OK, thanks. So it's about hardware requrement. Let me make sure if I
understand crypto alignment issue.

__crt_ctx needs ARCH_KMALLOC_MINALIGN alignment only because of crypto
hardware. If I misunderstand it, can you answer my question in the
previous mail (it's the part that you cut)? That is, why does
__crt_ctx need ARCH_KMALLOC_MINALIGN alignment with software
algorithms.

The VIA Padlock likes 16-byte aligned __crt_ctx. On x86-32 platform,
ARCH_KMALLOC_MINALIGN is not defined, so __crt_ctx is 8-byte
aligned. struct aes_ctx of The VIA Padlock may not be aligned so you
may need ALIGN hack every time.

But ARCH_KMALLOC_MINALIGN is 128 bytes on some architectures. In this
case, __crt_ctx is 128-byte aligned and struct aes_ctx of The VIA
Padlock is guaranteed to be aligned nicely.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 12:09                               ` FUJITA Tomonori
@ 2008-05-21 12:22                                 ` Herbert Xu
  2008-05-21 12:46                                   ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21 12:22 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 09:09:58PM +0900, FUJITA Tomonori wrote:
>
> OK, thanks. So it's about hardware requrement. Let me make sure if I
> understand crypto alignment issue.
> 
> __crt_ctx needs ARCH_KMALLOC_MINALIGN alignment only because of crypto
> hardware. If I misunderstand it, can you answer my question in the
> previous mail (it's the part that you cut)? That is, why does
> __crt_ctx need ARCH_KMALLOC_MINALIGN alignment with software
> algorithms.

Because the same structure is used for all algorithms!

Why is this so hard to understand?
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 12:22                                 ` Herbert Xu
@ 2008-05-21 12:46                                   ` FUJITA Tomonori
  2008-05-21 12:55                                     ` FUJITA Tomonori
  2008-05-21 13:18                                     ` Herbert Xu
  0 siblings, 2 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21 12:46 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 20:22:18 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 09:09:58PM +0900, FUJITA Tomonori wrote:
> >
> > OK, thanks. So it's about hardware requrement. Let me make sure if I
> > understand crypto alignment issue.
> > 
> > __crt_ctx needs ARCH_KMALLOC_MINALIGN alignment only because of crypto
> > hardware. If I misunderstand it, can you answer my question in the
> > previous mail (it's the part that you cut)? That is, why does
> > __crt_ctx need ARCH_KMALLOC_MINALIGN alignment with software
> > algorithms.
> 
> Because the same structure is used for all algorithms!

No, you misunderstand my question. I meant, software algorithms don't
need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
with using the ALIGN hack for crypto hardware every time (like
aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
for __crt_ctx. Is this right?

> 
> Why is this so hard to understand?

Because there are few architecture that defines
ARCH_KMALLOC_MINALIGN. So if crypto hardware needs alignement, it's
likely the hardware alignement is larger than __crt_ctx alignment. As
a result, you have to use ALIGN_PTR. So It's hard to understand using
ARCH_KMALLOC_MINALIGN here. I don't know about crypto hardware, but I
wonder if we can use a static alignment like 64 bytes here, which may
work for most of crypto hardware. Or if there are not many users of
crypto hardware, it may be fine to use ALIGN_PTR for the hardware.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 12:46                                   ` FUJITA Tomonori
@ 2008-05-21 12:55                                     ` FUJITA Tomonori
  2008-05-21 13:19                                       ` Herbert Xu
  2008-05-21 13:18                                     ` Herbert Xu
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-21 12:55 UTC (permalink / raw)
  To: fujita.tomonori
  Cc: herbert, akpm, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 21:46:24 +0900
FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> wrote:

> On Wed, 21 May 2008 20:22:18 +0800
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > On Wed, May 21, 2008 at 09:09:58PM +0900, FUJITA Tomonori wrote:
> > >
> > > OK, thanks. So it's about hardware requrement. Let me make sure if I
> > > understand crypto alignment issue.
> > > 
> > > __crt_ctx needs ARCH_KMALLOC_MINALIGN alignment only because of crypto
> > > hardware. If I misunderstand it, can you answer my question in the
> > > previous mail (it's the part that you cut)? That is, why does
> > > __crt_ctx need ARCH_KMALLOC_MINALIGN alignment with software
> > > algorithms.
> > 
> > Because the same structure is used for all algorithms!
> 
> No, you misunderstand my question. I meant, software algorithms don't
> need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
> with using the ALIGN hack for crypto hardware every time (like
> aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
> for __crt_ctx. Is this right?
> 
> 
> > 
> > Why is this so hard to understand?
> 
> Because there are few architecture that defines
> ARCH_KMALLOC_MINALIGN. So if crypto hardware needs alignement, it's
> likely the hardware alignement is larger than __crt_ctx alignment. As
> a result, you have to use ALIGN_PTR. So It's hard to understand using
> ARCH_KMALLOC_MINALIGN here. I don't know about crypto hardware, but I
> wonder if we can use a static alignment like 64 bytes here, which may
> work for most of crypto hardware. Or if there are not many users of

Oops, scratch the static alignment. It's impossible.


> crypto hardware, it may be fine to use ALIGN_PTR for the hardware.

I still wonder it's acceptable or not.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 12:55                                     ` FUJITA Tomonori
@ 2008-05-21 13:19                                       ` Herbert Xu
  0 siblings, 0 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-21 13:19 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 09:55:15PM +0900, FUJITA Tomonori wrote:
>
> > crypto hardware, it may be fine to use ALIGN_PTR for the hardware.
> 
> I still wonder it's acceptable or not.

Normally I would say yes.  But because the context poitner is
used on the most performance-critical path of the crypto API,
I'd rather not use it unless necessary.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 12:46                                   ` FUJITA Tomonori
  2008-05-21 12:55                                     ` FUJITA Tomonori
@ 2008-05-21 13:18                                     ` Herbert Xu
  2008-05-22  1:14                                       ` FUJITA Tomonori
  1 sibling, 1 reply; 45+ messages in thread
From: Herbert Xu @ 2008-05-21 13:18 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: akpm, linux-scsi, linux-ide, jens.axboe, tsbogend, bzolnier,
	James.Bottomley, jeff, davem, linux-mm

On Wed, May 21, 2008 at 09:46:24PM +0900, FUJITA Tomonori wrote:
>
> No, you misunderstand my question. I meant, software algorithms don't
> need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
> with using the ALIGN hack for crypto hardware every time (like
> aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
> for __crt_ctx. Is this right?

The padlock isn't the only hardware device that will require
such alignment.  Now that we have the async interface there will
be more.

> Because there are few architecture that defines
> ARCH_KMALLOC_MINALIGN. So if crypto hardware needs alignement, it's

You keep going back to ARCH_KMALLOC_MINALIGN.  But this has *nothing*
to do with ARCH_KMALLOC_MINALIGN.  The only reason it appears at
all in the crypto code is because it's one of the parameters used
to calculate the minimum alignment guaranteed by kmalloc.

If there were a macro KMALLOC_MINALIGN which did what its name says
then I'd gladly use it.

Cheeres,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-21 13:18                                     ` Herbert Xu
@ 2008-05-22  1:14                                       ` FUJITA Tomonori
  2008-05-22  1:19                                         ` David Miller
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-22  1:14 UTC (permalink / raw)
  To: herbert
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, davem, linux-mm

On Wed, 21 May 2008 21:18:11 +0800
Herbert Xu <herbert@gondor.apana.org.au> wrote:

> On Wed, May 21, 2008 at 09:46:24PM +0900, FUJITA Tomonori wrote:
> >
> > No, you misunderstand my question. I meant, software algorithms don't
> > need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
> > with using the ALIGN hack for crypto hardware every time (like
> > aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
> > for __crt_ctx. Is this right?
> 
> The padlock isn't the only hardware device that will require
> such alignment.  Now that we have the async interface there will
> be more.

Ok, so it's all about crypto hardware requirement. In other words, if
we accept for potential performance drop of crypto hardware, crypto
can drop this alignment.

> > Because there are few architecture that defines
> > ARCH_KMALLOC_MINALIGN. So if crypto hardware needs alignement, it's
> 
> You keep going back to ARCH_KMALLOC_MINALIGN.  But this has *nothing*
> to do with ARCH_KMALLOC_MINALIGN.  The only reason it appears at
> all in the crypto code is because it's one of the parameters used
> to calculate the minimum alignment guaranteed by kmalloc.

No, you misunderstand what I meant. I'm talking about the minimum
alignment guaranteed by kmalloc too.

What I'm trying to asking you, on the majority of architectures, the
minimum alignment guaranteed by kmalloc (8 bytes) is too small for
algorithms requiring alignments (that is, crypto hardware requiring
alignments). As a result, the former in the following your logic
doesn't happens for most of us. Your logic is:

=
It's used to make the context aligned so that for most algorithms we
can get to the context without going through ALIGN_PTR. Only
algorithms requiring alignments bigger than that offered by kmalloc
would have to use ALIGN_PTR.
=

The former preferable path (algorithms requiring alignments are
smaller than the minimum alignment guaranteed by kmalloc) happens only
on some powerpc, arm, and mips architectures. Do I misunderstand
something?

If you put a hack in __crypto_alloc_tfm and crypto_free_tfm to return
aligned tmf to algorithms (and use aligned attribute for __crt_ctx),
then the your logic would makes sense.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-22  1:14                                       ` FUJITA Tomonori
@ 2008-05-22  1:19                                         ` David Miller
  2008-05-22  1:21                                           ` Herbert Xu
  2008-05-22  1:32                                           ` FUJITA Tomonori
  0 siblings, 2 replies; 45+ messages in thread
From: David Miller @ 2008-05-22  1:19 UTC (permalink / raw)
  To: fujita.tomonori
  Cc: herbert, akpm, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, linux-mm

From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date: Thu, 22 May 2008 10:14:12 +0900

> On Wed, 21 May 2008 21:18:11 +0800
> Herbert Xu <herbert@gondor.apana.org.au> wrote:
> 
> > On Wed, May 21, 2008 at 09:46:24PM +0900, FUJITA Tomonori wrote:
> > >
> > > No, you misunderstand my question. I meant, software algorithms don't
> > > need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
> > > with using the ALIGN hack for crypto hardware every time (like
> > > aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
> > > for __crt_ctx. Is this right?
> > 
> > The padlock isn't the only hardware device that will require
> > such alignment.  Now that we have the async interface there will
> > be more.
> 
> Ok, so it's all about crypto hardware requirement. In other words, if
> we accept for potential performance drop of crypto hardware, crypto
> can drop this alignment.

It sounds to me that Herbert is saying that the VIA crypto hardware
will malfunction if not given an aligned address, rather than simply
go more slowly.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-22  1:19                                         ` David Miller
@ 2008-05-22  1:21                                           ` Herbert Xu
  2008-05-22  1:32                                           ` FUJITA Tomonori
  1 sibling, 0 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-22  1:21 UTC (permalink / raw)
  To: David Miller
  Cc: fujita.tomonori, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, linux-mm

On Wed, May 21, 2008 at 06:19:45PM -0700, David Miller wrote:
>
> > Ok, so it's all about crypto hardware requirement. In other words, if
> > we accept for potential performance drop of crypto hardware, crypto
> > can drop this alignment.
> 
> It sounds to me that Herbert is saying that the VIA crypto hardware
> will malfunction if not given an aligned address, rather than simply
> go more slowly.

Yes, in general hardware crypto that needs alignment requires it.
In VIA's case it will generate a GPF.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-22  1:19                                         ` David Miller
  2008-05-22  1:21                                           ` Herbert Xu
@ 2008-05-22  1:32                                           ` FUJITA Tomonori
  2008-05-22  1:56                                             ` Herbert Xu
  1 sibling, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-22  1:32 UTC (permalink / raw)
  To: davem
  Cc: fujita.tomonori, herbert, akpm, linux-scsi, linux-ide, jens.axboe,
	tsbogend, bzolnier, James.Bottomley, jeff, linux-mm

On Wed, 21 May 2008 18:19:45 -0700 (PDT)
David Miller <davem@davemloft.net> wrote:

> From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Date: Thu, 22 May 2008 10:14:12 +0900
> 
> > On Wed, 21 May 2008 21:18:11 +0800
> > Herbert Xu <herbert@gondor.apana.org.au> wrote:
> > 
> > > On Wed, May 21, 2008 at 09:46:24PM +0900, FUJITA Tomonori wrote:
> > > >
> > > > No, you misunderstand my question. I meant, software algorithms don't
> > > > need ARCH_KMALLOC_MINALIGN alignment for __crt_ctx and if we are fine
> > > > with using the ALIGN hack for crypto hardware every time (like
> > > > aes_ctx_common), crypto doesn't need ARCH_KMALLOC_MINALIGN alignment
> > > > for __crt_ctx. Is this right?
> > > 
> > > The padlock isn't the only hardware device that will require
> > > such alignment.  Now that we have the async interface there will
> > > be more.
> > 
> > Ok, so it's all about crypto hardware requirement. In other words, if
> > we accept for potential performance drop of crypto hardware, crypto
> > can drop this alignment.
> 
> It sounds to me that Herbert is saying that the VIA crypto hardware
> will malfunction if not given an aligned address, rather than simply
> go more slowly.

I understand that.

VIA crypto driver has the following code to get proper alignment:

static inline struct aes_ctx *aes_ctx_common(void *ctx)
{
	unsigned long addr = (unsigned long)ctx;
	unsigned long align = PADLOCK_ALIGNMENT;

	if (align <= crypto_tfm_ctx_alignment())
		align = 1;
	return (struct aes_ctx *)ALIGN(addr, align);
}

What he insists is:

When crypto hardware alignment is smaller than the minimum alignment
guaranteed by kmalloc, the above function is faster since ALIGN is
nullified. That's why crypto uses the minimum alignment guaranteed by
kmalloc.


What I asking is:

On most architectures, the minimum alignment guaranteed by kmalloc is
too small (8 bytes). This ideal story doesn't happen to most of us.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-22  1:32                                           ` FUJITA Tomonori
@ 2008-05-22  1:56                                             ` Herbert Xu
  0 siblings, 0 replies; 45+ messages in thread
From: Herbert Xu @ 2008-05-22  1:56 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: davem, akpm, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, linux-mm

On Thu, May 22, 2008 at 10:32:21AM +0900, FUJITA Tomonori wrote:
>
> What I asking is:
> 
> On most architectures, the minimum alignment guaranteed by kmalloc is
> too small (8 bytes). This ideal story doesn't happen to most of us.

Right, so the real issue is that what we have here is a lower
bound of the kmalloc alignment.  In reality, the kmalloc return
values are cache-line aligned when debugging is off.  So if you
can think of a way of getting a better bound at compile time,
I'm all ears.

Otherwise this discussion seems to be completely pointless.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask
  2008-05-20  9:31   ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask Andrew Morton
  2008-05-20  9:38     ` Herbert Xu
@ 2008-05-20  9:55     ` Paul Mundt
  1 sibling, 0 replies; 45+ messages in thread
From: Paul Mundt @ 2008-05-20  9:55 UTC (permalink / raw)
  To: Andrew Morton
  Cc: FUJITA Tomonori, linux-scsi, linux-ide, jens.axboe, tsbogend,
	bzolnier, James.Bottomley, jeff, davem, linux-mm, Herbert Xu

On Tue, May 20, 2008 at 02:31:29AM -0700, Andrew Morton wrote:
> Why does ARCH_KMALLOC_MINALIGN even exist?  What is its mandate?  Sigh.
> 
> It's not really related to your patch (although your patch compounds
> the problem a little).  But we should sit down and work out what we
> actually want to do here.  Something like:
> 
> In each architecture's arch/foo/Kconfig, define
> 
> 	CONFIG_ARCH_DMA_ALIGN
> 
> and
> 
> 	CONFIG_ARCH_64BIT_POINTER_ALIGN
> 
> and then use them.  Note that these have nothing to do with each other,
> as far as I can tell.
> 
> Which leaves the question: "what should slab use"?  Maybe
> CONFIG_ARCH_DMA_ALIGN?  But that depends what ARCH_KMALLOC_MINALIGN is
> supposed to exist for.
> 
> ick.
> 
The only platforms that set ARCH_KMALLOC_MINALIGN appear to do so for DMA
alignment reasons, so your Kconfig option there seems reasonable.

The ARCH_SLAB_MINALIGN you can blame me for:

	http://marc.info/?l=linux-kernel&m=110227138116749&w=2

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  4:58 [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
  2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
@ 2008-05-21 22:11 ` Thomas Bogendoerfer
  2008-05-22  1:13   ` FUJITA Tomonori
  2008-05-27 18:58 ` Bartlomiej Zolnierkiewicz
  2 siblings, 1 reply; 45+ messages in thread
From: Thomas Bogendoerfer @ 2008-05-21 22:11 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-scsi, linux-ide, jens.axboe, bzolnier, James.Bottomley,
	jeff, akpm, davem

On Tue, May 20, 2008 at 01:58:30PM +0900, FUJITA Tomonori wrote:
> The goal of this patchset is reverting the commit
> 22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use kmalloced buffers
> instead of buffers on stack).
> 
> http://lkml.org/lkml/2008/4/21/634
> 
> The commit is using kmalloced buffers for cdrom packet commands to
> avoid stack corruption on non coherent platforms. But allocating a
> small buffer like this is not nice (unnecessary complicity):
> 
> +	buffer = kmalloc(8, GFP_KERNEL);

If I read the bio code correctly with your patch non coherent system will
be punished a lot more. bio_copy_kern will allocate a whole page for this
8 bytes. Please correct me, if I'm wrong otherwise I tend to NAK that 
approach.

> If some architectures can't do DMA on stack, we also need to a new
> queue_flag like QUEUE_FLAG_NO_DMA_ON_STACK in addtion of this
> patchset.

DMA-mapping.txt tells that DMA on stack is forbidden. IMHO we either need
to revoke that first or deal with it. 

Thomas.

-- 
Crap can work. Given enough thrust pigs will fly, but it's not necessary a
good idea.                                                [ RFC1925, 2.3 ]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-21 22:11 ` [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) Thomas Bogendoerfer
@ 2008-05-22  1:13   ` FUJITA Tomonori
  2008-05-22  1:18     ` David Miller
  0 siblings, 1 reply; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-22  1:13 UTC (permalink / raw)
  To: tsbogend
  Cc: fujita.tomonori, linux-scsi, linux-ide, jens.axboe, bzolnier,
	James.Bottomley, jeff, akpm, davem

On Thu, 22 May 2008 00:11:13 +0200
tsbogend@alpha.franken.de (Thomas Bogendoerfer) wrote:

> On Tue, May 20, 2008 at 01:58:30PM +0900, FUJITA Tomonori wrote:
> > The goal of this patchset is reverting the commit
> > 22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use kmalloced buffers
> > instead of buffers on stack).
> > 
> > http://lkml.org/lkml/2008/4/21/634
> > 
> > The commit is using kmalloced buffers for cdrom packet commands to
> > avoid stack corruption on non coherent platforms. But allocating a
> > small buffer like this is not nice (unnecessary complicity):
> > 
> > +	buffer = kmalloc(8, GFP_KERNEL);
> 
> If I read the bio code correctly with your patch non coherent system will
> be punished a lot more. bio_copy_kern will allocate a whole page for this
> 8 bytes. Please correct me, if I'm wrong otherwise I tend to NAK that 
> approach.

You read the block layer code correctly, but I don't think allocating
one page frame is expensive.

We still have many places to do DMA on stack and need to fix them.

There are still many places (scsi_execute users) in the scsi mid-layer
that uses stack addresses for DMA.

drivers/block/pktcdvd.c always uses stack addresses for DMA for
generic packets.

I don't think that using kmalloc/kfree all over the place is a good
idea. We have a mechanism to do I/Os via a temporary buffer for a
buffer that we can't do DMA against. This patch just uses the mechnism
instead of spreading kmalloc/kfree everywhere. Note that of course
this patch fixes the problems that I explained above.

> > If some architectures can't do DMA on stack, we also need to a new
> > queue_flag like QUEUE_FLAG_NO_DMA_ON_STACK in addtion of this
> > patchset.
> 
> DMA-mapping.txt tells that DMA on stack is forbidden. IMHO we either need
> to revoke that first or deal with it. 

Yeah, it would be better to postpone the 4th patch until we finish
this issue. But I guess that we have enough time to do that until the
next merge window.

DMA on stack is forbidden because of non coherent architecutes and
architectures can't uses stack addresses for DMA? What architectures
can't uses stack addresses for DMA? Would it be better to just forbid
using stack addresses for DMA on all the architectures at all times?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-22  1:13   ` FUJITA Tomonori
@ 2008-05-22  1:18     ` David Miller
  2008-05-22  8:43       ` James Bottomley
  0 siblings, 1 reply; 45+ messages in thread
From: David Miller @ 2008-05-22  1:18 UTC (permalink / raw)
  To: fujita.tomonori
  Cc: tsbogend, linux-scsi, linux-ide, jens.axboe, bzolnier,
	James.Bottomley, jeff, akpm

From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Date: Thu, 22 May 2008 10:13:31 +0900

> DMA on stack is forbidden because of non coherent architecutes and
> architectures can't uses stack addresses for DMA? What architectures
> can't uses stack addresses for DMA? Would it be better to just forbid
> using stack addresses for DMA on all the architectures at all times?

Rather, the real problem is that some architectures map the kernel
stack virtually, and as a result virt_to_page() and things like that
will not work.

It really is fully not working to put DMA buffers on the stack
in these cases.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-22  1:18     ` David Miller
@ 2008-05-22  8:43       ` James Bottomley
  2008-05-26  9:17         ` FUJITA Tomonori
  0 siblings, 1 reply; 45+ messages in thread
From: James Bottomley @ 2008-05-22  8:43 UTC (permalink / raw)
  To: David Miller
  Cc: fujita.tomonori, tsbogend, linux-scsi, linux-ide, jens.axboe,
	bzolnier, jeff, akpm

On Wed, 2008-05-21 at 18:18 -0700, David Miller wrote:
> From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> Date: Thu, 22 May 2008 10:13:31 +0900
> 
> > DMA on stack is forbidden because of non coherent architecutes and
> > architectures can't uses stack addresses for DMA? What architectures
> > can't uses stack addresses for DMA? Would it be better to just forbid
> > using stack addresses for DMA on all the architectures at all times?
> 
> Rather, the real problem is that some architectures map the kernel
> stack virtually, and as a result virt_to_page() and things like that
> will not work.
> 
> It really is fully not working to put DMA buffers on the stack
> in these cases.

Actually, for SCSI only, that problem is fixed.  Our decision to
re-route all I/O, including kernel internal I/O through the block
mapping algorithms (really so we don't have to worry about GFP_DMA any
more since we can now use the block bounce buffers) means that all SCSI
submission routines will accept vmalloc'd and even stack buffers
perfectly correctly.

However (before anyone gets any ideas), the stack is incredibly
dangerous to do DMA on because it's packed with current data.  If you
don't get the alignment requirements right on a non-coherent platform,
you'll end up corrupting data because of cache line interference issues.
For this reason, DMA to stack is still technically illegal.

James

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-22  8:43       ` James Bottomley
@ 2008-05-26  9:17         ` FUJITA Tomonori
  0 siblings, 0 replies; 45+ messages in thread
From: FUJITA Tomonori @ 2008-05-26  9:17 UTC (permalink / raw)
  To: James.Bottomley
  Cc: davem, fujita.tomonori, tsbogend, linux-scsi, linux-ide,
	jens.axboe, bzolnier, jeff, akpm

On Thu, 22 May 2008 09:43:12 +0100
James Bottomley <James.Bottomley@HansenPartnership.com> wrote:

> On Wed, 2008-05-21 at 18:18 -0700, David Miller wrote:
> > From: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
> > Date: Thu, 22 May 2008 10:13:31 +0900
> > 
> > > DMA on stack is forbidden because of non coherent architecutes and
> > > architectures can't uses stack addresses for DMA? What architectures
> > > can't uses stack addresses for DMA? Would it be better to just forbid
> > > using stack addresses for DMA on all the architectures at all times?
> > 
> > Rather, the real problem is that some architectures map the kernel
> > stack virtually, and as a result virt_to_page() and things like that
> > will not work.
> > 
> > It really is fully not working to put DMA buffers on the stack
> > in these cases.
> 
> Actually, for SCSI only, that problem is fixed.  Our decision to
> re-route all I/O, including kernel internal I/O through the block
> mapping algorithms (really so we don't have to worry about GFP_DMA any
> more since we can now use the block bounce buffers) means that all SCSI
> submission routines will accept vmalloc'd and even stack buffers
> perfectly correctly.

Yeah, post 2.6.24, blk_rq_map_kern was fixed to use the bounce
buffers. But the block layer doesn't take care of stack buffers or non
coherent architectures wrt the bounce buffers. If we fix that, all
SCSI submission routines are fine. That's the issue that this patchset
tries to fix.


> However (before anyone gets any ideas), the stack is incredibly
> dangerous to do DMA on because it's packed with current data.  If you
> don't get the alignment requirements right on a non-coherent platform,
> you'll end up corrupting data because of cache line interference issues.
> For this reason, DMA to stack is still technically illegal.
> 
> James
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack)
  2008-05-20  4:58 [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
  2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
  2008-05-21 22:11 ` [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) Thomas Bogendoerfer
@ 2008-05-27 18:58 ` Bartlomiej Zolnierkiewicz
  2 siblings, 0 replies; 45+ messages in thread
From: Bartlomiej Zolnierkiewicz @ 2008-05-27 18:58 UTC (permalink / raw)
  To: FUJITA Tomonori
  Cc: linux-scsi, linux-ide, jens.axboe, tsbogend, James.Bottomley,
	jeff, akpm, davem, Borislav Petkov


On Tuesday 20 May 2008, FUJITA Tomonori wrote:
> The goal of this patchset is reverting the commit
> 22a9189fd073db3d03a4cf8b8c098aa207602de1 (cdrom: use kmalloced buffers
> instead of buffers on stack).
> 
> http://lkml.org/lkml/2008/4/21/634
> 
> The commit is using kmalloced buffers for cdrom packet commands to
> avoid stack corruption on non coherent platforms. But allocating a
> small buffer like this is not nice (unnecessary complicity):
> 
> +	buffer = kmalloc(8, GFP_KERNEL);
> 
> I tried to remove generic_packet() and convert cdrom users to use the
> block queue like pkt_generic_packet (as Jens suggested in the thread
> if I correctly understand), but it turned out that it needs tricky
> surgery (like handling ssleep and retries for packet commands in ide).
> 
> Then I found that we can easily handle packet commands on non coherent
> platforms. The diffstat is pretty small except for the revert. All
> this patchset does is just setting the dma_pad_mask to
> ARCH_KMALLOC_MINALIGN.
> 
> Only scsi and ide-cd do DMA generic_packet. In the case of scsi,
> sr_packet uses blk_rq_map_kern (the commit
> 68154e90c9d1492d570671ae181d9a8f8530da55) post 2.6.25. So if we set
> the dma padding on non coherent platforms, sr_packet uses allocated
> pages properly.
> 
> In the case of IDE, ide-cd has a mechanism to handle alignment and
> padding for SG_IO. So we can easily exploit it for packet commands.
> 
> If some architectures can't do DMA on stack, we also need to a new
> queue_flag like QUEUE_FLAG_NO_DMA_ON_STACK in addtion of this
> patchset.
> 
> The diffstat is:
> 
>  block/blk-settings.c      |   30 +++++-
>  drivers/ata/libata-scsi.c |    3 +-
>  drivers/cdrom/cdrom.c     |  274 +++++++++++++++------------------------------
>  drivers/ide/ide-cd.c      |   17 ++-
>  include/linux/blkdev.h    |    1 +
>  5 files changed, 134 insertions(+), 191 deletions(-)

The patchset looks good to me after a quick look but Borislav is a more
appropriate person to contact on ide-cd specific patches.

Thanks,
Bart

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2008-05-28  6:43 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-20  4:58 [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
2008-05-20  4:58 ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
2008-05-20  4:58   ` [PATCH 2/4] block: add blk_queue_update_dma_pad FUJITA Tomonori
2008-05-20  4:58     ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC FUJITA Tomonori
2008-05-20  4:58       ` [PATCH 4/4] cdrom: revert commit 22a9189 (cdrom: use kmalloced buffers instead of buffers on stack) FUJITA Tomonori
2008-05-20  7:33         ` Elias Oltmanns
2008-05-20  7:42           ` Jim Paris
2008-05-20  8:14             ` Elias Oltmanns
2008-05-28  6:43       ` [PATCH 3/4] ide: use the dma safe check for REQ_TYPE_ATA_PC Borislav Petkov
2008-05-20  9:31   ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask Andrew Morton
2008-05-20  9:38     ` Herbert Xu
2008-05-20  9:52       ` Andrew Morton
2008-05-20  9:58         ` FUJITA Tomonori
2008-05-20 11:32         ` Herbert Xu
2008-05-20 13:25       ` FUJITA Tomonori
2008-05-20 15:34         ` Herbert Xu
2008-05-20 15:53           ` Plans for libsas and SAS-2? Marushak, Nathan
2008-05-20 16:09           ` [PATCH 1/4] block: use ARCH_KMALLOC_MINALIGN as the default dma pad mask FUJITA Tomonori
2008-05-21  1:26             ` Herbert Xu
2008-05-21  1:36               ` FUJITA Tomonori
2008-05-21  3:16                 ` Herbert Xu
2008-05-21  6:54                   ` FUJITA Tomonori
2008-05-21  8:47                     ` Herbert Xu
2008-05-21  9:34                       ` FUJITA Tomonori
2008-05-21 10:05                         ` Herbert Xu
2008-05-21 11:01                           ` FUJITA Tomonori
2008-05-21 11:25                             ` Herbert Xu
2008-05-21 12:09                               ` FUJITA Tomonori
2008-05-21 12:22                                 ` Herbert Xu
2008-05-21 12:46                                   ` FUJITA Tomonori
2008-05-21 12:55                                     ` FUJITA Tomonori
2008-05-21 13:19                                       ` Herbert Xu
2008-05-21 13:18                                     ` Herbert Xu
2008-05-22  1:14                                       ` FUJITA Tomonori
2008-05-22  1:19                                         ` David Miller
2008-05-22  1:21                                           ` Herbert Xu
2008-05-22  1:32                                           ` FUJITA Tomonori
2008-05-22  1:56                                             ` Herbert Xu
2008-05-20  9:55     ` Paul Mundt
2008-05-21 22:11 ` [PATCH 0/4] revert the commit 22a9189f (cdrom: use kmalloced buffers instead of buffers on stack) Thomas Bogendoerfer
2008-05-22  1:13   ` FUJITA Tomonori
2008-05-22  1:18     ` David Miller
2008-05-22  8:43       ` James Bottomley
2008-05-26  9:17         ` FUJITA Tomonori
2008-05-27 18:58 ` Bartlomiej Zolnierkiewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).