public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed
* [patch 02/13] jffs2 summary allocation: don't use vmalloc()
@ 2008-07-30 19:34 akpm
  2008-07-30 22:39 ` David Brownell
  0 siblings, 1 reply; 14+ messages in thread
From: akpm @ 2008-07-30 19:34 UTC (permalink / raw)
  To: dwmw2; +Cc: david-b, linux-mtd, trimarchimichael, akpm, jwboyer, rmk

From: Michael Trimarchi <trimarchimichael@yahoo.it>

arm:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = c0004000
[00000000] *pgd=00000000
stopped custom tracer.
Internal error: Oops: 817 [#1] PREEMPT
Modules linked in:
CPU: 0    Not tainted  (2.6.24-rc5-rt1 #37)
PC is at dma_cache_maint+0x40/0x80
LR is at atmel_spi_transfer+0x94/0x178

This is because the SPI layer is using DMA transfers to support jffs2 I/O, and
apparently jffs2 isn't used to having DMA done against this buffer.  DMA
against vmalloced memory plain isn't allowed.

This patch will probably break all sorts of things because that buffer is
&*large*: up to half a meg.

So this patch isn't mergeable.  I'll hang onto it to bug dmwm2 with when he
reincarnates.

Cc: David Brownell <david-b@pacbell.net>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/jffs2/summary.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff -puN fs/jffs2/summary.c~jffs2-summary-allocation-dont-use-vmalloc fs/jffs2/summary.c
--- a/fs/jffs2/summary.c~jffs2-summary-allocation-dont-use-vmalloc
+++ a/fs/jffs2/summary.c
@@ -17,7 +17,6 @@
 #include <linux/pagemap.h>
 #include <linux/crc32.h>
 #include <linux/compiler.h>
-#include <linux/vmalloc.h>
 #include "nodelist.h"
 #include "debug.h"
 
@@ -30,7 +29,7 @@ int jffs2_sum_init(struct jffs2_sb_info 
 		return -ENOMEM;
 	}
 
-	c->summary->sum_buf = vmalloc(c->sector_size);
+	c->summary->sum_buf = kmalloc(c->sector_size, GFP_KERNEL);
 
 	if (!c->summary->sum_buf) {
 		JFFS2_WARNING("Can't allocate buffer for writing out summary information!\n");
@@ -49,7 +48,7 @@ void jffs2_sum_exit(struct jffs2_sb_info
 
 	jffs2_sum_disable_collecting(c->summary);
 
-	vfree(c->summary->sum_buf);
+	kfree(c->summary->sum_buf);
 	c->summary->sum_buf = NULL;
 
 	kfree(c->summary);
_

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-30 19:34 [patch 02/13] jffs2 summary allocation: don't use vmalloc() akpm
@ 2008-07-30 22:39 ` David Brownell
  2008-07-30 22:46   ` Andrew Morton
                     ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: David Brownell @ 2008-07-30 22:39 UTC (permalink / raw)
  To: linux-mtd, dwmw2; +Cc: trimarchimichael, jwboyer, akpm, rmk

> This patch will probably break all sorts of things because that buffer is
> &*large*: up to half a meg.
>
> So this patch isn't mergeable.  I'll hang onto it to bug dmwm2 with when he
> reincarnates.

I'm still asking whether MTD folk have any plans to make that stack DMA-safe...
more than just the SPI flash drivers (mtd_dataflash, m25p80) could benefit
from DMA support, so I'd hope it's at least being considered.

If the answer is "no" then (a) the MTD interface specs need to finally say
they pass DMA-unsafe addresses, and (b) those SPI flash drivers are going
to need updates.

- Dae


> @@ -30,7 +29,7 @@ int jffs2_sum_init(struct jffs2_sb_info 
>  		return -ENOMEM;
>  	}
>  
> -	c->summary->sum_buf = vmalloc(c->sector_size);
> +	c->summary->sum_buf = kmalloc(c->sector_size, GFP_KERNEL);
>  
>  	if (!c->summary->sum_buf) {
>  		JFFS2_WARNING("Can't allocate buffer for writing out summary information!\n");

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-30 22:39 ` David Brownell
@ 2008-07-30 22:46   ` Andrew Morton
  2008-07-31  5:10     ` David Brownell
  2008-07-31  5:15   ` Artem Bityutskiy
  2008-07-31  7:33   ` David Woodhouse
  2 siblings, 1 reply; 14+ messages in thread
From: Andrew Morton @ 2008-07-30 22:46 UTC (permalink / raw)
  To: David Brownell; +Cc: trimarchimichael, dwmw2, jwboyer, linux-mtd, rmk

On Wed, 30 Jul 2008 15:39:24 -0700
David Brownell <david-b@pacbell.net> wrote:

> > This patch will probably break all sorts of things because that buffer is
> > &*large*: up to half a meg.
> >
> > So this patch isn't mergeable.  I'll hang onto it to bug dmwm2 with when he
> > reincarnates.
> 
> I'm still asking whether MTD folk have any plans to make that stack DMA-safe...
> more than just the SPI flash drivers (mtd_dataflash, m25p80) could benefit
> from DMA support, so I'd hope it's at least being considered.
> 
> If the answer is "no" then (a) the MTD interface specs need to finally say
> they pass DMA-unsafe addresses, and (b) those SPI flash drivers are going
> to need updates.

Well yes.  It's been four months since this bug (it goes oops!) was
reported and afaik there's been no discussion or consideration or
anything else.

I don't know how much of a problem this bug is in the real world, but
it's taking an awful long time to get fixed?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-30 22:46   ` Andrew Morton
@ 2008-07-31  5:10     ` David Brownell
  2008-07-31  5:37       ` Artem Bityutskiy
  0 siblings, 1 reply; 14+ messages in thread
From: David Brownell @ 2008-07-31  5:10 UTC (permalink / raw)
  To: Andrew Morton; +Cc: trimarchimichael, dwmw2, jwboyer, linux-mtd, rmk

On Wednesday 30 July 2008, Andrew Morton wrote:
> On Wed, 30 Jul 2008 15:39:24 -0700
> David Brownell <david-b@pacbell.net> wrote:
> 
> > I'm still asking whether MTD folk have any plans to make that stack DMA-safe...
> > more than just the SPI flash drivers (mtd_dataflash, m25p80) could benefit
> > from DMA support, so I'd hope it's at least being considered.
> > 
> > If the answer is "no" then (a) the MTD interface specs need to finally say
> > they pass DMA-unsafe addresses, and (b) those SPI flash drivers are going
> > to need updates.
> 
> Well yes.  It's been four months since this bug (it goes oops!) was
> reported and afaik there's been no discussion or consideration or
> anything else.

Well, the appended is at least starting to address (a) ... under the
assumption the MTD folk aren't making the stack more DMA-friendly.
Presumably the same thing should be true of panic_write(), the OOB
operations (for NAND), and the OTP operations.

I actually think it makes the most technical sense to make the MTD
stack talk to its low level drivers the way most code seems to:
only pass DMA-safe buffers.


> I don't know how much of a problem this bug is in the real world, but
> it's taking an awful long time to get fixed?

I'm not keen on spending time on a "fix" that both worsens performance
and is also wrong ... which is why I've asked the broader question about
the MTD stack and DMA.  It's taking a long time to get an answer to what
should be a simple question...

That, and the fact that the board I have two types of SPI flash hooked up
to currently has a misbehaving SPI driver, which I think someone (else)
is going to be fixing soon.  So testing such changes is impractical, even
if I assume the MTD stack is unlkely to change...

Not all the SPI controllers use DMA, so only some boards suffer from
this problem in any case.

- Dave


---
 include/linux/mtd/mtd.h |   10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

--- a/include/linux/mtd/mtd.h	2008-07-30 20:46:02.000000000 -0700
+++ b/include/linux/mtd/mtd.h	2008-07-30 21:20:37.000000000 -0700
@@ -149,8 +149,14 @@ struct mtd_info {
 	void (*unpoint) (struct mtd_info *mtd, loff_t from, size_t len);
 
 
-	int (*read) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);
-	int (*write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf);
+	/*
+	 * Read/write operations are given *VIRTUAL* addresses, which
+	 * must not be passed as-is to dma mapping operations.
+	 */
+	int (*read) (struct mtd_info *mtd, loff_t from,
+			size_t len, size_t *retlen, u_char *virt);
+	int (*write) (struct mtd_info *mtd, loff_t to,
+			size_t len, size_t *retlen, const u_char *virt);
 
 	/* In blackbox flight recorder like scenarios we want to make successful
 	   writes in interrupt context. panic_write() is only intended to be

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-30 22:39 ` David Brownell
  2008-07-30 22:46   ` Andrew Morton
@ 2008-07-31  5:15   ` Artem Bityutskiy
  2008-07-31  6:56     ` David Brownell
  2008-07-31  7:33   ` David Woodhouse
  2 siblings, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2008-07-31  5:15 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-mtd, trimarchimichael, jwboyer, dwmw2, akpm, rmk

Hi,

On Wed, 2008-07-30 at 15:39 -0700, David Brownell wrote:
> > This patch will probably break all sorts of things because that buffer is
> > &*large*: up to half a meg.
> >
> > So this patch isn't mergeable.  I'll hang onto it to bug dmwm2 with when he
> > reincarnates.
> 
> I'm still asking whether MTD folk have any plans to make that stack DMA-safe...
> more than just the SPI flash drivers (mtd_dataflash, m25p80) could benefit
> from DMA support, so I'd hope it's at least being considered.
> 
> If the answer is "no" then (a) the MTD interface specs need to finally say
> they pass DMA-unsafe addresses, and (b) those SPI flash drivers are going
> to need updates.

We use vmalloc() in both UBI and UBIFS because we need to allocate a
large (of eraseblock size) buffer. So this is not just JFFS2. Using
kmalloc() for this does not seem to be a good idea for me, because
indeed the buffer size may be up to 512KiB, and may even grow at some
point to 1MiB.

Using kmalloc() would mean that at some point we would be unable to
allocate these buffers at one go and would have to do things is
fractions smaller than eraseblock size, which is not always easy. So I
am not really sure what is better - to add complexity to JFFS2/UBI/UBIFS
or to teach low levels (which do DMA) to deal with physically
noncontinuous buffers (e.g., DMA only one RAM page at a time).

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  5:10     ` David Brownell
@ 2008-07-31  5:37       ` Artem Bityutskiy
  2008-07-31  6:57         ` David Brownell
  0 siblings, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2008-07-31  5:37 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-mtd, trimarchimichael, Andrew Morton, dwmw2, jwboyer, rmk


On Wed, 2008-07-30 at 22:10 -0700, David Brownell wrote:
> -	int (*read) (struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen, u_char *buf);
> -	int (*write) (struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen, const u_char *buf);
> +	/*
> +	 * Read/write operations are given *VIRTUAL* addresses, which
> +	 * must not be passed as-is to dma mapping operations.
> +	 */
> +	int (*read) (struct mtd_info *mtd, loff_t from,
> +			size_t len, size_t *retlen, u_char *virt);
> +	int (*write) (struct mtd_info *mtd, loff_t to,
> +			size_t len, size_t *retlen, const u_char *virt);
>  
>  	/* In blackbox flight recorder like scenarios we want to make successful
>  	   writes in interrupt context. panic_write() is only intended to be

Can SPI driver detect if the memory was kmalloc()'ed or vmalloc()'ed? Or
if not, we could add one more argument to read()/write() which tells
whether the memory is contiguous or not. Then the driver would chose how
to deal with the buffer.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  5:15   ` Artem Bityutskiy
@ 2008-07-31  6:56     ` David Brownell
  2008-07-31  8:00       ` Artem Bityutskiy
  0 siblings, 1 reply; 14+ messages in thread
From: David Brownell @ 2008-07-31  6:56 UTC (permalink / raw)
  To: dedekind; +Cc: linux-mtd, trimarchimichael, jwboyer, dwmw2, akpm, rmk

On Wednesday 30 July 2008, Artem Bityutskiy wrote:
> We use vmalloc() in both UBI and UBIFS because we need to allocate a
> large (of eraseblock size) buffer.

In this case, the erase blocks are often small ... many would be
4KB (or less) if JFFS2 didn't jack them up to 8KB minimum, but some
of the flash chips supported by m25p80 are more like NOR.


> So this is not just JFFS2. Using 
> kmalloc() for this does not seem to be a good idea for me, because
> indeed the buffer size may be up to 512KiB, and may even grow at some
> point to 1MiB.

Yeah, nobody's saying kmalloc() is the right answer.  The questions
include who's going to change what, given that this part of the MTD
driver interface has previously been unspecified. 

(DataFlash support has been around since 2003 or so; only *this year*
did anyone suggest that buffers handed to it wouldn't work with the
standard DMA mapping operations, and that came up in the context of
a newish JFFS2 "summary" feature ...)


Another perspective comes from looking at it bottom up, starting with
what the various kinds of flash do.

 - NOR (which I'll assume for discussion is all CFI) hasn't previously
   considered DMA ... although the drivers/dma stuff might handle its
   memcpy on some platforms.  (I measured it on a few systems and saw
   no performance wins however; IMO the interface overheads hurt it.)

 - NAND only does reads/writes of smallish pages ... in conjunction
   with hardware ECC, DMA can help (*) but that only uses small buffers.
   Some NAND drivers *do* use DMA ... Blackfin looks like it assumes
   the buffers are always in adjacent pages, fwiw, and PXA3 looks like
   it always uses a bounce buffer (not very speedy).

 - SPI (two drivers) often does writes of smaller pages than NAND, but
   can read out the entire flash chip in a single operation.  (Which is
   handy for bootstrapping and suchlike.)

So right *now* the main trouble spot with DMA seems to be SPI, initially
with the newish summary support, although some troubles may be lurking
with NAND too (which has an easier time using DMA than NOR).


> Using kmalloc() would mean that at some point we would be unable to
> allocate these buffers at one go and would have to do things is
> fractions smaller than eraseblock size, which is not always easy. So I
> am not really sure what is better - to add complexity to JFFS2/UBI/UBIFS
> or to teach low levels (which do DMA)

Midlayers *could* use drivers/dma to shrink cpu memcpy costs, if
they wanted.  Not sure I'd advise it just now though ... just
saying that more than the lowest levels could do DMA.


> 	to deal with physically 
> noncontinuous buffers (e.g., DMA only one RAM page at a time).

I suppose I'd rather see some mid-layer utilities offloading the
DMA from the lower level drivers.  It seems wrong to expect two
drivers to do the same kind of virtual-buffer to physical-pages
mappings.  There's probably even a utility to do that, leaving
just the task of using it when the lowest level driver (the one
called by MTD-over-SPI drivers like m25p80/dataflash) does DMA.

Comments?

- Dave

(*) As I noted in the context of a different patch:  why doesn't the
    generic NAND code use readsw()/writesw() to get a speedup even
    for PIO based access?  I thought a 16% improvement (ARM9) over
    the current I/O loops would be compelling ...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  5:37       ` Artem Bityutskiy
@ 2008-07-31  6:57         ` David Brownell
  2008-07-31  8:03           ` Artem Bityutskiy
  0 siblings, 1 reply; 14+ messages in thread
From: David Brownell @ 2008-07-31  6:57 UTC (permalink / raw)
  To: dedekind; +Cc: linux-mtd, trimarchimichael, Andrew Morton, dwmw2, jwboyer, rmk

On Wednesday 30 July 2008, Artem Bityutskiy wrote:
> Can SPI driver detect if the memory was kmalloc()'ed or vmalloc()'ed?

The general driver policy is to use one or the other and never mix
them.  Pretty much like always calling with a lock held, or never;
or always calling in contexts that can sleep, etc.  (kmalloc and
siblings being an exception, with gfp_t.)

For clarity:  the stack in question is like

	jffs2, ubifs, etc
	mtd_dataflash or m25p80
	spi_master (generally SOC-specific) sometimes using DMA
	... talking to flash chip using 4 wire signaling ...

Inputs to the SPI stack are required to be DMA-safe -- vmalloc bad.
Not all the SPI controllers support DMA either.

Inputs to the MTD layer have previously been unspecified, but now seem
to be assumed to be DMA-unsafe -- vmalloc ok.


> Or if not, we could add one more argument to read()/write() which tells
> whether the memory is contiguous or not. Then the driver would chose how
> to deal with the buffer.

Is updating all the MTD users in such a way a real possibility?

- Dave
 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-30 22:39 ` David Brownell
  2008-07-30 22:46   ` Andrew Morton
  2008-07-31  5:15   ` Artem Bityutskiy
@ 2008-07-31  7:33   ` David Woodhouse
  2008-07-31  8:21     ` David Brownell
  2 siblings, 1 reply; 14+ messages in thread
From: David Woodhouse @ 2008-07-31  7:33 UTC (permalink / raw)
  To: David Brownell; +Cc: trimarchimichael, jwboyer, linux-mtd, akpm, rmk

On Wed, 2008-07-30 at 15:39 -0700, David Brownell wrote:
> > This patch will probably break all sorts of things because that buffer is
> > &*large*: up to half a meg.
> >
> > So this patch isn't mergeable.  I'll hang onto it to bug dmwm2 with when he
> > reincarnates.
> 
> I'm still asking whether MTD folk have any plans to make that stack DMA-safe...
> more than just the SPI flash drivers (mtd_dataflash, m25p80) could benefit
> from DMA support, so I'd hope it's at least being considered.
> 
> If the answer is "no" then (a) the MTD interface specs need to finally say
> they pass DMA-unsafe addresses, and (b) those SPI flash drivers are going
> to need updates.

Yes. We're planning to change the API so it looks a lot more like the
block layer, and a lot less like the one we inherited from PCMCIA in the
1990s. One of the design considerations we've spoken about has been the
fact that we want to make DMA work properly. 

-- 
dwmw2

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  6:56     ` David Brownell
@ 2008-07-31  8:00       ` Artem Bityutskiy
  2008-07-31  8:48         ` David Woodhouse
  0 siblings, 1 reply; 14+ messages in thread
From: Artem Bityutskiy @ 2008-07-31  8:00 UTC (permalink / raw)
  To: David Brownell; +Cc: linux-mtd, trimarchimichael, jwboyer, dwmw2, akpm, rmk

On Wed, 2008-07-30 at 23:56 -0700, David Brownell wrote:
> > So this is not just JFFS2. Using 
> > kmalloc() for this does not seem to be a good idea for me, because
> > indeed the buffer size may be up to 512KiB, and may even grow at some
> > point to 1MiB.
> 
> Yeah, nobody's saying kmalloc() is the right answer.  The questions
> include who's going to change what, given that this part of the MTD
> driver interface has previously been unspecified. 
> 
> (DataFlash support has been around since 2003 or so; only *this year*
> did anyone suggest that buffers handed to it wouldn't work with the
> standard DMA mapping operations, and that came up in the context of
> a newish JFFS2 "summary" feature ...)

I've just glanced to JFFS2, and this sum_buf does not have to be of
eraseblock size. It should be something like a couple of NAND pages in
size, or, say, 5-10% of eraseblock size. So I would say, in this
particular case JFFS2 may be fixed an kmalloc() may be used.

The idea of this summary stuff is to speed up mount time. JFFS2, while
writing to an EB, remembers information about written nodes in
c->summary->sum_list_head. Then, when the eraseblock is close to be
full, it creates a summary node, which contains an array of information
about each node in this EB. And this summary node is written to the end
of the eraseblock. And, when JFFS2 is mounted it reads this summary node
from the end of EB, instead of scanning whole EB, which speeds up
mounting.

Obviously, JFFS2 does not need eraseblock size buffer for the summary
node. This can be fixed and the problem may be "forgotten" for some
period of time :-)


> Another perspective comes from looking at it bottom up, starting with
> what the various kinds of flash do.
> 
>  - NOR (which I'll assume for discussion is all CFI) hasn't previously
>    considered DMA ... although the drivers/dma stuff might handle its
>    memcpy on some platforms.  (I measured it on a few systems and saw
>    no performance wins however; IMO the interface overheads hurt it.)
> 
>  - NAND only does reads/writes of smallish pages ... in conjunction
>    with hardware ECC, DMA can help (*) but that only uses small buffers.
Yeah, of NAND page size which is 4KiB at max. now AFAIK. But it may grow
at some point.

>    Some NAND drivers *do* use DMA ... Blackfin looks like it assumes
>    the buffers are always in adjacent pages, fwiw, and PXA3 looks like
>    it always uses a bounce buffer (not very speedy).
> 
>  - SPI (two drivers) often does writes of smaller pages than NAND, but
>    can read out the entire flash chip in a single operation.  (Which
> is
>    handy for bootstrapping and suchlike.)

Yeah, it seems that if we just fix this sum_buf in JFFS2 then anyone is
going to be happy. And we may hope that someone soon would change mtd
interfaces as well.

> Midlayers *could* use drivers/dma to shrink cpu memcpy costs, if
> they wanted.  Not sure I'd advise it just now though ... just
> saying that more than the lowest levels could do DMA.

Yeah, you are right, I did not think about this. For UBIFS that could be
a good optimization, because profiling shows it substantial amount of
time in memcpy().

> I suppose I'd rather see some mid-layer utilities offloading the
> DMA from the lower level drivers.  It seems wrong to expect two
> drivers to do the same kind of virtual-buffer to physical-pages
> mappings.  There's probably even a utility to do that, leaving
> just the task of using it when the lowest level driver (the one
> called by MTD-over-SPI drivers like m25p80/dataflash) does DMA.

Hmm, interesting idea. Is something like this is used somewhere in the
kernel?

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  6:57         ` David Brownell
@ 2008-07-31  8:03           ` Artem Bityutskiy
  0 siblings, 0 replies; 14+ messages in thread
From: Artem Bityutskiy @ 2008-07-31  8:03 UTC (permalink / raw)
  To: David Brownell
  Cc: linux-mtd, trimarchimichael, Andrew Morton, dwmw2, jwboyer, rmk


On Wed, 2008-07-30 at 23:57 -0700, David Brownell wrote:
> > Or if not, we could add one more argument to read()/write() which tells
> > whether the memory is contiguous or not. Then the driver would chose how
> > to deal with the buffer.
> 
> Is updating all the MTD users in such a way a real possibility?

I need to look closer, but at the first glance it seems that it should
be possible. There are not so many mtd->read/write users.

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  7:33   ` David Woodhouse
@ 2008-07-31  8:21     ` David Brownell
  0 siblings, 0 replies; 14+ messages in thread
From: David Brownell @ 2008-07-31  8:21 UTC (permalink / raw)
  To: David Woodhouse; +Cc: trimarchimichael, jwboyer, linux-mtd, akpm, rmk

On Thursday 31 July 2008, David Woodhouse wrote:
> Yes. We're planning to change the API so it looks a lot more like the
> block layer, and a lot less like the one we inherited from PCMCIA in the
> 1990s.

Yeah, there are still a few "old" driver structures lingering.  And
I expect the Linux community understands a LOT more about MTD than
it did way back in the day ... :)


> One of the design considerations we've spoken about has been the 
> fact that we want to make DMA work properly. 

Glad to hear that.  But am I right in reading between the lines that
this isn't "soon"?  So that the issue addressed by $SUBJECT won't be
able to rely on that as its fix.

- Dave

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  8:00       ` Artem Bityutskiy
@ 2008-07-31  8:48         ` David Woodhouse
  2008-07-31  9:09           ` David Woodhouse
  0 siblings, 1 reply; 14+ messages in thread
From: David Woodhouse @ 2008-07-31  8:48 UTC (permalink / raw)
  To: dedekind; +Cc: David Brownell, linux-mtd, trimarchimichael, jwboyer, akpm, rmk

On Thu, 2008-07-31 at 11:00 +0300, Artem Bityutskiy wrote:
> I've just glanced to JFFS2, and this sum_buf does not have to be of
> eraseblock size. It should be something like a couple of NAND pages in
> size, or, say, 5-10% of eraseblock size. So I would say, in this
> particular case JFFS2 may be fixed an kmalloc() may be used.

That's a reasonable answer in the short term; good idea. And we don't
have to worry too much about proving a lower bound on the size we'll
want to use for the summary -- because the summary is optional, it's OK
just to abort it if it would have overflowed our truncated buffer. 

Something like this should do the trick...

diff --git a/fs/jffs2/summary.c b/fs/jffs2/summary.c
index 629af01..6caf1e1 100644
--- a/fs/jffs2/summary.c
+++ b/fs/jffs2/summary.c
@@ -23,6 +23,8 @@
 
 int jffs2_sum_init(struct jffs2_sb_info *c)
 {
+	uint32_t sum_size = max_t(uint32_t, c->sector_size, MAX_SUMMARY_SIZE);
+
 	c->summary = kzalloc(sizeof(struct jffs2_summary), GFP_KERNEL);
 
 	if (!c->summary) {
@@ -30,7 +32,7 @@ int jffs2_sum_init(struct jffs2_sb_info *c)
 		return -ENOMEM;
 	}
 
-	c->summary->sum_buf = vmalloc(c->sector_size);
+	c->summary->sum_buf = kmalloc(sum_size, GFP_KERNEL);
 
 	if (!c->summary->sum_buf) {
 		JFFS2_WARNING("Can't allocate buffer for writing out summary information!\n");
@@ -49,7 +51,7 @@ void jffs2_sum_exit(struct jffs2_sb_info *c)
 
 	jffs2_sum_disable_collecting(c->summary);
 
-	vfree(c->summary->sum_buf);
+	kfree(c->summary->sum_buf);
 	c->summary->sum_buf = NULL;
 
 	kfree(c->summary);
@@ -665,7 +667,7 @@ crc_err:
 /* Write summary data to flash - helper function for jffs2_sum_write_sumnode() */
 
 static int jffs2_sum_write_data(struct jffs2_sb_info *c, struct jffs2_eraseblock *jeb,
-					uint32_t infosize, uint32_t datasize, int padsize)
+				uint32_t infosize, uint32_t datasize, int padsize)
 {
 	struct jffs2_raw_summary isum;
 	union jffs2_sum_mem *temp;
@@ -676,6 +678,26 @@ static int jffs2_sum_write_data(struct jffs2_sb_info *c, struct jffs2_eraseblock
 	int ret;
 	size_t retlen;
 
+	if (padsize + datasize > MAX_SUMMARY_SIZE) {
+		/* It won't fit in the buffer. Abort summary for this jeb */
+		jffs2_sum_disable_collecting(c->summary);
+
+		JFFS2_WARNING("Summary too big (%d data, %d pad) in eraseblock at %08x\n",
+			      datasize, padsize, jeb->offset);
+		/* Non-fatal */
+		return 0;
+	}
+	/* Is there enough space for summary? */
+	if (padsize < 0) {
+		/* don't try to write out summary for this jeb */
+		jffs2_sum_disable_collecting(c->summary);
+
+		JFFS2_WARNING("Not enough space for summary, padsize = %d\n",
+			      padsize);
+		/* Non-fatal */
+		return 0;
+	}
+
 	memset(c->summary->sum_buf, 0xff, datasize);
 	memset(&isum, 0, sizeof(isum));
 
@@ -821,7 +843,7 @@ int jffs2_sum_write_sumnode(struct jffs2_sb_info *c)
 {
 	int datasize, infosize, padsize;
 	struct jffs2_eraseblock *jeb;
-	int ret;
+	int ret = 0;
 
 	dbg_summary("called\n");
 
@@ -841,16 +863,6 @@ int jffs2_sum_write_sumnode(struct jffs2_sb_info *c)
 	infosize += padsize;
 	datasize += padsize;
 
-	/* Is there enough space for summary? */
-	if (padsize < 0) {
-		/* don't try to write out summary for this jeb */
-		jffs2_sum_disable_collecting(c->summary);
-
-		JFFS2_WARNING("Not enough space for summary, padsize = %d\n", padsize);
-		spin_lock(&c->erase_completion_lock);
-		return 0;
-	}
-
 	ret = jffs2_sum_write_data(c, jeb, infosize, datasize, padsize);
 	spin_lock(&c->erase_completion_lock);
 	return ret;
diff --git a/fs/jffs2/summary.h b/fs/jffs2/summary.h
index 8bf34f2..60207a2 100644
--- a/fs/jffs2/summary.h
+++ b/fs/jffs2/summary.h
@@ -13,6 +13,12 @@
 #ifndef JFFS2_SUMMARY_H
 #define JFFS2_SUMMARY_H
 
+/* Limit summary size to 64KiB so that we can kmalloc it. If the summary
+   is larger than that, we have to just ditch it and avoid using summary
+   for the eraseblock in question... and it probably doesn't hurt us much
+   anyway. */
+#define MAX_SUMMARY_SIZE 65536
+
 #include <linux/uio.h>
 #include <linux/jffs2.h>
 



-- 
dwmw2

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
  2008-07-31  8:48         ` David Woodhouse
@ 2008-07-31  9:09           ` David Woodhouse
  0 siblings, 0 replies; 14+ messages in thread
From: David Woodhouse @ 2008-07-31  9:09 UTC (permalink / raw)
  To: dedekind; +Cc: David Brownell, linux-mtd, trimarchimichael, jwboyer, akpm, rmk

On Thu, 2008-07-31 at 09:48 +0100, David Woodhouse wrote:
> And we don't have to worry too much about proving a lower bound on the
> size we'll want to use for the summary

Or indeed whether it's an upper bound or a lower bound we'd need to
calculate. More tea required...

-- 
dwmw2

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-07-31  9:09 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-30 19:34 [patch 02/13] jffs2 summary allocation: don't use vmalloc() akpm
2008-07-30 22:39 ` David Brownell
2008-07-30 22:46   ` Andrew Morton
2008-07-31  5:10     ` David Brownell
2008-07-31  5:37       ` Artem Bityutskiy
2008-07-31  6:57         ` David Brownell
2008-07-31  8:03           ` Artem Bityutskiy
2008-07-31  5:15   ` Artem Bityutskiy
2008-07-31  6:56     ` David Brownell
2008-07-31  8:00       ` Artem Bityutskiy
2008-07-31  8:48         ` David Woodhouse
2008-07-31  9:09           ` David Woodhouse
2008-07-31  7:33   ` David Woodhouse
2008-07-31  8:21     ` David Brownell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox