Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()

public inbox for linux-mtd@lists.infradead.org
 help / color / mirror / Atom feed

From: Artem Bityutskiy <dedekind@infradead.org>
To: David Brownell <david-b@pacbell.net>
Cc: linux-mtd@lists.infradead.org, trimarchimichael@yahoo.it,
	jwboyer@gmail.com, dwmw2@infradead.org,
	akpm@linux-foundation.org, rmk@arm.linux.org.uk
Subject: Re: [patch 02/13] jffs2 summary allocation: don't use vmalloc()
Date: Thu, 31 Jul 2008 11:00:58 +0300	[thread overview]
Message-ID: <1217491258.9432.20.camel@sauron> (raw)
In-Reply-To: <200807302356.29835.david-b@pacbell.net>

On Wed, 2008-07-30 at 23:56 -0700, David Brownell wrote:
> > So this is not just JFFS2. Using 
> > kmalloc() for this does not seem to be a good idea for me, because
> > indeed the buffer size may be up to 512KiB, and may even grow at some
> > point to 1MiB.
> 
> Yeah, nobody's saying kmalloc() is the right answer.  The questions
> include who's going to change what, given that this part of the MTD
> driver interface has previously been unspecified. 
> 
> (DataFlash support has been around since 2003 or so; only *this year*
> did anyone suggest that buffers handed to it wouldn't work with the
> standard DMA mapping operations, and that came up in the context of
> a newish JFFS2 "summary" feature ...)

I've just glanced to JFFS2, and this sum_buf does not have to be of
eraseblock size. It should be something like a couple of NAND pages in
size, or, say, 5-10% of eraseblock size. So I would say, in this
particular case JFFS2 may be fixed an kmalloc() may be used.

The idea of this summary stuff is to speed up mount time. JFFS2, while
writing to an EB, remembers information about written nodes in
c->summary->sum_list_head. Then, when the eraseblock is close to be
full, it creates a summary node, which contains an array of information
about each node in this EB. And this summary node is written to the end
of the eraseblock. And, when JFFS2 is mounted it reads this summary node
from the end of EB, instead of scanning whole EB, which speeds up
mounting.

Obviously, JFFS2 does not need eraseblock size buffer for the summary
node. This can be fixed and the problem may be "forgotten" for some
period of time :-)

> Another perspective comes from looking at it bottom up, starting with
> what the various kinds of flash do.
> 
>  - NOR (which I'll assume for discussion is all CFI) hasn't previously
>    considered DMA ... although the drivers/dma stuff might handle its
>    memcpy on some platforms.  (I measured it on a few systems and saw
>    no performance wins however; IMO the interface overheads hurt it.)
> 
>  - NAND only does reads/writes of smallish pages ... in conjunction
>    with hardware ECC, DMA can help (*) but that only uses small buffers.
Yeah, of NAND page size which is 4KiB at max. now AFAIK. But it may grow
at some point.

>    Some NAND drivers *do* use DMA ... Blackfin looks like it assumes
>    the buffers are always in adjacent pages, fwiw, and PXA3 looks like
>    it always uses a bounce buffer (not very speedy).
> 
>  - SPI (two drivers) often does writes of smaller pages than NAND, but
>    can read out the entire flash chip in a single operation.  (Which
> is
>    handy for bootstrapping and suchlike.)

Yeah, it seems that if we just fix this sum_buf in JFFS2 then anyone is
going to be happy. And we may hope that someone soon would change mtd
interfaces as well.

> Midlayers *could* use drivers/dma to shrink cpu memcpy costs, if
> they wanted.  Not sure I'd advise it just now though ... just
> saying that more than the lowest levels could do DMA.

Yeah, you are right, I did not think about this. For UBIFS that could be
a good optimization, because profiling shows it substantial amount of
time in memcpy().

> I suppose I'd rather see some mid-layer utilities offloading the
> DMA from the lower level drivers.  It seems wrong to expect two
> drivers to do the same kind of virtual-buffer to physical-pages
> mappings.  There's probably even a utility to do that, leaving
> just the task of using it when the lowest level driver (the one
> called by MTD-over-SPI drivers like m25p80/dataflash) does DMA.

Hmm, interesting idea. Is something like this is used somewhere in the
kernel?

-- 
Best regards,
Artem Bityutskiy (Битюцкий Артём)

next prev parent reply	other threads:[~2008-07-31  8:06 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-30 19:34 [patch 02/13] jffs2 summary allocation: don't use vmalloc() akpm
2008-07-30 22:39 ` David Brownell
2008-07-30 22:46   ` Andrew Morton
2008-07-31  5:10     ` David Brownell
2008-07-31  5:37       ` Artem Bityutskiy
2008-07-31  6:57         ` David Brownell
2008-07-31  8:03           ` Artem Bityutskiy
2008-07-31  5:15   ` Artem Bityutskiy
2008-07-31  6:56     ` David Brownell
2008-07-31  8:00       ` Artem Bityutskiy [this message]
2008-07-31  8:48         ` David Woodhouse
2008-07-31  9:09           ` David Woodhouse
2008-07-31  7:33   ` David Woodhouse
2008-07-31  8:21     ` David Brownell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1217491258.9432.20.camel@sauron \
    --to=dedekind@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=david-b@pacbell.net \
    --cc=dwmw2@infradead.org \
    --cc=jwboyer@gmail.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=rmk@arm.linux.org.uk \
    --cc=trimarchimichael@yahoo.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox