Re: [PATCH] LogFS take three

linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: "Jörn Engel" <joern@lazybastard.org>
To: Pekka J Enberg <penberg@cs.helsinki.fi>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mtd@lists.infradead.org, akpm@osdl.org,
	Albert Cahalan <acahalan@gmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Jan Engelhardt <jengelh@linux01.gwdg.de>,
	Evgeniy Polyakov <johnpol@2ka.mipt.ru>, Greg KH <greg@kroah.com>,
	Ingo Oeser <ioe-lkml@rameria.de>
Subject: Re: [PATCH] LogFS take three
Date: Wed, 16 May 2007 14:26:06 +0200	[thread overview]
Message-ID: <20070516122603.GE5472@lazybastard.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0705161319060.5152@sbz-30.cs.Helsinki.FI>

On Wed, 16 May 2007 13:21:11 +0300, Pekka J Enberg wrote:
> 
> > +#define LOGFS_BUG(sb) do {           \
> > +     struct super_block *__sb = sb;  \
> > +     logfs_crash_dump(__sb);         \
> > +     BUG();                          \
> > +} while(0)
> 
> Note that BUG() can be a no-op so dumping something on disk might not make 
> sense there. This seems useful, but you probably need to make this bit 
> more generic so that using BUG() proper in your filesystem code does the 
> right thing. Inventing your own wrapper should be avoided.

Hmm.  I am not sure how this could be generic and still make sense.

LogFS has some unused write-once space in the superblock segment.
Embedded people always have problems debugging and were suggesting using
this to store debug information.  That allows me to ask for a filesystem
image and get both the offending image plus a crash dump.  It also
allows me to abort mounting if I ever see an existing crash dump (not
implemented yet).  "First failure data capture" was an old IBM slogal
and the "first" really makes a difference.

So how should I deal with BUG() as a no-op?  I _think_ it should not
make a difference.  People choosing that option should know that their
kernels will continue running with errors and can potentially do any
kind of damage.  My wrapper doesn't seem to change that one way or
another.

What I _should_ do is make the filesystem read-only after this point.
That would actually be quite simple.  Onto my list.

> > +static inline struct logfs_super *LOGFS_SUPER(struct super_block *sb)
> > +{
> > +     return sb->s_fs_info;
> > +}
> > +
> > +static inline struct logfs_inode *LOGFS_INODE(struct inode *inode)
> > +{
> > +     return container_of(inode, struct logfs_inode, vfs_inode);
> > +}
> 
> No need for upper case in function names.

Will change.

> > +int logfs_memcpy(void *in, void *out, size_t inlen, size_t outlen)
> > +{
> > +     if (outlen < inlen)
> > +             return -EIO;
> > +     memcpy(out, in, inlen);
> > +     return inlen;
> > +}
> 
> Please drop this wrapper function. It's better to open-code the error
> handling in the callers (there are total of three of them).

Is it?  The advantage of having the wrapper is that the compression case
and the non-compressed case look identical and the code just repeats the
same pattern twice.

I'll try to open-code it and see how it looks.

> > +/* FIXME: combine with per-sb journal variant */
> > +static unsigned char compressor_buf[LOGFS_MAX_OBJECTSIZE];
> > +static DEFINE_MUTEX(compr_mutex);
> 
> This looks fishy. All reads and writes are serialized by compr_mutex 
> because they share a scratch buffer for compression and uncompression?

s/fishy/lame/

Will move to superblock.

> > +/* FIXME: all this mess should get replaced by using the page cache */
> > +static void fixup_from_wbuf(struct super_block *sb, struct logfs_area 
> *area,
> > +             void *read, u64 ofs, size_t readlen)
> > +{
> 
> Indeed. And I think you're getting some more trouble because of this... 

More trouble?

Sadly squeezing bugs out of that beast was easier than wrapping my brain
around the proper solution.  And now it works and has moved to a less
urgent position in my todo list.

> > +int logfs_segment_read(struct super_block *sb, void *buf, u64 ofs)
> > +{
> > +     struct logfs_object_header *h;
> > +     u16 len;
> > +     int err, bs = sb->s_blocksize;
> > +
> > +     mutex_lock(&compr_mutex);
> > +     err = wbuf_read(sb, ofs, bs + LOGFS_HEADERSIZE, compressor_buf);
> > +     if (err)
> > +             goto out;
> > +     h = (void*)compressor_buf;
> > +     len = be16_to_cpu(h->len);
> > +
> > +     switch (h->compr) {
> > +     case COMPR_NONE:
> > +             logfs_memcpy(compressor_buf + LOGFS_HEADERSIZE, buf, bs, 
> bs);
> > +             break;
> 
> Seems wasteful to first read the data in a scratch buffer and then 
> memcpy() it immediately for the COMPR_NONE case. Any reason why we can't 
> read a full struct page, for example, and simply use that if we don't need 
> to uncompress anything?

No technical reason.  Just a case of "make it work now and optimize
later".

> > +     case COMPR_ZLIB:
> > +             err = logfs_uncompress(compressor_buf + LOGFS_HEADERSIZE, 
> buf,
> > +                             len, bs);
> > +             BUG_ON(err);
> > +             break;
> 
> Not claiming to undestand your on-disk format, but wouldn't it make more 
> sense if we knew whether a given segment is compressed or not _before_ we 
> actually read it?

[ Objects are the units that get compressed.  Segments can contain both
compressed and uncompressed objects. ]

It is a trade-off.  Each object has a 24 Byte header plus X Bytes of
data.  Whether the data is compressed or not is indicated in the header.
So we have to do one of two things.  Either:
- read the complete object, then either uncompress or memcpy, or
- read just the header, then either read uncompressed data or read
  compressed data and uncompress.

The former does an unnecessary memcpy() in the uncompressed case.  The
latter does an extra IO operation.  I haven't done any benchmarks to see
which variant is faster on which hardware.

An advantage of the second approach is that no buffer and no mutex is
needed in the uncompressed case.  That could help read performance on an
SMP system.  I'll put it on my list, just not at the top.  Ok?

Jörn

-- 
When in doubt, punt.  When somebody actually complains, go back and fix it...
The 90% solution is a good thing.
-- Rob Landley

next prev parent reply	other threads:[~2007-05-16 12:26 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-15 15:19 [PATCH] LogFS take three Jörn Engel
2007-05-15 15:21 ` Review status (Re: [PATCH] LogFS take three) Jörn Engel
2007-05-17 16:03   ` Evgeniy Polyakov
2007-05-17 17:10     ` Jörn Engel
2007-05-20 17:30       ` Evgeniy Polyakov
2007-05-23 12:58         ` Jörn Engel
2007-05-23 15:07           ` Evgeniy Polyakov
2007-05-23 15:14             ` Jörn Engel
2007-05-23 15:53               ` Evgeniy Polyakov
2007-05-15 18:37 ` [PATCH] LogFS take three Sam Ravnborg
2007-05-15 19:10   ` Jörn Engel
2007-05-15 19:07 ` John Stoffel
2007-05-15 19:19   ` Jörn Engel
2007-05-16  4:54     ` David Woodhouse
2007-05-16 11:09       ` Jörn Engel
2007-05-16 11:34         ` Jamie Lokier
2007-05-16 11:38           ` Artem Bityutskiy
2007-05-16 12:25             ` Jamie Lokier
2007-05-16 12:49               ` Jörn Engel
2007-05-16 11:50           ` Jörn Engel
2007-05-16 12:06             ` CaT
2007-05-17 17:07               ` Jan Engelhardt
2007-05-16 12:29             ` Evgeniy Polyakov
2007-05-16 12:55               ` Jörn Engel
2007-05-17 17:08                 ` Jan Engelhardt
2007-05-16 13:41             ` John Stoffel
2007-05-16 13:53               ` Jörn Engel
2007-05-16 14:04                 ` David Woodhouse
2007-05-16 14:13                   ` Artem Bityutskiy
2007-05-16 14:17                   ` Kevin Bowling
2007-05-17  9:12                     ` Pavel Machek
2007-05-17  9:52                       ` David Woodhouse
2007-05-19 15:00                     ` Bill Davidsen
2007-05-16 14:29                 ` CaT
2007-05-17 17:14                 ` Jan Engelhardt
2007-05-19 14:45                 ` David Weinehall
2007-05-19 16:17                   ` Jamie Lokier
2007-05-19 16:27                     ` Evgeniy Polyakov
2007-05-17 16:59         ` Jan Engelhardt
2007-05-17 11:39     ` Arnd Bergmann
2007-05-15 20:37 ` Andrew Morton
2007-05-16  0:06   ` Jörn Engel
2007-05-16  2:11     ` Jörn Engel
2007-05-16  5:22     ` Willy Tarreau
2007-05-16 11:23       ` Jörn Engel
2007-05-17 17:26     ` Jan Engelhardt
2007-05-17 17:45       ` Evgeniy Polyakov
2007-05-18  1:00         ` Kyle Moffett
2007-05-18  6:16           ` Jan Engelhardt
2007-05-16 12:07   ` David Woodhouse
2007-05-16 15:34     ` Andrew Morton
2007-05-16 15:49       ` David Woodhouse
2007-05-16 16:41         ` Jörn Engel
     [not found]           ` <7fe698080705162312t4e7ed90byd10ef8e664027b17@mail.gmail.com>
2007-05-17  6:25             ` David Woodhouse
2007-05-17  8:20               ` Dongjun Shin
2007-05-17  8:43                 ` David Woodhouse
2007-05-17 12:05                   ` Jörn Engel
2007-05-17 20:18                 ` Pavel Machek
2007-05-18  0:01                   ` Dongjun Shin
2007-05-18  6:17                     ` Jan Engelhardt
2007-05-18  6:47                       ` David Woodhouse
2007-05-19 15:18                 ` Bill Davidsen
2007-05-19  6:15   ` Rob Landley
2007-05-19  9:24     ` Jan Engelhardt
2007-05-19 17:26       ` Rob Landley
2007-05-15 23:26 ` Albert Cahalan
2007-05-16  0:09   ` Jörn Engel
2007-05-16 11:18   ` Jamie Lokier
2007-05-16 12:08     ` Pekka Enberg
2007-05-16 12:58       ` Jörn Engel
2007-05-16  2:37 ` Roland Dreier
2007-05-16 11:35   ` Jörn Engel
2007-05-16 10:21 ` Pekka J Enberg
2007-05-16 12:26   ` Jörn Engel [this message]
2007-05-16 12:36     ` Pekka Enberg
2007-05-16 12:38       ` Pekka Enberg
2007-05-16 13:20       ` Jörn Engel
2007-05-17 20:58         ` Pekka Enberg
2007-05-17 21:36           ` Arnd Bergmann
2007-05-17 21:50             ` Jörn Engel
2007-05-16 19:17 ` Pavel Machek
2007-05-16 19:23   ` Jörn Engel
2007-05-17 15:08 ` Arnd Bergmann
2007-05-17 20:21   ` Jörn Engel
2007-05-17 21:00     ` Arnd Bergmann
2007-05-17 21:30       ` Jörn Engel
2007-05-17 22:01     ` Jamie Lokier

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070516122603.GE5472@lazybastard.org \
    --to=joern@lazybastard.org \
    --cc=acahalan@gmail.com \
    --cc=akpm@osdl.org \
    --cc=greg@kroah.com \
    --cc=ioe-lkml@rameria.de \
    --cc=jengelh@linux01.gwdg.de \
    --cc=johnpol@2ka.mipt.ru \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).