From mboxrd@z Thu Jan 1 00:00:00 1970 From: =?utf-8?B?SsO2cm4=?= Engel Subject: Re: [Patch 15/18] fs/logfs/super.c Date: Sun, 10 Jun 2007 19:38:29 +0200 Message-ID: <20070610173828.GA32619@lazybastard.org> References: <20070603183845.GA8952@lazybastard.org> <20070603184927.GP8952@lazybastard.org> <200706101827.50948.arnd@arndb.de> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mtd@lists.infradead.org, akpm@osdl.org, Sam Ravnborg , John Stoffel , David Woodhouse , Jamie Lokier , Artem Bityutskiy , CaT , Jan Engelhardt , Evgeniy Polyakov , David Weinehall , Willy Tarreau , Kyle Moffett , Dongjun Shin , Pavel Machek , Bill Davidsen , Thomas Gleixner , Albert Cahalan , Pekka Enberg , Roland Dreier , Ondrej Zajicek , Ulisses Furquim To: Arnd Bergmann Return-path: Received: from lazybastard.de ([212.112.238.170]:47884 "EHLO longford.lazybastard.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755004AbXFJRn3 (ORCPT ); Sun, 10 Jun 2007 13:43:29 -0400 Content-Disposition: inline In-Reply-To: <200706101827.50948.arnd@arndb.de> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org On Sun, 10 June 2007 18:27:49 +0200, Arnd Bergmann wrote: > On Sunday 03 June 2007, J=C3=B6rn Engel wrote: > > +static int mtdwrite(struct super_block *sb, loff_t ofs, size_t len= , void *buf) > > +{ > > + struct logfs_super *super =3D logfs_super(sb); > > + struct mtd_info *mtd =3D super->s_mtd; > > + struct inode *inode =3D super->s_dev_inode; > > + size_t retlen; > > + loff_t page_start, page_end; > > + int ret; > > + > > + if (super->s_flags & LOGFS_SB_FLAG_RO) > > + return -EROFS; > > + > > + BUG_ON((ofs >=3D mtd->size) || (len > mtd->size - ofs)); > > + BUG_ON(ofs !=3D (ofs >> super->s_writeshift) << super->s_writeshi= ft); > > + BUG_ON(len > PAGE_CACHE_SIZE); > > + page_start =3D ofs & PAGE_CACHE_MASK; > > + page_end =3D PAGE_CACHE_ALIGN(ofs + len) - 1; > > + truncate_inode_pages_range(&inode->i_data, page_start, page_end); > > + ret =3D mtd->write(mtd, ofs, len, &retlen, buf); > > + if (ret || (retlen !=3D len)) > > + return -EIO; > > + > > + return 0; > > +} >=20 > It seems that these functions are completely synchronous and afaics, = the > writes are called in the pdflush context, effectively blocking out op= erations > on other file systems at the time. Not sure if this is something that > should be fixed, but it seems to limit scalability on mtd backends. >=20 > It seems that jffs2 has the same behaviour. Most flash chips are synchronous. Some support erase suspend - an eras= e operation can be indefinitely suspended to give faster read and write operations priority. That's about it. =46or the foreseeable future I believe synchronous operations are the correct way to deal with raw flash chips. > > +static int bdwrite(struct super_block *sb, loff_t to, size_t len, = void *buf) > > +{ > > + struct block_device *bdev =3D logfs_super(sb)->s_bdev; > > + struct address_space *mapping =3D bdev->bd_inode->i_mapping; > > + struct page *page; > > + long index =3D to >> PAGE_SHIFT; > > + long offset =3D to & (PAGE_SIZE-1); > > + long copylen; > > + > > + while (len) { > > + copylen =3D min((ulong)len, PAGE_SIZE - offset); > > + > > + page =3D read_cache_page(mapping, index, > > + (filler_t*)mapping->a_ops->readpage, NULL); > > + if (!page) > > + return -ENOMEM; > > + if (IS_ERR(page)) > > + return PTR_ERR(page); > > + lock_page(page); > > + memcpy(page_address(page) + offset, buf, copylen); > > + set_page_dirty(page); > > + unlock_page(page); > > + page_cache_release(page); > > + > > + buf +=3D copylen; > > + len -=3D copylen; > > + offset =3D 0; > > + index++; > > + } > > + return 0; > > +} >=20 > How about using submit_bio here instead of going to the page cache? > That would avoid doubling all the memory consumption here. That may make sense, yes. "May", because there is no simple mapping between physical data and logical data. In ext3, everything is block-aligned, usually to 4KiB =3D=3D PAGE_SIZE. So the exact same con= tent would exists in two pages. In LogFS, data is compressed and byte-aligned. A bdev page can contain several full objects that after uncompression get stored in one page each. > > +/* > > + * logfs_crash_dump - dump debug information to device > > + * > > + * The LogFS superblock only occupies part of a segment. This fun= ction will > > + * write as much debug information as it can gather into the spare= space. > > + */ > > +void logfs_crash_dump(struct super_block *sb) > > +{ > > + struct logfs_super *super =3D logfs_super(sb); > > + int i, blockno =3D 2, bs =3D sb->s_blocksize; > > + void *scratch =3D super->s_wblock[0]; > > + void *stack =3D (void *) ((ulong)current & ~0x1fffUL); > > + > > + /* all wbufs */ > > + for (i=3D0; i > + void *wbuf =3D super->s_area[i]->a_wbuf; > > + u64 ofs =3D sb->s_blocksize + i*super->s_writesize; > > + mtdwrite(sb, ofs, super->s_writesize, wbuf); > > + } >=20 > shouldn't this use the ->write() function instead of hardcoding mtdwr= ite. It should and I've already fixed it since sending the patches. > > +module_init(logfs_init); > > +module_exit(logfs_exit); >=20 > You are missing at least a MODULE_LICENSE and should probably > add the MODULE_AUTHOR and MODULE_DESCRIPTION tags as well. > Right now, it's impossible to even load the module, because > it uses a few GPL-only symbols. Indeed. Will do. J=C3=B6rn --=20 A defeated army first battles and then seeks victory. -- Sun Tzu - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html