public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@osdl.org>
To: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Cc: linux-kernel@vger.kernel.org
Subject: Re: [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices
Date: Thu, 20 Oct 2005 13:50:14 -0700	[thread overview]
Message-ID: <20051020135014.2289fa01.akpm@osdl.org> (raw)
In-Reply-To: <871x2gf8f5.fsf@devron.myhome.or.jp>

OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> wrote:
>
> Hi,
> 
> This adds new "flush" option on experiment for hotplug devices.
> 
> Current implementation of "flush" option does,
> 
> 	- synchronizing data pages at ->release() (last close(2))
> 	- if user's work seems to be done (fs is not active), all
> 	  metadata syncs by pdflush()

Seems like a sensible thing to do.

> This option would provide kind of sane progress, and dirty buffers is
> flushed more frequently (if fs is not active).

Your implementation doesn't really do this.  bdi_write_congested() only
returns true if the device is super-busy.  To determine whether it's "not
active" we'd need to peek at the queue's disk_stat accounting, or at the
queue's outstanding read/write requests.  We covered this a couple of weeks
ago in the context of Con's swap prefetch work.

>  This option doesn't
> provide any robustness (robustness is provided by other options), but
> probably the option is proper for hotplug devices.

Well...  It does a full fsync_super() - that's pretty robust.

> +EXPORT_SYMBOL(filemap_write_and_wait);

_GPL please.

> +EXPORT_SYMBOL(fsync_super);

Ditto

> +/*
> + * Copyright (C) 2005, OGAWA Hirofumi
> + * Released under GPL v2.
> + */
> +
> +#include <linux/fs.h>
> +#include <linux/blkdev.h>
> +#include <linux/writeback.h>
> +#include <linux/msdos_fs.h>
> +
> +int fat_sync_fdata(struct inode *inode, struct file *filp)
> +{
> +	int err = 0;
> +
> +	if (filp->f_mode & FMODE_WRITE) {
> +#if 1
> +		current->flags |= PF_SYNCWRITE;
> +		err = filemap_write_and_wait(inode->i_mapping);
> +		current->flags &= ~PF_SYNCWRITE;
> +#else
> +		down(&inode->i_sem);
> +#if 1
> +		err = generic_osync_inode(inode, inode->i_mapping, OSYNC_DATA);
> +#else
> +		err = filp->f_op->fsync(filp, filp->f_dentry, 1);
> +#endif
> +		up(&inode->i_sem);
> +#endif
> +	}
> +	return err;
> +}

Can't we just split up do_fsync() a bit and use that?

> +static void fat_pdflush_handler(unsigned long arg)
> +{
> +	struct super_block *sb = (struct super_block *)arg;
> +	fsync_super(sb);
> +}

It would be nice if /proc/sys/vm/dirty_writeback_centisecs was a per-fs
thing.   That's non-trivial.

> +static void fat_flush_timer(unsigned long data)
> +{
> +	struct super_block *sb = (struct super_block *)data;
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +	struct backing_dev_info *bdi = blk_get_backing_dev_info(sb->s_bdev);
> +	unsigned long last_flush_jiff;
> +
> +	if (bdi_write_congested(bdi)) {

As indicated above, this won't be very effective.

> +		mod_timer(&sbi->flush_timer, jiffies + (HZ / 10));
> +		return;
> +	}
> +
> +	last_flush_jiff = sbi->last_flush_jiff;
> +
> +	if (!time_after_eq(jiffies, last_flush_jiff + (HZ / 2))) {
> +		mod_timer(&sbi->flush_timer, last_flush_jiff + (HZ / 2));
> +		return;
> +	}

What's the above doing?

> +	if (pdflush_operation(fat_pdflush_handler, (unsigned long)sb) < 0)
> +		mod_timer(&sbi->flush_timer, jiffies + HZ);
> +}
> +
> +void __fat_mark_flush(struct super_block *sb)
> +{
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +
> +	sbi->last_flush_jiff = jiffies;
> +	/*
> +	 * make sure by smb_wmb() that dirty buffers before here is
> +	 * processed at the timer routine.
> +	 */
> +	smp_wmb();
> +
> +	if (!timer_pending(&sbi->flush_timer))
> +		mod_timer(&sbi->flush_timer, jiffies + HZ);
> +}
> +EXPORT_SYMBOL(__fat_mark_flush);

_GPL?

> +void fat_flush_stop(struct super_block *sb)
> +{
> +	del_timer_sync(&MSDOS_SB(sb)->flush_timer);
> +}

whoops, the pdflush_operation could still be in progress.

To avoid umount races I think the pdflush callback is going to need to take
sb_lock, increment s_count, take ->s_umount, test ->s_root.  Like, for
example, __sync_inodes.

> +void fat_flush_init(struct super_block *sb)
> +{
> +	struct msdos_sb_info *sbi = MSDOS_SB(sb);
> +	init_timer(&sbi->flush_timer);
> +	sbi->flush_timer.data = (unsigned long)sb;
> +	sbi->flush_timer.function = fat_flush_timer;

-mm has setup_timer()

> +
> +static inline void fat_mark_flush(struct super_block *sb)
> +{
> +	if (MSDOS_SB(sb)->options.flush)
> +		__fat_mark_flush(sb);
> +}

It'd be nice to make this a more generic thing, so other filesystems can
use it without copying lots of code.

> +		case Opt_flush:

MS_FLUSH?   I added MS_DIRSYNC a few years ago - it wasn't too complex.



  reply	other threads:[~2005-10-20 20:50 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2005-10-20 16:41 [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices OGAWA Hirofumi
2005-10-20 20:50 ` Andrew Morton [this message]
2005-10-20 21:08   ` running linux TCP/IP stack on a real slow machine Kallol Biswas
2005-10-22  7:45   ` [EXPERIMENT,RFC] FAT: Add "flush" option for hotplug devices OGAWA Hirofumi
2005-10-22 10:15 ` Christoph Hellwig
2005-10-22 10:45   ` OGAWA Hirofumi
2005-10-22 10:56     ` Christoph Hellwig
2005-10-22 11:24       ` OGAWA Hirofumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20051020135014.2289fa01.akpm@osdl.org \
    --to=akpm@osdl.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox