All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Ric Wheeler <ricwheeler@gmail.com>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>,
	Lukas Czerner <lczerner@redhat.com>,
	linux-ext4@vger.kernel.org, tytso@mit.edu, sandeen@redhat.com
Subject: Re: [PATCH] e2fsck: Discard free data and inode blocks.
Date: Fri, 22 Oct 2010 17:19:16 -0400	[thread overview]
Message-ID: <yq1sjzyncx7.fsf@sermon.lab.mkp.net> (raw)
In-Reply-To: <4CC1D694.3040006@gmail.com> (Ric Wheeler's message of "Fri, 22 Oct 2010 14:23:16 -0400")

>>>>> "Ric" == Ric Wheeler <ricwheeler@gmail.com> writes:

Ric> Just to further confuse things, if we just want to zero a device,
Ric> there is the (relatively old) WRITE_SAME command that arrays
Ric> use. Note that it is quite a bit faster than doing this from the
Ric> server since you only transfer over one block of data and the disk
Ric> firmware does the rest - no data transfer for each block once you
Ric> start.

Ric> It can certainly take a long, long time, but would be faster than
Ric> zeroing a drive with write() system calls :)

I took some stabs at this in the spring. And while it looked like a good
idea on paper it turned out not to be a huge win unless the FC link was
heavily congested due to traffic to other devices.

First of all many drives have a cap on the maximum number of blocks
that can be written using one WRITE SAME command. Typically you can only
write 16-32 megs at a time. So I needed to have a bunch of magic to
scale down and retry while attempting to find the sweet spot.

Fred tried to convince T10 that it would be nice to have a field in the
block limits VPD that would indicate the max WRITE SAME blocks a device
supported. But T10 thought that was a bad idea and the proposal was
rejected. Otherwise I would have wired that up and we could have handled
generic WRITE SAME like we do the discard case.

The other problem is that the WRITE SAME may take a looong time. And so
we need special timeouts in place to prevent regular error handling from
kicking in while the drive is busy wiping stuff.

I guess we could just pick a number (16 MB, maybe) and define that as
the max. Picking a low number also has the benefit of being less likely
to interfere with timeouts.

If there's interest I'll be happy to revisit my patches...

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2010-10-22 21:20 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-10-21 14:15 [PATCH] e2fsck: Discard free data and inode blocks Lukas Czerner
2010-10-21 18:07 ` Andreas Dilger
2010-10-22  9:12   ` Lukas Czerner
2010-10-22 11:30     ` Ric Wheeler
2010-10-22 11:43       ` Lukas Czerner
2010-10-22 14:12         ` Ric Wheeler
2010-10-22 14:32           ` Lukas Czerner
2010-10-22 14:46             ` Ric Wheeler
2010-10-22 15:37               ` Eric Sandeen
2010-10-22 15:41                 ` Ric Wheeler
2010-10-22 17:03                   ` Martin K. Petersen
2010-10-22 17:14                     ` Ric Wheeler
2010-10-22 17:29                       ` Martin K. Petersen
2010-10-22 18:23                     ` Eric Sandeen
2010-10-22 17:50               ` Andreas Dilger
2010-10-22 18:01                 ` Lukas Czerner
2010-10-22 18:17                   ` Andreas Dilger
2010-10-22 18:23                     ` Ric Wheeler
2010-10-22 21:19                       ` Martin K. Petersen [this message]
2010-10-22 18:29                 ` Eric Sandeen
2010-10-22 21:01                 ` Martin K. Petersen
2010-10-22 18:00             ` Andreas Dilger
2010-10-22 18:27               ` Eric Sandeen
2010-10-22 18:31                 ` Lukas Czerner
  -- strict thread matches above, loose matches on Subject: below --
2010-10-11 10:37 Lukas Czerner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1sjzyncx7.fsf@sermon.lab.mkp.net \
    --to=martin.petersen@oracle.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=ricwheeler@gmail.com \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.