All of lore.kernel.org
 help / color / mirror / Atom feed
From: Emil Larsson <d99-ela@nada.kth.se>
To: Valdis.Kletnieks@vt.edu
Cc: John Richard Moser <nigelenki@comcast.net>, reiserfs-list@namesys.com
Subject: Re: Interesting deletion idea
Date: Sat, 09 Oct 2004 08:43:49 +0200	[thread overview]
Message-ID: <416788A5.7010301@nada.kth.se> (raw)
In-Reply-To: <200410090022.i990Mgcj001091@turing-police.cc.vt.edu>

Valdis.Kletnieks@vt.edu wrote:

>On Fri, 08 Oct 2004 19:52:14 EDT, John Richard Moser said:
>
>  
>
>>I thought the DOD algorithm was 7 pass?
>>    
>>
>
>Citation please?  If you have a better reference than DOD 5220-22.M,
>feel free to share it.
>
>  
>
To the best of my knowledge, "DOD 7-pass" or similar expressions refer 
to the sequential use of the "e", "c" and "e" overwrite methods as 
described in DoD 5220.22-M / NISPOM 8-306. There is no basis for this in 
any official regulations or guidelines that I have been made privy to - 
it's just typical "more-must-be-better" thinking.

>>If this is going on rapidly, there's no point in trying to completely
>>destroy the disk for *every* logical operation; but buffering the
>>operations and then only doing the most recent one, and destroying the
>>area before that one exactly, would be OK.  The idea is that rapid
>>overwrites from userspace get collapsed into a single overwrite; and
>>then the kernel overwrites a bunch of times before flushing that data to
>>disk to securely erase it.
>>    
>>
>
>The point is that you have no really good way to know beforehand that
>the flurry of writes is over, and it's time to collapse the writes into
>a single write.  
>
>To demonstrate using your example:
>
>a = open("/some/file.txt");
>seek(a, 0, 0);
>fputc(a,'N');
>seek(a, 0, 0);
>fputc(a, 'D');
>seek(a,0,0);
>fputc(a, 'X');
>
>At what point do you do the overwrite?  You place it just before the
>fputc 'X' - but you can't really delay to that rather than at the
>'N' or 'D' unless you *know* that the 'X' one will happen 'Soon Enough'.
>There's also the point that fputc() is stdio and buffered by default,
>unless you've called fflush() or setlinebuf() or similar.  Even if you
>look at the read()/write() syscall level, the Linux kernel will almost
>certainly automatically do most of the needed collapsing in the buffer
>cache code (look at fs/buffer.c for the gory details) - in fact, most
>of the time, you need to use fsync() or similar to *force* the data to
>actually get to the disk (often, the data doesn't go out until long after
>the process has actually exited - and then there's the different way
>that the different I/O elevators schedule things, just to add another
>layer of unpredictability into things).  The end result is that it's
>a lot harder than it looks to get this right...
>
>In addition, doing the overwrite at *THAT* point is *the wrong point* - as
>you're about to overwrite the block at least once *anyhow*.  You *really* need to
>be doing erasing in the handling for the unlink() and (f)truncate() syscalls,
>because *that* is the point you're freeing the disk blocks - and the point of
>erasing is to prohibit scavenging of old data off the disk.  This has the added
>benefit of being something you *can* do basically at the filesystem's leisure,
>subject to a requirement that you return blocks to the free list fast enough
>to prevent disk space exhaustion (which is trickier than it looks - under heavy
>file create/write/read/unlink loads, you need to be doing it as fast as possible
>at exactly the time you have the least idle bandwidth - at worst case, a 3-pass
>erase of all blocks will limit you to 25% of the effective write bandwidth in a
>steady-state high-load situation).
>
>Also, you *really* need to be *very* careful regarding write barriers and the
>like - look at the linux-kernel archives for the last few months where a *long*
>series of threads about the problems on IDE.
>
>Basically, if the drive has a write cache on it, you have to either disable
>it or jump through some *real* hoops in order to get strictly correct write
>barrier semantics (and on some drives, the situation is totally impossible).
>
>
>  
>
/Emil


      parent reply	other threads:[~2004-10-09  6:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-08  5:55 Interesting deletion idea John Richard Moser
2004-10-08 22:14 ` Valdis.Kletnieks
2004-10-08 23:52   ` John Richard Moser
2004-10-09  0:22     ` Valdis.Kletnieks
2004-10-09  0:34       ` John Richard Moser
2004-10-09  6:43       ` Emil Larsson [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=416788A5.7010301@nada.kth.se \
    --to=d99-ela@nada.kth.se \
    --cc=Valdis.Kletnieks@vt.edu \
    --cc=nigelenki@comcast.net \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.