public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: "Artem S. Tashkinov" <aros@gmx.com>
Cc: Andreas Dilger <adilger@dilger.ca>, linux-ext4@vger.kernel.org
Subject: Re: A possible way to reduce free space fragmentation?
Date: Sun, 2 Feb 2025 15:15:57 -0500	[thread overview]
Message-ID: <20250202154319.GB12129@macsyma-2.local> (raw)
In-Reply-To: <ba25991f-43ff-4412-8978-27ad8198e347@gmx.com>

On Sat, Feb 01, 2025 at 03:48:16PM +0000, Artem S. Tashkinov wrote:
> 
> 
> On 2/1/25 3:38 PM, Andreas Dilger wrote:
> > It should be possible to run "find $DIR -type f -size -1M | xargs e4defrag" to only defragment files below 1MB (or whatever you consider "small").
> 
> I have smaller files completely defragmented already.
> 
> The issue is a dozen of 50-250MB files that span multiple extents (up to
> 30).

How big are the extents?  If you are performing large sequential
reads, a few seeks every few megabytes is really not a big deal from a
performance perspective, and it's certainly not worth the huge amount
of time that a perfect defragmentation would take (since that would
require moving smaller files out of the way to free up enough
contiguous space for a big file).

This is why Windows defraggers have mostly fallen out of faver, and
why no one has really found it worthwhile to invest more effort in
improving e4defrag (either the userspace program or the underlying
kernel infrastructure).

> > > ext4 has no free space defragmentation and at most you can use e4defrag
> > > to defragment individual files. I now have a 24GB ext4 filesystem that
> > > has only 7GB of space occupied however it has small files scattered all
> > > over it and now bigger files occupy more than one extent and I cannot
> > > reduce fragmentation to zero. One way to approach that would be to
> > > shrink the volume and then defragment it but that will involve a ton of
> > > disk writes and unnecessary tear and wear. Is it possible to modify the
> > > e4degrag utility to move small defragmented files, so that they were
> > > placed consecutively instead of being randomly spread all over the disk?

Anything is *possible*.  Whether anyone thinks its worth their
development time is a different question.  Many years ago, at a
face-to-face ext4 developer's get together, we had sketched out some
ideas for how we might do this.  It included ways to block certain
areas of the disk from being used for normal block allocation, and an
extended ext4-specific fallocate-like ioctl which e4defrag could use
to allocate blocks in a specific portion of the file system.

But no company has a business case where implementing this feature
would have a positive return on investment; no hobbyist has been
interested in doing in their free time; and unfortunately, this is too
complicated of a project for a Google Summer of Code, Outreachy, or
other Intern project.

If you're interested, I'm happy to chat.  But basically, this is a
"patches are welcome; send us code and we'll be happy to review them"
sort of situation.

Cheers,

						- Ted
						


      reply	other threads:[~2025-02-02 20:18 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-31 19:01 A possible way to reduce free space fragmentation? Artem S. Tashkinov
2025-02-01 15:38 ` Andreas Dilger
2025-02-01 15:48   ` Artem S. Tashkinov
2025-02-02 20:15     ` Theodore Ts'o [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250202154319.GB12129@macsyma-2.local \
    --to=tytso@mit.edu \
    --cc=adilger@dilger.ca \
    --cc=aros@gmx.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox