"No space left on device"

linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* "No space left on device"
@ 2013-11-14 19:54 Leonidas Spyropoulos
  2013-11-14 21:00 ` Hugo Mills
  0 siblings, 1 reply; 4+ messages in thread
From: Leonidas Spyropoulos @ 2013-11-14 19:54 UTC (permalink / raw)
  To: linux-btrfs@vger.kernel.org

Hello,

I've been following this list for years and I see during various situations
this message coming up. Some times it's a genuine problem that there is
actually not enough space. In other cases it's some by-product of something
else. I have seen this error personality on a broken system ( which I never
figured out what had happened).
I know this is still experimental but I just want to make sure my
expectations are not really out on sync with the others.

As an end user when I see an error like this the first thing I will do is
check the space (using 'df' command) [1]. If I see more that 7% I usually
think it's OK, (depends on the size of partition as well).

- Is this unreasonable in btrfs filesystem? Is there a formula to calculate
how much space btrfs _might_ need?
- It's probably not your job but can df reports correct sizes for btrfs?
I've seen some threads on trying to show the actual space occupied by data
and/or metadata with btrfs command. Can we expect this someone to be
incorporated into df command?
- In cases that btrfs reports this error but there's something else that's
causing it, can we expect better error handling from btrfs so the end user
is pointed to the correct direction?

[1]: one could argue that an end user should use the btrfs commands instead
but let's leave that for now.

Apologies if these have already been answered or are already on roadmap.

Thanks in advanced, your comments are appreciated.

Kind regards,
Leonidas

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: "No space left on device"
  2013-11-14 19:54 "No space left on device" Leonidas Spyropoulos
@ 2013-11-14 21:00 ` Hugo Mills
  2013-11-15  9:25   ` Duncan
  0 siblings, 1 reply; 4+ messages in thread
From: Hugo Mills @ 2013-11-14 21:00 UTC (permalink / raw)
  To: Leonidas Spyropoulos; +Cc: linux-btrfs@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2908 bytes --]

On Thu, Nov 14, 2013 at 07:54:21PM +0000, Leonidas Spyropoulos wrote:
> Hello,
> 
> I've been following this list for years and I see during various situations
> this message coming up. Some times it's a genuine problem that there is
> actually not enough space. In other cases it's some by-product of something
> else. I have seen this error personality on a broken system ( which I never
> figured out what had happened).
> I know this is still experimental but I just want to make sure my
> expectations are not really out on sync with the others.
> 
> As an end user when I see an error like this the first thing I will do is
> check the space (using 'df' command) [1]. If I see more that 7% I usually
> think it's OK, (depends on the size of partition as well).

   The problem is that there's two kinds of space (data and metadata),
and either could run out. The FS won't, currently, attempt to
reallocate between one and the other -- hence the recommendation for a
filtered balance to fix it (one of the side-effects of a balance is to
be able to free up unused or little-used chunks of allocation).

> - Is this unreasonable in btrfs filesystem? Is there a formula to calculate
> how much space btrfs _might_ need?

   Not really. I'd expect to need something in the range 250-1500 GiB
of headroom, depending on the size of the filesystem (and on the size
of the metadata).

> - It's probably not your job but can df reports correct sizes for btrfs?
> I've seen some threads on trying to show the actual space occupied by data
> and/or metadata with btrfs command. Can we expect this someone to be
> incorporated into df command?

   Sadly, no. The POSIX API for df doesn't contain enough information
to give an accurate representation of the space on the FS.

> - In cases that btrfs reports this error but there's something else that's
> causing it, can we expect better error handling from btrfs so the end user
> is pointed to the correct direction?

   Again, we're constrained by the POSIX API here -- we only have
ENOSPC to represent the error condition. I suspect the best we could
do (possibly) is print something in dmesg, where users don't tend to
look anyway.

   There's a future project on the books to make the FS attempt to
recover unused or little-used chunks, which should reduce the odds of
the ENOSPC showing up in an "unexpected" way (i.e. running out of
metadata).

   Hugo.

> [1]: one could argue that an end user should use the btrfs commands instead
> but let's leave that for now.
> 
> Apologies if these have already been answered or are already on roadmap.
> 
> Thanks in advanced, your comments are appreciated.
> 
> Kind regards,
> Leonidas
> 

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
                 --- emacs: Eats Memory and Crashes. ---                 

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: "No space left on device"
  2013-11-14 21:00 ` Hugo Mills
@ 2013-11-15  9:25   ` Duncan
  0 siblings, 0 replies; 4+ messages in thread
From: Duncan @ 2013-11-15  9:25 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Thu, 14 Nov 2013 21:00:56 +0000 as excerpted:

>> Is there a formula to calculate how much space btrfs _might_ need?
> 
> Not really. I'd expect to need something in the range 250-1500 GiB of
> headroom, depending on the size of the filesystem (and on the size of
> the metadata).

As a somewhat more concrete answer...

While recently doing a bit of research on something else, I came across 
comments that on a large enough filesystem, data chunks default to 1 GiB, 
while metadata chunks default to 256 MiB.

And we know that data mode defaults to SINGLE, while metadata mode 
defaults to DUP.

So on a default single-device btrfs of several gigs plus, assuming the 
files being manipulated are under 1 GiB size, keeping an unallocated 
space reserve of 1.5 GiB should be reasonable.  That's enough unallocated 
space to allocate one more 1 GiB data chunk, plus one more 256 MiB 
metadata chunk, doubled to a half GiB due to DUP mode.  Obviously in the 
single-mode-metadata case, the metadata requirement would be only a 
single copy, so 256 MiB for it, 1.25 GiB total unallocated, minimum.

btrfs filesystem show is the command used to see what your allocated 
space for a filesystem looks like, per device.  However, it doesn't note 
UNALLOCATED space, only size and used (aka allocated), so an admin must 
do the math to figure unallocated.

If the files being manipulated are over a gig in size, round up to the 
nearest whole GiB for the data and add another half GiB to cover the 
quarter-gig DUP metadata case.

If the filesystem is under a gig in size, btrfs defaults to mixed
data+metadata, with chunks of 256 MiB if there's space but apparently 
rather more flexibility in ordered to better utilize all available 
space.  At such "small" sizes[1], full allocation with no more to 
allocate being common, but one does hope people using such sized 
filesystems have a good idea what will be going on them, and they won't /
need/ to allocate further chunks after the initial filesystem 
population.  And quite in contrast to the multi-TB filesystems, 
rebalancing such a filesystem in ordered to recover lost space should be 
relatively fast even on spinning rust.

For filesystems of 1 GiB up to say 10 GiB, it's a more open question, 
altho at that size, there's still a rather good chance that the sysadmin 
has a reasonably good idea what's going on the filesystem and has planned 
accordingly, with some "reasonable" level of over-allocation for future-
proofing and plan fuzziness, and rebalances should still occur in 
reasonable time as well, so it shouldn't be a /huge/ problem unless the 
admin simply isn't tracking the situation.

The multi-device situation is another dimension vector.  Apparently, 
except for single mode, btrfs at this point only ever allocates in pairs 
(plus raid5/6 checksum chunks if applicable, and pairs of pairs in raid10 
mode), regardless of the number of devices available, which does simplify 
calculations to some degree.

Btrfs' multi-device default (for >1 GiB per device sizes, anyway) is 
single data, raid1 metadata.  So to reserve space for one chunk of either 
type, we'd need at least 1 GiB unallocated on ONE device to allow at 
least one single-mode data chunk allocation, PLUS at least 256 MiB 
unallocated on each of TWO devices to cover at least one raid1-mode 
metadata chunk allocation.  Thus, with two devices, we'd require at least 
1.25 GiB free/unallocated on one device (1 GiB data chunk plus one copy 
of the 256 MiB metadata chunk), 256 MiB on the other (the second copy of 
the metadata).  For a three+ device filesystem, that would work, OR 256 
MiB on each of two (for the raid1 metadata), 1 GiB on a third (for the 
data).

For raid1 data the 1 GiB data chunks must have two copies, each on its 
own device, and the above multi-device default scenario would modify 
accordingly: 2-device-case: 1.25 GiB minimum unallocated on each device 
(one copy each for a data and a metadata chunk).  3-device-case:  That OR 
1.25/1.0/.25 GiB.  4-device-plus-case: Either of those or 1.0/1.0/.25/.25 
GiB.

For single metadata plus default single data, we're back to the 1.25 GiB 
total case, in two separate chunks of 1 GiB and 256 MiB, either on 
separate devices or the same device.

I haven't personally played with the raid0 case as it doesn't fit my use-
case, but the wiki documentation suggests that it still allocates chunks 
only in pairs, striping the data/metadata across the pair.  So we're 
looking at a minimum 1 GiB on each of two separate devices for a raid0 
data chunk allocation (which would then allow two gigs of data), a 
minimum of 256 MiB on each of two separate devices for a raid0 metadata 
chunk allocation (which would hold a half-gig of metadata).  Permutations 
are, as they say "left as an exercise for the reader." =:^)

Apparently raid10 mode is pairs of pairs, so allocates in sets of four.  
Metadata: 256 MiB on each of four separate devices, 512 MiB metadata 
capacity.  Data: 1 GiB on each of four separate devices, holds 2 GiB 
worth of data.  Again, permutations "left as an exercise for the reader."

Finally, there's the mixed data/metadata chunk mode that's the default on 
<1 GiB filesystems.  Default chunk sizes there are 256 MiB, with the same 
pair-allocation rules for multi-device filesystems as above.  But as 
discussed under the single device case, these filesystems are often 
capacity-planned and fully allocated from the beginning, with no further 
chunk allocation necessary once the filesystem is populated.

That leaves raid5/6.  With the caveat that these raid modes aren't yet 
ready for normal use (even more so than the still experimental btrfs as a 
whole, where good backups are STRONGLY RECOMMENDED, with raid5/6 mode, 
REALLY expect your data to be eaten for breakfast, so do NOT use it in 
present form for anything but temporary testing!)...

raid5 should work like raid0 above, but requiring one more device chunk 
reserved for the raid5 checksumming, thus reserving in threes with no 
additional capacity over raid0.  raid6 is the same but with yet another 
reserved, thus reserving in fours.  Again, permutations "left as an 
exercise for the reader."

Presumably raid50/60 will be possible with little change in the code once 
raid5/6 stabilize, since it's a logical combination with raid0, with the 
required parallel chunk reservation 6 and 8 devices wide respectively, 
but AFAIK, that's not even supported at all yet, and even if it is, it's 
hardly worth trying since the raid5/6 component remains so highly 
unstable at this point.

And of course there's N-way mirroring on the roadmap as well, but 
implementation remains some way out, beyond raid5/6 normalization.  When 
it comes, its parallel chunk reservation characteristics can be predicted 
based on the raid1 discussion above, extended from it by multiplying by 
the N in the N-way mirroring, instead of by a hard-coded two, as done in 
the current raid1 case.  (This is actually a case I'm strongly interested 
in, 3-way-mirroring, perhaps even in the raid10 variant thus requiring 
six devices minimum, but given btrfs history to date and current progress 
on raid5/6, I don't expect to see it in anything like normalized form 
until well into next year, perhaps a year from now, at the earliest.)

---
[1] Re < 1 GiB being "small", I still can't help but think of my first 
computer when I mention that, a 486-class machine with a 130 MB (128 MiB 
or some such, half the size of my /boot and 1/128th the size of my main 
memory, today!) hard drive, and that was early 90s, so while I've a bit 
of computer experience I'm still a relative newcomer compared to many in 
the *ix community.  It was several disk upgrade generations later when I 
got my first gig-sized drive, and it sure didn't seem "small" at the 
time!  My how times do change!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

^ permalink raw reply	[flat|nested] 4+ messages in thread

* "No space left on device"
@ 2017-10-06  7:10 Nick Gilmour
  0 siblings, 0 replies; 4+ messages in thread
From: Nick Gilmour @ 2017-10-06  7:10 UTC (permalink / raw)
  To: linux-btrfs

Hi all,

I'm getting an error "No space left on device" on a VM in VirtualBox.
It started as I was trying to convert the .vdi to .img. I wanted to
shrink the size of the disk first and I followed the steps from here:
https://superuser.com/questions/529149/how-to-compact-virtualboxs-vdi-file-size#529183

and then I got the error:
$ dd if=/dev/zero of=/tmp/bigemptyfile bs=4096k
dd: error writing '/tmp/bigemptyfile': No space left on device
94174+0 records in
94173+0 records out
394990190592 bytes (395 GB, 368 GiB) copied, 685.984 s, 576 MB/s
$ rm /tmp/bigemptyfile

After I rebooted I got only a terminal prompt, no desktop environment,
and I couldn't start it with startx (not even bash completion doesn't
work).

I have tried with balancing following the steps from here:
https://unix.stackexchange.com/questions/174446/btrfs-error-error-during-balancing-no-space-left-on-device

but I keep getting:
"Done, had to relocate 0 out of XX chunks"
regardless of the increase of the dusage parameter.

Debug information (Copy & Paste was not possible, some text is missing...):

uname -a
Linux VM-Ubuntu 4.4.0-83-generic

btrfs --version
btrfs-progs v4.4

btrfs fi show
Label: none uuid: x
           Total devices 1 FS bytes used 473.68GiB
            devid 1 size 492.00 GiB used 492.00GiB path /dev/sda1

Label: 'extra' uuid: y
           Total devices 1 FS bytes used 112.00KiB
            devid 1 size 100.00 GiB used 2.02GiB path /dev/sdb1

btrfs fi df /home
Data, single: total=462.23GiB, used=462.23GiB
System, DUP: total=8.00MiB, used=80.00KiB
GlobalReserve, single: total=512.00 MiB, used=160.00KiB

dmesg > dmesg.log
dmesg: write failed: No space left on device
dmesg: write error

Any ideas how can I fix this?
Thanks.

Regards,
Nick

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-10-06  7:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-11-14 19:54 "No space left on device" Leonidas Spyropoulos
2013-11-14 21:00 ` Hugo Mills
2013-11-15  9:25   ` Duncan
  -- strict thread matches above, loose matches on Subject: below --
2017-10-06  7:10 Nick Gilmour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).