* EXT4 ENOSPC Bug
@ 2008-11-29 13:18 Andres Freund
2008-11-29 20:32 ` Jonathan Bastien-Filiatrault
` (3 more replies)
0 siblings, 4 replies; 39+ messages in thread
From: Andres Freund @ 2008-11-29 13:18 UTC (permalink / raw)
To: Theodore Tso, adilger; +Cc: LKML, linux-ext4
[-- Attachment #1: Type: text/plain, Size: 2795 bytes --]
Hi Ted, Hi Andreas, hi all,
On a testsystem (spare laptop) with ext4 as root filesystem sometimes the
system starts to return ENOSPC to all write/create syscalls.
Sometimes the system runs without problems, at other times it starts having
problems soon after boot.
A reboot resolves the problem temporarily.
I don't see a specific usage triggering the problem.
Deleting some files sometimes allows the creation (just touch $unused_filename)
of some files, but not many.
Anything I can do to help you to debug the problem?
Andres
Kernel is: 2.6.28-rc6-andres-00007-ged31348
# df /
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/root_crypt
302855628 110032848 192822780 37% /
# df -i /
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/mapper/root_crypt
19234816 786691 18448125 5% /
# tune2fs -l /dev/mapper/root_crypt
tune2fs 1.41.3 (12-Oct-2008)
Filesystem volume name: root
Last mounted on: <not available>
Filesystem UUID: b81f5bab-9544-45a9-8f15-aa7012ae2522
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype
needs_recovery extent flex_bg sparse_super large_file huge_file uninit_bg
dir_nlink extra_isize
Filesystem flags: signed_directory_hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 19234816
Block count: 76921099
Reserved block count: 0
Free blocks: 48205695
Free inodes: 18448125
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1005
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 8192
Inode blocks per group: 512
Flex block group size: 16
Filesystem created: Tue Sep 30 11:57:33 2008
Last mount time: Thu Nov 27 14:30:58 2008
Last write time: Thu Nov 27 14:30:58 2008
Mount count: 116
Maximum mount count: 200
Last checked: Mon Oct 6 13:54:09 2008
Check interval: 31104000 (12 months)
Next check after: Thu Oct 1 13:54:09 2009
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 256
Required extra isize: 28
Desired extra isize: 28
Journal inode: 8
First orphan inode: 288694
Default directory hash: tea
Directory Hash Seed: d0501622-2c99-4b0d-9a0c-d6b6705ff4f2
Journal backup: inode blocks
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 39+ messages in thread* Re: EXT4 ENOSPC Bug 2008-11-29 13:18 EXT4 ENOSPC Bug Andres Freund @ 2008-11-29 20:32 ` Jonathan Bastien-Filiatrault 2008-11-29 21:15 ` Theodore Tso ` (2 subsequent siblings) 3 siblings, 0 replies; 39+ messages in thread From: Jonathan Bastien-Filiatrault @ 2008-11-29 20:32 UTC (permalink / raw) To: Andres Freund; +Cc: Theodore Tso, adilger, LKML, linux-ext4 Andres Freund wrote: > Hi Ted, Hi Andreas, hi all, > > On a testsystem (spare laptop) with ext4 as root filesystem sometimes the > system starts to return ENOSPC to all write/create syscalls. > Sometimes the system runs without problems, at other times it starts having > problems soon after boot. > A reboot resolves the problem temporarily. I have had the same problem. > > I don't see a specific usage triggering the problem. Using Vuze(formerly azureus) seemed to trigger the problem. My system can run fine for days before it happens. > > Deleting some files sometimes allows the creation (just touch $unused_filename) > of some files, but not many. > > Anything I can do to help you to debug the problem? > > Andres I do not have much more info to give you, this is mostly a "me too". Thanks, Jonathan ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-11-29 13:18 EXT4 ENOSPC Bug Andres Freund 2008-11-29 20:32 ` Jonathan Bastien-Filiatrault @ 2008-11-29 21:15 ` Theodore Tso 2008-11-29 21:31 ` Andres Freund 2008-12-01 12:34 ` Andres Freund 2008-12-10 0:07 ` Andres Freund 3 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2008-11-29 21:15 UTC (permalink / raw) To: Andres Freund; +Cc: adilger, LKML, linux-ext4 On Sat, Nov 29, 2008 at 02:18:19PM +0100, Andres Freund wrote: > Hi Ted, Hi Andreas, hi all, > > On a testsystem (spare laptop) with ext4 as root filesystem sometimes the > system starts to return ENOSPC to all write/create syscalls. > Sometimes the system runs without problems, at other times it starts having > problems soon after boot. Hmm.... Is there anything interesting log messages relating to ext4 or jbd2 showing up in /var/log/messages or in dmesg? - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-11-29 21:15 ` Theodore Tso @ 2008-11-29 21:31 ` Andres Freund 0 siblings, 0 replies; 39+ messages in thread From: Andres Freund @ 2008-11-29 21:31 UTC (permalink / raw) To: Theodore Tso; +Cc: adilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 631 bytes --] On Saturday 29 November 2008 22:15:33 Theodore Tso wrote: > On Sat, Nov 29, 2008 at 02:18:19PM +0100, Andres Freund wrote: > > On a testsystem (spare laptop) with ext4 as root filesystem sometimes the > > system starts to return ENOSPC to all write/create syscalls. > > Sometimes the system runs without problems, at other times it starts > > having problems soon after boot. > Hmm.... Is there anything interesting log messages relating to ext4 or > jbd2 showing up in /var/log/messages or in dmesg? Nothing else than telling that barriers are not used (I am using dm-crypt...), but that shouldnt be related. Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-11-29 13:18 EXT4 ENOSPC Bug Andres Freund 2008-11-29 20:32 ` Jonathan Bastien-Filiatrault 2008-11-29 21:15 ` Theodore Tso @ 2008-12-01 12:34 ` Andres Freund 2008-12-01 19:42 ` Andreas Dilger 2008-12-10 0:07 ` Andres Freund 3 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2008-12-01 12:34 UTC (permalink / raw) To: Theodore Tso; +Cc: adilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 199 bytes --] Hi Ted, hi all, Are there any additional informations you could use? The filesystem is a bit big for you to download unfortunately (and the testdata it contains a bit to sensitive). Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-01 12:34 ` Andres Freund @ 2008-12-01 19:42 ` Andreas Dilger 2008-12-01 20:16 ` Andres Freund 2008-12-02 15:26 ` Henrique de Moraes Holschuh 0 siblings, 2 replies; 39+ messages in thread From: Andreas Dilger @ 2008-12-01 19:42 UTC (permalink / raw) To: Andres Freund; +Cc: Theodore Tso, LKML, linux-ext4 On Dec 01, 2008 13:34 +0100, Andres Freund wrote: > Are there any additional informations you could use? > The filesystem is a bit big for you to download unfortunately (and the > testdata it contains a bit to sensitive). Som information that would be useful: - did you run "e2fsck -f" to see if there were any errors in the filesystem? - do you run any specific applications that seem to trigger the problem (e.g. Vuze (formerly azureus) as was reported by another user) - do the applications writing to this file have any unusual IO pattern (e.g. mmap IO, lots of write+truncate+write on the same file, etc) We discussed the creation of a debugging patch to help diagnose this problem. It looks like you have already compiled your own kernel, so I assume it would be possible for you to run with an additional patch? Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-01 19:42 ` Andreas Dilger @ 2008-12-01 20:16 ` Andres Freund 2008-12-02 7:57 ` Theodore Tso 2008-12-02 15:26 ` Henrique de Moraes Holschuh 1 sibling, 1 reply; 39+ messages in thread From: Andres Freund @ 2008-12-01 20:16 UTC (permalink / raw) To: Andreas Dilger; +Cc: Theodore Tso, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 1762 bytes --] Hi Andreas, Hi all, On Monday 01 December 2008 20:42:04 Andreas Dilger wrote: > On Dec 01, 2008 13:34 +0100, Andres Freund wrote: > > Are there any additional informations you could use? > > The filesystem is a bit big for you to download unfortunately (and the > > testdata it contains a bit to sensitive). > - did you run "e2fsck -f" to see if there were any errors in the > filesystem? Yes, no errors. > - do you run any specific applications that seem to trigger the > problem (e.g. Vuze (formerly azureus) as was reported by another user) No. The Problem occured sometimes shortly after boot (Once even before X was up) sometimes only after days of full usage. Boot starts postgres, thats the only thing that propably always ran. But I have seen it while postgres was idle - i.e. just some polls and stats. > - do the applications writing to this file have any unusual IO pattern > (e.g. mmap IO, lots of write+truncate+write on the same file, etc) Its not a single file, but the whole filesystem, having problems. > We discussed the creation of a debugging patch to help diagnose this > problem. It looks like you have already compiled your own kernel, so > I assume it would be possible for you to run with an additional patch? Sure, no problem. At least if the patch is not expensive during run time. The system produces test-data for testing our development software and I cant take it down for too long. Thats the reason why I can use something like ext4 on such a system at all ;-) It sometimes takes quite a while to reproduce the problem though. The system ran stable for 4 days so far. Same kernel as the last time the error occured though. Any specific debug options I should enable? Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-01 20:16 ` Andres Freund @ 2008-12-02 7:57 ` Theodore Tso 2008-12-02 14:58 ` Andres Freund 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2008-12-02 7:57 UTC (permalink / raw) To: Andres Freund; +Cc: Andreas Dilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 2025 bytes --] Hi Andres, What we suspect may be happening is that somehow, there is something in your workload which is causing a leak or a double-increment of the number of blocks reserved so that delayed allocation will succeed. Basically, when the system writes to a block that hasn't been previously allocated by filesystme, we reserve a data block so that when we finally do the late delayed allocation, we know that a free block will be available. When we do finally determine where the block will be located on the filesystem, we decrement the reserved block counter. If somehow the reserved block counter is growing and not shrinking when it should, that could lead to the problem which you describe. So.... could you apply this patch, attached below. You can trigger it using the attached program, debug-ioctl. If the filesystem is quiscent, and you've typed sync once or twice, you should get the following in your printk logs: [ 2742.603886] ext4 debug delalloc of dm-4 [ 2742.603948] ext4: dirty blocks 0 free blocks 7324359 [ 2742.603960] ext4 debug delalloc done If you do have some dirty blocks that haen't been flushed out to disk, it might look liket this: [ 2758.653682] ext4 debug delalloc of dm-4 [ 2758.653697] ext4: dirty blocks 172 free blocks 7324439 [ 2758.653703] s_dirty list: [ 2758.653708] ino 401167: 79 2 [ 2758.653713] ino 401200: 2 2 [ 2758.653718] ino 401197: 3 2 .... [ 2758.653828] ext4 debug delalloc done If our theory is correct, I suspect you will start to see the number of dirty blocks grown over time, even before you start seeing ENOSPC errors (which will happen when the number of dirty blocks exceeds the number of free blocks). In that case, the list of inodes that have data and metadata blocks reserved will hopefully tell us soemthing about what might be going on. Just run the debug-ioctl command giving a filename or directory within the filesystem where you want to deump out the debugging information; it will be dumpd out in the dmesg buffer. - Ted [-- Attachment #2: debug-delalloc-patch --] [-- Type: text/plain, Size: 2478 bytes --] diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index ba18b81..43f12f6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -313,6 +313,8 @@ struct ext4_new_group_data { #define EXT4_IOC_FIEMAP_INO _IOWR('f', 18, struct fiemap_ino) #define EXT4_IOC_MOVE_VICTIM _IOW('f', 19, struct ext4_extents_info) +#define EXT4_IOC_DEBUG_DELALLOC _IO('f', 42) + /* * ioctl commands in 32 bit emulation */ diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 15050d3..be60cfc 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -18,6 +18,17 @@ #include "ext4_jbd2.h" #include "ext4.h" +static void print_inode_dealloc_info(struct inode *inode) +{ + if (!EXT4_I(inode)->i_reserved_data_blocks || + !EXT4_I(inode)->i_reserved_meta_blocks) + return; + + printk(KERN_DEBUG "ino %lu: %u %u\n", inode->i_ino, + EXT4_I(inode)->i_reserved_data_blocks, + EXT4_I(inode)->i_reserved_meta_blocks); +} + long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct inode *inode = filp->f_dentry->d_inode; @@ -346,6 +357,42 @@ setversion_out: return err; } + case EXT4_IOC_DEBUG_DELALLOC: + { + extern spinlock_t inode_lock; + struct super_block *sb = inode->i_sb; + struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb); + struct inode *inode; + + printk(KERN_DEBUG "ext4 debug delalloc of %s\n", sb->s_id); + printk(KERN_DEBUG "ext4: dirty blocks %lld free blocks %lld\n", + percpu_counter_sum(&sbi->s_dirtyblocks_counter), + percpu_counter_sum(&sbi->s_freeblocks_counter)); + spin_lock(&inode_lock); + if (!list_empty(&sb->s_dirty)) { + printk(KERN_DEBUG "s_dirty list:\n"); + list_for_each_entry(inode, &sb->s_dirty, i_list) { + print_inode_dealloc_info(inode); + } + } + if (!list_empty(&sb->s_io)) { + printk(KERN_DEBUG "s_io list:\n"); + list_for_each_entry(inode, &sb->s_io, i_list) { + print_inode_dealloc_info(inode); + } + } + if (!list_empty(&sb->s_more_io)) { + printk(KERN_DEBUG "s_more_io list:\n"); + list_for_each_entry(inode, &sb->s_more_io, i_list) { + print_inode_dealloc_info(inode); + } + } + spin_unlock(&inode_lock); + printk(KERN_DEBUG "ext4 debug delalloc done\n"); + return 0; + } + + default: return -ENOTTY; } @@ -389,6 +436,7 @@ long ext4_compat_ioctl(struct file *file, unsigned int cmd, unsigned long arg) cmd = EXT4_IOC_SETRSVSZ; break; case EXT4_IOC_GROUP_ADD: + case EXT4_IOC_DEBUG_DELALLOC: break; default: return -ENOIOCTLCMD; [-- Attachment #3: debug-ioctl.c --] [-- Type: text/x-csrc, Size: 863 bytes --] /* * trigger the debugging ioctl */ #include <stdio.h> #include <string.h> #include <unistd.h> #include <stdlib.h> #include <fcntl.h> #include <sys/ioctl.h> #include <sys/mount.h> /* For Linux, define BLKFLSBUF if necessary */ #if (!defined(EXT4_IOC_DEBUG_DELALLOC) && defined(__linux__)) #define EXT4_IOC_DEBUG_DELALLOC _IO('f', 42) #endif const char *progname; static void usage(void) { fprintf(stderr, "Usage: %s disk\n", progname); exit(1); } int main(int argc, char **argv) { int fd; progname = argv[0]; if (argc != 2) usage(); fd = open(argv[1], O_RDONLY, 0); if (fd < 0) { perror("open"); exit(1); } #ifdef EXT4_IOC_DEBUG_DELALLOC if (ioctl(fd, EXT4_IOC_DEBUG_DELALLOC, 0) < 0) { perror("ioctl BLKFLSBUF"); exit(1); } return 0; #else fprintf(stderr, "EXT4_IOC_DEBUG_DELALLOC ioctl not supported!\n"); return 1; #endif } ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-02 7:57 ` Theodore Tso @ 2008-12-02 14:58 ` Andres Freund 2008-12-02 16:47 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2008-12-02 14:58 UTC (permalink / raw) To: Theodore Tso, Andreas Dilger; +Cc: LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 735 bytes --] Hi Ted, Hi Andreas, On Tuesday 02 December 2008 08:57:12 Theodore Tso wrote: > So.... could you apply this patch, attached below. You can trigger it > using the attached program, debug-ioctl. If the filesystem is > quiscent, and you've typed sync once or twice, you should get the > following in your printk logs: Ok. The system now runs (without problems) with the patch enabled and I can get the debug output. 30 Minutes after boot the system still returns to zero dirty blocks and the free blocks seem to stay in a sensible range. I will let it run for a while and report back if either something interesting happens or the problem reappears and I am seeing no significant amount of dirty blocks. Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-02 14:58 ` Andres Freund @ 2008-12-02 16:47 ` Theodore Tso 2008-12-02 17:47 ` Andres Freund 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2008-12-02 16:47 UTC (permalink / raw) To: Andres Freund; +Cc: Andreas Dilger, LKML, linux-ext4 On Tue, Dec 02, 2008 at 03:58:53PM +0100, Andres Freund wrote: > Ok. The system now runs (without problems) with the patch enabled and I can > get the debug output. > 30 Minutes after boot the system still returns to zero dirty blocks and the > free blocks seem to stay in a sensible range. > I will let it run for a while and report back if either something interesting > happens or the problem reappears and I am seeing no significant amount of dirty > blocks. You say you are using Postgres, right? Something you might try to see if it triggers the problem it is creating a new database and then restoring some database dump/backup into that new database. Some databases expand into a new table space (or whatever terminology Postgres uses) by random writes into a sparse portion of the file. This could be triggering the problem, or at least trigger the problem more quickly. The other thing I wanted to ask is whether "df" was showing the 37% in-use statistic at the time, or was that after you rebooted. And although I hate to ask it, you're sure this isn't the standard "delete an in-use file but not get the space back" Unix trap, right? Thanks, regards, - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-02 16:47 ` Theodore Tso @ 2008-12-02 17:47 ` Andres Freund 2008-12-02 20:33 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2008-12-02 17:47 UTC (permalink / raw) To: Theodore Tso; +Cc: Andreas Dilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 1772 bytes --] Hi, On Tuesday 02 December 2008 17:47:09 Theodore Tso wrote: > You say you are using Postgres, right? Something you might try to see > if it triggers the problem it is creating a new database and then > restoring some database dump/backup into that new database. Some > databases expand into a new table space (or whatever terminology > Postgres uses) by random writes into a sparse portion of the file. > This could be triggering the problem, or at least trigger the problem > more quickly. I tried that - I have seen no problems so far. But it is not the first time I did not see the problem for some time. Btw, postgres just creates the database by copying over a default database. For an easy test with sparse files, I created a big one, set it up as a loop device, created a filesystem and ran some stuff in it. No Problem so far. > The other thing I wanted to ask is whether "df" was showing the 37% > in-use statistic at the time, or was that after you rebooted. It definitely was before a reboot. And there were plenty of both, inodes and blocks. I think that I have seen the problem on metadata only changes (find /tmp -type f|xargs touch) as well, but sometimes metadata changes were possible while file creation was not. Another Datapoint: File deletion sometimes made it possible to create more files, but by far not as much as the space freed. > And although I hate to ask it, you're sure this isn't the standard "delete > an in-use file but not get the space back" Unix trap, right? There were over 200GB free, so I doubt that. I don't know what could have caused an allocation of so much space unnoticed in an idle system multiple times. But I do understand the reason for the question ;-) Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-02 17:47 ` Andres Freund @ 2008-12-02 20:33 ` Theodore Tso 2008-12-03 0:37 ` Jonathan Bastien-Filiatrault 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2008-12-02 20:33 UTC (permalink / raw) To: Andres Freund; +Cc: Andreas Dilger, LKML, linux-ext4 On Tue, Dec 02, 2008 at 06:47:24PM +0100, Andres Freund wrote: > > I think that I have seen the problem on metadata only changes (find > /tmp -type f|xargs touch) as well, but sometimes metadata changes > were possible while file creation was not. Hmm... really? Was the error message ENOSPC, or something else? The only way I can see that it might be ENOSPC would be if we got an error in the jbd2 layer, in start_this_handle(), but that would have resulted in a kernel printk: if (nblocks > journal->j_max_transaction_buffers) { printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n", current->comm, nblocks, journal->j_max_transaction_buffers); ret = -ENOSPC; goto out; } If that wasn't too long ago, you might want to search your old system log files in /var/log for any "JBD" or "ext4" messages. That might be an important clue, although that seems rather unlikely to me. Or in the case where metadata changes were failing, perhaps they were doing so with some other error that ENOSPC? - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-02 20:33 ` Theodore Tso @ 2008-12-03 0:37 ` Jonathan Bastien-Filiatrault 2008-12-03 0:40 ` Jonathan Bastien-Filiatrault 2008-12-03 15:34 ` Aneesh Kumar K.V 0 siblings, 2 replies; 39+ messages in thread From: Jonathan Bastien-Filiatrault @ 2008-12-03 0:37 UTC (permalink / raw) To: Theodore Tso, Andres Freund, Andreas Dilger, LKML, linux-ext4 Theodore Tso wrote: > Hmm... really? Was the error message ENOSPC, or something else? > I can confirm I get a ENOSPC on open/creat. Here is the relevant output from strace touch test: open("test", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ENOSPC (No space left on device) And the df -h output (first filesystem in list): Filesystem Size Used Avail Use% Mounted on /dev/mapper/vg0-root 88G 60G 25G 71% / tmpfs 1.6G 12K 1.6G 1% /lib/init/rw udev 10M 112K 9.9M 2% /dev tmpfs 1.6G 0 1.6G 0% /dev/shm /dev/sda1 92M 19M 69M 22% /boot none 1.6G 3.3M 1.6G 1% /tmp Cheers, Jonathan ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-03 0:37 ` Jonathan Bastien-Filiatrault @ 2008-12-03 0:40 ` Jonathan Bastien-Filiatrault 2008-12-03 4:37 ` Theodore Tso 2008-12-03 15:34 ` Aneesh Kumar K.V 1 sibling, 1 reply; 39+ messages in thread From: Jonathan Bastien-Filiatrault @ 2008-12-03 0:40 UTC (permalink / raw) To: Theodore Tso, Andres Freund, Andreas Dilger, LKML, linux-ext4 Jonathan Bastien-Filiatrault wrote: > Theodore Tso wrote: > >> Hmm... really? Was the error message ENOSPC, or something else? >> > I can confirm I get a ENOSPC on open/creat. Here is the relevant output > from strace touch test: > (Sorry for replying to myself) This happened while compiling a kernel with ccache enabled (with a relatively low hit rate for this specific build). Vuze was also opened, but with no active download. Cheers, Jonathan ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-03 0:40 ` Jonathan Bastien-Filiatrault @ 2008-12-03 4:37 ` Theodore Tso 0 siblings, 0 replies; 39+ messages in thread From: Theodore Tso @ 2008-12-03 4:37 UTC (permalink / raw) To: Jonathan Bastien-Filiatrault Cc: Andres Freund, Andreas Dilger, LKML, linux-ext4 On Tue, Dec 02, 2008 at 07:40:43PM -0500, Jonathan Bastien-Filiatrault wrote: > Jonathan Bastien-Filiatrault wrote: > > Theodore Tso wrote: > > > >> Hmm... really? Was the error message ENOSPC, or something else? > >> > > I can confirm I get a ENOSPC on open/creat. Here is the relevant output > > from strace touch test: > > > (Sorry for replying to myself) > > This happened while compiling a kernel with ccache enabled (with a > relatively low hit rate for this specific build). Vuze was also opened, > but with no active download. Hmm... I build with ccache all the time. I normally don't run a bittorrent client, though. What kernel version are you using? Any chance you can apply the patch which I sent earlier in this thread, trigger the debugging ioctl when it happens again? Thanks, - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-03 0:37 ` Jonathan Bastien-Filiatrault 2008-12-03 0:40 ` Jonathan Bastien-Filiatrault @ 2008-12-03 15:34 ` Aneesh Kumar K.V 2008-12-03 17:23 ` Theodore Tso 1 sibling, 1 reply; 39+ messages in thread From: Aneesh Kumar K.V @ 2008-12-03 15:34 UTC (permalink / raw) To: Jonathan Bastien-Filiatrault Cc: Theodore Tso, Andres Freund, Andreas Dilger, LKML, linux-ext4 On Tue, Dec 02, 2008 at 07:37:20PM -0500, Jonathan Bastien-Filiatrault wrote: > Theodore Tso wrote: > > > Hmm... really? Was the error message ENOSPC, or something else? > > > I can confirm I get a ENOSPC on open/creat. Here is the relevant output > from strace touch test: > > open("test", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ENOSPC (No > space left on device) > Can you make sure you have the below patch in the kernel. c001077f4003fa75793bb62979baa6241dd8eb19 commit c001077f4003fa75793bb62979baa6241dd8eb19 Author: Eric Sandeen <sandeen@redhat.com> Date: Tue Aug 19 22:19:50 2008 -0400 ext4: Fix bug where we return ENOSPC even though we have plenty of inodes -aneesh ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-03 15:34 ` Aneesh Kumar K.V @ 2008-12-03 17:23 ` Theodore Tso 2008-12-03 18:18 ` Andres Freund 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2008-12-03 17:23 UTC (permalink / raw) To: Aneesh Kumar K.V Cc: Jonathan Bastien-Filiatrault, Andres Freund, Andreas Dilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 930 bytes --] On Wed, Dec 03, 2008 at 09:04:41PM +0530, Aneesh Kumar K.V wrote: > > Can you make sure you have the below patch in the kernel. > > c001077f4003fa75793bb62979baa6241dd8eb19 > > commit c001077f4003fa75793bb62979baa6241dd8eb19 > Author: Eric Sandeen <sandeen@redhat.com> > Date: Tue Aug 19 22:19:50 2008 -0400 > > ext4: Fix bug where we return ENOSPC even though we have plenty of inodes > Mmmm, good point, thanks. I had been assuming this was caused by some failure in the delayed allocation code with block accounting, but we also had a bug fix that was causing a problem with inode allocation. That doesn't explain a report of an ENOSPC error with metadata only changes were failing (i.e., touching a file that already exists), although I don't think we've gotten a lot of information about that scenario and it feels a little unconfirmed to me still... So, yes, this patch may solve the issue for you. - Ted [-- Attachment #2: 0001-ext4-Fix-bug-where-we-return-ENOSPC-even-though-we.patch --] [-- Type: text/x-diff, Size: 1601 bytes --] >From c001077f4003fa75793bb62979baa6241dd8eb19 Mon Sep 17 00:00:00 2001 From: Eric Sandeen <sandeen@redhat.com> Date: Tue, 19 Aug 2008 22:19:50 -0400 Subject: [PATCH] ext4: Fix bug where we return ENOSPC even though we have plenty of inodes The find_group_flex() function starts with best_flex as the parent_fbg_group, which happens to have 0 inodes free. Some of the flex groups searched have free blocks and free inodes, but the flex_freeb_ratio is < 10, so they're skipped. Then when a group is compared to the current "best" flex group, it does not have more free blocks than "best", so it is skipped as well. This continues until no flex group with free inodes is found which has a proper ratio or which has more free blocks than the "best" group, and we're left with a "best" group that has 0 inodes free, and we return -ENOSPC. We fix this by changing the logic so that if the current "best" flex group has no inodes free, and the current one does have room, it is promoted to the next "best." Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> --- fs/ext4/ialloc.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 655e760..f344834 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -351,7 +351,7 @@ find_close_to_parent: goto found_flexbg; } - if (best_flex < 0 || + if (flex_group[best_flex].free_inodes == 0 || (flex_group[i].free_blocks > flex_group[best_flex].free_blocks && flex_group[i].free_inodes)) -- 1.6.0.4.8.g36f27.dirty ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-03 17:23 ` Theodore Tso @ 2008-12-03 18:18 ` Andres Freund 0 siblings, 0 replies; 39+ messages in thread From: Andres Freund @ 2008-12-03 18:18 UTC (permalink / raw) To: Theodore Tso Cc: Aneesh Kumar K.V, Jonathan Bastien-Filiatrault, Andreas Dilger, LKML, linux-ext4 [-- Attachment #1: Type: text/plain, Size: 1065 bytes --] On Wednesday 03 December 2008 18:23:09 Theodore Tso wrote: > On Wed, Dec 03, 2008 at 09:04:41PM +0530, Aneesh Kumar K.V wrote: > > Can you make sure you have the below patch in the kernel. > > c001077f4003fa75793bb62979baa6241dd8eb19 Yes, I do have that patch in kernel (verified through git and looking at the source). Its a part of linus' tree. > Mmmm, good point, thanks. I had been assuming this was caused by some > failure in the delayed allocation code with block accounting, but we > also had a bug fix that was causing a problem with inode allocation. > > That doesn't explain a report of an ENOSPC error with metadata only > changes were failing (i.e., touching a file that already exists), > although I don't think we've gotten a lot of information about that > scenario and it feels a little unconfirmed to me still... Unfortunately I am not absolutely sure (I reported this) as well, this is the reason I did not report it at first but only after no culprit could be found. Still no luck reproducing it once more. Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-01 19:42 ` Andreas Dilger 2008-12-01 20:16 ` Andres Freund @ 2008-12-02 15:26 ` Henrique de Moraes Holschuh 1 sibling, 0 replies; 39+ messages in thread From: Henrique de Moraes Holschuh @ 2008-12-02 15:26 UTC (permalink / raw) To: Andreas Dilger; +Cc: Andres Freund, Theodore Tso, LKML, linux-ext4 On Mon, 01 Dec 2008, Andreas Dilger wrote: > - do the applications writing to this file have any unusual IO pattern > (e.g. mmap IO, lots of write+truncate+write on the same file, etc) Vuze does a massive ammount of sparse writes. It can also be configured to pre-allocate space (either by creating an empty non-sparse file, or by telling the kernel that the file is expected to grow to size foo and to please pre-allocate that in the filesystem. I don't recall how this is done). -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-11-29 13:18 EXT4 ENOSPC Bug Andres Freund ` (2 preceding siblings ...) 2008-12-01 12:34 ` Andres Freund @ 2008-12-10 0:07 ` Andres Freund 2009-02-16 11:37 ` Andres Freund 3 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2008-12-10 0:07 UTC (permalink / raw) To: Theodore Tso Cc: adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V [-- Attachment #1: Type: text/plain, Size: 1289 bytes --] Hi Ted, Very Strange: I booted (after an crash which I did not see) a somewhat older kernel and I once more caught the issue. Unfortunately I do not have your patch in this older kernel. older kernel: 2.6.28-rc5-andres-00070-g4e14e83 kernel I rean whole last week: 2.6.28-rc6-andres-00007-ged31348 There seem to have been no significant ext4 related changes to the kernel between those. for i in `seq 1 1000`;do touch tmp$i;done touch: cannot touch `tmp139': No space left on device ... rm tmp*;for i in `seq 1 200`;do dd if=/dev/zero of=tmp$i bs=1024k count=1;done dd: opening `tmp139': No space left on device 139 .. Various different blocksizes and filesizes all return the same result. This all hints to the patch Anesh linked. But it is included in 2.6.28-rc5. Any idea how that could happen? The kernel install is just a "make;make install; update-initramfs -c -k `make kernelrelease`; update-grub" run... A short idea how I could verify on binary level what kernelrelease it really is? If a somehow corrupted compile (murked ccache?) is the culprit of this (in addition to my inaccuracies) I am very sorry for all your work. I let this kernel run for the next hours if you want to run some tests... Thanks, Andres [-- Attachment #2: This is a digitally signed message part. --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2008-12-10 0:07 ` Andres Freund @ 2009-02-16 11:37 ` Andres Freund 2009-02-16 15:01 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2009-02-16 11:37 UTC (permalink / raw) To: Theodore Tso Cc: adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V Hi All, "Luckily", to prove that I am not completly mad I recently found this problem again. On a more recent kernel: 2.6.29-rc3-andres-00498-g68e80d5 (upstream 2.6.29 + Theodore's ext4 debug patches) Again I got: open("/home/andres/tt", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 ENOSPC (No space left on device) When doing `touch /home/andres/tt` Again df does not show something suspicious: Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/root_crypt 19234816 826296 18408520 5% / tmpfs 491025 3 491022 1% /lib/init/rw varrun 491025 96 490929 1% /var/run varlock 491025 3 491022 1% /var/lock udev 491025 5517 485508 2% /dev tmpfs 491025 1 491024 1% /dev/shm /dev/sda2 61312 110 61202 1% /boot Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/root_crypt 302855628 202819548 100036080 67% / tmpfs 1964100 0 1964100 0% /lib/init/rw varrun 1964100 300 1963800 1% /var/run varlock 1964100 0 1964100 0% /var/lock udev 1964100 92 1964008 1% /dev tmpfs 1964100 0 1964100 0% /dev/shm /dev/sda2 964532 634052 281484 70% /boot Delayed allocations are not showing anything suspicious: [72586.050538] ext4 debug delalloc of dm-0 [72586.050547] ext4: dirty blocks 22 free blocks 25008749 [72586.050551] s_dirty list: [72586.050556] ino 534064: 2 2 [72586.050559] ino 539981: 2 2 [72586.050563] ino 524506: 5 2 [72586.050566] ino 184479: 1 2 [72586.050571] ino 167172: 2 2 [72586.050574] ext4 debug delalloc done Any debugging ideas? I got this twice in the last two months, so its not really easy to reproduce :-( Andres PS: All information is taken out of the running kernel while having the problem meminfo: MemTotal: 3928200 kB MemFree: 23312 kB Buffers: 239564 kB Cached: 2383280 kB SwapCached: 0 kB Active: 1490948 kB Inactive: 2149360 kB Active(anon): 879880 kB Inactive(anon): 294356 kB Active(file): 611068 kB Inactive(file): 1855004 kB Unevictable: 52 kB Mlocked: 52 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 52 kB Writeback: 4 kB AnonPages: 1017356 kB Mapped: 129440 kB Slab: 123316 kB SReclaimable: 94512 kB SUnreclaim: 28804 kB PageTables: 26936 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 1964100 kB Committed_AS: 1169664 kB VmallocTotal: 34359738367 kB VmallocUsed: 341580 kB VmallocChunk: 34359392035 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB DirectMap4k: 86016 kB DirectMap2M: 4003840 kB ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 11:37 ` Andres Freund @ 2009-02-16 15:01 ` Theodore Tso 2009-02-16 15:27 ` Andres Freund 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2009-02-16 15:01 UTC (permalink / raw) To: Andres Freund Cc: adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Mon, Feb 16, 2009 at 12:37:19PM +0100, Andres Freund wrote: > Hi All, > > "Luckily", to prove that I am not completly mad I recently found this > problem again. On a more recent kernel: > 2.6.29-rc3-andres-00498-g68e80d5 > (upstream 2.6.29 + Theodore's ext4 debug patches) > > Again I got: > open("/home/andres/tt", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 > ENOSPC (No space left on device) So /home/andres/tt does not exist, correct? Does this happen for any attempt to create a new zero-length file using the "touch" command? (Even in other directories) Can you append to a file using "cat /etc/mailcap >> some_existing_file"? If it is "yes" to the first two questions, and "no" the third, it looks like this might be an inode allocation failure as opposed to a block allocation failure. Once this triggers, does it reliably continue to fail if you reboot? Or does it go away when you reboot? - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 15:01 ` Theodore Tso @ 2009-02-16 15:27 ` Andres Freund 2009-02-16 19:00 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2009-02-16 15:27 UTC (permalink / raw) To: Theodore Tso, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, "Aneesh Kumar K.V" <aneesh. Hi, On 02/16/2009 04:01 PM, Theodore Tso wrote: > On Mon, Feb 16, 2009 at 12:37:19PM +0100, Andres Freund wrote: >> "Luckily", to prove that I am not completly mad I recently found this >> problem again. On a more recent kernel: >> 2.6.29-rc3-andres-00498-g68e80d5 >> (upstream 2.6.29 + Theodore's ext4 debug patches) >> >> Again I got: >> open("/home/andres/tt", O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666) = -1 >> ENOSPC (No space left on device) > > So /home/andres/tt does not exist, correct? yes > Does this happen for any > attempt to create a new zero-length file using the "touch" command? > (Even in other directories) Can you append to a file using "cat > /etc/mailcap>> some_existing_file"? Happens at the moment only for not existing files in every directory i tried. > Once this triggers, does it reliably continue to fail if you reboot? Yes and no. It triggers as long as I don't delete any files: Out of another mail: for i in `seq 1 1000`;do touch tmp$i;done touch: cannot touch `tmp139': No space left on device ... rm tmp*;for i in `seq 1 200`;do dd if=/dev/zero of=tmp$i bs=1024k count=1;done dd: opening `tmp139': No space left on device 139 .. So, yes, seems to be an inode allocation problem. > Or does it go away when you reboot? Reboot fixes it. Andres ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 15:27 ` Andres Freund @ 2009-02-16 19:00 ` Theodore Tso 2009-02-17 17:21 ` Alex Buell ` (2 more replies) 0 siblings, 3 replies; 39+ messages in thread From: Theodore Tso @ 2009-02-16 19:00 UTC (permalink / raw) To: Andres Freund, Alex Buell Cc: adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: > > So, yes, seems to be an inode allocation problem. > Andres, Alex, others, I'm pretty sure the ENOSPC problem which you both found is an inode allocation problem. Some of you seem to have an easier time reproducing it than others; could you try this patch, and periodically scan your system logs for the message "ext4: find_group_flex failed, fallback succeeded"? If the problem goes away for you, and you find the occasional aforemention message in your system log, that will confirm what I suspect, which is the bug is in fs/ext4/inode.c's find_group_flex() function. (If I'm wrong, the fallback code will activate only when the filesystem is genuinely out of inodes, which should be very rare.) More comments are in the patch header. My current long-term plan for dealing with this is to enhance find_group_orlov() to and find_group_other() to understand about flex_bg's. - Ted commit 1012e25b371b203164e4766a98f1e696df68b56d Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Feb 16 13:51:16 2009 -0500 ext4: Add fallback for find_group_flex This is a workaround for find_group_flex() which badly needs to be replaced. One of its problems (besides ignoring the Orlov algorithm) is that it is a bit hyperactive about returning failure under suspicious circumstances. This can lead to spurious ENOSPC failures. Work around this for now by retrying the search using find_group_other() if find_group_flex() returns -1. If find_group_other() succeeds when find_group_flex(), log a warning message. I can't quite find the motivation to spend effort working on fixing find_group_flex() given that I want to replace it all anyway (and in fact work on the replacement code is underway), so we may leave the workaround for as long as find_group_flex() stays in the kernel... Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index a200059..21080ab 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -715,6 +715,13 @@ struct inode *ext4_new_inode(handle_t *handle, struct inode *dir, int mode) if (sbi->s_log_groups_per_flex) { ret2 = find_group_flex(sb, dir, &group); + if (ret2 == -1) { + ret2 = find_group_other(sb, dir, &group); + if (ret2 == 0) + printk(KERN_NOTICE "ext4: find_group_flex " + "failed, fallback succeeded dir %lu\n", + dir->i_ino); + } goto got_group; } ^ permalink raw reply related [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 19:00 ` Theodore Tso @ 2009-02-17 17:21 ` Alex Buell 2009-02-17 17:36 ` Andres Freund 2009-02-17 18:13 ` Eric Sandeen 2 siblings, 0 replies; 39+ messages in thread From: Alex Buell @ 2009-02-17 17:21 UTC (permalink / raw) To: Theodore Tso Cc: Andres Freund, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Mon, 16 Feb 2009 14:00:01 -0500, I waved a wand and this message magically appears in front of Theodore Tso: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: > > > > So, yes, seems to be an inode allocation problem. > > Andres, Alex, others, > > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) OK, I had to go look through the archives on linux-ext4 mailing list to see what the context was. For myself, this used to happen at least once a week with 2.6.26, and less frequently with 2.6.27. I think that 2.6.28 with your patch should get rid of that problem altogether. I will of course get in touch should I see any more of these find_group_flex failures as that would mean your patch worked. Thanks for your work on tracking this one down! -- http://www.munted.org.uk Fearsome grindings. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 19:00 ` Theodore Tso 2009-02-17 17:21 ` Alex Buell @ 2009-02-17 17:36 ` Andres Freund 2009-02-18 21:18 ` Andres Freund 2009-02-17 18:13 ` Eric Sandeen 2 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2009-02-17 17:36 UTC (permalink / raw) To: Theodore Tso, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault <jo On 02/16/2009 08:00 PM, Theodore Tso wrote: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >> So, yes, seems to be an inode allocation problem. > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) > > More comments are in the patch header. My current long-term plan for > dealing with this is to enhance find_group_orlov() to and > find_group_other() to understand about flex_bg's. Ok. I am now running with the patch enabled on two machines - but as the issue occured only 2 times in nearly 2 months on two machines... Andres ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-17 17:36 ` Andres Freund @ 2009-02-18 21:18 ` Andres Freund 2009-02-18 21:29 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Andres Freund @ 2009-02-18 21:18 UTC (permalink / raw) To: Theodore Tso, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault <jo Hi All, On 02/17/2009 06:36 PM, Andres Freund wrote: > On 02/16/2009 08:00 PM, Theodore Tso wrote: >> On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >>> So, yes, seems to be an inode allocation problem. >> I'm pretty sure the ENOSPC problem which you both found is an inode >> allocation problem. Some of you seem to have an easier time >> reproducing it than others; could you try this patch, and periodically >> scan your system logs for the message "ext4: find_group_flex failed, >> fallback succeeded"? If the problem goes away for you, and you find >> the occasional aforemention message in your system log, that will >> confirm what I suspect, which is the bug is in fs/ext4/inode.c's >> find_group_flex() function. (If I'm wrong, the fallback code will >> activate only when the filesystem is genuinely out of inodes, which >> should be very rare.) >> More comments are in the patch header. My current long-term plan for >> dealing with this is to enhance find_group_orlov() to and >> find_group_other() to understand about flex_bg's. > Ok. I am now running with the patch enabled on two machines - but as the > issue occured only 2 times in nearly 2 months on two machines... Didn't take that long: On one of the machines I got several thousand of: [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 (with different directories) No userspace visible behaviour. So it seems you were right. It seems sensible to put that patch without printk in the kernel until the issue is fully solved... Andres ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-18 21:18 ` Andres Freund @ 2009-02-18 21:29 ` Theodore Tso 2009-02-19 2:18 ` Andres Freund 2009-02-27 3:57 ` Andres Freund 0 siblings, 2 replies; 39+ messages in thread From: Theodore Tso @ 2009-02-18 21:29 UTC (permalink / raw) To: Andres Freund Cc: Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Wed, Feb 18, 2009 at 10:18:57PM +0100, Andres Freund wrote: > On one of the machines I got several thousand of: > > [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 > [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 > (with different directories) Ok, that's good. Good to know the workaround works. Can you send me a dumpe2fs of the filesystem in question? I'm curious what was going on... > No userspace visible behaviour. > > So it seems you were right. It seems sensible to put that patch without > printk in the kernel until the issue is fully solved... Thanks for the report. I'll push the workaround patch to Linus for 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 for comment a patch to revamp the Orlov allocator for flex_bg and to use that instead of find_group_flex(), but no way that's going into 2.6.29 at this point.... - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-18 21:29 ` Theodore Tso @ 2009-02-19 2:18 ` Andres Freund 2009-02-19 3:22 ` Theodore Tso 2009-02-27 3:57 ` Andres Freund 1 sibling, 1 reply; 39+ messages in thread From: Andres Freund @ 2009-02-19 2:18 UTC (permalink / raw) To: Theodore Tso, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault <jo On 02/18/2009 10:29 PM, Theodore Tso wrote: > Ok, that's good. Good to know the workaround works. > Can you send me a dumpe2fs of the filesystem in question? I'm curious > what was going on... Will do as soon as I am at the same place as the machine. I guess thats only interesting to you privately (size and so on)? > Thanks for the report. I'll push the workaround patch to Linus for > 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 > for comment a patch to revamp the Orlov allocator for flex_bg and to > use that instead of find_group_flex(), but no way that's going into > 2.6.29 at this point.... Would it be helpfull if I test that patch? Andres ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-19 2:18 ` Andres Freund @ 2009-02-19 3:22 ` Theodore Tso 2009-02-19 15:46 ` Eric Sandeen 0 siblings, 1 reply; 39+ messages in thread From: Theodore Tso @ 2009-02-19 3:22 UTC (permalink / raw) To: Andres Freund Cc: Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Thu, Feb 19, 2009 at 03:18:45AM +0100, Andres Freund wrote: > On 02/18/2009 10:29 PM, Theodore Tso wrote: >> Ok, that's good. Good to know the workaround works. >> Can you send me a dumpe2fs of the filesystem in question? I'm curious >> what was going on... > Will do as soon as I am at the same place as the machine. I guess thats > only interesting to you privately (size and so on)? > >> Thanks for the report. I'll push the workaround patch to Linus for >> 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 >> for comment a patch to revamp the Orlov allocator for flex_bg and to >> use that instead of find_group_flex(), but no way that's going into >> 2.6.29 at this point.... > Would it be helpfull if I test that patch? > Sure, I'll take all of the testing I can get. :-) The patch is in the ext4 patch queue, and I sent them to the ext4 patch queue. The patch is also in patch work: http://patchwork.ozlabs.org/patch/23343/ The patch which I sent you earlier (available below) is a prequisite: http://patchwork.ozlabs.org/patch/23228/ - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-19 3:22 ` Theodore Tso @ 2009-02-19 15:46 ` Eric Sandeen 2009-02-23 2:02 ` Theodore Tso 0 siblings, 1 reply; 39+ messages in thread From: Eric Sandeen @ 2009-02-19 15:46 UTC (permalink / raw) To: Theodore Tso, Andres Freund, Alex Buell, adilger, LKML, linux-ext4 Theodore Tso wrote: > On Thu, Feb 19, 2009 at 03:18:45AM +0100, Andres Freund wrote: >> On 02/18/2009 10:29 PM, Theodore Tso wrote: >>> Ok, that's good. Good to know the workaround works. >>> Can you send me a dumpe2fs of the filesystem in question? I'm curious >>> what was going on... >> Will do as soon as I am at the same place as the machine. I guess thats >> only interesting to you privately (size and so on)? >> >>> Thanks for the report. I'll push the workaround patch to Linus for >>> 2.6.29 to avoid this problem for now. I recently sent to linux-ext4 >>> for comment a patch to revamp the Orlov allocator for flex_bg and to >>> use that instead of find_group_flex(), but no way that's going into >>> 2.6.29 at this point.... >> Would it be helpfull if I test that patch? >> > > Sure, I'll take all of the testing I can get. :-) > > The patch is in the ext4 patch queue, and I sent them to the ext4 > patch queue. The patch is also in patch work: > > http://patchwork.ozlabs.org/patch/23343/ > > The patch which I sent you earlier (available below) is a prequisite: > > http://patchwork.ozlabs.org/patch/23228/ Ted, I hope the printk will be removed or at least ratelimited before it gets upstream? Thanks, -Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-19 15:46 ` Eric Sandeen @ 2009-02-23 2:02 ` Theodore Tso 0 siblings, 0 replies; 39+ messages in thread From: Theodore Tso @ 2009-02-23 2:02 UTC (permalink / raw) To: Eric Sandeen Cc: Andres Freund, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Thu, Feb 19, 2009 at 09:46:51AM -0600, Eric Sandeen wrote: > > Ted, I hope the printk will be removed or at least ratelimited before it > gets upstream? > Yes, I've added a ratelimit. - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-18 21:29 ` Theodore Tso 2009-02-19 2:18 ` Andres Freund @ 2009-02-27 3:57 ` Andres Freund 1 sibling, 0 replies; 39+ messages in thread From: Andres Freund @ 2009-02-27 3:57 UTC (permalink / raw) To: Theodore Tso, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault <jo On 02/18/2009 10:29 PM, Theodore Tso wrote: >> [10379.575904] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.576002] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.579981] ext4: find_group_flex failed, fallback succeeded dir 416319 >> [10379.580097] ext4: find_group_flex failed, fallback succeeded dir 416319 >> (with different directories) > Can you send me a dumpe2fs of the filesystem in question? I'm curious > what was going on... Unfortunately the system was rebooted, before I had the chance to do the dump - since then the problem has not reemerged. Would a dump after reboot still be usefull? Andres ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-16 19:00 ` Theodore Tso 2009-02-17 17:21 ` Alex Buell 2009-02-17 17:36 ` Andres Freund @ 2009-02-17 18:13 ` Eric Sandeen 2009-02-17 20:08 ` Eric Sandeen [not found] ` <499B1935.10906@redhat.com> 2 siblings, 2 replies; 39+ messages in thread From: Eric Sandeen @ 2009-02-17 18:13 UTC (permalink / raw) To: Theodore Tso, Andres Freund, Alex Buell, adilger, LKML, linux-ext4 Theodore Tso wrote: > On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >> So, yes, seems to be an inode allocation problem. >> > > Andres, Alex, others, > > I'm pretty sure the ENOSPC problem which you both found is an inode > allocation problem. Some of you seem to have an easier time > reproducing it than others; could you try this patch, and periodically > scan your system logs for the message "ext4: find_group_flex failed, > fallback succeeded"? If the problem goes away for you, and you find > the occasional aforemention message in your system log, that will > confirm what I suspect, which is the bug is in fs/ext4/inode.c's > find_group_flex() function. (If I'm wrong, the fallback code will > activate only when the filesystem is genuinely out of inodes, which > should be very rare.) > > More comments are in the patch header. My current long-term plan for > dealing with this is to enhance find_group_orlov() to and > find_group_other() to understand about flex_bg's. Ok, I finally got to where I can reliably hit this. Just as I was about to install an ext4 with this patch in place, and the bug was preventing the new initrd creation ;) But worked around that, and: ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 ext4: find_group_flex failed, fallback succeeded dir 258402 .... I'll see if I can dig a bit more as to why the find_group_flex failed, if you think it's worth it, Ted. -Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-17 18:13 ` Eric Sandeen @ 2009-02-17 20:08 ` Eric Sandeen [not found] ` <499B1935.10906@redhat.com> 1 sibling, 0 replies; 39+ messages in thread From: Eric Sandeen @ 2009-02-17 20:08 UTC (permalink / raw) To: Theodore Tso, Andres Freund, Alex Buell, adilger, LKML, linux-ext4 Eric Sandeen wrote: > Theodore Tso wrote: >> On Mon, Feb 16, 2009 at 04:27:03PM +0100, Andres Freund wrote: >>> So, yes, seems to be an inode allocation problem. >>> >> Andres, Alex, others, >> >> I'm pretty sure the ENOSPC problem which you both found is an inode >> allocation problem. Some of you seem to have an easier time >> reproducing it than others; could you try this patch, and periodically >> scan your system logs for the message "ext4: find_group_flex failed, >> fallback succeeded"? If the problem goes away for you, and you find >> the occasional aforemention message in your system log, that will >> confirm what I suspect, which is the bug is in fs/ext4/inode.c's >> find_group_flex() function. (If I'm wrong, the fallback code will >> activate only when the filesystem is genuinely out of inodes, which >> should be very rare.) >> >> More comments are in the patch header. My current long-term plan for >> dealing with this is to enhance find_group_orlov() to and >> find_group_other() to understand about flex_bg's. > > Ok, I finally got to where I can reliably hit this. Just as I was about > to install an ext4 with this patch in place, and the bug was preventing > the new initrd creation ;) But worked around that, and: > > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > ext4: find_group_flex failed, fallback succeeded dir 258402 > .... > > I'll see if I can dig a bit more as to why the find_group_flex failed, > if you think it's worth it, Ted. FWIW my problem seems to be different than others have encountered; mine persists past reboot, while other reporters have said that a reboot (remount) makes the problem go away. I seem to be encountering some silliness in find_group_flex when 2 out of 3 groups are full (I "only" have 55k inodes left, all in the last group). -Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
[parent not found: <499B1935.10906@redhat.com>]
* Re: EXT4 ENOSPC Bug [not found] ` <499B1935.10906@redhat.com> @ 2009-02-17 22:00 ` Theodore Tso 2009-02-17 22:30 ` Alex Buell 1 sibling, 0 replies; 39+ messages in thread From: Theodore Tso @ 2009-02-17 22:00 UTC (permalink / raw) To: Eric Sandeen Cc: Andres Freund, Alex Buell, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Tue, Feb 17, 2009 at 02:08:21PM -0600, Eric Sandeen wrote: > FWIW my problem seems to be different than others have encountered; mine > persists past reboot, while other reporters have said that a reboot > (remount) makes the problem go away. It might or might not be the same problem, since the reporters were doing this on a mounted root partition, and on a filesystem quite a bit larger than your test filesystem; so it could be that the act of shutting down and rebooting created/deleted various pid files, and purturbed the filesystem to make the problem go away. The other possibility is that it is the flex_bg specific counters which were introduced specifically for find_group_flex. I'm not wild about them since they mean we have to take an extra flex_bg specific spin lock for every block and inode allocation. The Orlov algorithm only needs the information when allocating directories, and since those are rarer than file allocations, I think it should be OK to simply sum up the necessary fields at directory allocation time instead of trying to maintain separate counters (which could possibly get corrupted, although I couldn't see a way that they could be getting out of sync with reality). - Ted ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug [not found] ` <499B1935.10906@redhat.com> 2009-02-17 22:00 ` Theodore Tso @ 2009-02-17 22:30 ` Alex Buell 2009-02-17 22:56 ` Eric Sandeen 1 sibling, 1 reply; 39+ messages in thread From: Alex Buell @ 2009-02-17 22:30 UTC (permalink / raw) To: Eric Sandeen Cc: Theodore Tso, Andres Freund, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message magically appears in front of Eric Sandeen: > FWIW my problem seems to be different than others have encountered; > mine persists past reboot, while other reporters have said that a > reboot (remount) makes the problem go away. > > I seem to be encountering some silliness in find_group_flex when 2 out > of 3 groups are full (I "only" have 55k inodes left, all in the last > group). I've discovered a forced fsck clears this. HTH. -- http://www.munted.org.uk Fearsome grindings. ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-17 22:30 ` Alex Buell @ 2009-02-17 22:56 ` Eric Sandeen 2009-02-17 22:59 ` Alex Buell 0 siblings, 1 reply; 39+ messages in thread From: Eric Sandeen @ 2009-02-17 22:56 UTC (permalink / raw) To: Alex Buell Cc: Theodore Tso, Andres Freund, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V Alex Buell wrote: > On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message > magically appears in front of Eric Sandeen: > >> FWIW my problem seems to be different than others have encountered; >> mine persists past reboot, while other reporters have said that a >> reboot (remount) makes the problem go away. >> >> I seem to be encountering some silliness in find_group_flex when 2 out >> of 3 groups are full (I "only" have 55k inodes left, all in the last >> group). > > I've discovered a forced fsck clears this. HTH. Do you have the output of the fsck run? -Eric ^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: EXT4 ENOSPC Bug 2009-02-17 22:56 ` Eric Sandeen @ 2009-02-17 22:59 ` Alex Buell 0 siblings, 0 replies; 39+ messages in thread From: Alex Buell @ 2009-02-17 22:59 UTC (permalink / raw) To: Eric Sandeen Cc: Theodore Tso, Andres Freund, adilger, LKML, linux-ext4, Jonathan Bastien-Filiatrault, Aneesh Kumar K.V On Tue, 17 Feb 2009 16:56:23 -0600, I waved a wand and this message magically appears in front of Eric Sandeen: > Alex Buell wrote: > > On Tue, 17 Feb 2009 14:08:21 -0600, I waved a wand and this message > > magically appears in front of Eric Sandeen: > > > >> FWIW my problem seems to be different than others have encountered; > >> mine persists past reboot, while other reporters have said that a > >> reboot (remount) makes the problem go away. > >> > >> I seem to be encountering some silliness in find_group_flex when 2 > >> out of 3 groups are full (I "only" have 55k inodes left, all in > >> the last group). > > > > I've discovered a forced fsck clears this. HTH. > > Do you have the output of the fsck run? I'm afraid not, there's no way to save the output on a forced fsck reboot. -- http://www.munted.org.uk Fearsome grindings. ^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2009-02-27 3:57 UTC | newest]
Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-29 13:18 EXT4 ENOSPC Bug Andres Freund
2008-11-29 20:32 ` Jonathan Bastien-Filiatrault
2008-11-29 21:15 ` Theodore Tso
2008-11-29 21:31 ` Andres Freund
2008-12-01 12:34 ` Andres Freund
2008-12-01 19:42 ` Andreas Dilger
2008-12-01 20:16 ` Andres Freund
2008-12-02 7:57 ` Theodore Tso
2008-12-02 14:58 ` Andres Freund
2008-12-02 16:47 ` Theodore Tso
2008-12-02 17:47 ` Andres Freund
2008-12-02 20:33 ` Theodore Tso
2008-12-03 0:37 ` Jonathan Bastien-Filiatrault
2008-12-03 0:40 ` Jonathan Bastien-Filiatrault
2008-12-03 4:37 ` Theodore Tso
2008-12-03 15:34 ` Aneesh Kumar K.V
2008-12-03 17:23 ` Theodore Tso
2008-12-03 18:18 ` Andres Freund
2008-12-02 15:26 ` Henrique de Moraes Holschuh
2008-12-10 0:07 ` Andres Freund
2009-02-16 11:37 ` Andres Freund
2009-02-16 15:01 ` Theodore Tso
2009-02-16 15:27 ` Andres Freund
2009-02-16 19:00 ` Theodore Tso
2009-02-17 17:21 ` Alex Buell
2009-02-17 17:36 ` Andres Freund
2009-02-18 21:18 ` Andres Freund
2009-02-18 21:29 ` Theodore Tso
2009-02-19 2:18 ` Andres Freund
2009-02-19 3:22 ` Theodore Tso
2009-02-19 15:46 ` Eric Sandeen
2009-02-23 2:02 ` Theodore Tso
2009-02-27 3:57 ` Andres Freund
2009-02-17 18:13 ` Eric Sandeen
2009-02-17 20:08 ` Eric Sandeen
[not found] ` <499B1935.10906@redhat.com>
2009-02-17 22:00 ` Theodore Tso
2009-02-17 22:30 ` Alex Buell
2009-02-17 22:56 ` Eric Sandeen
2009-02-17 22:59 ` Alex Buell
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).