* Bad ext3/nfs DoS bug
@ 2006-07-17 13:01 James
2006-07-17 13:06 ` Marcel Holtmann
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: James @ 2006-07-17 13:01 UTC (permalink / raw)
To: linux-kernel
I've tried contacting the relevant maintainers directly,
and it's even in the kernel bugzilla, but nothing's happened
and it's been over a month now. No-one seems to be doing anyting
about this. Is one meant to post this to bugtraq or what?
Here's the bug: http://bugzilla.kernel.org/show_bug.cgi?id=6828
(exploit code follows)
> We found this rather surprising behaviour when debugging a
> network card for one of our embedded systems. There was a
> bus problem that occasionally caused the network card to
> place random data in the outgoing packets. We were using
> NFS root, as we hadn't written drivers for the block
> devices yet, and discovered our Linux NFS servers getting
> ext3 errors. It turned out that the 3com cards we have in
> the servers lie about checking UDP checksums, and passed
> the rubbish to knfsd where it was causing the problem.
>
> Here's an example one of our widgets (dcm503) is talking
> to an NFS server (dufftown)
>
> 17:28:38.535011 dcm503.guralp.local.984095109 > dufftown.guralp.local.nfs: 116
> lookup fh Unknown/1 "" (DF) (ttl 64, id 0, len 144)
> 4500 0090 0000 4000 4011 3d45 0a52 01fa
> c0a8 3024 03ff 0801 007c 8e9c 3aa8 1985
> 0000 0000 0000 0002 0001 86a3 0000 0002
> 0000 0004 0000 0001 0000 001c 028f 5b0c
> 0000 0006 6463 6d35 3033 0000 0000 0000
> 0000 0000 0000 0000 0000 0000 0000 0000
> 0100 0001 0021 0003 3d26 3d00 4a2f ffff
> 3d00 2c08 c923 0000 0000 0000 0000 0000
> 0000 0000 000a 6d6f 756e 7470 6f69 6e74
>
> so what's happened here is 4a2f ffff should have been 4a2f
> xxxx but the network card has missed the clock on the bus
> and gotten ffff instead
>
> nfsd_dispatch: vers 2 proc 4
> nfsd: LOOKUP 32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9
> nfsd: nfsd_lookup(fh 32: 01000001 03002100 003d263d ffff2f4a 082c003d
> 000023c9, )
> nfsd: fh_verify(32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9)
>
> so here the client does a V2 lookup with a DH which has
> gotten screwed up by my clients network card, this is
> received by my server, gets past the UDP checksum code
> (thank you 3com) and ends up at knfsd.
>
> knfsd passes this to fh_verify which decodes it to be hde3
> and inode 4294913866 (0xffff2f4a)
>
> that then gets passed to ext3 which then panics.
>
> EXT3-fs error (device hde3): ext3_get_inode_block: bad inode number:
> 4294913866
>
> marks the file system as containing an error, and remounts
> the system read only.
>
> Obviously this is sub optimal, and a fairly horrid DoS
> since anyone can craft a UDP packet, with a bogus FH in
> it. Whilst this is for V2_LOOKUP it works for all of the
> V2 procedures we tried.
>
exploit code is available at
http://www.madingley.org/uploaded/crash-nfs.tar.gz
James.
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: Bad ext3/nfs DoS bug 2006-07-17 13:01 Bad ext3/nfs DoS bug James @ 2006-07-17 13:06 ` Marcel Holtmann 2006-07-17 18:43 ` Marcel Holtmann 2006-07-18 7:55 ` Marcel Holtmann 2 siblings, 0 replies; 32+ messages in thread From: Marcel Holtmann @ 2006-07-17 13:06 UTC (permalink / raw) To: James; +Cc: linux-kernel Hi James, > I've tried contacting the relevant maintainers directly, > and it's even in the kernel bugzilla, but nothing's happened > and it's been over a month now. No-one seems to be doing anyting > about this. Is one meant to post this to bugtraq or what? good contact points are security@kernel.org and vendor-sec@lst.de if this problem has a security implication. Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-17 13:01 Bad ext3/nfs DoS bug James 2006-07-17 13:06 ` Marcel Holtmann @ 2006-07-17 18:43 ` Marcel Holtmann 2006-07-18 7:55 ` Marcel Holtmann 2 siblings, 0 replies; 32+ messages in thread From: Marcel Holtmann @ 2006-07-17 18:43 UTC (permalink / raw) To: James; +Cc: linux-kernel Hi James, > I've tried contacting the relevant maintainers directly, > and it's even in the kernel bugzilla, but nothing's happened > and it's been over a month now. No-one seems to be doing anyting > about this. Is one meant to post this to bugtraq or what? please use CVE-2006-3468 for any further reference of this issue. Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-17 13:01 Bad ext3/nfs DoS bug James 2006-07-17 13:06 ` Marcel Holtmann 2006-07-17 18:43 ` Marcel Holtmann @ 2006-07-18 7:55 ` Marcel Holtmann 2006-07-18 14:56 ` James 2 siblings, 1 reply; 32+ messages in thread From: Marcel Holtmann @ 2006-07-18 7:55 UTC (permalink / raw) To: James; +Cc: linux-kernel Hi James, > I've tried contacting the relevant maintainers directly, > and it's even in the kernel bugzilla, but nothing's happened > and it's been over a month now. No-one seems to be doing anyting > about this. Is one meant to post this to bugtraq or what? > > Here's the bug: http://bugzilla.kernel.org/show_bug.cgi?id=6828 > (exploit code follows) > > > We found this rather surprising behaviour when debugging a > > network card for one of our embedded systems. There was a > > bus problem that occasionally caused the network card to > > place random data in the outgoing packets. We were using > > NFS root, as we hadn't written drivers for the block > > devices yet, and discovered our Linux NFS servers getting > > ext3 errors. It turned out that the 3com cards we have in > > the servers lie about checking UDP checksums, and passed > > the rubbish to knfsd where it was causing the problem. > > > > Here's an example one of our widgets (dcm503) is talking > > to an NFS server (dufftown) > > > > 17:28:38.535011 dcm503.guralp.local.984095109 > dufftown.guralp.local.nfs: 116 > > lookup fh Unknown/1 "" (DF) (ttl 64, id 0, len 144) > > 4500 0090 0000 4000 4011 3d45 0a52 01fa > > c0a8 3024 03ff 0801 007c 8e9c 3aa8 1985 > > 0000 0000 0000 0002 0001 86a3 0000 0002 > > 0000 0004 0000 0001 0000 001c 028f 5b0c > > 0000 0006 6463 6d35 3033 0000 0000 0000 > > 0000 0000 0000 0000 0000 0000 0000 0000 > > 0100 0001 0021 0003 3d26 3d00 4a2f ffff > > 3d00 2c08 c923 0000 0000 0000 0000 0000 > > 0000 0000 000a 6d6f 756e 7470 6f69 6e74 > > > > so what's happened here is 4a2f ffff should have been 4a2f > > xxxx but the network card has missed the clock on the bus > > and gotten ffff instead > > > > nfsd_dispatch: vers 2 proc 4 > > nfsd: LOOKUP 32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9 > > nfsd: nfsd_lookup(fh 32: 01000001 03002100 003d263d ffff2f4a 082c003d > > 000023c9, ) > > nfsd: fh_verify(32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9) > > > > so here the client does a V2 lookup with a DH which has > > gotten screwed up by my clients network card, this is > > received by my server, gets past the UDP checksum code > > (thank you 3com) and ends up at knfsd. > > > > knfsd passes this to fh_verify which decodes it to be hde3 > > and inode 4294913866 (0xffff2f4a) > > > > that then gets passed to ext3 which then panics. > > > > EXT3-fs error (device hde3): ext3_get_inode_block: bad inode number: > > 4294913866 > > > > marks the file system as containing an error, and remounts > > the system read only. > > > > Obviously this is sub optimal, and a fairly horrid DoS > > since anyone can craft a UDP packet, with a bogus FH in > > it. Whilst this is for V2_LOOKUP it works for all of the > > V2 procedures we tried. > > > > exploit code is available at > > http://www.madingley.org/uploaded/crash-nfs.tar.gz so I used your exploit and I could reproduce it on every 2.6 kernel, I tried so far. However with a 2.4 kernel I see the error messages, but it doesn't get remounted read-only. Did you run tests with 2.4 kernels? Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-18 7:55 ` Marcel Holtmann @ 2006-07-18 14:56 ` James 2006-07-18 15:22 ` Marcel Holtmann 0 siblings, 1 reply; 32+ messages in thread From: James @ 2006-07-18 14:56 UTC (permalink / raw) To: Marcel Holtmann; +Cc: James, linux-kernel > so I used your exploit and I could reproduce it on every 2.6 kernel, I > tried so far. That must have been a lot of fscks. > However with a 2.4 kernel I see the error messages, but it > doesn't get remounted read-only. Did you run tests with 2.4 kernels? no, I don't have any to hand, but someone is preparing one now. Is NFS subtree checking on by default in 2.4? James. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-18 14:56 ` James @ 2006-07-18 15:22 ` Marcel Holtmann 2006-07-18 15:23 ` James 0 siblings, 1 reply; 32+ messages in thread From: Marcel Holtmann @ 2006-07-18 15:22 UTC (permalink / raw) To: James; +Cc: linux-kernel Hi James, > > so I used your exploit and I could reproduce it on every 2.6 kernel, I > > tried so far. > > That must have been a lot of fscks. it wasn't that many. For some obvious reasons I only tested the RHEL and Fedora kernels and vanilla plus stable series. > > However with a 2.4 kernel I see the error messages, but it > > doesn't get remounted read-only. Did you run tests with 2.4 kernels? > > no, I don't have any to hand, but someone is preparing one > now. Is NFS subtree checking on by default in 2.4? I haven't checked within the code, but the manual page exports(5) states subtree checking as being on by default. And it doesn't mention any difference to 2.4 kernels. What is the reason behind your question? Does disabling subtree checking changes something? Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-18 15:22 ` Marcel Holtmann @ 2006-07-18 15:23 ` James 2006-07-18 20:18 ` Marcel Holtmann 0 siblings, 1 reply; 32+ messages in thread From: James @ 2006-07-18 15:23 UTC (permalink / raw) To: Marcel Holtmann; +Cc: James, linux-kernel > What is the reason behind your question? Does disabling subtree checking > changes something? just that the iget() call which causes the problem in 2.6 is happening in the subtree checking code, not based on analysis of the flow. James. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-18 15:23 ` James @ 2006-07-18 20:18 ` Marcel Holtmann 2006-07-19 9:28 ` James 0 siblings, 1 reply; 32+ messages in thread From: Marcel Holtmann @ 2006-07-18 20:18 UTC (permalink / raw) To: James; +Cc: linux-kernel Hi James, > > What is the reason behind your question? Does disabling subtree checking > > changes something? > > just that the iget() call which causes the problem in 2.6 > is happening in the subtree checking code, not based on > analysis of the flow. just did a quick test with the RHEL4 kernel (2.6.9 based) and subtree_check and no_subtree_check export option. No difference and in both cases it gets remounted read-only. Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-18 20:18 ` Marcel Holtmann @ 2006-07-19 9:28 ` James 2006-07-19 15:55 ` Jan Kara 0 siblings, 1 reply; 32+ messages in thread From: James @ 2006-07-19 9:28 UTC (permalink / raw) To: Marcel Holtmann; +Cc: James, linux-kernel So what happens next? Is the ext3 maintainer on sabatical, or am I expected to submit a patch to fix this? J. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-19 9:28 ` James @ 2006-07-19 15:55 ` Jan Kara 2006-07-20 4:46 ` Neil Brown 0 siblings, 1 reply; 32+ messages in thread From: Jan Kara @ 2006-07-19 15:55 UTC (permalink / raw) To: James; +Cc: Marcel Holtmann, linux-kernel, akpm, sct > So what happens next? Is the ext3 maintainer on sabatical, > or am I expected to submit a patch to fix this? I guess people are mostly busy with OLS and such so maybe they missed the discussion.. Giving CC to relevant people to catch their attention :) Andrew, Stephen: James has come across a nasty bug (potentially remote DoS). NFS extracts inode number from a filehandle and the inode number eventually ends in ext3_read_inode(). Now if the inode number is bogus, ext3_get_inode_block() calls ext3_error() and filesystem is remounted RO or whatever else is configured. That is quite undesirable in this case. Now the easy "fix" is to change ext3_error() to ext3_warning() but an attacker flooding your logs with warnings is probably not good either and in case the inode number comes from ext3 itself we should really do ext3_error() as there is some corruption in the fs. Better fix would be to add a flag to read_inode() saying that the inode number is from untrusted source (but that means changing a prototype of a function every fs uses) and change export_iget() to pass this flag. Yet another solution would be to make ext3 implement its own get_dentry() export function and pass the flag internally... What do you think is the best solution? Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-19 15:55 ` Jan Kara @ 2006-07-20 4:46 ` Neil Brown 2006-07-20 16:06 ` Jan Kara 2006-07-21 0:42 ` Marcel Holtmann 0 siblings, 2 replies; 32+ messages in thread From: Neil Brown @ 2006-07-20 4:46 UTC (permalink / raw) To: Jan Kara; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct On Wednesday July 19, jack@suse.cz wrote: > > So what happens next? Is the ext3 maintainer on sabatical, > > or am I expected to submit a patch to fix this? > I guess people are mostly busy with OLS and such so maybe they missed > the discussion.. Giving CC to relevant people to catch their attention > :) > Andrew, Stephen: James has come across a nasty bug (potentially remote > DoS). NFS extracts inode number from a filehandle and the inode number > eventually ends in ext3_read_inode(). Now if the inode number is bogus, > ext3_get_inode_block() calls ext3_error() and filesystem is remounted > RO or whatever else is configured. That is quite undesirable in this > case. > Now the easy "fix" is to change ext3_error() to ext3_warning() but an > attacker flooding your logs with warnings is probably not good either > and in case the inode number comes from ext3 itself we should really do > ext3_error() as there is some corruption in the fs. > Better fix would be to add a flag to read_inode() saying that the inode > number is from untrusted source (but that means changing a prototype of a > function every fs uses) and change export_iget() to pass this flag. Yet > another solution would be to make ext3 implement its own get_dentry() export > function and pass the flag internally... > What do you think is the best solution? I think that a good solution (hard to say if it is the best) is to remove that error message altogether, and put it where inode numbers are read out of directories. Something like the following patch - compile tested only. NeilBrown Avoid triggering ext3_error on bad NFS file handle The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block without any checking. If ext3_get_inode_block allows it to trigger a error, then bad filehandles can have unpleasant effect. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read of storage. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./fs/ext3/inode.c | 10 ++++++---- ./fs/ext3/namei.c | 19 +++++++++++++++++-- 2 files changed, 23 insertions(+), 6 deletions(-) diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c --- .prev/fs/ext3/inode.c 2006-07-20 14:41:07.000000000 +1000 +++ ./fs/ext3/inode.c 2006-07-20 14:42:04.000000000 +1000 @@ -2405,11 +2405,13 @@ static ext3_fsblk_t ext3_get_inode_block if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) || - ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) { - ext3_error(sb, "ext3_get_inode_block", - "bad inode number: %lu", ino); + ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) + /* This error already checked for in namei.c unless we + * are looking at an NFS filehandle, in which case, + * no error reported is needed + */ return 0; - } + block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb); if (block_group >= EXT3_SB(sb)->s_groups_count) { ext3_error(sb,"ext3_get_inode_block","group >= groups count"); diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c --- .prev/fs/ext3/namei.c 2006-07-20 14:39:51.000000000 +1000 +++ ./fs/ext3/namei.c 2006-07-20 14:44:23.000000000 +1000 @@ -1000,7 +1000,14 @@ static struct dentry *ext3_lookup(struct if (bh) { unsigned long ino = le32_to_cpu(de->inode); brelse (bh); - inode = iget(dir->i_sb, ino); + if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && + ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(dir->i_sb)) || + ino > le32_to_cpu(EXT3_SB(dir->i_sb)->s_es->s_inodes_count)) { + ext3_error(dir->i_sb, "ext3_lookup", + "bad inode number: %lu", ino); + inode = NULL; + } else + inode = iget(dir->i_sb, ino); if (!inode) return ERR_PTR(-EACCES); @@ -1028,7 +1035,15 @@ struct dentry *ext3_get_parent(struct de return ERR_PTR(-ENOENT); ino = le32_to_cpu(de->inode); brelse(bh); - inode = iget(child->d_inode->i_sb, ino); + + if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && + ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(child->d_inode->i_sb)) || + ino > le32_to_cpu(EXT3_SB(child->d_inode->i_sb)->s_es->s_inodes_count)) { + ext3_error(child->d_inode->i_sb, "ext3_get_parent", + "bad inode number: %lu", ino); + inode = NULL; + } else + inode = iget(child->d_inode->i_sb, ino); if (!inode) return ERR_PTR(-EACCES); ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-20 4:46 ` Neil Brown @ 2006-07-20 16:06 ` Jan Kara 2006-07-20 20:11 ` James 2006-07-21 6:39 ` Neil Brown 2006-07-21 0:42 ` Marcel Holtmann 1 sibling, 2 replies; 32+ messages in thread From: Jan Kara @ 2006-07-20 16:06 UTC (permalink / raw) To: Neil Brown; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct > On Wednesday July 19, jack@suse.cz wrote: > > > So what happens next? Is the ext3 maintainer on sabatical, > > > or am I expected to submit a patch to fix this? > > I guess people are mostly busy with OLS and such so maybe they missed > > the discussion.. Giving CC to relevant people to catch their attention > > :) > > Andrew, Stephen: James has come across a nasty bug (potentially remote > > DoS). NFS extracts inode number from a filehandle and the inode number > > eventually ends in ext3_read_inode(). Now if the inode number is bogus, > > ext3_get_inode_block() calls ext3_error() and filesystem is remounted > > RO or whatever else is configured. That is quite undesirable in this > > case. > > Now the easy "fix" is to change ext3_error() to ext3_warning() but an > > attacker flooding your logs with warnings is probably not good either > > and in case the inode number comes from ext3 itself we should really do > > ext3_error() as there is some corruption in the fs. > > Better fix would be to add a flag to read_inode() saying that the inode > > number is from untrusted source (but that means changing a prototype of a > > function every fs uses) and change export_iget() to pass this flag. Yet > > another solution would be to make ext3 implement its own get_dentry() export > > function and pass the flag internally... > > What do you think is the best solution? > > I think that a good solution (hard to say if it is the best) is to > remove that error message altogether, and put it where inode numbers > are read out of directories. Something like the following patch - > compile tested only. Yes, that looks fine too. I did not realize that we get the inode number only in a few places. Maybe we could wrap the checks in a function (possibly inline) so that the checks are just in one place? Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-20 16:06 ` Jan Kara @ 2006-07-20 20:11 ` James 2006-07-21 6:44 ` Neil Brown 2006-07-21 6:39 ` Neil Brown 1 sibling, 1 reply; 32+ messages in thread From: James @ 2006-07-20 20:11 UTC (permalink / raw) To: Jan Kara; +Cc: Neil Brown, James, Marcel Holtmann, linux-kernel, akpm, sct Excellent - thank you. Neil - out of interest did you ever get my direct mail to you details below: Date: Tue, 6 Jun 2006 18:15:17 +0100 Subject: Bogus FH in NFS request crashes file system code Message-ID: <20060606171517.GA2601@selene.humbnet.james.local> J. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-20 20:11 ` James @ 2006-07-21 6:44 ` Neil Brown 0 siblings, 0 replies; 32+ messages in thread From: Neil Brown @ 2006-07-21 6:44 UTC (permalink / raw) To: James; +Cc: Jan Kara, Marcel Holtmann, linux-kernel, akpm, sct On Thursday July 20, 20@madingley.org wrote: > Excellent - thank you. > > Neil - out of interest did you ever get my direct mail > to you details below: > > Date: Tue, 6 Jun 2006 18:15:17 +0100 > Subject: Bogus FH in NFS request crashes file system code > Message-ID: <20060606171517.GA2601@selene.humbnet.james.local> > > J. Hmmm.... cannot find it in my mailbox, and it doesn't ring any bell. The first I remember of this was early July shortly before going on vacation.... That doesn't mean it didn't arrive though :-( My apologies if I did miss it. NeilBrown ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-20 16:06 ` Jan Kara 2006-07-20 20:11 ` James @ 2006-07-21 6:39 ` Neil Brown 2006-07-21 14:24 ` Jan Kara 2006-07-22 0:06 ` Andrew Morton 1 sibling, 2 replies; 32+ messages in thread From: Neil Brown @ 2006-07-21 6:39 UTC (permalink / raw) To: Jan Kara; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct On Thursday July 20, jack@suse.cz wrote: > Yes, that looks fine too. I did not realize that we get the inode > number only in a few places. Maybe we could wrap the checks in a > function (possibly inline) so that the checks are just in one place? Like this? NeilBrown Avoid triggering ext3_error on bad NFS file handle The inode number out of an NFS file handle gets passed eventually to ext3_get_inode_block without any checking. If ext3_get_inode_block allows it to trigger a error, then bad filehandles can have unpleasant effect. So remove the call to ext3_error there and put a matching check in ext3/namei.c where inode numbers are read of storage. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./fs/ext3/inode.c | 13 ++++++------- ./fs/ext3/namei.c | 15 +++++++++++++-- ./include/linux/ext3_fs.h | 9 +++++++++ 3 files changed, 28 insertions(+), 9 deletions(-) diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c --- .prev/fs/ext3/inode.c 2006-07-20 14:41:07.000000000 +1000 +++ ./fs/ext3/inode.c 2006-07-21 16:36:32.000000000 +1000 @@ -2402,14 +2402,13 @@ static ext3_fsblk_t ext3_get_inode_block struct buffer_head *bh; struct ext3_group_desc * gdp; - - if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && - ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) || - ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) { - ext3_error(sb, "ext3_get_inode_block", - "bad inode number: %lu", ino); + if (!ext3_valid_inum(sb, ino)) + /* This error already checked for in namei.c unless we + * are looking at an NFS filehandle, in which case, + * no error reported is needed + */ return 0; - } + block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb); if (block_group >= EXT3_SB(sb)->s_groups_count) { ext3_error(sb,"ext3_get_inode_block","group >= groups count"); diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c --- .prev/fs/ext3/namei.c 2006-07-20 14:39:51.000000000 +1000 +++ ./fs/ext3/namei.c 2006-07-21 16:36:09.000000000 +1000 @@ -1000,7 +1000,12 @@ static struct dentry *ext3_lookup(struct if (bh) { unsigned long ino = le32_to_cpu(de->inode); brelse (bh); - inode = iget(dir->i_sb, ino); + if (!ext3_valid_inum(dir->i_sb, ino)) { + ext3_error(dir->i_sb, "ext3_lookup", + "bad inode number: %lu", ino); + inode = NULL; + } else + inode = iget(dir->i_sb, ino); if (!inode) return ERR_PTR(-EACCES); @@ -1028,7 +1033,13 @@ struct dentry *ext3_get_parent(struct de return ERR_PTR(-ENOENT); ino = le32_to_cpu(de->inode); brelse(bh); - inode = iget(child->d_inode->i_sb, ino); + + if (!ext3_valid_inum(child->d_inode->i_sb, ino)) { + ext3_error(child->d_inode->i_sb, "ext3_get_parent", + "bad inode number: %lu", ino); + inode = NULL; + } else + inode = iget(child->d_inode->i_sb, ino); if (!inode) return ERR_PTR(-EACCES); diff .prev/include/linux/ext3_fs.h ./include/linux/ext3_fs.h --- .prev/include/linux/ext3_fs.h 2006-07-21 16:34:01.000000000 +1000 +++ ./include/linux/ext3_fs.h 2006-07-21 16:35:55.000000000 +1000 @@ -492,6 +492,15 @@ static inline struct ext3_inode_info *EX { return container_of(inode, struct ext3_inode_info, vfs_inode); } + +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino) +{ + return ino == EXT3_ROOT_INO || + ino == EXT3_JOURNAL_INO || + ino == EXT3_RESIZE_INO || + (ino > EXT3_FIRST_INO(sb) && + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)); +} #else /* Assume that user mode programs are passing in an ext3fs superblock, not * a kernel struct super_block. This will allow us to call the feature-test ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-21 6:39 ` Neil Brown @ 2006-07-21 14:24 ` Jan Kara 2006-07-22 0:06 ` Andrew Morton 1 sibling, 0 replies; 32+ messages in thread From: Jan Kara @ 2006-07-21 14:24 UTC (permalink / raw) To: Neil Brown; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct > On Thursday July 20, jack@suse.cz wrote: > > Yes, that looks fine too. I did not realize that we get the inode > > number only in a few places. Maybe we could wrap the checks in a > > function (possibly inline) so that the checks are just in one place? > > Like this? Yes. > NeilBrown > > > Avoid triggering ext3_error on bad NFS file handle > > The inode number out of an NFS file handle gets passed > eventually to ext3_get_inode_block without any checking. > If ext3_get_inode_block allows it to trigger a error, > then bad filehandles can have unpleasant effect. > > So remove the call to ext3_error there and put a matching > check in ext3/namei.c where inode numbers are read of storage. > > Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Honza > ### Diffstat output > ./fs/ext3/inode.c | 13 ++++++------- > ./fs/ext3/namei.c | 15 +++++++++++++-- > ./include/linux/ext3_fs.h | 9 +++++++++ > 3 files changed, 28 insertions(+), 9 deletions(-) > > diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c > --- .prev/fs/ext3/inode.c 2006-07-20 14:41:07.000000000 +1000 > +++ ./fs/ext3/inode.c 2006-07-21 16:36:32.000000000 +1000 > @@ -2402,14 +2402,13 @@ static ext3_fsblk_t ext3_get_inode_block > struct buffer_head *bh; > struct ext3_group_desc * gdp; > > - > - if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO && > - ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) || > - ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) { > - ext3_error(sb, "ext3_get_inode_block", > - "bad inode number: %lu", ino); > + if (!ext3_valid_inum(sb, ino)) > + /* This error already checked for in namei.c unless we > + * are looking at an NFS filehandle, in which case, > + * no error reported is needed > + */ > return 0; > - } > + > block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb); > if (block_group >= EXT3_SB(sb)->s_groups_count) { > ext3_error(sb,"ext3_get_inode_block","group >= groups count"); > > diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c > --- .prev/fs/ext3/namei.c 2006-07-20 14:39:51.000000000 +1000 > +++ ./fs/ext3/namei.c 2006-07-21 16:36:09.000000000 +1000 > @@ -1000,7 +1000,12 @@ static struct dentry *ext3_lookup(struct > if (bh) { > unsigned long ino = le32_to_cpu(de->inode); > brelse (bh); > - inode = iget(dir->i_sb, ino); > + if (!ext3_valid_inum(dir->i_sb, ino)) { > + ext3_error(dir->i_sb, "ext3_lookup", > + "bad inode number: %lu", ino); > + inode = NULL; > + } else > + inode = iget(dir->i_sb, ino); > > if (!inode) > return ERR_PTR(-EACCES); > @@ -1028,7 +1033,13 @@ struct dentry *ext3_get_parent(struct de > return ERR_PTR(-ENOENT); > ino = le32_to_cpu(de->inode); > brelse(bh); > - inode = iget(child->d_inode->i_sb, ino); > + > + if (!ext3_valid_inum(child->d_inode->i_sb, ino)) { > + ext3_error(child->d_inode->i_sb, "ext3_get_parent", > + "bad inode number: %lu", ino); > + inode = NULL; > + } else > + inode = iget(child->d_inode->i_sb, ino); > > if (!inode) > return ERR_PTR(-EACCES); > > diff .prev/include/linux/ext3_fs.h ./include/linux/ext3_fs.h > --- .prev/include/linux/ext3_fs.h 2006-07-21 16:34:01.000000000 +1000 > +++ ./include/linux/ext3_fs.h 2006-07-21 16:35:55.000000000 +1000 > @@ -492,6 +492,15 @@ static inline struct ext3_inode_info *EX > { > return container_of(inode, struct ext3_inode_info, vfs_inode); > } > + > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino) > +{ > + return ino == EXT3_ROOT_INO || > + ino == EXT3_JOURNAL_INO || > + ino == EXT3_RESIZE_INO || > + (ino > EXT3_FIRST_INO(sb) && > + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)); > +} > #else > /* Assume that user mode programs are passing in an ext3fs superblock, not > * a kernel struct super_block. This will allow us to call the feature-test -- Jan Kara <jack@suse.cz> SuSE CR Labs ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-21 6:39 ` Neil Brown 2006-07-21 14:24 ` Jan Kara @ 2006-07-22 0:06 ` Andrew Morton 2006-07-22 13:17 ` Theodore Tso 1 sibling, 1 reply; 32+ messages in thread From: Andrew Morton @ 2006-07-22 0:06 UTC (permalink / raw) To: Neil Brown Cc: jack, 20, marcel, linux-kernel, sct, Theodore Ts'o, Andreas Dilger On Fri, 21 Jul 2006 16:39:32 +1000 Neil Brown <neilb@suse.de> wrote: > Avoid triggering ext3_error on bad NFS file handle > > The inode number out of an NFS file handle gets passed > eventually to ext3_get_inode_block without any checking. > If ext3_get_inode_block allows it to trigger a error, > then bad filehandles can have unpleasant effect. > > So remove the call to ext3_error there and put a matching > check in ext3/namei.c where inode numbers are read of storage. > There are strange things happening in here. > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino) > +{ > + return ino == EXT3_ROOT_INO || > + ino == EXT3_JOURNAL_INO || > + ino == EXT3_RESIZE_INO || > + (ino > EXT3_FIRST_INO(sb) && > + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)); > +} One would expect the inode validity test to be (ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count)) but even this assumes that s_inodes_count is misnamed and really should be called s_last_inode_plus_one. If it is not misnamed then the validity test should be (ino >= EXT3_FIRST_INO(sb)) && (ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count)) Look through the filesystem for other uses of EXT3_FIRST_INO(). It's all rather fishily inconsistent. Ted, Andreas: do you think we could come up with statements describing exactly what the values in EXT3_FIRST_INO() and in ->s_inodes_count represent? Thanks. Also Neil, I wonder whether this patch of yours still permits NFS clients to access the journal and resize inodes in undesirable ways? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-22 0:06 ` Andrew Morton @ 2006-07-22 13:17 ` Theodore Tso 2006-07-25 1:56 ` Andrew Morton 0 siblings, 1 reply; 32+ messages in thread From: Theodore Tso @ 2006-07-22 13:17 UTC (permalink / raw) To: Andrew Morton Cc: Neil Brown, jack, 20, marcel, linux-kernel, sct, Andreas Dilger Sorry, OLS and some work-related emergencies have been hitting hard this week, so I've been deferred doing a full review of Jan's patch. Hopefully after OLS I'll be able to comment more fully. On Fri, Jul 21, 2006 at 05:06:27PM -0700, Andrew Morton wrote: > There are strange things happening in here. > > > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino) > > +{ > > + return ino == EXT3_ROOT_INO || > > + ino == EXT3_JOURNAL_INO || > > + ino == EXT3_RESIZE_INO || > > + (ino > EXT3_FIRST_INO(sb) && > > + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)); > > +} > > One would expect the inode validity test to be > > (ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count)) > > but even this assumes that s_inodes_count is misnamed and really should be > called s_last_inode_plus_one. If it is not misnamed then the validity test > should be > > (ino >= EXT3_FIRST_INO(sb)) && > (ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count)) > > Look through the filesystem for other uses of EXT3_FIRST_INO(). It's all > rather fishily inconsistent. I don't think there's anything in consistent. Basically, inodes are 1 based (inode number 0 in a directory entry means that the file is deleted and the directory entry is freed). Hence valid inode numbers are between 1 and s_inodes_count, inclusive. Inodes between 1 and EXT3_FIRST_INO-1 (inclusive) are reserved, are should always be marked as in use in the inode allocation bitmap. EXT3_FIRST_INO (which is 11 on all ext3 filesystems to date) is generally the lost+found directory, but that's just because it is the first file/directory which is allocated by mke2fs. So EXT3_FIRST_INO representes the first inode which can be allocated by userspace. (The root inode doesn't fall in this category because it can never be deleted or reallocated after the filesystem is created, and as a nod to Unix filesystem history, it has inode #2). The net of all of this is the inode validity test should be: (ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count)) However, I would suggest that we *not* allow remote NFS users to get access to the journal inode or the resize inode, please? That's only going to cause mischief of the DoS attack kind..... - Ted ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-22 13:17 ` Theodore Tso @ 2006-07-25 1:56 ` Andrew Morton 2006-07-25 2:21 ` Neil Brown 2006-07-25 2:36 ` Neil Brown 0 siblings, 2 replies; 32+ messages in thread From: Andrew Morton @ 2006-07-25 1:56 UTC (permalink / raw) To: Theodore Tso; +Cc: neilb, jack, 20, marcel, linux-kernel, sct, adilger On Sat, 22 Jul 2006 09:17:59 -0400 Theodore Tso <tytso@mit.edu> wrote: > Sorry, OLS and some work-related emergencies have been hitting hard > this week, so I've been deferred doing a full review of Jan's patch. > Hopefully after OLS I'll be able to comment more fully. > > On Fri, Jul 21, 2006 at 05:06:27PM -0700, Andrew Morton wrote: > > There are strange things happening in here. > > > > > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino) > > > +{ > > > + return ino == EXT3_ROOT_INO || > > > + ino == EXT3_JOURNAL_INO || > > > + ino == EXT3_RESIZE_INO || > > > + (ino > EXT3_FIRST_INO(sb) && > > > + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)); > > > +} > > > > One would expect the inode validity test to be > > > > (ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count)) > > > > but even this assumes that s_inodes_count is misnamed and really should be > > called s_last_inode_plus_one. If it is not misnamed then the validity test > > should be > > > > (ino >= EXT3_FIRST_INO(sb)) && > > (ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count)) > > > > Look through the filesystem for other uses of EXT3_FIRST_INO(). It's all > > rather fishily inconsistent. > > I don't think there's anything in consistent. Basically, inodes are 1 > based (inode number 0 in a directory entry means that the file is > deleted and the directory entry is freed). Hence valid inode numbers > are between 1 and s_inodes_count, inclusive. > > Inodes between 1 and EXT3_FIRST_INO-1 (inclusive) are reserved, are > should always be marked as in use in the inode allocation bitmap. > EXT3_FIRST_INO (which is 11 on all ext3 filesystems to date) is > generally the lost+found directory, but that's just because it is the > first file/directory which is allocated by mke2fs. So EXT3_FIRST_INO > representes the first inode which can be allocated by userspace. (The > root inode doesn't fall in this category because it can never be > deleted or reallocated after the filesystem is created, and as a nod > to Unix filesystem history, it has inode #2). > > The net of all of this is the inode validity test should be: > > (ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count)) I agree; I made that change. > However, I would suggest that we *not* allow remote NFS users to get > access to the journal inode or the resize inode, please? That's only > going to cause mischief of the DoS attack kind..... <looks at Neil> <then looks at ext2> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-25 1:56 ` Andrew Morton @ 2006-07-25 2:21 ` Neil Brown 2006-07-26 17:12 ` Eric Sandeen 2006-07-25 2:36 ` Neil Brown 1 sibling, 1 reply; 32+ messages in thread From: Neil Brown @ 2006-07-25 2:21 UTC (permalink / raw) To: Andrew Morton; +Cc: Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Monday July 24, akpm@osdl.org wrote: > On Sat, 22 Jul 2006 09:17:59 -0400 > Theodore Tso <tytso@mit.edu> wrote: > > The net of all of this is the inode validity test should be: > > > > (ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count)) > > I agree; I made that change. > Yeh, my bad. I double checked the second comparison but not the first :-( > > However, I would suggest that we *not* allow remote NFS users to get > > access to the journal inode or the resize inode, please? That's only > > going to cause mischief of the DoS attack kind..... > > <looks at Neil> See, I had a funny feeling that someone was watching me .... I doubt there is much room for a real problem here. To get to the point of IO on any of these files, you would need root access to the whole filesystem anyway. The follow patch should do what is suggested though. > > <then looks at ext2> Hmmm. are two looks allowed in the same email? NeilBrown ---------------------------- Make ext3 reject filehandles referring to special files. Inodes earlier than the 'first' inode (e.g. journal, resize) should be rejected early - except the root inode. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./fs/exportfs/expfs.c | 4 +++- ./fs/ext3/super.c | 15 +++++++++++++++ ./include/linux/fs.h | 2 ++ 3 files changed, 20 insertions(+), 1 deletion(-) diff .prev/fs/exportfs/expfs.c ./fs/exportfs/expfs.c --- .prev/fs/exportfs/expfs.c 2006-07-25 12:19:21.000000000 +1000 +++ ./fs/exportfs/expfs.c 2006-07-25 12:16:12.000000000 +1000 @@ -392,7 +392,8 @@ out: } -static struct dentry *export_iget(struct super_block *sb, unsigned long ino, __u32 generation) +struct dentry *export_iget(struct super_block *sb, unsigned long ino, + __u32 generation) { /* iget isn't really right if the inode is currently unallocated!! @@ -434,6 +435,7 @@ static struct dentry *export_iget(struct } return result; } +EXPORT_SYMBOL_GPL(export_iget); static struct dentry *get_object(struct super_block *sb, void *vobjp) diff .prev/fs/ext3/super.c ./fs/ext3/super.c --- .prev/fs/ext3/super.c 2006-07-25 12:19:21.000000000 +1000 +++ ./fs/ext3/super.c 2006-07-25 12:19:43.000000000 +1000 @@ -554,6 +554,20 @@ static int ext3_show_options(struct seq_ return 0; } + +static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp) +{ + __u32 *objp = vobjp; + unsigned long ino = objp[0]; + __u32 generation = objp[1]; + + if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb)) + return ERR_PTR(-ESTALE); + + return export_iget(sb, ino, generation); +} + + #ifdef CONFIG_QUOTA #define QTYPE2NAME(t) ((t)==USRQUOTA?"user":"group") #define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA)) @@ -622,6 +636,7 @@ static struct super_operations ext3_sops static struct export_operations ext3_export_ops = { .get_parent = ext3_get_parent, + .get_dentry = ext3_get_dentry, }; enum { diff .prev/include/linux/fs.h ./include/linux/fs.h --- .prev/include/linux/fs.h 2006-07-25 12:19:21.000000000 +1000 +++ ./include/linux/fs.h 2006-07-25 12:15:32.000000000 +1000 @@ -1381,6 +1381,8 @@ extern struct dentry * find_exported_dentry(struct super_block *sb, void *obj, void *parent, int (*acceptable)(void *context, struct dentry *de), void *context); +struct dentry *export_iget(struct super_block *sb, unsigned long ino, + __u32 generation); struct file_system_type { const char *name; ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-25 2:21 ` Neil Brown @ 2006-07-26 17:12 ` Eric Sandeen 2006-07-26 23:53 ` Neil Brown 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2006-07-26 17:12 UTC (permalink / raw) To: Neil Brown Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger > +EXPORT_SYMBOL_GPL(export_iget); ... > +static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp) > +{ > + __u32 *objp = vobjp; > + unsigned long ino = objp[0]; > + __u32 generation = objp[1]; > + > + if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb)) > + return ERR_PTR(-ESTALE); > + > + return export_iget(sb, ino, generation); > +} Hm, with this, ext3.ko has a new dependency on exportfs.ko. Is that desirable/acceptable? -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-26 17:12 ` Eric Sandeen @ 2006-07-26 23:53 ` Neil Brown 2006-07-27 18:32 ` Eric Sandeen 0 siblings, 1 reply; 32+ messages in thread From: Neil Brown @ 2006-07-26 23:53 UTC (permalink / raw) To: Eric Sandeen Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Wednesday July 26, sandeen@sandeen.net wrote: > > +EXPORT_SYMBOL_GPL(export_iget); > ... > > +static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp) > > +{ > > + __u32 *objp = vobjp; > > + unsigned long ino = objp[0]; > > + __u32 generation = objp[1]; > > + > > + if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb)) > > + return ERR_PTR(-ESTALE); > > + > > + return export_iget(sb, ino, generation); > > +} > > Hm, with this, ext3.ko has a new dependency on exportfs.ko. Is that > desirable/acceptable? Drat, you're right. No, I don't think that is what we want. I'll do it differently in a day or so. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-26 23:53 ` Neil Brown @ 2006-07-27 18:32 ` Eric Sandeen 2006-07-27 19:12 ` Christoph Hellwig 0 siblings, 1 reply; 32+ messages in thread From: Eric Sandeen @ 2006-07-27 18:32 UTC (permalink / raw) To: Neil Brown Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger Neil Brown wrote: > On Wednesday July 26, sandeen@sandeen.net wrote: >> Hm, with this, ext3.ko has a new dependency on exportfs.ko. Is that >> desirable/acceptable? > > Drat, you're right. > No, I don't think that is what we want. > I'll do it differently in a day or so. Would moving export_iget into fs/inode.c & exporting it from there be a reasonable way to go? At least ext2 & ext3 both have this need (prevent nfs access to special inodes) so putting the bulk of what they need for get_dentry (i.e. export_iget) somewhere common seems like a decent option to me. -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-27 18:32 ` Eric Sandeen @ 2006-07-27 19:12 ` Christoph Hellwig 2006-07-28 0:34 ` Neil Brown 2006-07-28 13:27 ` Peter Staubach 0 siblings, 2 replies; 32+ messages in thread From: Christoph Hellwig @ 2006-07-27 19:12 UTC (permalink / raw) To: Eric Sandeen Cc: Neil Brown, Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote: > Neil Brown wrote: > >On Wednesday July 26, sandeen@sandeen.net wrote: > > >>Hm, with this, ext3.ko has a new dependency on exportfs.ko. Is that > >>desirable/acceptable? > > > >Drat, you're right. > >No, I don't think that is what we want. > >I'll do it differently in a day or so. > > Would moving export_iget into fs/inode.c & exporting it from there be a > reasonable way to go? At least ext2 & ext3 both have this need (prevent > nfs access to special inodes) so putting the bulk of what they need for > get_dentry (i.e. export_iget) somewhere common seems like a decent > option to me. Nope. The right fix is to not make ext2/3 rely on export_iget at all. Please implement the proper export_operations instead, similar to e.g. xfs. export_iget is a horrible layering violation that needs to go away long-term, not move into core code. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-27 19:12 ` Christoph Hellwig @ 2006-07-28 0:34 ` Neil Brown 2006-07-28 13:27 ` Peter Staubach 1 sibling, 0 replies; 32+ messages in thread From: Neil Brown @ 2006-07-28 0:34 UTC (permalink / raw) To: Christoph Hellwig Cc: Eric Sandeen, Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Thursday July 27, hch@infradead.org wrote: > On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote: > > Neil Brown wrote: > > >I'll do it differently in a day or so. > > > > Would moving export_iget into fs/inode.c & exporting it from there be a > > reasonable way to go? At least ext2 & ext3 both have this need (prevent > > nfs access to special inodes) so putting the bulk of what they need for > > get_dentry (i.e. export_iget) somewhere common seems like a decent > > option to me. > > Nope. The right fix is to not make ext2/3 rely on export_iget at all. Please > implement the proper export_operations instead, similar to e.g. xfs. > > export_iget is a horrible layering violation that needs to go away long-term, > not move into core code. Agreed. I've just submitted revised patches. They effectively open-code export_iget in a local implemention of the get_dentry export_operation. This should give the problem with no unpleasant exports or layering issues. NeilBrown ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-27 19:12 ` Christoph Hellwig 2006-07-28 0:34 ` Neil Brown @ 2006-07-28 13:27 ` Peter Staubach 2006-07-28 13:30 ` Christoph Hellwig 1 sibling, 1 reply; 32+ messages in thread From: Peter Staubach @ 2006-07-28 13:27 UTC (permalink / raw) To: Christoph Hellwig Cc: Eric Sandeen, Neil Brown, Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger Christoph Hellwig wrote: >On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote: > > >>Neil Brown wrote: >> >> >>>On Wednesday July 26, sandeen@sandeen.net wrote: >>> >>> >>>>Hm, with this, ext3.ko has a new dependency on exportfs.ko. Is that >>>>desirable/acceptable? >>>> >>>> >>>Drat, you're right. >>>No, I don't think that is what we want. >>>I'll do it differently in a day or so. >>> >>> >>Would moving export_iget into fs/inode.c & exporting it from there be a >>reasonable way to go? At least ext2 & ext3 both have this need (prevent >>nfs access to special inodes) so putting the bulk of what they need for >>get_dentry (i.e. export_iget) somewhere common seems like a decent >>option to me. >> >> > >Nope. The right fix is to not make ext2/3 rely on export_iget at all. Please >implement the proper export_operations instead, similar to e.g. xfs. > >export_iget is a horrible layering violation that needs to go away long-term, >not move into core code. > Since export_iget() doesn't actually involve any code which has anything to do with the NFS server exports data structures, what exactly is the objection? Is it truly better to duplicate code than to use a common routine which can be documented? Thanx... ps ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-28 13:27 ` Peter Staubach @ 2006-07-28 13:30 ` Christoph Hellwig 0 siblings, 0 replies; 32+ messages in thread From: Christoph Hellwig @ 2006-07-28 13:30 UTC (permalink / raw) To: Peter Staubach Cc: Christoph Hellwig, Eric Sandeen, Neil Brown, Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Fri, Jul 28, 2006 at 09:27:01AM -0400, Peter Staubach wrote: > Since export_iget() doesn't actually involve any code which has anything to > do with the NFS server exports data structures, what exactly is the > objection? > Is it truly better to duplicate code than to use a common routine which > can be documented? export_iget calls iget() which assumes a lot about how a filesystem works. Generally no one should call iget outside of filesystem code (export_iget is the only such occurance) and should be replaced by opencoding iget_locked & co on filesystems where it helps or a simple_iget that takes a callback similar to the current read_inode method. By moving export_iget to core code you encourage people to use it, and that's the last thing we want. Btw, you folks might want to ping Al Viro, he had patches to fix various nfsd vs icache issues a while ago. > > Thanx... > > ps ---end quoted text--- ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-25 1:56 ` Andrew Morton 2006-07-25 2:21 ` Neil Brown @ 2006-07-25 2:36 ` Neil Brown 2006-07-25 18:27 ` Eric Sandeen 1 sibling, 1 reply; 32+ messages in thread From: Neil Brown @ 2006-07-25 2:36 UTC (permalink / raw) To: Andrew Morton; +Cc: Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger On Monday July 24, akpm@osdl.org wrote: > > <then looks at ext2> Yeh.... this patch is somewhat different to the others, yet similar... I didn't bother changing ext2_get_inode at all, but just defined a get_dentry export_op so that bad inode numbers never get to ext2_get_inode via nfsd. So in this case I had to test for the upper bound as well as the lower bound. Putting it another way, ext3_get_dentry reject certain inums that are known to be a problem. ext2_get_dentry allows only those inums that could possibly be ok. So if you (anyone) prefer one approach over the other, making the change so they both fs take the same approach would be trivial. Compile tested only NeilBrown ------------------------------------ Have ext2 reject file handles with bad inode numbers early. This prevents bad inode numbers from triggering errors in ext2_get_inode. Signed-off-by: Neil Brown <neilb@suse.de> ### Diffstat output ./fs/ext2/super.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff .prev/fs/ext2/super.c ./fs/ext2/super.c --- .prev/fs/ext2/super.c 2006-07-25 12:29:10.000000000 +1000 +++ ./fs/ext2/super.c 2006-07-25 12:31:04.000000000 +1000 @@ -251,6 +251,20 @@ static struct super_operations ext2_sops #endif }; +static struct dentry *ext2_get_dentry(struct super_block *sb, void *vobjp) +{ + __u32 *objp = vobjp; + unsigned long ino = objp[0]; + __u32 generation = objp[1]; + + if (ino != EXT2_ROOT_INO && ino < EXT2_FIRST_INO(sb)) + return ERR_PTR(-ESTALE); + if (ino > le32_to_cpu(EXT2_SB(sb)->s_es->s_inodes_count)) + return ERR_PTR(-ESTALE); + + return export_iget(sb, ino, generation); +} + /* Yes, most of these are left as NULL!! * A NULL value implies the default, which works with ext2-like file * systems, but can be improved upon. @@ -258,6 +272,7 @@ static struct super_operations ext2_sops */ static struct export_operations ext2_export_ops = { .get_parent = ext2_get_parent, + .get_dentry = ext2_get_dentry, }; static unsigned long get_sb_block(void **data) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-25 2:36 ` Neil Brown @ 2006-07-25 18:27 ` Eric Sandeen 0 siblings, 0 replies; 32+ messages in thread From: Eric Sandeen @ 2006-07-25 18:27 UTC (permalink / raw) To: Neil Brown Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger Neil Brown wrote: > Putting it another way, > ext3_get_dentry reject certain inums that are known to be a problem. > ext2_get_dentry allows only those inums that could possibly be ok. > > So if you (anyone) prefer one approach over the other, making the > change so they both fs take the same approach would be trivial. I like the 2nd approach - seems simpler, takes care of everything in ->get_dentry, right?. But I think your original patch is all that will work for 2.4 kernels... -Eric ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-20 4:46 ` Neil Brown 2006-07-20 16:06 ` Jan Kara @ 2006-07-21 0:42 ` Marcel Holtmann 2006-07-21 12:29 ` Andrew Morton 1 sibling, 1 reply; 32+ messages in thread From: Marcel Holtmann @ 2006-07-21 0:42 UTC (permalink / raw) To: Neil Brown; +Cc: Jan Kara, James, linux-kernel, akpm, sct Hi Neil, > > > So what happens next? Is the ext3 maintainer on sabatical, > > > or am I expected to submit a patch to fix this? > > I guess people are mostly busy with OLS and such so maybe they missed > > the discussion.. Giving CC to relevant people to catch their attention > > :) > > Andrew, Stephen: James has come across a nasty bug (potentially remote > > DoS). NFS extracts inode number from a filehandle and the inode number > > eventually ends in ext3_read_inode(). Now if the inode number is bogus, > > ext3_get_inode_block() calls ext3_error() and filesystem is remounted > > RO or whatever else is configured. That is quite undesirable in this > > case. > > Now the easy "fix" is to change ext3_error() to ext3_warning() but an > > attacker flooding your logs with warnings is probably not good either > > and in case the inode number comes from ext3 itself we should really do > > ext3_error() as there is some corruption in the fs. > > Better fix would be to add a flag to read_inode() saying that the inode > > number is from untrusted source (but that means changing a prototype of a > > function every fs uses) and change export_iget() to pass this flag. Yet > > another solution would be to make ext3 implement its own get_dentry() export > > function and pass the flag internally... > > What do you think is the best solution? > > I think that a good solution (hard to say if it is the best) is to > remove that error message altogether, and put it where inode numbers > are read out of directories. Something like the following patch - > compile tested only. I tested your patch and it works for me. So can someone with ext3 knowledge review and then propose it for upstream inclusion. Regards Marcel ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug 2006-07-21 0:42 ` Marcel Holtmann @ 2006-07-21 12:29 ` Andrew Morton 0 siblings, 0 replies; 32+ messages in thread From: Andrew Morton @ 2006-07-21 12:29 UTC (permalink / raw) To: Marcel Holtmann; +Cc: neilb, jack, 20, linux-kernel, sct On Fri, 21 Jul 2006 02:42:13 +0200 Marcel Holtmann <marcel@holtmann.org> wrote: > I tested your patch and it works for me. So can someone with ext3 > knowledge review and then propose it for upstream inclusion. Yup, I have it save away for next week's resumption of work. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: Bad ext3/nfs DoS bug
@ 2006-07-22 3:38 linux
0 siblings, 0 replies; 32+ messages in thread
From: linux @ 2006-07-22 3:38 UTC (permalink / raw)
To: akpm, linux-kernel
>> +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
>> +{
>> + return ino == EXT3_ROOT_INO ||
>> + ino == EXT3_JOURNAL_INO ||
>> + ino == EXT3_RESIZE_INO ||
>> + (ino > EXT3_FIRST_INO(sb) &&
>> + ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
>> +}
>
> One would expect the inode validity test to be
>
> (ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count))
>
> but even this assumes that s_inodes_count is misnamed and really should be
> called s_last_inode_plus_one. If it is not misnamed then the validity test
> should be
>
> (ino >= EXT3_FIRST_INO(sb)) &&
> (ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count))
>
> Look through the filesystem for other uses of EXT3_FIRST_INO(). It's all
> rather fishily inconsistent.
Er... I'm not an authoritative speaker, but it seems very simple to me.
Inodes are indexed starting from 1; the index 0 is reserved, and the first
inode on disk is number 1.
Thus, potentially valid inode numbers are 1 through s_inodes_count,
inclusive. Thus the <=. If this were a standard 0-based index, it would
be <, but it's not.
Further, a range of low inode numbers are reserved for special purposes.
Only a few inode numbers in this range are valid. However, these
numbers are still assigned space in the inode tables.
The only confusing term is EXT3_FIRST_INO, which is actually
more like EXT3_RESERVED_INODES. The same 1-based indexing explains
the use of > rather than >= there.
^ permalink raw reply [flat|nested] 32+ messages in threadend of thread, other threads:[~2006-07-28 13:30 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-17 13:01 Bad ext3/nfs DoS bug James 2006-07-17 13:06 ` Marcel Holtmann 2006-07-17 18:43 ` Marcel Holtmann 2006-07-18 7:55 ` Marcel Holtmann 2006-07-18 14:56 ` James 2006-07-18 15:22 ` Marcel Holtmann 2006-07-18 15:23 ` James 2006-07-18 20:18 ` Marcel Holtmann 2006-07-19 9:28 ` James 2006-07-19 15:55 ` Jan Kara 2006-07-20 4:46 ` Neil Brown 2006-07-20 16:06 ` Jan Kara 2006-07-20 20:11 ` James 2006-07-21 6:44 ` Neil Brown 2006-07-21 6:39 ` Neil Brown 2006-07-21 14:24 ` Jan Kara 2006-07-22 0:06 ` Andrew Morton 2006-07-22 13:17 ` Theodore Tso 2006-07-25 1:56 ` Andrew Morton 2006-07-25 2:21 ` Neil Brown 2006-07-26 17:12 ` Eric Sandeen 2006-07-26 23:53 ` Neil Brown 2006-07-27 18:32 ` Eric Sandeen 2006-07-27 19:12 ` Christoph Hellwig 2006-07-28 0:34 ` Neil Brown 2006-07-28 13:27 ` Peter Staubach 2006-07-28 13:30 ` Christoph Hellwig 2006-07-25 2:36 ` Neil Brown 2006-07-25 18:27 ` Eric Sandeen 2006-07-21 0:42 ` Marcel Holtmann 2006-07-21 12:29 ` Andrew Morton -- strict thread matches above, loose matches on Subject: below -- 2006-07-22 3:38 linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox