* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export [not found] ` <163038594541.7591.11109978693705593957@noble.neil.brown.name> @ 2021-09-01 7:20 ` Christoph Hellwig 2021-09-01 15:22 ` J. Bruce Fields 2021-09-02 4:06 ` NeilBrown 0 siblings, 2 replies; 19+ messages in thread From: Christoph Hellwig @ 2021-09-01 7:20 UTC (permalink / raw) To: NeilBrown Cc: Christoph Hellwig, J. Bruce Fields, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Tue, Aug 31, 2021 at 02:59:05PM +1000, NeilBrown wrote: > Making the change purely in btrfs is simply not possible. There is no > way for btrfs to provide nfsd with a different inode number. To move > the bulk of the change into btrfs code we would need - at the very least > - some way for nfsd to provide the filehandle when requesting stat > information. We would also need to provide a reference filehandle when > requesting a dentry->filehandle conversion. Cluttering the > export_operations like that just for btrfs doesn't seem like the right > balance. I agree that cluttering kstat is not ideal, but it was a case > of choosing the minimum change for the maximum effect. So you're papering over a btrfs bug by piling up cludges in the nsdd code that has not business even knowing about this btrfs bug, while leaving other users of inodes numbers and file handles broken? If you only care about file handles: this is what the export operations are for. If you care about inode numbers: well, it is up to btrfs to generate uniqueue inode numbers. It currently doesn't do that, and no amount of papering over that in nfsd is going to fix the issue. If XORing a little more entropy into the inode number is a good enough band aid (and I strongly disagree with that), do it inside btrfs for every place they report the inode number. There is nothing NFS-specific about that. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-01 7:20 ` [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export Christoph Hellwig @ 2021-09-01 15:22 ` J. Bruce Fields 2021-09-02 4:14 ` NeilBrown 2021-09-02 7:11 ` Christoph Hellwig 2021-09-02 4:06 ` NeilBrown 1 sibling, 2 replies; 19+ messages in thread From: J. Bruce Fields @ 2021-09-01 15:22 UTC (permalink / raw) To: Christoph Hellwig Cc: NeilBrown, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Wed, Sep 01, 2021 at 08:20:06AM +0100, Christoph Hellwig wrote: > On Tue, Aug 31, 2021 at 02:59:05PM +1000, NeilBrown wrote: > > Making the change purely in btrfs is simply not possible. There is no > > way for btrfs to provide nfsd with a different inode number. To move > > the bulk of the change into btrfs code we would need - at the very least > > - some way for nfsd to provide the filehandle when requesting stat > > information. We would also need to provide a reference filehandle when > > requesting a dentry->filehandle conversion. Cluttering the > > export_operations like that just for btrfs doesn't seem like the right > > balance. I agree that cluttering kstat is not ideal, but it was a case > > of choosing the minimum change for the maximum effect. > > So you're papering over a btrfs bug by piling up cludges in the nsdd > code that has not business even knowing about this btrfs bug, while > leaving other users of inodes numbers and file handles broken? > > If you only care about file handles: this is what the export operations > are for. If you care about inode numbers: well, it is up to btrfs > to generate uniqueue inode numbers. It currently doesn't do that, and > no amount of papering over that in nfsd is going to fix the issue. > > If XORing a little more entropy It's stronger than "a little more entropy". We know enough about how the numbers being XOR'd grow to know that collisions are only going to happen in some extreme use cases. (If I understand correctly.) > into the inode number is a good enough band aid (and I strongly > disagree with that), do it inside btrfs for every place they report > the inode number. There is nothing NFS-specific about that. Neil tried something like that: https://lore.kernel.org/linux-nfs/162761259105.21659.4838403432058511846@noble.neil.brown.name/ "The patch below, which is just a proof-of-concept, changes btrfs to report a uniform st_dev, and different (64bit) st_ino in different subvols." (Though actually you're proposing keeping separate st_dev?) I looked back through a couple threads to try to understand why we couldn't do that (on new filesystems, with a mkfs option to choose new or old behavior) and still don't understand. But the threads are long. There are objections to a new mount option (which seem obviously wrong; this should be a persistent feature of the on-disk filesystem). --b. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-01 15:22 ` J. Bruce Fields @ 2021-09-02 4:14 ` NeilBrown 2021-09-05 16:07 ` J. Bruce Fields 2021-09-02 7:11 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-09-02 4:14 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Thu, 02 Sep 2021, J. Bruce Fields wrote: > I looked back through a couple threads to try to understand why we > couldn't do that (on new filesystems, with a mkfs option to choose new > or old behavior) and still don't understand. But the threads are long. > > There are objections to a new mount option (which seem obviously wrong; > this should be a persistent feature of the on-disk filesystem). I hadn't thought much (if at all) about a persistent filesystem feature flag. I'll try that now. There are two features of interest. One is completely unique inode numbers, the other is reporting different st_dev for different subvolumes. I think these need to be kept separate, though the second would depend on the first. They would be similar to my "inumbits" and "numdevs" mount options, though with less flexibility. I think that they would need strong semantics to be acceptable - "mostly unique" isn't really acceptable once we are changing the on-disk data. The "unique inode numbers" bit (UIN) would require that file object-ids fit in some number of bits (maybe 40) and that subvolume numbers fit in the remaining bits (24) and would then combine them together for the inode number. This could obviously be set at mkfs time. Could it be set on an unmounted filesystem? The "single-dev" flag (SD) could be toggled any time that UIN was set, and mkfs would default it on if UIN was selected. If UIN was in effect, then creating a subvol beyond the permitted max would have to fail. 24 bits is small enough that we would probably want a warning of impending doom - maybe at 23 bits? The current 48bits doesn't need that. Similarly creating an inode beyond 40bits would have to fail. This is probably more problematic and so might need more warnings. Do we want a warning each time any subvol crosses some limit? If not we would need a flag for each warning. What should a sysadmin do when they see the warning? If 40 bit an unacceptable limit of the total number of inodes in a subvol, or is it only a problem because of btrfs' practice of never reusing object-ids? Backup-and-restore would compact object-ids, but would be a big cost. Off-line reindexing would be cheaper (does anyone else remember using "renum" programs with BASIC??). Online lazy re-indexing might be possible if the inode number was maintained separately from the object-id and an atomic "switch which inode number to use" could be done at mount time. Setting UIN on an existing filesystem would require checking that only 24bit are used for subvolumes (easy) and that only 40 were usgd for objects in any individual subvolume (presumably that would require checking all subvolumes, which might take a little while, but shouldn't take more than a few minutes. Doing this would break any indexes that might be created over files, and would probably upset any active NFS mounts, and would likely have other problems. Se it would need to be a well-documented step with clear rewards. An alternative to renumbering would be to maintain file-ids and subvolume-ids which are separate from the object-id. Apparently reusing subvolume object-ids is not possible and reusing file object-ids is quite costly. If the file-id were separate from the object-id, these problems would vanish. This would require extra space in the inode (there are several reserved u64s, so that isn't a problem) and space in each directory entry (might be more of a problem). It would also require some way to keep track of used (or unused) id numbers. This avoids the cost of renumbering, by spreading it out over every creation. I suspect the average inode-creation overhead could be kept quite low, but not quite zero. I believe that some code *knows* that the root of any btrfs subvolumes has inode number 256. systemd seems to use this. I have no idea what else might depend on inode numbers in some way. I suspect that if we tried to roll out a change like this, either almost no-one would use it (if it wasn't the default), or things would start breaking (if it was). I'm not against breaking things, but we need to be sure there is a solution for fixing them, and I'm certainly not up to doing that myself. So yes - I think that using a mkfs option would open up other avenues for a solution. There would still be a lot of work to find something that continues to meet everyone's needs. The advantage of an nfsd-focusses solution is that we can have working code today with minimal down-sides. I'm certainly not prepared to go digging through btrfs code to determine how to implement a btrfs-only solution without strong buy-in from btrfs maintainers. NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-02 4:14 ` NeilBrown @ 2021-09-05 16:07 ` J. Bruce Fields 2021-09-06 1:29 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: J. Bruce Fields @ 2021-09-05 16:07 UTC (permalink / raw) To: NeilBrown Cc: Christoph Hellwig, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Thu, Sep 02, 2021 at 02:14:17PM +1000, NeilBrown wrote: > On Thu, 02 Sep 2021, J. Bruce Fields wrote: > > I looked back through a couple threads to try to understand why we > > couldn't do that (on new filesystems, with a mkfs option to choose new > > or old behavior) and still don't understand. But the threads are long. > > > > There are objections to a new mount option (which seem obviously wrong; > > this should be a persistent feature of the on-disk filesystem). > > I hadn't thought much (if at all) about a persistent filesystem feature > flag. I'll try that now. > > There are two features of interest. One is completely unique inode > numbers, the other is reporting different st_dev for different > subvolumes. I think these need to be kept separate, though the second > would depend on the first. They would be similar to my "inumbits" and > "numdevs" mount options, though with less flexibility. I think that > they would need strong semantics to be acceptable - "mostly unique" > isn't really acceptable once we are changing the on-disk data. I don't quite follow that. Also the "on-disk data" here is literally just one more flag bit in some superblock field, right? > I believe that some code *knows* that the root of any btrfs subvolumes > has inode number 256. systemd seems to use this. I have no idea what > else might depend on inode numbers in some way. Looking. Ugh, yes, there's abtrfs_might_be_subvol that takes a struct stat and returns: return S_ISDIR(st->st_mode) && st->st_ino == 256; I wonder why it does that? Are there situations where all it has is a file descriptor (so it can't easily compare st_dev with the parent?) And if you NFS-export and wanted to answer the same question on the client side, I wonder what you'd do. --b. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-05 16:07 ` J. Bruce Fields @ 2021-09-06 1:29 ` NeilBrown 2021-09-11 14:12 ` Amir Goldstein 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-09-06 1:29 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Mon, 06 Sep 2021, J. Bruce Fields wrote: > On Thu, Sep 02, 2021 at 02:14:17PM +1000, NeilBrown wrote: > > On Thu, 02 Sep 2021, J. Bruce Fields wrote: > > > I looked back through a couple threads to try to understand why we > > > couldn't do that (on new filesystems, with a mkfs option to choose new > > > or old behavior) and still don't understand. But the threads are long. > > > > > > There are objections to a new mount option (which seem obviously wrong; > > > this should be a persistent feature of the on-disk filesystem). > > > > I hadn't thought much (if at all) about a persistent filesystem feature > > flag. I'll try that now. > > > > There are two features of interest. One is completely unique inode > > numbers, the other is reporting different st_dev for different > > subvolumes. I think these need to be kept separate, though the second > > would depend on the first. They would be similar to my "inumbits" and > > "numdevs" mount options, though with less flexibility. I think that > > they would need strong semantics to be acceptable - "mostly unique" > > isn't really acceptable once we are changing the on-disk data. > > I don't quite follow that. I agree it is a bit of a leap. > > Also the "on-disk data" here is literally just one more flag bit in some > superblock field, right? Maybe. I *could* be just one bit. But even "just one bit" is, I think, more of a support commitement than adding a mount option. Mount options are fairly obvious to the user. super-blocks not as much. So "just one bit" might still be "one more question" than the supoort people need to ask when handling a problem report. When I wrote that I was thinking about how I would be comfortable with if I were a btrfs maintainer. And I don't think I'd like to spend and on-disk change and only gain a "mostly harmless" solution. Christoph's comment about possible vulnerabilities are probably part of this. I think that over NFS, concern about a user being able to synthesise an inode number conflict is probably "mountain out of mole hill" territory. However for local access, I cannot convince myself that it won't be a problem. I can imagine (incautiously written) auditing scans getting confused, and while auditing over NFS doesn't make much sense, auditing locally does. > > > I believe that some code *knows* that the root of any btrfs subvolumes > > has inode number 256. systemd seems to use this. I have no idea what > > else might depend on inode numbers in some way. > > Looking. Ugh, yes, there's abtrfs_might_be_subvol that takes a struct > stat and returns: > > return S_ISDIR(st->st_mode) && st->st_ino == 256; > > I wonder why it does that? Are there situations where all it has is a > file descriptor (so it can't easily compare st_dev with the parent?) > And if you NFS-export and wanted to answer the same question on the > client side, I wonder what you'd do. There are also a few references to BTRFS_FIRST_FREE_OBJECTID which is 256. Uses seem to include: - managing quotas, which fits with my idea that subvols are like project-quota trees. - optionally preventing "rm -r" from removing subvols - some switching to/from "readonly" which I cannot follow - some special handling of user home-directories when they are subvols These are probably reasonable and do point to subvols being a little bit like separate filesytems. These would break if we changed local inode numbers. The project-quota management and the read-only setting are not available via NFS so changing the inode number seen that way is not likely to matter as much. In any case, detecting "256" is only useful if you can also detect "is btrfs", and you cannot do that of NFS. Once upon a time ext[234] had a set of inode flags and xfs separately had a bunch of inode flags. These are now unified (to a degree) in 'struct fsattr' accessed by FS_IOC_FSGETXATTR and FS_IOC_FSSETXATTR. btrfs supports that interface, but doesn't appear to have extended it for subvol-specific things - preferring to create btrfs-specific ioctls instead. Maybe they weren't designed to be extensible enough. Maybe what we really need is for a bunch of diverse filesystem developers to get together and agree on some new common interface for subvolume management, including coming up with some sort of definition of what a subvolume "is". Until that happens (and the new interfaces are implemented and widely used) I can only see two possible solutions to the current NFS-export-of-btrfs problem: 1/ change nfsd to export a different inode number to the one btrfs uses (or maybe a different fsid, but that is problematic in other ways) 2/ change userspace to check filehandles and not assume two things are the same if their filehandles are different. Maybe I should write a patch for fts_read() and see how much glibc folk will hate it. NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-06 1:29 ` NeilBrown @ 2021-09-11 14:12 ` Amir Goldstein 2021-09-13 0:43 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: Amir Goldstein @ 2021-09-11 14:12 UTC (permalink / raw) To: NeilBrown Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara > Maybe what we really need is for a bunch of diverse filesystem > developers to get together and agree on some new common interface for > subvolume management, including coming up with some sort of definition > of what a subvolume "is". Neil, Seeing that LSF/MM is not expected to gather in the foreseen future, would you like to submit this as a topic for discussion in LPC Filesystem MC [1]? I know this is last minute, but we've just extended the CFP deadline until Sep 15 (MC is on Sep 21), so if you post a proposal, I think we will be able to fit this session in the final schedule. Granted, I don't know how many of the stakeholders plan to attend the LPC Filesystem MC, but at least Josef should be there ;) I do have one general question about the expected behavior - In his comment to the LWN article [2], Josef writes: "The st_dev thing is unfortunate, but again is the result of a lack of interfaces. Very early on we had problems with rsync wandering into snapshots and copying loads of stuff. Find as well would get tripped up. The way these tools figure out if they've wandered into another file system is if the st_dev is different..." If your plan goes through to export the main btrfs filesystem and subvolumes as a uniform st_dev namespace to the NFS client, what's to stop those old issues from remerging on NFS exported btrfs? IOW, the user experience you are trying to solve is inability of 'find' to traverse the unified btrfs namespace, but Josef's comment indicates that some users were explicitly unhappy from 'find' trying to traverse into subvolumes to begin with. So is there really a globally expected user experience? If not, then I really don't see how an nfs export option can be avoided. Thanks, Amir. [1] https://www.linuxplumbersconf.org/event/11/page/104-accepted-microconferences#cont-filesys [2] https://lwn.net/Articles/867509/ ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-11 14:12 ` Amir Goldstein @ 2021-09-13 0:43 ` NeilBrown 2021-09-13 10:04 ` Amir Goldstein 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-09-13 0:43 UTC (permalink / raw) To: Amir Goldstein Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara On Sun, 12 Sep 2021, Amir Goldstein wrote: > > Maybe what we really need is for a bunch of diverse filesystem > > developers to get together and agree on some new common interface for > > subvolume management, including coming up with some sort of definition > > of what a subvolume "is". > > Neil, > > Seeing that LSF/MM is not expected to gather in the foreseen future, would > you like to submit this as a topic for discussion in LPC Filesystem MC [1]? > I know this is last minute, but we've just extended the CFP deadline > until Sep 15 (MC is on Sep 21), so if you post a proposal, I think we will > be able to fit this session in the final schedule. Thanks for the suggestion. Maybe that is a good idea... But I don't personally find face-to-face interactions particularly useful - though other people obviously do. I need thinking time after receiving new ideas, so I can be sure that I understand them properly. Face-to-face doesn't allow me that thinking time. So: no, I won't be proposing anything for LPC. > > Granted, I don't know how many of the stakeholders plan to attend > the LPC Filesystem MC, but at least Josef should be there ;) > > I do have one general question about the expected behavior - > In his comment to the LWN article [2], Josef writes: > > "The st_dev thing is unfortunate, but again is the result of a lack of > interfaces. > Very early on we had problems with rsync wandering into snapshots and > copying loads of stuff. Find as well would get tripped up. > The way these tools figure out if they've wandered into another file system > is if the st_dev is different..." > > If your plan goes through to export the main btrfs filesystem and > subvolumes as a uniform st_dev namespace to the NFS client, > what's to stop those old issues from remerging on NFS exported btrfs? That comment from Josef was interesting.... It doesn't align with Commit 3394e1607eaf ("Btrfs: Give each subvol and snapshot their own anonymous devid") when Chris Mason introduced the per-subvol device number with the justification that: Each subvolume has its own private inode number space, and so we need to fill in different device numbers for each subvolume to avoid confusing applications. But I understand that history can be messy and maybe there were several justifications of which Josef remembers one and Chris reported another. If rsync did, in fact, wander into subvols and didn't get put off by the duplicate inode numbers (like 'find' does), then it would still do that when accessing btrfs over NFS. This has always been the case. Chris' "fix" only affected local access, it didn't change NFS access at all. > > IOW, the user experience you are trying to solve is inability of 'find' > to traverse the unified btrfs namespace, but Josef's comment indicates > that some users were explicitly unhappy from 'find' trying to traverse > into subvolumes to begin with. I believe that even 12 years ago, find would have complained if it saw a directory with the same inode as an ancestor. Chris's fix wouldn't prevent find from entering in that case, because it wouldn't enter anyway. > > So is there really a globally expected user experience? No. Everybody wants what they want. There is some overlap, not no guarantees. That is the unavoidable consequence of ignoring standards when implementing functionality. > If not, then I really don't see how an nfs export option can be avoided. And I really don't see how an nfs export option would help... Different people within and organisation and using the same export might have different expectations. Thanks, NeilBrown > > Thanks, > Amir. > > [1] https://www.linuxplumbersconf.org/event/11/page/104-accepted-microconferences#cont-filesys > [2] https://lwn.net/Articles/867509/ > > ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-13 0:43 ` NeilBrown @ 2021-09-13 10:04 ` Amir Goldstein 2021-09-13 22:59 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: Amir Goldstein @ 2021-09-13 10:04 UTC (permalink / raw) To: NeilBrown Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara > > I do have one general question about the expected behavior - > > In his comment to the LWN article [2], Josef writes: > > > > "The st_dev thing is unfortunate, but again is the result of a lack of > > interfaces. > > Very early on we had problems with rsync wandering into snapshots and > > copying loads of stuff. Find as well would get tripped up. > > The way these tools figure out if they've wandered into another file system > > is if the st_dev is different..." > > > > If your plan goes through to export the main btrfs filesystem and > > subvolumes as a uniform st_dev namespace to the NFS client, > > what's to stop those old issues from remerging on NFS exported btrfs? > > That comment from Josef was interesting.... It doesn't align with > Commit 3394e1607eaf ("Btrfs: Give each subvol and snapshot their own anonymous devid") > when Chris Mason introduced the per-subvol device number with the > justification that: > Each subvolume has its own private inode number space, and so we need > to fill in different device numbers for each subvolume to avoid confusing > applications. > > But I understand that history can be messy and maybe there were several > justifications of which Josef remembers one and Chris reported > another. > I don't see a contradiction between the two reasons. Reporting two different objects with the same st_ino;st_dev is a problem and so is rsync wandering into subvolumes is another problem. Separate st_dev solves the first problem and leaves the behavior or rsync in the hands of the user (i.e. rsync --one-file-system). > If rsync did, in fact, wander into subvols and didn't get put off by the > duplicate inode numbers (like 'find' does), then it would still do that > when accessing btrfs over NFS. This has always been the case. Chris' > "fix" only affected local access, it didn't change NFS access at all. > Right, so the right fix IMO would be to provide similar semantics to the NFS client, like your first patch set tried to do. > > > > IOW, the user experience you are trying to solve is inability of 'find' > > to traverse the unified btrfs namespace, but Josef's comment indicates > > that some users were explicitly unhappy from 'find' trying to traverse > > into subvolumes to begin with. > > I believe that even 12 years ago, find would have complained if it saw a > directory with the same inode as an ancestor. Chris's fix wouldn't > prevent find from entering in that case, because it wouldn't enter > anyway. > > > > > So is there really a globally expected user experience? > > No. Everybody wants what they want. There is some overlap, not no > guarantees. That is the unavoidable consequence of ignoring standards > when implementing functionality. > > > If not, then I really don't see how an nfs export option can be avoided. > > And I really don't see how an nfs export option would help... Different > people within and organisation and using the same export might have > different expectations. That's true. But if admin decides to export a specific btrfs mount as a non-unified filesystem, then NFS clients can decide whether ot not to auto-mount the exported subvolumes and different users on the client machine can decide if they want to rsync or rsync --one-file-system, just as they would with local btrfs. And maybe I am wrong, but I don't see how the decision on whether to export a non-unified btrfs can be made a btrfs option or a nfsd global option, that's why I ended up with export option. Thanks, Amir. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-13 10:04 ` Amir Goldstein @ 2021-09-13 22:59 ` NeilBrown 2021-09-14 5:45 ` Amir Goldstein 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-09-13 22:59 UTC (permalink / raw) To: Amir Goldstein Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara On Mon, 13 Sep 2021, Amir Goldstein wrote: > > Right, so the right fix IMO would be to provide similar semantics > to the NFS client, like your first patch set tried to do. > Like every other approach, this sounds good and sensible ... until you examine the details. For NFSv3 (RFC1813) this would be a protocol violation. Section 3.3.3 (LOOKUP) says: A server will not allow a LOOKUP operation to cross a mountpoint to the root of a different filesystem, even if the filesystem is exported. The filesystem is represented by the fsid, so this implies that the fsid of an object reported by LOOKUP must be the same as the fsid of the directory used in the LOOKUP. Linux NFS does allow this restriction to be bypassed with the "crossmnt" export option. Maybe if crossmnt were given it would be defensible to change the fsid - if crossmnt is not given, we leave the current behaviour. Note that this is a hack and while it is extremely useful, it does not produce a seemly experience. You can get exactly the same problems with "find" - just not as uniformly (mounting with "-o noac" makes them uniform). For NFSv4, we need to provide a "mounted-on" fileid for any mountpoint. btrfs doesn't have a mounted-on fileid that can be used. We can fake something that might work reasonably well - but it would be fake. (but then ... btrfs already provided bogus information in getdents when there is a subvol root in the directory). But these are relatively minor. The bigger problem is /proc/mounts. If btrfs maintainers were willing to have every active subvolume appear in /proc/mounts, then I would be happy to fiddle the NFS fsid and allow every active NFS/btrfs subvolume to appear in /proc/mounts on the NFS client. But they aren't. So I am not. > > And I really don't see how an nfs export option would help... Different > > people within and organisation and using the same export might have > > different expectations. > > That's true. > But if admin decides to export a specific btrfs mount as a non-unified > filesystem, then NFS clients can decide whether ot not to auto-mount the > exported subvolumes and different users on the client machine can decide > if they want to rsync or rsync --one-file-system, just as they would with > local btrfs. > > And maybe I am wrong, but I don't see how the decision on whether to > export a non-unified btrfs can be made a btrfs option or a nfsd global > option, that's why I ended up with export option. Just because a btrfs option and global nfsd option are bad, that doesn't mean an export option must be good. It needs to be presented and defended on its own merits. My current opinion (and I must admit I am feeling rather jaded about the whole thing), is that while btrfs is a very interesting and valuable experiment in fs design, it contains core mistakes that cannot be incrementally fixed. It should be marked as legacy with all current behaviour declared as intentional and not subject to change. This would make way for a new "betrfs" which was designed based on all that we have learned. It would use the same code base, but present a more coherent interface. Exactly what that interface would be has yet to be decided, but we would not be bound to maintain anything just because btrfs supports it. NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-13 22:59 ` NeilBrown @ 2021-09-14 5:45 ` Amir Goldstein 2021-09-20 22:09 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: Amir Goldstein @ 2021-09-14 5:45 UTC (permalink / raw) To: NeilBrown Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara On Tue, Sep 14, 2021 at 1:59 AM NeilBrown <neilb@suse.de> wrote: > > On Mon, 13 Sep 2021, Amir Goldstein wrote: > > > > Right, so the right fix IMO would be to provide similar semantics > > to the NFS client, like your first patch set tried to do. > > > > Like every other approach, this sounds good and sensible ... until > you examine the details. > > For NFSv3 (RFC1813) this would be a protocol violation. > Section 3.3.3 (LOOKUP) says: > A server will not allow a LOOKUP operation to cross a mountpoint to > the root of a different filesystem, even if the filesystem is > exported. > > The filesystem is represented by the fsid, so this implies that the fsid > of an object reported by LOOKUP must be the same as the fsid of the > directory used in the LOOKUP. > > Linux NFS does allow this restriction to be bypassed with the "crossmnt" > export option. Maybe if crossmnt were given it would be defensible to > change the fsid - if crossmnt is not given, we leave the current > behaviour. Note that this is a hack and while it is extremely useful, > it does not produce a seemly experience. You can get exactly the same > problems with "find" - just not as uniformly (mounting with "-o noac" > makes them uniform). > I don't understand why we would need to talk about NFSv3. This btrfs export issue has been with us for a while. I see no reason to address it for old protocols if we can address it with a new protocol with better support for the concept of fsid hierarchy. > For NFSv4, we need to provide a "mounted-on" fileid for any mountpoint. > btrfs doesn't have a mounted-on fileid that can be used. We can fake > something that might work reasonably well - but it would be fake. (but > then ... btrfs already provided bogus information in getdents when there > is a subvol root in the directory). > That seems easy to solve by passing some flag to ->encode_fh() or if the behavior is persistent in btrfs by some mkfs/module/mount option then btrfs_encode_fh() will always encode the subvol root inode as resident of the parent tree-id, because nfsd anyway does not ->encode_fh() for export roots, right? > But these are relatively minor. The bigger problem is /proc/mounts. If > btrfs maintainers were willing to have every active subvolume appear in > /proc/mounts, then I would be happy to fiddle the NFS fsid and allow > every active NFS/btrfs subvolume to appear in /proc/mounts on the NFS > client. But they aren't. So I am not. > I don't understand why you need to tie the two together. I would suggest: 1. Export different fsid's per subvols to NFSv4 based on statx() exported tree-id 2. NFS client side uses user configuration to determine which subvols to auto-mount 3. [optional] Provide a way to configure btrfs using mkfs/module/mount option to behave locally the same as the NFS client, which will allow user configuration to determine with subvols to auto-mount locally I admit that my understanding of the full picture is limited, but I don't understand why #3 is a strict dependency for #1 and #2. > > > And I really don't see how an nfs export option would help... Different > > > people within and organisation and using the same export might have > > > different expectations. > > > > That's true. > > But if admin decides to export a specific btrfs mount as a non-unified > > filesystem, then NFS clients can decide whether ot not to auto-mount the > > exported subvolumes and different users on the client machine can decide > > if they want to rsync or rsync --one-file-system, just as they would with > > local btrfs. > > > > And maybe I am wrong, but I don't see how the decision on whether to > > export a non-unified btrfs can be made a btrfs option or a nfsd global > > option, that's why I ended up with export option. > > Just because a btrfs option and global nfsd option are bad, that doesn't > mean an export option must be good. It needs to be presented and > defended on its own merits. > > My current opinion (and I must admit I am feeling rather jaded about the > whole thing), is that while btrfs is a very interesting and valuable > experiment in fs design, it contains core mistakes that cannot be > incrementally fixed. It should be marked as legacy with all current > behaviour declared as intentional and not subject to change. This would > make way for a new "betrfs" which was designed based on all that we have > learned. It would use the same code base, but present a more coherent > interface. Exactly what that interface would be has yet to be decided, > but we would not be bound to maintain anything just because btrfs > supports it. > There is no need for a new driver name (like ext3=>ext4) Both ext4 and xfs have features that can be determined in mkfs time. This user experience change does not involve on-disk format changes, so it is a much easier case, because at least technically, there is nothing preventing an administrator from turning the user experience feature on and off with proper care of the consequences. Which brings me to another point. This discussion presents several technical challenges and you have been very creative in presenting technical solutions, but I think that the nature of the problem is more on the administrative side. I see this as an unfortunate flaw in our design process, when filesystem developers have long discussions about issues where some of the material stakeholders (i.e. administrators) are not in the loop. But I do not have very good ideas on how to address this flaw. Thanks, Amir. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-14 5:45 ` Amir Goldstein @ 2021-09-20 22:09 ` NeilBrown 0 siblings, 0 replies; 19+ messages in thread From: NeilBrown @ 2021-09-20 22:09 UTC (permalink / raw) To: Amir Goldstein Cc: J. Bruce Fields, Christoph Hellwig, Chuck Lever, Linux NFS Mailing List, Josef Bacik, linux-fsdevel, Theodore Tso, Jan Kara On Tue, 14 Sep 2021, Amir Goldstein wrote: > On Tue, Sep 14, 2021 at 1:59 AM NeilBrown <neilb@suse.de> wrote: > > > > On Mon, 13 Sep 2021, Amir Goldstein wrote: > > > > > > Right, so the right fix IMO would be to provide similar semantics > > > to the NFS client, like your first patch set tried to do. > > > > > > > Like every other approach, this sounds good and sensible ... until > > you examine the details. > > > > For NFSv3 (RFC1813) this would be a protocol violation. > > Section 3.3.3 (LOOKUP) says: > > A server will not allow a LOOKUP operation to cross a mountpoint to > > the root of a different filesystem, even if the filesystem is > > exported. > > > > The filesystem is represented by the fsid, so this implies that the fsid > > of an object reported by LOOKUP must be the same as the fsid of the > > directory used in the LOOKUP. > > > > Linux NFS does allow this restriction to be bypassed with the "crossmnt" > > export option. Maybe if crossmnt were given it would be defensible to > > change the fsid - if crossmnt is not given, we leave the current > > behaviour. Note that this is a hack and while it is extremely useful, > > it does not produce a seemly experience. You can get exactly the same > > problems with "find" - just not as uniformly (mounting with "-o noac" > > makes them uniform). > > > > I don't understand why we would need to talk about NFSv3. > This btrfs export issue has been with us for a while. > I see no reason to address it for old protocols if we can address > it with a new protocol with better support for the concept of fsid hierarchy. > > > For NFSv4, we need to provide a "mounted-on" fileid for any mountpoint. > > btrfs doesn't have a mounted-on fileid that can be used. We can fake > > something that might work reasonably well - but it would be fake. (but > > then ... btrfs already provided bogus information in getdents when there > > is a subvol root in the directory). > > > > That seems easy to solve by passing some flag to ->encode_fh() > or if the behavior is persistent in btrfs by some mkfs/module/mount option > then btrfs_encode_fh() will always encode the subvol root inode as > resident of the parent tree-id, because nfsd anyway does not ->encode_fh() > for export roots, right? ->encode_fh has nothing to do with getting the mounted-on fileid. With a normal mount point, there are two inodes, one in each vfsmount. We can call ->getattr to get kstat info including the inode number. nfsd does that for the underlying vfsmnt/inode to get the mounted-on fileid. What should it do for btrfs "subvols"? > > > But these are relatively minor. The bigger problem is /proc/mounts. If > > btrfs maintainers were willing to have every active subvolume appear in > > /proc/mounts, then I would be happy to fiddle the NFS fsid and allow > > every active NFS/btrfs subvolume to appear in /proc/mounts on the NFS > > client. But they aren't. So I am not. > > > > I don't understand why you need to tie the two together. Because they are the same thing. The most concrete reason is that any name that appears in /proc/mounts is public. People understand that when they mount filesystems. People don't need to understand that when creating private subvols. There is anecdotal evidence that people might expect subvol paths to be private. If they then access those subvols via NFS, the names suddenly become public. > I would suggest: > 1. Export different fsid's per subvols to NFSv4 based on statx() > exported tree-id > 2. NFS client side uses user configuration to determine which subvols > to auto-mount That is a non-started. Subvols currently don't need mounting, they transparently appear. Requiring client-side configuration would be a major cost for some users. > 3. [optional] Provide a way to configure btrfs using mkfs/module/mount option > to behave locally the same as the NFS client, which will allow > user configuration > to determine with subvols to auto-mount locally > > I admit that my understanding of the full picture is limited, but I don't > understand why #3 is a strict dependency for #1 and #2. > > > > > And I really don't see how an nfs export option would help... Different > > > > people within and organisation and using the same export might have > > > > different expectations. > > > > > > That's true. > > > But if admin decides to export a specific btrfs mount as a non-unified > > > filesystem, then NFS clients can decide whether ot not to auto-mount the > > > exported subvolumes and different users on the client machine can decide > > > if they want to rsync or rsync --one-file-system, just as they would with > > > local btrfs. > > > > > > And maybe I am wrong, but I don't see how the decision on whether to > > > export a non-unified btrfs can be made a btrfs option or a nfsd global > > > option, that's why I ended up with export option. > > > > Just because a btrfs option and global nfsd option are bad, that doesn't > > mean an export option must be good. It needs to be presented and > > defended on its own merits. > > > > My current opinion (and I must admit I am feeling rather jaded about the > > whole thing), is that while btrfs is a very interesting and valuable > > experiment in fs design, it contains core mistakes that cannot be > > incrementally fixed. It should be marked as legacy with all current > > behaviour declared as intentional and not subject to change. This would > > make way for a new "betrfs" which was designed based on all that we have > > learned. It would use the same code base, but present a more coherent > > interface. Exactly what that interface would be has yet to be decided, > > but we would not be bound to maintain anything just because btrfs > > supports it. > > > > There is no need for a new driver name (like ext3=>ext4) > Both ext4 and xfs have features that can be determined in mkfs time. > This user experience change does not involve on-disk format changes, > so it is a much easier case, because at least technically, there is nothing > preventing an administrator from turning the user experience feature > on and off with proper care of the consequences. > > Which brings me to another point. > This discussion presents several technical challenges and you have > been very creative in presenting technical solutions, but I think that > the nature of the problem is more on the administrative side. > > I see this as an unfortunate flaw in our design process, when > filesystem developers have long discussions about issues where > some of the material stakeholders (i.e. administrators) are not in the loop. > But I do not have very good ideas on how to address this flaw. I agree this is more than a technical question. I don't see it as particularly an admin issue, because non-admin users can create subvols. I see it as a conceptual problem. What is a "subvol"? What do we want it to be. Does it make sense for the subvol namespace to align with the filesystem namespace? Subvols are more than directories, but less than filesystems. How can be best characterise them and thing about them? Are they directories with extra features, or filesystems with some limitation (and some extra features)? Or are they something completely new? What sort of identity information do applications *need* about files an filesystems and how can we best provide that within the context of existing APIs? NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-01 15:22 ` J. Bruce Fields 2021-09-02 4:14 ` NeilBrown @ 2021-09-02 7:11 ` Christoph Hellwig 1 sibling, 0 replies; 19+ messages in thread From: Christoph Hellwig @ 2021-09-02 7:11 UTC (permalink / raw) To: J. Bruce Fields Cc: Christoph Hellwig, NeilBrown, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Wed, Sep 01, 2021 at 11:22:51AM -0400, J. Bruce Fields wrote: > It's stronger than "a little more entropy". We know enough about how > the numbers being XOR'd grow to know that collisions are only going to > happen in some extreme use cases. (If I understand correctly.) Do we know that a malicious attacker can't reproduce the collisions? Because that is the case to worry about. > > into the inode number is a good enough band aid (and I strongly > > disagree with that), do it inside btrfs for every place they report > > the inode number. There is nothing NFS-specific about that. > > Neil tried something like that: > > https://lore.kernel.org/linux-nfs/162761259105.21659.4838403432058511846@noble.neil.brown.name/ > > "The patch below, which is just a proof-of-concept, changes > btrfs to report a uniform st_dev, and different (64bit) st_ino > in different subvols." > > (Though actually you're proposing keeping separate st_dev?) No, I'm not suggestion to keep a separate st_dev in that case. So the above scheme looks like the most reasonable (or least unreasonable) of the approaches I've seen so far. I have to admit I've only noticed it now given how deep it was hidden in a thread that I only followed bit while on vacation. > I looked back through a couple threads to try to understand why we > couldn't do that (on new filesystems, with a mkfs option to choose new > or old behavior) and still don't understand. But the threads are long. > > There are objections to a new mount option (which seem obviously wrong; > this should be a persistent feature of the on-disk filesystem). Yes. Anything like this needs to be persisted. But a mount option might still be a reasonable way to set that persistent flag. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-01 7:20 ` [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export Christoph Hellwig 2021-09-01 15:22 ` J. Bruce Fields @ 2021-09-02 4:06 ` NeilBrown 2021-09-02 7:16 ` Christoph Hellwig 1 sibling, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-09-02 4:06 UTC (permalink / raw) To: Christoph Hellwig Cc: Christoph Hellwig, J. Bruce Fields, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Wed, 01 Sep 2021, Christoph Hellwig wrote: > On Tue, Aug 31, 2021 at 02:59:05PM +1000, NeilBrown wrote: > > Making the change purely in btrfs is simply not possible. There is no > > way for btrfs to provide nfsd with a different inode number. To move > > the bulk of the change into btrfs code we would need - at the very least > > - some way for nfsd to provide the filehandle when requesting stat > > information. We would also need to provide a reference filehandle when > > requesting a dentry->filehandle conversion. Cluttering the > > export_operations like that just for btrfs doesn't seem like the right > > balance. I agree that cluttering kstat is not ideal, but it was a case > > of choosing the minimum change for the maximum effect. > > So you're papering over a btrfs bug by piling up cludges in the nsdd > code that has not business even knowing about this btrfs bug, while > leaving other users of inodes numbers and file handles broken? > > If you only care about file handles: this is what the export operations > are for. If you care about inode numbers: well, it is up to btrfs > to generate uniqueue inode numbers. It currently doesn't do that, and > no amount of papering over that in nfsd is going to fix the issue. > > If XORing a little more entropy into the inode number is a good enough > band aid (and I strongly disagree with that), do it inside btrfs for > every place they report the inode number. There is nothing NFS-specific > about that. > Hi Christoph, I have to say that I struggle with some of these conversations with you. I don't know if it is deliberate on your part, or inadvertent, or purely in my imagination, but your attitude often seems combative. I find that to be a disincentive to continuing to engage, which I need to work hard to overcome. If I'm misunderstanding you, I apologise and simply ask that you do what you can to compensate for my apparent sensitivity. Your attitude seems to be that this is a btrfs problem and must be fixed in btrfs. I agree about the source of the problem - specifically: Commit 3394e1607eaf ("Btrfs: Give each subvol and snapshot their own anonymous devid") took a wrong turn. But I don't think we can completely isolate any part of the kernel, and we need to work together to solve problems that affect us, no matter the cause. Similarly our code needs to work together. Highlighting the various problems with the proposed solution doesn't help a lot - they are fairly obvious. Proposing solutions would be much more helpful, and I have no doubt that your different experience and perspective could help me see things that I have missed. Any help that you can provide would certainly be appreciated. Thanks, NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-02 4:06 ` NeilBrown @ 2021-09-02 7:16 ` Christoph Hellwig 2021-09-02 7:53 ` Miklos Szeredi 0 siblings, 1 reply; 19+ messages in thread From: Christoph Hellwig @ 2021-09-02 7:16 UTC (permalink / raw) To: NeilBrown Cc: Christoph Hellwig, J. Bruce Fields, Chuck Lever, linux-nfs, Josef Bacik, linux-fsdevel On Thu, Sep 02, 2021 at 02:06:54PM +1000, NeilBrown wrote: > Hi Christoph, > I have to say that I struggle with some of these conversations with > you. > I don't know if it is deliberate on your part, or inadvertent, or > purely in my imagination, but your attitude often seems combative. I > find that to be a disincentive to continuing to engage, which I need to > work hard to overcome. If I'm misunderstanding you, I apologise and > simply ask that you do what you can to compensate for my apparent > sensitivity. I would not call it combative. It is sticking to the points and the red lines. > Your attitude seems to be that this is a btrfs problem and must be > fixed in btrfs. Yes. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-02 7:16 ` Christoph Hellwig @ 2021-09-02 7:53 ` Miklos Szeredi 2021-09-02 14:16 ` Frank Filz 0 siblings, 1 reply; 19+ messages in thread From: Miklos Szeredi @ 2021-09-02 7:53 UTC (permalink / raw) To: Christoph Hellwig Cc: NeilBrown, J. Bruce Fields, Chuck Lever, Linux NFS list, Josef Bacik, linux-fsdevel On Thu, 2 Sept 2021 at 09:18, Christoph Hellwig <hch@infradead.org> wrote: > > Your attitude seems to be that this is a btrfs problem and must be > > fixed in btrfs. > > Yes. st_ino space issues affect overlayfs as well. The two problems are not the same, but do share some characteristics. And this limitation will likely come up again in the future. I suspect the long term solution would involve introducing new userspace API (variable length inode numbers) and deprecating st_ino. E.g. make stat return an error if the inode number doesn't fit into st_ino and add a statx extension to return the full number. But this would be a long process... Thanks, Miklos ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-02 7:53 ` Miklos Szeredi @ 2021-09-02 14:16 ` Frank Filz 2021-09-02 23:02 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: Frank Filz @ 2021-09-02 14:16 UTC (permalink / raw) To: 'Miklos Szeredi', 'Christoph Hellwig' Cc: 'NeilBrown', 'J. Bruce Fields', 'Chuck Lever', 'Linux NFS list', 'Josef Bacik', linux-fsdevel > On Thu, 2 Sept 2021 at 09:18, Christoph Hellwig <hch@infradead.org> wrote: > > > > Your attitude seems to be that this is a btrfs problem and must be > > > fixed in btrfs. > > > > Yes. > > st_ino space issues affect overlayfs as well. The two problems are > not the same, but do share some characteristics. And this limitation will likely > come up again in the future. > > I suspect the long term solution would involve introducing new userspace API > (variable length inode numbers) and deprecating st_ino. > E.g. make stat return an error if the inode number doesn't fit into st_ino and add > a statx extension to return the full number. But this would be a long process... But then what do we do where fileid in NFS is only 64 bits? The solution of giving each subvol a separate fsid is the only real solution to enlarging the NFS fileid space, however that has downsides on the client side. Frank ^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-09-02 14:16 ` Frank Filz @ 2021-09-02 23:02 ` NeilBrown 0 siblings, 0 replies; 19+ messages in thread From: NeilBrown @ 2021-09-02 23:02 UTC (permalink / raw) To: Frank Filz Cc: 'Miklos Szeredi', 'Christoph Hellwig', 'J. Bruce Fields', 'Chuck Lever', 'Linux NFS list', 'Josef Bacik', linux-fsdevel On Fri, 03 Sep 2021, Frank Filz wrote: > > On Thu, 2 Sept 2021 at 09:18, Christoph Hellwig <hch@infradead.org> wrote: > > > > > > Your attitude seems to be that this is a btrfs problem and must be > > > > fixed in btrfs. > > > > > > Yes. > > > > st_ino space issues affect overlayfs as well. The two problems are > > not the same, but do share some characteristics. And this limitation will likely > > come up again in the future. > > > > I suspect the long term solution would involve introducing new userspace API > > (variable length inode numbers) and deprecating st_ino. > > E.g. make stat return an error if the inode number doesn't fit into st_ino and add > > a statx extension to return the full number. But this would be a long process... > > But then what do we do where fileid in NFS is only 64 bits? Hence my suggestion that user-space should move to using the filehandle. Two file handles being different doesn't guarantee that the two objects are different, but the problems caused by incorrectly assuming two things are different are much less than the problems caused by incorrectly assuming two things are the same. > > The solution of giving each subvol a separate fsid is the only real > solution to enlarging the NFS fileid space, however that has downsides > on the client side. And this doesn't help overlayfs... NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH/RFC 00/11] expose btrfs subvols in mount table correctly @ 2021-07-27 22:37 NeilBrown 2021-08-13 1:45 ` [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-07-27 22:37 UTC (permalink / raw) To: Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever, Chris Mason, David Sterba, Alexander Viro Cc: linux-fsdevel, linux-nfs, linux-btrfs There are long-standing problems with btrfs subvols, particularly in relation to whether and how they are exposed in the mount table. - /proc/self/mountinfo reports the major:minor device number for each filesystem and when a btrfs subvol is explicitly mounted, the number reported is wrong - it does not match what stat() reports for the mountpoint. - when subvol are not explicitly mounted, they don't appear in mountinfo at all. Consequences include that a tool which uses stat() to find the dev of the filesystem, then searches mountinfo for that filesystem, will not find it. Some tools (e.g. findmnt) appear to have been enhanced to cope with this strangeness, but it would be best to make btrfs behave more normally. - nfsd cannot currently see the transition to subvol, so reports the main volume and all subvols to the client as being in the same filesystem. As inode numbers are not unique across all subvols, this can confuse clients. In particular, 'find' is likely to report a loop. subvols can be made to appear in mountinfo using automounts. However nfsd does not cope well with automounts. It assumes all filesystems to be exported are already mounted. So adding automounts to btrfs would break nfsd. We can enhance nfsd to understand that some automounts can be managed. "internal mounts" where a filesystem provides an automount point and mounts its own directories, can be handled differently by nfsd. This series addresses all these issues. After a few enhancements to the VFS to provide needed support, they enhance exportfs and nfsd to cope with the concept of internal mounts, and then enhance btrfs to provide them. The NFSv3 support is incomplete. I'm not sure we can make it work "perfectly". A normal nfsv3 mount seem to work well enough, but if mounted with '-o noac', it loses track of the mounted-on inode number and complains about inode numbers changing. My basic test for these is to mount a btrfs filesystem which contains subvols, nfs-export it and mount it with nfsv3 and nfsv4, then run 'find' in each of the filesystem and check the contents of /proc/self/mountinfo. The first patch simply fixes the dev number in mountinfo and could possibly be tagged for -stable. NeilBrown --- NeilBrown (11): VFS: show correct dev num in mountinfo VFS: allow d_automount to create in-place bind-mount. VFS: pass lookup_flags into follow_down() VFS: export lookup_mnt() VFS: new function: mount_is_internal() nfsd: include a vfsmount in struct svc_fh exportfs: Allow filehandle lookup to cross internal mount points. nfsd: change get_parent_attributes() to nfsd_get_mounted_on() nfsd: Allow filehandle lookup to cross internal mount points. btrfs: introduce mapping function from location to inum btrfs: use automount to bind-mount all subvol roots. fs/btrfs/btrfs_inode.h | 12 +++ fs/btrfs/inode.c | 111 ++++++++++++++++++++++++++- fs/btrfs/super.c | 1 + fs/exportfs/expfs.c | 100 ++++++++++++++++++++---- fs/fhandle.c | 2 +- fs/internal.h | 1 - fs/namei.c | 6 +- fs/namespace.c | 32 +++++++- fs/nfsd/export.c | 4 +- fs/nfsd/nfs3xdr.c | 40 +++++++--- fs/nfsd/nfs4proc.c | 9 ++- fs/nfsd/nfs4xdr.c | 106 ++++++++++++------------- fs/nfsd/nfsfh.c | 44 +++++++---- fs/nfsd/nfsfh.h | 3 +- fs/nfsd/nfsproc.c | 5 +- fs/nfsd/vfs.c | 162 +++++++++++++++++++++++---------------- fs/nfsd/vfs.h | 12 +-- fs/nfsd/xdr4.h | 2 +- fs/overlayfs/namei.c | 5 +- fs/xfs/xfs_ioctl.c | 12 ++- include/linux/exportfs.h | 4 +- include/linux/mount.h | 4 + include/linux/namei.h | 2 +- 23 files changed, 490 insertions(+), 189 deletions(-) -- Signature ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export 2021-07-27 22:37 [PATCH/RFC 00/11] expose btrfs subvols in mount table correctly NeilBrown @ 2021-08-13 1:45 ` NeilBrown 2021-08-15 7:39 ` Goffredo Baroncelli 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-08-13 1:45 UTC (permalink / raw) To: Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever, Chris Mason, David Sterba, Alexander Viro Cc: linux-fsdevel, linux-nfs, linux-btrfs [[This patch is a minimal patch which addresses the current problems with nfsd and btrfs, in a way which I think is most supportable, least surprising, and least likely to impact any future attempts to more completely fix the btrfs file-identify problem]] BTRFS does not provide unique inode numbers across a filesystem. It *does* provide unique inode numbers with a subvolume and uses synthetic device numbers for different subvolumes to ensure uniqueness for device+inode. nfsd cannot use these varying device numbers. If nfsd were to synthesise different stable filesystem ids to give to the client, that would cause subvolumes to appear in the mount table on the client, even though they don't appear in the mount table on the server. Also, NFSv3 doesn't support changing the filesystem id without a new explicit mount on the client (this is partially supported in practice, but violates the protocol specification). So currently, the roots of all subvolumes report the same inode number in the same filesystem to NFS clients and tools like 'find' notice that a directory has the same identity as an ancestor, and so refuse to enter that directory. This patch allows btrfs (or any filesystem) to provide a 64bit number that can be xored with the inode number to make the number more unique. Rather than the client being certain to see duplicates, with this patch it is possible but extremely rare. The number than btrfs provides is a swab64() version of the subvolume identifier. This has most entropy in the high bits (the low bits of the subvolume identifer), while the inoe has most entropy in the low bits. The result will always be unique within a subvolume, and will almost always be unique across the filesystem. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/btrfs/inode.c | 4 ++++ fs/nfsd/nfs3xdr.c | 17 ++++++++++++++++- fs/nfsd/nfs4xdr.c | 9 ++++++++- fs/nfsd/xdr3.h | 2 ++ include/linux/stat.h | 17 +++++++++++++++++ 5 files changed, 47 insertions(+), 2 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0117d867ecf8..989fdf2032d5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9195,6 +9195,10 @@ static int btrfs_getattr(struct user_namespace *mnt_userns, generic_fillattr(&init_user_ns, inode, stat); stat->dev = BTRFS_I(inode)->root->anon_dev; + if (BTRFS_I(inode)->root->root_key.objectid != BTRFS_FS_TREE_OBJECTID) + stat->ino_uniquifier = + swab64(BTRFS_I(inode)->root->root_key.objectid); + spin_lock(&BTRFS_I(inode)->lock); delalloc_bytes = BTRFS_I(inode)->new_delalloc_bytes; inode_bytes = inode_get_bytes(inode); diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 0a5ebc52e6a9..669e2437362a 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -340,6 +340,7 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, { struct user_namespace *userns = nfsd_user_namespace(rqstp); __be32 *p; + u64 ino; u64 fsid; p = xdr_reserve_space(xdr, XDR_UNIT * 21); @@ -377,7 +378,10 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, p = xdr_encode_hyper(p, fsid); /* fileid */ - p = xdr_encode_hyper(p, stat->ino); + ino = stat->ino; + if (stat->ino_uniquifier && stat->ino_uniquifier != ino) + ino ^= stat->ino_uniquifier; + p = xdr_encode_hyper(p, ino); p = encode_nfstime3(p, &stat->atime); p = encode_nfstime3(p, &stat->mtime); @@ -1151,6 +1155,17 @@ svcxdr_encode_entry3_common(struct nfsd3_readdirres *resp, const char *name, if (xdr_stream_encode_item_present(xdr) < 0) return false; /* fileid */ + if (!resp->dir_have_uniquifier) { + struct kstat stat; + if (fh_getattr(&resp->fh, &stat) == nfs_ok) + resp->dir_ino_uniquifier = stat.ino_uniquifier; + else + resp->dir_ino_uniquifier = 0; + resp->dir_have_uniquifier = 1; + } + if (resp->dir_ino_uniquifier && + resp->dir_ino_uniquifier != ino) + ino ^= resp->dir_ino_uniquifier; if (xdr_stream_encode_u64(xdr, ino) < 0) return false; /* name */ diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 7abeccb975b2..ddccf849c29c 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3114,10 +3114,14 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, fhp->fh_handle.fh_size); } if (bmval0 & FATTR4_WORD0_FILEID) { + u64 ino = stat.ino; + if (stat.ino_uniquifier && + stat.ino_uniquifier != stat.ino) + ino ^= stat.ino_uniquifier; p = xdr_reserve_space(xdr, 8); if (!p) goto out_resource; - p = xdr_encode_hyper(p, stat.ino); + p = xdr_encode_hyper(p, ino); } if (bmval0 & FATTR4_WORD0_FILES_AVAIL) { p = xdr_reserve_space(xdr, 8); @@ -3285,6 +3289,9 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, if (err) goto out_nfserr; ino = parent_stat.ino; + if (parent_stat.ino_uniquifier && + parent_stat.ino_uniquifier != ino) + ino ^= parent_stat.ino_uniquifier; } p = xdr_encode_hyper(p, ino); } diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h index 933008382bbe..b4f9f3c71f72 100644 --- a/fs/nfsd/xdr3.h +++ b/fs/nfsd/xdr3.h @@ -179,6 +179,8 @@ struct nfsd3_readdirres { struct xdr_buf dirlist; struct svc_fh scratch; struct readdir_cd common; + u64 dir_ino_uniquifier; + int dir_have_uniquifier; unsigned int cookie_offset; struct svc_rqst * rqstp; diff --git a/include/linux/stat.h b/include/linux/stat.h index fff27e603814..a5188f42ed81 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -46,6 +46,23 @@ struct kstat { struct timespec64 btime; /* File creation time */ u64 blocks; u64 mnt_id; + /* + * BTRFS does not provide unique inode numbers within a filesystem, + * depending on a synthetic 'dev' to provide uniqueness. + * NFSd cannot make use of this 'dev' number so clients often see + * duplicate inode numbers. + * For BTRFS, 'ino' is unlikely to use the high bits. It puts + * another number in ino_uniquifier which: + * - has most entropy in the high bits + * - is different precisely when 'dev' is different + * - is stable across unmount/remount + * NFSd can xor this with 'ino' to get a substantially more unique + * number for reporting to the client. + * The ino_uniquifier for a directory can reasonably be applied + * to inode numbers reported by the readdir filldir callback. + * It is NOT currently exported to user-space. + */ + u64 ino_uniquifier; }; #endif -- 2.32.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export 2021-08-13 1:45 ` [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export NeilBrown @ 2021-08-15 7:39 ` Goffredo Baroncelli 2021-08-15 19:35 ` Roman Mamedov 0 siblings, 1 reply; 19+ messages in thread From: Goffredo Baroncelli @ 2021-08-15 7:39 UTC (permalink / raw) To: NeilBrown, Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever, Chris Mason, David Sterba, Alexander Viro Cc: linux-fsdevel, linux-nfs, linux-btrfs Hi All, On 8/13/21 3:45 AM, NeilBrown wrote: > > [[This patch is a minimal patch which addresses the current problems > with nfsd and btrfs, in a way which I think is most supportable, least > surprising, and least likely to impact any future attempts to more > completely fix the btrfs file-identify problem]] > > BTRFS does not provide unique inode numbers across a filesystem. > It *does* provide unique inode numbers with a subvolume and > uses synthetic device numbers for different subvolumes to ensure > uniqueness for device+inode. > > nfsd cannot use these varying device numbers. If nfsd were to > synthesise different stable filesystem ids to give to the client, that > would cause subvolumes to appear in the mount table on the client, even > though they don't appear in the mount table on the server. Also, NFSv3 > doesn't support changing the filesystem id without a new explicit > mount on the client (this is partially supported in practice, but > violates the protocol specification). I am sure that it was discussed already but I was unable to find any track of this discussion. But if the problem is the collision between the inode number of different subvolume in the nfd export, is it simpler if the export is truncated to the subvolume boundary ? It would be more coherent with the current behavior of vfs+nfsd. In fact in btrfs a subvolume is a complete filesystem, with an "own synthetic" device. We could like or not this solution, but this solution is the more aligned to the unix standard, where for each filesystem there is a pair (device, inode-set). NFS (by default) avoids to cross the boundary between the filesystems. So why in BTRFS this should be different ? May be an opt-in option would solve the backward compatibility (i.e. to avoid problem with setup which relies on this behaviour). > So currently, the roots of all subvolumes report the same inode number > in the same filesystem to NFS clients and tools like 'find' notice that > a directory has the same identity as an ancestor, and so refuse to > enter that directory. > > This patch allows btrfs (or any filesystem) to provide a 64bit number > that can be xored with the inode number to make the number more unique. > Rather than the client being certain to see duplicates, with this patch > it is possible but extremely rare. > > The number than btrfs provides is a swab64() version of the subvolume > identifier. This has most entropy in the high bits (the low bits of the > subvolume identifer), while the inoe has most entropy in the low bits. > The result will always be unique within a subvolume, and will almost > always be unique across the filesystem. > > Signed-off-by: NeilBrown <neilb@suse.de> > --- > fs/btrfs/inode.c | 4 ++++ > fs/nfsd/nfs3xdr.c | 17 ++++++++++++++++- > fs/nfsd/nfs4xdr.c | 9 ++++++++- > fs/nfsd/xdr3.h | 2 ++ > include/linux/stat.h | 17 +++++++++++++++++ > 5 files changed, 47 insertions(+), 2 deletions(-) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 0117d867ecf8..989fdf2032d5 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -9195,6 +9195,10 @@ static int btrfs_getattr(struct user_namespace *mnt_userns, > generic_fillattr(&init_user_ns, inode, stat); > stat->dev = BTRFS_I(inode)->root->anon_dev; > > + if (BTRFS_I(inode)->root->root_key.objectid != BTRFS_FS_TREE_OBJECTID) > + stat->ino_uniquifier = > + swab64(BTRFS_I(inode)->root->root_key.objectid); > + > spin_lock(&BTRFS_I(inode)->lock); > delalloc_bytes = BTRFS_I(inode)->new_delalloc_bytes; > inode_bytes = inode_get_bytes(inode); > diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c > index 0a5ebc52e6a9..669e2437362a 100644 > --- a/fs/nfsd/nfs3xdr.c > +++ b/fs/nfsd/nfs3xdr.c > @@ -340,6 +340,7 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, > { > struct user_namespace *userns = nfsd_user_namespace(rqstp); > __be32 *p; > + u64 ino; > u64 fsid; > > p = xdr_reserve_space(xdr, XDR_UNIT * 21); > @@ -377,7 +378,10 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, > p = xdr_encode_hyper(p, fsid); > > /* fileid */ > - p = xdr_encode_hyper(p, stat->ino); > + ino = stat->ino; > + if (stat->ino_uniquifier && stat->ino_uniquifier != ino) > + ino ^= stat->ino_uniquifier; > + p = xdr_encode_hyper(p, ino); > > p = encode_nfstime3(p, &stat->atime); > p = encode_nfstime3(p, &stat->mtime); > @@ -1151,6 +1155,17 @@ svcxdr_encode_entry3_common(struct nfsd3_readdirres *resp, const char *name, > if (xdr_stream_encode_item_present(xdr) < 0) > return false; > /* fileid */ > + if (!resp->dir_have_uniquifier) { > + struct kstat stat; > + if (fh_getattr(&resp->fh, &stat) == nfs_ok) > + resp->dir_ino_uniquifier = stat.ino_uniquifier; > + else > + resp->dir_ino_uniquifier = 0; > + resp->dir_have_uniquifier = 1; > + } > + if (resp->dir_ino_uniquifier && > + resp->dir_ino_uniquifier != ino) > + ino ^= resp->dir_ino_uniquifier; > if (xdr_stream_encode_u64(xdr, ino) < 0) > return false; > /* name */ > diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c > index 7abeccb975b2..ddccf849c29c 100644 > --- a/fs/nfsd/nfs4xdr.c > +++ b/fs/nfsd/nfs4xdr.c > @@ -3114,10 +3114,14 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, > fhp->fh_handle.fh_size); > } > if (bmval0 & FATTR4_WORD0_FILEID) { > + u64 ino = stat.ino; > + if (stat.ino_uniquifier && > + stat.ino_uniquifier != stat.ino) > + ino ^= stat.ino_uniquifier; > p = xdr_reserve_space(xdr, 8); > if (!p) > goto out_resource; > - p = xdr_encode_hyper(p, stat.ino); > + p = xdr_encode_hyper(p, ino); > } > if (bmval0 & FATTR4_WORD0_FILES_AVAIL) { > p = xdr_reserve_space(xdr, 8); > @@ -3285,6 +3289,9 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, > if (err) > goto out_nfserr; > ino = parent_stat.ino; > + if (parent_stat.ino_uniquifier && > + parent_stat.ino_uniquifier != ino) > + ino ^= parent_stat.ino_uniquifier; > } > p = xdr_encode_hyper(p, ino); > } > diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h > index 933008382bbe..b4f9f3c71f72 100644 > --- a/fs/nfsd/xdr3.h > +++ b/fs/nfsd/xdr3.h > @@ -179,6 +179,8 @@ struct nfsd3_readdirres { > struct xdr_buf dirlist; > struct svc_fh scratch; > struct readdir_cd common; > + u64 dir_ino_uniquifier; > + int dir_have_uniquifier; > unsigned int cookie_offset; > struct svc_rqst * rqstp; > > diff --git a/include/linux/stat.h b/include/linux/stat.h > index fff27e603814..a5188f42ed81 100644 > --- a/include/linux/stat.h > +++ b/include/linux/stat.h > @@ -46,6 +46,23 @@ struct kstat { > struct timespec64 btime; /* File creation time */ > u64 blocks; > u64 mnt_id; > + /* > + * BTRFS does not provide unique inode numbers within a filesystem, > + * depending on a synthetic 'dev' to provide uniqueness. > + * NFSd cannot make use of this 'dev' number so clients often see > + * duplicate inode numbers. > + * For BTRFS, 'ino' is unlikely to use the high bits. It puts > + * another number in ino_uniquifier which: > + * - has most entropy in the high bits > + * - is different precisely when 'dev' is different > + * - is stable across unmount/remount > + * NFSd can xor this with 'ino' to get a substantially more unique > + * number for reporting to the client. > + * The ino_uniquifier for a directory can reasonably be applied > + * to inode numbers reported by the readdir filldir callback. > + * It is NOT currently exported to user-space. > + */ > + u64 ino_uniquifier; > }; Why don't rename "ino_uniquifier" as "ino_and_subvolume" and leave to the filesystem the work to combine the inode and the subvolume-id ? I am worried that the logic is split between the filesystem, which synthesizes the ino_uniquifier, and to NFS which combine to the inode. I am thinking that this combination is filesystem specific; for BTRFS is a simple xor but for other filesystem may be a more complex operation, so leaving an half in the filesystem and another half to the NFS seems to not optimal if other filesystem needs to use ino_uniquifier. > #endif > -- gpg @keyserver.linux.it: Goffredo Baroncelli <kreijackATinwind.it> Key fingerprint BBF5 1610 0B64 DAC6 5F7D 17B2 0EDA 9B37 8B82 E0B5 ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export 2021-08-15 7:39 ` Goffredo Baroncelli @ 2021-08-15 19:35 ` Roman Mamedov 2021-08-15 22:17 ` NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: Roman Mamedov @ 2021-08-15 19:35 UTC (permalink / raw) To: Goffredo Baroncelli Cc: NeilBrown, Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever, Chris Mason, David Sterba, Alexander Viro, linux-fsdevel, linux-nfs, linux-btrfs On Sun, 15 Aug 2021 09:39:08 +0200 Goffredo Baroncelli <kreijack@libero.it> wrote: > I am sure that it was discussed already but I was unable to find any track > of this discussion. But if the problem is the collision between the inode > number of different subvolume in the nfd export, is it simpler if the export > is truncated to the subvolume boundary ? It would be more coherent with the > current behavior of vfs+nfsd. See this bugreport thread which started it all: https://www.spinics.net/lists/linux-btrfs/msg111172.html In there the reporting user replied that it is strongly not feasible for them to export each individual snapshot. > In fact in btrfs a subvolume is a complete filesystem, with an "own > synthetic" device. We could like or not this solution, but this solution is > the more aligned to the unix standard, where for each filesystem there is a > pair (device, inode-set). NFS (by default) avoids to cross the boundary > between the filesystems. So why in BTRFS this should be different ? From the user point of view subvolumes are basically directories; that they are "complete filesystems"* is merely a low-level implementation detail. * well except they are not, as you cannot 'dd' a subvolume to another blockdevice. > Why don't rename "ino_uniquifier" as "ino_and_subvolume" and leave to the > filesystem the work to combine the inode and the subvolume-id ? > > I am worried that the logic is split between the filesystem, which > synthesizes the ino_uniquifier, and to NFS which combine to the inode. I am > thinking that this combination is filesystem specific; for BTRFS is a simple > xor but for other filesystem may be a more complex operation, so leaving an > half in the filesystem and another half to the NFS seems to not optimal if > other filesystem needs to use ino_uniquifier. I wondered a bit myself, what are the downsides of just doing the uniquefication inside Btrfs, not leaving that to NFSD? I mean not even adding the extra stat field, just return the inode itself with that already applied. Surely cannot be any worse collision-wise, than different subvolumes straight up having the same inode numbers as right now? Or is it a performance concern, always doing more work, for something which only NFSD has needed so far. -- With respect, Roman ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export 2021-08-15 19:35 ` Roman Mamedov @ 2021-08-15 22:17 ` NeilBrown 2021-08-23 4:05 ` [PATCH v2] BTRFS/NFSD: " NeilBrown 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-08-15 22:17 UTC (permalink / raw) To: Roman Mamedov Cc: Goffredo Baroncelli, Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever, Chris Mason, David Sterba, Alexander Viro, linux-fsdevel, linux-nfs, linux-btrfs On Mon, 16 Aug 2021, Roman Mamedov wrote: > > I wondered a bit myself, what are the downsides of just doing the > uniquefication inside Btrfs, not leaving that to NFSD? > > I mean not even adding the extra stat field, just return the inode itself with > that already applied. Surely cannot be any worse collision-wise, than > different subvolumes straight up having the same inode numbers as right now? > > Or is it a performance concern, always doing more work, for something which > only NFSD has needed so far. Any change in behaviour will have unexpected consequences. I think the btrfs maintainers perspective is they they don't want to change behaviour if they don't have to (which is reasonable) and that currently they don't have to (which probably means that users aren't complaining loudly enough). NFS export of BTRFS is already demonstrably broken and users are complaining loudly enough that I can hear them .... though I think it has been broken like this for 10 years, do I wonder that I didn't hear them before. If something is perceived as broken, then a behaviour change that appears to fix it is more easily accepted. However, having said that I now see that my latest patch is not ideal. It changes the inode numbers associated with filehandles of objects in the non-root subvolume. This will cause the Linux NFS client to treat the object as 'stale' For most objects this is a transient annoyance. Reopen the file or restart the process and all should be well again. However if the inode number of the mount point changes, you will need to unmount and remount. That is more somewhat more of an annoyance. There are a few ways to handle this more gracefully. 1/ We could get btrfs to hand out new filehandles as well as new inode numbers, but still accept the old filehandles. Then we could make the inode number reported be based on the filehandle. This would be nearly seamless but rather clumsy to code. I'm not *very* keen on this idea, but it is worth keeping in mind. 2/ We could add a btrfs mount option to control whether the uniquifier was set or not. This would allow the sysadmin to choose when to manage any breakage. I think this is my preference, but Josef has declared an aversion to mount options. 3/ We could add a module parameter to nfsd to control whether the uniquifier is merged in. This again gives the sysadmin control, and it can be done despite any aversion from btrfs maintainers. But I'd need to overcome any aversion from the nfsd maintainers, and I don't know how strong that would be yet. (A new export option isn't really appropriate. It is much more work to add an export option than the add a mount option). I don't know.... maybe I should try harder to like option 1, or at least verify if it works as expected and see how ugly the code really is. NeilBrown ^ permalink raw reply [flat|nested] 19+ messages in thread
* [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-08-15 22:17 ` NeilBrown @ 2021-08-23 4:05 ` NeilBrown 2021-08-23 8:17 ` kernel test robot 0 siblings, 1 reply; 19+ messages in thread From: NeilBrown @ 2021-08-23 4:05 UTC (permalink / raw) To: Chris Mason, David Sterba, Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever Cc: Roman Mamedov, Goffredo Baroncelli, Alexander Viro, linux-fsdevel, linux-nfs, linux-btrfs BTRFS does not provide unique inode numbers across a filesystem. It only provide unique inode numbers within a subvolume and uses synthetic device numbers for different subvolumes to ensure uniqueness for device+inode. nfsd cannot use these varying synthetic device numbers. If nfsd were to synthesise different stable filesystem ids to give to the client, that would cause subvolumes to appear in the mount table on the client, even though they don't appear in the mount table on the server. Also, NFSv3 doesn't support changing the filesystem id without a new explicit mount on the client (this is partially supported in practice, but violates the protocol specification and has problems in some edge cases). So currently, the roots of all subvolumes report the same inode number in the same filesystem to NFS clients and tools like 'find' notice that a directory has the same identity as an ancestor, and so refuse to enter that directory. This patch allows btrfs (or any filesystem) to provide a 64bit number that can be xored with the inode number to make the number more unique. Rather than the client being certain to see duplicates, with this patch it is possible but extremely rare. The number that btrfs provides is a swab64() version of the subvolume identifier. This has most entropy in the high bits (the low bits of the subvolume identifer), while the inode has most entropy in the low bits. The result will always be unique within a subvolume, and will almost always be unique across the filesystem. If an upgrade of the NFS server caused all inode numbers in an exportfs BTRFS filesystem to appear to the client to change, the client may not handle this well. The Linux client will cause any open files to become 'stale'. If the mount point changed inode number, the whole mount would become inaccessible. To avoid this, an unused byte in the filehandle (fh_auth) has been repurposed as "fh_options". (The use of #defines make fh_flags a problematic choice). The new behaviour of uniquifying inode number is only activated when this bit is set. NFSD will only set this bit in filehandles it reports if the filehandle of the parent (provided by the client) contains the bit, or if - the filehandle for the parent is not provided or is for a different export and - the filehandle refers to a BTRFS filesystem. Thus if you have a BTRFS filesystem originally mounted from a server without this patch, the flag will never be set and the current behaviour will continue. Only once you re-mount the filesystem (or the filesystem is re-auto-mounted) will the inode numbers change. When that happens, it is likely that the filesystem st_dev number seen on the client will change anyway. Signed-off-by: NeilBrown <neilb@suse.de> --- fs/btrfs/inode.c | 4 ++++ fs/nfsd/nfs3xdr.c | 15 ++++++++++++++- fs/nfsd/nfs4xdr.c | 7 ++++--- fs/nfsd/nfsfh.c | 13 +++++++++++-- fs/nfsd/nfsfh.h | 22 ++++++++++++++++++++++ fs/nfsd/xdr3.h | 2 ++ include/linux/stat.h | 18 ++++++++++++++++++ include/uapi/linux/nfsd/nfsfh.h | 18 ++++++++++++------ 8 files changed, 87 insertions(+), 12 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 0117d867ecf8..989fdf2032d5 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9195,6 +9195,10 @@ static int btrfs_getattr(struct user_namespace *mnt_userns, generic_fillattr(&init_user_ns, inode, stat); stat->dev = BTRFS_I(inode)->root->anon_dev; + if (BTRFS_I(inode)->root->root_key.objectid != BTRFS_FS_TREE_OBJECTID) + stat->ino_uniquifier = + swab64(BTRFS_I(inode)->root->root_key.objectid); + spin_lock(&BTRFS_I(inode)->lock); delalloc_bytes = BTRFS_I(inode)->new_delalloc_bytes; inode_bytes = inode_get_bytes(inode); diff --git a/fs/nfsd/nfs3xdr.c b/fs/nfsd/nfs3xdr.c index 0a5ebc52e6a9..19d14f11f79a 100644 --- a/fs/nfsd/nfs3xdr.c +++ b/fs/nfsd/nfs3xdr.c @@ -340,6 +340,7 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, { struct user_namespace *userns = nfsd_user_namespace(rqstp); __be32 *p; + u64 ino; u64 fsid; p = xdr_reserve_space(xdr, XDR_UNIT * 21); @@ -377,7 +378,8 @@ svcxdr_encode_fattr3(struct svc_rqst *rqstp, struct xdr_stream *xdr, p = xdr_encode_hyper(p, fsid); /* fileid */ - p = xdr_encode_hyper(p, stat->ino); + ino = nfsd_uniquify_ino(fhp, stat); + p = xdr_encode_hyper(p, ino); p = encode_nfstime3(p, &stat->atime); p = encode_nfstime3(p, &stat->mtime); @@ -1151,6 +1153,17 @@ svcxdr_encode_entry3_common(struct nfsd3_readdirres *resp, const char *name, if (xdr_stream_encode_item_present(xdr) < 0) return false; /* fileid */ + if (!resp->dir_have_uniquifier) { + struct kstat stat; + if (fh_getattr(&resp->fh, &stat) == nfs_ok) + resp->dir_ino_uniquifier = + nfsd_ino_uniquifier(&resp->fh, &stat); + else + resp->dir_ino_uniquifier = 0; + resp->dir_have_uniquifier = true; + } + if (resp->dir_ino_uniquifier != ino) + ino ^= resp->dir_ino_uniquifier; if (xdr_stream_encode_u64(xdr, ino) < 0) return false; /* name */ diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c index 7abeccb975b2..5ed894ceebb0 100644 --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3114,10 +3114,11 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, fhp->fh_handle.fh_size); } if (bmval0 & FATTR4_WORD0_FILEID) { + u64 ino = nfsd_uniquify_ino(fhp, &stat); p = xdr_reserve_space(xdr, 8); if (!p) goto out_resource; - p = xdr_encode_hyper(p, stat.ino); + p = xdr_encode_hyper(p, ino); } if (bmval0 & FATTR4_WORD0_FILES_AVAIL) { p = xdr_reserve_space(xdr, 8); @@ -3274,7 +3275,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, p = xdr_reserve_space(xdr, 8); if (!p) - goto out_resource; + goto out_resource; /* * Get parent's attributes if not ignoring crossmount * and this is the root of a cross-mounted filesystem. @@ -3284,7 +3285,7 @@ nfsd4_encode_fattr(struct xdr_stream *xdr, struct svc_fh *fhp, err = get_parent_attributes(exp, &parent_stat); if (err) goto out_nfserr; - ino = parent_stat.ino; + ino = nfsd_uniquify_ino(fhp, &parent_stat); } p = xdr_encode_hyper(p, ino); } diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c index c475d2271f9c..e97ed957a379 100644 --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -172,7 +172,7 @@ static __be32 nfsd_set_fh_dentry(struct svc_rqst *rqstp, struct svc_fh *fhp) if (--data_left < 0) return error; - if (fh->fh_auth_type != 0) + if ((fh->fh_options & ~NFSD_FH_OPTION_ALL) != 0) return error; len = key_len(fh->fh_fsid_type) / 4; if (len == 0) @@ -569,6 +569,7 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, struct inode * inode = d_inode(dentry); dev_t ex_dev = exp_sb(exp)->s_dev; + u8 options = 0; dprintk("nfsd: fh_compose(exp %02x:%02x/%ld %pd2, ino=%ld)\n", MAJOR(ex_dev), MINOR(ex_dev), @@ -585,6 +586,14 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, /* If we have a ref_fh, then copy the fh_no_wcc setting from it. */ fhp->fh_no_wcc = ref_fh ? ref_fh->fh_no_wcc : false; + if (ref_fh && ref_fh->fh_export == exp) { + options = ref_fh->fh_handle.fh_options; + } else { + /* Set options as needed */ + if (exp->ex_path.mnt->mnt_sb->s_magic == BTRFS_SUPER_MAGIC) + options |= NFSD_FH_OPTION_INO_UNIQUIFY; + } + if (ref_fh == fhp) fh_put(ref_fh); @@ -615,7 +624,7 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, } else { fhp->fh_handle.fh_size = key_len(fhp->fh_handle.fh_fsid_type) + 4; - fhp->fh_handle.fh_auth_type = 0; + fhp->fh_handle.fh_options = options; mk_fsid(fhp->fh_handle.fh_fsid_type, fhp->fh_handle.fh_fsid, diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h index 6106697adc04..1144a98c2951 100644 --- a/fs/nfsd/nfsfh.h +++ b/fs/nfsd/nfsfh.h @@ -84,6 +84,28 @@ enum fsid_source { }; extern enum fsid_source fsid_source(const struct svc_fh *fhp); +enum nfsd_fh_options { + NFSD_FH_OPTION_INO_UNIQUIFY = 1, /* BTRFS only */ + + NFSD_FH_OPTION_ALL = 1 +}; + +static inline u64 nfsd_ino_uniquifier(const struct svc_fh *fhp, + const struct kstat *stat) +{ + if (fhp->fh_handle.fh_options & NFSD_FH_OPTION_INO_UNIQUIFY) + return stat->ino_uniquifier; + return 0; +} + +static inline u64 nfsd_uniquify_ino(const struct svc_fh *fhp, + const struct kstat *stat) +{ + u64 u = nfsd_ino_uniquifier(fhp, stat); + if (u != stat->ino) + return stat->ino ^ u; + return stat->ino; +} /* * This might look a little large to "inline" but in all calls except diff --git a/fs/nfsd/xdr3.h b/fs/nfsd/xdr3.h index 933008382bbe..d9b6c8314bbb 100644 --- a/fs/nfsd/xdr3.h +++ b/fs/nfsd/xdr3.h @@ -179,6 +179,8 @@ struct nfsd3_readdirres { struct xdr_buf dirlist; struct svc_fh scratch; struct readdir_cd common; + u64 dir_ino_uniquifier; + bool dir_have_uniquifier; unsigned int cookie_offset; struct svc_rqst * rqstp; diff --git a/include/linux/stat.h b/include/linux/stat.h index fff27e603814..0f3f74d302f8 100644 --- a/include/linux/stat.h +++ b/include/linux/stat.h @@ -46,6 +46,24 @@ struct kstat { struct timespec64 btime; /* File creation time */ u64 blocks; u64 mnt_id; + /* + * BTRFS does not provide unique inode numbers within a filesystem, + * depending on a synthetic 'dev' to provide uniqueness. + * NFSd cannot make use of this 'dev' number so clients often see + * duplicate inode numbers. + * For BTRFS, 'ino' is unlikely to use the high bits until the filesystem + * has created a great many inodes. + * It puts another number in ino_uniquifier which: + * - has most entropy in the high bits + * - is different precisely when 'dev' is different + * - is stable across unmount/remount + * NFSd can xor this with 'ino' to get a substantially more unique + * number for reporting to the client. + * The ino_uniquifier for a directory can reasonably be applied + * to inode numbers reported by the readdir filldir callback. + * It is NOT currently exported to user-space. + */ + u64 ino_uniquifier; }; #endif diff --git a/include/uapi/linux/nfsd/nfsfh.h b/include/uapi/linux/nfsd/nfsfh.h index 427294dd56a1..59311df4b476 100644 --- a/include/uapi/linux/nfsd/nfsfh.h +++ b/include/uapi/linux/nfsd/nfsfh.h @@ -38,11 +38,17 @@ struct nfs_fhbase_old { * The file handle starts with a sequence of four-byte words. * The first word contains a version number (1) and three descriptor bytes * that tell how the remaining 3 variable length fields should be handled. - * These three bytes are auth_type, fsid_type and fileid_type. + * These three bytes are options, fsid_type and fileid_type. * * All four-byte values are in host-byte-order. * - * The auth_type field is deprecated and must be set to 0. + * The options field (previously auth_type) can be used when nfsd behaviour + * needs to change in a non-compatible way, usually for some specific + * filesystem. Options should only be set in filehandles for filesystems which + * need them. + * Current values: + * 1 - BTRFS only. Cause stat->ino_uniquifier to be used to improve inode + * number uniqueness. * * The fsid_type identifies how the filesystem (or export point) is * encoded. @@ -67,7 +73,7 @@ struct nfs_fhbase_new { union { struct { __u8 fb_version_aux; /* == 1, even => nfs_fhbase_old */ - __u8 fb_auth_type_aux; + __u8 fb_options_aux; __u8 fb_fsid_type_aux; __u8 fb_fileid_type_aux; __u32 fb_auth[1]; @@ -76,7 +82,7 @@ struct nfs_fhbase_new { }; struct { __u8 fb_version; /* == 1, even => nfs_fhbase_old */ - __u8 fb_auth_type; + __u8 fb_options; __u8 fb_fsid_type; __u8 fb_fileid_type; __u32 fb_auth_flex[]; /* flexible-array member */ @@ -106,11 +112,11 @@ struct knfsd_fh { #define fh_version fh_base.fh_new.fb_version #define fh_fsid_type fh_base.fh_new.fb_fsid_type -#define fh_auth_type fh_base.fh_new.fb_auth_type +#define fh_options fh_base.fh_new.fb_options #define fh_fileid_type fh_base.fh_new.fb_fileid_type #define fh_fsid fh_base.fh_new.fb_auth_flex /* Do not use, provided for userspace compatiblity. */ -#define fh_auth fh_base.fh_new.fb_auth +#define fh_auth fh_base.fh_new.fb_options #endif /* _UAPI_LINUX_NFSD_FH_H */ -- 2.32.0 ^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export 2021-08-23 4:05 ` [PATCH v2] BTRFS/NFSD: " NeilBrown @ 2021-08-23 8:17 ` kernel test robot 0 siblings, 0 replies; 19+ messages in thread From: kernel test robot @ 2021-08-23 8:17 UTC (permalink / raw) To: NeilBrown, Chris Mason, David Sterba, Christoph Hellwig, Josef Bacik, J. Bruce Fields, Chuck Lever Cc: clang-built-linux, kbuild-all, Roman Mamedov, Goffredo Baroncelli, Alexander Viro, linux-fsdevel [-- Attachment #1: Type: text/plain, Size: 5472 bytes --] Hi NeilBrown, Thank you for the patch! Yet something to improve: [auto build test ERROR on nfs/linux-next] [also build test ERROR on hch-configfs/for-next linus/master v5.14-rc7 next-20210820] [cannot apply to kdave/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/NeilBrown/BTRFS-NFSD-provide-more-unique-inode-number-for-btrfs-export/20210823-120718 base: git://git.linux-nfs.org/projects/trondmy/linux-nfs.git linux-next config: hexagon-randconfig-r045-20210822 (attached as .config) compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 79b55e5038324e61a3abf4e6a9a949c473edd858) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/e99ff00e4055532e35c592b50809761d82f87595 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review NeilBrown/BTRFS-NFSD-provide-more-unique-inode-number-for-btrfs-export/20210823-120718 git checkout e99ff00e4055532e35c592b50809761d82f87595 # save the attached .config to linux build tree mkdir build_dir COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=hexagon SHELL=/bin/bash If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> fs/nfsd/nfsfh.c:593:44: error: use of undeclared identifier 'BTRFS_SUPER_MAGIC' if (exp->ex_path.mnt->mnt_sb->s_magic == BTRFS_SUPER_MAGIC) ^ >> fs/nfsd/nfsfh.c:593:44: error: use of undeclared identifier 'BTRFS_SUPER_MAGIC' >> fs/nfsd/nfsfh.c:593:44: error: use of undeclared identifier 'BTRFS_SUPER_MAGIC' 3 errors generated. vim +/BTRFS_SUPER_MAGIC +593 fs/nfsd/nfsfh.c 557 558 __be32 559 fh_compose(struct svc_fh *fhp, struct svc_export *exp, struct dentry *dentry, 560 struct svc_fh *ref_fh) 561 { 562 /* ref_fh is a reference file handle. 563 * if it is non-null and for the same filesystem, then we should compose 564 * a filehandle which is of the same version, where possible. 565 * Currently, that means that if ref_fh->fh_handle.fh_version == 0xca 566 * Then create a 32byte filehandle using nfs_fhbase_old 567 * 568 */ 569 570 struct inode * inode = d_inode(dentry); 571 dev_t ex_dev = exp_sb(exp)->s_dev; 572 u8 options = 0; 573 574 dprintk("nfsd: fh_compose(exp %02x:%02x/%ld %pd2, ino=%ld)\n", 575 MAJOR(ex_dev), MINOR(ex_dev), 576 (long) d_inode(exp->ex_path.dentry)->i_ino, 577 dentry, 578 (inode ? inode->i_ino : 0)); 579 580 /* Choose filehandle version and fsid type based on 581 * the reference filehandle (if it is in the same export) 582 * or the export options. 583 */ 584 set_version_and_fsid_type(fhp, exp, ref_fh); 585 586 /* If we have a ref_fh, then copy the fh_no_wcc setting from it. */ 587 fhp->fh_no_wcc = ref_fh ? ref_fh->fh_no_wcc : false; 588 589 if (ref_fh && ref_fh->fh_export == exp) { 590 options = ref_fh->fh_handle.fh_options; 591 } else { 592 /* Set options as needed */ > 593 if (exp->ex_path.mnt->mnt_sb->s_magic == BTRFS_SUPER_MAGIC) 594 options |= NFSD_FH_OPTION_INO_UNIQUIFY; 595 } 596 597 if (ref_fh == fhp) 598 fh_put(ref_fh); 599 600 if (fhp->fh_locked || fhp->fh_dentry) { 601 printk(KERN_ERR "fh_compose: fh %pd2 not initialized!\n", 602 dentry); 603 } 604 if (fhp->fh_maxsize < NFS_FHSIZE) 605 printk(KERN_ERR "fh_compose: called with maxsize %d! %pd2\n", 606 fhp->fh_maxsize, 607 dentry); 608 609 fhp->fh_dentry = dget(dentry); /* our internal copy */ 610 fhp->fh_export = exp_get(exp); 611 612 if (fhp->fh_handle.fh_version == 0xca) { 613 /* old style filehandle please */ 614 memset(&fhp->fh_handle.fh_base, 0, NFS_FHSIZE); 615 fhp->fh_handle.fh_size = NFS_FHSIZE; 616 fhp->fh_handle.ofh_dcookie = 0xfeebbaca; 617 fhp->fh_handle.ofh_dev = old_encode_dev(ex_dev); 618 fhp->fh_handle.ofh_xdev = fhp->fh_handle.ofh_dev; 619 fhp->fh_handle.ofh_xino = 620 ino_t_to_u32(d_inode(exp->ex_path.dentry)->i_ino); 621 fhp->fh_handle.ofh_dirino = ino_t_to_u32(parent_ino(dentry)); 622 if (inode) 623 _fh_update_old(dentry, exp, &fhp->fh_handle); 624 } else { 625 fhp->fh_handle.fh_size = 626 key_len(fhp->fh_handle.fh_fsid_type) + 4; 627 fhp->fh_handle.fh_options = options; 628 629 mk_fsid(fhp->fh_handle.fh_fsid_type, 630 fhp->fh_handle.fh_fsid, 631 ex_dev, 632 d_inode(exp->ex_path.dentry)->i_ino, 633 exp->ex_fsid, exp->ex_uuid); 634 635 if (inode) 636 _fh_update(fhp, exp, dentry); 637 if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) { 638 fh_put(fhp); 639 return nfserr_opnotsupp; 640 } 641 } 642 643 return 0; 644 } 645 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 29470 bytes --] ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2021-09-20 22:11 UTC | newest] Thread overview: 19+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <162995209561.7591.4202079352301963089@noble.neil.brown.name> [not found] ` <162995778427.7591.11743795294299207756@noble.neil.brown.name> [not found] ` <YSkQ31UTVDtBavOO@infradead.org> [not found] ` <163010550851.7591.9342822614202739406@noble.neil.brown.name> [not found] ` <YSnhHl0HDOgg07U5@infradead.org> [not found] ` <163038594541.7591.11109978693705593957@noble.neil.brown.name> 2021-09-01 7:20 ` [PATCH v2] BTRFS/NFSD: provide more unique inode number for btrfs export Christoph Hellwig 2021-09-01 15:22 ` J. Bruce Fields 2021-09-02 4:14 ` NeilBrown 2021-09-05 16:07 ` J. Bruce Fields 2021-09-06 1:29 ` NeilBrown 2021-09-11 14:12 ` Amir Goldstein 2021-09-13 0:43 ` NeilBrown 2021-09-13 10:04 ` Amir Goldstein 2021-09-13 22:59 ` NeilBrown 2021-09-14 5:45 ` Amir Goldstein 2021-09-20 22:09 ` NeilBrown 2021-09-02 7:11 ` Christoph Hellwig 2021-09-02 4:06 ` NeilBrown 2021-09-02 7:16 ` Christoph Hellwig 2021-09-02 7:53 ` Miklos Szeredi 2021-09-02 14:16 ` Frank Filz 2021-09-02 23:02 ` NeilBrown 2021-07-27 22:37 [PATCH/RFC 00/11] expose btrfs subvols in mount table correctly NeilBrown 2021-08-13 1:45 ` [PATCH] VFS/BTRFS/NFSD: provide more unique inode number for btrfs export NeilBrown 2021-08-15 7:39 ` Goffredo Baroncelli 2021-08-15 19:35 ` Roman Mamedov 2021-08-15 22:17 ` NeilBrown 2021-08-23 4:05 ` [PATCH v2] BTRFS/NFSD: " NeilBrown 2021-08-23 8:17 ` kernel test robot
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).