* size limit of extended attributes @ 2015-04-30 11:33 Björn JACKE 2015-04-30 12:38 ` Trond Myklebust 0 siblings, 1 reply; 12+ messages in thread From: Björn JACKE @ 2015-04-30 11:33 UTC (permalink / raw) To: linux-fsdevel Hi, currently there is the hard limit for the size of extended attributes, of XATTR_SIZE_MAX set to 65536. Is there a reason for that limit or can this limit be removed, so that only the filesystem stays as a limiting factor here? In Samba we need to be able to store data in extrended attributes which can be much biger thatn 64k. AIX for example supports EAs of 16TB size in JFS2, Solaris is able to handle streams like files and also doesn't have special size limitations for EAs. Can we get rid of the tiny size limit in Linux also? Björn -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen ☎ +49-551-370000-0, ℻ +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 11:33 size limit of extended attributes Björn JACKE @ 2015-04-30 12:38 ` Trond Myklebust 2015-04-30 13:57 ` Björn JACKE 0 siblings, 1 reply; 12+ messages in thread From: Trond Myklebust @ 2015-04-30 12:38 UTC (permalink / raw) To: Björn JACKE; +Cc: Linux FS-devel Mailing List On Thu, Apr 30, 2015 at 7:33 AM, Björn JACKE <bj@sernet.de> wrote: > > Hi, > > currently there is the hard limit for the size of extended attributes, of > XATTR_SIZE_MAX set to 65536. > > Is there a reason for that limit or can this limit be removed, so that only the > filesystem stays as a limiting factor here? > > In Samba we need to be able to store data in extrended attributes which can be > much biger thatn 64k. AIX for example supports EAs of 16TB size in JFS2, > Solaris is able to handle streams like files and also doesn't have special size > limitations for EAs. Can we get rid of the tiny size limit in Linux also? It's going to be a real treat watching Samba write a 16TB memory buffer using the xattr API. Why can't you work around this by using the xattr as a symlink-like object that points to the real stored data? That way you also have available the file semantics that you need in order to write these large objects. Trying to squeeze an API for subfiles into the xattr API is a dead end. Cheers Trond -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 12:38 ` Trond Myklebust @ 2015-04-30 13:57 ` Björn JACKE 2015-04-30 14:35 ` Theodore Ts'o ` (2 more replies) 0 siblings, 3 replies; 12+ messages in thread From: Björn JACKE @ 2015-04-30 13:57 UTC (permalink / raw) To: Trond Myklebust; +Cc: Linux FS-devel Mailing List On 2015-04-30 at 08:38 -0400 Trond Myklebust sent off: > It's going to be a real treat watching Samba write a 16TB memory > buffer using the xattr API. you might have fun watching that but this is not really relevant for my question :-) > Why can't you work around this by using the xattr as a symlink-like > object that points to the real stored data? That way you also have > available the file semantics that you need in order to write these > large objects. because that wouldn't be connected to the file any more. It would also have to be outside the filesystem tree. Moving files on the command line without Samba would become impossible also. We don't want to implement crude hacks like this, this is why I ask for removing the size limit in the kernel instead. To get a proposal for a crude hack for Samba, there would be no need to talk about it on fsdevel. > Trying to squeeze an API for subfiles into the xattr API is a dead end. depends. Works quite well on Solaris. But I didn't ask for that API anyway. The actual question, which you did not touch at all was: Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on Linux. But there is a huge size limit - which would help us already a lot. Björn -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen ☎ +49-551-370000-0, ℻ +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 13:57 ` Björn JACKE @ 2015-04-30 14:35 ` Theodore Ts'o 2015-04-30 14:46 ` Boaz Harrosh 2015-05-01 15:43 ` Dave Chinner 2 siblings, 0 replies; 12+ messages in thread From: Theodore Ts'o @ 2015-04-30 14:35 UTC (permalink / raw) To: Björn JACKE; +Cc: Trond Myklebust, Linux FS-devel Mailing List On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote: > > Trying to squeeze an API for subfiles into the xattr API is a dead end. > > depends. Works quite well on Solaris. But I didn't ask for that API anyway. Yes, it works really well for hiding rootkits that can't be found using the usual tools. Trust me, we don't want to go there. Even Windows has more or less abandoned that misbegotten idea of alternate file streams, which get dropped when you try to download a file using http, etc., etc., etc. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 13:57 ` Björn JACKE 2015-04-30 14:35 ` Theodore Ts'o @ 2015-04-30 14:46 ` Boaz Harrosh 2015-04-30 16:06 ` Björn JACKE 2015-05-01 15:43 ` Dave Chinner 2 siblings, 1 reply; 12+ messages in thread From: Boaz Harrosh @ 2015-04-30 14:46 UTC (permalink / raw) To: Björn JACKE, Trond Myklebust; +Cc: Linux FS-devel Mailing List On 04/30/2015 04:57 PM, Björn JACKE wrote: > > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on > Linux. But there is a huge size limit - which would help us already a lot. > If memory serves me correctly the xattr is always written in one shot from offset0..xattr_size. At least it is so on the VFS API side. That said I think there are also two mem-copies on the way to the FS from the user buffer. How will you reliably write something bigger than CONST-X in one call? On all arches. Same goes for read. > Björn > The hacks need not be so ugly and they can be well documented and poblished as a public STD. ./foo-with-streams (With xatters info) ./.foo-with-streams.__STREAMES__/ (backpointer xattrs-info) ./.foo-with-streams.__STREAMES__/SA ./.foo-with-streams.__STREAMES__/SB ... And some special mod bits on the .foo-with-streams.__STREAMES__ directory The smb read-dir parser removes those hidden directories Cheers Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 14:46 ` Boaz Harrosh @ 2015-04-30 16:06 ` Björn JACKE 2015-04-30 16:56 ` Boaz Harrosh 0 siblings, 1 reply; 12+ messages in thread From: Björn JACKE @ 2015-04-30 16:06 UTC (permalink / raw) To: Boaz Harrosh; +Cc: Trond Myklebust, Linux FS-devel Mailing List On 2015-04-30 at 17:46 +0300 Boaz Harrosh sent off: > The hacks need not be so ugly and they can be well documented and poblished > as a public STD. > ./foo-with-streams (With xatters info) > ./.foo-with-streams.__STREAMES__/ (backpointer xattrs-info) > ./.foo-with-streams.__STREAMES__/SA > ./.foo-with-streams.__STREAMES__/SB > ... > And some special mod bits on the .foo-with-streams.__STREAMES__ directory > > The smb read-dir parser removes those hidden directories any hack like this would mean that for long file names which are close NAME_MAX this will not work. Yes, such long file names are being used. Apart from the fact that the meta data is detached from the file, which also makes this workaround quite sub-optimal. Still the only real solution I see would be bigger EA sizes. I was hoping that this would be not a big challenge for the Linux kernel. Björn -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen ☎ +49-551-370000-0, ℻ +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 16:06 ` Björn JACKE @ 2015-04-30 16:56 ` Boaz Harrosh 2015-05-05 13:34 ` Björn JACKE 0 siblings, 1 reply; 12+ messages in thread From: Boaz Harrosh @ 2015-04-30 16:56 UTC (permalink / raw) To: Björn JACKE; +Cc: Trond Myklebust, Linux FS-devel Mailing List On 04/30/2015 07:06 PM, Björn JACKE wrote: > On 2015-04-30 at 17:46 +0300 Boaz Harrosh sent off: >> The hacks need not be so ugly and they can be well documented and poblished >> as a public STD. >> ./foo-with-streams (With xatters info) >> ./.foo-with-streams.__STREAMES__/ (backpointer xattrs-info) >> ./.foo-with-streams.__STREAMES__/SA >> ./.foo-with-streams.__STREAMES__/SB >> ... >> And some special mod bits on the .foo-with-streams.__STREAMES__ directory >> >> The smb read-dir parser removes those hidden directories > > any hack like this would mean that for long file names which are close NAME_MAX > this will not work. Yes, such long file names are being used. > Solvable use the infamous ....xxxx~1 encoding solution for the dirs > Apart from the > fact that the meta data is detached from the file, which also makes this > workaround quite sub-optimal. Still the only real solution I see would be > bigger EA sizes. I was hoping that this would be not a big challenge for the > Linux kernel. > Again you are ignoring my point. If the FS would like to (easily) keep these xattrs for you, you have a POSIX API problem. You will need an alternate API to be able to read/write these big xattrs in chunks. It might be possible to make that 64K say 2M but you need a CONST-MAX size. with current API, is there a number that will satisfy you? > Björn > Cheers Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 16:56 ` Boaz Harrosh @ 2015-05-05 13:34 ` Björn JACKE 0 siblings, 0 replies; 12+ messages in thread From: Björn JACKE @ 2015-05-05 13:34 UTC (permalink / raw) To: Boaz Harrosh; +Cc: Trond Myklebust, Linux FS-devel Mailing List On 2015-04-30 at 19:56 +0300 Boaz Harrosh sent off: > Solvable use the infamous ....xxxx~1 encoding solution for the dirs I mentioned already, that this is not a good solution in any way. Apart from the fact that those file names would then not be allowed for clients to be created it decouples the meta-data from the files and that asks for obvious interoperability troubles. No need to discuss any external data storage here, really. > > Apart from the > > fact that the meta data is detached from the file, which also makes this > > workaround quite sub-optimal. Still the only real solution I see would be > > bigger EA sizes. I was hoping that this would be not a big challenge for the > > Linux kernel. > > > > Again you are ignoring my point. If the FS would like to (easily) keep these > xattrs for you, you have a POSIX API problem. You will need an alternate API > to be able to read/write these big xattrs in chunks. > > It might be possible to make that 64K say 2M but you need a CONST-MAX size. > with current API, is there a number that will satisfy you? the most prominent consumer of those data are OS X clients. 64MB might be a good number for most clients, even those that make quite heavy use of EAs. It would be a start if the hard coded kernel EA size limit would vanish. If that is done, we might start looking at extending the current xattr API as needed or maybe even think of coming up with an alternativ API to access EAs. Björn -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen ☎ +49-551-370000-0, ℻ +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-04-30 13:57 ` Björn JACKE 2015-04-30 14:35 ` Theodore Ts'o 2015-04-30 14:46 ` Boaz Harrosh @ 2015-05-01 15:43 ` Dave Chinner 2015-05-05 13:38 ` Björn JACKE 2 siblings, 1 reply; 12+ messages in thread From: Dave Chinner @ 2015-05-01 15:43 UTC (permalink / raw) To: Björn JACKE; +Cc: Trond Myklebust, Linux FS-devel Mailing List On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote: > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on > Linux. But there is a huge size limit - which would help us already a lot. No - the maximum xattr size of 64k is encoded into the on-disk format of many filesystems and that's not a simple thing to change. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-05-01 15:43 ` Dave Chinner @ 2015-05-05 13:38 ` Björn JACKE 2015-05-05 15:43 ` Jan Kara 0 siblings, 1 reply; 12+ messages in thread From: Björn JACKE @ 2015-05-05 13:38 UTC (permalink / raw) To: Dave Chinner; +Cc: Trond Myklebust, Linux FS-devel Mailing List On 2015-05-02 at 01:43 +1000 Dave Chinner sent off: > On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote: > > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on > > Linux. But there is a huge size limit - which would help us already a lot. > > No - the maximum xattr size of 64k is encoded into the on-disk > format of many filesystems and that's not a simple thing to change. I know ext4 even has a much lower limit. But some filesystems don't have a limit, and there the kernel 64k limit strikes in. The EA size limit of some file systems could also be increased. As there is a real use case for that I guess filesystems would like to be able to support larger EA sizes. Björn -- SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen ☎ +49-551-370000-0, ℻ +49-551-370000-9 AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-05-05 13:38 ` Björn JACKE @ 2015-05-05 15:43 ` Jan Kara 2015-05-06 1:31 ` Dave Chinner 0 siblings, 1 reply; 12+ messages in thread From: Jan Kara @ 2015-05-05 15:43 UTC (permalink / raw) To: Björn JACKE Cc: Dave Chinner, Trond Myklebust, Linux FS-devel Mailing List On Tue 05-05-15 15:38:11, Björn JACKE wrote: > On 2015-05-02 at 01:43 +1000 Dave Chinner sent off: > > On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote: > > > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on > > > Linux. But there is a huge size limit - which would help us already a lot. > > > > No - the maximum xattr size of 64k is encoded into the on-disk > > format of many filesystems and that's not a simple thing to change. > > I know ext4 even has a much lower limit. But some filesystems don't have a > limit, and there the kernel 64k limit strikes in. The EA size limit of some > file systems could also be increased. As there is a real use case for that I > guess filesystems would like to be able to support larger EA sizes. Yeah, so XFS could support more than 64K in principle if I look correct. Supporting it for ext4 would mean ondisk format change - doable but requires some non-trivial effort. Regarding the API, the issue with larger xattr size is that currently we copy whole xattr into kernel memory and process it in one go. Currently that's OK but if you want xattrs that have megabytes, it may become an effective way to DOS a system. So to support that we'd need to change at least the API from VFS into filesystems so that xattrs could be processed in smaller chunks. Again doable but quite some work. All in all I don't think this is going to happen unless someone interested in this invests significant amount of time to make this happen. Honza -- Jan Kara <jack@suse.cz> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: size limit of extended attributes 2015-05-05 15:43 ` Jan Kara @ 2015-05-06 1:31 ` Dave Chinner 0 siblings, 0 replies; 12+ messages in thread From: Dave Chinner @ 2015-05-06 1:31 UTC (permalink / raw) To: Jan Kara; +Cc: Björn JACKE, Trond Myklebust, Linux FS-devel Mailing List On Tue, May 05, 2015 at 05:43:48PM +0200, Jan Kara wrote: > On Tue 05-05-15 15:38:11, Björn JACKE wrote: > > On 2015-05-02 at 01:43 +1000 Dave Chinner sent off: > > > On Thu, Apr 30, 2015 at 03:57:37PM +0200, Björn JACKE wrote: > > > > Can we get rid of the 64k size limit for EAs? The API on AIX is the same as on > > > > Linux. But there is a huge size limit - which would help us already a lot. > > > > > > No - the maximum xattr size of 64k is encoded into the on-disk > > > format of many filesystems and that's not a simple thing to change. > > > > I know ext4 even has a much lower limit. But some filesystems don't have a > > limit, and there the kernel 64k limit strikes in. The EA size limit of some > > file systems could also be increased. As there is a real use case for that I > > guess filesystems would like to be able to support larger EA sizes. > > Yeah, so XFS could support more than 64K in principle if I look correct. > Supporting it for ext4 would mean ondisk format change - doable but > requires some non-trivial effort. In theory, the on disk attribute format for XFS can support 2^32 bytes for a remote attribute, but the attribute btree itself can't really support arbitrary length xattrs in it's mappings with any sort of performance or flexibility. And I really mean that performance thing - remote attributes in XFs are written *synchronously* in the syscall because we need them on disk before we commit the transaction that updates all the metadata that points to them. > Regarding the API, the issue with larger xattr size is that currently we > copy whole xattr into kernel memory and process it in one go. Currently > that's OK but if you want xattrs that have megabytes, it may become an > effective way to DOS a system. So to support that we'd need to change at > least the API from VFS into filesystems so that xattrs could be processed > in smaller chunks. Again doable but quite some work. Well, it's not just the VFS that would need to support this. Attributes currently are not designed to be extended or partially overwritten. Attributes are replaced in whole when they are changed, and the filesystem implementations reflect that. Again, going back to crash resiliency, XFS has a 3-step attribute replacement algorithm to guarantee that a crash during or soon after the operation will leave you with either the old or new value. It's designed around userspace providing new attributes in whole, so something like partial writes or extends will need *significant* amounts of redesign and rework. Oh, and what about all of utilities that you rely on for backups, restore, copying files, etc. They all think: $ grep -R XATTR /usr/include/linux/limits.h |grep MAX #define XATTR_NAME_MAX 255 /* # chars in an extended attribute name */ #define XATTR_SIZE_MAX 65536 /* size of an extended attribute value (64k) */ #define XATTR_LIST_MAX 65536 /* size of extended attribute namelist (64k) */ $ And so if we change the kernel, we suddenly are creating files that all our existing tools can't deal with. Now, taht means I've got to update xfs_repair, xfsdump, xfs_restore, xfs_db, xfs_fsr, etc. to support arbitrarily sized attributes, not to mention the special XFS ioctl kernel interfaces they use... > All in all I don't think this is going to happen unless someone interested > in this invests significant amount of time to make this happen. Compared to how much work it is for the file server application to map file streams to a directory+files on demand, it makes no sense to invent a completely new xattr API and have to implement it in all the required supporting infrastructure. Cheers, Dave. -- Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-05-06 1:31 UTC | newest] Thread overview: 12+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-04-30 11:33 size limit of extended attributes Björn JACKE 2015-04-30 12:38 ` Trond Myklebust 2015-04-30 13:57 ` Björn JACKE 2015-04-30 14:35 ` Theodore Ts'o 2015-04-30 14:46 ` Boaz Harrosh 2015-04-30 16:06 ` Björn JACKE 2015-04-30 16:56 ` Boaz Harrosh 2015-05-05 13:34 ` Björn JACKE 2015-05-01 15:43 ` Dave Chinner 2015-05-05 13:38 ` Björn JACKE 2015-05-05 15:43 ` Jan Kara 2015-05-06 1:31 ` Dave Chinner
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).