From mboxrd@z Thu Jan 1 00:00:00 1970 From: Dave Chinner Subject: Re: size limit of extended attributes Date: Wed, 6 May 2015 11:31:34 +1000 Message-ID: <20150506013134.GN15810@dastard> References: <20150501154351.GC15810@dastard> <20150505154348.GA16223@quack.suse.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: =?iso-8859-1?Q?Bj=F6rn?= JACKE , Trond Myklebust , Linux FS-devel Mailing List To: Jan Kara Return-path: Received: from ipmail04.adl6.internode.on.net ([150.101.137.141]:33031 "EHLO ipmail04.adl6.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750735AbbEFBbq (ORCPT ); Tue, 5 May 2015 21:31:46 -0400 Content-Disposition: inline In-Reply-To: <20150505154348.GA16223@quack.suse.cz> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Tue, May 05, 2015 at 05:43:48PM +0200, Jan Kara wrote: > On Tue 05-05-15 15:38:11, Bj=F6rn JACKE wrote: > > On 2015-05-02 at 01:43 +1000 Dave Chinner sent off: > > > On Thu, Apr 30, 2015 at 03:57:37PM +0200, Bj=F6rn JACKE wrote: > > > > Can we get rid of the 64k size limit for EAs? The API on AIX is= the same as on > > > > Linux. But there is a huge size limit - which would help us al= ready a lot. > > >=20 > > > No - the maximum xattr size of 64k is encoded into the on-disk > > > format of many filesystems and that's not a simple thing to chang= e. > >=20 > > I know ext4 even has a much lower limit. But some filesystems don't= have a > > limit, and there the kernel 64k limit strikes in. The EA size limit= of some > > file systems could also be increased. As there is a real use case f= or that I > > guess filesystems would like to be able to support larger EA sizes. > > Yeah, so XFS could support more than 64K in principle if I look cor= rect. > Supporting it for ext4 would mean ondisk format change - doable but > requires some non-trivial effort. In theory, the on disk attribute format for XFS can support 2^32 bytes for a remote attribute, but the attribute btree itself can't really support arbitrary length xattrs in it's mappings with any sort of performance or flexibility. And I really mean that performance thing - remote attributes in XFs are written *synchronously* in the syscall because we need them on disk before we commit the transaction that updates all the metadata that points to them. > Regarding the API, the issue with larger xattr size is that currently= we > copy whole xattr into kernel memory and process it in one go. Current= ly > that's OK but if you want xattrs that have megabytes, it may become a= n > effective way to DOS a system. So to support that we'd need to change= at > least the API from VFS into filesystems so that xattrs could be proce= ssed > in smaller chunks. Again doable but quite some work. Well, it's not just the VFS that would need to support this. Attributes currently are not designed to be extended or partially overwritten. Attributes are replaced in whole when they are changed, and the filesystem implementations reflect that. Again, going back to crash resiliency, XFS has a 3-step attribute replacement algorithm to guarantee that a crash during or soon after the operation will leave you with either the old or new value. It's designed around userspace providing new attributes in whole, so something like partial writes or extends will need *significant* amounts of redesign and rework. Oh, and what about all of utilities that you rely on for backups, restore, copying files, etc. They all think: $ grep -R XATTR /usr/include/linux/limits.h |grep MAX #define XATTR_NAME_MAX 255 /* # chars in an extended attribute nam= e */ #define XATTR_SIZE_MAX 65536 /* size of an extended attribute value = (64k) */ #define XATTR_LIST_MAX 65536 /* size of extended attribute namelist = (64k) */ $ And so if we change the kernel, we suddenly are creating files that all our existing tools can't deal with. Now, taht means I've got to update xfs_repair, xfsdump, xfs_restore, xfs_db, xfs_fsr, etc. to support arbitrarily sized attributes, not to mention the special XFS ioctl kernel interfaces they use... > All in all I don't think this is going to happen unless someone inter= ested > in this invests significant amount of time to make this happen. Compared to how much work it is for the file server application to map file streams to a directory+files on demand, it makes no sense to invent a completely new xattr API and have to implement it in all the required supporting infrastructure. Cheers, Dave. --=20 Dave Chinner david@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel= " in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html