Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [PATCH v4 14/16] ext4: add basic fs-verity support
From: Eric Biggers @ 2019-06-18 17:51 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615153112.GO6142@mit.edu>

On Sat, Jun 15, 2019 at 11:31:12AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:52:03AM -0700, Eric Biggers wrote:
> > +/*
> > + * Format of ext4 verity xattr.  This points to the location of the verity
> > + * descriptor within the file data rather than containing it directly because
> > + * the verity descriptor *must* be encrypted when ext4 encryption is used.  But,
> > + * ext4 encryption does not encrypt xattrs.
> > + */
> > +struct fsverity_descriptor_location {
> > +	__le32 version;
> > +	__le32 size;
> > +	__le64 pos;
> > +};
> 
> What's the benefit of storing the location in an xattr as opposed to
> just keying it off the end of i_size, rounded up to next page size (or
> 64k) as I had suggested earlier?
> 
> Using an xattr burns xattr space, which is a limited resource, and it
> adds some additional code complexity.  Does the benefits outweigh the
> added complexity?
> 
> 						- Ted

It means that only the fs/verity/ support layer has to be aware of the format of
the fsverity_descriptor, and the filesystem can just treat it an as opaque blob.

Otherwise the filesystem would need to read the first 'sizeof(struct
fsverity_descriptor)' bytes and use those to calculate the size as
'sizeof(struct fsverity_descriptor) + le32_to_cpu(desc.sig_size)', then read the
rest.  Is this what you have in mind?

Alternatively the filesystem could prepend the fsverity_descriptor with its
size, similar to how in the v1 and v2 patchsets there was an fsverity_footer
appended to the fsverity_descriptor.  But an xattr seems a cleaner approach to
store a few bytes that don't need to be encrypted.

Putting the verity descriptor before the Merkle tree also means that we'd have
to pass the desc_size to ->begin_enable_verity(), ->read_merkle_tree_page(), and
->write_merkle_tree_block(), versus just passing the merkle_tree_size to
->end_enable_verity().  This would be easy, but it would still add a bit of
complexity in the fsverity_operations rather than reduce it.

It's also somewhat nice to have the version number in the xattr, in case we ever
introduce a new fs-verity format for ext4 or f2fs.

So to me, it doesn't seem like the other possible solutions are better.

- Eric

^ permalink raw reply

* Re: [PATCH] mm, memcg: Report number of memcg caches in slabinfo
From: Waiman Long @ 2019-06-18 16:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel, Roman Gushchin,
	Johannes Weiner, Shakeel Butt, Vladimir Davydov, linux-api
In-Reply-To: <20190618123750.GG3318@dhcp22.suse.cz>

On 6/18/19 8:37 AM, Michal Hocko wrote:
> On Mon 17-06-19 10:50:23, Waiman Long wrote:
>> On 6/17/19 10:38 AM, Michal Hocko wrote:
>>> [Cc linux-api]
>>>
>>> On Mon 17-06-19 10:21:49, Waiman Long wrote:
>>>> There are concerns about memory leaks from extensive use of memory
>>>> cgroups as each memory cgroup creates its own set of kmem caches. There
>>>> is a possiblity that the memcg kmem caches may remain even after the
>>>> memory cgroup removal.
>>>>
>>>> Therefore, it will be useful to show how many memcg caches are present
>>>> for each of the kmem caches.
>>> How is a user going to use that information?  Btw. Don't we have an
>>> interface to display the number of (dead) cgroups?
>> The interface to report dead cgroups is for cgroup v2 (cgroup.stat)
>> only. I don't think there is a way to find that for cgroup v1.
> Doesn't debug_legacy_files provide the information for both cgroups
> APIs?

Not really. The debug controller doesn't provide information about the
number of dead cgroups, for instance. Of course, we can always add those
information there. Also the debug controller is not typically configured
into a production kernel.


>> Also the
>> number of memcg kmem caches may not be the same as the number of
>> memcg's. It can range from 0 to above the number of memcg's.  So it is
>> an interesting number by itself.
> Is this useful enough to put into slabinfo? Doesn't this sound more like
> a debugfs kinda a thing?

I guess it is probably more on the debug side of things. I add it to
slabinfo as the data is readily available. It will be much more work if
we need to export the data via debugfs.

We are seeing the kmem_cache slab growing continuously overtime when
running a container-based workloads. Roman's kmem_cache reparenting
patch will hopefully solve a major part of the problem, but we still
need a way to confirm that by looking at how many memcg kmem_caches are
associated with each root kmem_cache.

Cheers,
Longman

^ permalink raw reply

* Re: [PATCH v4 13/16] fs-verity: support builtin file signatures
From: Eric Biggers @ 2019-06-18 16:58 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615152143.GN6142@mit.edu>

On Sat, Jun 15, 2019 at 11:21:43AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:52:02AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > To meet some users' needs, add optional support for having fs-verity
> > handle a portion of the authentication policy in the kernel.  An
> > ".fs-verity" keyring is created to which X.509 certificates can be
> > added; then a sysctl 'fs.verity.require_signatures' can be set to cause
> > the kernel to enforce that all fs-verity files contain a signature of
> > their file measurement by a key in this keyring.
> 
> I think it might be a good idea to allow the require_signatures
> setting to be set on a per-file system basis, via a mount option?  We
> could plumb it in via a flag in fsverity_info, set by the file system.

Perhaps, but this is something that can be added later, so I think we should
hold off on it until someone needs it.

> 
> Other than this feature request, looks good; you can add:
> 
> Reviewed-off-by: Theodore Ts'o <tytso@mit.edu>
> 

I assume you mean "Reviewed-by" :-)

- Eric

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Dave Martin @ 2019-06-18 16:50 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Yu-cheng Yu, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz
In-Reply-To: <8736k67tdc.fsf@oldenburg2.str.redhat.com>

On Tue, Jun 18, 2019 at 06:25:51PM +0200, Florian Weimer wrote:
> * Dave Martin:
> 
> > On Tue, Jun 18, 2019 at 09:00:35AM -0700, Yu-cheng Yu wrote:
> >> On Tue, 2019-06-18 at 18:05 +0200, Florian Weimer wrote:
> >> > * Yu-cheng Yu:
> >> > 
> >> > > > I assumed that it would also parse the main executable and make
> >> > > > adjustments based on that.
> >> > > 
> >> > > Yes, Linux also looks at the main executable's header, but not its
> >> > > NT_GNU_PROPERTY_TYPE_0 if there is a loader.
> >> > > 
> >> > > > 
> >> > > > ld.so can certainly provide whatever the kernel needs.  We need to tweak
> >> > > > the existing loader anyway.
> >> > > > 
> >> > > > No valid statically-linked binaries exist today, so this is not a
> >> > > > consideration at this point.
> >> > > 
> >> > > So from kernel, we look at only PT_GNU_PROPERTY?
> >> > 
> >> > If you don't parse notes/segments in the executable for CET, then yes.
> >> > We can put PT_GNU_PROPERTY into the loader.
> >> 
> >> Thanks!
> >
> > Would this require the kernel and ld.so to be updated in a particular
> > order to avoid breakage?  I don't know enough about RHEL to know how
> > controversial that might be.
> 
> There is no official ld.so that will work with the current userspace
> interface (in this patch submission).  Upstream glibc needs to be
> updated anyway, so yet another change isn't much of an issue.  This is
> not a problem; we knew that something like this might happen.
> 
> Sure, people need a new binutils with backports for PT_GNU_PROPERTY, but
> given that only very few people will build CET binaries with older
> binutils, I think that's not a real issue either.

OK, just wanted to check we weren't missing any requirement for x86.

This approach should satisfy the requirement for arm64 nicely.

Cheers
---Dave

^ permalink raw reply

* Re: [PATCH v4 10/16] fs-verity: implement FS_IOC_ENABLE_VERITY ioctl
From: Eric Biggers @ 2019-06-18 16:50 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615150821.GK6142@mit.edu>

On Sat, Jun 15, 2019 at 11:08:21AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:51:59AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Add a function for filesystems to call to implement the
> > FS_IOC_ENABLE_VERITY ioctl.  This ioctl enables fs-verity on a file.
> > 
> > See the "FS_IOC_ENABLE_VERITY" section of
> > Documentation/filesystems/fsverity.rst for the documentation.
> > 
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> > diff --git a/fs/verity/enable.c b/fs/verity/enable.c
> > new file mode 100644
> > index 000000000000..7e7ef9d3c376
> > --- /dev/null
> > +++ b/fs/verity/enable.c
> > +	/* Tell the filesystem to finish enabling verity on the file */
> > +	err = vops->end_enable_verity(filp, desc, desc_size, params.tree_size);
> > +	if (err) {
> > +		fsverity_err(inode, "%ps() failed with err %d",
> > +			     vops->end_enable_verity, err);
> > +		fsverity_free_info(vi);
> > +	} else {
> > +		/* Successfully enabled verity */
> > +
> > +		WARN_ON(!IS_VERITY(inode));
> > +
> > +		/*
> > +		 * Readers can start using ->i_verity_info immediately, so it
> > +		 * can't be rolled back once set.  So don't set it until just
> > +		 * after the filesystem has successfully enabled verity.
> > +		 */
> > +		fsverity_set_info(inode, vi);
> > +	}
> 
> If end_enable_Verity() retuns success, and IS_VERITY is not set, I
> would think that we should report the error via fsverity_err() and
> return an error to userspace, and *not* call fsverity_set_info().  I
> don't think the stack trace printed by WARN_ON is going to very
> interesting, since the call path which gets us to enable_verity() is
> not going to be surprising.
> 

I want to keep it as WARN_ON() because if it happens it's a kernel bug, and
WARNs are reported as bugs by automated tools.  But I can do the following so it
returns an error code too:

@@ -229,11 +235,12 @@ static int enable_verity(struct file *filp,
 		fsverity_err(inode, "%ps() failed with err %d",
 			     vops->end_enable_verity, err);
 		fsverity_free_info(vi);
+	} else if (WARN_ON(!IS_VERITY(inode))) {
+		err = -EINVAL;
+		fsverity_free_info(vi);
 	} else {
 		/* Successfully enabled verity */
 
-		WARN_ON(!IS_VERITY(inode));
-
 		/*
 		 * Readers can start using ->i_verity_info immediately, so it
 		 * can't be rolled back once set.  So don't set it until just

> > +
> > +	if (inode->i_size <= 0) {
> > +		err = -EINVAL;
> > +		goto out_unlock;
> > +	}
> 
> How hard would it be to support fsverity for zero-length files?  There
> would be no Merkle tree, but there still would be an fsverity header
> file on which we can calculate a checksum for the digital signature.
> 
>      	      	     	       	 - Ted
> 

Empty files would have to be special-cased, e.g. defining the root hash to be
all 0's, since there are no blocks to checksum.  It would be straightforward,
but it would still be a special case, e.g.:

diff --git a/fs/verity/enable.c b/fs/verity/enable.c
index ee9dd578e59fb..e859a2b6a4310 100644
--- a/fs/verity/enable.c
+++ b/fs/verity/enable.c
@@ -112,6 +112,12 @@ static int build_merkle_tree(struct inode *inode,
 	unsigned int level;
 	int err = -ENOMEM;
 
+	if (inode->i_size == 0) {
+		/* Empty file is a special case; root hash is all 0's */
+		memset(root_hash, 0, params->digest_size);
+		return 0;
+	}
+

On the other hand, *not* supporting empty files is a special case from the
user's point of view.  It means that fs-verity isn't supported on every possible
file.  Thinking about it, that's probably worse than having a special case in
the *implementation*.

So now I'm leaning towards changing it to support empty files.

- Eric

^ permalink raw reply related

* Re: [PATCH v4 07/16] fs-verity: add the hook for file ->open()
From: Eric Biggers @ 2019-06-18 16:35 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615144207.GH6142@mit.edu>

On Sat, Jun 15, 2019 at 10:42:07AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:51:56AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Add the fsverity_file_open() function, which prepares an fs-verity file
> > to be read from.  If not already done, it loads the fs-verity descriptor
> > from the filesystem and sets up an fsverity_info structure for the inode
> > which describes the Merkle tree and contains the file measurement.  It
> > also denies all attempts to open verity files for writing.
> > 
> > This commit also begins the include/linux/fsverity.h header, which
> > declares the interface between fs/verity/ and filesystems.
> > 
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Looks good; you can add:
> 
> Reviewed-off-by: Theodore Ts'o <tytso@mit.edu>
> 
> 						- Ted
> 
> > +/*
> > + * Validate the given fsverity_descriptor and create a new fsverity_info from
> > + * it.  The signature (if present) is also checked.
> > + */
> > +struct fsverity_info *fsverity_create_info(const struct inode *inode,
> > +					   const void *_desc, size_t desc_size)
> 
> Well, technically it's not checked (yet).  It doesn't get checked
> until [PATCH 13/16]: support builtin file signatures.  If we want to
> be really nit-picky, that portion of the comment could be moved to
> later in the series.
> 

Yes, I missed this when splitting out the patches.  I'll move it to patch 13.

- Eric

^ permalink raw reply

* Re: [PATCH v4 05/16] fs-verity: add Kconfig and the helper functions for hashing
From: Eric Biggers @ 2019-06-18 16:32 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615125731.GF6142@mit.edu>

On Sat, Jun 15, 2019 at 08:57:31AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:51:54AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Add the beginnings of the fs/verity/ support layer, including the
> > Kconfig option and various helper functions for hashing.  To start, only
> > SHA-256 is supported, but other hash algorithms can easily be added.
> > 
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Looks good; you can add:
> 
> Reviewed-off-by: Theodore Ts'o <tytso@mit.edu>
> 
> One thought for consideration below...
> 
> 
> > +
> > +/*
> > + * Maximum depth of the Merkle tree.  Up to 64 levels are theoretically possible
> > + * with a very small block size, but we'd like to limit stack usage during
> > + * verification, and in practice this is plenty.  E.g., with SHA-256 and 4K
> > + * blocks, a file with size UINT64_MAX bytes needs just 8 levels.
> > + */
> > +#define FS_VERITY_MAX_LEVELS		16
> 
> Maybe we should make FS_VERITY_MAX_LEVELS 8 for now?  This is an
> implementation-level restriction, and currently we don't support any
> architectures that have a page size < 4k.  We can always bump this
> number up in the future if it ever becomes necessary, and limiting max
> levels to 8 saves almost 100 bytes of stack space in verify_page().
> 
> 						- Ted

Yes, I agree.  I'll reduce MAX_LEVELS to 8 for now and tweak the comment.

- Eric

^ permalink raw reply

* Re: [PATCH v4 01/16] fs-verity: add a documentation file
From: Eric Biggers @ 2019-06-18 16:31 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Darrick J . Wong, linux-api, Dave Chinner, linux-f2fs-devel,
	linux-fscrypt, linux-fsdevel, Jaegeuk Kim, linux-integrity,
	linux-ext4, Linus Torvalds, Christoph Hellwig, Victor Hsieh
In-Reply-To: <20190615123920.GB6142@mit.edu>

Hi Ted,

On Sat, Jun 15, 2019 at 08:39:20AM -0400, Theodore Ts'o wrote:
> On Thu, Jun 06, 2019 at 08:51:50AM -0700, Eric Biggers wrote:
> > From: Eric Biggers <ebiggers@google.com>
> > 
> > Add a documentation file for fs-verity, covering....
> > 
> > Signed-off-by: Eric Biggers <ebiggers@google.com>
> 
> Looks good; you can add:
> 
> Reviewed-by: Theodore Ts'o <tytso@mit.edu>
> 
> 
> One minor design point below:
> 
> > +ext4 stores the verity metadata (Merkle tree and fsverity_descriptor)
> > +past the end of the file, starting at the first page fully beyond
>                                                    ^^^^
> > +i_size.  This approach works because (a) verity files are readonly,
> > +and (b) pages fully beyond i_size aren't visible to userspace but can
> > +be read/written internally by ext4 with only some relatively small
> > +changes to ext4.  This approach avoids having to depend on the
> > +EA_INODE feature and on rearchitecturing ext4's xattr support to
> > +support paging multi-gigabyte xattrs into memory, and to support
> > +encrypting xattrs.  Note that the verity metadata *must* be encrypted
> > +when the file is, since it contains hashes of the plaintext data.
> 
> If we ever want to support mounting, say, a file system with 4k blocks
> and fsverity enabled on a architecture with a 16k or 64k page size,
> then "page" in that first sentence will need to become "block".  At
> the moment we only support fsverity when page size == block size, so
> it's not an issue.
> 
> However, it's worth reflecting on what this means.  In order to
> satisfy this requirement (from the mmap man page):
> 
>        A file is mapped in multiples of the page size.  For a file
>        that is not a multiple of the page size, the remaining memory
>        is zeroed when mapped...
> 
> we're going to have to special case how the last page gets mmaped.
> The simplest way to do this will be to map in an anonymous page which
> just has the blocks that are part of the data block copied in, and the
> rest of the page can be zero'ed.
> 
> One thing we might consider doing just to make life much easier for
> ourselves (should we ever want to support page size != block size ---
> which I could imagine some folks like Chandan might find desirable) is
> to specify that the fsverity metadata begins at an offset which begins
> at i_size rounded up to the next 64k binary, which should handle all
> current and future architectures' page sizes.
> 

Thanks for the review.  Good point; I think we should just go with the "always
round up to the next 64K boundary" method.  Special-casing how the last page
gets mmap()ed seems it would be really painful.

Since there can be a hole between the end of the file and the start of the
verity metadata, this doesn't even necessarily use any additional disk space.

For consistency and since there is little downside I think I'll do the same for
f2fs too, though f2fs doesn't currently support PAGE_SIZE != 4096 at all anyway.

- Eric

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Florian Weimer @ 2019-06-18 16:25 UTC (permalink / raw)
  To: Dave Martin
  Cc: Yu-cheng Yu, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz
In-Reply-To: <20190618162005.GF2790@e103592.cambridge.arm.com>

* Dave Martin:

> On Tue, Jun 18, 2019 at 09:00:35AM -0700, Yu-cheng Yu wrote:
>> On Tue, 2019-06-18 at 18:05 +0200, Florian Weimer wrote:
>> > * Yu-cheng Yu:
>> > 
>> > > > I assumed that it would also parse the main executable and make
>> > > > adjustments based on that.
>> > > 
>> > > Yes, Linux also looks at the main executable's header, but not its
>> > > NT_GNU_PROPERTY_TYPE_0 if there is a loader.
>> > > 
>> > > > 
>> > > > ld.so can certainly provide whatever the kernel needs.  We need to tweak
>> > > > the existing loader anyway.
>> > > > 
>> > > > No valid statically-linked binaries exist today, so this is not a
>> > > > consideration at this point.
>> > > 
>> > > So from kernel, we look at only PT_GNU_PROPERTY?
>> > 
>> > If you don't parse notes/segments in the executable for CET, then yes.
>> > We can put PT_GNU_PROPERTY into the loader.
>> 
>> Thanks!
>
> Would this require the kernel and ld.so to be updated in a particular
> order to avoid breakage?  I don't know enough about RHEL to know how
> controversial that might be.

There is no official ld.so that will work with the current userspace
interface (in this patch submission).  Upstream glibc needs to be
updated anyway, so yet another change isn't much of an issue.  This is
not a problem; we knew that something like this might happen.

Sure, people need a new binutils with backports for PT_GNU_PROPERTY, but
given that only very few people will build CET binaries with older
binutils, I think that's not a real issue either.

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Dave Martin @ 2019-06-18 16:20 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: Florian Weimer, Peter Zijlstra, Thomas Gleixner, x86,
	H. Peter Anvin, Ingo Molnar, linux-kernel, linux-doc, linux-mm,
	linux-arch, linux-api, Arnd Bergmann, Andy Lutomirski,
	Balbir Singh, Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz
In-Reply-To: <b0491cb517ba377da6496fe91a98fdbfca4609a9.camel@intel.com>

On Tue, Jun 18, 2019 at 09:00:35AM -0700, Yu-cheng Yu wrote:
> On Tue, 2019-06-18 at 18:05 +0200, Florian Weimer wrote:
> > * Yu-cheng Yu:
> > 
> > > > I assumed that it would also parse the main executable and make
> > > > adjustments based on that.
> > > 
> > > Yes, Linux also looks at the main executable's header, but not its
> > > NT_GNU_PROPERTY_TYPE_0 if there is a loader.
> > > 
> > > > 
> > > > ld.so can certainly provide whatever the kernel needs.  We need to tweak
> > > > the existing loader anyway.
> > > > 
> > > > No valid statically-linked binaries exist today, so this is not a
> > > > consideration at this point.
> > > 
> > > So from kernel, we look at only PT_GNU_PROPERTY?
> > 
> > If you don't parse notes/segments in the executable for CET, then yes.
> > We can put PT_GNU_PROPERTY into the loader.
> 
> Thanks!

Would this require the kernel and ld.so to be updated in a particular
order to avoid breakage?  I don't know enough about RHEL to know how
controversial that might be.

Also:

What about static binaries distrubited as part of RHEL?

A user would also reasonably expect static binaries built using the
distro toolchain to work on top of the distro kernel...  which might
be broken by this.


(When I say "broken" I mean that the binary would run, but CET
protections would be silently turned off.)

Cheers
---Dave

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Florian Weimer @ 2019-06-18 16:05 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: Dave Martin, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <1ca57aaae8a2121731f2dcb1a137b92eed39a0d2.camel@intel.com>

* Yu-cheng Yu:

>> I assumed that it would also parse the main executable and make
>> adjustments based on that.
>
> Yes, Linux also looks at the main executable's header, but not its
> NT_GNU_PROPERTY_TYPE_0 if there is a loader.
>
>> 
>> ld.so can certainly provide whatever the kernel needs.  We need to tweak
>> the existing loader anyway.
>> 
>> No valid statically-linked binaries exist today, so this is not a
>> consideration at this point.
>
> So from kernel, we look at only PT_GNU_PROPERTY?

If you don't parse notes/segments in the executable for CET, then yes.
We can put PT_GNU_PROPERTY into the loader.

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Yu-cheng Yu @ 2019-06-18 16:00 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Dave Martin, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <87blyu7ubf.fsf@oldenburg2.str.redhat.com>

On Tue, 2019-06-18 at 18:05 +0200, Florian Weimer wrote:
> * Yu-cheng Yu:
> 
> > > I assumed that it would also parse the main executable and make
> > > adjustments based on that.
> > 
> > Yes, Linux also looks at the main executable's header, but not its
> > NT_GNU_PROPERTY_TYPE_0 if there is a loader.
> > 
> > > 
> > > ld.so can certainly provide whatever the kernel needs.  We need to tweak
> > > the existing loader anyway.
> > > 
> > > No valid statically-linked binaries exist today, so this is not a
> > > consideration at this point.
> > 
> > So from kernel, we look at only PT_GNU_PROPERTY?
> 
> If you don't parse notes/segments in the executable for CET, then yes.
> We can put PT_GNU_PROPERTY into the loader.

Thanks!

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Yu-cheng Yu @ 2019-06-18 15:53 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Dave Martin, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <87pnna7v1d.fsf@oldenburg2.str.redhat.com>

On Tue, 2019-06-18 at 17:49 +0200, Florian Weimer wrote:
> * Yu-cheng Yu:
> 
> > The kernel looks at only ld-linux.  Other applications are loaded by ld-
> > linux. 
> > So the issues are limited to three versions of ld-linux's.  Can we somehow
> > update those??
> 
> I assumed that it would also parse the main executable and make
> adjustments based on that.

Yes, Linux also looks at the main executable's header, but not its
NT_GNU_PROPERTY_TYPE_0 if there is a loader.

> 
> ld.so can certainly provide whatever the kernel needs.  We need to tweak
> the existing loader anyway.
> 
> No valid statically-linked binaries exist today, so this is not a
> consideration at this point.

So from kernel, we look at only PT_GNU_PROPERTY?

Yu-cheng

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Florian Weimer @ 2019-06-18 15:49 UTC (permalink / raw)
  To: Yu-cheng Yu
  Cc: Dave Martin, Peter Zijlstra, Thomas Gleixner, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <d54fe81be77b9edd8578a6d208c72cd7c0b8c1dd.camel@intel.com>

* Yu-cheng Yu:

> The kernel looks at only ld-linux.  Other applications are loaded by ld-linux. 
> So the issues are limited to three versions of ld-linux's.  Can we somehow
> update those??

I assumed that it would also parse the main executable and make
adjustments based on that.

ld.so can certainly provide whatever the kernel needs.  We need to tweak
the existing loader anyway.

No valid statically-linked binaries exist today, so this is not a
consideration at this point.

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Yu-cheng Yu @ 2019-06-18 14:58 UTC (permalink / raw)
  To: Dave Martin, Peter Zijlstra
  Cc: Florian Weimer, Thomas Gleixner, x86, H. Peter Anvin, Ingo Molnar,
	linux-kernel, linux-doc, linux-mm, linux-arch, linux-api,
	Arnd Bergmann, Andy Lutomirski, Balbir Singh, Borislav Petkov,
	Cyrill Gorcunov, Dave Hansen, Eugene Syromiatnikov, H.J. Lu,
	Jann Horn, Jonathan Corbet, Kees Cook, Mike Kravetz, Nadav Amit,
	Ole
In-Reply-To: <20190618133223.GD2790@e103592.cambridge.arm.com>

On Tue, 2019-06-18 at 14:32 +0100, Dave Martin wrote:
> On Tue, Jun 18, 2019 at 02:55:12PM +0200, Peter Zijlstra wrote:
> > On Tue, Jun 18, 2019 at 02:47:00PM +0200, Florian Weimer wrote:
> > > * Peter Zijlstra:
> > > 
> > > > I'm not sure I read Thomas' comment like that. In my reading keeping the
> > > > PT_NOTE fallback is exactly one of those 'fly workarounds'. By not
> > > > supporting PT_NOTE only the 'fine' people already shit^Hpping this out
> > > > of tree are affected, and we don't have to care about them at all.
> > > 
> > > Just to be clear here: There was an ABI document that required PT_NOTE
> > > parsing.
> > 
> > URGH.
> > 
> > > The Linux kernel does *not* define the x86-64 ABI, it only
> > > implements it.  The authoritative source should be the ABI document.
> > > 
> > > In this particularly case, so far anyone implementing this ABI extension
> > > tried to provide value by changing it, sometimes successfully.  Which
> > > makes me wonder why we even bother to mainatain ABI documentation.  The
> > > kernel is just very late to the party.
> > 
> > How can the kernel be late to the party if all of this is spinning
> > wheels without kernel support?
> 
> PT_GNU_PROPERTY is mentioned and allocated a p_type value in hjl's
> spec [1], but otherwise seems underspecified.
> 
> In particular, it's not clear whether a PT_GNU_PROPERTY phdr _must_ be
> emitted for NT_GNU_PROPERTY_TYPE_0.  While it seems a no-brainer to emit
> it, RHEL's linker already doesn't IIUC, and there are binaries in the
> wild.
> 
> Maybe this phdr type is a late addition -- I haven't attempted to dig
> through the history.
> 
> 
> For arm64 we don't have this out-of-tree legacy to support, so we can
> avoid exhausitvely searching for the note: no PT_GNU_PROPERTY ->
> no note.
> 
> So, can we do the same for x86, forcing RHEL to carry some code out of
> tree to support their legacy binaries?  Or do we accept that there is
> already a de facto ABI and try to be compatible with it?
> 
> 
> From my side, I want to avoid duplication between x86 and arm64, and
> keep unneeded complexity out of the ELF loader where possible.

Hi Florian,

The kernel looks at only ld-linux.  Other applications are loaded by ld-linux. 
So the issues are limited to three versions of ld-linux's.  Can we somehow
update those??

Thanks,
Yu-cheng

^ permalink raw reply

* Re: [PATCH 25/25] fsinfo: Add API documentation [ver #13]
From: David Howells @ 2019-06-18 14:01 UTC (permalink / raw)
  To: Alan Jenkins
  Cc: dhowells, viro, raven, linux-api, linux-fsdevel, linux-kernel,
	mszeredi
In-Reply-To: <d88b3276-a81a-d5a6-76d6-1a01376aa31c@gmail.com>

Alan Jenkins <alan.christopher.jenkins@gmail.com> wrote:

> > +    eleemnts in the FSINFO_ATTR_MOUNT_CHROOT list.
> 
> FSINFO_ATTR_MOUNT_CHROOT -> FSINFO_ATTR_MOUNT_CHILDREN

I've applied your changes.

David

^ permalink raw reply

* Re: [PATCH 09/25] vfs: Allow mount information to be queried by fsinfo() [ver #13]
From: David Howells @ 2019-06-18 14:00 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: dhowells, viro, raven, linux-api, linux-fsdevel, linux-kernel,
	mszeredi
In-Reply-To: <20190601160822.GA77761@google.com>

Joel Fernandes <joel@joelfernandes.org> wrote:

> > +	record.mnt_id = m->mnt_id;
> > +	record.notify_counter = atomic_read(&m->mnt_notify_counter);
> > +	store_mount_fsinfo(params, &record);
> > +
> > +	rcu_read_unlock();
> 
> Not super familiar with this code, but wanted to check with you:
> 
> Here, if the rcu_read_lock is supposed to protect the RCU list, can
> rcu_read_lock() scope be reduced to just wrapping around the
> list_for_each_entry_rcu?

Done.

David

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Dave Martin @ 2019-06-18 13:32 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Florian Weimer, Thomas Gleixner, Yu-cheng Yu, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <20190618125512.GJ3419@hirez.programming.kicks-ass.net>

On Tue, Jun 18, 2019 at 02:55:12PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 18, 2019 at 02:47:00PM +0200, Florian Weimer wrote:
> > * Peter Zijlstra:
> > 
> > > I'm not sure I read Thomas' comment like that. In my reading keeping the
> > > PT_NOTE fallback is exactly one of those 'fly workarounds'. By not
> > > supporting PT_NOTE only the 'fine' people already shit^Hpping this out
> > > of tree are affected, and we don't have to care about them at all.
> > 
> > Just to be clear here: There was an ABI document that required PT_NOTE
> > parsing.
> 
> URGH.
> 
> > The Linux kernel does *not* define the x86-64 ABI, it only
> > implements it.  The authoritative source should be the ABI document.
> >
> > In this particularly case, so far anyone implementing this ABI extension
> > tried to provide value by changing it, sometimes successfully.  Which
> > makes me wonder why we even bother to mainatain ABI documentation.  The
> > kernel is just very late to the party.
> 
> How can the kernel be late to the party if all of this is spinning
> wheels without kernel support?

PT_GNU_PROPERTY is mentioned and allocated a p_type value in hjl's
spec [1], but otherwise seems underspecified.

In particular, it's not clear whether a PT_GNU_PROPERTY phdr _must_ be
emitted for NT_GNU_PROPERTY_TYPE_0.  While it seems a no-brainer to emit
it, RHEL's linker already doesn't IIUC, and there are binaries in the
wild.

Maybe this phdr type is a late addition -- I haven't attempted to dig
through the history.


For arm64 we don't have this out-of-tree legacy to support, so we can
avoid exhausitvely searching for the note: no PT_GNU_PROPERTY ->
no note.

So, can we do the same for x86, forcing RHEL to carry some code out of
tree to support their legacy binaries?  Or do we accept that there is
already a de facto ABI and try to be compatible with it?


>From my side, I want to avoid duplication between x86 and arm64, and
keep unneeded complexity out of the ELF loader where possible.

Cheers
---Dave


[1] https://github.com/hjl-tools/linux-abi/wiki/Linux-Extensions-to-gABI

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Peter Zijlstra @ 2019-06-18 12:55 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Dave Martin, Thomas Gleixner, Yu-cheng Yu, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, Nad
In-Reply-To: <87ef3r9i2j.fsf@oldenburg2.str.redhat.com>

On Tue, Jun 18, 2019 at 02:47:00PM +0200, Florian Weimer wrote:
> * Peter Zijlstra:
> 
> > I'm not sure I read Thomas' comment like that. In my reading keeping the
> > PT_NOTE fallback is exactly one of those 'fly workarounds'. By not
> > supporting PT_NOTE only the 'fine' people already shit^Hpping this out
> > of tree are affected, and we don't have to care about them at all.
> 
> Just to be clear here: There was an ABI document that required PT_NOTE
> parsing.

URGH.

> The Linux kernel does *not* define the x86-64 ABI, it only
> implements it.  The authoritative source should be the ABI document.
>
> In this particularly case, so far anyone implementing this ABI extension
> tried to provide value by changing it, sometimes successfully.  Which
> makes me wonder why we even bother to mainatain ABI documentation.  The
> kernel is just very late to the party.

How can the kernel be late to the party if all of this is spinning
wheels without kernel support?

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Florian Weimer @ 2019-06-18 12:47 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Dave Martin, Thomas Gleixner, Yu-cheng Yu, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, Nad
In-Reply-To: <20190618124122.GH3419@hirez.programming.kicks-ass.net>

* Peter Zijlstra:

> I'm not sure I read Thomas' comment like that. In my reading keeping the
> PT_NOTE fallback is exactly one of those 'fly workarounds'. By not
> supporting PT_NOTE only the 'fine' people already shit^Hpping this out
> of tree are affected, and we don't have to care about them at all.

Just to be clear here: There was an ABI document that required PT_NOTE
parsing.  The Linux kernel does *not* define the x86-64 ABI, it only
implements it.  The authoritative source should be the ABI document.

In this particularly case, so far anyone implementing this ABI extension
tried to provide value by changing it, sometimes successfully.  Which
makes me wonder why we even bother to mainatain ABI documentation.  The
kernel is just very late to the party.

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Peter Zijlstra @ 2019-06-18 12:41 UTC (permalink / raw)
  To: Dave Martin
  Cc: Thomas Gleixner, Florian Weimer, Yu-cheng Yu, x86, H. Peter Anvin,
	Ingo Molnar, linux-kernel, linux-doc, linux-mm, linux-arch,
	linux-api, Arnd Bergmann, Andy Lutomirski, Balbir Singh,
	Borislav Petkov, Cyrill Gorcunov, Dave Hansen,
	Eugene Syromiatnikov, H.J. Lu, Jann Horn, Jonathan Corbet,
	Kees Cook, Mike Kravetz, N
In-Reply-To: <20190618091248.GB2790@e103592.cambridge.arm.com>

On Tue, Jun 18, 2019 at 10:12:50AM +0100, Dave Martin wrote:
> On Mon, Jun 17, 2019 at 02:20:40PM +0200, Thomas Gleixner wrote:
> > On Mon, 17 Jun 2019, Florian Weimer wrote:
> > > * Dave Martin:
> > > > On Tue, Jun 11, 2019 at 12:31:34PM -0700, Yu-cheng Yu wrote:
> > > >> We can probably check PT_GNU_PROPERTY first, and fallback (based on ld-linux
> > > >> version?) to PT_NOTE scanning?
> > > >
> > > > For arm64, we can check for PT_GNU_PROPERTY and then give up
> > > > unconditionally.
> > > >
> > > > For x86, we would fall back to PT_NOTE scanning, but this will add a bit
> > > > of cost to binaries that don't have NT_GNU_PROPERTY_TYPE_0.  The ld.so
> > > > version doesn't tell you what ELF ABI a given executable conforms to.
> > > >
> > > > Since this sounds like it's largely a distro-specific issue, maybe there
> > > > could be a Kconfig option to turn the fallback PT_NOTE scanning on?
> > > 
> > > I'm worried that this causes interop issues similarly to what we see
> > > with VSYSCALL today.  If we need both and a way to disable it, it should
> > > be something like a personality flag which can be configured for each
> > > process tree separately.  Ideally, we'd settle on one correct approach
> > > (i.e., either always process both, or only process PT_GNU_PROPERTY) and
> > > enforce that.
> > 
> > Chose one and only the one which makes technically sense and is not some
> > horrible vehicle.
> > 
> > Everytime we did those 'oh we need to make x fly workarounds' we regretted
> > it sooner than later.
> 
> So I guess that points to keeping PT_NOTE scanning always available as a
> fallback on x86.  This sucks a bit, but if there are binaries already in
> the wild that rely on this, I don't think we have much choice...

I'm not sure I read Thomas' comment like that. In my reading keeping the
PT_NOTE fallback is exactly one of those 'fly workarounds'. By not
supporting PT_NOTE only the 'fine' people already shit^Hpping this out
of tree are affected, and we don't have to care about them at all.

^ permalink raw reply

* Re: [PATCH] mm, memcg: Report number of memcg caches in slabinfo
From: Michal Hocko @ 2019-06-18 12:37 UTC (permalink / raw)
  To: Waiman Long
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, linux-kernel, Roman Gushchin,
	Johannes Weiner, Shakeel Butt, Vladimir Davydov, linux-api
In-Reply-To: <9e165eae-e354-04c4-6362-0f80fe819469@redhat.com>

On Mon 17-06-19 10:50:23, Waiman Long wrote:
> On 6/17/19 10:38 AM, Michal Hocko wrote:
> > [Cc linux-api]
> >
> > On Mon 17-06-19 10:21:49, Waiman Long wrote:
> >> There are concerns about memory leaks from extensive use of memory
> >> cgroups as each memory cgroup creates its own set of kmem caches. There
> >> is a possiblity that the memcg kmem caches may remain even after the
> >> memory cgroup removal.
> >>
> >> Therefore, it will be useful to show how many memcg caches are present
> >> for each of the kmem caches.
> > How is a user going to use that information?  Btw. Don't we have an
> > interface to display the number of (dead) cgroups?
> 
> The interface to report dead cgroups is for cgroup v2 (cgroup.stat)
> only. I don't think there is a way to find that for cgroup v1.

Doesn't debug_legacy_files provide the information for both cgroups
APIs?

> Also the
> number of memcg kmem caches may not be the same as the number of
> memcg's. It can range from 0 to above the number of memcg's.  So it is
> an interesting number by itself.

Is this useful enough to put into slabinfo? Doesn't this sound more like
a debugfs kinda a thing?

> From the user perspective, if the numbers is way above the number of
> memcg's, there is probably something wrong there.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: [PATCH v7 22/27] binfmt_elf: Extract .note.gnu.property from an ELF file
From: Dave Martin @ 2019-06-18  9:12 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Florian Weimer, Yu-cheng Yu, x86, H. Peter Anvin, Ingo Molnar,
	linux-kernel, linux-doc, linux-mm, linux-arch, linux-api,
	Arnd Bergmann, Andy Lutomirski, Balbir Singh, Borislav Petkov,
	Cyrill Gorcunov, Dave Hansen, Eugene Syromiatnikov, H.J. Lu,
	Jann Horn, Jonathan Corbet, Kees Cook, Mike Kravetz, Nadav Amit,
	Oleg
In-Reply-To: <alpine.DEB.2.21.1906171418220.1854@nanos.tec.linutronix.de>

On Mon, Jun 17, 2019 at 02:20:40PM +0200, Thomas Gleixner wrote:
> On Mon, 17 Jun 2019, Florian Weimer wrote:
> > * Dave Martin:
> > > On Tue, Jun 11, 2019 at 12:31:34PM -0700, Yu-cheng Yu wrote:
> > >> We can probably check PT_GNU_PROPERTY first, and fallback (based on ld-linux
> > >> version?) to PT_NOTE scanning?
> > >
> > > For arm64, we can check for PT_GNU_PROPERTY and then give up
> > > unconditionally.
> > >
> > > For x86, we would fall back to PT_NOTE scanning, but this will add a bit
> > > of cost to binaries that don't have NT_GNU_PROPERTY_TYPE_0.  The ld.so
> > > version doesn't tell you what ELF ABI a given executable conforms to.
> > >
> > > Since this sounds like it's largely a distro-specific issue, maybe there
> > > could be a Kconfig option to turn the fallback PT_NOTE scanning on?
> > 
> > I'm worried that this causes interop issues similarly to what we see
> > with VSYSCALL today.  If we need both and a way to disable it, it should
> > be something like a personality flag which can be configured for each
> > process tree separately.  Ideally, we'd settle on one correct approach
> > (i.e., either always process both, or only process PT_GNU_PROPERTY) and
> > enforce that.
> 
> Chose one and only the one which makes technically sense and is not some
> horrible vehicle.
> 
> Everytime we did those 'oh we need to make x fly workarounds' we regretted
> it sooner than later.

So I guess that points to keeping PT_NOTE scanning always available as a
fallback on x86.  This sucks a bit, but if there are binaries already in
the wild that rely on this, I don't think we have much choice...

I'd still favour a Kconfig option to allow this support to be suppressed
by arches that don't have a similar legacy to be compatible with.

Cheers
---Dave

^ permalink raw reply

* Re: [PATCH] uapi: avoid namespace conflict in linux/posix_types.h
From: Florian Weimer @ 2019-06-18  7:44 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Joseph Myers, Arnd Bergmann, Linux API, linux-arch, Netdev,
	Laura Abbott, Paul Burton, Deepa Dinamani,
	Linux List Kernel Mailing
In-Reply-To: <CAHk-=wjF6ek4v04w2O3CuOaauDERfdyduW+h=u9uN5ja1ObLzQ@mail.gmail.com>

* Linus Torvalds:

> On Mon, Jun 17, 2019 at 11:19 AM Florian Weimer <fweimer@redhat.com> wrote:
>> >
>> > Unlike the "val[]" thing, I don't think anybody is supposed to access
>> > those fields directly.
>>
>> Well, glibc already calls it __val …
>
> Hmm. If user space already doesn't see the "val[]" array anyway, I
> guess we could just do that in the kernel too.
>
> Looking at the glibc headers I have for fds_bits, glibc seems to do
> *both* fds_bits[] and __fds_bits[] depending on __USE_XOPEN or not.
>
> Anyway, that all implies to me that we might as well just go the truly
> mindless way, and just do the double underscores and not bother with
> renaming any files.
>
> I thought people actually might care about the "val[]" name because I
> find that in documentation, but since apparently it's already not
> visible to user space anyway, that can't be true.
>
> I guess that makes the original patch acceptable, and we should just
> do the same thing to fds_bits..

Hah.

I think Arnd's original patch already had both.  So it's ready to go in
after all?

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH] uapi: avoid namespace conflict in linux/posix_types.h
From: Linus Torvalds @ 2019-06-17 18:48 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Joseph Myers, Arnd Bergmann, Linux API, linux-arch, Netdev,
	Laura Abbott, Paul Burton, Deepa Dinamani,
	Linux List Kernel Mailing
In-Reply-To: <87a7egdqgr.fsf@oldenburg2.str.redhat.com>

On Mon, Jun 17, 2019 at 11:19 AM Florian Weimer <fweimer@redhat.com> wrote:
> >
> > Unlike the "val[]" thing, I don't think anybody is supposed to access
> > those fields directly.
>
> Well, glibc already calls it __val …

Hmm. If user space already doesn't see the "val[]" array anyway, I
guess we could just do that in the kernel too.

Looking at the glibc headers I have for fds_bits, glibc seems to do
*both* fds_bits[] and __fds_bits[] depending on __USE_XOPEN or not.

Anyway, that all implies to me that we might as well just go the truly
mindless way, and just do the double underscores and not bother with
renaming any files.

I thought people actually might care about the "val[]" name because I
find that in documentation, but since apparently it's already not
visible to user space anyway, that can't be true.

I guess that makes the original patch acceptable, and we should just
do the same thing to fds_bits..

                     Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox