Linux userland API discussions

Linux userland API discussions
 help / color / mirror / Atom feed

* Re: [PATCH v2 0/3] initramfs: add support for xattrs in the initial ram disk
From: Mimi Zohar @ 2019-05-10 11:49 UTC (permalink / raw)
  To: Roberto Sassu, Rob Landley, viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, james.w.mcmechan
In-Reply-To: <bf0d02fc-d6ce-ef1d-bb7d-7ca14432c6fd@huawei.com>

On Fri, 2019-05-10 at 08:56 +0200, Roberto Sassu wrote:
> On 5/9/2019 8:34 PM, Rob Landley wrote:
> > On 5/9/19 6:24 AM, Roberto Sassu wrote:

> >> The difference with another proposal
> >> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> >> included in an image without changing the image format, as opposed to
> >> defining a new one. As seen from the discussion, if a new format has to be
> >> defined, it should fix the issues of the existing format, which requires
> >> more time.
> > 
> > So you've explicitly chosen _not_ to address Y2038 while you're there.
> 
> Can you be more specific?

Right, this patch set avoids incrementing the CPIO magic number and
the resulting changes required (eg. increasing the timestamp field
size), by including a file with the security xattrs in the CPIO.  In
either case, including the security xattrs in the initramfs header or
as a separate file, the initramfs, itself, needs to be signed.

Mimi

^ permalink raw reply

* Re: [PATCH v2 0/3] initramfs: add support for xattrs in the initial ram disk
From: Roberto Sassu @ 2019-05-10  6:56 UTC (permalink / raw)
  To: Rob Landley, viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, james.w.mcmechan
In-Reply-To: <fca8e601-1144-1bb8-c007-518651f624a5@landley.net>

On 5/9/2019 8:34 PM, Rob Landley wrote:
> On 5/9/19 6:24 AM, Roberto Sassu wrote:
>> This patch set aims at solving the following use case: appraise files from
>> the initial ram disk. To do that, IMA checks the signature/hash from the
>> security.ima xattr. Unfortunately, this use case cannot be implemented
>> currently, as the CPIO format does not support xattrs.
>>
>> This proposal consists in marshaling pathnames and xattrs in a file called
>> .xattr-list. They are unmarshaled by the CPIO parser after all files have
>> been extracted.
> 
> So it's in-band signalling that has a higher peak memory requirement.

This can be modified. Now I allocate the memory necessary for the path
and all xattrs of a file (max: .xattr-list size - 10 bytes). I could
process each xattr individually (max: 255 + 1 + 65536 bytes).


>> The difference with another proposal
>> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
>> included in an image without changing the image format, as opposed to
>> defining a new one. As seen from the discussion, if a new format has to be
>> defined, it should fix the issues of the existing format, which requires
>> more time.
> 
> So you've explicitly chosen _not_ to address Y2038 while you're there.

Can you be more specific?

Thanks

Roberto


> Rob
> 

-- 
HUAWEI TECHNOLOGIES Duesseldorf GmbH, HRB 56063
Managing Director: Bo PENG, Jian LI, Yanli SHI

^ permalink raw reply

* Re: [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support
From: Wu Hao @ 2019-05-10  2:23 UTC (permalink / raw)
  To: Alan Tull
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Ananda Ravuri, Xu Yilun
In-Reply-To: <CANk1AXRpBe=8Jh+_ZMfARSdXZmrQaN3jc0AfxoX2nP5sLESv2A@mail.gmail.com>

On Thu, May 09, 2019 at 11:27:36AM -0500, Alan Tull wrote:
> On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:
> 
> Hi Hao,
> 
> The changes look good.  There's one easy to fix thing that Greg has
> pointed out recently on another patch (below).
> 
> >
> > This patch adds support for global error reporting for FPGA
> > Management Engine (FME), it introduces sysfs interfaces to
> > report different error detected by the hardware, and allow
> > user to clear errors or inject error for testing purpose.
> >
> > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > Signed-off-by: Wu Hao <hao.wu@intel.com>
> 
> Acked-by: Alan Tull <atull@kernel.org>
> 
> > ---
> > v2: fix issues found in sysfs doc.
> >     fix returned error code issues for writable sysfs interfaces.
> >     (use -EINVAL if input doesn't match error code)
> >     reorder the sysfs groups in code.
> 
> > +static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
> > +                            char *buf)
> > +{
> > +       struct device *err_dev = dev->parent;
> > +       void __iomem *base;
> > +
> > +       base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
> > +
> > +       return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));
> 
> Greg is discouraging use of scnprintf for sysfs attributes where it's
> not needed [1].
> 
> Please fix this up the attributes added in this patchset.  Besides
> that, looks good, I added my Ack.

Sure, will fix them in the next patchset.

thanks a lot!

Hao

> 
> Alan
> 
> > +}
> > +static DEVICE_ATTR_RO(revision);
> 
> [1] https://lkml.org/lkml/2019/4/25/1050

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Pavel Machek @ 2019-05-09 22:34 UTC (permalink / raw)
  To: Dan Murphy
  Cc: Yurii Pavlovskyi, Andy Shevchenko, Jacek Anaszewski,
	Linux LED Subsystem, Corentin Chary, Darren Hart, Andy Shevchenko,
	Daniel Drake, acpi4asus-user, Platform Driver,
	Linux Kernel Mailing List, linux-api
In-Reply-To: <2f26dd9e-ada7-8e20-c810-a647854c338c@ti.com>

[-- Attachment #1: Type: text/plain, Size: 3147 bytes --]

Hi!

> >> Yes, please. We have common interface for LED drivers; this needs to
> >> use it.
> > 
> > That is indeed a better option and I did in fact considered this first and
> > even did a test implementation. The discoveries were:
> > 1. The WMI methods are write-only and only written all at once in a
> > transaction manner (also invoking solely first RGB-interface method has no
> > effect until some other keyboard backlight method is called).

Write-only is not a problem, right? Nor should be transaction. Just
cache the values in kernel.

> > 2. In addition to RGB there are several control values, which switch
> > effects, speed and enable or disable the backlight under specific
> > conditions or switch whether it is set temporarily or permanently (not that
> > these are critical functionalities, but for the sake of
> > completeness).

Yep, lets ignore that for now.

> > 3. The EC is really slow
> > # time bash -c "echo 1 > /sys/devices/platform/faustus/kbbl_set"
> > 
> > real	0m0,691s
> > user	0m0,000s
> > sys	0m0,691s
> > 
> > (please ignore the sysfs-path there, it's essentially the same code running
> > as in this patch). It is consistently same for both temporary and permanent
> > configuration. Writing after every change would take about (6+)x of that.
> > Not that it's that unbearable though as it is not likely to be
> > done often.

Yup, this is quite ugly.

What about simply ignoring changes as they happen, and then setting
RGB channels when nothing changes for 10msec?

> > I was not quite happy with that implementation so I opted for writing sort
> > of sysfs wrapper instead that would allow same sort of transactions as
> > provided by BIOS. I agree that it's non-standard solution.
> > 
> > If I understood correctly, the typical current RGB led_class devices from
> > the Linux tree currently provide channels as separate LEDs. There are also
> > blink / pattern options present, I guess one could misuse them for setting
> > effects and speed. So one could make 3 devices for RGB + 3 for awake,
> > sleep, boot modes + 1 for setting effect / speed.

Take a look at "pattern" trigger. That should give you effect/speed
options. .. for single channel.

> > I'd guess the end solution might be also either something like combination
> > of both approaches (RGB leds + separate sysfs interface) or some extension
> > of the led_class device interface. Dropping support of the non-essential
> > features for the sake of uniformity of ABI would also be an option to
> > consider (exposing just three RGB LEDs with brightness only), not happy one
> > though.
> > 
> > In any case this looks like it might need some additional research,
> > discussion, development, and a pair of iterations so I tend to separate
> > this patch from the series and post it extra after the others are through
> > to avoid dragging 10+ patches around.

Separate patch certainly makes sense.

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Pavel Machek @ 2019-05-09 22:15 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Dan Murphy, Yurii Pavlovskyi, Jacek Anaszewski,
	Linux LED Subsystem, Corentin Chary, Darren Hart, Andy Shevchenko,
	Daniel Drake, acpi4asus-user, Platform Driver,
	Linux Kernel Mailing List, linux-api
In-Reply-To: <CAHp75VcSVumVg74==bM3cBcZZ2iUNDnUao6h9Q6ktcyEuAKDew@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 993 bytes --]

On Fri 2019-05-10 00:06:11, Andy Shevchenko wrote:
> On Thu, May 9, 2019 at 11:45 PM Dan Murphy <dmurphy@ti.com> wrote:
> > On 5/9/19 2:04 PM, Yurii Pavlovskyi wrote:
> > We are working on a framework for this.
> >
> > Please see this series
> > https://lore.kernel.org/patchwork/project/lkml/list/?series=390141
> >
> > It is still a work in progress
> 
> Side question:
> Have you considered to convert existing color LED controllers? (It
> seems to me that your proposal lacks of the idea to keep back
> compatibility with the existing controllers whre user may create a
> sysfs node based on the arbitrary label, while it's good to have
> multicolor infrastructure like in your proposal. Did I miss
> something?)

That's undecided at the moment. We have enough fun trying to figure
out reasonable interface...


									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Dan Murphy @ 2019-05-09 21:44 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Yurii Pavlovskyi, Pavel Machek, Jacek Anaszewski,
	Linux LED Subsystem, Corentin Chary, Darren Hart, Andy Shevchenko,
	Daniel Drake, acpi4asus-user, Platform Driver,
	Linux Kernel Mailing List, linux-api
In-Reply-To: <CAHp75VcSVumVg74==bM3cBcZZ2iUNDnUao6h9Q6ktcyEuAKDew@mail.gmail.com>

Andy

On 5/9/19 4:06 PM, Andy Shevchenko wrote:
> On Thu, May 9, 2019 at 11:45 PM Dan Murphy <dmurphy@ti.com> wrote:
>> On 5/9/19 2:04 PM, Yurii Pavlovskyi wrote:
>> We are working on a framework for this.
>>
>> Please see this series
>> https://lore.kernel.org/patchwork/project/lkml/list/?series=390141
>>
>> It is still a work in progress
> 
> Side question:
> Have you considered to convert existing color LED controllers? (It
> seems to me that your proposal lacks of the idea to keep back
> compatibility with the existing controllers whre user may create a
> sysfs node based on the arbitrary label, while it's good to have
> multicolor infrastructure like in your proposal. Did I miss
> something?)
> 
> 

Yes that is part of the work that is in progress.
The LED driver should be able to register either a single color LED or a group of colored LEDs.

This can be based on a firmware entry and which LED framework the driver chooses to register to. Either the
multicolor framework or the base LED framework.  Of course we can put this in code and keep it
out of the firmware nodes again thats why it is wip.

I have convert a couple of drivers over in my testing that support RGB modules or have a RGB cluter used to mix
colors.

If the product wants to expose a single red LED via the label then they use legacy registration.
If the product wants to expose RGBW as a single group then the multicolor framework should be registered too.

Dan

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Andy Shevchenko @ 2019-05-09 21:06 UTC (permalink / raw)
  To: Dan Murphy
  Cc: Yurii Pavlovskyi, Pavel Machek, Jacek Anaszewski,
	Linux LED Subsystem, Corentin Chary, Darren Hart, Andy Shevchenko,
	Daniel Drake, acpi4asus-user, Platform Driver,
	Linux Kernel Mailing List, linux-api
In-Reply-To: <2f26dd9e-ada7-8e20-c810-a647854c338c@ti.com>

On Thu, May 9, 2019 at 11:45 PM Dan Murphy <dmurphy@ti.com> wrote:
> On 5/9/19 2:04 PM, Yurii Pavlovskyi wrote:
> We are working on a framework for this.
>
> Please see this series
> https://lore.kernel.org/patchwork/project/lkml/list/?series=390141
>
> It is still a work in progress

Side question:
Have you considered to convert existing color LED controllers? (It
seems to me that your proposal lacks of the idea to keep back
compatibility with the existing controllers whre user may create a
sysfs node based on the arbitrary label, while it's good to have
multicolor infrastructure like in your proposal. Did I miss
something?)

-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Dan Murphy @ 2019-05-09 20:45 UTC (permalink / raw)
  To: Yurii Pavlovskyi, Pavel Machek, Andy Shevchenko
  Cc: Jacek Anaszewski, Linux LED Subsystem, Corentin Chary,
	Darren Hart, Andy Shevchenko, Daniel Drake, acpi4asus-user,
	Platform Driver, Linux Kernel Mailing List, linux-api
In-Reply-To: <52e73640-9fbf-437b-537a-7b3dc167052f@gmail.com>

Yurii

On 5/9/19 2:04 PM, Yurii Pavlovskyi wrote:
> First of all, thanks to Andy for all the review comments!
> 
> I will implement all the ones that I didn't directly answer on as well and
> update this series shortly.
> 
> Regarding this patch,
> 
> On 08.05.19 19:12, Pavel Machek wrote:
>>> Shouldn't be the LED subsystem driver for this?
>>
>> Yes, please. We have common interface for LED drivers; this needs to
>> use it.
> 
> That is indeed a better option and I did in fact considered this first and
> even did a test implementation. The discoveries were:
> 1. The WMI methods are write-only and only written all at once in a
> transaction manner (also invoking solely first RGB-interface method has no
> effect until some other keyboard backlight method is called).
> 2. In addition to RGB there are several control values, which switch
> effects, speed and enable or disable the backlight under specific
> conditions or switch whether it is set temporarily or permanently (not that
> these are critical functionalities, but for the sake of completeness).
> 3. The EC is really slow
> # time bash -c "echo 1 > /sys/devices/platform/faustus/kbbl_set"
> 
> real	0m0,691s
> user	0m0,000s
> sys	0m0,691s
> 
> (please ignore the sysfs-path there, it's essentially the same code running
> as in this patch). It is consistently same for both temporary and permanent
> configuration. Writing after every change would take about (6+)x of that.
> Not that it's that unbearable though as it is not likely to be done often.
> 
> I was not quite happy with that implementation so I opted for writing sort
> of sysfs wrapper instead that would allow same sort of transactions as
> provided by BIOS. I agree that it's non-standard solution.
> 
> If I understood correctly, the typical current RGB led_class devices from
> the Linux tree currently provide channels as separate LEDs. There are also
> blink / pattern options present, I guess one could misuse them for setting
> effects and speed. So one could make 3 devices for RGB + 3 for awake,
> sleep, boot modes + 1 for setting effect / speed.
> 
> I'd guess the end solution might be also either something like combination
> of both approaches (RGB leds + separate sysfs interface) or some extension
> of the led_class device interface. Dropping support of the non-essential
> features for the sake of uniformity of ABI would also be an option to
> consider (exposing just three RGB LEDs with brightness only), not happy one
> though.
> 
> In any case this looks like it might need some additional research,
> discussion, development, and a pair of iterations so I tend to separate
> this patch from the series and post it extra after the others are through
> to avoid dragging 10+ patches around.
> 
> Any suggestions on how to do this properly would be appreciated. That's the
> best I could come up with at the moment.
> 

We are working on a framework for this.

Please see this series
https://lore.kernel.org/patchwork/project/lkml/list/?series=390141

It is still a work in progress

> Thanks,
> Yurii
> 

^ permalink raw reply

* Re: [PATCH v3 09/11] platform/x86: asus-wmi: Control RGB keyboard backlight
From: Yurii Pavlovskyi @ 2019-05-09 19:04 UTC (permalink / raw)
  To: Pavel Machek, Andy Shevchenko
  Cc: Jacek Anaszewski, Linux LED Subsystem, Corentin Chary,
	Darren Hart, Andy Shevchenko, Daniel Drake, acpi4asus-user,
	Platform Driver, Linux Kernel Mailing List, linux-api
In-Reply-To: <20190508171229.GA22024@amd>

First of all, thanks to Andy for all the review comments!

I will implement all the ones that I didn't directly answer on as well and
update this series shortly.

Regarding this patch,

On 08.05.19 19:12, Pavel Machek wrote:
>> Shouldn't be the LED subsystem driver for this?
> 
> Yes, please. We have common interface for LED drivers; this needs to
> use it.

That is indeed a better option and I did in fact considered this first and
even did a test implementation. The discoveries were:
1. The WMI methods are write-only and only written all at once in a
transaction manner (also invoking solely first RGB-interface method has no
effect until some other keyboard backlight method is called).
2. In addition to RGB there are several control values, which switch
effects, speed and enable or disable the backlight under specific
conditions or switch whether it is set temporarily or permanently (not that
these are critical functionalities, but for the sake of completeness).
3. The EC is really slow
# time bash -c "echo 1 > /sys/devices/platform/faustus/kbbl_set"

real	0m0,691s
user	0m0,000s
sys	0m0,691s

(please ignore the sysfs-path there, it's essentially the same code running
as in this patch). It is consistently same for both temporary and permanent
configuration. Writing after every change would take about (6+)x of that.
Not that it's that unbearable though as it is not likely to be done often.

I was not quite happy with that implementation so I opted for writing sort
of sysfs wrapper instead that would allow same sort of transactions as
provided by BIOS. I agree that it's non-standard solution.

If I understood correctly, the typical current RGB led_class devices from
the Linux tree currently provide channels as separate LEDs. There are also
blink / pattern options present, I guess one could misuse them for setting
effects and speed. So one could make 3 devices for RGB + 3 for awake,
sleep, boot modes + 1 for setting effect / speed.

I'd guess the end solution might be also either something like combination
of both approaches (RGB leds + separate sysfs interface) or some extension
of the led_class device interface. Dropping support of the non-essential
features for the sake of uniformity of ABI would also be an option to
consider (exposing just three RGB LEDs with brightness only), not happy one
though.

In any case this looks like it might need some additional research,
discussion, development, and a pair of iterations so I tend to separate
this patch from the series and post it extra after the others are through
to avoid dragging 10+ patches around.

Any suggestions on how to do this properly would be appreciated. That's the
best I could come up with at the moment.

Thanks,
Yurii

^ permalink raw reply

* Re: [PATCH v2 0/3] initramfs: add support for xattrs in the initial ram disk
From: Rob Landley @ 2019-05-09 18:34 UTC (permalink / raw)
  To: Roberto Sassu, viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, james.w.mcmechan
In-Reply-To: <20190509112420.15671-1-roberto.sassu@huawei.com>

On 5/9/19 6:24 AM, Roberto Sassu wrote:
> This patch set aims at solving the following use case: appraise files from
> the initial ram disk. To do that, IMA checks the signature/hash from the
> security.ima xattr. Unfortunately, this use case cannot be implemented
> currently, as the CPIO format does not support xattrs.
> 
> This proposal consists in marshaling pathnames and xattrs in a file called
> .xattr-list. They are unmarshaled by the CPIO parser after all files have
> been extracted.

So it's in-band signalling that has a higher peak memory requirement.

> The difference with another proposal
> (https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
> included in an image without changing the image format, as opposed to
> defining a new one. As seen from the discussion, if a new format has to be
> defined, it should fix the issues of the existing format, which requires
> more time.

So you've explicitly chosen _not_ to address Y2038 while you're there.

Rob

^ permalink raw reply

* Re: [PATCH v2 17/18] fpga: dfl: fme: add global error reporting support
From: Alan Tull @ 2019-05-09 16:27 UTC (permalink / raw)
  To: Wu Hao
  Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Luwei Kang,
	Ananda Ravuri, Xu Yilun
In-Reply-To: <1556528151-17221-18-git-send-email-hao.wu@intel.com>

On Mon, Apr 29, 2019 at 4:13 AM Wu Hao <hao.wu@intel.com> wrote:

Hi Hao,

The changes look good.  There's one easy to fix thing that Greg has
pointed out recently on another patch (below).

>
> This patch adds support for global error reporting for FPGA
> Management Engine (FME), it introduces sysfs interfaces to
> report different error detected by the hardware, and allow
> user to clear errors or inject error for testing purpose.
>
> Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> Signed-off-by: Ananda Ravuri <ananda.ravuri@intel.com>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>

> ---
> v2: fix issues found in sysfs doc.
>     fix returned error code issues for writable sysfs interfaces.
>     (use -EINVAL if input doesn't match error code)
>     reorder the sysfs groups in code.

> +static ssize_t revision_show(struct device *dev, struct device_attribute *attr,
> +                            char *buf)
> +{
> +       struct device *err_dev = dev->parent;
> +       void __iomem *base;
> +
> +       base = dfl_get_feature_ioaddr_by_id(err_dev, FME_FEATURE_ID_GLOBAL_ERR);
> +
> +       return scnprintf(buf, PAGE_SIZE, "%u\n", dfl_feature_revision(base));

Greg is discouraging use of scnprintf for sysfs attributes where it's
not needed [1].

Please fix this up the attributes added in this patchset.  Besides
that, looks good, I added my Ack.

Alan

> +}
> +static DEVICE_ATTR_RO(revision);

[1] https://lkml.org/lkml/2019/4/25/1050

^ permalink raw reply

* Re: [PATCH linux-next v10 0/7] ptrace: add PTRACE_GET_SYSCALL_INFO request
From: Oleg Nesterov @ 2019-05-09 16:14 UTC (permalink / raw)
  To: Dmitry V. Levin, Andrew Morton
  Cc: Elvira Khabirova, Eugene Syromyatnikov, Andy Lutomirski,
	Benjamin Herrenschmidt, Greentime Hu, Helge Deller,
	James E.J. Bottomley, James Hogan, Kees Cook, Michael Ellerman,
	Paul Burton, Paul Mackerras, Ralf Baechle, Richard Kuo,
	Shuah Khan, Vincent Chen, linux-api, linux-hexagon, linux-kernel,
	linux-kselftest, linux-mips
In-Reply-To: <20190415234307.GA9364@altlinux.org>

On 04/16, Dmitry V. Levin wrote:
>
> [Andrew, could you take this patchset into your tree, please?]

Just in case...

I have already acked 6/7.

Other patches look good to me too, just I don't think I can actually review
these non-x86 changes.

Oleg.

^ permalink raw reply

* Re: [PATCH v8 05/16] sched/core: Allow sched_setattr() to use the current policy
From: Patrick Bellasi @ 2019-05-09 14:59 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190508192131.GD32547@worktop.programming.kicks-ass.net>

On 08-May 21:21, Peter Zijlstra wrote:
> On Tue, Apr 02, 2019 at 11:41:41AM +0100, Patrick Bellasi wrote:
> > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > index 22627f80063e..075c610adf45 100644
> > --- a/include/uapi/linux/sched.h
> > +++ b/include/uapi/linux/sched.h
> > @@ -40,6 +40,8 @@
> >  /* SCHED_ISO: reserved but not implemented yet */
> >  #define SCHED_IDLE		5
> >  #define SCHED_DEADLINE		6
> > +/* Must be the last entry: used to sanity check attr.policy values */
> > +#define SCHED_POLICY_MAX	SCHED_DEADLINE
> 
> This is a wee bit sad to put in a uapi header; but yeah, where else :/
> 
> Another option would be something like:
> 
> enum {
> 	SCHED_NORMAL = 0,
> 	SCHED_FIFO = 1,
> 	SCHED_RR = 2,
> 	SCHED_BATCH = 3,
> 	/* SCHED_ISO = 4, reserved */
> 	SCHED_IDLE = 5,
> 	SCHED_DEADLINE = 6,
> 	SCHED_POLICY_NR
> };
> 
> >  /* Can be ORed in to make sure the process is reverted back to SCHED_NORMAL on fork */
> >  #define SCHED_RESET_ON_FORK     0x40000000
> > @@ -50,9 +52,11 @@
> >  #define SCHED_FLAG_RESET_ON_FORK	0x01
> >  #define SCHED_FLAG_RECLAIM		0x02
> >  #define SCHED_FLAG_DL_OVERRUN		0x04
> > +#define SCHED_FLAG_KEEP_POLICY		0x08
> >  
> >  #define SCHED_FLAG_ALL	(SCHED_FLAG_RESET_ON_FORK	| \
> >  			 SCHED_FLAG_RECLAIM		| \
> > -			 SCHED_FLAG_DL_OVERRUN)
> > +			 SCHED_FLAG_DL_OVERRUN		| \
> > +			 SCHED_FLAG_KEEP_POLICY)
> >  
> >  #endif /* _UAPI_LINUX_SCHED_H */
> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > index d368ac26b8aa..20efb32e1a7e 100644
> > --- a/kernel/sched/core.c
> > +++ b/kernel/sched/core.c
> > @@ -4907,8 +4907,17 @@ SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
> >  	if (retval)
> >  		return retval;
> >  
> > -	if ((int)attr.sched_policy < 0)
> > +	/*
> > +	 * A valid policy is always required from userspace, unless
> > +	 * SCHED_FLAG_KEEP_POLICY is set and the current policy
> > +	 * is enforced for this call.
> > +	 */
> > +	if (attr.sched_policy > SCHED_POLICY_MAX &&
> > +	    !(attr.sched_flags & SCHED_FLAG_KEEP_POLICY)) {
> >  		return -EINVAL;
> > +	}
> 
> And given I just looked at those darn SCHED_* things, I now note the
> above does 'funny' things when passed: attr.policy=4.

Looking better at the code, I see now that we don't really need that
check anymore. Indeed, v8 introduced the support to change policy
specific and independent attributes at the same time. Thus:

1. the policy validity is already checked in:

     sched_setattr()
       sched_setattr()
         __sched_setscheduler()
            valid_policy()

   which knows how to deal with attr.policy=4 (i.e. -EINVAL)

2. when we pass in SCHED_FLAG_KEEP_POLICY we force the current policy
   by setting attr.sched_policy = SETPARAM_POLICY, so we just need a
   non negative policy being defined (usually 0 by default).

Thus, I'll remove the new #define and update the check above to be just:

	if (attr.sched_flags & SCHED_FLAG_KEEP_POLICY)
		attr.sched_policy = SETPARAM_POLICY;
	else if ((int)attr.sched_policy < 0)
		return -EINVAL;

which should cover the additional case:

   you can syscall with just SCHED_FLAG_KEEP_POLICY set if you want to
   change only cross-policy attributes.

> > +	if (attr.sched_flags & SCHED_FLAG_KEEP_POLICY)
> > +		attr.sched_policy = SETPARAM_POLICY;
> >  
> >  	rcu_read_lock();
> >  	retval = -ESRCH;

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply

* Re: [PATCH v2 12/18] fpga: dfl: afu: add error reporting support.
From: Alan Tull @ 2019-05-09 14:41 UTC (permalink / raw)
  To: Wu Hao; +Cc: Moritz Fischer, linux-fpga, linux-kernel, linux-api, Xu Yilun
In-Reply-To: <1556528151-17221-13-git-send-email-hao.wu@intel.com>

On Mon, Apr 29, 2019 at 4:12 AM Wu Hao <hao.wu@intel.com> wrote:
>
> Error reporting is one important private feature, it reports error
> detected on port and accelerated function unit (AFU). It introduces
> several sysfs interfaces to allow userspace to check and clear
> errors detected by hardware.
>
> Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> Signed-off-by: Wu Hao <hao.wu@intel.com>

Acked-by: Alan Tull <atull@kernel.org>

Thanks!
Alan

> ---
> v2: add more error code description for error clear sysfs in doc.
>     return -EINVAL instead of -EBUSY when input error code doesn't
>     match in error clear sysfs.
> ---
>  Documentation/ABI/testing/sysfs-platform-dfl-port |  39 ++++
>  drivers/fpga/Makefile                             |   1 +
>  drivers/fpga/dfl-afu-error.c                      | 225 ++++++++++++++++++++++
>  drivers/fpga/dfl-afu-main.c                       |   4 +
>  drivers/fpga/dfl-afu.h                            |   4 +
>  5 files changed, 273 insertions(+)
>  create mode 100644 drivers/fpga/dfl-afu-error.c

^ permalink raw reply

* Re: [PATCH v8 00/16] Add utilization clamping support
From: Patrick Bellasi @ 2019-05-09 13:09 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190509130215.GV2623@hirez.programming.kicks-ass.net>

On 09-May 15:02, Peter Zijlstra wrote:
> On Tue, Apr 02, 2019 at 11:41:36AM +0100, Patrick Bellasi wrote:
> > Series Organization
> > ===================
> > 
> > The series is organized into these main sections:
> > 
> >  - Patches [01-07]: Per task (primary) API
> >  - Patches [08-09]: Schedutil integration for FAIR and RT tasks
> >  - Patches [10-11]: Integration with EAS's energy_compute()
> 
> Aside from the comments already provided, I think this is starting to
> look really good.

Thanks Peter for the very useful review...
 
> Thanks!
> 
> >  - Patches [12-16]: Per task group (secondary) API
> 
> I still have to stare at these, but maybe a little later...

... I'll soon post a v9 to factor in all the last comments from this
round so that you have a better base for when you wanna start looking
at the cgroup bits.

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply

* Re: [PATCH v8 04/16] sched/core: uclamp: Add system default clamps
From: Patrick Bellasi @ 2019-05-09 13:04 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190509115307.GS2623@hirez.programming.kicks-ass.net>

On 09-May 13:53, Peter Zijlstra wrote:
> On Thu, May 09, 2019 at 10:10:57AM +0100, Patrick Bellasi wrote:
> > On 08-May 21:15, Peter Zijlstra wrote:
> > > On Wed, May 08, 2019 at 09:07:33PM +0200, Peter Zijlstra wrote:
> > > > On Tue, Apr 02, 2019 at 11:41:40AM +0100, Patrick Bellasi wrote:
> > > > > +static inline struct uclamp_se
> > > > > +uclamp_eff_get(struct task_struct *p, unsigned int clamp_id)
> > > > > +{
> > > > > +	struct uclamp_se uc_req = p->uclamp_req[clamp_id];
> > > > > +	struct uclamp_se uc_max = uclamp_default[clamp_id];
> > > > > +
> > > > > +	/* System default restrictions always apply */
> > > > > +	if (unlikely(uc_req.value > uc_max.value))
> > > > > +		return uc_max;
> > > > > +
> > > > > +	return uc_req;
> > > > > +}
> > > > > +
> > > > > +static inline unsigned int
> > > > > +uclamp_eff_bucket_id(struct task_struct *p, unsigned int clamp_id)
> > > > > +{
> > > > > +	struct uclamp_se uc_eff;
> > > > > +
> > > > > +	/* Task currently refcounted: use back-annotated (effective) bucket */
> > > > > +	if (p->uclamp[clamp_id].active)
> > > > > +		return p->uclamp[clamp_id].bucket_id;
> > > > > +
> > > > > +	uc_eff = uclamp_eff_get(p, clamp_id);
> > > > > +
> > > > > +	return uc_eff.bucket_id;
> > > > > +}
> > > > > +
> > > > > +unsigned int uclamp_eff_value(struct task_struct *p, unsigned int clamp_id)
> > > > > +{
> > > > > +	struct uclamp_se uc_eff;
> > > > > +
> > > > > +	/* Task currently refcounted: use back-annotated (effective) value */
> > > > > +	if (p->uclamp[clamp_id].active)
> > > > > +		return p->uclamp[clamp_id].value;
> > > > > +
> > > > > +	uc_eff = uclamp_eff_get(p, clamp_id);
> > > > > +
> > > > > +	return uc_eff.value;
> > > > > +}
> > > > 
> > > > This is 'wrong' because:
> > > > 
> > > >   uclamp_eff_value(p,id) := uclamp_eff(p,id).value
> > > 
> > > Clearly I means to say the above does not hold with the given
> > > implementation, while the naming would suggest it does.
> > 
> > Not sure to completely get your point...
> 
> the point is that uclamp_eff_get() doesn't do the back annotate thing
> and therefore returns something entirely different from
> uclamp_eff_{bucket_id,value}(), where the naming would suggest it in
> fact returns the same thing.
> 
> > > > Which seems to suggest the uclamp_eff_*() functions want another name.
> > 
> > That function returns the effective value of a task, which is either:
> >  1. the back annotated value for a RUNNABLE task
> > or
> >  2. the aggregation of task-specific, system-default and cgroup values
> >     for a non RUNNABLE task.
> 
> Right, but uclamp_eff_get() doesn't do 1, while the other two do do it.
> And that is confusing.

I see, right.

> > > > Also, suppose the above would be true; does GCC really generate better
> > > > code for the LHS compared to the RHS?
> > 
> > It generate "sane" code which implements the above logic and allows
> > to know that whenever we call uclamp_eff_value(p,id) we get the most
> > updated effective value for a task, independently from its {!}RUNNABLE
> > state.
> > 
> > I would keep the function but, since Suren also complained also about
> > the name... perhaps I should come up with a better name? Proposals?
> 
> Right, so they should move to the patch where they're needed, but I was

Yes, I'll move _value() to 10/16:

   sched/core: uclamp: Add uclamp_util_with()

where we actually need to access the clamp value and...

> wondering why you'd not written something like:
> 
> static inline
> struct uclamp_se uclamp_active(struct task_struct *p, unsigned int clamp_id)
> {
> 	if (p->uclamp[clamp_id].active)
> 		return p->uclamp[clamp_id];
> 
> 	return uclamp_eff(p, clamp_id);
> }
> 
> And then used:
> 
> 	uclamp_active(p, id).{value,bucket_id}
> 
> - OR -
> 
> have uclamp_eff() include the active thing, afaict the callsite in
> uclamp_rq_inc_id() guarantees !active.
> 
> In any case, I'm thinking the foo().member notation saves us from having
> to have two almost identical functions and the 'inline' part should get
> GCC to generate sane code.

... look into this approach, seems reasonable and actually better to read.

Thanks

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply

* Re: [PATCH v8 00/16] Add utilization clamping support
From: Peter Zijlstra @ 2019-05-09 13:02 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190402104153.25404-1-patrick.bellasi@arm.com>

On Tue, Apr 02, 2019 at 11:41:36AM +0100, Patrick Bellasi wrote:
> Series Organization
> ===================
> 
> The series is organized into these main sections:
> 
>  - Patches [01-07]: Per task (primary) API
>  - Patches [08-09]: Schedutil integration for FAIR and RT tasks
>  - Patches [10-11]: Integration with EAS's energy_compute()

Aside from the comments already provided, I think this is starting to
look really good.

Thanks!

>  - Patches [12-16]: Per task group (secondary) API

I still have to stare at these, but maybe a little later...

^ permalink raw reply

* Re: [PATCH v8 11/16] sched/fair: uclamp: Add uclamp support to energy_compute()
From: Peter Zijlstra @ 2019-05-09 12:51 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190402104153.25404-12-patrick.bellasi@arm.com>

On Tue, Apr 02, 2019 at 11:41:47AM +0100, Patrick Bellasi wrote:
> @@ -6484,11 +6494,29 @@ compute_energy(struct task_struct *p, int dst_cpu, struct perf_domain *pd)
>  		 * it will not appear in its pd list and will not be accounted
>  		 * by compute_energy().
>  		 */
> -		for_each_cpu_and(cpu, perf_domain_span(pd), cpu_online_mask) {
> -			util = cpu_util_next(cpu, p, dst_cpu);
> -			util = schedutil_energy_util(cpu, util);
> -			max_util = max(util, max_util);
> -			sum_util += util;
> +		for_each_cpu_and(cpu, pd_mask, cpu_online_mask) {
> +			util_cfs = cpu_util_next(cpu, p, dst_cpu);
> +
> +			/*
> +			 * Busy time computation: utilization clamping is not
> +			 * required since the ratio (sum_util / cpu_capacity)
> +			 * is already enough to scale the EM reported power
> +			 * consumption at the (eventually clamped) cpu_capacity.
> +			 */
> +			sum_util += schedutil_cpu_util(cpu, util_cfs, cpu_cap,
> +						       ENERGY_UTIL, NULL);
> +
> +			/*
> +			 * Performance domain frequency: utilization clamping
> +			 * must be considered since it affects the selection
> +			 * of the performance domain frequency.
> +			 * NOTE: in case RT tasks are running, by default the
> +			 * FREQUENCY_UTIL's utilization can be max OPP.
> +			 */
> +			tsk = cpu == dst_cpu ? p : NULL;
> +			cpu_util = schedutil_cpu_util(cpu, util_cfs, cpu_cap,
> +						      FREQUENCY_UTIL, tsk);
> +			max_util = max(max_util, cpu_util);
>  		}

That's a bit unfortunate; having to do both variants here, but I see
why. Nothing to be done about it I suppose.

^ permalink raw reply

* Re: [PATCH v8 05/16] sched/core: Allow sched_setattr() to use the current policy
From: Peter Zijlstra @ 2019-05-09 11:55 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190509091807.7d3iykkn3oj4b737@e110439-lin>

On Thu, May 09, 2019 at 10:18:07AM +0100, Patrick Bellasi wrote:
> On 08-May 21:21, Peter Zijlstra wrote:
> > On Tue, Apr 02, 2019 at 11:41:41AM +0100, Patrick Bellasi wrote:
> > > diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
> > > index 22627f80063e..075c610adf45 100644
> > > --- a/include/uapi/linux/sched.h
> > > +++ b/include/uapi/linux/sched.h
> > > @@ -40,6 +40,8 @@
> > >  /* SCHED_ISO: reserved but not implemented yet */
> > >  #define SCHED_IDLE		5
> > >  #define SCHED_DEADLINE		6
> > > +/* Must be the last entry: used to sanity check attr.policy values */
> > > +#define SCHED_POLICY_MAX	SCHED_DEADLINE
> > 
> > This is a wee bit sad to put in a uapi header; but yeah, where else :/
> > 
> > Another option would be something like:
> > 
> > enum {
> > 	SCHED_NORMAL = 0,
> > 	SCHED_FIFO = 1,
> > 	SCHED_RR = 2,
> > 	SCHED_BATCH = 3,
> > 	/* SCHED_ISO = 4, reserved */
> > 	SCHED_IDLE = 5,
> > 	SCHED_DEADLINE = 6,
> > 	SCHED_POLICY_NR
> > };
> 
> I just wanted to minimize the changes by keeping the same structure...
> If you prefer the above I can add a refactoring patch just to update
> existing definitions before adding this patch...

Right; I've no idea really. The thing that started all this was adding
that define to UAPI. Maybe we can do without it and instead put in a
comment to check sched_setattr() any time we add a new policy and just
hard code the thing.

^ permalink raw reply

* Re: [PATCH v8 04/16] sched/core: uclamp: Add system default clamps
From: Peter Zijlstra @ 2019-05-09 11:53 UTC (permalink / raw)
  To: Patrick Bellasi
  Cc: linux-kernel, linux-pm, linux-api, Ingo Molnar, Tejun Heo,
	Rafael J . Wysocki, Vincent Guittot, Viresh Kumar, Paul Turner,
	Quentin Perret, Dietmar Eggemann, Morten Rasmussen, Juri Lelli,
	Todd Kjos, Joel Fernandes, Steve Muckle, Suren Baghdasaryan
In-Reply-To: <20190509091057.ckef2ley4eswyzds@e110439-lin>

On Thu, May 09, 2019 at 10:10:57AM +0100, Patrick Bellasi wrote:
> On 08-May 21:15, Peter Zijlstra wrote:
> > On Wed, May 08, 2019 at 09:07:33PM +0200, Peter Zijlstra wrote:
> > > On Tue, Apr 02, 2019 at 11:41:40AM +0100, Patrick Bellasi wrote:
> > > > +static inline struct uclamp_se
> > > > +uclamp_eff_get(struct task_struct *p, unsigned int clamp_id)
> > > > +{
> > > > +	struct uclamp_se uc_req = p->uclamp_req[clamp_id];
> > > > +	struct uclamp_se uc_max = uclamp_default[clamp_id];
> > > > +
> > > > +	/* System default restrictions always apply */
> > > > +	if (unlikely(uc_req.value > uc_max.value))
> > > > +		return uc_max;
> > > > +
> > > > +	return uc_req;
> > > > +}
> > > > +
> > > > +static inline unsigned int
> > > > +uclamp_eff_bucket_id(struct task_struct *p, unsigned int clamp_id)
> > > > +{
> > > > +	struct uclamp_se uc_eff;
> > > > +
> > > > +	/* Task currently refcounted: use back-annotated (effective) bucket */
> > > > +	if (p->uclamp[clamp_id].active)
> > > > +		return p->uclamp[clamp_id].bucket_id;
> > > > +
> > > > +	uc_eff = uclamp_eff_get(p, clamp_id);
> > > > +
> > > > +	return uc_eff.bucket_id;
> > > > +}
> > > > +
> > > > +unsigned int uclamp_eff_value(struct task_struct *p, unsigned int clamp_id)
> > > > +{
> > > > +	struct uclamp_se uc_eff;
> > > > +
> > > > +	/* Task currently refcounted: use back-annotated (effective) value */
> > > > +	if (p->uclamp[clamp_id].active)
> > > > +		return p->uclamp[clamp_id].value;
> > > > +
> > > > +	uc_eff = uclamp_eff_get(p, clamp_id);
> > > > +
> > > > +	return uc_eff.value;
> > > > +}
> > > 
> > > This is 'wrong' because:
> > > 
> > >   uclamp_eff_value(p,id) := uclamp_eff(p,id).value
> > 
> > Clearly I means to say the above does not hold with the given
> > implementation, while the naming would suggest it does.
> 
> Not sure to completely get your point...

the point is that uclamp_eff_get() doesn't do the back annotate thing
and therefore returns something entirely different from
uclamp_eff_{bucket_id,value}(), where the naming would suggest it in
fact returns the same thing.

> > > Which seems to suggest the uclamp_eff_*() functions want another name.
> 
> That function returns the effective value of a task, which is either:
>  1. the back annotated value for a RUNNABLE task
> or
>  2. the aggregation of task-specific, system-default and cgroup values
>     for a non RUNNABLE task.

Right, but uclamp_eff_get() doesn't do 1, while the other two do do it.
And that is confusing.

> > > Also, suppose the above would be true; does GCC really generate better
> > > code for the LHS compared to the RHS?
> 
> It generate "sane" code which implements the above logic and allows
> to know that whenever we call uclamp_eff_value(p,id) we get the most
> updated effective value for a task, independently from its {!}RUNNABLE
> state.
> 
> I would keep the function but, since Suren also complained also about
> the name... perhaps I should come up with a better name? Proposals?

Right, so they should move to the patch where they're needed, but I was
wondering why you'd not written something like:

static inline
struct uclamp_se uclamp_active(struct task_struct *p, unsigned int clamp_id)
{
	if (p->uclamp[clamp_id].active)
		return p->uclamp[clamp_id];

	return uclamp_eff(p, clamp_id);
}

And then used:

	uclamp_active(p, id).{value,bucket_id}

- OR -

have uclamp_eff() include the active thing, afaict the callsite in
uclamp_rq_inc_id() guarantees !active.

In any case, I'm thinking the foo().member notation saves us from having
to have two almost identical functions and the 'inline' part should get
GCC to generate sane code.

^ permalink raw reply

* [PATCH v2 3/3] initramfs: introduce do_readxattrs()
From: Roberto Sassu @ 2019-05-09 11:24 UTC (permalink / raw)
  To: viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, rob,
	james.w.mcmechan, Roberto Sassu
In-Reply-To: <20190509112420.15671-1-roberto.sassu@huawei.com>

This patch adds support for an alternative method to add xattrs to files in
the rootfs filesystem. Instead of extracting them directly from the ram
disk image, they are extracted from a regular file called .xattr-list, that
can be added by any ram disk generator available today.

.xattr-list can be generated by executing:

$ getfattr --absolute-names -d -P -R -e hex -m - \
      <file list> | xattr.awk -b > ${initdir}/.xattr-list

where the content of the xattr.awk script is:

#! /usr/bin/awk -f
{
  if (!length($0)) {
    printf("%.10x%s\0", len, file);
    for (x in xattr) {
      printf("%.8x%s\0", xattr_len[x], x);
      for (i = 0; i < length(xattr[x]) / 2; i++) {
        printf("%c", strtonum("0x"substr(xattr[x], i * 2 + 1, 2)));
      }
    }
    i = 0;
    delete xattr;
    delete xattr_len;
    next;
  };
  if (i == 0) {
    file=$3;
    len=length(file) + 8 + 1;
  }
  if (i > 0) {
    split($0, a, "=");
    xattr[a[1]]=substr(a[2], 3);
    xattr_len[a[1]]=length(a[1]) + 1 + 8 + length(xattr[a[1]]) / 2;
    len+=xattr_len[a[1]];
  };
  i++;
}

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 init/initramfs.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/init/initramfs.c b/init/initramfs.c
index 98c2aa4b5ab4..91f35a84c592 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -11,6 +11,9 @@
 #include <linux/utime.h>
 #include <linux/file.h>
 
+#define XATTR_LIST_FILENAME ".xattr-list"
+
+
 static ssize_t __init xwrite(int fd, const char *p, size_t count)
 {
 	ssize_t out = 0;
@@ -451,6 +454,91 @@ static int __init do_setxattrs(void)
 	return 0;
 }
 
+struct path_hdr {
+	char p_size[10]; /* total size including p_size field */
+	char p_data[];  /* <path>\0<xattrs> */
+};
+
+static int __init do_readxattrs(void)
+{
+	struct path_hdr hdr;
+	char str[sizeof(hdr.p_size) + 1];
+	unsigned long file_entry_size;
+	size_t size, name_buf_size, total_size;
+	struct kstat st;
+	int ret, fd;
+
+	ret = vfs_lstat(XATTR_LIST_FILENAME, &st);
+	if (ret < 0)
+		return ret;
+
+	total_size = st.size;
+
+	fd = ksys_open(XATTR_LIST_FILENAME, O_RDONLY, 0);
+	if (fd < 0)
+		return fd;
+
+	while (total_size) {
+		size = ksys_read(fd, (char *)&hdr, sizeof(hdr));
+		if (size != sizeof(hdr)) {
+			ret = -EIO;
+			goto out;
+		}
+
+		total_size -= size;
+
+		memcpy(str, hdr.p_size, sizeof(hdr.p_size));
+		ret = kstrtoul(str, 16, &file_entry_size);
+		if (ret < 0)
+			goto out;
+
+		file_entry_size -= sizeof(sizeof(hdr.p_size));
+		if (file_entry_size > total_size) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		name_buf = vmalloc(file_entry_size);
+		if (!name_buf) {
+			ret = -ENOMEM;
+			goto out;
+		}
+
+		size = ksys_read(fd, name_buf, file_entry_size);
+		if (size != file_entry_size) {
+			ret = -EIO;
+			goto out_free;
+		}
+
+		total_size -= size;
+
+		name_buf_size = strnlen(name_buf, file_entry_size);
+		if (name_buf_size == file_entry_size) {
+			ret = -EINVAL;
+			goto out_free;
+		}
+
+		xattr_buf = name_buf + name_buf_size + 1;
+		xattr_len = file_entry_size - name_buf_size - 1;
+
+		ret = do_setxattrs();
+		vfree(name_buf);
+		name_buf = NULL;
+
+		if (ret < 0)
+			break;
+	}
+out_free:
+	vfree(name_buf);
+out:
+	ksys_close(fd);
+
+	if (ret < 0)
+		error("Unable to parse xattrs");
+
+	return ret;
+}
+
 static __initdata int (*actions[])(void) = {
 	[Start]		= do_start,
 	[Collect]	= do_collect,
@@ -554,6 +642,7 @@ static char * __init unpack_to_rootfs(char *buf, unsigned long len)
 		buf += my_inptr;
 		len -= my_inptr;
 	}
+	do_readxattrs();
 	dir_utime();
 	kfree(name_buf);
 	kfree(symlink_buf);
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 2/3] initramfs: set extended attributes
From: Roberto Sassu @ 2019-05-09 11:24 UTC (permalink / raw)
  To: viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, rob,
	james.w.mcmechan, Roberto Sassu
In-Reply-To: <20190509112420.15671-1-roberto.sassu@huawei.com>

From: Mimi Zohar <zohar@linux.vnet.ibm.com>

This patch adds xattrs to a file, with name and value taken from a supplied
buffer. The data format is:

<xattr #N data len (ASCII, 8 chars)><xattr #N name>\0<xattr #N value>

[kamensky: fixed restoring of xattrs for symbolic links by using
           sys_lsetxattr() instead of sys_setxattr()]

[sassu: removed state management, kept only do_setxattrs(), replaced
        sys_lsetxattr() with ksys_lsetxattr(), added check for
        xattr_entry_size, added check for hdr->c_size, replaced strlen()
        with strnlen()]

Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Signed-off-by: Victor Kamensky <kamensky@cisco.com>
Signed-off-by: Taras Kondratiuk <takondra@cisco.com>
Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 init/initramfs.c | 63 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 61 insertions(+), 2 deletions(-)

diff --git a/init/initramfs.c b/init/initramfs.c
index 4749e1115eef..98c2aa4b5ab4 100644
--- a/init/initramfs.c
+++ b/init/initramfs.c
@@ -146,7 +146,8 @@ static __initdata time64_t mtime;
 
 static __initdata unsigned long ino, major, minor, nlink;
 static __initdata umode_t mode;
-static __initdata unsigned long body_len, name_len;
+static __initdata u32 name_len, xattr_len;
+static __initdata u64 body_len;
 static __initdata uid_t uid;
 static __initdata gid_t gid;
 static __initdata unsigned rdev;
@@ -218,7 +219,7 @@ static void __init read_into(char *buf, unsigned size, enum state next)
 	}
 }
 
-static __initdata char *header_buf, *symlink_buf, *name_buf;
+static __initdata char *header_buf, *symlink_buf, *name_buf, *xattr_buf;
 
 static int __init do_start(void)
 {
@@ -392,6 +393,64 @@ static int __init do_symlink(void)
 	return 0;
 }
 
+struct xattr_hdr {
+	char c_size[8]; /* total size including c_size field */
+	char c_data[];  /* <name>\0<value> */
+};
+
+static int __init do_setxattrs(void)
+{
+	char *buf = xattr_buf;
+	char *bufend = buf + xattr_len;
+	struct xattr_hdr *hdr;
+	char str[sizeof(hdr->c_size) + 1];
+
+	if (!xattr_len)
+		return 0;
+
+	str[sizeof(hdr->c_size)] = 0;
+
+	while (buf < bufend) {
+		char *xattr_name, *xattr_value;
+		unsigned long xattr_entry_size;
+		unsigned long xattr_name_size, xattr_value_size;
+		int ret;
+
+		if (buf + sizeof(hdr->c_size) > bufend) {
+			error("malformed xattrs");
+			break;
+		}
+
+		hdr = (struct xattr_hdr *)buf;
+		memcpy(str, hdr->c_size, sizeof(hdr->c_size));
+		ret = kstrtoul(str, 16, &xattr_entry_size);
+		buf += xattr_entry_size;
+		if (ret || buf > bufend || !xattr_entry_size) {
+			error("malformed xattrs");
+			break;
+		}
+
+		xattr_name = hdr->c_data;
+		xattr_name_size = strnlen(xattr_name,
+					xattr_entry_size - sizeof(hdr->c_size));
+		if (xattr_name_size == xattr_entry_size - sizeof(hdr->c_size)) {
+			error("malformed xattrs");
+			break;
+		}
+
+		xattr_value = xattr_name + xattr_name_size + 1;
+		xattr_value_size = buf - xattr_value;
+
+		ret = ksys_lsetxattr(name_buf, xattr_name, xattr_value,
+				     xattr_value_size, 0);
+
+		pr_debug("%s: %s size: %lu val: %s (ret: %d)\n", name_buf,
+			 xattr_name, xattr_value_size, xattr_value, ret);
+	}
+
+	return 0;
+}
+
 static __initdata int (*actions[])(void) = {
 	[Start]		= do_start,
 	[Collect]	= do_collect,
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 1/3] fs: add ksys_lsetxattr() wrapper
From: Roberto Sassu @ 2019-05-09 11:24 UTC (permalink / raw)
  To: viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, rob,
	james.w.mcmechan, Roberto Sassu
In-Reply-To: <20190509112420.15671-1-roberto.sassu@huawei.com>

Similarly to commit 03450e271a16 ("fs: add ksys_fchmod() and do_fchmodat()
helpers and ksys_chmod() wrapper; remove in-kernel calls to syscall"), this
patch introduces the ksys_lsetxattr() helper to avoid in-kernel calls to
the sys_lsetxattr() syscall.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
---
 fs/xattr.c               | 9 ++++++++-
 include/linux/syscalls.h | 3 +++
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 0d6a6a4af861..422b3d481edb 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -484,11 +484,18 @@ SYSCALL_DEFINE5(setxattr, const char __user *, pathname,
 	return path_setxattr(pathname, name, value, size, flags, LOOKUP_FOLLOW);
 }
 
+int ksys_lsetxattr(const char __user *pathname,
+		   const char __user *name, const void __user *value,
+		   size_t size, int flags)
+{
+	return path_setxattr(pathname, name, value, size, flags, 0);
+}
+
 SYSCALL_DEFINE5(lsetxattr, const char __user *, pathname,
 		const char __user *, name, const void __user *, value,
 		size_t, size, int, flags)
 {
-	return path_setxattr(pathname, name, value, size, flags, 0);
+	return ksys_lsetxattr(pathname, name, value, size, flags);
 }
 
 SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e446806a561f..b639f13cd1f8 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -1260,6 +1260,9 @@ int ksys_ipc(unsigned int call, int first, unsigned long second,
 	unsigned long third, void __user * ptr, long fifth);
 int compat_ksys_ipc(u32 call, int first, int second,
 	u32 third, u32 ptr, u32 fifth);
+int ksys_lsetxattr(const char __user *pathname,
+		   const char __user *name, const void __user *value,
+		   size_t size, int flags);
 
 /*
  * The following kernel syscall equivalents are just wrappers to fs-internal
-- 
2.17.1

^ permalink raw reply related

* [PATCH v2 0/3] initramfs: add support for xattrs in the initial ram disk
From: Roberto Sassu @ 2019-05-09 11:24 UTC (permalink / raw)
  To: viro
  Cc: linux-security-module, linux-integrity, initramfs, linux-api,
	linux-fsdevel, linux-kernel, zohar, silviu.vlasceanu,
	dmitry.kasatkin, takondra, kamensky, hpa, arnd, rob,
	james.w.mcmechan, Roberto Sassu

This patch set aims at solving the following use case: appraise files from
the initial ram disk. To do that, IMA checks the signature/hash from the
security.ima xattr. Unfortunately, this use case cannot be implemented
currently, as the CPIO format does not support xattrs.

This proposal consists in marshaling pathnames and xattrs in a file called
.xattr-list. They are unmarshaled by the CPIO parser after all files have
been extracted.

The difference from v1 (https://lkml.org/lkml/2018/11/22/1182) is that all
xattrs are stored in a single file and not per file (solves the file name
limitation issue, as it is not necessary to add a suffix to files
containing xattrs).

The difference with another proposal
(https://lore.kernel.org/patchwork/cover/888071/) is that xattrs can be
included in an image without changing the image format, as opposed to
defining a new one. As seen from the discussion, if a new format has to be
defined, it should fix the issues of the existing format, which requires
more time.

To fulfill both requirements, adding support for xattrs in a short time and
defining a new image format properly, this patch set takes an incremental
approach: it introduces a parser of xattrs that can be used either if
xattrs are in a regular file or directly added to the image (this patch set
reuses patch 9/15 of the existing proposal); in addition, it introduces a
wrapper of the xattr parser, to read xattrs from a file.

The changes introduced by this patch set don't cause any compatibility
issue: kernels without the xattr parser simply extracts .xattr-list and
don't unmarshal xattrs; kernels with the xattr parser don't unmarshal
xattrs if .xattr-list is not found in the image.

>From the kernel space perspective, backporting this functionality to older
kernels should be very easy. It is sufficient to add a call to the new
function do_readxattrs(). From the user space perspective, no change is
required for the use case. A new dracut module (module-setup.sh) will
execute:

getfattr --absolute-names -d -P -R -e hex -m security.ima \
    <file list> | xattr.awk -b > ${initdir}/.xattr-list

where xattr.awk is the script that marshals xattrs (see patch 3/3). The
same can be done with the initramfs-tools ram disk generator.

Changelog

v1:

- move xattr unmarshaling to CPIO parser

Mimi Zohar (1):
  initramfs: set extended attributes

Roberto Sassu (2):
  fs: add ksys_lsetxattr() wrapper
  initramfs: introduce do_readxattrs()

 fs/xattr.c               |   9 ++-
 include/linux/syscalls.h |   3 +
 init/initramfs.c         | 152 ++++++++++++++++++++++++++++++++++++++-
 3 files changed, 161 insertions(+), 3 deletions(-)

-- 
2.17.1

^ permalink raw reply

* Re: [PATCH v8 06/16] sched/core: uclamp: Extend sched_setattr() to support utilization clamping
From: Patrick Bellasi @ 2019-05-09  9:24 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Suren Baghdasaryan, LKML, linux-pm, linux-api, Ingo Molnar,
	Tejun Heo, Rafael J . Wysocki, Vincent Guittot, Viresh Kumar,
	Paul Turner, Quentin Perret, Dietmar Eggemann, Morten Rasmussen,
	Juri Lelli, Todd Kjos, Joel Fernandes, Steve Muckle
In-Reply-To: <20190508194439.GF32547@worktop.programming.kicks-ass.net>

On 08-May 21:44, Peter Zijlstra wrote:
> On Tue, May 07, 2019 at 12:13:47PM +0100, Patrick Bellasi wrote:
> > On 17-Apr 15:26, Suren Baghdasaryan wrote:
> > > On Tue, Apr 2, 2019 at 3:42 AM Patrick Bellasi <patrick.bellasi@arm.com> wrote:
> 
> > > > @@ -1056,6 +1100,13 @@ static void __init init_uclamp(void)
> > > >  #else /* CONFIG_UCLAMP_TASK */
> > > >  static inline void uclamp_rq_inc(struct rq *rq, struct task_struct *p) { }
> > > >  static inline void uclamp_rq_dec(struct rq *rq, struct task_struct *p) { }
> > > > +static inline int uclamp_validate(struct task_struct *p,
> > > > +                                 const struct sched_attr *attr)
> > > > +{
> > > > +       return -ENODEV;
> > > 
> > > ENOSYS might be more appropriate?
> > 
> > Yep, agree, thanks!
> 
> No, -ENOSYS (see the comment) is special in that it indicates the whole
> system call is unavailable; that is most certainly not the case!

Yep, noted. Thanks.

-- 
#include <best/regards.h>

Patrick Bellasi

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox