The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* Re: [PATCH v3 2/2] FPGA: Add TS-7300 FPGA manager
From: Florian Fainelli @ 2016-12-14 19:08 UTC (permalink / raw)
  To: Hartley Sweeten, Moritz Fischer
  Cc: Linux Kernel Mailing List, linux-arm-kernel, Alan Tull,
	Russell King, rmallon@gmail.com, linux-fpga@vger.kernel.org
In-Reply-To: <SN1PR0101MB1565A30EFE8AFEB319B84124D09A0@SN1PR0101MB1565.prod.exchangelabs.com>

On 12/14/2016 10:58 AM, Hartley Sweeten wrote:
> On Wednesday, December 14, 2016 11:55 AM, Florian Fainelli wrote:
>> My understanding is that, yes, this triggers the final write. You are
>> right that ts73xx_fpga_write() can be called multiple times. It sounds
>> like what my write_complete function does right now is just return that
>> we successfully completed the bistream write, but this snippet that you
>> are quoting should actually be moved into write_complete.
> 
> Florian,
> 
> I'm in the process of getting a TS-7300 board so I can help test this. Hopefully
> I will have it by next week.

Great! I got a few things on my list that have not been submitted yet:

- tmp124 support through drivers/hwmon/lm70.c
- specific memcpy_{from,to}io accessors for ethoc from the FPGA
- serial port support for the UARTs from the FPGA

And some other things that are giving me issues at the moment, like
SPI_3WIRE support for spi-ep93xx so I can configure the tmp124 to send
alarms/have temperature thresholds.

My branch is here:

https://github.com/ffainelli/linux/tree/ts72xx

Cheers
-- 
Florian

^ permalink raw reply

* Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap
From: arvind Yadav @ 2016-12-14 19:05 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, linux-kernel
In-Reply-To: <262b3fdb-754b-a1f1-bd6a-3b15d72063b4@gmail.com>

Hi,

As per your suggestion, I have change the subject.

Thanks

On Thursday 15 December 2016 12:24 AM, Florian Fainelli wrote:
> On 12/14/2016 10:39 AM, arvind Yadav wrote:
>> Hi David,
>>
>> I have gave my comment.
>>
>> Thanks
>> Arvind
>>
>> On Wednesday 14 December 2016 11:44 PM, David Daney wrote:
>>> On 12/14/2016 10:06 AM, arvind Yadav wrote:
>>>> Yes, I have seen this error. We have a device with very less memory.
>>>> Basically it's OMAP2 board. We have to port Android L on this.
>>>> It's has 3.10 kernel version. In this device, we were getting Page
>>>> allocation failure.
>>> This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
>>> ton of memory where ioremap can never fail, and it doesn't run
>>> Android, and you are talking about OMAP2.
>>            -I just gave as example where i have seen ioremap issue.
>> Please don't relate. I know, Now it will not fail.  ioremap will through
>> NULL on failure. We should catch this error. Even other driver of MIPS
>> soc is having same check. It's just check which will not impact any
>> functionality or performance of this driver. It will avoid NULL pointer
>> error. We know, if  function is returning any error. we should catch.
> Your patch subject should also be changed to insert spaces between
> semicolon, so this would be:
>
> net: ethernet: cavium: octeon: octeon_mgmt:

^ permalink raw reply

* [v3] net: ethernet: cavium: octeon: octeon_mgmt: Handle return NULL error from devm_ioremap
From: Arvind Yadav @ 2016-12-14 19:03 UTC (permalink / raw)
  To: peter.chen, fw, david.daney, f.fainelli; +Cc: netdev, linux-kernel

Here, If devm_ioremap will fail. It will return NULL.
Kernel can run into a NULL-pointer dereference.
This error check will avoid NULL pointer dereference.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
---
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 4ab404f..33c2fec 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct platform_device *pdev)
 	p->agl = (u64)devm_ioremap(&pdev->dev, p->agl_phys, p->agl_size);
 	p->agl_prt_ctl = (u64)devm_ioremap(&pdev->dev, p->agl_prt_ctl_phys,
 					   p->agl_prt_ctl_size);
+	if (!p->mix || !p->agl || !p->agl_prt_ctl) {
+		dev_err(&pdev->dev, "failed to map I/O memory\n");
+		result = -ENOMEM;
+		goto err;
+	}
+
 	spin_lock_init(&p->lock);
 
 	skb_queue_head_init(&p->tx_list);
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 1/3] mm, trace: extract COMPACTION_STATUS and ZONE_TYPE to a common header
From: Michal Hocko @ 2016-12-14 19:02 UTC (permalink / raw)
  To: kbuild test robot
  Cc: kbuild-all, Andrew Morton, Vlastimil Babka, David Rientjes,
	Johannes Weiner, linux-mm, LKML
In-Reply-To: <201612150127.1D06IAf6%fengguang.wu@intel.com>

On Thu 15-12-16 01:32:06, kbuild test robot wrote:
> Hi Michal,
> 
> [auto build test ERROR on tip/perf/core]
> [also build test ERROR on v4.9 next-20161214]
> [if your patch is applied to the wrong git tree, please drop us a note to help improve the system]
> 
> url:    https://github.com/0day-ci/linux/commits/Michal-Hocko/mm-oom-add-oom-detection-tracepoints/20161214-231225
> config: x86_64-randconfig-s2-12142134 (attached as .config)
> compiler: gcc-4.4 (Debian 4.4.7-8) 4.4.7
> reproduce:
>         # save the attached .config to linux build tree
>         make ARCH=x86_64 
> 
> All error/warnings (new ones prefixed by >>):
> 
>    In file included from include/trace/trace_events.h:361,
>                     from include/trace/define_trace.h:95,
>                     from include/trace/events/compaction.h:356,
>                     from mm/compaction.c:43:
>    include/trace/events/compaction.h: In function 'trace_raw_output_mm_compaction_end':
> >> include/trace/events/compaction.h:134: error: expected expression before ',' token
>    include/trace/events/compaction.h: In function 'trace_raw_output_mm_compaction_suitable_template':
>    include/trace/events/compaction.h:195: error: expected expression before ',' token
> >> include/trace/events/compaction.h:195: warning: missing braces around initializer
>    include/trace/events/compaction.h:195: warning: (near initialization for 'symbols[0]')
> >> include/trace/events/compaction.h:195: error: initializer element is not constant
>    include/trace/events/compaction.h:195: error: (near initialization for 'symbols[0].mask')

Interesting. I am pretty sure that my config battery has
CONFIG_COMPACTION=n. Not sure which part of your config made a change.
Anyway, I've added to my collection. And with the below diff it passes
all my configs.
---
>From 921bf07b8684ded5f076904cc6baa875b52c3a1e Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@suse.com>
Date: Wed, 14 Dec 2016 18:56:44 +0100
Subject: [PATCH] fold me "mm, trace: extract COMPACTION_STATUS and ZONE_TYPE
 to a common header"

0-day has reported:
   In file included from include/trace/trace_events.h:361,
                    from include/trace/define_trace.h:95,
                    from include/trace/events/compaction.h:356,
                    from mm/compaction.c:43:
   include/trace/events/compaction.h: In function 'trace_raw_output_mm_compaction_end':
>> include/trace/events/compaction.h:134: error: expected expression before ',' token
   include/trace/events/compaction.h: In function 'trace_raw_output_mm_compaction_suitable_template':
   include/trace/events/compaction.h:195: error: expected expression before ',' token
>> include/trace/events/compaction.h:195: warning: missing braces around initializer
   include/trace/events/compaction.h:195: warning: (near initialization for 'symbols[0]')
>> include/trace/events/compaction.h:195: error: initializer element is not constant
   include/trace/events/compaction.h:195: error: (near initialization for 'symbols[0].mask')

CONFIG_COMPACTION=n so COMPACTION_STATUS is not defined properly.

Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 include/trace/events/compaction.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/compaction.h b/include/trace/events/compaction.h
index 2334faa56323..0a18ab6483ff 100644
--- a/include/trace/events/compaction.h
+++ b/include/trace/events/compaction.h
@@ -131,6 +131,7 @@ TRACE_EVENT(mm_compaction_begin,
 		__entry->sync ? "sync" : "async")
 );
 
+#ifdef CONFIG_COMPACTION
 TRACE_EVENT(mm_compaction_end,
 	TP_PROTO(unsigned long zone_start, unsigned long migrate_pfn,
 		unsigned long free_pfn, unsigned long zone_end, bool sync,
@@ -164,6 +165,7 @@ TRACE_EVENT(mm_compaction_end,
 		__entry->sync ? "sync" : "async",
 		__print_symbolic(__entry->status, COMPACTION_STATUS))
 );
+#endif
 
 TRACE_EVENT(mm_compaction_try_to_compact_pages,
 
@@ -192,6 +194,7 @@ TRACE_EVENT(mm_compaction_try_to_compact_pages,
 		__entry->prio)
 );
 
+#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_suitable_template,
 
 	TP_PROTO(struct zone *zone,
@@ -239,7 +242,6 @@ DEFINE_EVENT(mm_compaction_suitable_template, mm_compaction_suitable,
 	TP_ARGS(zone, order, ret)
 );
 
-#ifdef CONFIG_COMPACTION
 DECLARE_EVENT_CLASS(mm_compaction_defer_template,
 
 	TP_PROTO(struct zone *zone, int order),
-- 
2.10.2

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply related

* Re: Question about regulator API
From: Mark Brown @ 2016-12-14 18:14 UTC (permalink / raw)
  To: Harald Geyer; +Cc: Liam Girdwood, linux-kernel
In-Reply-To: <E1cHAvO-0000Qg-Hv@stardust.g4.wien.funkfeuer.at>

[-- Attachment #1: Type: text/plain, Size: 2880 bytes --]

On Wed, Dec 14, 2016 at 03:52:54PM +0100, Harald Geyer wrote:

> Thus the following constraints should be met:
> * When user space asks the driver to read a device, it needs to "claim"
>   the supply and ensure that it has been up for at least 2 seconds
>   before proceeding to read the HW. The supply would be "locked" enabled.
>   (I think this is standard regulator API.)
> * When HW failure is detected, the driver needs to tell the supply to
>   turn off for at least 2 seconds, wait for all other devices to release
>   the supply so it can actually be turned off, then wait for the off period,
>   then start over again.
>   (This is easy only with getting the supply exclusively.)

You need to use a notification to figure out when the supply is actually
off.

> * It should be possible to read multiple devices quickly when everything
>   is okay and working. (Having 6 2-second delays accumulate would be quite
>   annoying.)
>   (This won't work with exclusive supply usage.)

Similarly using a notification to discover when the supply is on would
help here.

> * Optionally: If all devices are idle the supply would be enabled if short
>   response time is desired, but disabled if power saving is desired.
> 
> Can this somehow be solved with the existing API?
> If not, do you think it would be reasonable/possible to extend the API
> to cover situations like the one described above?

This doesn't feel like a regulator API problem exactly, a lot of what
you're talking about here seems like you really need the devices to
coopereate with each other and know what they're doing in order to work
well together.  From a regulator API point of view my first thought is
to use notifications to hook into the actual power on/off transitions
and then expressing the bit where the devices all work together at a
higher level.  It may be that a lot of that higher level coordination
just falls out of normal usage patterns do doesn't need explicitly
implementing, assuming the devices are only powered up during reads
anyway.  Using regulator_disable_deferred() in the driver may help
grease the wheels in terms of avoiding needless power bounces and
delays - just wait a little while before powering down in case you need
to power up again very soon after.

You'd end up with the devices all ignoring each other but keeping track
of when the supply was last enabled and disabled and individually
keeping timers to make sure that the needed delays are taken care of.
Userspace would then turn up and read all the devices, they'd then do
the enables and disables as though they were working alone but
coordinate through the notifications.  If device A powered things up
then device B would know the power was already on via the notifications
and when it came on so could take account of that when doing its own
delays.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk
From: Brijesh Singh @ 2016-12-14 18:39 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: brijesh.singh, kvm, thomas lendacky, rkrcmar, joro, x86,
	linux-kernel, mingo, hpa, tglx, bp
In-Reply-To: <ed5a3921-00b0-9214-0767-62585fb6b166@redhat.com>


On 12/14/2016 11:23 AM, Paolo Bonzini wrote:
>
>
> On 14/12/2016 18:07, Brijesh Singh wrote:
>>>
>>
>> Since now we are going to perform multiple conditional checks before
>> concluding that its safe to use HW provided GPA. How about if we add two
>> functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
>> emulator.c and  use these functions inside the x86.c to determine if its
>> safe to use HW provided gpa?
>
> Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
> leaving emulator_is_string_op in emulate.c was the right thing to do, it
> was just the test that was wrong :)).
>

Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c 
was right thing - mainly because emulator.c does not deal with GPA. I 
will go with your advice and put it in emulator.c, it makes easy :)


> The patch below is still missing the check for cross-page MMIO.  Your
> reference to the BKDG only covers MMCONFIG (sometimes referred to as
> ECAM), not MMIO in general.  Doing AND or OR into video memory for
> example is perfectly legal, and I'm fairly sure that some obscure legacy
> software does PUSH/POP into vram as well!
>
>

I used your below code snippet to detect cross-page MMIO access. After 
applying these changes cross-page MMIO read/write unit test is passing 
just fine. I will include it in patch.

 > Actually there is a nice trick you can do to support cross-page
 > MMIO access detection:

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 37cd31645d45..754d251dc611 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4549,6 +4549,7 @@ static int emulator_read_write_onepage(unsigned 
long addr, void *val,
  	 */
  	if (vcpu->arch.gpa_available &&
  	    !emulator_can_use_hw_gpa(ctxt) &&
+	    (addr & ~PAGE_MASK) == (exception->address & ~PAGE_MASK) &&
  	    vcpu_is_mmio_gpa(vcpu, addr, exception->address, write)) {
  		gpa = exception->address;
  		goto mmio;

^ permalink raw reply related

* Re: [GIT PULL] ext4 updates for 4.10
From: Johannes Weiner @ 2016-12-14 18:49 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Theodore Ts'o, Linux Kernel Mailing List,
	linux-ext4@vger.kernel.org
In-Reply-To: <CA+55aFzpr1r7MywLFGNyxRSkzwj7G8N2AdGNM22-poLMMv5QQw@mail.gmail.com>

On Wed, Dec 14, 2016 at 09:20:43AM -0800, Linus Torvalds wrote:
> [ Johannes added to participants due to radix tree changes ]
> 
> On Tue, Dec 13, 2016 at 12:00 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> >
> > This merge request includes the dax-4.0-iomap-pmd branch which is
> > needed for both ext4 and xfs dax changes to use iomap for DAX.  It
> > also includes the fscrypt branch which is needed for ubifs encryption
> > work as well as ext4 encryption and fscrypt cleanups.
> 
> Can you double-check my merge resolution with the radix tree changes
> wrt the DAX changes, please? It looked straightforward, but it's an
> area where I'd really like people to actually look again and test..

The radix tree bits look correct to me.

^ permalink raw reply

* Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap
From: Florian Fainelli @ 2016-12-14 18:54 UTC (permalink / raw)
  To: arvind Yadav; +Cc: netdev, linux-kernel
In-Reply-To: <a4c42b7c-89bb-efa7-e6e6-86e620ee1897@gmail.com>

On 12/14/2016 10:39 AM, arvind Yadav wrote:
> Hi David,
> 
> I have gave my comment.
> 
> Thanks
> Arvind
> 
> On Wednesday 14 December 2016 11:44 PM, David Daney wrote:
>> On 12/14/2016 10:06 AM, arvind Yadav wrote:
>>> Yes, I have seen this error. We have a device with very less memory.
>>> Basically it's OMAP2 board. We have to port Android L on this.
>>> It's has 3.10 kernel version. In this device, we were getting Page
>>> allocation failure.
>>
>> This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a
>> ton of memory where ioremap can never fail, and it doesn't run
>> Android, and you are talking about OMAP2.
>           -I just gave as example where i have seen ioremap issue.
> Please don't relate. I know, Now it will not fail.  ioremap will through
> NULL on failure. We should catch this error. Even other driver of MIPS
> soc is having same check. It's just check which will not impact any
> functionality or performance of this driver. It will avoid NULL pointer
> error. We know, if  function is returning any error. we should catch.

Your patch subject should also be changed to insert spaces between
semicolon, so this would be:

net: ethernet: cavium: octeon: octeon_mgmt:
-- 
Florian

^ permalink raw reply

* [RFC 4/4] Introduce CONFIG_READONLY_USERMODEHELPER
From: Greg KH @ 2016-12-14 18:51 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-kernel
In-Reply-To: <20161214185000.GA3930@kroah.com>

If you can write to kernel memory, an "easy" way to get the kernel to
run any application is to change the pointer of one of the usermode
helper program names.  To try to mitigate this, create a new config
option, CONFIG_READONLY_USERMODEHELPER.

This option only allows "predefined" binaries to be called.  A number of
drivers and subsystems allow for the name of the binary to be changed,
and this config option disables that capability, so be aware of that.

Note:  Still a proof-of-concept at this point in time, doesn't cover all
of the call_usermodehelper() calls just yet, including the "fun" of
coredumps, it's still a work in progress.

Not-Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 arch/x86/kernel/cpu/mcheck/mce.c | 12 ++++++++----
 drivers/block/drbd/drbd_int.h    |  6 +++++-
 drivers/block/drbd/drbd_main.c   |  5 +++++
 drivers/video/fbdev/uvesafb.c    | 19 ++++++++++++++-----
 fs/nfs/cache_lib.c               | 12 ++++++++++--
 include/linux/reboot.h           |  2 ++
 kernel/ksysfs.c                  |  6 +++++-
 kernel/reboot.c                  |  3 +++
 kernel/sysctl.c                  |  4 ++++
 lib/kobject_uevent.c             |  3 +++
 security/Kconfig                 | 17 +++++++++++++++++
 11 files changed, 76 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 00ef43233e03..92a2ef8ffe3e 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -2337,15 +2337,16 @@ static ssize_t set_bank(struct device *s, struct device_attribute *attr,
 }
 
 static ssize_t
-show_trigger(struct device *s, struct device_attribute *attr, char *buf)
+trigger_show(struct device *s, struct device_attribute *attr, char *buf)
 {
 	strcpy(buf, mce_helper);
 	strcat(buf, "\n");
 	return strlen(mce_helper) + 1;
 }
 
-static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
-				const char *buf, size_t siz)
+#ifndef CONFIG_READONLY_USERMODEHELPER
+static ssize_t trigger_store(struct device *s, struct device_attribute *attr,
+			     const char *buf, size_t siz)
 {
 	char *p;
 
@@ -2358,6 +2359,10 @@ static ssize_t set_trigger(struct device *s, struct device_attribute *attr,
 
 	return strlen(mce_helper) + !!p;
 }
+static DEVICE_ATTR_RW(trigger);
+#else
+static DEVICE_ATTR_RO(trigger);
+#endif
 
 static ssize_t set_ignore_ce(struct device *s,
 			     struct device_attribute *attr,
@@ -2415,7 +2420,6 @@ static ssize_t store_int_with_restart(struct device *s,
 	return ret;
 }
 
-static DEVICE_ATTR(trigger, 0644, show_trigger, set_trigger);
 static DEVICE_INT_ATTR(tolerant, 0644, mca_cfg.tolerant);
 static DEVICE_INT_ATTR(monarch_timeout, 0644, mca_cfg.monarch_timeout);
 static DEVICE_BOOL_ATTR(dont_log_ce, 0644, mca_cfg.dont_log_ce);
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index a139a34f1f1e..e21ab2bcc482 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,11 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char drbd_usermode_helper[];
+extern
+#ifdef CONFIG_READONLY_USERMODEHELPER
+       const
+#endif
+             char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 8f51eccc8de7..41c988e9cdf2 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,14 @@ int proc_details;       /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
+#ifdef CONFIG_READONLY_USERMODEHELPER
+const
+#endif
 char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
+#ifndef CONFIG_READONLY_USERMODEHELPER
 module_param_string(usermode_helper, drbd_usermode_helper, sizeof(drbd_usermode_helper), 0644);
+#endif
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/video/fbdev/uvesafb.c b/drivers/video/fbdev/uvesafb.c
index 98af9e02959b..0328d70a4afb 100644
--- a/drivers/video/fbdev/uvesafb.c
+++ b/drivers/video/fbdev/uvesafb.c
@@ -30,7 +30,11 @@ static struct cb_id uvesafb_cn_id = {
 	.idx = CN_IDX_V86D,
 	.val = CN_VAL_V86D_UVESAFB
 };
+#ifdef CONFIG_READONLY_USERMODEHELPER
+static const char v86d_path[PATH_MAX] = "/sbin/v86d";
+#else
 static char v86d_path[PATH_MAX] = "/sbin/v86d";
+#endif
 static char v86d_started;	/* has v86d been started by uvesafb? */
 
 static const struct fb_fix_screeninfo uvesafb_fix = {
@@ -114,7 +118,7 @@ static int uvesafb_helper_start(void)
 	};
 
 	char *argv[] = {
-		v86d_path,
+		(char *)v86d_path,
 		NULL,
 	};
 
@@ -1883,19 +1887,22 @@ static int uvesafb_setup(char *options)
 }
 #endif /* !MODULE */
 
-static ssize_t show_v86d(struct device_driver *dev, char *buf)
+static ssize_t v86d_show(struct device_driver *dev, char *buf)
 {
 	return snprintf(buf, PAGE_SIZE, "%s\n", v86d_path);
 }
 
-static ssize_t store_v86d(struct device_driver *dev, const char *buf,
+#ifndef CONFIG_READONLY_USERMODEHELPER
+static ssize_t v86d_store(struct device_driver *dev, const char *buf,
 		size_t count)
 {
 	strncpy(v86d_path, buf, PATH_MAX);
 	return count;
 }
-
-static DRIVER_ATTR(v86d, S_IRUGO | S_IWUSR, show_v86d, store_v86d);
+static DRIVER_ATTR_RW(v86d);
+#else
+static DRIVER_ATTR_RO(v86d);
+#endif
 
 static int uvesafb_init(void)
 {
@@ -2017,8 +2024,10 @@ MODULE_PARM_DESC(mode_option,
 module_param(vbemode, ushort, 0);
 MODULE_PARM_DESC(vbemode,
 	"VBE mode number to set, overrides the 'mode' option");
+#ifndef CONFIG_READONLY_USERMODEHELPER
 module_param_string(v86d, v86d_path, PATH_MAX, 0660);
 MODULE_PARM_DESC(v86d, "Path to the v86d userspace helper.");
+#endif
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Michal Januszewski <spock@gentoo.org>");
diff --git a/fs/nfs/cache_lib.c b/fs/nfs/cache_lib.c
index 6de15709d024..32a739e909d2 100644
--- a/fs/nfs/cache_lib.c
+++ b/fs/nfs/cache_lib.c
@@ -20,13 +20,20 @@
 #define NFS_CACHE_UPCALL_PATHLEN 256
 #define NFS_CACHE_UPCALL_TIMEOUT 15
 
+#ifdef CONFIG_READONLY_USERMODEHELPER
+static const char nfs_cache_getent_prog[NFS_CACHE_UPCALL_PATHLEN] =
+#else
 static char nfs_cache_getent_prog[NFS_CACHE_UPCALL_PATHLEN] =
+#endif
 				"/sbin/nfs_cache_getent";
 static unsigned long nfs_cache_getent_timeout = NFS_CACHE_UPCALL_TIMEOUT;
 
+#ifndef CONFIG_READONLY_USERMODEHELPER
 module_param_string(cache_getent, nfs_cache_getent_prog,
 		sizeof(nfs_cache_getent_prog), 0600);
 MODULE_PARM_DESC(cache_getent, "Path to the client cache upcall program");
+#endif
+
 module_param_named(cache_getent_timeout, nfs_cache_getent_timeout, ulong, 0600);
 MODULE_PARM_DESC(cache_getent_timeout, "Timeout (in seconds) after which "
 		"the cache upcall is assumed to have failed");
@@ -39,7 +46,7 @@ int nfs_cache_upcall(struct cache_detail *cd, char *entry_name)
 		NULL
 	};
 	char *argv[] = {
-		nfs_cache_getent_prog,
+		(char *)nfs_cache_getent_prog,
 		cd->name,
 		entry_name,
 		NULL
@@ -48,7 +55,8 @@ int nfs_cache_upcall(struct cache_detail *cd, char *entry_name)
 
 	if (nfs_cache_getent_prog[0] == '\0')
 		goto out;
-	ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
+	ret = call_usermodehelper(nfs_cache_getent_prog, argv, envp,
+				  UMH_WAIT_EXEC);
 	/*
 	 * Disable the upcall mechanism if we're getting an ENOENT or
 	 * EACCES error. The admin can re-enable it on the fly by using
diff --git a/include/linux/reboot.h b/include/linux/reboot.h
index a7ff409f386d..52a43b062942 100644
--- a/include/linux/reboot.h
+++ b/include/linux/reboot.h
@@ -68,7 +68,9 @@ extern int C_A_D; /* for sysctl */
 void ctrl_alt_del(void);
 
 #define POWEROFF_CMD_PATH_LEN	256
+#ifndef CONFIG_READONLY_USERMODEHELPER
 extern char poweroff_cmd[POWEROFF_CMD_PATH_LEN];
+#endif
 
 extern void orderly_poweroff(bool force);
 extern void orderly_reboot(void);
diff --git a/kernel/ksysfs.c b/kernel/ksysfs.c
index ee1bc1bb8feb..9158fb36cfae 100644
--- a/kernel/ksysfs.c
+++ b/kernel/ksysfs.c
@@ -44,6 +44,7 @@ static ssize_t uevent_helper_show(struct kobject *kobj,
 {
 	return sprintf(buf, "%s\n", uevent_helper);
 }
+#ifndef CONFIG_READONLY_USERMODEHELPER
 static ssize_t uevent_helper_store(struct kobject *kobj,
 				   struct kobj_attribute *attr,
 				   const char *buf, size_t count)
@@ -57,7 +58,10 @@ static ssize_t uevent_helper_store(struct kobject *kobj,
 	return count;
 }
 KERNEL_ATTR_RW(uevent_helper);
-#endif
+#else
+KERNEL_ATTR_RO(uevent_helper);
+#endif	/* CONFIG_READONLY_USERMODEHELPER */
+#endif	/* CONFIG_UEVENT_HELPER */
 
 #ifdef CONFIG_PROFILING
 static ssize_t profiling_show(struct kobject *kobj,
diff --git a/kernel/reboot.c b/kernel/reboot.c
index bd30a973fe94..1b1764f0eb30 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -386,6 +386,9 @@ void ctrl_alt_del(void)
 		kill_cad_pid(SIGINT, 1);
 }
 
+#ifdef CONFIG_READONLY_USERMODEHELPER
+static const
+#endif
 char poweroff_cmd[POWEROFF_CMD_PATH_LEN] = "/sbin/poweroff";
 static const char reboot_cmd[] = "/sbin/reboot";
 
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 39b3368f6de6..0b75e1aa8d82 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -662,6 +662,7 @@ static struct ctl_table kern_table[] = {
 	},
 #endif
 #ifdef CONFIG_UEVENT_HELPER
+#ifndef CONFIG_READONLY_USERMODEHELPER
 	{
 		.procname	= "hotplug",
 		.data		= &uevent_helper,
@@ -670,6 +671,7 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dostring,
 	},
 #endif
+#endif
 #ifdef CONFIG_CHR_DEV_SG
 	{
 		.procname	= "sg-big-buff",
@@ -1079,6 +1081,7 @@ static struct ctl_table kern_table[] = {
 		.proc_handler	= proc_dointvec,
 	},
 #endif
+#ifndef CONFIG_READONLY_USERMODEHELPER
 	{
 		.procname	= "poweroff_cmd",
 		.data		= &poweroff_cmd,
@@ -1086,6 +1089,7 @@ static struct ctl_table kern_table[] = {
 		.mode		= 0644,
 		.proc_handler	= proc_dostring,
 	},
+#endif
 #ifdef CONFIG_KEYS
 	{
 		.procname	= "keys",
diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index 9a2b811966eb..a8f087d7687d 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -29,6 +29,9 @@
 
 u64 uevent_seqnum;
 #ifdef CONFIG_UEVENT_HELPER
+#ifdef CONFIG_READONLY_USERMODEHELPER
+const
+#endif
 char uevent_helper[UEVENT_HELPER_PATH_LEN] = CONFIG_UEVENT_HELPER_PATH;
 #endif
 #ifdef CONFIG_NET
diff --git a/security/Kconfig b/security/Kconfig
index 118f4549404e..47e8011c6261 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -158,6 +158,23 @@ config HARDENED_USERCOPY_PAGESPAN
 	  been removed. This config is intended to be used only while
 	  trying to find such users.
 
+config READONLY_USERMODEHELPER
+	bool "Make User Mode Helper program names read-only"
+	default N
+	help
+	  Some user mode helper program names can be changed at runtime
+	  by userspace programs.  Prevent this from happening by "hard
+	  coding" all user mode helper program names at kernel build
+	  time, moving the names into read-only memory, making it harder
+	  for any arbritrary program to be run as root if something were
+	  to go wrong.
+
+	  Note, some subsystems and drivers allow their user mode helper
+	  binary to be changed with a module parameter, sysctl, sysfs
+	  file, or some combination of these.  Enabling this option
+	  prevents the binary name to be changed, which might not be
+	  good for some systems.
+
 source security/selinux/Kconfig
 source security/smack/Kconfig
 source security/tomoyo/Kconfig
-- 
2.10.2

^ permalink raw reply related

* [PATCH 3/4] Make static usermode helper binaries constant
From: Greg KH @ 2016-12-14 18:50 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-kernel
In-Reply-To: <20161214185000.GA3930@kroah.com>


There are a number of usermode helper binaries that are "hard coded" in
the kernel today, so mark them as "const" to make it harder for someone
to change where the variables point to.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/macintosh/windfarm_core.c          | 2 +-
 drivers/net/hamradio/baycom_epp.c          | 2 +-
 drivers/pnp/pnpbios/core.c                 | 5 +++--
 drivers/staging/greybus/svc_watchdog.c     | 4 ++--
 drivers/staging/rtl8192e/rtl8192e/rtl_dm.c | 6 +++---
 fs/nfsd/nfs4layouts.c                      | 6 ++++--
 security/keys/request_key.c                | 7 ++++---
 7 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/drivers/macintosh/windfarm_core.c b/drivers/macintosh/windfarm_core.c
index 465d770ab0bb..1b317cbb73cf 100644
--- a/drivers/macintosh/windfarm_core.c
+++ b/drivers/macintosh/windfarm_core.c
@@ -74,7 +74,7 @@ static inline void wf_notify(int event, void *param)
 
 static int wf_critical_overtemp(void)
 {
-	static char * critical_overtemp_path = "/sbin/critical_overtemp";
+	static const char * critical_overtemp_path = "/sbin/critical_overtemp";
 	char *argv[] = { critical_overtemp_path, NULL };
 	static char *envp[] = { "HOME=/",
 				"TERM=linux",
diff --git a/drivers/net/hamradio/baycom_epp.c b/drivers/net/hamradio/baycom_epp.c
index 78dbc44540f6..321cffa8dbe4 100644
--- a/drivers/net/hamradio/baycom_epp.c
+++ b/drivers/net/hamradio/baycom_epp.c
@@ -299,7 +299,7 @@ static inline void baycom_int_freq(struct baycom_state *bc)
  *    eppconfig_path should be setable  via /proc/sys.
  */
 
-static char eppconfig_path[256] = "/usr/sbin/eppfpga";
+static const char eppconfig_path[256] = "/usr/sbin/eppfpga";
 
 static char *envp[] = { "HOME=/", "TERM=linux", "PATH=/usr/bin:/bin", NULL };
 
diff --git a/drivers/pnp/pnpbios/core.c b/drivers/pnp/pnpbios/core.c
index c38a5b9733c8..614aae6fcc0f 100644
--- a/drivers/pnp/pnpbios/core.c
+++ b/drivers/pnp/pnpbios/core.c
@@ -98,6 +98,7 @@ static struct completion unload_sem;
  */
 static int pnp_dock_event(int dock, struct pnp_docking_station_info *info)
 {
+	static const char *sbin_pnpbios = "/sbin/pnpbios";
 	char *argv[3], **envp, *buf, *scratch;
 	int i = 0, value;
 
@@ -112,7 +113,7 @@ static int pnp_dock_event(int dock, struct pnp_docking_station_info *info)
 	 * integrated into the driver core and use the usual infrastructure
 	 * like sysfs and uevents
 	 */
-	argv[0] = "/sbin/pnpbios";
+	argv[0] = sbin_pnpbios;
 	argv[1] = "dock";
 	argv[2] = NULL;
 
@@ -139,7 +140,7 @@ static int pnp_dock_event(int dock, struct pnp_docking_station_info *info)
 			   info->location_id, info->serial, info->capabilities);
 	envp[i] = NULL;
 
-	value = call_usermodehelper(argv [0], argv, envp, UMH_WAIT_EXEC);
+	value = call_usermodehelper(sbin_pnpbios, argv, envp, UMH_WAIT_EXEC);
 	kfree(buf);
 	kfree(envp);
 	return 0;
diff --git a/drivers/staging/greybus/svc_watchdog.c b/drivers/staging/greybus/svc_watchdog.c
index 3729460fb954..db32ec0f0e80 100644
--- a/drivers/staging/greybus/svc_watchdog.c
+++ b/drivers/staging/greybus/svc_watchdog.c
@@ -44,14 +44,14 @@ static int svc_watchdog_pm_notifier(struct notifier_block *notifier,
 
 static void greybus_reset(struct work_struct *work)
 {
-	static char start_path[256] = "/system/bin/start";
+	static const char start_path[256] = "/system/bin/start";
 	static char *envp[] = {
 		"HOME=/",
 		"PATH=/sbin:/vendor/bin:/system/sbin:/system/bin:/system/xbin",
 		NULL,
 	};
 	static char *argv[] = {
-		start_path,
+		(char *)start_path,
 		"unipro_reset",
 		NULL,
 	};
diff --git a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
index 9bc284812c30..5f0c2cdf32d1 100644
--- a/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
+++ b/drivers/staging/rtl8192e/rtl8192e/rtl_dm.c
@@ -268,7 +268,7 @@ void rtl92e_dm_watchdog(struct net_device *dev)
 static void _rtl92e_dm_check_ac_dc_power(struct net_device *dev)
 {
 	struct r8192_priv *priv = rtllib_priv(dev);
-	static char *ac_dc_script = "/etc/acpi/wireless-rtl-ac-dc-power.sh";
+	static const char *ac_dc_script = "/etc/acpi/wireless-rtl-ac-dc-power.sh";
 	char *argv[] = {ac_dc_script, DRV_NAME, NULL};
 	static char *envp[] = {"HOME=/",
 			"TERM=linux",
@@ -1823,7 +1823,7 @@ static void _rtl92e_dm_check_rf_ctrl_gpio(void *data)
 	enum rt_rf_power_state eRfPowerStateToSet;
 	bool bActuallySet = false;
 	char *argv[3];
-	static char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
+	static const char *RadioPowerPath = "/etc/acpi/events/RadioPower.sh";
 	static char *envp[] = {"HOME=/", "TERM=linux", "PATH=/usr/bin:/bin",
 			       NULL};
 
@@ -1862,7 +1862,7 @@ static void _rtl92e_dm_check_rf_ctrl_gpio(void *data)
 		else
 			argv[1] = "RFON";
 
-		argv[0] = RadioPowerPath;
+		argv[0] = (char *)RadioPowerPath;
 		argv[2] = NULL;
 		call_usermodehelper(RadioPowerPath, argv, envp, UMH_WAIT_PROC);
 	}
diff --git a/fs/nfsd/nfs4layouts.c b/fs/nfsd/nfs4layouts.c
index 42aace4fc4c8..4ce019b9d5a9 100644
--- a/fs/nfsd/nfs4layouts.c
+++ b/fs/nfsd/nfs4layouts.c
@@ -613,6 +613,7 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls)
 {
 	struct nfs4_client *clp = ls->ls_stid.sc_client;
 	char addr_str[INET6_ADDRSTRLEN];
+	static const char *nfsd_recall_failed = "/sbin/nfsd-recall-failed";
 	static char *envp[] = {
 		"HOME=/",
 		"TERM=linux",
@@ -628,12 +629,13 @@ nfsd4_cb_layout_fail(struct nfs4_layout_stateid *ls)
 		"nfsd: client %s failed to respond to layout recall. "
 		"  Fencing..\n", addr_str);
 
-	argv[0] = "/sbin/nfsd-recall-failed";
+	argv[0] = (char *)nfsd_recall_failed;
 	argv[1] = addr_str;
 	argv[2] = ls->ls_file->f_path.mnt->mnt_sb->s_id;
 	argv[3] = NULL;
 
-	error = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+	error = call_usermodehelper(nfsd_recall_failed, argv, envp,
+				    UMH_WAIT_PROC);
 	if (error) {
 		printk(KERN_ERR "nfsd: fence failed for client %s: %d!\n",
 			addr_str, error);
diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 43affcf10b22..e79cdcd704b5 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -72,7 +72,7 @@ static void umh_keys_cleanup(struct subprocess_info *info)
 /*
  * Call a usermode helper with a specific session keyring.
  */
-static int call_usermodehelper_keys(char *path, char **argv, char **envp,
+static int call_usermodehelper_keys(const char *path, char **argv, char **envp,
 					struct key *session_keyring, int wait)
 {
 	struct subprocess_info *info;
@@ -95,6 +95,7 @@ static int call_sbin_request_key(struct key_construction *cons,
 				 const char *op,
 				 void *aux)
 {
+	static const char *request_key = "/sbin/request-key";
 	const struct cred *cred = current_cred();
 	key_serial_t prkey, sskey;
 	struct key *key = cons->key, *authkey = cons->authkey, *keyring,
@@ -161,7 +162,7 @@ static int call_sbin_request_key(struct key_construction *cons,
 
 	/* set up the argument list */
 	i = 0;
-	argv[i++] = "/sbin/request-key";
+	argv[i++] = (char *)request_key;
 	argv[i++] = (char *) op;
 	argv[i++] = key_str;
 	argv[i++] = uid_str;
@@ -172,7 +173,7 @@ static int call_sbin_request_key(struct key_construction *cons,
 	argv[i] = NULL;
 
 	/* do it */
-	ret = call_usermodehelper_keys(argv[0], argv, envp, keyring,
+	ret = call_usermodehelper_keys(request_key, argv, envp, keyring,
 				       UMH_WAIT_PROC);
 	kdebug("usermode -> 0x%x", ret);
 	if (ret >= 0) {
-- 
2.10.2

^ permalink raw reply related

* [PATCH 2/4] drbd: rename "usermode_helper" to "drbd_usermode_helper"
From: Greg KH @ 2016-12-14 18:50 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-kernel
In-Reply-To: <20161214185000.GA3930@kroah.com>


Nothing like having a very generic global variable in a tiny driver
subsystem to make a mess of the global namespace...

Anyway, clean it up in anticipation of making drbd_usermode_helper
read-only in a future patch.

Note, there are many other "generic" named global variables in the drbd
subsystem, someone should fix those up one day before they hit a linking
error.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/block/drbd/drbd_int.h  |  2 +-
 drivers/block/drbd/drbd_main.c |  4 ++--
 drivers/block/drbd/drbd_nl.c   | 20 ++++++++++----------
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index 4cb8f21ff4ef..a139a34f1f1e 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -75,7 +75,7 @@ extern int fault_rate;
 extern int fault_devs;
 #endif
 
-extern char usermode_helper[];
+extern char drbd_usermode_helper[];
 
 
 /* This is used to stop/restart our threads.
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 83482721bc01..8f51eccc8de7 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -108,9 +108,9 @@ int proc_details;       /* Detail level in proc drbd*/
 
 /* Module parameter for setting the user mode helper program
  * to run. Default is /sbin/drbdadm */
-char usermode_helper[80] = "/sbin/drbdadm";
+char drbd_usermode_helper[80] = "/sbin/drbdadm";
 
-module_param_string(usermode_helper, usermode_helper, sizeof(usermode_helper), 0644);
+module_param_string(usermode_helper, drbd_usermode_helper, sizeof(drbd_usermode_helper), 0644);
 
 /* in 2.6.x, our device mapping and config info contains our virtual gendisks
  * as member "struct gendisk *vdisk;"
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index f35db29cac76..9edc6fb95f19 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -344,7 +344,7 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 			 (char[60]) { }, /* address */
 			NULL };
 	char mb[14];
-	char *argv[] = {usermode_helper, cmd, mb, NULL };
+	char *argv[] = {drbd_usermode_helper, cmd, mb, NULL };
 	struct drbd_connection *connection = first_peer_device(device)->connection;
 	struct sib_info sib;
 	int ret;
@@ -359,19 +359,19 @@ int drbd_khelper(struct drbd_device *device, char *cmd)
 	 * write out any unsynced meta data changes now */
 	drbd_md_sync(device);
 
-	drbd_info(device, "helper command: %s %s %s\n", usermode_helper, cmd, mb);
+	drbd_info(device, "helper command: %s %s %s\n", drbd_usermode_helper, cmd, mb);
 	sib.sib_reason = SIB_HELPER_PRE;
 	sib.helper_name = cmd;
 	drbd_bcast_event(device, &sib);
 	notify_helper(NOTIFY_CALL, device, connection, cmd, 0);
-	ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+	ret = call_usermodehelper(drbd_usermode_helper, argv, envp, UMH_WAIT_PROC);
 	if (ret)
 		drbd_warn(device, "helper command: %s %s %s exit code %u (0x%x)\n",
-				usermode_helper, cmd, mb,
+				drbd_usermode_helper, cmd, mb,
 				(ret >> 8) & 0xff, ret);
 	else
 		drbd_info(device, "helper command: %s %s %s exit code %u (0x%x)\n",
-				usermode_helper, cmd, mb,
+				drbd_usermode_helper, cmd, mb,
 				(ret >> 8) & 0xff, ret);
 	sib.sib_reason = SIB_HELPER_POST;
 	sib.helper_exit_code = ret;
@@ -396,24 +396,24 @@ enum drbd_peer_state conn_khelper(struct drbd_connection *connection, char *cmd)
 			 (char[60]) { }, /* address */
 			NULL };
 	char *resource_name = connection->resource->name;
-	char *argv[] = {usermode_helper, cmd, resource_name, NULL };
+	char *argv[] = {drbd_usermode_helper, cmd, resource_name, NULL };
 	int ret;
 
 	setup_khelper_env(connection, envp);
 	conn_md_sync(connection);
 
-	drbd_info(connection, "helper command: %s %s %s\n", usermode_helper, cmd, resource_name);
+	drbd_info(connection, "helper command: %s %s %s\n", drbd_usermode_helper, cmd, resource_name);
 	/* TODO: conn_bcast_event() ?? */
 	notify_helper(NOTIFY_CALL, NULL, connection, cmd, 0);
 
-	ret = call_usermodehelper(usermode_helper, argv, envp, UMH_WAIT_PROC);
+	ret = call_usermodehelper(drbd_usermode_helper, argv, envp, UMH_WAIT_PROC);
 	if (ret)
 		drbd_warn(connection, "helper command: %s %s %s exit code %u (0x%x)\n",
-			  usermode_helper, cmd, resource_name,
+			  drbd_usermode_helper, cmd, resource_name,
 			  (ret >> 8) & 0xff, ret);
 	else
 		drbd_info(connection, "helper command: %s %s %s exit code %u (0x%x)\n",
-			  usermode_helper, cmd, resource_name,
+			  drbd_usermode_helper, cmd, resource_name,
 			  (ret >> 8) & 0xff, ret);
 	/* TODO: conn_bcast_event() ?? */
 	notify_helper(NOTIFY_RESPONSE, NULL, connection, cmd, ret);
-- 
2.10.2

^ permalink raw reply related

* [PATCH 1/4] kmod: make usermodehelper path a const string
From: Greg KH @ 2016-12-14 18:50 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-kernel
In-Reply-To: <20161214185000.GA3930@kroah.com>

This is in preparation for making it so that usermode helper programs
can't be changed, if desired, by userspace.  We will tackle the mess of
cleaning up the write-ability of argv and env later, that's going to
take more work, for much less gain...

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/linux/kmod.h | 7 ++++---
 kernel/kmod.c        | 4 ++--
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index fcfd2bf14d3f..c4e441e00db5 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -56,7 +56,7 @@ struct file;
 struct subprocess_info {
 	struct work_struct work;
 	struct completion *complete;
-	char *path;
+	const char *path;
 	char **argv;
 	char **envp;
 	int wait;
@@ -67,10 +67,11 @@ struct subprocess_info {
 };
 
 extern int
-call_usermodehelper(char *path, char **argv, char **envp, int wait);
+call_usermodehelper(const char *path, char **argv, char **envp, int wait);
 
 extern struct subprocess_info *
-call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
+call_usermodehelper_setup(const char *path, char **argv, char **envp,
+			  gfp_t gfp_mask,
 			  int (*init)(struct subprocess_info *info, struct cred *new),
 			  void (*cleanup)(struct subprocess_info *), void *data);
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 0277d1216f80..0c216b76afca 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -516,7 +516,7 @@ static void helper_unlock(void)
  * Function must be runnable in either a process context or the
  * context in which call_usermodehelper_exec is called.
  */
-struct subprocess_info *call_usermodehelper_setup(char *path, char **argv,
+struct subprocess_info *call_usermodehelper_setup(const char *path, char **argv,
 		char **envp, gfp_t gfp_mask,
 		int (*init)(struct subprocess_info *info, struct cred *new),
 		void (*cleanup)(struct subprocess_info *info),
@@ -613,7 +613,7 @@ EXPORT_SYMBOL(call_usermodehelper_exec);
  * This function is the equivalent to use call_usermodehelper_setup() and
  * call_usermodehelper_exec().
  */
-int call_usermodehelper(char *path, char **argv, char **envp, int wait)
+int call_usermodehelper(const char *path, char **argv, char **envp, int wait)
 {
 	struct subprocess_info *info;
 	gfp_t gfp_mask = (wait == UMH_NO_WAIT) ? GFP_ATOMIC : GFP_KERNEL;
-- 
2.10.2

^ permalink raw reply related

* [RFC 0/4] make call_usermodehelper a bit more "safe"
From: Greg KH @ 2016-12-14 18:50 UTC (permalink / raw)
  To: kernel-hardening; +Cc: linux-kernel

Hi all,

Here's a proof-of-concept patch series that tries to work to address the
issue of call_usermodehelper being abused to have the kernel call any
userspace binary with full root permissions.

The issue is that if you end up getting write access to kernel memory,
if you change the string '/sbin/hotplug' to point to
'/home/hacked/my_binary', then the next uevent that the system makes
will call this binary instead of the "trusted" one.

It does this by moving the location of the binary to be in read-only
memory.  This works for a number of call_usermodehelper strings, as they
are specified at build or configuration time.  But, some subsystems have
the option to let userspace change the value at runtime, so those values
can't live in read-only memory.  To resolve this I've created a new
configuration option, CONFIG_READONLY_USERMODEHELPER to make those
options not able to be changed.

Yes, this changes existing functionality, but I'm willing to bet that
almost no one ever changes these binary locations, or if they do, they
can set them to the "correct" location at built time.

This all happens in the last patch of the series.  Note, I haven't
caught all places in the kernel that has these options, the messiest
being coredumps, which I haven't addressed yet, and is going to be a
pain.

This last patch is hacky, and I'm not really happy about it, so I'm
posting it here as an RFC to see what others think.

As a contrast, grsec does try to mitigate this same problem, but it does
so by looking at the location of the binary that is about to be run, and
only allowing a small whitelist of directories that are "allowed" to be
used.  That's a much simpler solution, but also feels hacky to me in a
way given that it's a whitelist and encompasses whole system directories
(i.e. /sbin/).  My patchset requires that each caller of
call_usermodehelper be audited, which is a pain, and will be needed to
be watched out for for new users, which also isn't any good.

So, anyone have any better ideas?  Is this approach worth it?  Or should
we just go down the "whitelist" path?

Note, the first 3 patches in this series will be submitted for inclusion
either way, as they are good cleanups, and change no functionality at
all, and resolve this issue automatically for some subsystems with no
downside.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH v2 3/3] kvm: svm: Use the hardware provided GPA instead of page walk
From: Paolo Bonzini @ 2016-12-14 18:47 UTC (permalink / raw)
  To: Brijesh Singh
  Cc: kvm, thomas lendacky, rkrcmar, joro, x86, linux-kernel, mingo,
	hpa, tglx, bp
In-Reply-To: <8ddf2c3b-833e-4c88-ff14-d1826171703e@amd.com>



On 14/12/2016 19:39, Brijesh Singh wrote:
> 
> On 12/14/2016 11:23 AM, Paolo Bonzini wrote:
>>
>>
>> On 14/12/2016 18:07, Brijesh Singh wrote:
>>>>
>>>
>>> Since now we are going to perform multiple conditional checks before
>>> concluding that its safe to use HW provided GPA. How about if we add two
>>> functions "emulator_is_rep_string_op" and "emulator_is_two_mem_op" into
>>> emulator.c and  use these functions inside the x86.c to determine if its
>>> safe to use HW provided gpa?
>>
>> Why not export only emulator_can_use_gpa from emulate.c?  (So in the end
>> leaving emulator_is_string_op in emulate.c was the right thing to do, it
>> was just the test that was wrong :)).
>>
> 
> Actually, I was not sure if putting emulator_can_use_gpa() in emulate.c
> was right thing - mainly because emulator.c does not deal with GPA. I
> will go with your advice and put it in emulator.c, it makes easy :)
> 
> 
>> The patch below is still missing the check for cross-page MMIO.  Your
>> reference to the BKDG only covers MMCONFIG (sometimes referred to as
>> ECAM), not MMIO in general.  Doing AND or OR into video memory for
>> example is perfectly legal, and I'm fairly sure that some obscure legacy
>> software does PUSH/POP into vram as well!
>>
>>
> 
> I used your below code snippet to detect cross-page MMIO access. After
> applying these changes cross-page MMIO read/write unit test is passing
> just fine. I will include it in patch.

Great, thanks.  I hope we can include it in 4.10.

Paolo

^ permalink raw reply

* Re: [PATCH v3 1/2] ARM: ep93xx: Register ts73xx-fpga manager driver for TS-7300
From: Florian Fainelli @ 2016-12-14 18:47 UTC (permalink / raw)
  To: Moritz Fischer
  Cc: Linux Kernel Mailing List, linux-arm-kernel, Alan Tull,
	Russell King, Ryan Mallon, H Hartley Sweeten, linux-fpga
In-Reply-To: <CAAtXAHcO4bN2YmrEabiivnBJLXdU56j4rv4A12FL_XKuwc5Dxw@mail.gmail.com>

On 12/13/2016 10:14 PM, Moritz Fischer wrote:
> Hi Florian,
> 
> On Tue, Dec 13, 2016 at 6:35 PM, Florian Fainelli <f.fainelli@gmail.com> wrote:
>> Register the TS-7300 FPGA manager device drivers which allows us to load
>> bitstreams into the on-board Altera Cyclone II FPGA.
>>
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
>>  arch/arm/mach-ep93xx/ts72xx.c | 26 ++++++++++++++++++++++++++
>>  1 file changed, 26 insertions(+)
>>
>> diff --git a/arch/arm/mach-ep93xx/ts72xx.c b/arch/arm/mach-ep93xx/ts72xx.c
>> index 3b39ea353d30..acf72ea670ef 100644
>> --- a/arch/arm/mach-ep93xx/ts72xx.c
>> +++ b/arch/arm/mach-ep93xx/ts72xx.c
>> @@ -230,6 +230,28 @@ static struct ep93xx_eth_data __initdata ts72xx_eth_data = {
>>         .phy_id         = 1,
>>  };
>>
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +
>> +/* Relative to EP93XX_CS1_PHYS_BASE */
>> +#define TS73XX_FPGA_LOADER_BASE                0x03c00000
>> +
>> +static struct resource ts73xx_fpga_resources[] = {
>> +       {
>> +               .start  = EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE,
>> +               .end    = EP93XX_CS1_PHYS_BASE + TS73XX_FPGA_LOADER_BASE + 1,
>> +               .flags  = IORESOURCE_MEM,
>> +       },
>> +};
>> +
>> +static struct platform_device ts73xx_fpga_device = {
>> +       .name   = "ts73xx-fpga-mgr",
>> +       .id     = -1,
>> +       .resource = ts73xx_fpga_resources,
>> +       .num_resources = ARRAY_SIZE(ts73xx_fpga_resources),
>> +};
>> +
>> +#endif
>> +
>>  static void __init ts72xx_init_machine(void)
>>  {
>>         ep93xx_init_devices();
>> @@ -238,6 +260,10 @@ static void __init ts72xx_init_machine(void)
>>         platform_device_register(&ts72xx_wdt_device);
>>
>>         ep93xx_register_eth(&ts72xx_eth_data, 1);
>> +#if IS_ENABLED(CONFIG_FPGA_MGR_TS73XX)
>> +       if (board_is_ts7300())
>> +               platform_device_register(&ts73xx_fpga_device);
>> +#endif
>>  }
>>
>>  MACHINE_START(TS72XX, "Technologic Systems TS-72xx SBC")
>> --
>> 2.9.3
>>
> 
> I think this is backwards, shouldn't this be your [PATCH 2/2]?
> Otherwise you're using
> the driver before you added it.

I can definitively re-order the patches, although I don't think this
really makes a difference, a driver without device does nothing, and
vice versa.
-- 
Florian

^ permalink raw reply

* [PATCH v3 3/3] random: use siphash24 instead of md5 for get_random_int/long
From: Jason A. Donenfeld @ 2016-12-14 18:46 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Ted Tso
In-Reply-To: <20161214184605.24006-1-Jason@zx2c4.com>

This duplicates the current algorithm for get_random_int/long, but uses
siphash24 instead. This comes with several benefits. It's certainly
faster and more cryptographically secure than MD5. This patch also
hashes the pid, entropy, and timestamp as fixed width fields, in order
to increase diffusion.

The previous md5 algorithm used a per-cpu md5 state, which caused
successive calls to the function to chain upon each other. While it's
not entirely clear that this kind of chaining is absolutely necessary
when using a secure PRF like siphash24, it can't hurt, and the timing of
the call chain does add a degree of natural entropy. So, in keeping with
this design, instead of the massive per-cpu 64-byte md5 state, there is
instead a per-cpu previously returned value for chaining.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Ted Tso <tytso@mit.edu>
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.

 drivers/char/random.c | 52 ++++++++++++++++++++++++++++++++-------------------
 1 file changed, 33 insertions(+), 19 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index d6876d506220..b1c2e3b26430 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -262,6 +262,7 @@
 #include <linux/syscalls.h>
 #include <linux/completion.h>
 #include <linux/uuid.h>
+#include <linux/siphash.h>
 #include <crypto/chacha20.h>
 
 #include <asm/processor.h>
@@ -2042,7 +2043,7 @@ struct ctl_table random_table[] = {
 };
 #endif 	/* CONFIG_SYSCTL */
 
-static u32 random_int_secret[MD5_MESSAGE_BYTES / 4] ____cacheline_aligned;
+static u8 random_int_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 int random_int_secret_init(void)
 {
@@ -2050,8 +2051,7 @@ int random_int_secret_init(void)
 	return 0;
 }
 
-static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
-		__aligned(sizeof(unsigned long));
+static DEFINE_PER_CPU(u64, get_random_int_chaining);
 
 /*
  * Get a random word for internal kernel use only. Similar to urandom but
@@ -2061,19 +2061,26 @@ static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash)
  */
 unsigned int get_random_int(void)
 {
-	__u32 *hash;
 	unsigned int ret;
+	struct {
+		u64 chaining;
+		unsigned long ts;
+		unsigned long entropy;
+		pid_t pid;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined;
+	u64 *chaining;
 
 	if (arch_get_random_int(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = hash[0];
-	put_cpu_var(get_random_int_hash);
-
+	chaining = get_cpu_ptr(&get_random_int_chaining);
+	combined.chaining = *chaining;
+	combined.ts = jiffies;
+	combined.entropy = random_get_entropy();
+	combined.pid = current->pid;
+	ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined), end), random_int_secret);
+	put_cpu_ptr(chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_int);
@@ -2083,19 +2090,26 @@ EXPORT_SYMBOL(get_random_int);
  */
 unsigned long get_random_long(void)
 {
-	__u32 *hash;
 	unsigned long ret;
+	struct {
+		u64 chaining;
+		unsigned long ts;
+		unsigned long entropy;
+		pid_t pid;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined;
+	u64 *chaining;
 
 	if (arch_get_random_long(&ret))
 		return ret;
 
-	hash = get_cpu_var(get_random_int_hash);
-
-	hash[0] += current->pid + jiffies + random_get_entropy();
-	md5_transform(hash, random_int_secret);
-	ret = *(unsigned long *)hash;
-	put_cpu_var(get_random_int_hash);
-
+	chaining = get_cpu_ptr(&get_random_int_chaining);
+	combined.chaining = *chaining;
+	combined.ts = jiffies;
+	combined.entropy = random_get_entropy();
+	combined.pid = current->pid;
+	ret = *chaining = siphash24((u8 *)&combined, offsetof(typeof(combined), end), random_int_secret);
+	put_cpu_ptr(chaining);
 	return ret;
 }
 EXPORT_SYMBOL(get_random_long);
-- 
2.11.0

^ permalink raw reply related

* [PATCH v3 1/3] siphash: add cryptographically secure hashtable function
From: Jason A. Donenfeld @ 2016-12-14 18:46 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto
  Cc: Jason A. Donenfeld, Jean-Philippe Aumasson, Daniel J . Bernstein,
	Linus Torvalds, Eric Biggers, David Laight
In-Reply-To: <20161214035927.30004-1-Jason@zx2c4.com>

SipHash is a 64-bit keyed hash function that is actually a
cryptographically secure PRF, like HMAC. Except SipHash is super fast,
and is meant to be used as a hashtable keyed lookup function.

SipHash isn't just some new trendy hash function. It's been around for a
while, and there really isn't anything that comes remotely close to
being useful in the way SipHash is. With that said, why do we need this?

There are a variety of attacks known as "hashtable poisoning" in which an
attacker forms some data such that the hash of that data will be the
same, and then preceeds to fill up all entries of a hashbucket. This is
a realistic and well-known denial-of-service vector.

Linux developers already seem to be aware that this is an issue, and
various places that use hash tables in, say, a network context, use a
non-cryptographically secure function (usually jhash) and then try to
twiddle with the key on a time basis (or in many cases just do nothing
and hope that nobody notices). While this is an admirable attempt at
solving the problem, it doesn't actually fix it. SipHash fixes it.

(It fixes it in such a sound way that you could even build a stream
cipher out of SipHash that would resist the modern cryptanalysis.)

There are a modicum of places in the kernel that are vulnerable to
hashtable poisoning attacks, either via userspace vectors or network
vectors, and there's not a reliable mechanism inside the kernel at the
moment to fix it. The first step toward fixing these issues is actually
getting a secure primitive into the kernel for developers to use. Then
we can, bit by bit, port things over to it as deemed appropriate.

Secondly, a few places are using MD5 for creating secure sequence
numbers, port numbers, or fast random numbers. Siphash is a faster, more
fittting, and more secure replacement for MD5 in those situations.

Dozens of languages are already using this internally for their hash
tables. Some of the BSDs already use this in their kernels. SipHash is
a widely known high-speed solution to a widely known problem, and it's
time we catch-up.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
Cc: Daniel J. Bernstein <djb@cr.yp.to>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Eric Biggers <ebiggers3@gmail.com>
Cc: David Laight <David.Laight@aculab.com>
---
Changes from v2->v3:

  - There is now a fast aligned version of the function and a not-as-fast
    unaligned version. The requirements for each have been documented in
    a docbook-style comment. As well, the header now contains a constant
    for the expected alignment.

  - The test suite has been updated to check both the unaligned and aligned
    version of the function.

 include/linux/siphash.h |  30 ++++++++++
 lib/Kconfig.debug       |   6 +-
 lib/Makefile            |   5 +-
 lib/siphash.c           | 153 ++++++++++++++++++++++++++++++++++++++++++++++++
 lib/test_siphash.c      |  85 +++++++++++++++++++++++++++
 5 files changed, 274 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/siphash.h
 create mode 100644 lib/siphash.c
 create mode 100644 lib/test_siphash.c

diff --git a/include/linux/siphash.h b/include/linux/siphash.h
new file mode 100644
index 000000000000..82dc1a911a2e
--- /dev/null
+++ b/include/linux/siphash.h
@@ -0,0 +1,30 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#ifndef _LINUX_SIPHASH_H
+#define _LINUX_SIPHASH_H
+
+#include <linux/types.h>
+
+enum siphash_lengths {
+	SIPHASH24_KEY_LEN = 16,
+	SIPHASH24_ALIGNMENT = 8
+};
+
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+
+#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+static inline u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN])
+{
+	return siphash24(data, len, key);
+}
+#else
+u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN]);
+#endif
+
+#endif /* _LINUX_SIPHASH_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index e6327d102184..32bbf689fc46 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1843,9 +1843,9 @@ config TEST_HASH
 	tristate "Perform selftest on hash functions"
 	default n
 	help
-	  Enable this option to test the kernel's integer (<linux/hash,h>)
-	  and string (<linux/stringhash.h>) hash functions on boot
-	  (or module load).
+	  Enable this option to test the kernel's integer (<linux/hash.h>),
+	  string (<linux/stringhash.h>), and siphash (<linux/siphash.h>)
+	  hash functions on boot (or module load).
 
 	  This is intended to help people writing architecture-specific
 	  optimized versions.  If unsure, say N.
diff --git a/lib/Makefile b/lib/Makefile
index 50144a3aeebd..71d398b04a74 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -22,7 +22,8 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
 	 sha1.o chacha20.o md5.o irq_regs.o argv_split.o \
 	 flex_proportions.o ratelimit.o show_mem.o \
 	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
-	 earlycpio.o seq_buf.o nmi_backtrace.o nodemask.o win_minmax.o
+	 earlycpio.o seq_buf.o siphash.o \
+	 nmi_backtrace.o nodemask.o win_minmax.o
 
 lib-$(CONFIG_MMU) += ioremap.o
 lib-$(CONFIG_SMP) += cpumask.o
@@ -44,7 +45,7 @@ obj-$(CONFIG_TEST_HEXDUMP) += test_hexdump.o
 obj-y += kstrtox.o
 obj-$(CONFIG_TEST_BPF) += test_bpf.o
 obj-$(CONFIG_TEST_FIRMWARE) += test_firmware.o
-obj-$(CONFIG_TEST_HASH) += test_hash.o
+obj-$(CONFIG_TEST_HASH) += test_hash.o test_siphash.o
 obj-$(CONFIG_TEST_KASAN) += test_kasan.o
 obj-$(CONFIG_TEST_KSTRTOX) += test-kstrtox.o
 obj-$(CONFIG_TEST_LKM) += test_module.o
diff --git a/lib/siphash.c b/lib/siphash.c
new file mode 100644
index 000000000000..32acdc26234f
--- /dev/null
+++ b/lib/siphash.c
@@ -0,0 +1,153 @@
+/* Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ * Copyright (C) 2012-2014 Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
+ * Copyright (C) 2012-2014 Daniel J. Bernstein <djb@cr.yp.to>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <asm/unaligned.h>
+
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+#include <linux/dcache.h>
+#include <asm/word-at-a-time.h>
+#endif
+
+#define SIPROUND \
+	do { \
+	v0 += v1; v1 = rol64(v1, 13); v1 ^= v0; v0 = rol64(v0, 32); \
+	v2 += v3; v3 = rol64(v3, 16); v3 ^= v2; \
+	v0 += v3; v3 = rol64(v3, 21); v3 ^= v0; \
+	v2 += v1; v1 = rol64(v1, 17); v1 ^= v2; v2 = rol64(v2, 32); \
+	} while(0)
+
+static inline u16 le16_to_cpuvp(const void *p)
+{
+	return le16_to_cpup(p);
+}
+static inline u32 le32_to_cpuvp(const void *p)
+{
+	return le32_to_cpup(p);
+}
+static inline u64 le64_to_cpuvp(const void *p)
+{
+	return le64_to_cpup(p);
+}
+
+/**
+ * siphash24 - compute 64-bit siphash24 PRF value
+ * @data: buffer to hash, must be aligned to SIPHASH24_ALIGNMENT
+ * @size: size of @data
+ * @key: key buffer of size SIPHASH24_KEY_LEN, must be aligned to SIPHASH24_ALIGNMENT
+ */
+u64 siphash24(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN])
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 k0 = le64_to_cpuvp(key);
+	u64 k1 = le64_to_cpuvp(key + sizeof(u64));
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= k1;
+	v2 ^= k0;
+	v1 ^= k1;
+	v0 ^= k0;
+	for (; data != end; data += sizeof(u64)) {
+		m = le64_to_cpuvp(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)data[6]) << 48;
+	case 6: b |= ((u64)data[5]) << 40;
+	case 5: b |= ((u64)data[4]) << 32;
+	case 4: b |= le32_to_cpuvp(data); break;
+	case 3: b |= ((u64)data[2]) << 16;
+	case 2: b |= le16_to_cpuvp(data); break;
+	case 1: b |= data[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash24);
+
+#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
+/**
+ * siphash24 - compute 64-bit siphash24 PRF value, without alignment requirements
+ * @data: buffer to hash
+ * @size: size of @data
+ * @key: key buffer of size SIPHASH24_KEY_LEN
+ */
+u64 siphash24_unaligned(const u8 *data, size_t len, const u8 key[SIPHASH24_KEY_LEN])
+{
+	u64 v0 = 0x736f6d6570736575ULL;
+	u64 v1 = 0x646f72616e646f6dULL;
+	u64 v2 = 0x6c7967656e657261ULL;
+	u64 v3 = 0x7465646279746573ULL;
+	u64 b = ((u64)len) << 56;
+	u64 k0 = get_unaligned_le64(key);
+	u64 k1 = get_unaligned_le64(key + sizeof(u64));
+	u64 m;
+	const u8 *end = data + len - (len % sizeof(u64));
+	const u8 left = len & (sizeof(u64) - 1);
+	v3 ^= k1;
+	v2 ^= k0;
+	v1 ^= k1;
+	v0 ^= k0;
+	for (; data != end; data += sizeof(u64)) {
+		m = get_unaligned_le64(data);
+		v3 ^= m;
+		SIPROUND;
+		SIPROUND;
+		v0 ^= m;
+	}
+#if defined(CONFIG_DCACHE_WORD_ACCESS) && BITS_PER_LONG == 64
+	if (left)
+		b |= le64_to_cpu((__force __le64)(load_unaligned_zeropad(data) & bytemask_from_count(left)));
+#else
+	switch (left) {
+	case 7: b |= ((u64)data[6]) << 48;
+	case 6: b |= ((u64)data[5]) << 40;
+	case 5: b |= ((u64)data[4]) << 32;
+	case 4: b |= get_unaligned_le32(data); break;
+	case 3: b |= ((u64)data[2]) << 16;
+	case 2: b |= get_unaligned_le16(data); break;
+	case 1: b |= data[0];
+	}
+#endif
+	v3 ^= b;
+	SIPROUND;
+	SIPROUND;
+	v0 ^= b;
+	v2 ^= 0xff;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	SIPROUND;
+	return (v0 ^ v1) ^ (v2 ^ v3);
+}
+EXPORT_SYMBOL(siphash24_unaligned);
+#endif
diff --git a/lib/test_siphash.c b/lib/test_siphash.c
new file mode 100644
index 000000000000..69ac94dec366
--- /dev/null
+++ b/lib/test_siphash.c
@@ -0,0 +1,85 @@
+/* Test cases for siphash.c
+ *
+ * Copyright (C) 2015-2016 Jason A. Donenfeld <Jason@zx2c4.com>
+ *
+ * This file is provided under a dual BSD/GPLv2 license.
+ *
+ * SipHash: a fast short-input PRF
+ * https://131002.net/siphash/
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/siphash.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/errno.h>
+#include <linux/module.h>
+
+/* Test vectors taken from official reference source available at:
+ *     https://131002.net/siphash/siphash24.c
+ */
+static const u64 test_vectors[64] = {
+	0x726fdb47dd0e0e31ULL, 0x74f839c593dc67fdULL, 0x0d6c8009d9a94f5aULL,
+	0x85676696d7fb7e2dULL, 0xcf2794e0277187b7ULL, 0x18765564cd99a68dULL,
+	0xcbc9466e58fee3ceULL, 0xab0200f58b01d137ULL, 0x93f5f5799a932462ULL,
+	0x9e0082df0ba9e4b0ULL, 0x7a5dbbc594ddb9f3ULL, 0xf4b32f46226bada7ULL,
+	0x751e8fbc860ee5fbULL, 0x14ea5627c0843d90ULL, 0xf723ca908e7af2eeULL,
+	0xa129ca6149be45e5ULL, 0x3f2acc7f57c29bdbULL, 0x699ae9f52cbe4794ULL,
+	0x4bc1b3f0968dd39cULL, 0xbb6dc91da77961bdULL, 0xbed65cf21aa2ee98ULL,
+	0xd0f2cbb02e3b67c7ULL, 0x93536795e3a33e88ULL, 0xa80c038ccd5ccec8ULL,
+	0xb8ad50c6f649af94ULL, 0xbce192de8a85b8eaULL, 0x17d835b85bbb15f3ULL,
+	0x2f2e6163076bcfadULL, 0xde4daaaca71dc9a5ULL, 0xa6a2506687956571ULL,
+	0xad87a3535c49ef28ULL, 0x32d892fad841c342ULL, 0x7127512f72f27cceULL,
+	0xa7f32346f95978e3ULL, 0x12e0b01abb051238ULL, 0x15e034d40fa197aeULL,
+	0x314dffbe0815a3b4ULL, 0x027990f029623981ULL, 0xcadcd4e59ef40c4dULL,
+	0x9abfd8766a33735cULL, 0x0e3ea96b5304a7d0ULL, 0xad0c42d6fc585992ULL,
+	0x187306c89bc215a9ULL, 0xd4a60abcf3792b95ULL, 0xf935451de4f21df2ULL,
+	0xa9538f0419755787ULL, 0xdb9acddff56ca510ULL, 0xd06c98cd5c0975ebULL,
+	0xe612a3cb9ecba951ULL, 0xc766e62cfcadaf96ULL, 0xee64435a9752fe72ULL,
+	0xa192d576b245165aULL, 0x0a8787bf8ecb74b2ULL, 0x81b3e73d20b49b6fULL,
+	0x7fa8220ba3b2eceaULL, 0x245731c13ca42499ULL, 0xb78dbfaf3a8d83bdULL,
+	0xea1ad565322a1a0bULL, 0x60e61c23a3795013ULL, 0x6606d7e446282b93ULL,
+	0x6ca4ecb15c5f91e1ULL, 0x9f626da15c9625f3ULL, 0xe51b38608ef25f57ULL,
+	0x958a324ceb064572ULL
+};
+
+static int __init siphash_test_init(void)
+{
+	u8 in[64] __aligned(SIPHASH24_ALIGNMENT);
+	u8 k[16] __aligned(SIPHASH24_ALIGNMENT);
+	u8 in_unaligned[65];
+	u8 k_unaligned[65];
+	u8 i;
+	int ret = 0;
+
+	for (i = 0; i < 16; ++i) {
+		k[i] = i;
+		k_unaligned[i + 1] = i;
+	}
+	for (i = 0; i < 64; ++i) {
+		in[i] = i;
+		in_unaligned[i + 1] = i;
+		if (siphash24(in, i, k) != test_vectors[i]) {
+			pr_info("self-test aligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+		if (siphash24_unaligned(in_unaligned + 1, i, k_unaligned + 1) != test_vectors[i]) {
+			pr_info("self-test unaligned %u: FAIL\n", i + 1);
+			ret = -EINVAL;
+		}
+	}
+	if (!ret)
+		pr_info("self-tests: pass\n");
+	return ret;
+}
+
+static void __exit siphash_test_exit(void)
+{
+}
+
+module_init(siphash_test_init);
+module_exit(siphash_test_exit);
+
+MODULE_AUTHOR("Jason A. Donenfeld <Jason@zx2c4.com>");
+MODULE_LICENSE("Dual BSD/GPL");
-- 
2.11.0

^ permalink raw reply related

* [PATCH v3 2/3] secure_seq: use siphash24 instead of md5_transform
From: Jason A. Donenfeld @ 2016-12-14 18:46 UTC (permalink / raw)
  To: Netdev, kernel-hardening, LKML, linux-crypto
  Cc: Jason A. Donenfeld, Andi Kleen, David Miller, David Laight
In-Reply-To: <20161214184605.24006-1-Jason@zx2c4.com>

This gives a clear speed and security improvement. Siphash is both
faster and is more solid crypto than the aging MD5.

Rather than manually filling MD5 buffers, we simply create
a layout by a simple anonymous struct, for which gcc generates
rather efficient code.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Miller <davem@davemloft.net>
Cc: David Laight <David.Laight@aculab.com>
---
Changes from v2->v3:

  - Structs are no longer packed, to mitigate slow byte-by-byte assignment.
  - A typo has been fixed in the port number assignment.

 net/core/secure_seq.c | 166 ++++++++++++++++++++++++++------------------------
 1 file changed, 85 insertions(+), 81 deletions(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index 88a8e429fc3e..00eb141c981b 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -1,3 +1,5 @@
+/* Copyright (C) 2016 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved. */
+
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/cryptohash.h>
@@ -8,14 +10,14 @@
 #include <linux/ktime.h>
 #include <linux/string.h>
 #include <linux/net.h>
-
+#include <linux/siphash.h>
 #include <net/secure_seq.h>
 
 #if IS_ENABLED(CONFIG_IPV6) || IS_ENABLED(CONFIG_INET)
+#include <linux/in6.h>
 #include <net/tcp.h>
-#define NET_SECRET_SIZE (MD5_MESSAGE_BYTES / 4)
 
-static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
+static u8 net_secret[SIPHASH24_KEY_LEN] __aligned(SIPHASH24_ALIGNMENT);
 
 static __always_inline void net_secret_init(void)
 {
@@ -44,44 +46,41 @@ static u32 seq_scale(u32 seq)
 u32 secure_tcpv6_sequence_number(const __be32 *saddr, const __be32 *daddr,
 				 __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 EXPORT_SYMBOL(secure_tcpv6_sequence_number);
 
 u32 secure_ipv6_port_ephemeral(const __be32 *saddr, const __be32 *daddr,
 			       __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
-	u32 i;
-
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32) daddr[i];
-	secret[4] = net_secret[4] + (__force u32)dport;
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	return hash[0];
+	return siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
 }
 EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 #endif
@@ -91,33 +90,39 @@ EXPORT_SYMBOL(secure_ipv6_port_ephemeral);
 u32 secure_tcp_sequence_number(__be32 saddr, __be32 daddr,
 			       __be16 sport, __be16 dport, u32 *tsoff)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 sport;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.sport = sport,
+		.dport = dport
+	};
+	u64 hash;
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	*tsoff = sysctl_tcp_timestamps == 1 ? hash[1] : 0;
-	return seq_scale(hash[0]);
+	hash = siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
+	*tsoff = sysctl_tcp_timestamps == 1 ? (hash >> 32) : 0;
+	return seq_scale(hash);
 }
 
 u32 secure_ipv4_port_ephemeral(__be32 saddr, __be32 daddr, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
-
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.dport = dport
+	};
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = (__force u32)dport ^ net_secret[14];
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	return hash[0];
+	return siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
 }
 EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 #endif
@@ -126,21 +131,23 @@ EXPORT_SYMBOL_GPL(secure_ipv4_port_ephemeral);
 u64 secure_dccp_sequence_number(__be32 saddr, __be32 daddr,
 				__be16 sport, __be16 dport)
 {
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		__be32 saddr;
+		__be32 daddr;
+		__be16 sport;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = saddr,
+		.daddr = daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-
 	net_secret_init();
-	hash[0] = (__force u32)saddr;
-	hash[1] = (__force u32)daddr;
-	hash[2] = ((__force u16)sport << 16) + (__force u16)dport;
-	hash[3] = net_secret[15];
-
-	md5_transform(hash, net_secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccp_sequence_number);
@@ -149,26 +156,23 @@ EXPORT_SYMBOL(secure_dccp_sequence_number);
 u64 secure_dccpv6_sequence_number(__be32 *saddr, __be32 *daddr,
 				  __be16 sport, __be16 dport)
 {
-	u32 secret[MD5_MESSAGE_BYTES / 4];
-	u32 hash[MD5_DIGEST_WORDS];
+	const struct {
+		struct in6_addr saddr;
+		struct in6_addr daddr;
+		__be16 sport;
+		__be16 dport;
+		char end[];
+	} __aligned(SIPHASH24_ALIGNMENT) combined = {
+		.saddr = *(struct in6_addr *)saddr,
+		.daddr = *(struct in6_addr *)daddr,
+		.sport = sport,
+		.dport = dport
+	};
 	u64 seq;
-	u32 i;
-
 	net_secret_init();
-	memcpy(hash, saddr, 16);
-	for (i = 0; i < 4; i++)
-		secret[i] = net_secret[i] + (__force u32)daddr[i];
-	secret[4] = net_secret[4] +
-		(((__force u16)sport << 16) + (__force u16)dport);
-	for (i = 5; i < MD5_MESSAGE_BYTES / 4; i++)
-		secret[i] = net_secret[i];
-
-	md5_transform(hash, secret);
-
-	seq = hash[0] | (((u64)hash[1]) << 32);
+	seq = siphash24((const u8 *)&combined, offsetof(typeof(combined), end), net_secret);
 	seq += ktime_get_real_ns();
 	seq &= (1ull << 48) - 1;
-
 	return seq;
 }
 EXPORT_SYMBOL(secure_dccpv6_sequence_number);
-- 
2.11.0

^ permalink raw reply related

* Re: [GIT PULL] f2fs update for 4.10
From: Jaegeuk Kim @ 2016-12-14 18:40 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Linux Kernel Mailing List, Linux FS Dev Mailing List,
	Linux F2FS Dev Mailing List
In-Reply-To: <CA+55aFw4dRsn0dG2RO7sgKFDF--jjb8_ZgQg_ghL+tpDKH63Tw@mail.gmail.com>

On 12/14, Linus Torvalds wrote:
> On Mon, Dec 12, 2016 at 2:15 PM, Jaegeuk Kim <jaegeuk@kernel.org> wrote:
> >
> > Could you please consider this pull request?
> 
> Pulled. Mind double-checking my resolution wrt commit 70fd76140a6c
> ("block,fs: use REQ_* flags directly")?

Thank you, and the resolution looks good to me as well.

Thanks,

> 
>                      Linus

^ permalink raw reply

* Re: [v2] net:ethernet:cavium:octeon:octeon_mgmt: Handle return NULL error from devm_ioremap
From: arvind Yadav @ 2016-12-14 18:39 UTC (permalink / raw)
  To: David Daney, peter.chen, fw, david.daney; +Cc: netdev, linux-kernel
In-Reply-To: <f113cad3-e368-679a-c56b-3c8c57e1a07b@caviumnetworks.com>

Hi David,

I have gave my comment.

Thanks
Arvind

On Wednesday 14 December 2016 11:44 PM, David Daney wrote:
> On 12/14/2016 10:06 AM, arvind Yadav wrote:
>> Yes, I have seen this error. We have a device with very less memory.
>> Basically it's OMAP2 board. We have to port Android L on this.
>> It's has 3.10 kernel version. In this device, we were getting Page
>> allocation failure.
>
> This makes absolutely no sense to me.  OCTEON is a mips64 SoC with a 
> ton of memory where ioremap can never fail, and it doesn't run 
> Android, and you are talking about OMAP2.
           -I just gave as example where i have seen ioremap issue. 
Please don't relate. I know, Now it will not fail.  ioremap will through 
NULL on failure. We should catch this error. Even other driver of MIPS 
soc is having same check. It's just check which will not impact any 
functionality or performance of this driver. It will avoid NULL pointer 
error. We know, if  function is returning any error. we should catch.
>
> Q1: Have you observed a failure on the device for which you are 
> modifying the driver?
          -No, I did not observe this error.
>
> Q2: Have you tested the patch on hardware that uses the driver you are 
> modifying by running network traffic through the Ethernet interface 
> this driver controls?
         -Right Now we can not tested these kind of failure,
>
> If you cannot answer yes to both of those questions, then you should 
> probably note in the changelog that the patch is untested.
>

> David.
>
>
>> Vmalloc size was not enough to run all application. So we have decide to
>> increase vmalloc reserve space. once we increases Vmalloc space.
>> We start getting ioremap falilure. Kernel is getting NULL-pointer
>> dereference error.
>>
>> Here, It's just check to avoid any kernel crash because of ioremap 
>> failure.
>> We can keep this check to avoid this kind of scenario.
>>
>> Thanks
>> -Arvind
>>
>>
>> On Wednesday 14 December 2016 11:02 PM, David Daney wrote:
>>> On 12/14/2016 08:25 AM, Arvind Yadav wrote:
>>>> Here, If devm_ioremap will fail. It will return NULL.
>>>> Kernel can run into a NULL-pointer dereference.
>>>> This error check will avoid NULL pointer dereference.
>>>>
>>> i
>>> Have you ever seen this failure in the wild?
>>>
>>> How was the patch tested?
>>>
>>> Thanks,
>>> David Daney
>>>
>>>
>>>> Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
>>>> ---
>>>>  drivers/net/ethernet/cavium/octeon/octeon_mgmt.c | 6 ++++++
>>>>  1 file changed, 6 insertions(+)
>>>>
>>>> diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
>>>> b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
>>>> index 4ab404f..33c2fec 100644
>>>> --- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
>>>> +++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
>>>> @@ -1479,6 +1479,12 @@ static int octeon_mgmt_probe(struct
>>>> platform_device *pdev)
>>>>      p->agl = (u64)devm_ioremap(&pdev->dev, p->agl_phys, p->agl_size);
>>>>      p->agl_prt_ctl = (u64)devm_ioremap(&pdev->dev, 
>>>> p->agl_prt_ctl_phys,
>>>>                         p->agl_prt_ctl_size);
>>>> +    if (!p->mix || !p->agl || !p->agl_prt_ctl) {
>>>> +        dev_err(&pdev->dev, "failed to map I/O memory\n");
>>>> +        result = -ENOMEM;
>>>> +        goto err;
>>>> +    }
>>>> +
>>>>      spin_lock_init(&p->lock);
>>>>
>>>>      skb_queue_head_init(&p->tx_list);
>>>>
>>>
>>

^ permalink raw reply

* Re: [RFC] perf/x86/intel: Account interrupts for PEBS errors
From: Peter Zijlstra @ 2016-12-14 18:07 UTC (permalink / raw)
  To: Jiri Olsa; +Cc: Andi Kleen, lkml, Alexander Shishkin, Vince Weaver, Ingo Molnar
In-Reply-To: <20161214165036.GB9180@krava>

On Wed, Dec 14, 2016 at 05:50:36PM +0100, Jiri Olsa wrote:
> 
> I also fail to reproduce on other than snb_x (model 45) server

reproduces on my ivb-ep as well model 62.

> thoughts?

cute find :-)

> +++ b/arch/x86/events/intel/ds.c
> @@ -1389,9 +1389,13 @@ static void intel_pmu_drain_pebs_nhm(struct pt_regs *iregs)
>  			continue;
>  
>  		/* log dropped samples number */
> -		if (error[bit])
> +		if (error[bit]) {
>  			perf_log_lost_samples(event, error[bit]);
>  
> +			if (perf_event_account_interrupt(event, 1))

Seems a bit daft to expose the .throttle argument, since that would be
the only point of calling this.




> +static int __perf_event_overflow(struct perf_event *event,
> +				   int throttle, struct perf_sample_data *data,
> +				   struct pt_regs *regs)
> +{
> +	int events = atomic_read(&event->event_limit);
> +	struct hw_perf_event *hwc = &event->hw;
> +	int ret = 0;
> +
> +	/*
> +	 * Non-sampling counters might still use the PMI to fold short
> +	 * hardware counters, ignore those.
> +	 */
> +	if (unlikely(!is_sampling_event(event)))
> +		return 0;
> +
> +	ret = perf_event_account_interrupt(event, throttle);
> +
>  	if (event->attr.freq) {
>  		u64 now = perf_clock();
>  		s64 delta = now - hwc->freq_time_stamp;

Arguably, everything in __perf_event_overflow() except for calling of
->overflow_handler() should be done I think.

^ permalink raw reply

* RE: [PATCH] perf tools: ignore zombie process for user profile
From: Liang, Kan @ 2016-12-14 18:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: acme@kernel.org, linux-kernel@vger.kernel.org, mingo@redhat.com,
	peterz@infradead.org, jolsa@kernel.org, Hunter, Adrian,
	andi@firstfloor.org
In-Reply-To: <20161214175911.GA14085@krava>



> 
> On Wed, Dec 14, 2016 at 12:48:05PM -0500, kan.liang@intel.com wrote:
> > From: Kan Liang <kan.liang@intel.com>
> >
> > If user has zombie process, the perf record -u will error out.
> > Here is an example.
> >  $ ./testd &
> >  [1] 23796
> >  $ sudo perf record -e cycles -u kan
> >  Error:
> >  The sys_perf_event_open() syscall returned with 3 (No such process)
> > for  event (cycles).
> >  /bin/dmesg may provide additional information.
> >  No CONFIG_PERF_EVENTS=y kernel support configured?
> >
> > The source code of testd is as below.
> >  int main() {
> >
> > 	if (fork())
> > 	{
> > 		while (1);
> > 	}
> > 	return 0;
> >  }
> >
> > Zombie process is dead process. It is meaningless to profile it.
> > It's better to ignore it for user profile.
> 
> I recently posted different patch for same issue:
>   http://marc.info/?l=linux-kernel&m=148153895827359&w=2

The change as below make me confuse.
+	/* The system wide setup does not work with threads. */
+	if (!evsel->system_wide)
+		return false;
It looks the meaning of the comments is inconsistent with the code.


Your original patch doesn't work well with the issue.
But if I change the above code as below, the issue is fixed.
	if (evsel->system_wide)
		return false;

Thanks,
Kan

^ permalink raw reply

* [GIT PULL] Audit patches for v4.10
From: Paul Moore @ 2016-12-14 18:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-audit, linux-kernel

Hi Linus,

After the small number of patches for v4.9, we've got a much bigger pile for 
v4.10.

The bulk of these patches involve a rework of the audit backlog queue to 
enable us to move the netlink multicasting out of the task/thread that 
generates the audit record and into the kernel thread that emits the record 
(just like we do for the audit unicast to auditd).  While we were playing 
with the backlog queue(s) we fixed a number of other little problems with 
the code, and from all the testing so far things look to be in much better 
shape now.  Doing this also allowed us to re-enable disabling IRQs for some 
netns operations ("netns: avoid disabling irq for netns id").  The remaining 
patches fix some small problems that are well documented in the commit 
descriptions, as well as adding session ID filtering support.

You will likely hit two merge conflicts, one in net/core/net_namespace.c and 
one in include/uapi/linux/audit.h, both are easily resolved so I won't 
bother you with that here.  If you have questions, you know how to find me.

Thanks,
-Paul

---
The following changes since commit c8d2bc9bc39ebea8437fd974fdbc21847bb897a3:

  Linux 4.8 (2016-10-02 16:24:33 -0700)

are available in the git repository at:

  git://git.infradead.org/users/pcmoore/audit stable-4.10

for you to fetch changes up to 533c7b69c764ad5febb3e716899f43a75564fcab:

  audit: use proper refcount locking on audit_sock
         (2016-12-14 13:06:04 -0500)

----------------------------------------------------------------
Alexey Dobriyan (1):
      audit: less stack usage for /proc/*/loginuid

Paul Moore (9):
      audit: fixup audit_init()
      audit: queue netlink multicast sends just like we do for unicast sends
      audit: rename the queues and kauditd related functions
      audit: rework the audit queue handling
      audit: rework audit_log_start()
      audit: wake up kauditd_thread after auditd registers
      audit: handle a clean auditd shutdown with grace
      audit: don't ever sleep on a command record/message
      netns: avoid disabling irq for netns id

Richard Guy Briggs (5):
      audit: tame initialization warning len_abuf in audit_log_execve_info
      audit: skip sessionid sentinel value when auto-incrementing
      audit: add support for session ID user filter
      audit: move kaudit thread start from auditd registration to
             kaudit init (#2)
      audit: use proper refcount locking on audit_sock

Steve Grubb (1):
      audit: fix formatting of AUDIT_CONFIG_CHANGE events

 fs/proc/base.c             |   2 +-
 include/uapi/linux/audit.h |   5 +-
 kernel/audit.c             | 532 ++++++++++++++++++++++++---------------
 kernel/audit_fsnotify.c    |   5 +-
 kernel/audit_tree.c        |   3 +-
 kernel/audit_watch.c       |   5 +-
 kernel/auditfilter.c       |   5 +-
 kernel/auditsc.c           |  12 +-
 net/core/net_namespace.c   |  35 ++-
 9 files changed, 361 insertions(+), 243 deletions(-)

-- 
paul moore
security @ redhat

^ permalink raw reply

* Re: [PATCH v2] infiniband: remove WARN that is not kernel bug
From: Leon Romanovsky @ 2016-12-14 18:27 UTC (permalink / raw)
  To: Doug Ledford
  Cc: Jason Gunthorpe, Dmitry Vyukov, syzkaller, Valdis.Kletnieks,
	sean.hefty, Hal Rosenstock, linux-rdma, LKML
In-Reply-To: <b5343f5e-68a8-5a01-5fa9-04e9181e4082@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 848 bytes --]

On Wed, Dec 14, 2016 at 01:16:45PM -0500, Doug Ledford wrote:
> On 11/21/2016 12:38 PM, Leon Romanovsky wrote:
> > On Mon, Nov 21, 2016 at 09:52:53AM -0700, Jason Gunthorpe wrote:
> >> On Mon, Nov 21, 2016 at 02:14:08PM +0200, Leon Romanovsky wrote:
> >>>>
> >>>> In ib_ucm_write function there is a wrong prefix:
> >>>>
> >>>> + pr_err_once("ucm_write: process %d (%s) tried to do something hinky\n",
> >>>
> >>> I did it intentionally to have the same errors for all flows.
> >>
> >> Lets actually use a good message too please?
> >>
> >>  pr_err_once("ucm_write: process %d (%s) changed security contexts after opening FD, this is not allowed.\n",
> >>
> >> Jason
>
> I applied Leon's reworked version of this patch, thanks.

Thanks Doug,
I already forgot about it :)

>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG Key ID: 0E572FDD
>




[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] Input: evdev: fix queueing of SYN_DROPPED event for EVIOCG[type] IOCTL case
From: Aniroop Mathur @ 2016-12-14 18:24 UTC (permalink / raw)
  To: Dmitry Torokhov
  Cc: linux-input@vger.kernel.org, linux-kernel@vger.kernel.org,
	Aniroop Mathur, SAMUEL SEQUEIRA, Rahul Mahale
In-Reply-To: <1480018303-4220-1-git-send-email-a.mathur@samsung.com>

Hello Mr. Torokhov,

Would you kindly update about this patch?
Thanks!

Best Regards,
Aniroop Mathur


On Fri, Nov 25, 2016 at 1:41 AM, Aniroop Mathur <a.mathur@samsung.com> wrote:
> Currently, when EVIOCG[type] ioctl call is issued and bits_to_user fails,
> then SYN_DROPPED event is inserted in the event queue always.
>
> However, it is not compulsory that some events are flushed out on every
> EVIOCG[type] ioctl call like in case of empty event queue and in case when
> EVIOCG[type] ioctl is issued for say A type of events but event queue does
> not have any A type of events but some other type of events.
>
> Therefore, insert SYN_DROPPED event only when some events have been flushed
> out from event queue plus bits_to_user fails.
>
> Signed-off-by: Aniroop Mathur <a.mathur@samsung.com>
> ---
>  drivers/input/evdev.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
> index e9ae3d5..f8b295e 100644
> --- a/drivers/input/evdev.c
> +++ b/drivers/input/evdev.c
> @@ -108,9 +108,11 @@ static bool __evdev_is_filtered(struct evdev_client *client,
>  }
>
>  /* flush queued events of type @type, caller must hold client->buffer_lock */
> -static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
> +static unsigned int __evdev_flush_queue(struct evdev_client *client,
> +                                       unsigned int type)
>  {
>         unsigned int i, head, num;
> +       unsigned int drop_count = 0;
>         unsigned int mask = client->bufsize - 1;
>         bool is_report;
>         struct input_event *ev;
> @@ -129,9 +131,11 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>
>                 if (ev->type == type) {
>                         /* drop matched entry */
> +                       drop_count++;
>                         continue;
>                 } else if (is_report && !num) {
>                         /* drop empty SYN_REPORT groups */
> +                       drop_count++;
>                         continue;
>                 } else if (head != i) {
>                         /* move entry to fill the gap */
> @@ -151,6 +155,7 @@ static void __evdev_flush_queue(struct evdev_client *client, unsigned int type)
>         }
>
>         client->head = head;
> +       return drop_count;
>  }
>
>  static void __evdev_queue_syn_dropped(struct evdev_client *client)
> @@ -920,6 +925,7 @@ static int evdev_handle_get_val(struct evdev_client *client,
>         int ret;
>         unsigned long *mem;
>         size_t len;
> +       unsigned int drop_count = 0;
>
>         len = BITS_TO_LONGS(maxbit) * sizeof(unsigned long);
>         mem = kmalloc(len, GFP_KERNEL);
> @@ -933,12 +939,12 @@ static int evdev_handle_get_val(struct evdev_client *client,
>
>         spin_unlock(&dev->event_lock);
>
> -       __evdev_flush_queue(client, type);
> +       drop_count = __evdev_flush_queue(client, type);
>
>         spin_unlock_irq(&client->buffer_lock);
>
>         ret = bits_to_user(mem, maxbit, maxlen, p, compat);
> -       if (ret < 0)
> +       if (ret < 0 && drop_count > 0)
>                 evdev_queue_syn_dropped(client);
>
>         kfree(mem);
> --
> 2.6.2
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox