public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] Unnecessary work and noise from mce code in suspend/resume path
@ 2014-05-14 19:02 Luck, Tony
  2014-05-23 10:54 ` Borislav Petkov
  0 siblings, 1 reply; 3+ messages in thread
From: Luck, Tony @ 2014-05-14 19:02 UTC (permalink / raw)
  To: linux-kernel; +Cc: Borislav Petkov, Chen Gong, Aiden Park

When we suspend a laptop we offline all but one processor. But
the mce code registers on a notify chain so it can clean up
some sysfs entries. Part of that code calls device_unregister()
which will fire kobject_uevent() which might wake up some user
code that is watching for such things.

Patch below from Aiden Park works around the issue by avoiding 
the device_unregister()/device_register() by just keeping track
of the original device registrations.

1) Is there a better way to avoid the kobject_uevent()?
2) What user code actually uses all these sysfs files?
   Some test code in mcelog(8) uses:
        /sys/devices/system/machinecheck/machinecheck0/trigger
   Some code in mce-test package adjusts
        /sys/devices/system/machinecheck/machinecheck0/tolerant
   But neither of these make use of all the per-cpu instances
   created/destroyed on every suspend/resume (or logical processor
   offline).  Should we just move these out of per-cpu directories
   and scrap this whole block of code?
3) checkpatch isn't happy about use of NR_CPUS I suspect the
   usage here warrants for_each_possible_cpu()

-Tony Luck
---

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
old mode 100644
new mode 100755
index 68317c80de7f..e49302f24ce5
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -113,6 +113,8 @@ static DEFINE_PER_CPU(struct work_struct, mce_work);
 
 static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
 
+static struct device *mce_devices[NR_CPUS];
+
 /*
  * CPU/chipset specific EDAC code can register a notifier call here to print
  * MCE errors in a human-readable form.
@@ -2060,7 +2062,18 @@ static int mce_syscore_suspend(void)
 
 static void mce_syscore_shutdown(void)
 {
+	int i = 0;
+	struct device *dev;
+
 	mce_disable_error_reporting();
+
+	for (i = 0; i < NR_CPUS; i++) {
+		dev = mce_devices[i];
+		if (dev) {
+			device_unregister(dev);
+			mce_devices[i] = NULL;
+		}
+	}
 }
 
 /*
@@ -2272,23 +2285,27 @@ static void mce_device_release(struct device *dev)
 static int mce_device_create(unsigned int cpu)
 {
 	struct device *dev;
-	int err;
+	int err = 0;
 	int i, j;
 
 	if (!mce_available(&boot_cpu_data))
 		return -EIO;
 
-	dev = kzalloc(sizeof *dev, GFP_KERNEL);
-	if (!dev)
-		return -ENOMEM;
-	dev->id  = cpu;
-	dev->bus = &mce_subsys;
-	dev->release = &mce_device_release;
+	dev = mce_devices[cpu];
+	if (!dev) {
+		dev = kzalloc(sizeof *dev, GFP_KERNEL);
+		if (!dev)
+			return -ENOMEM;
+		dev->id  = cpu;
+		dev->bus = &mce_subsys;
+		dev->release = &mce_device_release;
 
-	err = device_register(dev);
-	if (err) {
-		put_device(dev);
-		return err;
+		err = device_register(dev);
+		if (err) {
+			put_device(dev);
+			return err;
+		}
+		mce_devices[cpu] = dev;
 	}
 
 	for (i = 0; mce_device_attrs[i]; i++) {
@@ -2313,6 +2330,7 @@ error:
 		device_remove_file(dev, mce_device_attrs[i]);
 
 	device_unregister(dev);
+	mce_devices[cpu] = NULL;
 
 	return err;
 }
@@ -2331,7 +2349,6 @@ static void mce_device_remove(unsigned int cpu)
 	for (i = 0; i < mca_cfg.banks; i++)
 		device_remove_file(dev, &mce_banks[i].attr);
 
-	device_unregister(dev);
 	cpumask_clear_cpu(cpu, mce_device_initialized);
 	per_cpu(mce_device, cpu) = NULL;
 }
@@ -2450,6 +2467,7 @@ static __init int mcheck_init_device(void)
 
 	cpu_notifier_register_begin();
 	for_each_online_cpu(i) {
+		mce_devices[i] = NULL;
 		err = mce_device_create(i);
 		if (err) {
 			cpu_notifier_register_done();

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [RFC] Unnecessary work and noise from mce code in suspend/resume path
  2014-05-14 19:02 [RFC] Unnecessary work and noise from mce code in suspend/resume path Luck, Tony
@ 2014-05-23 10:54 ` Borislav Petkov
  2014-05-23 18:53   ` Luck, Tony
  0 siblings, 1 reply; 3+ messages in thread
From: Borislav Petkov @ 2014-05-23 10:54 UTC (permalink / raw)
  To: Luck, Tony; +Cc: linux-kernel, Chen Gong, Aiden Park

On Wed, May 14, 2014 at 12:02:32PM -0700, Luck, Tony wrote:
> When we suspend a laptop we offline all but one processor. But
> the mce code registers on a notify chain so it can clean up
> some sysfs entries. Part of that code calls device_unregister()
> which will fire kobject_uevent() which might wake up some user
> code that is watching for such things.

The issue being?

It is not clear from the text what actually is the problem.

> Patch below from Aiden Park works around the issue by avoiding 
> the device_unregister()/device_register() by just keeping track
> of the original device registrations.
> 
> 1) Is there a better way to avoid the kobject_uevent()?
> 2) What user code actually uses all these sysfs files?
>    Some test code in mcelog(8) uses:
>         /sys/devices/system/machinecheck/machinecheck0/trigger
>    Some code in mce-test package adjusts
>         /sys/devices/system/machinecheck/machinecheck0/tolerant
>    But neither of these make use of all the per-cpu instances
>    created/destroyed on every suspend/resume (or logical processor
>    offline).  Should we just move these out of per-cpu directories
>    and scrap this whole block of code?

Oh, I'd love to.

Especially "trigger" - it wants to be in debugfs anyway. The "tolerant"
deal we probably need to discuss a bit more. Exposing MCE tolerance
levels even to root is not such a good idea and this tolerant thing is
primarily used in testing, AFAICT, so that the box doesn't actually
panic when you inject errors and so on...

> 3) checkpatch isn't happy about use of NR_CPUS I suspect the
>    usage here warrants for_each_possible_cpu()

Right.

> -Tony Luck
> ---
> 
> diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
> old mode 100644
> new mode 100755

Something decided to make mce.c an executable script :-)

> index 68317c80de7f..e49302f24ce5
> --- a/arch/x86/kernel/cpu/mcheck/mce.c
> +++ b/arch/x86/kernel/cpu/mcheck/mce.c
> @@ -113,6 +113,8 @@ static DEFINE_PER_CPU(struct work_struct, mce_work);
>  
>  static void (*quirk_no_way_out)(int bank, struct mce *m, struct pt_regs *regs);
>  
> +static struct device *mce_devices[NR_CPUS];
> +
>  /*
>   * CPU/chipset specific EDAC code can register a notifier call here to print
>   * MCE errors in a human-readable form.
> @@ -2060,7 +2062,18 @@ static int mce_syscore_suspend(void)
>  
>  static void mce_syscore_shutdown(void)
>  {
> +	int i = 0;
> +	struct device *dev;
> +
>  	mce_disable_error_reporting();
> +
> +	for (i = 0; i < NR_CPUS; i++) {

WARNING: usage of NR_CPUS is often wrong - consider using cpu_possible(), num_possible_cpus(), for_each_possible_cpu(), etc
#64: FILE: arch/x86/kernel/cpu/mcheck/mce.c:2070:
+       for (i = 0; i < NR_CPUS; i++) {


> +		dev = mce_devices[i];
> +		if (dev) {
> +			device_unregister(dev);
> +			mce_devices[i] = NULL;
> +		}
> +	}
>  }
>  
>  /*
> @@ -2272,23 +2285,27 @@ static void mce_device_release(struct device *dev)
>  static int mce_device_create(unsigned int cpu)
>  {
>  	struct device *dev;
> -	int err;
> +	int err = 0;
>  	int i, j;
>  
>  	if (!mce_available(&boot_cpu_data))
>  		return -EIO;
>  
> -	dev = kzalloc(sizeof *dev, GFP_KERNEL);
> -	if (!dev)
> -		return -ENOMEM;
> -	dev->id  = cpu;
> -	dev->bus = &mce_subsys;
> -	dev->release = &mce_device_release;
> +	dev = mce_devices[cpu];
> +	if (!dev) {
> +		dev = kzalloc(sizeof *dev, GFP_KERNEL);

Please run through checkpatch:

WARNING: sizeof *dev should be sizeof(*dev)
#93: FILE: arch/x86/kernel/cpu/mcheck/mce.c:2296:
+               dev = kzalloc(sizeof *dev, GFP_KERNEL);

> +		if (!dev)
> +			return -ENOMEM;
> +		dev->id  = cpu;
> +		dev->bus = &mce_subsys;
> +		dev->release = &mce_device_release;
>  
> -	err = device_register(dev);
> -	if (err) {
> -		put_device(dev);
> -		return err;
> +		err = device_register(dev);
> +		if (err) {
> +			put_device(dev);
> +			return err;
> +		}
> +		mce_devices[cpu] = dev;

Ok, in any case, AFAIU it is a mechanism to cache those mce_devices and
not generate them every time we come from suspend.

And we have those mce_device per cpu vars which we use too. So this one
adds another array of mce_devices.

Dumb question: would it be possible to save us all that init/teardown
everytime and simply toggle their visibility in sysfs instead? I mean,
maybe play with device_create_file/device_remove_file only and hand it
down those struct device *dev things we have cached per-cpu.

This will definitely save us the kobject_uevent() calls.

But it could be that I'm missing something obvious...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: [RFC] Unnecessary work and noise from mce code in suspend/resume path
  2014-05-23 10:54 ` Borislav Petkov
@ 2014-05-23 18:53   ` Luck, Tony
  0 siblings, 0 replies; 3+ messages in thread
From: Luck, Tony @ 2014-05-23 18:53 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: linux-kernel@vger.kernel.org, Chen Gong, Park, Aiden

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 818 bytes --]

>> When we suspend a laptop we offline all but one processor. But
>> the mce code registers on a notify chain so it can clean up
>> some sysfs entries. Part of that code calls device_unregister()
>> which will fire kobject_uevent() which might wake up some user
>> code that is watching for such things.
>
> The issue being?
>
> It is not clear from the text what actually is the problem.

We are trying to suspend - and yet we are now having udev
woken up to run some scripts ... that will burn power ... which
is what we are trying to avoid when we shut the lid of the
laptop.  There might be some Android-isms involved too
that I don't understand.

-Tony
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-05-23 18:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-05-14 19:02 [RFC] Unnecessary work and noise from mce code in suspend/resume path Luck, Tony
2014-05-23 10:54 ` Borislav Petkov
2014-05-23 18:53   ` Luck, Tony

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox