* mmiotracer hangs the system @ 2016-08-02 10:08 Andy Shevchenko 2016-08-02 10:36 ` Andy Shevchenko 2016-08-02 15:05 ` Andy Shevchenko 0 siblings, 2 replies; 20+ messages in thread From: Andy Shevchenko @ 2016-08-02 10:08 UTC (permalink / raw) To: linux-kernel@vger.kernel.org; +Cc: Paul E. McKenney, Steven Rostedt Hi! I'm trying to use mmio tracer with recent kernels (in this particular case today's linux-next). # mount -t debugfs none /sys/kernel/debug/ # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer [ 869.673145] in mmio_trace_init [ 869.714170] mmiotrace: Disabling non-boot CPUs... [ 869.729938] Cannot set affinity for irq 169 [ 869.735765] smpboot: CPU 1 is now offline [ 869.746662] mmiotrace: CPU1 is down. [ 869.757896] smpboot: CPU 2 is now offline [ 869.773572] mmiotrace: CPU2 is down. [ 869.781768] smpboot: CPU 3 is now offline [ 869.789495] mmiotrace: CPU3 is down. [ 869.793515] mmiotrace: enabled. # echo 1 > /sys/kernel/debug/tracing/tracing_on [ 869.802634] in mmio_trace_start # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind [ 883.625744] mmiotrace: Unmapping ffffc90000854000. [ 883.633925] mmiotrace: Unmapping ffffc90000852000. [ 883.644580] mmiotrace: Unmapping ffffc90000850000. # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind [ 889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200 [ 910.533911] INFO: rcu_sched detected stalls on CPUs/tasks: [ 910.540052] (detected by 0, t=21002 jiffies, g=348, c=347, q=0) [ 910.546790] All QSes seen, last rcu_sched kthread activity 21002 (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm ask 0x0 [ 910.560142] sh R running task 12336 1289 1 0x20020008 [ 910.568055] ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296 ffff88017fc18500 [ 910.576366] ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d 0000000000000000 [ 910.584675] ffffffff810f67ec 000000000000015c 0000000000000000 000000000000015c [ 910.592980] Call Trace: [ 910.595715] <IRQ> [<ffffffff81085296>] sched_show_task+0xb6/0x110 [ 910.602748] [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850 [ 910.609573] [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0 [ 910.616788] [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30 [ 910.623613] [<ffffffff810ba34a>] update_process_times+0x2a/0x50 [ 910.630343] [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40 [ 910.637560] [<ffffffff810c9188>] tick_sched_timer+0x38/0x70 [ 910.643902] [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250 [ 910.650734] [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190 [ 910.657272] [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50 [ 910.664584] [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50 [ 910.671705] [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90 [ 910.678427] <EOI> [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0 [ 910.685739] [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0 [ 910.692364] [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40 [ 910.698610] [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70 [ 910.706022] [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80 [ 910.712753] [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0 [ 910.719479] [<ffffffff813797e0>] local_pci_probe+0x40/0xa0 [ 910.725719] [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0 [ 910.733617] [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120 [ 910.740058] [<ffffffff8148679f>] driver_probe_device+0x21f/0x430 [ 910.746883] [<ffffffff81484c4f>] bind_store+0x10f/0x160 [ 910.752836] [<ffffffff81484150>] drv_attr_store+0x20/0x30 [ 910.758983] [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40 [ 910.765129] [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190 [ 910.771663] [<ffffffff81185343>] __vfs_write+0x23/0x120 [ 910.777607] [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0 [ 910.784918] [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50 [ 910.791351] [<ffffffff81186403>] vfs_write+0xb3/0x1b0 [ 910.797103] [<ffffffff81187711>] SyS_write+0x41/0xa0 [ 910.802758] [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0 [ 910.809389] [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40 [ 910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 [ 910.826351] rcu_sched R running task 14896 7 2 0x00000000 [ 910.834260] ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00 0000000000000000 [ 910.842550] ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff 0000000000000000 [ 910.850836] ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370 000000010008feaa [ 910.859118] Call Trace: [ 910.861852] [<ffffffff81909370>] schedule+0x30/0x80 [ 910.867408] [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410 [ 910.873938] [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0 [ 910.880267] [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80 [ 910.886793] [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800 [ 910.893124] [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20 [ 910.899458] [<ffffffff81079f54>] kthread+0xc4/0xe0 [ 910.904917] [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40 [ 910.910961] [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160 Is it bug in the driver or somewhere else? -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko @ 2016-08-02 10:36 ` Andy Shevchenko 2016-08-02 15:05 ` Andy Shevchenko 1 sibling, 0 replies; 20+ messages in thread From: Andy Shevchenko @ 2016-08-02 10:36 UTC (permalink / raw) To: linux-kernel@vger.kernel.org; +Cc: Paul E. McKenney, Steven Rostedt On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > Hi! > > I'm trying to use mmio tracer with recent kernels (in this particular > case today's linux-next). Additional info. I took v4.4.16 and add the following to the default x86_64_defconfig: +CONFIG_MMIOTRACE=y +CONFIG_SERIAL_8250_DW=y +CONFIG_MFD_INTEL_LPSS=y +CONFIG_MFD_INTEL_LPSS_PCI=y The problem is still reproduced. I can't take earlier kernels because the mentioned driver (intel-lpss) was introduced in v4.4. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko 2016-08-02 10:36 ` Andy Shevchenko @ 2016-08-02 15:05 ` Andy Shevchenko 2016-08-02 15:07 ` Andy Shevchenko 1 sibling, 1 reply; 20+ messages in thread From: Andy Shevchenko @ 2016-08-02 15:05 UTC (permalink / raw) To: linux-kernel@vger.kernel.org, Karol Herbst Cc: Paul E. McKenney, Steven Rostedt, Ingo Molnar +Cc: Karol, Ingo On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > Hi! > > I'm trying to use mmio tracer with recent kernels (in this particular > case today's linux-next). Tested on other board and found that v4.5 works while v4.5.7 doesn't. Bisecting to commit d62a28a60562a8ba82e67e13c268245f37e796cb Author: Karol Herbst <nouveau@karolherbst.de> Date: Thu Mar 3 02:03:11 2016 +0100 x86/mm/kmmio: Fix mmiotrace for hugepages commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream. Reverting _helps_ for x86 and x86_64 builds. > > # mount -t debugfs none /sys/kernel/debug/ > > # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer > [ 869.673145] in mmio_trace_init > [ 869.714170] mmiotrace: Disabling non-boot CPUs... > [ 869.729938] Cannot set affinity for irq 169 > [ 869.735765] smpboot: CPU 1 is now offline > [ 869.746662] mmiotrace: CPU1 is down. > [ 869.757896] smpboot: CPU 2 is now offline > [ 869.773572] mmiotrace: CPU2 is down. > [ 869.781768] smpboot: CPU 3 is now offline > [ 869.789495] mmiotrace: CPU3 is down. > [ 869.793515] mmiotrace: enabled. > > # echo 1 > /sys/kernel/debug/tracing/tracing_on > [ 869.802634] in mmio_trace_start > > # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind > [ 883.625744] mmiotrace: Unmapping ffffc90000854000. > [ 883.633925] mmiotrace: Unmapping ffffc90000852000. > [ 883.644580] mmiotrace: Unmapping ffffc90000850000. > > # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind > [ 889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200 > > [ 910.533911] INFO: rcu_sched detected stalls on CPUs/tasks: > [ 910.540052] (detected by 0, t=21002 jiffies, g=348, c=347, q=0) > [ 910.546790] All QSes seen, last rcu_sched kthread activity 21002 > (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm > ask 0x0 > [ 910.560142] sh R running task 12336 1289 1 0x20020008 > [ 910.568055] ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296 > ffff88017fc18500 > [ 910.576366] ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d > 0000000000000000 > [ 910.584675] ffffffff810f67ec 000000000000015c 0000000000000000 > 000000000000015c > [ 910.592980] Call Trace: > [ 910.595715] <IRQ> [<ffffffff81085296>] sched_show_task+0xb6/0x110 > [ 910.602748] [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850 > [ 910.609573] [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0 > [ 910.616788] [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30 > [ 910.623613] [<ffffffff810ba34a>] update_process_times+0x2a/0x50 > [ 910.630343] [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40 > [ 910.637560] [<ffffffff810c9188>] tick_sched_timer+0x38/0x70 > [ 910.643902] [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250 > [ 910.650734] [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190 > [ 910.657272] [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50 > [ 910.664584] [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50 > [ 910.671705] [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90 > [ 910.678427] <EOI> [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0 > [ 910.685739] [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0 > [ 910.692364] [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40 > [ 910.698610] [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70 > [ 910.706022] [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80 > [ 910.712753] [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0 > [ 910.719479] [<ffffffff813797e0>] local_pci_probe+0x40/0xa0 > [ 910.725719] [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0 > [ 910.733617] [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120 > [ 910.740058] [<ffffffff8148679f>] driver_probe_device+0x21f/0x430 > [ 910.746883] [<ffffffff81484c4f>] bind_store+0x10f/0x160 > [ 910.752836] [<ffffffff81484150>] drv_attr_store+0x20/0x30 > [ 910.758983] [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40 > [ 910.765129] [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190 > [ 910.771663] [<ffffffff81185343>] __vfs_write+0x23/0x120 > [ 910.777607] [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0 > [ 910.784918] [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50 > [ 910.791351] [<ffffffff81186403>] vfs_write+0xb3/0x1b0 > [ 910.797103] [<ffffffff81187711>] SyS_write+0x41/0xa0 > [ 910.802758] [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0 > [ 910.809389] [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40 > [ 910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347 > f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 > [ 910.826351] rcu_sched R running task 14896 7 2 0x00000000 > [ 910.834260] ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00 > 0000000000000000 > [ 910.842550] ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff > 0000000000000000 > [ 910.850836] ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370 > 000000010008feaa > [ 910.859118] Call Trace: > [ 910.861852] [<ffffffff81909370>] schedule+0x30/0x80 > [ 910.867408] [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410 > [ 910.873938] [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0 > [ 910.880267] [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80 > [ 910.886793] [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800 > [ 910.893124] [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20 > [ 910.899458] [<ffffffff81079f54>] kthread+0xc4/0xe0 > [ 910.904917] [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40 > [ 910.910961] [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160 > > > Is it bug in the driver or somewhere else? > > -- > With Best Regards, > Andy Shevchenko -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 15:05 ` Andy Shevchenko @ 2016-08-02 15:07 ` Andy Shevchenko 2016-08-02 15:31 ` Steven Rostedt 0 siblings, 1 reply; 20+ messages in thread From: Andy Shevchenko @ 2016-08-02 15:07 UTC (permalink / raw) To: linux-kernel@vger.kernel.org, Karol Herbst Cc: Paul E. McKenney, Steven Rostedt, Ingo Molnar Use another Karol's address (found in MAINTAINERS) On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > +Cc: Karol, Ingo > > On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: >> Hi! >> >> I'm trying to use mmio tracer with recent kernels (in this particular >> case today's linux-next). > > Tested on other board and found that v4.5 works while v4.5.7 doesn't. > Bisecting to > > commit d62a28a60562a8ba82e67e13c268245f37e796cb > Author: Karol Herbst <nouveau@karolherbst.de> > Date: Thu Mar 3 02:03:11 2016 +0100 > > x86/mm/kmmio: Fix mmiotrace for hugepages > > commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream. > > Reverting _helps_ for x86 and x86_64 builds. > >> >> # mount -t debugfs none /sys/kernel/debug/ >> >> # echo mmiotrace > /sys/kernel/debug/tracing/current_tracer >> [ 869.673145] in mmio_trace_init >> [ 869.714170] mmiotrace: Disabling non-boot CPUs... >> [ 869.729938] Cannot set affinity for irq 169 >> [ 869.735765] smpboot: CPU 1 is now offline >> [ 869.746662] mmiotrace: CPU1 is down. >> [ 869.757896] smpboot: CPU 2 is now offline >> [ 869.773572] mmiotrace: CPU2 is down. >> [ 869.781768] smpboot: CPU 3 is now offline >> [ 869.789495] mmiotrace: CPU3 is down. >> [ 869.793515] mmiotrace: enabled. >> >> # echo 1 > /sys/kernel/debug/tracing/tracing_on >> [ 869.802634] in mmio_trace_start >> >> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/unbind >> [ 883.625744] mmiotrace: Unmapping ffffc90000854000. >> [ 883.633925] mmiotrace: Unmapping ffffc90000852000. >> [ 883.644580] mmiotrace: Unmapping ffffc90000850000. >> >> # echo 0000:00:18.1 > /sys/bus/pci/drivers/intel-lpss/bind >> [ 889.525125] mmiotrace: ioremap_*(0x9242e200, 0x100) = ffffc90000856200 >> >> [ 910.533911] INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 910.540052] (detected by 0, t=21002 jiffies, g=348, c=347, q=0) >> [ 910.546790] All QSes seen, last rcu_sched kthread activity 21002 >> (4295577777-4295556775), jiffies_till_next_fqs=3, root ->qsm >> ask 0x0 >> [ 910.560142] sh R running task 12336 1289 1 0x20020008 >> [ 910.568055] ffffffff81e422c0 ffff88017fc03e20 ffffffff81085296 >> ffff88017fc18500 >> [ 910.576366] ffffffff81e422c0 ffff88017fc03e88 ffffffff810b547d >> 0000000000000000 >> [ 910.584675] ffffffff810f67ec 000000000000015c 0000000000000000 >> 000000000000015c >> [ 910.592980] Call Trace: >> [ 910.595715] <IRQ> [<ffffffff81085296>] sched_show_task+0xb6/0x110 >> [ 910.602748] [<ffffffff810b547d>] rcu_check_callbacks+0x84d/0x850 >> [ 910.609573] [<ffffffff810f67ec>] ? __acct_update_integrals+0x2c/0xb0 >> [ 910.616788] [<ffffffff810c9150>] ? tick_sched_do_timer+0x30/0x30 >> [ 910.623613] [<ffffffff810ba34a>] update_process_times+0x2a/0x50 >> [ 910.630343] [<ffffffff810c8bb1>] tick_sched_handle.isra.12+0x31/0x40 >> [ 910.637560] [<ffffffff810c9188>] tick_sched_timer+0x38/0x70 >> [ 910.643902] [<ffffffff810bacba>] __hrtimer_run_queues+0xda/0x250 >> [ 910.650734] [<ffffffff810bb3f3>] hrtimer_interrupt+0xa3/0x190 >> [ 910.657272] [<ffffffff8103ead3>] local_apic_timer_interrupt+0x33/0x50 >> [ 910.664584] [<ffffffff8103f588>] smp_apic_timer_interrupt+0x38/0x50 >> [ 910.671705] [<ffffffff8190dd6f>] apic_timer_interrupt+0x7f/0x90 >> [ 910.678427] <EOI> [<ffffffff814a717f>] ? intel_lpss_probe+0x7f/0x5f0 >> [ 910.685739] [<ffffffff814a716b>] ? intel_lpss_probe+0x6b/0x5f0 >> [ 910.692364] [<ffffffff8170e5df>] ? raw_pci_write+0x1f/0x40 >> [ 910.698610] [<ffffffff8136e825>] ? pci_bus_write_config_byte+0x55/0x70 >> [ 910.706022] [<ffffffff813781b1>] ? pcibios_set_master+0x51/0x80 >> [ 910.712753] [<ffffffff814a7836>] intel_lpss_pci_probe+0x76/0xb0 >> [ 910.719479] [<ffffffff813797e0>] local_pci_probe+0x40/0xa0 >> [ 910.725719] [<ffffffff811fce44>] ? sysfs_do_create_link_sd.isra.2+0x64/0xa0 >> [ 910.733617] [<ffffffff8137ab46>] pci_device_probe+0xd6/0x120 >> [ 910.740058] [<ffffffff8148679f>] driver_probe_device+0x21f/0x430 >> [ 910.746883] [<ffffffff81484c4f>] bind_store+0x10f/0x160 >> [ 910.752836] [<ffffffff81484150>] drv_attr_store+0x20/0x30 >> [ 910.758983] [<ffffffff811fc312>] sysfs_kf_write+0x32/0x40 >> [ 910.765129] [<ffffffff811fb863>] kernfs_fop_write+0x113/0x190 >> [ 910.771663] [<ffffffff81185343>] __vfs_write+0x23/0x120 >> [ 910.777607] [<ffffffff812cfd46>] ? security_file_permission+0x36/0xb0 >> [ 910.784918] [<ffffffff810998dd>] ? percpu_down_read+0xd/0x50 >> [ 910.791351] [<ffffffff81186403>] vfs_write+0xb3/0x1b0 >> [ 910.797103] [<ffffffff81187711>] SyS_write+0x41/0xa0 >> [ 910.802758] [<ffffffff81002b2e>] do_int80_syscall_32+0x4e/0xa0 >> [ 910.809389] [<ffffffff8190f2aa>] entry_INT80_compat+0x2a/0x40 >> [ 910.815925] rcu_sched kthread starved for 21002 jiffies! g348 c347 >> f0x2 RCU_GP_WAIT_FQS(3) ->state=0x0 >> [ 910.826351] rcu_sched R running task 14896 7 2 0x00000000 >> [ 910.834260] ffff88017ab7bd88 ffff880179c17080 ffff88017ab34b00 >> 0000000000000000 >> [ 910.842550] ffff88017ab7c000 ffff88017ab7bdd0 00000000ffffffff >> 0000000000000000 >> [ 910.850836] ffff88017fc0ec40 ffff88017ab7bda0 ffffffff81909370 >> 000000010008feaa >> [ 910.859118] Call Trace: >> [ 910.861852] [<ffffffff81909370>] schedule+0x30/0x80 >> [ 910.867408] [<ffffffff8190c3b9>] schedule_timeout+0x209/0x410 >> [ 910.873938] [<ffffffff810b8760>] ? init_timer_key+0xa0/0xa0 >> [ 910.880267] [<ffffffff81097aab>] ? prepare_to_swait+0x5b/0x80 >> [ 910.886793] [<ffffffff810b3e09>] rcu_gp_kthread+0x479/0x800 >> [ 910.893124] [<ffffffff810b3990>] ? call_rcu_sched+0x20/0x20 >> [ 910.899458] [<ffffffff81079f54>] kthread+0xc4/0xe0 >> [ 910.904917] [<ffffffff8190d3cf>] ret_from_fork+0x1f/0x40 >> [ 910.910961] [<ffffffff81079e90>] ? kthread_worker_fn+0x160/0x160 >> >> >> Is it bug in the driver or somewhere else? >> >> -- >> With Best Regards, >> Andy Shevchenko > > > > -- > With Best Regards, > Andy Shevchenko -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 15:07 ` Andy Shevchenko @ 2016-08-02 15:31 ` Steven Rostedt 2016-08-02 16:08 ` Andy Shevchenko 0 siblings, 1 reply; 20+ messages in thread From: Steven Rostedt @ 2016-08-02 15:31 UTC (permalink / raw) To: Andy Shevchenko Cc: linux-kernel@vger.kernel.org, Karol Herbst, Paul E. McKenney, Ingo Molnar On Tue, 2 Aug 2016 18:07:38 +0300 Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > Use another Karol's address (found in MAINTAINERS) > > On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko > <andy.shevchenko@gmail.com> wrote: > > +Cc: Karol, Ingo > > > > On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko > > <andy.shevchenko@gmail.com> wrote: > >> Hi! > >> > >> I'm trying to use mmio tracer with recent kernels (in this particular > >> case today's linux-next). > > > > Tested on other board and found that v4.5 works while v4.5.7 doesn't. > > Bisecting to > > > > commit d62a28a60562a8ba82e67e13c268245f37e796cb > > Author: Karol Herbst <nouveau@karolherbst.de> > > Date: Thu Mar 3 02:03:11 2016 +0100 > > > > x86/mm/kmmio: Fix mmiotrace for hugepages > > > > commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream. > > > > Reverting _helps_ for x86 and x86_64 builds. > > That commit was added in 4.6. Does that kernel work? Maybe it was a bad backport? -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 15:31 ` Steven Rostedt @ 2016-08-02 16:08 ` Andy Shevchenko 2016-08-02 16:13 ` Steven Rostedt 0 siblings, 1 reply; 20+ messages in thread From: Andy Shevchenko @ 2016-08-02 16:08 UTC (permalink / raw) To: Steven Rostedt Cc: linux-kernel@vger.kernel.org, Karol Herbst, Paul E. McKenney, Ingo Molnar On Tue, Aug 2, 2016 at 6:31 PM, Steven Rostedt <rostedt@goodmis.org> wrote: > On Tue, 2 Aug 2016 18:07:38 +0300 > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > >> Use another Karol's address (found in MAINTAINERS) >> >> On Tue, Aug 2, 2016 at 6:05 PM, Andy Shevchenko >> <andy.shevchenko@gmail.com> wrote: >> > +Cc: Karol, Ingo >> > >> > On Tue, Aug 2, 2016 at 1:08 PM, Andy Shevchenko >> > <andy.shevchenko@gmail.com> wrote: >> >> Hi! >> >> >> >> I'm trying to use mmio tracer with recent kernels (in this particular >> >> case today's linux-next). >> > >> > Tested on other board and found that v4.5 works while v4.5.7 doesn't. >> > Bisecting to >> > >> > commit d62a28a60562a8ba82e67e13c268245f37e796cb >> > Author: Karol Herbst <nouveau@karolherbst.de> >> > Date: Thu Mar 3 02:03:11 2016 +0100 >> > >> > x86/mm/kmmio: Fix mmiotrace for hugepages >> > >> > commit cfa52c0cfa4d727aa3e457bf29aeff296c528a08 upstream. >> > >> > Reverting _helps_ for x86 and x86_64 builds. >> > > > That commit was added in 4.6. Does that kernel work? Maybe it was a bad > backport? I don't think so, since linux-next doesn't work until I revert this commit. I can try exactly v4.6 (yep, I tried stable versions, including v4.4.16 that's why all of them failed to me) if you still would like me to do so. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 16:08 ` Andy Shevchenko @ 2016-08-02 16:13 ` Steven Rostedt 2016-08-03 18:24 ` karol herbst 2016-08-19 10:35 ` karol herbst 0 siblings, 2 replies; 20+ messages in thread From: Steven Rostedt @ 2016-08-02 16:13 UTC (permalink / raw) To: Andy Shevchenko Cc: linux-kernel@vger.kernel.org, Karol Herbst, Paul E. McKenney, Ingo Molnar On Tue, 2 Aug 2016 19:08:24 +0300 Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > I don't think so, since linux-next doesn't work until I revert this commit. > > I can try exactly v4.6 (yep, I tried stable versions, including > v4.4.16 that's why all of them failed to me) if you still would like > me to do so. If linux-next doesn't work, then don't bother. That commit obviously broke something and you'll probably need help from Karol to fix it. Thanks, -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 16:13 ` Steven Rostedt @ 2016-08-03 18:24 ` karol herbst 2016-08-19 10:35 ` karol herbst 1 sibling, 0 replies; 20+ messages in thread From: karol herbst @ 2016-08-03 18:24 UTC (permalink / raw) To: Steven Rostedt Cc: Andy Shevchenko, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar hi all, mhhh exactly this commit fixed mmiotrace for me and a few other nouveau devs on x86_64. Also the error log doesn't really show a problem inside the tracer? Maybe it would be helpful to provide full dmesg in the case of error, just to rule silly things out. I will try to figure out how that bug could happen at all. Greetings Karol 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>: > On Tue, 2 Aug 2016 19:08:24 +0300 > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > >> I don't think so, since linux-next doesn't work until I revert this commit. >> >> I can try exactly v4.6 (yep, I tried stable versions, including >> v4.4.16 that's why all of them failed to me) if you still would like >> me to do so. > > If linux-next doesn't work, then don't bother. > > That commit obviously broke something and you'll probably need help > from Karol to fix it. > > Thanks, > > -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-02 16:13 ` Steven Rostedt 2016-08-03 18:24 ` karol herbst @ 2016-08-19 10:35 ` karol herbst 2016-08-19 13:02 ` Andy Shevchenko 2016-08-19 13:34 ` Steven Rostedt 1 sibling, 2 replies; 20+ messages in thread From: karol herbst @ 2016-08-19 10:35 UTC (permalink / raw) To: Steven Rostedt Cc: Andy Shevchenko, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar Hi everybody, is there any update on that issue I missed somehow? I really don't want to leave the mmiotracer in a state, where it breaks something while fixing other issues. But for now, without being able to even reproduce the issue, I can't really do much, because the code in the current state looks sane to me. Maybe this case includes the mmiotracer cleaning things up and arms new region for mmiotracing and that's why it fails? Besides that, I have no idea and no way to reproduce this, so I can't help this way. Greetings 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>: > On Tue, 2 Aug 2016 19:08:24 +0300 > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > >> I don't think so, since linux-next doesn't work until I revert this commit. >> >> I can try exactly v4.6 (yep, I tried stable versions, including >> v4.4.16 that's why all of them failed to me) if you still would like >> me to do so. > > If linux-next doesn't work, then don't bother. > > That commit obviously broke something and you'll probably need help > from Karol to fix it. > > Thanks, > > -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 10:35 ` karol herbst @ 2016-08-19 13:02 ` Andy Shevchenko 2016-08-19 15:08 ` karol herbst 2016-08-19 13:34 ` Steven Rostedt 1 sibling, 1 reply; 20+ messages in thread From: Andy Shevchenko @ 2016-08-19 13:02 UTC (permalink / raw) To: karol herbst Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: > is there any update on that issue I missed somehow? I really don't > want to leave the mmiotracer in a state, where it breaks something > while fixing other issues. No updates. I'm busy right now with more priority tasks and revert works for me. Issue is reproducible in my case 100%. So, I would able to attach dmesg in case it would be helpful. Otherwise tell me exact instructions how to debug the issue. Here you are: http://pastebin.com/raw/VfTZENt7 > But for now, without being able to even reproduce the issue, I can't > really do much, because the code in the current state looks sane to > me. Maybe this case includes the mmiotracer cleaning things up and > arms new region for mmiotracing and that's why it fails? Besides that, > I have no idea and no way to reproduce this, so I can't help this way. Maybe. First thing happened is iounmap(). > 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>: > > On Tue, 2 Aug 2016 19:08:24 +0300 > > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > > > >> I don't think so, since linux-next doesn't work until I revert this commit. > >> > >> I can try exactly v4.6 (yep, I tried stable versions, including > >> v4.4.16 that's why all of them failed to me) if you still would like > >> me to do so. > > > > If linux-next doesn't work, then don't bother. > > > > That commit obviously broke something and you'll probably need help > > from Karol to fix it. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 13:02 ` Andy Shevchenko @ 2016-08-19 15:08 ` karol herbst 2016-08-19 15:35 ` Andy Shevchenko 0 siblings, 1 reply; 20+ messages in thread From: karol herbst @ 2016-08-19 15:08 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: > On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >> is there any update on that issue I missed somehow? I really don't >> want to leave the mmiotracer in a state, where it breaks something >> while fixing other issues. > > No updates. I'm busy right now with more priority tasks and revert > works for me. Issue is reproducible in my case 100%. > Is there something I could do with a "normal" haswell desktop system to reproduce this issue? I'll try to play around the next days a bit and maybe I find something that works out here as well. It seems to be related to unmapping-mapping cycles. Because if this only happens with the pwm-lpss driver, it may be really troublesome to debug, because I don't really know the code that well to be sure where the issue might be. > So, I would able to attach dmesg in case it would be helpful. > Otherwise tell me exact instructions how to debug the issue. > > Here you are: > http://pastebin.com/raw/VfTZENt7 > >> But for now, without being able to even reproduce the issue, I can't >> really do much, because the code in the current state looks sane to >> me. Maybe this case includes the mmiotracer cleaning things up and >> arms new region for mmiotracing and that's why it fails? Besides that, >> I have no idea and no way to reproduce this, so I can't help this way. > > Maybe. First thing happened is iounmap(). > >> 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>: >> > On Tue, 2 Aug 2016 19:08:24 +0300 >> > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: >> > >> >> I don't think so, since linux-next doesn't work until I revert this commit. >> >> >> >> I can try exactly v4.6 (yep, I tried stable versions, including >> >> v4.4.16 that's why all of them failed to me) if you still would like >> >> me to do so. >> > >> > If linux-next doesn't work, then don't bother. >> > >> > That commit obviously broke something and you'll probably need help >> > from Karol to fix it. > > -- > With Best Regards, > Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 15:08 ` karol herbst @ 2016-08-19 15:35 ` Andy Shevchenko 2016-08-19 18:23 ` karol herbst 2016-08-19 20:46 ` Karol Herbst 0 siblings, 2 replies; 20+ messages in thread From: Andy Shevchenko @ 2016-08-19 15:35 UTC (permalink / raw) To: karol herbst Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: > 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>> is there any update on that issue I missed somehow? I really don't >>> want to leave the mmiotracer in a state, where it breaks something >>> while fixing other issues. >> >> No updates. I'm busy right now with more priority tasks and revert >> works for me. Issue is reproducible in my case 100%. >> > > Is there something I could do with a "normal" haswell desktop system > to reproduce this issue? Try LPSS UART device(s) > > I'll try to play around the next days a bit and maybe I find something > that works out here as well. It seems to be related to > unmapping-mapping cycles. That is the only thing I would think of. > > Because if this only happens with the pwm-lpss driver, It has nothing to do with pwm-lpss since it's a HS UART and served by intel-lpss driver. > it may be > really troublesome to debug, because I don't really know the code that > well to be sure where the issue might be. > >> So, I would able to attach dmesg in case it would be helpful. >> Otherwise tell me exact instructions how to debug the issue. >> >> Here you are: >> http://pastebin.com/raw/VfTZENt7 >> >>> But for now, without being able to even reproduce the issue, I can't >>> really do much, because the code in the current state looks sane to >>> me. Maybe this case includes the mmiotracer cleaning things up and >>> arms new region for mmiotracing and that's why it fails? Besides that, >>> I have no idea and no way to reproduce this, so I can't help this way. >> >> Maybe. First thing happened is iounmap(). -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 15:35 ` Andy Shevchenko @ 2016-08-19 18:23 ` karol herbst 2016-08-19 20:46 ` Karol Herbst 1 sibling, 0 replies; 20+ messages in thread From: karol herbst @ 2016-08-19 18:23 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: > On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>> is there any update on that issue I missed somehow? I really don't >>>> want to leave the mmiotracer in a state, where it breaks something >>>> while fixing other issues. >>> >>> No updates. I'm busy right now with more priority tasks and revert >>> works for me. Issue is reproducible in my case 100%. >>> >> >> Is there something I could do with a "normal" haswell desktop system >> to reproduce this issue? > > Try LPSS UART device(s) > isn't this a skylake thing? Because my CPU and motherboard is a bit older than this. >> >> I'll try to play around the next days a bit and maybe I find something >> that works out here as well. It seems to be related to >> unmapping-mapping cycles. > > That is the only thing I would think of. > >> >> Because if this only happens with the pwm-lpss driver, > > It has nothing to do with pwm-lpss since it's a HS UART and served by > intel-lpss driver. > >> it may be >> really troublesome to debug, because I don't really know the code that >> well to be sure where the issue might be. >> >>> So, I would able to attach dmesg in case it would be helpful. >>> Otherwise tell me exact instructions how to debug the issue. >>> >>> Here you are: >>> http://pastebin.com/raw/VfTZENt7 >>> >>>> But for now, without being able to even reproduce the issue, I can't >>>> really do much, because the code in the current state looks sane to >>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>> I have no idea and no way to reproduce this, so I can't help this way. >>> >>> Maybe. First thing happened is iounmap(). > > > -- > With Best Regards, > Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 15:35 ` Andy Shevchenko 2016-08-19 18:23 ` karol herbst @ 2016-08-19 20:46 ` Karol Herbst 2016-08-19 21:50 ` Andy Shevchenko 2016-10-13 21:12 ` Karol Herbst 1 sibling, 2 replies; 20+ messages in thread From: Karol Herbst @ 2016-08-19 20:46 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar Hi again, I was able to get a crash/freeze/something while unbinding/binding my nvidia gpu from nouveau. Guess that means something is odd. I will investigate this more over the weekend. 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: > On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>> is there any update on that issue I missed somehow? I really don't >>>> want to leave the mmiotracer in a state, where it breaks something >>>> while fixing other issues. >>> >>> No updates. I'm busy right now with more priority tasks and revert >>> works for me. Issue is reproducible in my case 100%. >>> >> >> Is there something I could do with a "normal" haswell desktop system >> to reproduce this issue? > > Try LPSS UART device(s) > >> >> I'll try to play around the next days a bit and maybe I find something >> that works out here as well. It seems to be related to >> unmapping-mapping cycles. > > That is the only thing I would think of. > >> >> Because if this only happens with the pwm-lpss driver, > > It has nothing to do with pwm-lpss since it's a HS UART and served by > intel-lpss driver. > >> it may be >> really troublesome to debug, because I don't really know the code that >> well to be sure where the issue might be. >> >>> So, I would able to attach dmesg in case it would be helpful. >>> Otherwise tell me exact instructions how to debug the issue. >>> >>> Here you are: >>> http://pastebin.com/raw/VfTZENt7 >>> >>>> But for now, without being able to even reproduce the issue, I can't >>>> really do much, because the code in the current state looks sane to >>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>> I have no idea and no way to reproduce this, so I can't help this way. >>> >>> Maybe. First thing happened is iounmap(). > > > -- > With Best Regards, > Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 20:46 ` Karol Herbst @ 2016-08-19 21:50 ` Andy Shevchenko 2016-10-13 21:12 ` Karol Herbst 1 sibling, 0 replies; 20+ messages in thread From: Andy Shevchenko @ 2016-08-19 21:50 UTC (permalink / raw) To: Karol Herbst Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar On Fri, Aug 19, 2016 at 11:46 PM, Karol Herbst <karolherbst@gmail.com> wrote: > I was able to get a crash/freeze/something while unbinding/binding my > nvidia gpu from nouveau. > > Guess that means something is odd. I will investigate this more over > the weekend. Thanks. Will wait for further updates. -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 20:46 ` Karol Herbst 2016-08-19 21:50 ` Andy Shevchenko @ 2016-10-13 21:12 ` Karol Herbst 2016-10-22 16:02 ` Andy Shevchenko 1 sibling, 1 reply; 20+ messages in thread From: Karol Herbst @ 2016-10-13 21:12 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar sorry for the delay fixing that bug. I got occupied with other things and didn't really got to the issue again, it is on my todo list as the next item though and I hope I will be able to get a fix ready this weekend. I think I might know where the issue is, but didn't confirm it yet. Again, sorry for the delay. Karol 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>: > Hi again, > > I was able to get a crash/freeze/something while unbinding/binding my > nvidia gpu from nouveau. > > Guess that means something is odd. I will investigate this more over > the weekend. > > 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>> is there any update on that issue I missed somehow? I really don't >>>>> want to leave the mmiotracer in a state, where it breaks something >>>>> while fixing other issues. >>>> >>>> No updates. I'm busy right now with more priority tasks and revert >>>> works for me. Issue is reproducible in my case 100%. >>>> >>> >>> Is there something I could do with a "normal" haswell desktop system >>> to reproduce this issue? >> >> Try LPSS UART device(s) >> >>> >>> I'll try to play around the next days a bit and maybe I find something >>> that works out here as well. It seems to be related to >>> unmapping-mapping cycles. >> >> That is the only thing I would think of. >> >>> >>> Because if this only happens with the pwm-lpss driver, >> >> It has nothing to do with pwm-lpss since it's a HS UART and served by >> intel-lpss driver. >> >>> it may be >>> really troublesome to debug, because I don't really know the code that >>> well to be sure where the issue might be. >>> >>>> So, I would able to attach dmesg in case it would be helpful. >>>> Otherwise tell me exact instructions how to debug the issue. >>>> >>>> Here you are: >>>> http://pastebin.com/raw/VfTZENt7 >>>> >>>>> But for now, without being able to even reproduce the issue, I can't >>>>> really do much, because the code in the current state looks sane to >>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>> >>>> Maybe. First thing happened is iounmap(). >> >> >> -- >> With Best Regards, >> Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-10-13 21:12 ` Karol Herbst @ 2016-10-22 16:02 ` Andy Shevchenko 2016-11-19 10:56 ` Karol Herbst 0 siblings, 1 reply; 20+ messages in thread From: Andy Shevchenko @ 2016-10-22 16:02 UTC (permalink / raw) To: Karol Herbst Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote: > sorry for the delay fixing that bug. I got occupied with other things > and didn't really got to the issue again, it is on my todo list as the > next item though and I hope I will be able to get a fix ready this > weekend. I think I might know where the issue is, but didn't confirm > it yet. Thanks.I'm still using revert. Feel free to Cc me when you will have some material to test. > > Again, sorry for the delay. > > Karol > > 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>: >> Hi again, >> >> I was able to get a crash/freeze/something while unbinding/binding my >> nvidia gpu from nouveau. >> >> Guess that means something is odd. I will investigate this more over >> the weekend. >> >> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>>> is there any update on that issue I missed somehow? I really don't >>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>> while fixing other issues. >>>>> >>>>> No updates. I'm busy right now with more priority tasks and revert >>>>> works for me. Issue is reproducible in my case 100%. >>>>> >>>> >>>> Is there something I could do with a "normal" haswell desktop system >>>> to reproduce this issue? >>> >>> Try LPSS UART device(s) >>> >>>> >>>> I'll try to play around the next days a bit and maybe I find something >>>> that works out here as well. It seems to be related to >>>> unmapping-mapping cycles. >>> >>> That is the only thing I would think of. >>> >>>> >>>> Because if this only happens with the pwm-lpss driver, >>> >>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>> intel-lpss driver. >>> >>>> it may be >>>> really troublesome to debug, because I don't really know the code that >>>> well to be sure where the issue might be. >>>> >>>>> So, I would able to attach dmesg in case it would be helpful. >>>>> Otherwise tell me exact instructions how to debug the issue. >>>>> >>>>> Here you are: >>>>> http://pastebin.com/raw/VfTZENt7 >>>>> >>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>> really do much, because the code in the current state looks sane to >>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>> >>>>> Maybe. First thing happened is iounmap(). >>> >>> >>> -- >>> With Best Regards, >>> Andy Shevchenko -- With Best Regards, Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-10-22 16:02 ` Andy Shevchenko @ 2016-11-19 10:56 ` Karol Herbst 2016-11-24 20:50 ` Karol Herbst 0 siblings, 1 reply; 20+ messages in thread From: Karol Herbst @ 2016-11-19 10:56 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar this is odd, I found a bug related to nouveau (modprobe/bind doesn't return), but that isn't related to your issue at all or maybe it is exactly this, cause the binding of the device doesn't return and depending on the kind of driver, it would hang the system... yeah, maybe it is the same issue. anyway, could you try to trace with the attached patch? Maybe the additional output would help me to verify it. Currently I am working on the bugfix I mentioned above and this may also fix your issue. I was still able to get a working mmiotrace file, even if the dvice binding didn't finish. Is this the same for you? (try cat "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this contains anything usefull). This really looks like an odd issue, because the mmiotracer still behaves as expected. 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: > On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote: >> sorry for the delay fixing that bug. I got occupied with other things >> and didn't really got to the issue again, it is on my todo list as the >> next item though and I hope I will be able to get a fix ready this >> weekend. I think I might know where the issue is, but didn't confirm >> it yet. > > Thanks.I'm still using revert. Feel free to Cc me when you will have > some material to test. > >> >> Again, sorry for the delay. >> >> Karol >> >> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>: >>> Hi again, >>> >>> I was able to get a crash/freeze/something while unbinding/binding my >>> nvidia gpu from nouveau. >>> >>> Guess that means something is odd. I will investigate this more over >>> the weekend. >>> >>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>> while fixing other issues. >>>>>> >>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>> works for me. Issue is reproducible in my case 100%. >>>>>> >>>>> >>>>> Is there something I could do with a "normal" haswell desktop system >>>>> to reproduce this issue? >>>> >>>> Try LPSS UART device(s) >>>> >>>>> >>>>> I'll try to play around the next days a bit and maybe I find something >>>>> that works out here as well. It seems to be related to >>>>> unmapping-mapping cycles. >>>> >>>> That is the only thing I would think of. >>>> >>>>> >>>>> Because if this only happens with the pwm-lpss driver, >>>> >>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>> intel-lpss driver. >>>> >>>>> it may be >>>>> really troublesome to debug, because I don't really know the code that >>>>> well to be sure where the issue might be. >>>>> >>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>> >>>>>> Here you are: >>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>> >>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>> really do much, because the code in the current state looks sane to >>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>> >>>>>> Maybe. First thing happened is iounmap(). >>>> >>>> >>>> -- >>>> With Best Regards, >>>> Andy Shevchenko > > > > -- > With Best Regards, > Andy Shevchenko ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-11-19 10:56 ` Karol Herbst @ 2016-11-24 20:50 ` Karol Herbst 0 siblings, 0 replies; 20+ messages in thread From: Karol Herbst @ 2016-11-24 20:50 UTC (permalink / raw) To: Andy Shevchenko Cc: Steven Rostedt, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar [-- Attachment #1: Type: text/plain, Size: 4044 bytes --] sorry for that, but I forgot the patch 2016-11-19 11:56 GMT+01:00 Karol Herbst <karolherbst@gmail.com>: > this is odd, I found a bug related to nouveau (modprobe/bind doesn't > return), but that isn't related to your issue at all or maybe it is > exactly this, cause the binding of the device doesn't return and > depending on the kind of driver, it would hang the system... yeah, > maybe it is the same issue. > > anyway, could you try to trace with the attached patch? Maybe the > additional output would help me to verify it. Currently I am working > on the bugfix I mentioned above and this may also fix your issue. I > was still able to get a working mmiotrace file, even if the dvice > binding didn't finish. Is this the same for you? (try cat > "/sys/kernel/debug/tracing/trace_pipe > some_file"; and see if this > contains anything usefull). > > This really looks like an odd issue, because the mmiotracer still > behaves as expected. > > 2016-10-22 18:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >> On Fri, Oct 14, 2016 at 12:12 AM, Karol Herbst <karolherbst@gmail.com> wrote: >>> sorry for the delay fixing that bug. I got occupied with other things >>> and didn't really got to the issue again, it is on my todo list as the >>> next item though and I hope I will be able to get a fix ready this >>> weekend. I think I might know where the issue is, but didn't confirm >>> it yet. >> >> Thanks.I'm still using revert. Feel free to Cc me when you will have >> some material to test. >> >>> >>> Again, sorry for the delay. >>> >>> Karol >>> >>> 2016-08-19 22:46 GMT+02:00 Karol Herbst <karolherbst@gmail.com>: >>>> Hi again, >>>> >>>> I was able to get a crash/freeze/something while unbinding/binding my >>>> nvidia gpu from nouveau. >>>> >>>> Guess that means something is odd. I will investigate this more over >>>> the weekend. >>>> >>>> 2016-08-19 17:35 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>>> On Fri, Aug 19, 2016 at 6:08 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>>> 2016-08-19 15:02 GMT+02:00 Andy Shevchenko <andy.shevchenko@gmail.com>: >>>>>>> On Fri, Aug 19, 2016 at 1:35 PM, karol herbst <karolherbst@gmail.com> wrote: >>>>>>>> is there any update on that issue I missed somehow? I really don't >>>>>>>> want to leave the mmiotracer in a state, where it breaks something >>>>>>>> while fixing other issues. >>>>>>> >>>>>>> No updates. I'm busy right now with more priority tasks and revert >>>>>>> works for me. Issue is reproducible in my case 100%. >>>>>>> >>>>>> >>>>>> Is there something I could do with a "normal" haswell desktop system >>>>>> to reproduce this issue? >>>>> >>>>> Try LPSS UART device(s) >>>>> >>>>>> >>>>>> I'll try to play around the next days a bit and maybe I find something >>>>>> that works out here as well. It seems to be related to >>>>>> unmapping-mapping cycles. >>>>> >>>>> That is the only thing I would think of. >>>>> >>>>>> >>>>>> Because if this only happens with the pwm-lpss driver, >>>>> >>>>> It has nothing to do with pwm-lpss since it's a HS UART and served by >>>>> intel-lpss driver. >>>>> >>>>>> it may be >>>>>> really troublesome to debug, because I don't really know the code that >>>>>> well to be sure where the issue might be. >>>>>> >>>>>>> So, I would able to attach dmesg in case it would be helpful. >>>>>>> Otherwise tell me exact instructions how to debug the issue. >>>>>>> >>>>>>> Here you are: >>>>>>> http://pastebin.com/raw/VfTZENt7 >>>>>>> >>>>>>>> But for now, without being able to even reproduce the issue, I can't >>>>>>>> really do much, because the code in the current state looks sane to >>>>>>>> me. Maybe this case includes the mmiotracer cleaning things up and >>>>>>>> arms new region for mmiotracing and that's why it fails? Besides that, >>>>>>>> I have no idea and no way to reproduce this, so I can't help this way. >>>>>>> >>>>>>> Maybe. First thing happened is iounmap(). >>>>> >>>>> >>>>> -- >>>>> With Best Regards, >>>>> Andy Shevchenko >> >> >> >> -- >> With Best Regards, >> Andy Shevchenko [-- Attachment #2: 0001-temp-hack.patch --] [-- Type: text/x-patch, Size: 2760 bytes --] From 92aea447a776f10aad0a2e971b5f2b208a1161d2 Mon Sep 17 00:00:00 2001 From: Karol Herbst <nouveau@karolherbst.de> Date: Thu, 24 Nov 2016 21:46:27 +0100 Subject: [PATCH] temp hack --- arch/x86/mm/kmmio.c | 29 +++++++++++++++++++++++------ 1 file changed, 23 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/kmmio.c b/arch/x86/mm/kmmio.c index afc47f5c9531..a002ee314a0c 100644 --- a/arch/x86/mm/kmmio.c +++ b/arch/x86/mm/kmmio.c @@ -97,11 +97,16 @@ static DEFINE_PER_CPU(struct kmmio_context, kmmio_ctx); static struct kmmio_probe *get_kmmio_probe(unsigned long addr) { struct kmmio_probe *p; + struct kmmio_probe *result = NULL; list_for_each_entry_rcu(p, &kmmio_probes, list) { - if (addr >= p->addr && addr < (p->addr + p->len)) - return p; + if (addr >= p->addr && addr < (p->addr + p->len)) { + if (!result) + result = p; + else + printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr); + } } - return NULL; + return result; } /* You must be holding RCU read lock. */ @@ -109,6 +114,7 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr) { struct list_head *head; struct kmmio_fault_page *f; + struct kmmio_fault_page *result = NULL; unsigned int l; pte_t *pte = lookup_address(addr, &l); @@ -116,11 +122,16 @@ static struct kmmio_fault_page *get_kmmio_fault_page(unsigned long addr) return NULL; addr &= page_level_mask(l); head = kmmio_page_list(addr); + list_for_each_entry_rcu(f, head, list) { - if (f->addr == addr) - return f; + if (f->addr == addr) { + if (!result) + return f; + else + printk(KERN_ERR " %s collision detected %lu", __FUNCTION__, addr); + } } - return NULL; + return result; } static void clear_pmd_presence(pmd_t *pmd, bool clear, pmdval_t *old) @@ -375,6 +386,7 @@ static int add_kmmio_fault_page(unsigned long addr) { struct kmmio_fault_page *f; + printk(KERN_WARNING " %s %lx", __FUNCTION__, addr); f = get_kmmio_fault_page(addr); if (f) { if (!f->count) @@ -406,6 +418,7 @@ static void release_kmmio_fault_page(unsigned long addr, { struct kmmio_fault_page *f; + printk(KERN_WARNING " %s %lx", __FUNCTION__, addr); f = get_kmmio_fault_page(addr); if (!f) return; @@ -445,6 +458,8 @@ int register_kmmio_probe(struct kmmio_probe *p) } pte = lookup_address(p->addr, &l); + printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l); + if (!pte) { ret = -EINVAL; goto out; @@ -537,6 +552,8 @@ void unregister_kmmio_probe(struct kmmio_probe *p) if (!pte) return; + printk(KERN_WARNING " %s %lx %u", __FUNCTION__, p->addr, l); + spin_lock_irqsave(&kmmio_lock, flags); while (size < size_lim) { release_kmmio_fault_page(p->addr + size, &release_list); -- 2.11.0.rc2 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: mmiotracer hangs the system 2016-08-19 10:35 ` karol herbst 2016-08-19 13:02 ` Andy Shevchenko @ 2016-08-19 13:34 ` Steven Rostedt 1 sibling, 0 replies; 20+ messages in thread From: Steven Rostedt @ 2016-08-19 13:34 UTC (permalink / raw) To: Andy Shevchenko Cc: karol herbst, linux-kernel@vger.kernel.org, Paul E. McKenney, Ingo Molnar Andy, OK, the ball is in your court. Karol can't reproduce it, thus it will require you sending debug information back so we can get this solved. -- Steve On Fri, 19 Aug 2016 12:35:24 +0200 karol herbst <karolherbst@gmail.com> wrote: > Hi everybody, > > is there any update on that issue I missed somehow? I really don't > want to leave the mmiotracer in a state, where it breaks something > while fixing other issues. > > But for now, without being able to even reproduce the issue, I can't > really do much, because the code in the current state looks sane to > me. Maybe this case includes the mmiotracer cleaning things up and > arms new region for mmiotracing and that's why it fails? Besides that, > I have no idea and no way to reproduce this, so I can't help this way. > > Greetings > > 2016-08-02 18:13 GMT+02:00 Steven Rostedt <rostedt@goodmis.org>: > > On Tue, 2 Aug 2016 19:08:24 +0300 > > Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > > > >> I don't think so, since linux-next doesn't work until I revert this commit. > >> > >> I can try exactly v4.6 (yep, I tried stable versions, including > >> v4.4.16 that's why all of them failed to me) if you still would like > >> me to do so. > > > > If linux-next doesn't work, then don't bother. > > > > That commit obviously broke something and you'll probably need help > > from Karol to fix it. > > > > Thanks, > > > > -- Steve ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2016-11-24 20:50 UTC | newest] Thread overview: 20+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-08-02 10:08 mmiotracer hangs the system Andy Shevchenko 2016-08-02 10:36 ` Andy Shevchenko 2016-08-02 15:05 ` Andy Shevchenko 2016-08-02 15:07 ` Andy Shevchenko 2016-08-02 15:31 ` Steven Rostedt 2016-08-02 16:08 ` Andy Shevchenko 2016-08-02 16:13 ` Steven Rostedt 2016-08-03 18:24 ` karol herbst 2016-08-19 10:35 ` karol herbst 2016-08-19 13:02 ` Andy Shevchenko 2016-08-19 15:08 ` karol herbst 2016-08-19 15:35 ` Andy Shevchenko 2016-08-19 18:23 ` karol herbst 2016-08-19 20:46 ` Karol Herbst 2016-08-19 21:50 ` Andy Shevchenko 2016-10-13 21:12 ` Karol Herbst 2016-10-22 16:02 ` Andy Shevchenko 2016-11-19 10:56 ` Karol Herbst 2016-11-24 20:50 ` Karol Herbst 2016-08-19 13:34 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).