linux-tegra.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
       [not found] ` <20250107212702.169493-6-john.ogness@linutronix.de>
@ 2025-01-15 16:21   ` Jon Hunter
  2025-01-15 16:54     ` John Ogness
  2025-10-08 15:56     ` John Ogness
  0 siblings, 2 replies; 16+ messages in thread
From: Jon Hunter @ 2025-01-15 16:21 UTC (permalink / raw)
  To: John Ogness, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org

Hi John,

On 07/01/2025 21:27, John Ogness wrote:
> Implement the necessary callbacks to switch the 8250 console driver
> to perform as an nbcon console.
> 
> Add implementations for the nbcon console callbacks:
> 
>    ->write_atomic()
>    ->write_thread()
>    ->device_lock()
>    ->device_unlock()
> 
> and add CON_NBCON to the initial @flags.
> 
> All register access in the callbacks are within unsafe sections.
> The ->write_atomic() and ->write_thread() callbacks allow safe
> handover/takeover per byte and add a preceding newline if they
> take over from another context mid-line.
> 
> For the ->write_atomic() callback, a new irq_work is used to defer
> modem control since it may be called from a context that does not
> allow waking up tasks.
> 
> Note: A new __serial8250_clear_IER() is introduced for direct
> clearing of UART_IER. This will allow to restore the lockdep
> check to serial8250_clear_IER() in a follow-up commit.
> 
> Signed-off-by: John Ogness <john.ogness@linutronix.de>


I have noticed a suspend regression on -next for some of our 32-bit 
Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing 
to this commit and reverting this on top of -next (along with reverting 
"serial: 8250: Revert "drop lockdep annotation from 
serial8250_clear_IER()") fixes the issue. So far I have not dug in any 
further. Unfortunately, I don't have any logs to see if there is some 
crash or something happening but I will see if there is any more info I 
can get.

Thanks
Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-15 16:21   ` [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console Jon Hunter
@ 2025-01-15 16:54     ` John Ogness
  2025-01-16 10:27       ` Jon Hunter
  2025-10-08 15:56     ` John Ogness
  1 sibling, 1 reply; 16+ messages in thread
From: John Ogness @ 2025-01-15 16:54 UTC (permalink / raw)
  To: Jon Hunter, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org

On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
> I have noticed a suspend regression on -next for some of our 32-bit 
> Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing 
> to this commit and reverting this on top of -next (along with reverting 
> "serial: 8250: Revert "drop lockdep annotation from 
> serial8250_clear_IER()") fixes the issue. So far I have not dug in any 
> further. Unfortunately, I don't have any logs to see if there is some 
> crash or something happening but I will see if there is any more info I 
> can get.

Do you at least know if it is failing to suspend or failing to resume
(based on power consumption)?

John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-15 16:54     ` John Ogness
@ 2025-01-16 10:27       ` Jon Hunter
  2025-01-16 10:38         ` John Ogness
  0 siblings, 1 reply; 16+ messages in thread
From: Jon Hunter @ 2025-01-16 10:27 UTC (permalink / raw)
  To: John Ogness, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org


On 15/01/2025 16:54, John Ogness wrote:
> On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
>> I have noticed a suspend regression on -next for some of our 32-bit
>> Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing
>> to this commit and reverting this on top of -next (along with reverting
>> "serial: 8250: Revert "drop lockdep annotation from
>> serial8250_clear_IER()") fixes the issue. So far I have not dug in any
>> further. Unfortunately, I don't have any logs to see if there is some
>> crash or something happening but I will see if there is any more info I
>> can get.
> 
> Do you at least know if it is failing to suspend or failing to resume
> (based on power consumption)?


Unfortunately, I don't. These are farm boards and so nothing local I can 
get my hands on. For some reason all the serial console logs are not 
available and so I am going to talk to the farm team about fixing that 
because we should at least have serial logs.

Thanks
Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-16 10:27       ` Jon Hunter
@ 2025-01-16 10:38         ` John Ogness
  2025-01-16 10:41           ` Jon Hunter
  0 siblings, 1 reply; 16+ messages in thread
From: John Ogness @ 2025-01-16 10:38 UTC (permalink / raw)
  To: Jon Hunter, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org

On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
>> Do you at least know if it is failing to suspend or failing to resume
>> (based on power consumption)?
>
>
> Unfortunately, I don't. These are farm boards and so nothing local I can 
> get my hands on. For some reason all the serial console logs are not 
> available and so I am going to talk to the farm team about fixing that 
> because we should at least have serial logs.

Can you confirm that the board is actually booting? The suspend code for
8250_tegra.c is quite simple. I am wondering if the farm tests are
failing somewhere else, such as the atomic printing during early boot.

John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-16 10:38         ` John Ogness
@ 2025-01-16 10:41           ` Jon Hunter
  2025-01-20 16:23             ` Thierry Reding
  0 siblings, 1 reply; 16+ messages in thread
From: Jon Hunter @ 2025-01-16 10:41 UTC (permalink / raw)
  To: John Ogness, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org


On 16/01/2025 10:38, John Ogness wrote:
> On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
>>> Do you at least know if it is failing to suspend or failing to resume
>>> (based on power consumption)?
>>
>>
>> Unfortunately, I don't. These are farm boards and so nothing local I can
>> get my hands on. For some reason all the serial console logs are not
>> available and so I am going to talk to the farm team about fixing that
>> because we should at least have serial logs.
> 
> Can you confirm that the board is actually booting? The suspend code for
> 8250_tegra.c is quite simple. I am wondering if the farm tests are
> failing somewhere else, such as the atomic printing during early boot.


Yes they are all booting fine. I have an independent boot test and that 
is passing. It is just the suspend test that is failing.

Thanks
Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-16 10:41           ` Jon Hunter
@ 2025-01-20 16:23             ` Thierry Reding
  2025-01-20 16:34               ` Thierry Reding
  0 siblings, 1 reply; 16+ messages in thread
From: Thierry Reding @ 2025-01-20 16:23 UTC (permalink / raw)
  To: Jon Hunter
  Cc: John Ogness, Greg Kroah-Hartman, Jiri Slaby, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	linux-tegra@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 5013 bytes --]

On Thu, Jan 16, 2025 at 10:41:08AM +0000, Jon Hunter wrote:
> 
> On 16/01/2025 10:38, John Ogness wrote:
> > On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
> > > > Do you at least know if it is failing to suspend or failing to resume
> > > > (based on power consumption)?
> > > 
> > > 
> > > Unfortunately, I don't. These are farm boards and so nothing local I can
> > > get my hands on. For some reason all the serial console logs are not
> > > available and so I am going to talk to the farm team about fixing that
> > > because we should at least have serial logs.
> > 
> > Can you confirm that the board is actually booting? The suspend code for
> > 8250_tegra.c is quite simple. I am wondering if the farm tests are
> > failing somewhere else, such as the atomic printing during early boot.
> 
> 
> Yes they are all booting fine. I have an independent boot test and that is
> passing. It is just the suspend test that is failing.

I was able to capture logs, but unfortunately they don't provide much
insight either. On the first try it doesn't suspend and goes back to
userspace after a second or so:

--- >8 ---
-sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:00 1970
[   36.332486] PM: suspend entry (deep)
[   36.332832] Filesystems sync: 0.000 seconds
[   36.369331] +1.8V_RUN_CAM: disabling
[   36.373884] +2.8V_RUN_CAM: disabling
[   36.375571] +1.2V_RUN_CAM_FRONT: disabling
[   36.380359] +1.05V_RUN_CAM_REAR: disabling
[   36.387399] +3.3V_RUN_TOUCH: disabling
[   36.390808] +2.8V_RUN_CAM_AF: disabling
[   36.393621] +1.8V_RUN_VPP_FUSE: disabling
[   36.408218] Freezing user space processes
[   36.413660] Freezing user space processes completed (elapsed 0.005 seconds)
[   36.413680] OOM killer disabled.
[   36.413693] Freezing remaining freezable tasks
[   36.415033] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
[   36.428474] drm drm: [drm:drm_client_dev_suspend] fbdev: ret=0
[   36.428527] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 2e5cd010
[   36.428547] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] 6a6be0ef state to 2e5cd010
[   36.428561] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 00d818c2 state to 2e5cd010
[   36.428574] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:32:plane-0] 4e145b7d state to 2e5cd010
[   36.428587] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:36:plane-1] dbf67d12 state to 2e5cd010
[   36.428597] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:40:plane-2] 763d8809 state to 2e5cd010
[   36.428608] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:44:plane-3] b6eabcf1 state to 2e5cd010
[   36.428617] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:48:plane-4] 7863878c state to 2e5cd010
[   36.428628] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:52:plane-5] 54b8029c state to 2e5cd010
[   36.428638] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:56:plane-6] 364063af state to 2e5cd010
[   36.428648] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:60:plane-7] e1c11dfb state to 2e5cd010
[   36.428662] drm drm: [drm:drm_atomic_get_connector_state] Added [CONNECTOR:65:HDMI-A-1] 5cb32770 state to 2e5cd010
[   36.428674] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 832943c7
[   36.428682] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] f09cf73d state to 832943c7
[   36.428691] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:47:crtc-0] to 832943c7
[   36.428700] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:47:crtc-0] to 832943c7
[   36.428711] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 2700922c state to 832943c7
[   36.428720] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:63:crtc-1] to 832943c7
[   36.428727] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:63:crtc-1] to 832943c7
[   36.428737] drm drm: [drm:drm_atomic_check_only] checking 832943c7
[   36.428759] drm drm: [drm:drm_atomic_commit] committing 832943c7
[   36.428881] drm drm: [drm:drm_atomic_state_default_clear] Clearing atomic state 832943c7
[   36.428897] drm drm: [drm:__drm_atomic_state_free] Freeing atomic state 832943c7
[   36.429085] r8169 0000:01:00.0 eth0: Link is Down
[   36.713236] Disabling non-boot CPUs ...
-sh-5.1#
--- >8 ---

A second attempt soft-hangs:

--- >8 ---
-sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
rtcwake: assuming RTC uses UTC ...
rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:10 1970
--- >8 ---

Where "soft-hang" means it doesn't do anything after this and I can't
SIGINT out of it or anything. However, the serial seems to still be
responsive.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-20 16:23             ` Thierry Reding
@ 2025-01-20 16:34               ` Thierry Reding
  2025-01-27 14:54                 ` Jon Hunter
  0 siblings, 1 reply; 16+ messages in thread
From: Thierry Reding @ 2025-01-20 16:34 UTC (permalink / raw)
  To: Jon Hunter
  Cc: John Ogness, Greg Kroah-Hartman, Jiri Slaby, Petr Mladek,
	Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	linux-tegra@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 5553 bytes --]

On Mon, Jan 20, 2025 at 05:23:26PM +0100, Thierry Reding wrote:
> On Thu, Jan 16, 2025 at 10:41:08AM +0000, Jon Hunter wrote:
> > 
> > On 16/01/2025 10:38, John Ogness wrote:
> > > On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
> > > > > Do you at least know if it is failing to suspend or failing to resume
> > > > > (based on power consumption)?
> > > > 
> > > > 
> > > > Unfortunately, I don't. These are farm boards and so nothing local I can
> > > > get my hands on. For some reason all the serial console logs are not
> > > > available and so I am going to talk to the farm team about fixing that
> > > > because we should at least have serial logs.
> > > 
> > > Can you confirm that the board is actually booting? The suspend code for
> > > 8250_tegra.c is quite simple. I am wondering if the farm tests are
> > > failing somewhere else, such as the atomic printing during early boot.
> > 
> > 
> > Yes they are all booting fine. I have an independent boot test and that is
> > passing. It is just the suspend test that is failing.
> 
> I was able to capture logs, but unfortunately they don't provide much
> insight either. On the first try it doesn't suspend and goes back to
> userspace after a second or so:
> 
> --- >8 ---
> -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
> rtcwake: assuming RTC uses UTC ...
> rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:00 1970
> [   36.332486] PM: suspend entry (deep)
> [   36.332832] Filesystems sync: 0.000 seconds
> [   36.369331] +1.8V_RUN_CAM: disabling
> [   36.373884] +2.8V_RUN_CAM: disabling
> [   36.375571] +1.2V_RUN_CAM_FRONT: disabling
> [   36.380359] +1.05V_RUN_CAM_REAR: disabling
> [   36.387399] +3.3V_RUN_TOUCH: disabling
> [   36.390808] +2.8V_RUN_CAM_AF: disabling
> [   36.393621] +1.8V_RUN_VPP_FUSE: disabling
> [   36.408218] Freezing user space processes
> [   36.413660] Freezing user space processes completed (elapsed 0.005 seconds)
> [   36.413680] OOM killer disabled.
> [   36.413693] Freezing remaining freezable tasks
> [   36.415033] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> [   36.428474] drm drm: [drm:drm_client_dev_suspend] fbdev: ret=0
> [   36.428527] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 2e5cd010
> [   36.428547] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] 6a6be0ef state to 2e5cd010
> [   36.428561] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 00d818c2 state to 2e5cd010
> [   36.428574] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:32:plane-0] 4e145b7d state to 2e5cd010
> [   36.428587] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:36:plane-1] dbf67d12 state to 2e5cd010
> [   36.428597] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:40:plane-2] 763d8809 state to 2e5cd010
> [   36.428608] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:44:plane-3] b6eabcf1 state to 2e5cd010
> [   36.428617] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:48:plane-4] 7863878c state to 2e5cd010
> [   36.428628] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:52:plane-5] 54b8029c state to 2e5cd010
> [   36.428638] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:56:plane-6] 364063af state to 2e5cd010
> [   36.428648] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:60:plane-7] e1c11dfb state to 2e5cd010
> [   36.428662] drm drm: [drm:drm_atomic_get_connector_state] Added [CONNECTOR:65:HDMI-A-1] 5cb32770 state to 2e5cd010
> [   36.428674] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 832943c7
> [   36.428682] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] f09cf73d state to 832943c7
> [   36.428691] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:47:crtc-0] to 832943c7
> [   36.428700] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:47:crtc-0] to 832943c7
> [   36.428711] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 2700922c state to 832943c7
> [   36.428720] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:63:crtc-1] to 832943c7
> [   36.428727] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:63:crtc-1] to 832943c7
> [   36.428737] drm drm: [drm:drm_atomic_check_only] checking 832943c7
> [   36.428759] drm drm: [drm:drm_atomic_commit] committing 832943c7
> [   36.428881] drm drm: [drm:drm_atomic_state_default_clear] Clearing atomic state 832943c7
> [   36.428897] drm drm: [drm:__drm_atomic_state_free] Freeing atomic state 832943c7
> [   36.429085] r8169 0000:01:00.0 eth0: Link is Down
> [   36.713236] Disabling non-boot CPUs ...
> -sh-5.1#
> --- >8 ---
> 
> A second attempt soft-hangs:
> 
> --- >8 ---
> -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
> rtcwake: assuming RTC uses UTC ...
> rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:10 1970
> --- >8 ---
> 
> Where "soft-hang" means it doesn't do anything after this and I can't
> SIGINT out of it or anything. However, the serial seems to still be
> responsive.

To clarify, this was on top of next-20250120 and reverting the patches
that Jon mentioned suspend/resume is fixed for me as well.

I do have a local device that I can test on, so if there's any patches
you want me to try, or any options to enable to get more information,
please let me know.

Thanks,
Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-20 16:34               ` Thierry Reding
@ 2025-01-27 14:54                 ` Jon Hunter
  2025-01-27 15:20                   ` Petr Mladek
  2025-01-27 15:21                   ` John Ogness
  0 siblings, 2 replies; 16+ messages in thread
From: Jon Hunter @ 2025-01-27 14:54 UTC (permalink / raw)
  To: Thierry Reding, John Ogness
  Cc: Greg Kroah-Hartman, Jiri Slaby, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, Esben Haabendal, linux-serial,
	linux-kernel, Andy Shevchenko, Arnd Bergmann, Tony Lindgren,
	Niklas Schnelle, Serge Semin, linux-tegra@vger.kernel.org

Hi John,

On 20/01/2025 16:34, Thierry Reding wrote:
> On Mon, Jan 20, 2025 at 05:23:26PM +0100, Thierry Reding wrote:
>> On Thu, Jan 16, 2025 at 10:41:08AM +0000, Jon Hunter wrote:
>>>
>>> On 16/01/2025 10:38, John Ogness wrote:
>>>> On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
>>>>>> Do you at least know if it is failing to suspend or failing to resume
>>>>>> (based on power consumption)?
>>>>>
>>>>>
>>>>> Unfortunately, I don't. These are farm boards and so nothing local I can
>>>>> get my hands on. For some reason all the serial console logs are not
>>>>> available and so I am going to talk to the farm team about fixing that
>>>>> because we should at least have serial logs.
>>>>
>>>> Can you confirm that the board is actually booting? The suspend code for
>>>> 8250_tegra.c is quite simple. I am wondering if the farm tests are
>>>> failing somewhere else, such as the atomic printing during early boot.
>>>
>>>
>>> Yes they are all booting fine. I have an independent boot test and that is
>>> passing. It is just the suspend test that is failing.
>>
>> I was able to capture logs, but unfortunately they don't provide much
>> insight either. On the first try it doesn't suspend and goes back to
>> userspace after a second or so:
>>
>> --- >8 ---
>> -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
>> rtcwake: assuming RTC uses UTC ...
>> rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:00 1970
>> [   36.332486] PM: suspend entry (deep)
>> [   36.332832] Filesystems sync: 0.000 seconds
>> [   36.369331] +1.8V_RUN_CAM: disabling
>> [   36.373884] +2.8V_RUN_CAM: disabling
>> [   36.375571] +1.2V_RUN_CAM_FRONT: disabling
>> [   36.380359] +1.05V_RUN_CAM_REAR: disabling
>> [   36.387399] +3.3V_RUN_TOUCH: disabling
>> [   36.390808] +2.8V_RUN_CAM_AF: disabling
>> [   36.393621] +1.8V_RUN_VPP_FUSE: disabling
>> [   36.408218] Freezing user space processes
>> [   36.413660] Freezing user space processes completed (elapsed 0.005 seconds)
>> [   36.413680] OOM killer disabled.
>> [   36.413693] Freezing remaining freezable tasks
>> [   36.415033] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
>> [   36.428474] drm drm: [drm:drm_client_dev_suspend] fbdev: ret=0
>> [   36.428527] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 2e5cd010
>> [   36.428547] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] 6a6be0ef state to 2e5cd010
>> [   36.428561] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 00d818c2 state to 2e5cd010
>> [   36.428574] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:32:plane-0] 4e145b7d state to 2e5cd010
>> [   36.428587] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:36:plane-1] dbf67d12 state to 2e5cd010
>> [   36.428597] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:40:plane-2] 763d8809 state to 2e5cd010
>> [   36.428608] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:44:plane-3] b6eabcf1 state to 2e5cd010
>> [   36.428617] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:48:plane-4] 7863878c state to 2e5cd010
>> [   36.428628] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:52:plane-5] 54b8029c state to 2e5cd010
>> [   36.428638] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:56:plane-6] 364063af state to 2e5cd010
>> [   36.428648] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:60:plane-7] e1c11dfb state to 2e5cd010
>> [   36.428662] drm drm: [drm:drm_atomic_get_connector_state] Added [CONNECTOR:65:HDMI-A-1] 5cb32770 state to 2e5cd010
>> [   36.428674] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 832943c7
>> [   36.428682] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] f09cf73d state to 832943c7
>> [   36.428691] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:47:crtc-0] to 832943c7
>> [   36.428700] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:47:crtc-0] to 832943c7
>> [   36.428711] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 2700922c state to 832943c7
>> [   36.428720] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:63:crtc-1] to 832943c7
>> [   36.428727] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:63:crtc-1] to 832943c7
>> [   36.428737] drm drm: [drm:drm_atomic_check_only] checking 832943c7
>> [   36.428759] drm drm: [drm:drm_atomic_commit] committing 832943c7
>> [   36.428881] drm drm: [drm:drm_atomic_state_default_clear] Clearing atomic state 832943c7
>> [   36.428897] drm drm: [drm:__drm_atomic_state_free] Freeing atomic state 832943c7
>> [   36.429085] r8169 0000:01:00.0 eth0: Link is Down
>> [   36.713236] Disabling non-boot CPUs ...
>> -sh-5.1#
>> --- >8 ---
>>
>> A second attempt soft-hangs:
>>
>> --- >8 ---
>> -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
>> rtcwake: assuming RTC uses UTC ...
>> rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:10 1970
>> --- >8 ---
>>
>> Where "soft-hang" means it doesn't do anything after this and I can't
>> SIGINT out of it or anything. However, the serial seems to still be
>> responsive.
> 
> To clarify, this was on top of next-20250120 and reverting the patches
> that Jon mentioned suspend/resume is fixed for me as well.
> 
> I do have a local device that I can test on, so if there's any patches
> you want me to try, or any options to enable to get more information,
> please let me know.


Any feedback on this? Our boards are still broken with this change.

Thanks!
Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-27 14:54                 ` Jon Hunter
@ 2025-01-27 15:20                   ` Petr Mladek
  2025-01-27 15:21                   ` John Ogness
  1 sibling, 0 replies; 16+ messages in thread
From: Petr Mladek @ 2025-01-27 15:20 UTC (permalink / raw)
  To: Jon Hunter
  Cc: Thierry Reding, John Ogness, Greg Kroah-Hartman, Jiri Slaby,
	Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	linux-tegra@vger.kernel.org

On Mon 2025-01-27 14:54:25, Jon Hunter wrote:
> Hi John,
> 
> On 20/01/2025 16:34, Thierry Reding wrote:
> > On Mon, Jan 20, 2025 at 05:23:26PM +0100, Thierry Reding wrote:
> > > On Thu, Jan 16, 2025 at 10:41:08AM +0000, Jon Hunter wrote:
> > > > 
> > > > On 16/01/2025 10:38, John Ogness wrote:
> > > > > On 2025-01-16, Jon Hunter <jonathanh@nvidia.com> wrote:
> > > > > > > Do you at least know if it is failing to suspend or failing to resume
> > > > > > > (based on power consumption)?
> > > > > > 
> > > > > > 
> > > > > > Unfortunately, I don't. These are farm boards and so nothing local I can
> > > > > > get my hands on. For some reason all the serial console logs are not
> > > > > > available and so I am going to talk to the farm team about fixing that
> > > > > > because we should at least have serial logs.
> > > > > 
> > > > > Can you confirm that the board is actually booting? The suspend code for
> > > > > 8250_tegra.c is quite simple. I am wondering if the farm tests are
> > > > > failing somewhere else, such as the atomic printing during early boot.
> > > > 
> > > > 
> > > > Yes they are all booting fine. I have an independent boot test and that is
> > > > passing. It is just the suspend test that is failing.
> > > 
> > > I was able to capture logs, but unfortunately they don't provide much
> > > insight either. On the first try it doesn't suspend and goes back to
> > > userspace after a second or so:
> > > 
> > > --- >8 ---
> > > -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
> > > rtcwake: assuming RTC uses UTC ...
> > > rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:00 1970
> > > [   36.332486] PM: suspend entry (deep)
> > > [   36.332832] Filesystems sync: 0.000 seconds
> > > [   36.369331] +1.8V_RUN_CAM: disabling
> > > [   36.373884] +2.8V_RUN_CAM: disabling
> > > [   36.375571] +1.2V_RUN_CAM_FRONT: disabling
> > > [   36.380359] +1.05V_RUN_CAM_REAR: disabling
> > > [   36.387399] +3.3V_RUN_TOUCH: disabling
> > > [   36.390808] +2.8V_RUN_CAM_AF: disabling
> > > [   36.393621] +1.8V_RUN_VPP_FUSE: disabling
> > > [   36.408218] Freezing user space processes
> > > [   36.413660] Freezing user space processes completed (elapsed 0.005 seconds)
> > > [   36.413680] OOM killer disabled.
> > > [   36.413693] Freezing remaining freezable tasks
> > > [   36.415033] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> > > [   36.428474] drm drm: [drm:drm_client_dev_suspend] fbdev: ret=0
> > > [   36.428527] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 2e5cd010
> > > [   36.428547] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] 6a6be0ef state to 2e5cd010
> > > [   36.428561] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 00d818c2 state to 2e5cd010
> > > [   36.428574] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:32:plane-0] 4e145b7d state to 2e5cd010
> > > [   36.428587] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:36:plane-1] dbf67d12 state to 2e5cd010
> > > [   36.428597] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:40:plane-2] 763d8809 state to 2e5cd010
> > > [   36.428608] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:44:plane-3] b6eabcf1 state to 2e5cd010
> > > [   36.428617] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:48:plane-4] 7863878c state to 2e5cd010
> > > [   36.428628] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:52:plane-5] 54b8029c state to 2e5cd010
> > > [   36.428638] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:56:plane-6] 364063af state to 2e5cd010
> > > [   36.428648] drm drm: [drm:drm_atomic_get_plane_state] Added [PLANE:60:plane-7] e1c11dfb state to 2e5cd010
> > > [   36.428662] drm drm: [drm:drm_atomic_get_connector_state] Added [CONNECTOR:65:HDMI-A-1] 5cb32770 state to 2e5cd010
> > > [   36.428674] drm drm: [drm:drm_atomic_state_init] Allocated atomic state 832943c7
> > > [   36.428682] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:47:crtc-0] f09cf73d state to 832943c7
> > > [   36.428691] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:47:crtc-0] to 832943c7
> > > [   36.428700] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:47:crtc-0] to 832943c7
> > > [   36.428711] drm drm: [drm:drm_atomic_get_crtc_state] Added [CRTC:63:crtc-1] 2700922c state to 832943c7
> > > [   36.428720] drm drm: [drm:drm_atomic_add_affected_planes] Adding all current planes for [CRTC:63:crtc-1] to 832943c7
> > > [   36.428727] drm drm: [drm:drm_atomic_add_affected_connectors] Adding all current connectors for [CRTC:63:crtc-1] to 832943c7
> > > [   36.428737] drm drm: [drm:drm_atomic_check_only] checking 832943c7
> > > [   36.428759] drm drm: [drm:drm_atomic_commit] committing 832943c7
> > > [   36.428881] drm drm: [drm:drm_atomic_state_default_clear] Clearing atomic state 832943c7
> > > [   36.428897] drm drm: [drm:__drm_atomic_state_free] Freeing atomic state 832943c7
> > > [   36.429085] r8169 0000:01:00.0 eth0: Link is Down
> > > [   36.713236] Disabling non-boot CPUs ...
> > > -sh-5.1#
> > > --- >8 ---
> > > 
> > > A second attempt soft-hangs:
> > > 
> > > --- >8 ---
> > > -sh-5.1# rtcwake --device /dev/rtc1 --mode mem --seconds 5
> > > rtcwake: assuming RTC uses UTC ...
> > > rtcwake: wakeup from "mem" using /dev/rtc1 at Thu Jan  1 00:01:10 1970
> > > --- >8 ---
> > > 
> > > Where "soft-hang" means it doesn't do anything after this and I can't
> > > SIGINT out of it or anything. However, the serial seems to still be
> > > responsive.
> > 
> > To clarify, this was on top of next-20250120 and reverting the patches
> > that Jon mentioned suspend/resume is fixed for me as well.
> > 
> > I do have a local device that I can test on, so if there's any patches
> > you want me to try, or any options to enable to get more information,
> > please let me know.
> 
> 
> Any feedback on this? Our boards are still broken with this change.

AFAIK, this patch has been reverted in linux-next last week, see
https://lore.kernel.org/r/84wmenqm03.fsf@jogness.linutronix.de

John had a training. I believe that he would look at these
suspend-related problems before sending another revision
of the patchset.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-27 14:54                 ` Jon Hunter
  2025-01-27 15:20                   ` Petr Mladek
@ 2025-01-27 15:21                   ` John Ogness
  2025-01-27 16:13                     ` Jon Hunter
  1 sibling, 1 reply; 16+ messages in thread
From: John Ogness @ 2025-01-27 15:21 UTC (permalink / raw)
  To: Jon Hunter, Thierry Reding
  Cc: Greg Kroah-Hartman, Jiri Slaby, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, Esben Haabendal, linux-serial,
	linux-kernel, Andy Shevchenko, Arnd Bergmann, Tony Lindgren,
	Niklas Schnelle, Serge Semin, linux-tegra@vger.kernel.org

Hi Jon,

On 2025-01-27, Jon Hunter <jonathanh@nvidia.com> wrote:
> Any feedback on this? Our boards are still broken with this change.

I have not yet been able to reproduce it (mostly battling brokenness in
am335x pm dependencies). For now the change has been reverted in
linux-next. I will send you a patch once I have something to send.

John

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-27 15:21                   ` John Ogness
@ 2025-01-27 16:13                     ` Jon Hunter
  0 siblings, 0 replies; 16+ messages in thread
From: Jon Hunter @ 2025-01-27 16:13 UTC (permalink / raw)
  To: John Ogness, Thierry Reding
  Cc: Greg Kroah-Hartman, Jiri Slaby, Petr Mladek, Sergey Senozhatsky,
	Steven Rostedt, Thomas Gleixner, Esben Haabendal, linux-serial,
	linux-kernel, Andy Shevchenko, Arnd Bergmann, Tony Lindgren,
	Niklas Schnelle, Serge Semin, linux-tegra@vger.kernel.org


On 27/01/2025 15:21, John Ogness wrote:
> Hi Jon,
> 
> On 2025-01-27, Jon Hunter <jonathanh@nvidia.com> wrote:
>> Any feedback on this? Our boards are still broken with this change.
> 
> I have not yet been able to reproduce it (mostly battling brokenness in
> am335x pm dependencies). For now the change has been reverted in
> linux-next. I will send you a patch once I have something to send.


OK great and yes appears to be passing since next-20250123. I should 
have checked!

Thanks
Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-01-15 16:21   ` [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console Jon Hunter
  2025-01-15 16:54     ` John Ogness
@ 2025-10-08 15:56     ` John Ogness
  2025-10-08 19:21       ` Jon Hunter
  1 sibling, 1 reply; 16+ messages in thread
From: John Ogness @ 2025-10-08 15:56 UTC (permalink / raw)
  To: Jon Hunter, Greg Kroah-Hartman
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org

Hi Jon,

On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
> I have noticed a suspend regression on -next for some of our 32-bit 
> Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing 
> to this commit and reverting this on top of -next (along with reverting 
> "serial: 8250: Revert "drop lockdep annotation from 
> serial8250_clear_IER()") fixes the issue. So far I have not dug in any 
> further. Unfortunately, I don't have any logs to see if there is some 
> crash or something happening but I will see if there is any more info I 
> can get.

I have been looking into reproducing this using other 8250/ARM boards
(BeagleBone Black and Phytec WEGA). Unfortunately it is just showing me
all kinds of other brokenness (in mainline) and essentially making it
impossible to confirm that I am seeing what you are seeing, since
suspend/resume is randomly dying without my 8250-nbcon patch.

Before I start spending weeks investigating/fixing most likely totally
unrelated PM or BSP issues, is it possible that I could receive one of
the boards you mentioned so that I can reproduce and debug the actual
problem you are reporting? If this is possible, feel free to take this
conversation offline so that we can discuss delivery details. Thanks!

John Ogness

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-10-08 15:56     ` John Ogness
@ 2025-10-08 19:21       ` Jon Hunter
  2025-10-09 10:04         ` Thierry Reding
  0 siblings, 1 reply; 16+ messages in thread
From: Jon Hunter @ 2025-10-08 19:21 UTC (permalink / raw)
  To: John Ogness, Greg Kroah-Hartman, Thierry Reding
  Cc: Jiri Slaby, Petr Mladek, Sergey Senozhatsky, Steven Rostedt,
	Thomas Gleixner, Esben Haabendal, linux-serial, linux-kernel,
	Andy Shevchenko, Arnd Bergmann, Tony Lindgren, Niklas Schnelle,
	Serge Semin, linux-tegra@vger.kernel.org

Hi John,

On 08/10/2025 16:56, John Ogness wrote:
> Hi Jon,
> 
> On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
>> I have noticed a suspend regression on -next for some of our 32-bit
>> Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing
>> to this commit and reverting this on top of -next (along with reverting
>> "serial: 8250: Revert "drop lockdep annotation from
>> serial8250_clear_IER()") fixes the issue. So far I have not dug in any
>> further. Unfortunately, I don't have any logs to see if there is some
>> crash or something happening but I will see if there is any more info I
>> can get.
> 
> I have been looking into reproducing this using other 8250/ARM boards
> (BeagleBone Black and Phytec WEGA). Unfortunately it is just showing me
> all kinds of other brokenness (in mainline) and essentially making it
> impossible to confirm that I am seeing what you are seeing, since
> suspend/resume is randomly dying without my 8250-nbcon patch.
> 
> Before I start spending weeks investigating/fixing most likely totally
> unrelated PM or BSP issues, is it possible that I could receive one of
> the boards you mentioned so that I can reproduce and debug the actual
> problem you are reporting? If this is possible, feel free to take this
> conversation offline so that we can discuss delivery details. Thanks!

These boards are really old now and so I don't really have any that we 
can ship. It would be great to get this change merged as I see that it 
is needed for RT support. I could see if I can resurrect a Tegra124 
Jetson TK1 and test again on that to see if we can get some logs.

Thierry, do you have a Tegra124 Jetson TK1 handy to test this change on?

Jon

-- 
nvpublic


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-10-08 19:21       ` Jon Hunter
@ 2025-10-09 10:04         ` Thierry Reding
  2025-10-09 11:49           ` John Ogness
  2025-10-09 12:54           ` Petr Mladek
  0 siblings, 2 replies; 16+ messages in thread
From: Thierry Reding @ 2025-10-09 10:04 UTC (permalink / raw)
  To: John Ogness
  Cc: Jon Hunter, Greg Kroah-Hartman, Thierry Reding, Jiri Slaby,
	Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	linux-tegra@vger.kernel.org

[-- Attachment #1: Type: text/plain, Size: 2631 bytes --]

On Wed, Oct 08, 2025 at 08:21:49PM +0100, Jon Hunter wrote:
> Hi John,
> 
> On 08/10/2025 16:56, John Ogness wrote:
> > Hi Jon,
> > 
> > On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
> > > I have noticed a suspend regression on -next for some of our 32-bit
> > > Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing
> > > to this commit and reverting this on top of -next (along with reverting
> > > "serial: 8250: Revert "drop lockdep annotation from
> > > serial8250_clear_IER()") fixes the issue. So far I have not dug in any
> > > further. Unfortunately, I don't have any logs to see if there is some
> > > crash or something happening but I will see if there is any more info I
> > > can get.
> > 
> > I have been looking into reproducing this using other 8250/ARM boards
> > (BeagleBone Black and Phytec WEGA). Unfortunately it is just showing me
> > all kinds of other brokenness (in mainline) and essentially making it
> > impossible to confirm that I am seeing what you are seeing, since
> > suspend/resume is randomly dying without my 8250-nbcon patch.
> > 
> > Before I start spending weeks investigating/fixing most likely totally
> > unrelated PM or BSP issues, is it possible that I could receive one of
> > the boards you mentioned so that I can reproduce and debug the actual
> > problem you are reporting? If this is possible, feel free to take this
> > conversation offline so that we can discuss delivery details. Thanks!
> 
> These boards are really old now and so I don't really have any that we can
> ship. It would be great to get this change merged as I see that it is needed
> for RT support. I could see if I can resurrect a Tegra124 Jetson TK1 and
> test again on that to see if we can get some logs.
> 
> Thierry, do you have a Tegra124 Jetson TK1 handy to test this change on?

Yes, I do. I reapplied patches 5 and 6 from the series (resolved a tiny
conflict for patch 5) and reran the tests. Same results as back in
January, though. Basically the first suspend doesn't work (it exits back
to userspace after a few seconds) and the second attempt then hangs. No
idea why that would be happpening.

I looked a bit at the code, but nothing jumped out that would explain
this. Not that I'm very familiar with any of this code, or the specifics
needed by nbcon. no_console_suspend doesn't have any noticeable effect,
other than providing a few more messages during suspend, but nothing
that would indicate what's going wrong.

John, I'm happy to test any other patches if you've got any ideas on
what could be wrong.

Thierry

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-10-09 10:04         ` Thierry Reding
@ 2025-10-09 11:49           ` John Ogness
  2025-10-09 12:54           ` Petr Mladek
  1 sibling, 0 replies; 16+ messages in thread
From: John Ogness @ 2025-10-09 11:49 UTC (permalink / raw)
  To: Thierry Reding
  Cc: Jon Hunter, Greg Kroah-Hartman, Thierry Reding, Jiri Slaby,
	Petr Mladek, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	linux-tegra@vger.kernel.org

Hi Thierry,

On 2025-10-09, Thierry Reding <thierry.reding@gmail.com> wrote:
>> Thierry, do you have a Tegra124 Jetson TK1 handy to test this change on?
>
> Yes, I do. I reapplied patches 5 and 6 from the series (resolved a tiny
> conflict for patch 5) and reran the tests. Same results as back in
> January, though. Basically the first suspend doesn't work (it exits back
> to userspace after a few seconds) and the second attempt then hangs. No
> idea why that would be happpening.

[...]

> John, I'm happy to test any other patches if you've got any ideas on
> what could be wrong.

Thanks for your support. I created a branch on a public git repository
[0] so that we have a common source to work with. I will push additional
debug-commits on top.

I am taking this email-debugging-session offlist. I will post again to
this thread once we come to some conclusion.

John

[0] https://github.com/Linutronix/linux (branch nvidia/debug-8250-nbcon)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console
  2025-10-09 10:04         ` Thierry Reding
  2025-10-09 11:49           ` John Ogness
@ 2025-10-09 12:54           ` Petr Mladek
  1 sibling, 0 replies; 16+ messages in thread
From: Petr Mladek @ 2025-10-09 12:54 UTC (permalink / raw)
  To: Thierry Reding
  Cc: John Ogness, Jon Hunter, Greg Kroah-Hartman, Thierry Reding,
	Jiri Slaby, Sergey Senozhatsky, Steven Rostedt, Thomas Gleixner,
	Esben Haabendal, linux-serial, linux-kernel, Andy Shevchenko,
	Arnd Bergmann, Tony Lindgren, Niklas Schnelle, Serge Semin,
	Jacky Bai, linux-tegra@vger.kernel.org

On Thu 2025-10-09 12:04:13, Thierry Reding wrote:
> On Wed, Oct 08, 2025 at 08:21:49PM +0100, Jon Hunter wrote:
> > Hi John,
> > 
> > On 08/10/2025 16:56, John Ogness wrote:
> > > Hi Jon,
> > > 
> > > On 2025-01-15, Jon Hunter <jonathanh@nvidia.com> wrote:
> > > > I have noticed a suspend regression on -next for some of our 32-bit
> > > > Tegra (ARM) devices (Tegra20, Tegra30 and Tegra124). Bisect is pointing
> > > > to this commit and reverting this on top of -next (along with reverting
> > > > "serial: 8250: Revert "drop lockdep annotation from
> > > > serial8250_clear_IER()") fixes the issue. So far I have not dug in any
> > > > further. Unfortunately, I don't have any logs to see if there is some
> > > > crash or something happening but I will see if there is any more info I
> > > > can get.
> > > 
> > > I have been looking into reproducing this using other 8250/ARM boards
> > > (BeagleBone Black and Phytec WEGA). Unfortunately it is just showing me
> > > all kinds of other brokenness (in mainline) and essentially making it
> > > impossible to confirm that I am seeing what you are seeing, since
> > > suspend/resume is randomly dying without my 8250-nbcon patch.
> > > 
> > > Before I start spending weeks investigating/fixing most likely totally
> > > unrelated PM or BSP issues, is it possible that I could receive one of
> > > the boards you mentioned so that I can reproduce and debug the actual
> > > problem you are reporting? If this is possible, feel free to take this
> > > conversation offline so that we can discuss delivery details. Thanks!
> > 
> > These boards are really old now and so I don't really have any that we can
> > ship. It would be great to get this change merged as I see that it is needed
> > for RT support. I could see if I can resurrect a Tegra124 Jetson TK1 and
> > test again on that to see if we can get some logs.
> > 
> > Thierry, do you have a Tegra124 Jetson TK1 handy to test this change on?
> 
> Yes, I do. I reapplied patches 5 and 6 from the series (resolved a tiny
> conflict for patch 5) and reran the tests. Same results as back in
> January, though. Basically the first suspend doesn't work (it exits back
> to userspace after a few seconds)

I remember a mail from Jacky Bai (added into Cc, the mail was off-list).
It pointed out that that ARM-specific suspend code checks whether
there are pending interrupts and eventually cancels the suspend,
see
https://github.com/ARM-software/arm-trusted-firmware/blob/f831058437f281e70c2409a9b79828116d4c2915/lib/psci/psci_suspend.c#L154

It might explain this first failure after a timeout.

A possible solution would be to avoid waking consoles in vprintk_emit()
when they are suspended. The functions nbcon_kthreads_wake(),
defer_console_output(), and wake_up_klogd() queue irq_work()
which could not get proceed when interrupts are disabled, ...


> and the second attempt then hangs. No
> idea why that would be happpening.

No possible explanation comes to my mind :-(

> I looked a bit at the code, but nothing jumped out that would explain
> this. Not that I'm very familiar with any of this code, or the specifics
> needed by nbcon. no_console_suspend doesn't have any noticeable effect,
> other than providing a few more messages during suspend, but nothing
> that would indicate what's going wrong.
> 
> John, I'm happy to test any other patches if you've got any ideas on
> what could be wrong.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2025-10-09 12:54 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250107212702.169493-1-john.ogness@linutronix.de>
     [not found] ` <20250107212702.169493-6-john.ogness@linutronix.de>
2025-01-15 16:21   ` [PATCH tty-next v5 5/6] serial: 8250: Switch to nbcon console Jon Hunter
2025-01-15 16:54     ` John Ogness
2025-01-16 10:27       ` Jon Hunter
2025-01-16 10:38         ` John Ogness
2025-01-16 10:41           ` Jon Hunter
2025-01-20 16:23             ` Thierry Reding
2025-01-20 16:34               ` Thierry Reding
2025-01-27 14:54                 ` Jon Hunter
2025-01-27 15:20                   ` Petr Mladek
2025-01-27 15:21                   ` John Ogness
2025-01-27 16:13                     ` Jon Hunter
2025-10-08 15:56     ` John Ogness
2025-10-08 19:21       ` Jon Hunter
2025-10-09 10:04         ` Thierry Reding
2025-10-09 11:49           ` John Ogness
2025-10-09 12:54           ` Petr Mladek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).