kernelnewbies.kernelnewbies.org archive mirror
 help / color / mirror / Atom feed
* Assistance Needed for Kernel mode driver Soft Lockup Issue
@ 2024-10-19 19:09 Muni Sekhar
  2024-10-20  6:46 ` Philipp Hortmann
  2024-10-20 12:48 ` Denis Kirjanov
  0 siblings, 2 replies; 5+ messages in thread
From: Muni Sekhar @ 2024-10-19 19:09 UTC (permalink / raw)
  To: kernelnewbies,
	kernel-hardening-sc.1597159196.oakfigcenbmaokmiekdo-munisekharrms=gmail.com,
	LKML

Dear Linux Kernel Developers,

I am encountering a soft lockup issue in my system related to the
continuous while loop in the empty_rx_fifo() function. Below is the
relevant code:


#include <linux/io.h> // For readw()

#define FIFO_STATUS 0x0014
#define FIFO_MAN_READ 0x0015
#define RX_FIFO_EMPTY 0x01 // Assuming RX_FIFO_EMPTY is defined as 0x01

static inline uint16_t read16_shifted(void __iomem *addr, u32 offset)
{
    void __iomem *target_addr = addr + (offset << 1); // Left shift
the offset by 1 and add to the base address
    uint16_t value = readw(target_addr); // Read the 16-bit value from
the calculated address
    return value;
}

void empty_rx_fifo(void __iomem *addr)
{
    while (!(read16_shifted(addr, FIFO_STATUS) & RX_FIFO_EMPTY)) {
        read16_shifted(addr, FIFO_MAN_READ); // Keep reading from the
FIFO until it's empty
    }
}

Explanation:
Function Name: read16_shifted — The function reads a 16-bit value from
an offset address with a left shift operation.
Operation: It shifts the offset left by 1 (offset << 1), adds it to
the base address, and reads the value from the new address.
The empty_rx_fifo function is designed to clear out the RX FIFO, but
I've encountered soft lockup issues. Specifically, the system logs
repeated soft lockup messages in the kernel log, with a time gap of
roughly 28 seconds between them (as per the kernel log timestamps).
Here's an example log:

watchdog: BUG: soft lockup - CPU#0 stuck for 23s!

In all cases, the RIP points to:
RIP: 0010:read16_shifted+0x11/0x20


Analysis:
The soft lockup seems to be caused by the continuous while loop in the
empty_rx_fifo() function. The RX FIFO takes a considerable amount of
time to empty, sometimes up to 1000 seconds. As a result, from the
first occurrence of the soft lockup trace, the log repeats
approximately every 28 seconds for the entire 1000 seconds duration.
After 1000 seconds, the system resumes normal operation.

Questions:
1. How should I best handle this kind of issue? Even if the hardware
takes time, I would like advice on the best approach to prevent these
lockups.
2. Do soft lockup issues auto-recover like this? Is this something I
should consider serious, or can it be ignored?

I would appreciate any guidance on how to resolve or mitigate this problem.


-- 
Thanks,
Sekhar

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Assistance Needed for Kernel mode driver Soft Lockup Issue
  2024-10-19 19:09 Assistance Needed for Kernel mode driver Soft Lockup Issue Muni Sekhar
@ 2024-10-20  6:46 ` Philipp Hortmann
  2024-10-20 12:48 ` Denis Kirjanov
  1 sibling, 0 replies; 5+ messages in thread
From: Philipp Hortmann @ 2024-10-20  6:46 UTC (permalink / raw)
  To: Muni Sekhar, kernelnewbies,
	kernel-hardening-sc.1597159196.oakfigcenbmaokmiekdo-munisekharrms=gmail.com,
	LKML

On 19.10.24 21:09, Muni Sekhar wrote:
> Dear Linux Kernel Developers,
> 
> I am encountering a soft lockup issue in my system related to the
> continuous while loop in the empty_rx_fifo() function. Below is the
> relevant code:


Hi Muni,

I am missing base information.
What system are your using?
What kernel are you using? Version? which git repo?
What is the driver and device you are talking about?

Thanks for your support.

Bye Philipp

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Assistance Needed for Kernel mode driver Soft Lockup Issue
  2024-10-19 19:09 Assistance Needed for Kernel mode driver Soft Lockup Issue Muni Sekhar
  2024-10-20  6:46 ` Philipp Hortmann
@ 2024-10-20 12:48 ` Denis Kirjanov
  2024-10-20 15:01   ` Muni Sekhar
  1 sibling, 1 reply; 5+ messages in thread
From: Denis Kirjanov @ 2024-10-20 12:48 UTC (permalink / raw)
  To: Muni Sekhar
  Cc: kernel-hardening-sc.1597159196.oakfigcenbmaokmiekdo-munisekharrms=gmail.com@lists.openwall.com,
	LKML, kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 3099 bytes --]

суббота, 19 октября 2024 г. пользователь Muni Sekhar <
munisekharrms@gmail.com> написал:

> Dear Linux Kernel Developers,
>
> I am encountering a soft lockup issue in my system related to the
> continuous while loop in the empty_rx_fifo() function. Below is the
> relevant code:
>
>
> #include <linux/io.h> // For readw()
>
> #define FIFO_STATUS 0x0014
> #define FIFO_MAN_READ 0x0015
> #define RX_FIFO_EMPTY 0x01 // Assuming RX_FIFO_EMPTY is defined as 0x01
>
> static inline uint16_t read16_shifted(void __iomem *addr, u32 offset)
> {
>     void __iomem *target_addr = addr + (offset << 1); // Left shift
> the offset by 1 and add to the base address
>     uint16_t value = readw(target_addr); // Read the 16-bit value from
> the calculated address
>     return value;
> }
>
> void empty_rx_fifo(void __iomem *addr)
> {
>     while (!(read16_shifted(addr, FIFO_STATUS) & RX_FIFO_EMPTY)) {
>         read16_shifted(addr, FIFO_MAN_READ); // Keep reading from the
> FIFO until it's empty
>     }
> }
>
> Explanation:
> Function Name: read16_shifted — The function reads a 16-bit value from
> an offset address with a left shift operation.
> Operation: It shifts the offset left by 1 (offset << 1), adds it to
> the base address, and reads the value from the new address.
> The empty_rx_fifo function is designed to clear out the RX FIFO, but
> I've encountered soft lockup issues. Specifically, the system logs
> repeated soft lockup messages in the kernel log, with a time gap of
> roughly 28 seconds between them (as per the kernel log timestamps).
> Here's an example log:
>
> watchdog: BUG: soft lockup - CPU#0 stuck for 23s!
>
> In all cases, the RIP points to:
> RIP: 0010:read16_shifted+0x11/0x20
>
>
> Analysis:
> The soft lockup seems to be caused by the continuous while loop in the
> empty_rx_fifo() function. The RX FIFO takes a considerable amount of
> time to empty, sometimes up to 1000 seconds. As a result, from the
> first occurrence of the soft lockup trace, the log repeats
> approximately every 28 seconds for the entire 1000 seconds duration.
> After 1000 seconds, the system resumes normal operation.
>
> Questions:
> 1. How should I best handle this kind of issue? Even if the hardware
> takes time, I would like advice on the best approach to prevent these
> lockups.


 I guess that you can switch on interrupt model or run a thread to check
the status there (here I mean check RX empty and release cpu)

2. Do soft lockup issues auto-recover like this? Is this something I
> should consider serious, or can it be ignored?


The kernel tells you that your cpu resource is stuck instead of doing
something useful


> I would appreciate any guidance on how to resolve or mitigate this problem.
>
>
> --
> Thanks,
> Sekhar
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>


-- 
Regards / Mit besten Grüßen,
Denis

[-- Attachment #1.2: Type: text/html, Size: 3951 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Assistance Needed for Kernel mode driver Soft Lockup Issue
  2024-10-20 12:48 ` Denis Kirjanov
@ 2024-10-20 15:01   ` Muni Sekhar
  2024-10-20 22:11     ` Tom Mitchell
  0 siblings, 1 reply; 5+ messages in thread
From: Muni Sekhar @ 2024-10-20 15:01 UTC (permalink / raw)
  To: Denis Kirjanov
  Cc: kernel-hardening-sc.1597159196.oakfigcenbmaokmiekdo-munisekharrms=gmail.com@lists.openwall.com,
	LKML, kernelnewbies

On Sun, Oct 20, 2024 at 6:18 PM Denis Kirjanov <kirjanov@gmail.com> wrote:
>
>
>
> суббота, 19 октября 2024 г. пользователь Muni Sekhar <munisekharrms@gmail.com> написал:
>>
>> Dear Linux Kernel Developers,
>>
>> I am encountering a soft lockup issue in my system related to the
>> continuous while loop in the empty_rx_fifo() function. Below is the
>> relevant code:
>>
>>
>> #include <linux/io.h> // For readw()
>>
>> #define FIFO_STATUS 0x0014
>> #define FIFO_MAN_READ 0x0015
>> #define RX_FIFO_EMPTY 0x01 // Assuming RX_FIFO_EMPTY is defined as 0x01
>>
>> static inline uint16_t read16_shifted(void __iomem *addr, u32 offset)
>> {
>>     void __iomem *target_addr = addr + (offset << 1); // Left shift
>> the offset by 1 and add to the base address
>>     uint16_t value = readw(target_addr); // Read the 16-bit value from
>> the calculated address
>>     return value;
>> }
>>
>> void empty_rx_fifo(void __iomem *addr)
>> {
>>     while (!(read16_shifted(addr, FIFO_STATUS) & RX_FIFO_EMPTY)) {
>>         read16_shifted(addr, FIFO_MAN_READ); // Keep reading from the
>> FIFO until it's empty
>>     }
>> }
>>
>> Explanation:
>> Function Name: read16_shifted — The function reads a 16-bit value from
>> an offset address with a left shift operation.
>> Operation: It shifts the offset left by 1 (offset << 1), adds it to
>> the base address, and reads the value from the new address.
>> The empty_rx_fifo function is designed to clear out the RX FIFO, but
>> I've encountered soft lockup issues. Specifically, the system logs
>> repeated soft lockup messages in the kernel log, with a time gap of
>> roughly 28 seconds between them (as per the kernel log timestamps).
>> Here's an example log:
>>
>> watchdog: BUG: soft lockup - CPU#0 stuck for 23s!
>>
>> In all cases, the RIP points to:
>> RIP: 0010:read16_shifted+0x11/0x20
>>
>>
>> Analysis:
>> The soft lockup seems to be caused by the continuous while loop in the
>> empty_rx_fifo() function. The RX FIFO takes a considerable amount of
>> time to empty, sometimes up to 1000 seconds. As a result, from the
>> first occurrence of the soft lockup trace, the log repeats
>> approximately every 28 seconds for the entire 1000 seconds duration.
>> After 1000 seconds, the system resumes normal operation.
>>
>> Questions:
>> 1. How should I best handle this kind of issue? Even if the hardware
>> takes time, I would like advice on the best approach to prevent these
>> lockups.
>
>
>  I guess that you can switch on interrupt model or run a thread to check the status there (here I mean check RX empty and release cpu)
Thanks for your response.

Switching to an interrupt model should resolve it, but unfortunately,
the hardware I am using does not support interrupts for this
functionality.
Would adding udelay() in the while loop after every few iterations
help avoid CPU hogging, allowing other processes to take control of
the CPU?

>
>> 2. Do soft lockup issues auto-recover like this? Is this something I
>> should consider serious, or can it be ignored?
>
>
> The kernel tells you that your cpu resource is stuck instead of doing something useful
>
>>
>> I would appreciate any guidance on how to resolve or mitigate this problem.
>>
>>
>> --
>> Thanks,
>> Sekhar
>>
>> _______________________________________________
>> Kernelnewbies mailing list
>> Kernelnewbies@kernelnewbies.org
>> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>
>
>
> --
> Regards / Mit besten Grüßen,
> Denis
>


-- 
Thanks,
Sekhar

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Assistance Needed for Kernel mode driver Soft Lockup Issue
  2024-10-20 15:01   ` Muni Sekhar
@ 2024-10-20 22:11     ` Tom Mitchell
  0 siblings, 0 replies; 5+ messages in thread
From: Tom Mitchell @ 2024-10-20 22:11 UTC (permalink / raw)
  To: Muni Sekhar
  Cc: kernel-hardening-sc.1597159196.oakfigcenbmaokmiekdo-munisekharrms=gmail.com@lists.openwall.com,
	Denis Kirjanov, LKML, kernelnewbies


[-- Attachment #1.1: Type: text/plain, Size: 1919 bytes --]

--
Tom M


On Sun, Oct 20, 2024 at 08:02 Muni Sekhar <munisekharrms@gmail.com> wrote:

> On Sun, Oct 20, 2024 at 6:18 PM Denis Kirjanov <kirjanov@gmail.com> wrote:
>
> >>
> >> Analysis:
> >> The soft lockup seems to be caused by the continuous while loop in the
> >> empty_rx_fifo() function. The RX FIFO takes a considerable amount of
> >> time to empty, sometimes up to 1000 seconds. As a result, from the
> >> first occurrence of




> >
> >  I guess that you can switch on interrupt model or run a thread to check
> the status there (here I mean check RX empty and release cpu)
> Thanks for your response.
>
> Switching to an interrupt model should resolve it, but unfortunately,
> the hardware I am using does not support interrupts for this
> functionality.
> Would adding udelay() in the while loop after every few iterations
> help avoid CPU hogging, allowing other processes to take control of
> the CPU?
>
> >
> >> 2. Do soft lockup issues auto-recover like this? Is this something I
> >> should consider serious, or can it be ignored?
> >
> >
> > The kernel tells you that your cpu resource is stuck instead of doing
> something useful
> >
> >>
> >> I would appreciate any guidance on how to resolve or mitigate this
> problem.
>

Do as little you can to drain the FIFO into a buffer (one of three).
How deep is  the FIFO?
Set a data flow block when the second is full.
Do the math on the data rate the device delivers.
Then add an interruptible thread to process the buffer ( shift and what
ever)

A udelay() can allow other work to proceed and can be a good thing.

A user space driver pinned to a dedicated core can also work.

Interrupts can  be costly.




> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies@kernelnewbies.org
> https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies
>

[-- Attachment #1.2: Type: text/html, Size: 3522 bytes --]

[-- Attachment #2: Type: text/plain, Size: 170 bytes --]

_______________________________________________
Kernelnewbies mailing list
Kernelnewbies@kernelnewbies.org
https://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-10-20 22:14 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-19 19:09 Assistance Needed for Kernel mode driver Soft Lockup Issue Muni Sekhar
2024-10-20  6:46 ` Philipp Hortmann
2024-10-20 12:48 ` Denis Kirjanov
2024-10-20 15:01   ` Muni Sekhar
2024-10-20 22:11     ` Tom Mitchell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).