All of lore.kernel.org
 help / color / mirror / Atom feed
From: Philippe Gerum <rpm@xenomai.org>
To: "François Legal" <devel@thom.fr.eu.org>
Cc: "Florian Bezdeka" <florian.bezdeka@siemens.com>,
	"Jan Kiszka" <jan.kiszka@siemens.com>,
	xenomai@xenomai.org, 孙涛 <suntaoworks@163.com>
Subject: Re: Switching from xenomai 3.2 IPIPE to xenomai 3.3 Dovetail
Date: Mon, 18 Nov 2024 16:33:58 +0100	[thread overview]
Message-ID: <87v7wkimjd.fsf@xenomai.org> (raw)
In-Reply-To: <1017-673b5c00-37-2c5d8800@259496998> ("François Legal"'s message of "Mon, 18 Nov 2024 16:22:57 +0100")

François Legal <devel@thom.fr.eu.org> writes:

> Le Lundi, Novembre 18, 2024 15:49 CET, Philippe Gerum <rpm@xenomai.org> a écrit: 
>  
>> Florian Bezdeka <florian.bezdeka@siemens.com> writes:
>> 
>> > On Mon, 2024-11-18 at 14:37 +0100, Philippe Gerum wrote:
>> >> François Legal <devel@thom.fr.eu.org> writes:
>> >> 
>> >> > Le Lundi, Novembre 18, 2024 13:06 CET, Florian Bezdeka <florian.bezdeka@siemens.com> a écrit: 
>> >> >  
>> >> > > 
>> >> > > [ Updated CC list to merge both mail threads ]
>> >> > > 
>> >> > > On Mon, 2024-11-18 at 12:37 +0100, Philippe Gerum wrote:
>> >> > > > François Legal <devel@thom.fr.eu.org> writes:
>> >> > > > 
>> >> > > > > Le Lundi, Novembre 18, 2024 11:32 CET, Florian Bezdeka <florian.bezdeka@siemens.com> a écrit: 
>> >> > > > >   
>> >> > > > > > On Mon, 2024-11-18 at 11:21 +0100, François Legal wrote:
>> >> > > > > > > Le Lundi, Novembre 18, 2024 10:01 CET, Jan Kiszka <jan.kiszka@siemens.com> a écrit: 
>> >> > > > > > >  
>> >> > > > > > > > On 18.11.24 09:50, François Legal wrote:
>> >> > > > > > > > > Hello,
>> >> > > > > > > > > 
>> >> > > > > > > > > running on an Arm v7 cortex A9 platform, I'm trying to switch from xenomai 3.2 (linux 5.4 IPIPE) to xenomai 3.3 (linux 5.15 Dovetail).
>> >> > > > > > > > > 
>> >> > > > > > > > > I can successfully boot my system with linux 5.15 + CONFIG_DOVETAIL, but as soon as I enable Xenomai (CONFIG_XENOMAI), I got stuck at boot in a do_idle loop. Did I miss anything ?
>> >> > > > > > > > > 
>> >> > > > > > > > > Attached is my config.
>> >> > > > > > > > 
>> >> > > > > > > > Can you factor out from it what the SoC is, if you patched the kernel
>> >> > > > > > > > for it (in the past and now), if you had to adopt Dovetail etc.?
>> >> > > > > > > > 
>> >> > > > > > > > Jan
>> >> > > > > > > 
>> >> > > > > > > The SoC is Xilinx zynq 7000. I have custom patches for custom FPGA peripherals (which most of them have been successfully ported to 5.15).
>> >> > > > > > 
>> >> > > > > > We had a similar report for the zynq 7020 a couple of days ago.
>> >> > > > > > See [1].
>> >> > > > > > 
>> >> > > > > > What happens if you disable CONFIG_SMP? (Not that this should be the
>> >> > > > > > final solution, but it might help to track the issue down...)
>> >> > > > > > 
>> >> > > > > > Florian
>> >> > > > > > 
>> >> > > > > 
>> >> > > > > Disabling config SMP makes my system boot again.
>> >> > > > > How can I help working this out ?
>> >> > > > > 
>> >> > > > > François
>> >> > > > > 
>> >> > > > 
>> >> > > > This looks like an issue with the proxy tick device. First thing is to
>> >> > > > disable CONFIG_XENOMAI, enabling CONFIG_IRQ_PIPELINE_TORTURE_TEST. If
>> >> > > > the kernel still hangs, we may have a hint about the reason
>> >> > > > why. Alternatively, you could keep CONFIG_XENOMAI in, booting the kernel
>> >> > > > with "xenomai.state=stopped". If no hang occurs at boot anymore, there
>> >> > > > may be an issue with proxying the timer device on this SoC.
>> >> > > > 
>> >> > 
>> >> > CONFIG_IRQ_PIPELINE_TORTURE_TEST seems to report OK (attached bootlog CONFIG_IRQ_PIPELINE_TORTURE_TEST.boot
>> >> > 
>> >> > Starting with "xenomai.state=stopped" works (attached boolog xenomai.state-stopped.boot)
>> >> > Starting up xenomai with coreclk afterwards seem to work and not hang system :
>> >> > root@Arkens_SV:~# /usr/xenomai/sbin/corectl -start
>> >> > [  117.328878] CPU1: proxy tick device registered (333.33MHz)
>> >> > [  117.328883] CPU0: proxy tick device registered (333.33MHz)
>> >> > [  117.339995] [Xenomai] services started
>> >> > 
>> >> > 
>> >> 
>> >> Ok, so another usual suspect for this issue: some weirdness in the CPU
>> >> idling code, which causes a (timer) interrupt to linger indefinitely in
>> >> some per-CPU interrupt log. A typical scenario is as follows:
>> >> 
>> >> CPUx
>> >> ----
>> >> 
>> >>      default_idle_call() {
>> >> 
>> >>       ...
>> >> <IRQ> -> logged because local_irq_disable() in effect, _however_ we do NOT
>> >>          expect hard irqs to be enabled [1] before entering the idle call
>> >>          next => issue is there.
>> >>       arch_cpu_idle();
>> >>       ...
>> >>       /* per-CPU log is not flushed, because of [1] */
>> >>      }
>> >> 
>> >> At the end of the day, the (most likely) timer IRQ is marked as pending
>> >> in the log, but never played, so the kernel activity stalls on that CPU,
>> >> especially if the boot CPU is involved.
>> >> 
>> >
>> > Taken from the boot log:
>> >
>> > [    0.000024] clocksource: arm_global_timer: freq: 166666665 Hz, mask: 0xffffffffffffffff max_cycles: 0x26703d7dd8, max_idle_ns: 440795208065 ns
>> > [    0.012959] clocksource: jiffies: freq: 0 Hz, mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
>> > [    0.587268] clocksource: Switched to clocksource arm_global_timer
>> > [    1.065121] clocksource: ttc_clocksource: freq: 54253 Hz, mask: 0xffff max_cycles: 0xffff, max_idle_ns: 537538477 ns
>> >
>> > So the last active clocksource is ttc_clocksource
>> > (drivers/clocksource/timer-cadence-ttc.c).
>> > If I'm correct the OOB enablement is missing for this driver.
>> >
>> > Philippe already pointed us to
>> > https://v4.xenomai.org/dovetail/porting/timer/.
>> >
>> > Could someone try that? I have matching HW at hand...
>> >
>> 
>> The following code is a 100%, happily untested attempt to enable the ttc
>> device for pipelining (timer events and reading its clock source
>> directly from user-space via mmio as well). I don't have any zynq hw at
>> hand to test it either.
>> 
>> diff --git a/drivers/clocksource/timer-cadence-ttc.c b/drivers/clocksource/timer-cadence-ttc.c
>> index 0d52e28fea4de..fcab778448bba 100644
>> --- a/drivers/clocksource/timer-cadence-ttc.c
>> +++ b/drivers/clocksource/timer-cadence-ttc.c
>> @@ -84,11 +84,11 @@ struct ttc_timer_clocksource {
>>  	u32			scale_clk_ctrl_reg_old;
>>  	u32			scale_clk_ctrl_reg_new;
>>  	struct ttc_timer	ttc;
>> -	struct clocksource	cs;
>> +	struct clocksource_user_mmio	cs;
>>  };
>>  
>>  #define to_ttc_timer_clksrc(x) \
>> -		container_of(x, struct ttc_timer_clocksource, cs)
>> +		container_of(x, struct ttc_timer_clocksource, cs.mmio.clksrc)
>>  
>>  struct ttc_timer_clockevent {
>>  	struct ttc_timer		ttc;
>> @@ -143,24 +143,11 @@ static irqreturn_t ttc_clock_event_interrupt(int irq, void *dev_id)
>>  	/* Acknowledge the interrupt and call event handler */
>>  	readl_relaxed(timer->base_addr + TTC_ISR_OFFSET);
>>  
>> -	ttce->ce.event_handler(&ttce->ce);
>> +	clockevents_handle_event(&ttce->ce);
>>  
>>  	return IRQ_HANDLED;
>>  }
>>  
>> -/**
>> - * __ttc_clocksource_read - Reads the timer counter register
>> - *
>> - * returns: Current timer counter register value
>> - **/
>> -static u64 __ttc_clocksource_read(struct clocksource *cs)
>> -{
>> -	struct ttc_timer *timer = &to_ttc_timer_clksrc(cs)->ttc;
>> -
>> -	return (u64)readl_relaxed(timer->base_addr +
>> -				TTC_COUNT_VAL_OFFSET);
>> -}
>> -
>>  static u64 notrace ttc_sched_clock_read(void)
>>  {
>>  	return readl_relaxed(ttc_sched_clock_val_reg);
>> @@ -320,6 +307,7 @@ static int ttc_rate_change_clocksource_cb(struct notifier_block *nb,
>>  static int __init ttc_setup_clocksource(struct clk *clk, void __iomem *base,
>>  					 u32 timer_width)
>>  {
>> +	struct clocksource_mmio_regs mmr = { 0 };
>>  	struct ttc_timer_clocksource *ttccs;
>>  	int err;
>>  
>> @@ -347,11 +335,11 @@ static int __init ttc_setup_clocksource(struct clk *clk, void __iomem *base,
>>  		pr_warn("Unable to register clock notifier.\n");
>>  
>>  	ttccs->ttc.base_addr = base;
>> -	ttccs->cs.name = "ttc_clocksource";
>> -	ttccs->cs.rating = 200;
>> -	ttccs->cs.read = __ttc_clocksource_read;
>> -	ttccs->cs.mask = CLOCKSOURCE_MASK(timer_width);
>> -	ttccs->cs.flags = CLOCK_SOURCE_IS_CONTINUOUS;
>> +	ttccs->cs.mmio.clksrc.name = "ttc_clocksource";
>> +	ttccs->cs.mmio.clksrc.rating = 200;
>> +	ttccs->cs.mmio.clksrc.read = clocksource_mmio_readl_up,
>> +	ttccs->cs.mmio.clksrc.mask = CLOCKSOURCE_MASK(timer_width);
>> +	ttccs->cs.mmio.clksrc.flags = CLOCK_SOURCE_IS_CONTINUOUS;
>>  
>>  	/*
>>  	 * Setup the clock source counter to be an incrementing counter
>> @@ -364,7 +352,10 @@ static int __init ttc_setup_clocksource(struct clk *clk, void __iomem *base,
>>  	writel_relaxed(CNT_CNTRL_RESET,
>>  		     ttccs->ttc.base_addr + TTC_CNT_CNTRL_OFFSET);
>>  
>> -	err = clocksource_register_hz(&ttccs->cs, ttccs->ttc.freq / PRESCALE);
>> +	mmr.reg_lower = ttccs->ttc.base_addr + TTC_COUNT_VAL_OFFSET;
>> +	mmr.bits_lower = 32;
>> +
>> +	err = clocksource_user_mmio_init(&ttccs->cs, &mmr, ttccs->ttc.freq / PRESCALE);
>>  	if (err) {
>>  		kfree(ttccs);
>>  		return err;
>> @@ -431,7 +422,8 @@ static int __init ttc_setup_clockevent(struct clk *clk,
>>  
>>  	ttcce->ttc.base_addr = base;
>>  	ttcce->ce.name = "ttc_clockevent";
>> -	ttcce->ce.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
>> +	ttcce->ce.features = CLOCK_EVT_FEAT_PERIODIC | \
>> +		CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PIPELINE;
>>  	ttcce->ce.set_next_event = ttc_set_next_event;
>>  	ttcce->ce.set_state_shutdown = ttc_shutdown;
>>  	ttcce->ce.set_state_periodic = ttc_set_periodic;
>> 
>> -- 
>
> I just gave it a try, but with no success. I hadtried a patch of mine too with the same result.
> I wonder however if this is really the problem here as when I get stuck at boot, the last declare time source is the Arm GT :
>
> Here :
> [    0.614817] clocksource: Switched to clocksource arm_global_timer
>
> [    0.628544] NET: Registered PF_INET protocol family
> [    0.633699] IP idents hash table entries: 16384 (order: 5, 131072 bytes, linear)
> [    0.642170] tcp_listen_portaddr_hash hash table entries: 512 (order: 0, 4096 bytes, linear)
> [    0.650587] Table-perturb hash table entries: 65536 (order: 6, 262144 bytes, linear)
> [    0.658339] TCP established hash table entries: 8192 (order: 3, 32768 bytes, linear)
> [    0.666173] TCP bind hash table entries: 8192 (order: 4, 65536 bytes, linear)
> [    0.673540] TCP: Hash tables configured (established 8192 bind 8192)
> [    0.679986] UDP hash table entries: 512 (order: 2, 16384 bytes, linear)
> [    0.686665] UDP-Lite hash table entries: 512 (order: 2, 16384 bytes, linear)
> [    0.693937] NET: Registered PF_UNIX/PF_LOCAL protocol family
> [    0.700355] RPC: Registered named UNIX socket transport module.
> [    0.706274] RPC: Registered udp transport module.
> [    0.710989] RPC: Registered tcp transport module.
> [    0.715691] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [    0.722914] [Xenomai] scheduling class idle registered.
> [    0.728138] [Xenomai] scheduling class rt registered.
> [    0.733243] IRQ pipeline: high-priority Xenomai stage added.
> [    0.740572] CPU0: proxy tick device registered (333.33MHz)
> [    0.740575] CPU1: proxy tick device registered (333.33MHz)
> [    0.752600] [Xenomai] Cobalt v3.3
> [    0.756129] workingset: timestamp_bits=30 max_order=18 bucket_order=0
> [    0.879595] NET: Registered PF_ALG protocol family
> [    0.884507] bounce: pool size: 64 pages
> [    0.888358] io scheduler mq-deadline registered
> [    0.896523] dma-pl330 f8003000.dmac: Loaded driver for PL330 DMAC-241330
> [    0.903234] dma-pl330 f8003000.dmac:         DBUFF-128x8bytes Num_Chans-8 Num_Peri-4 Num_Events-16
> [    0.919795] brd: module loaded
>
> then hung from here
>

What if you disable CONFIG_CPUIDLE?

-- 
Philippe.

  reply	other threads:[~2024-11-18 15:34 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-18  8:50 Switching from xenomai 3.2 IPIPE to xenomai 3.3 Dovetail François Legal
2024-11-18  9:01 ` Jan Kiszka
2024-11-18 10:21   ` François Legal
2024-11-18 10:32     ` Florian Bezdeka
2024-11-18 11:18       ` François Legal
2024-11-18 11:37         ` Philippe Gerum
2024-11-18 12:06           ` Florian Bezdeka
2024-11-18 13:21             ` François Legal
2024-11-18 13:37               ` Philippe Gerum
2024-11-18 13:57                 ` Florian Bezdeka
2024-11-18 14:49                   ` Philippe Gerum
2024-11-18 15:22                     ` François Legal
2024-11-18 15:33                       ` Philippe Gerum [this message]
2024-11-19  7:26                         ` François Legal
2024-11-19  9:26                           ` Julien Aube
2024-11-19 10:06                             ` François Legal
2024-11-19 10:23                               ` Philippe Gerum
2024-11-19 14:30                                 ` Philippe Gerum
2024-11-19 14:36                                 ` François Legal
2024-11-19 14:55                                   ` Philippe Gerum
2024-11-19 16:10                                     ` Philippe Gerum
2024-11-19 17:04                                       ` François Legal
2024-11-18 13:38             ` François Legal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87v7wkimjd.fsf@xenomai.org \
    --to=rpm@xenomai.org \
    --cc=devel@thom.fr.eu.org \
    --cc=florian.bezdeka@siemens.com \
    --cc=jan.kiszka@siemens.com \
    --cc=suntaoworks@163.com \
    --cc=xenomai@xenomai.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.