* [Xenomai] "inconsistent lock state" on boot-up
@ 2014-11-09 10:07 Stoidner, Christoph
2014-11-09 15:53 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-09 10:07 UTC (permalink / raw)
To: xenomai@xenomai.org
Hi at all,
I am using linux 3.10.32 and ipipe-core-3.10.32-arm-4.patch on a Freescale i.MX28.
When booting the kernel the message "inconsistent lock state" is given (see below). Does anyone have an idea why this happens? With kernel 3.10.18 and according ipipe it is the same. With linux 3.4.6 and ipipe 3.4.6-arm-4 the message does not appear.
I am very interested to understand if these message could lead to any problems, since I have I unpredictable crashes of my xenomai-based application program (e.g. with "segmentation fault" or "scheduling while atomic" messages).
Here is the kernel output of 3.10.32 and ipipe-core-3.10.32-arm-4.patch:
[ 0.000000] Booting Linux on physical CPU 0x0
[ 0.000000] Linux version 3.10.32-ipipe (stch@Kubuntu-Default) (gcc version 4.6.2 (arvero ARM tools 2013-08-05) ) #3 PREEMPT Thu Nov 6 14:54:08 CET 2014
[ 0.000000] CPU: ARM926EJ-S [41069265] revision 5 (ARMv5TEJ), cr=00053177
[ 0.000000] CPU: VIVT data cache, VIVT instruction cache
[ 0.000000] Machine: Freescale MXS (Device Tree), model: Viessmann Vitocom 100
[ 0.000000] Memory policy: ECC disabled, Data cache writeback
[ 0.000000] On node 0 totalpages: 32768
[ 0.000000] free_area_init_node: node 0, pgdat c06fa4e8, node_mem_map c0ca5000
[ 0.000000] Normal zone: 256 pages used for memmap
[ 0.000000] Normal zone: 0 pages reserved
[ 0.000000] Normal zone: 32768 pages, LIFO batch:7
[ 0.000000] pcpu-alloc: s0 r0 d32768 u32768 alloc=1*32768
[ 0.000000] pcpu-alloc: [0] 0
[ 0.000000] Built 1 zonelists in Zone order, mobility grouping on. Total pages: 32512
[ 0.000000] Kernel command line: fec.macaddr=0x00,0xD0,0x93,0x2A,0xCC,0x40 ip=192.168.200.20:192.168.200.125:192.168.1.254:255.255.255.0::eth0:off root=/dev/nfs rw nfsroot=192.168.200.125:/srv/nfs/rtos-linux-rootfs,v3,tcp rw console=ttyAMA0,115200
[ 0.000000] PID hash table entries: 512 (order: -1, 2048 bytes)
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[ 0.000000] Memory: 128MB = 128MB total
[ 0.000000] Memory: 116928k/116928k available, 14144k reserved, 0K highmem
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] vector : 0xffff0000 - 0xffff1000 ( 4 kB)
[ 0.000000] fixmap : 0xfff00000 - 0xfffe0000 ( 896 kB)
[ 0.000000] vmalloc : 0xc8800000 - 0xff000000 ( 872 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xc8000000 ( 128 MB)
[ 0.000000] modules : 0xbf000000 - 0xc0000000 ( 16 MB)
[ 0.000000] .text : 0xc0008000 - 0xc067dce4 (6616 kB)
[ 0.000000] .init : 0xc067e000 - 0xc06b4588 ( 218 kB)
[ 0.000000] .data : 0xc06b6000 - 0xc06fed30 ( 292 kB)
[ 0.000000] .bss : 0xc06fed30 - 0xc0c9ff80 (5765 kB)
[ 0.000000] SLUB: HWalign=32, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] Preemptible hierarchical RCU implementation.
[ 0.000000] NR_IRQS:16 nr_irqs:16 16
[ 0.000000] of_irq_init: children remain, but no parents
[ 0.000000] I-pipe, 24.000 MHz clocksource
[ 0.000000] sched_clock: 32 bits at 24MHz, resolution 41ns, wraps every 178956ms
[ 0.000000] I-pipe: ARM926EJ-S detected, disabling wfi instruction in idle loop
[ 0.000000] Interrupt pipeline (release #4)
[ 0.000000] Console: colour dummy device 80x30
[ 0.000000] Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar
[ 0.000000] ... MAX_LOCKDEP_SUBCLASSES: 8
[ 0.000000] ... MAX_LOCK_DEPTH: 48
[ 0.000000] ... MAX_LOCKDEP_KEYS: 8191
[ 0.000000] ... CLASSHASH_SIZE: 4096
[ 0.000000] ... MAX_LOCKDEP_ENTRIES: 16384
[ 0.000000] ... MAX_LOCKDEP_CHAINS: 32768
[ 0.000000] ... CHAINHASH_SIZE: 16384
[ 0.000000] memory used by lock dependency info: 3695 kB
[ 0.000000] per task-struct memory footprint: 1152 bytes
[ 0.002748] Calibrating delay loop... 226.09 BogoMIPS (lpj=1130496)
[ 0.071047] pid_max: default: 32768 minimum: 301
[ 0.071902] Mount-cache hash table entries: 512
[ 0.079998] CPU: Testing write buffer coherency: ok
[ 0.083826] Setting up static identity map for 0xc04acc20 - 0xc04acc78
[ 0.091221]
[ 0.091279] =================================
[ 0.091308] [ INFO: inconsistent lock state ]
[ 0.091344] 3.10.32-ipipe #3 Not tainted
[ 0.091366] ---------------------------------
[ 0.091392] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage.
[ 0.091427] kthreadd/9 [HC0[0]:SC0[0]:HE1:SE1] takes:
[ 0.091452] (std_spinlock_raw(&rq->lock)){?.....}, at: [<c0049608>] try_to_wake_up+0x80/0x12c
[ 0.091572] {IN-HARDIRQ-W} state was registered at:
[ 0.091599] [<c005eb40>] __lock_acquire+0xac8/0x1aec
[ 0.091657] [<c006017c>] lock_acquire+0xc0/0x184
[ 0.091702] [<c04abf18>] _raw_spin_lock+0x40/0x50
[ 0.091764] [<c004a1a8>] scheduler_tick+0x20/0xdc
[ 0.091815] [<c002c4ec>] update_process_times+0x58/0x68
[ 0.091867] [<c0058e8c>] tick_handle_periodic+0x18/0x8c
[ 0.091909] [<c03ae6f4>] mxs_timer_interrupt+0x34/0x40
[ 0.091960] [<c006ccc8>] handle_irq_event_percpu+0x68/0x314
[ 0.092013] [<c006cfb0>] handle_irq_event+0x3c/0x5c
[ 0.092056] [<c006fb78>] handle_level_irq+0x6c/0xcc
[ 0.092106] [<c006c5b4>] generic_handle_irq+0x20/0x30
[ 0.092148] [<c000f9a0>] handle_IRQ+0x30/0x84
[ 0.092203] [<c0077c14>] __ipipe_do_sync_stage+0x2b0/0x2fc
[ 0.092259] [<c00085bc>] __ipipe_grab_irq+0x2c/0x70
[ 0.092301] [<c000e624>] __irq_svc+0x44/0x70
[ 0.092342] [<c06a8f54>] calibrate_delay+0x360/0x4e8
[ 0.092401] [<c067e95c>] start_kernel+0x25c/0x2f0
[ 0.092457] [<40008040>] 0x40008040
[ 0.092498] irq event stamp: 6
[ 0.092523] hardirqs last enabled at (5): [<c0060d84>] debug_check_no_locks_freed+0xd8/0x170
[ 0.092576] hardirqs last disabled at (6): [<c04ac000>] _raw_spin_lock_irqsave+0x34/0x80
[ 0.092636] softirqs last enabled at (0): [<c001b454>] copy_process+0x2a4/0x107c
[ 0.092701] softirqs last disabled at (0): [< (null)>] (null)
[ 0.092733]
[ 0.092733] other info that might help us debug this:
[ 0.092761] Possible unsafe locking scenario:
[ 0.092761]
[ 0.092786] CPU0
[ 0.092804] ----
[ 0.092821] lock(std_spinlock_raw(&rq->lock));
[ 0.092858] <Interrupt>
[ 0.092876] lock(std_spinlock_raw(&rq->lock));
[ 0.092913]
[ 0.092913] *** DEADLOCK ***
[ 0.092913]
[ 0.092950] 3 locks held by kthreadd/9:
[ 0.092969] #0: (&x->wait){+.....}, at: [<c00489c0>] complete+0x1c/0x5c
[ 0.093069] #1: (std_spinlock_raw(&p->pi_lock)){+.....}, at: [<c00495a8>] try_to_wake_up+0x20/0x12c
[ 0.093165] #2: (std_spinlock_raw(&rq->lock)){?.....}, at: [<c0049608>] try_to_wake_up+0x80/0x12c
[ 0.093260]
[ 0.093260] stack backtrace:
[ 0.093306] CPU: 0 PID: 9 Comm: kthreadd Not tainted 3.10.32-ipipe #3
[ 0.093386] [<c0013ed8>] (unwind_backtrace+0x0/0xf0) from [<c0011c10>] (show_stack+0x10/0x14)
[ 0.093457] [<c0011c10>] (show_stack+0x10/0x14) from [<c04a50a0>] (print_usage_bug.part.28+0x218/0x280)
[ 0.093523] [<c04a50a0>] (print_usage_bug.part.28+0x218/0x280) from [<c005df38>] (mark_lock+0x528/0x668)
[ 0.093585] [<c005df38>] (mark_lock+0x528/0x668) from [<c0060a38>] (mark_held_locks+0x9c/0x120)
[ 0.093646] [<c0060a38>] (mark_held_locks+0x9c/0x120) from [<c0060b70>] (trace_hardirqs_on_caller+0xb4/0x1e0)
[ 0.093707] [<c0060b70>] (trace_hardirqs_on_caller+0xb4/0x1e0) from [<c000e654>] (__ipipe_fast_svc_irq_exit+0x4/0x10)
[ 0.093771] [<c000e654>] (__ipipe_fast_svc_irq_exit+0x4/0x10) from [<c0049028>] (update_rq_clock+0x48/0x58)
[ 0.093832] [<c0049028>] (update_rq_clock+0x48/0x58) from [<c00490fc>] (enqueue_task+0x18/0x70)
[ 0.093893] [<c00490fc>] (enqueue_task+0x18/0x70) from [<c0049618>] (try_to_wake_up+0x90/0x12c)
[ 0.093952] [<c0049618>] (try_to_wake_up+0x90/0x12c) from [<c0046e3c>] (__wake_up_common+0x54/0x94)
[ 0.094012] [<c0046e3c>] (__wake_up_common+0x54/0x94) from [<c00489ec>] (complete+0x48/0x5c)
[ 0.094071] [<c00489ec>] (complete+0x48/0x5c) from [<c003ef78>] (kthread+0x7c/0xb0)
[ 0.094132] [<c003ef78>] (kthread+0x7c/0xb0) from [<c000eb34>] (ret_from_fork+0x18/0x24)
[ 0.097934] devtmpfs: initialized
[ 0.102894] pinctrl core: initialized pinctrl subsystem
[ 0.104625] regulator-dummy: no parameters
[ 0.105613] NET: Registered protocol family 16
[ 0.107080] DMA: preallocated 256 KiB pool for atomic coherent allocations
[ 0.122886] gpiochip_add: registered GPIOs 0 to 31 on device: gpio.0
[ 0.125709] gpiochip_add: registered GPIOs 32 to 63 on device: gpio.1
[ 0.128234] gpiochip_add: registered GPIOs 64 to 95 on device: gpio.2
[ 0.130826] gpiochip_add: registered GPIOs 96 to 127 on device: gpio.3
[ 0.133570] gpiochip_add: registered GPIOs 128 to 159 on device: gpio.4
[ 0.150654] Serial: AMBA PL011 UART driver
[ 0.151544] 80074000.serial: ttyAMA0 at MMIO 0x80074000 (irq = 225) is a PL011 rev2
[ 0.898747] console [ttyAMA0] enabled
[ 0.930718] bio: create slab <bio-0> at 0
[ 0.943158] mxs-dma 80004000.dma-apbh: initialized
[ 0.954720] mxs-dma 80024000.dma-apbx: initialized
[ 0.960462] of_get_named_gpio_flags exited with status 124
[ 0.966563] vddio-sd0: 3300 mV
[ 0.970520] of_get_named_gpio_flags: can't parse gpios property
[ 0.977004] 3P3V: 3300 mV
[ 0.980476] of_get_named_gpio_flags exited with status 125
[ 0.986546] fec-3v3: 3300 mV
[ 0.990375] of_get_named_gpio_flags exited with status 122
[ 0.996481] usb0_vbus: 5000 mV
[ 1.001710] SCSI subsystem initialized
[ 1.006680] usbcore: registered new interface driver usbfs
[ 1.012713] usbcore: registered new interface driver hub
[ 1.018681] usbcore: registered new device driver usb
[ 1.025394] pps_core: LinuxPPS API ver. 1 registered
[ 1.030404] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[ 1.039859] PTP clock support registered
[ 1.045411] Advanced Linux Sound Architecture Driver Initialized.
[ 1.054090] Switching to clocksource ipipe_tsc
[ 1.221815] NET: Registered protocol family 2
[ 1.228393] TCP established hash table entries: 1024 (order: 1, 8192 bytes)
[ 1.236257] TCP bind hash table entries: 1024 (order: 3, 36864 bytes)
[ 1.243306] TCP: Hash tables configured (established 1024 bind 1024)
[ 1.250050] TCP: reno registered
[ 1.253378] UDP hash table entries: 256 (order: 2, 20480 bytes)
[ 1.259726] UDP-Lite hash table entries: 256 (order: 2, 20480 bytes)
[ 1.267635] NET: Registered protocol family 1
[ 1.273650] RPC: Registered named UNIX socket transport module.
[ 1.279881] RPC: Registered udp transport module.
[ 1.284637] RPC: Registered tcp transport module.
[ 1.289495] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 1.296613] NetWinder Floating Point Emulator V0.97 (double precision)
[ 1.307586] I-pipe: head domain Xenomai registered.
[ 1.312699] Xenomai: hal/arm started.
[ 1.316959] Xenomai: scheduling class idle registered.
[ 1.322159] Xenomai: scheduling class rt registered.
[ 1.349552] Xenomai: real-time nucleus v2.6.3 (Lies and Truths) loaded.
[ 1.356216] Xenomai: debug mode enabled.
[ 1.361259] Xenomai: starting @CHIP-RTOS services.
[ 1.366196] Xenomai: starting native API services.
[ 1.371211] Xenomai: starting POSIX services.
[ 1.376029] Xenomai: starting RTDM services.
[ 1.436475] NFS: Registering the id_resolver key type
[ 1.442015] Key type id_resolver registered
[ 1.446266] Key type id_legacy registered
[ 1.450607] jffs2: version 2.2. (NAND) © 2001-2006 Red Hat, Inc.
[ 1.459635] msgmni has been set to 228
[ 1.470890] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249)
[ 1.478351] io scheduler noop registered (default)
[ 1.486256] of_dma_request_slave_channel: dma-names property missing or empty
[ 1.493691] uart-pl011 80074000.serial: no DMA platform data
[ 1.500737] 80072000.serial: ttyAPP4 at MMIO 0x80072000 (irq = 224) is a 80072000.serial
[ 1.510672] mxs-auart 80072000.serial: Found APPUART 3.1.0
[ 1.548255] of_get_named_gpio_flags exited with status 17
[ 1.668496] libphy: fec_enet_mii_bus: probed
[ 1.675807] usbcore: registered new interface driver asix
[ 1.681869] usbcore: registered new interface driver ax88179_178a
[ 1.688302] usbcore: registered new interface driver cdc_ether
[ 1.694882] usbcore: registered new interface driver smsc95xx
[ 1.701232] usbcore: registered new interface driver net1080
[ 1.707224] usbcore: registered new interface driver cdc_subset
[ 1.713634] usbcore: registered new interface driver zaurus
[ 1.720035] usbcore: registered new interface driver cdc_ncm
[ 1.725754] ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
[ 1.732963] usbcore: registered new interface driver usb-storage
[ 1.742273] ci_hdrc ci_hdrc.0: doesn't support gadget
[ 1.747423] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 1.752764] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 1
[ 1.778903] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[ 1.787744] hub 1-0:1.0: USB hub found
[ 1.791825] hub 1-0:1.0: 1 port detected
[ 1.798368] mousedev: PS/2 mouse device common for all mice
[ 1.809289] stmp3xxx-rtc 80056000.rtc: rtc core: registered 80056000.rtc as rtc0
[ 1.817578] i2c /dev entries driver
[ 1.823272] stmp3xxx_rtc_wdt stmp3xxx_rtc_wdt: initialized watchdog with heartbeat 19s
[ 1.833634] of_get_named_gpio_flags exited with status 76
[ 1.878880] mxs-mmc 80012000.ssp: initialized
[ 1.887802] usbcore: registered new interface driver usbhid
[ 1.893631] usbhid: USB HID core driver
[ 1.902393] mxs-lradc 80050000.lradc: Touchscreen not enabled.
[ 1.925086] TCP: cubic registered
[ 1.928491] NET: Registered protocol family 17
[ 1.934010] Key type dns_resolver registered
[ 1.941379] registered taskstats version 1
[ 1.951463] stmp3xxx-rtc 80056000.rtc: setting system clock to 1970-01-01 00:21:15 UTC (1275)
[ 1.973991] mmc0: BKOPS_EN bit is not set
[ 1.991482] mmc0: new high speed MMC card at address 0001
[ 1.998963] fec 800f0000.ethernet eth0: Freescale FEC PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=800f0000.etherne:00, irq=-1)
[ 2.011564] mmcblk0: mmc0:0001 SEM02G 1.82 GiB
[ 2.016804] mmcblk0boot0: mmc0:0001 SEM02G partition 1 1.00 MiB
[ 2.024103] mmcblk0boot1: mmc0:0001 SEM02G partition 2 1.00 MiB
[ 2.035524] mmcblk0: p1 p2 p3 p4 < p5 p6 p7 >
[ 2.053456] mmcblk0boot1: unknown partition table
[ 2.062682] mmcblk0boot0: unknown partition table
[ 3.989508] libphy: 800f0000.etherne:00 - Link is Up - 100/Full
[ 4.019719] IP-Config: Gateway not on directly connected network
[ 4.025792] ALSA device list:
[ 4.028962] No soundcards found.
[ 4.066526] VFS: Mounted root (nfs filesystem) on device 0:11.
[ 4.074463] devtmpfs: mounted
[ 4.078876] Freeing unused kernel memory: 216K (c067e000 - c06b4000)
Thanks in advance,
Christoph
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-09 10:07 [Xenomai] "inconsistent lock state" on boot-up Stoidner, Christoph
@ 2014-11-09 15:53 ` Gilles Chanteperdrix
2014-11-10 9:08 ` Stoidner, Christoph
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-09 15:53 UTC (permalink / raw)
To: Stoidner, Christoph; +Cc: xenomai@xenomai.org
On Sun, Nov 09, 2014 at 10:07:54AM +0000, Stoidner, Christoph wrote:
> Hi at all,
>
> I am using linux 3.10.32 and ipipe-core-3.10.32-arm-4.patch on a
> Freescale i.MX28.
>
> When booting the kernel the message "inconsistent lock state" is
> given (see below). Does anyone have an idea why this happens? With
> kernel 3.10.18 and according ipipe it is the same. With linux
> 3.4.6 and ipipe 3.4.6-arm-4 the message does not appear.
Do you have the same message with exactly the same kernel
configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>
> I am very interested to understand if these message could lead to
> any problems, since I have I unpredictable crashes of my
> xenomai-based application program (e.g. with "segmentation fault"
> or "scheduling while atomic" messages).
Do you have FCSE enabled? If yes, did you try disabling it? same
with unlocked context switch.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-09 15:53 ` Gilles Chanteperdrix
@ 2014-11-10 9:08 ` Stoidner, Christoph
2014-11-10 12:33 ` Stoidner, Christoph
2014-11-10 12:43 ` Gilles Chanteperdrix
0 siblings, 2 replies; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-10 9:08 UTC (permalink / raw)
To: xenomai@xenomai.org
Hi Gilles,
> Do you have the same message with exactly the same kernel
> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
appear on boot-up.
> Do you have FCSE enabled? If yes, did you try disabling it? same
> with unlocked context switch.
FCSE is already disabled at all.
Do you have an idea how to overcome the problem?
Regards,
Christoph
________________________________________
Von: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Gesendet: Sonntag, 9. November 2014 16:53
An: Stoidner, Christoph
Cc: xenomai@xenomai.org
Betreff: Re: [Xenomai] "inconsistent lock state" on boot-up
On Sun, Nov 09, 2014 at 10:07:54AM +0000, Stoidner, Christoph wrote:
> Hi at all,
>
> I am using linux 3.10.32 and ipipe-core-3.10.32-arm-4.patch on a
> Freescale i.MX28.
>
> When booting the kernel the message "inconsistent lock state" is
> given (see below). Does anyone have an idea why this happens? With
> kernel 3.10.18 and according ipipe it is the same. With linux
> 3.4.6 and ipipe 3.4.6-arm-4 the message does not appear.
Do you have the same message with exactly the same kernel
configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>
> I am very interested to understand if these message could lead to
> any problems, since I have I unpredictable crashes of my
> xenomai-based application program (e.g. with "segmentation fault"
> or "scheduling while atomic" messages).
Do you have FCSE enabled? If yes, did you try disabling it? same
with unlocked context switch.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 9:08 ` Stoidner, Christoph
@ 2014-11-10 12:33 ` Stoidner, Christoph
2014-11-10 12:44 ` Gilles Chanteperdrix
2014-11-10 12:43 ` Gilles Chanteperdrix
1 sibling, 1 reply; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-10 12:33 UTC (permalink / raw)
To: xenomai@xenomai.org
Hi again,
now I have disabled CONFIG_XENOMAI but still enabled CONFIG_IPIPE. The error messages on boot-up is still given.
Since no one else has reported that problem I assume it happens only on a specific architecture (for me: i.MX28 => ARMv5). Is there something ARMv5 architecture specific in ipipe's locking mechanism?
Regards,
Christoph
________________________________________
Von: Xenomai <xenomai-bounces@xenomai.org> im Auftrag von Stoidner, Christoph <c.stoidner@arvero.de>
Gesendet: Montag, 10. November 2014 10:08
An: xenomai@xenomai.org
Betreff: Re: [Xenomai] "inconsistent lock state" on boot-up
Hi Gilles,
> Do you have the same message with exactly the same kernel
> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
appear on boot-up.
> Do you have FCSE enabled? If yes, did you try disabling it? same
> with unlocked context switch.
FCSE is already disabled at all.
Do you have an idea how to overcome the problem?
Regards,
Christoph
________________________________________
Von: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Gesendet: Sonntag, 9. November 2014 16:53
An: Stoidner, Christoph
Cc: xenomai@xenomai.org
Betreff: Re: [Xenomai] "inconsistent lock state" on boot-up
On Sun, Nov 09, 2014 at 10:07:54AM +0000, Stoidner, Christoph wrote:
> Hi at all,
>
> I am using linux 3.10.32 and ipipe-core-3.10.32-arm-4.patch on a
> Freescale i.MX28.
>
> When booting the kernel the message "inconsistent lock state" is
> given (see below). Does anyone have an idea why this happens? With
> kernel 3.10.18 and according ipipe it is the same. With linux
> 3.4.6 and ipipe 3.4.6-arm-4 the message does not appear.
Do you have the same message with exactly the same kernel
configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>
> I am very interested to understand if these message could lead to
> any problems, since I have I unpredictable crashes of my
> xenomai-based application program (e.g. with "segmentation fault"
> or "scheduling while atomic" messages).
Do you have FCSE enabled? If yes, did you try disabling it? same
with unlocked context switch.
--
Gilles.
_______________________________________________
Xenomai mailing list
Xenomai@xenomai.org
http://www.xenomai.org/mailman/listinfo/xenomai
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 9:08 ` Stoidner, Christoph
2014-11-10 12:33 ` Stoidner, Christoph
@ 2014-11-10 12:43 ` Gilles Chanteperdrix
2014-11-10 14:52 ` Jan Kiszka
2014-11-11 17:33 ` Stoidner, Christoph
1 sibling, 2 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 12:43 UTC (permalink / raw)
To: Stoidner, Christoph; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>
> Hi Gilles,
>
> > Do you have the same message with exactly the same kernel
> > configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>
> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> appear on boot-up.
>
> > Do you have FCSE enabled? If yes, did you try disabling it? same
> > with unlocked context switch.
>
> FCSE is already disabled at all.
>
> Do you have an idea how to overcome the problem?
I am not sure the lockdep message really is a problem. lockdep could
be confused by the fact that the hardware interrupts are not off
when running the I-pipe, or because we are missing some bit in the
I-pipe arm specific code to get it looking at the virtual mask
instead of the hardware mask.
As for the scheduling while atomic and random segmentation fault,
you should use the I-pipe tracer, configure it with enough back
trace points, something like 1000 or 10000, and trigger a trace
freeze in the kernell code when the problem happens.
Also, for the "scheduling while atomic", it may happen if you call
some Linux service which reschedules from primary mode, you can try
enabling I-pipe debugging, and in fact all Xenomai debugging, to try
and catch such mistakes. This is especially important if you are
running a custom skin.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 12:33 ` Stoidner, Christoph
@ 2014-11-10 12:44 ` Gilles Chanteperdrix
0 siblings, 0 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 12:44 UTC (permalink / raw)
To: Stoidner, Christoph; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 12:33:40PM +0000, Stoidner, Christoph wrote:
>
> Hi again,
>
> now I have disabled CONFIG_XENOMAI but still enabled CONFIG_IPIPE. The error messages on boot-up is still given.
>
> Since no one else has reported that problem I assume it happens only on a specific architecture (for me: i.MX28 => ARMv5). Is there something ARMv5 architecture specific in ipipe's locking mechanism?
This error message probably happens to anyone enabling lockdep with
I-pipe on ARM. You are not the first to report it.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 12:43 ` Gilles Chanteperdrix
@ 2014-11-10 14:52 ` Jan Kiszka
2014-11-10 15:56 ` Gilles Chanteperdrix
2014-11-11 17:33 ` Stoidner, Christoph
1 sibling, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 14:52 UTC (permalink / raw)
To: Gilles Chanteperdrix, Stoidner, Christoph; +Cc: xenomai@xenomai.org
On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>
>> Hi Gilles,
>>
>>> Do you have the same message with exactly the same kernel
>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>
>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>> appear on boot-up.
>>
>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>> with unlocked context switch.
>>
>> FCSE is already disabled at all.
>>
>> Do you have an idea how to overcome the problem?
>
> I am not sure the lockdep message really is a problem. lockdep could
> be confused by the fact that the hardware interrupts are not off
> when running the I-pipe, or because we are missing some bit in the
> I-pipe arm specific code to get it looking at the virtual mask
> instead of the hardware mask.
>
> As for the scheduling while atomic and random segmentation fault,
> you should use the I-pipe tracer, configure it with enough back
> trace points, something like 1000 or 10000, and trigger a trace
> freeze in the kernell code when the problem happens.
>
> Also, for the "scheduling while atomic", it may happen if you call
> some Linux service which reschedules from primary mode, you can try
> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> and catch such mistakes. This is especially important if you are
> running a custom skin.
"Scheduling while atomic" may have the same reason why lockdep stumbles:
some changes of I-pipe messe up with IRQ state tracing of Linux. I just
started to look into this issue again. We tried earlier but got distracted.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 14:52 ` Jan Kiszka
@ 2014-11-10 15:56 ` Gilles Chanteperdrix
2014-11-10 18:29 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 15:56 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>
> >> Hi Gilles,
> >>
> >>> Do you have the same message with exactly the same kernel
> >>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>
> >> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >> appear on boot-up.
> >>
> >>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>> with unlocked context switch.
> >>
> >> FCSE is already disabled at all.
> >>
> >> Do you have an idea how to overcome the problem?
> >
> > I am not sure the lockdep message really is a problem. lockdep could
> > be confused by the fact that the hardware interrupts are not off
> > when running the I-pipe, or because we are missing some bit in the
> > I-pipe arm specific code to get it looking at the virtual mask
> > instead of the hardware mask.
> >
> > As for the scheduling while atomic and random segmentation fault,
> > you should use the I-pipe tracer, configure it with enough back
> > trace points, something like 1000 or 10000, and trigger a trace
> > freeze in the kernell code when the problem happens.
> >
> > Also, for the "scheduling while atomic", it may happen if you call
> > some Linux service which reschedules from primary mode, you can try
> > enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> > and catch such mistakes. This is especially important if you are
> > running a custom skin.
>
> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> started to look into this issue again. We tried earlier but got distracted.
I doubt that very much. Though I never run with lockdep, I sometimes
run with CONFIG_PREEMPT, and never saw this message. From what I can
see, the "scheduling while atomic" message is based on the
preempt_count only and does not use irqs_disabled() (which by the
way is known to work with I-pipe on ARM as well, so, if something is
broken, that should be something more obscure).
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 15:56 ` Gilles Chanteperdrix
@ 2014-11-10 18:29 ` Jan Kiszka
2014-11-10 19:46 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 18:29 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>
>>>> Hi Gilles,
>>>>
>>>>> Do you have the same message with exactly the same kernel
>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>
>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>> appear on boot-up.
>>>>
>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>> with unlocked context switch.
>>>>
>>>> FCSE is already disabled at all.
>>>>
>>>> Do you have an idea how to overcome the problem?
>>>
>>> I am not sure the lockdep message really is a problem. lockdep could
>>> be confused by the fact that the hardware interrupts are not off
>>> when running the I-pipe, or because we are missing some bit in the
>>> I-pipe arm specific code to get it looking at the virtual mask
>>> instead of the hardware mask.
>>>
>>> As for the scheduling while atomic and random segmentation fault,
>>> you should use the I-pipe tracer, configure it with enough back
>>> trace points, something like 1000 or 10000, and trigger a trace
>>> freeze in the kernell code when the problem happens.
>>>
>>> Also, for the "scheduling while atomic", it may happen if you call
>>> some Linux service which reschedules from primary mode, you can try
>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>> and catch such mistakes. This is especially important if you are
>>> running a custom skin.
>>
>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>> started to look into this issue again. We tried earlier but got distracted.
>
> I doubt that very much. Though I never run with lockdep, I sometimes
> run with CONFIG_PREEMPT, and never saw this message. From what I can
> see, the "scheduling while atomic" message is based on the
> preempt_count only and does not use irqs_disabled() (which by the
> way is known to work with I-pipe on ARM as well, so, if something is
> broken, that should be something more obscure).
Let's see. I think I've identified one wrong path:
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d32f8bd..ab911f8 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -198,7 +198,10 @@
#ifdef CONFIG_TRACE_IRQFLAGS
@ The parent context IRQs must have been enabled to get here in
@ the first place, so there's no point checking the PSR I bit.
- bl trace_hardirqs_on
+ tst \rpsr, #PSR_I_BIT
+ bleq trace_hardirqs_off
+ tst \rpsr, #PSR_I_BIT
+ blne trace_hardirqs_on
#endif
.else
@ IRQs off again before pulling preserved data off the stack
This is probably no fix, but a with that change applied, the warning is
gone. Now the question is what to really test for when returning here. I
suppose we want the pipeline state of root here - should I
__ipipe_check_root_interruptible?
For reference, here is a trace that relates to a lockdep report:
| #func -155 __save_stack_trace+0x14 (save_stack_trace+0x30)
| #func -157 save_stack_trace+0x10 (save_trace+0x3c)
:| #func -159 __ipipe_bugon_irqs_enabled+0x10 (__ipipe_fast_svc_irq_exit+0x4)
:| #func -160 __ipipe_check_root_interruptible+0x10 (__irq_svc+0x48)
:| #func -161 __ipipe_exit_irq+0x10 (__ipipe_grab_irq+0x48)
:| #func -164 __ipipe_set_irq_pending+0x10 (__ipipe_dispatch_irq+0x1f0)
:| #func -167 irq_gc_mask_disable_reg+0x10 (omap_mask_ack_irq+0x18)
:| #func -168 omap_mask_ack_irq+0x10 (__ipipe_ack_level_irq+0x30)
:| #func -169 __ipipe_ack_level_irq+0x10 (__ipipe_dispatch_irq+0x6c)
:| #func -171 irq_to_desc+0x10 (__ipipe_dispatch_irq+0xc8)
:| #func -174 irq_to_desc+0x10 (__ipipe_dispatch_irq+0xb8)
:| #func -175 __ipipe_dispatch_irq+0x10 (__ipipe_grab_irq+0x40)
:| #func -177 __ipipe_grab_irq+0x10 (omap3_intc_handle_irq+0x94)
:| #func -179 irq_find_mapping+0x14 (omap3_intc_handle_irq+0x88)
:| #func -180 omap3_intc_handle_irq+0x10 (__irq_svc+0x44)
: #func -184 update_curr.constprop.48+0x14 (dequeue_task_fair+0x30)
: #func -184 dequeue_task_fair+0x10 (dequeue_task+0x38)
: #func -186 update_rq_clock.part.71+0x10 (dequeue_task+0x4c)
: #func -187 dequeue_task+0x14 (deactivate_task+0x38)
: #func -187 deactivate_task+0x10 (__schedule+0x2b4)
: #func -188 do_raw_spin_lock+0x14 (_raw_spin_lock_irq+0x7c)
+func -190 _raw_spin_lock_irq+0x14 (__schedule+0x84)
+func -190 ipipe_root_only+0x10 (__schedule+0x5c)
| #func -191 ipipe_root_only+0x10 (ipipe_unstall_root+0x1c)
#func -192 ipipe_unstall_root+0x10 (rcu_sched_qs+0xa0)
+func -193 rcu_sched_qs+0x10 (__schedule+0x48)
+func -194 __schedule+0x14 (schedule+0x40)
+func -195 schedule+0x10 (smpboot_thread_fn+0x108)
The ":" at the beginning stands for !current->hardirqs_enabled.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 18:29 ` Jan Kiszka
@ 2014-11-10 19:46 ` Gilles Chanteperdrix
2014-11-10 19:51 ` Gilles Chanteperdrix
2014-11-10 19:55 ` Jan Kiszka
0 siblings, 2 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 19:46 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>
> >>>> Hi Gilles,
> >>>>
> >>>>> Do you have the same message with exactly the same kernel
> >>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>
> >>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>> appear on boot-up.
> >>>>
> >>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>> with unlocked context switch.
> >>>>
> >>>> FCSE is already disabled at all.
> >>>>
> >>>> Do you have an idea how to overcome the problem?
> >>>
> >>> I am not sure the lockdep message really is a problem. lockdep could
> >>> be confused by the fact that the hardware interrupts are not off
> >>> when running the I-pipe, or because we are missing some bit in the
> >>> I-pipe arm specific code to get it looking at the virtual mask
> >>> instead of the hardware mask.
> >>>
> >>> As for the scheduling while atomic and random segmentation fault,
> >>> you should use the I-pipe tracer, configure it with enough back
> >>> trace points, something like 1000 or 10000, and trigger a trace
> >>> freeze in the kernell code when the problem happens.
> >>>
> >>> Also, for the "scheduling while atomic", it may happen if you call
> >>> some Linux service which reschedules from primary mode, you can try
> >>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>> and catch such mistakes. This is especially important if you are
> >>> running a custom skin.
> >>
> >> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >> started to look into this issue again. We tried earlier but got distracted.
> >
> > I doubt that very much. Though I never run with lockdep, I sometimes
> > run with CONFIG_PREEMPT, and never saw this message. From what I can
> > see, the "scheduling while atomic" message is based on the
> > preempt_count only and does not use irqs_disabled() (which by the
> > way is known to work with I-pipe on ARM as well, so, if something is
> > broken, that should be something more obscure).
>
> Let's see. I think I've identified one wrong path:
>
> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index d32f8bd..ab911f8 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -198,7 +198,10 @@
> #ifdef CONFIG_TRACE_IRQFLAGS
> @ The parent context IRQs must have been enabled to get here in
> @ the first place, so there's no point checking the PSR I bit.
> - bl trace_hardirqs_on
> + tst \rpsr, #PSR_I_BIT
> + bleq trace_hardirqs_off
> + tst \rpsr, #PSR_I_BIT
> + blne trace_hardirqs_on
> #endif
> .else
> @ IRQs off again before pulling preserved data off the stack
>
> This is probably no fix, but a with that change applied, the warning is
> gone. Now the question is what to really test for when returning here. I
> suppose we want the pipeline state of root here - should I
> __ipipe_check_root_interruptible?
This does not make sense, read the comment above that change: there
is no way an interrupt can be taken, and so entering svc_entry, with
interrupts off. Besides this is mainline code, so it would be a
problem for mainline too. We are necessarily returning to a place
where hardware irqs were on.
To me the problem is rather that we enter
trace_hardirqs_on/trace_hardirqs_off when in the xenomai domain.
We can try and fix that, but this will result in a hell of entry.S
to maintain.I would rather exit early in
trace_hardirqs_on/trace_hardirqs_off if current domain is not root.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 19:46 ` Gilles Chanteperdrix
@ 2014-11-10 19:51 ` Gilles Chanteperdrix
2014-11-10 19:55 ` Jan Kiszka
1 sibling, 0 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 19:51 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 08:46:06PM +0100, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> > On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > > On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> > >> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > >>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> > >>>>
> > >>>> Hi Gilles,
> > >>>>
> > >>>>> Do you have the same message with exactly the same kernel
> > >>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> > >>>>
> > >>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> > >>>> appear on boot-up.
> > >>>>
> > >>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> > >>>>> with unlocked context switch.
> > >>>>
> > >>>> FCSE is already disabled at all.
> > >>>>
> > >>>> Do you have an idea how to overcome the problem?
> > >>>
> > >>> I am not sure the lockdep message really is a problem. lockdep could
> > >>> be confused by the fact that the hardware interrupts are not off
> > >>> when running the I-pipe, or because we are missing some bit in the
> > >>> I-pipe arm specific code to get it looking at the virtual mask
> > >>> instead of the hardware mask.
> > >>>
> > >>> As for the scheduling while atomic and random segmentation fault,
> > >>> you should use the I-pipe tracer, configure it with enough back
> > >>> trace points, something like 1000 or 10000, and trigger a trace
> > >>> freeze in the kernell code when the problem happens.
> > >>>
> > >>> Also, for the "scheduling while atomic", it may happen if you call
> > >>> some Linux service which reschedules from primary mode, you can try
> > >>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> > >>> and catch such mistakes. This is especially important if you are
> > >>> running a custom skin.
> > >>
> > >> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> > >> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> > >> started to look into this issue again. We tried earlier but got distracted.
> > >
> > > I doubt that very much. Though I never run with lockdep, I sometimes
> > > run with CONFIG_PREEMPT, and never saw this message. From what I can
> > > see, the "scheduling while atomic" message is based on the
> > > preempt_count only and does not use irqs_disabled() (which by the
> > > way is known to work with I-pipe on ARM as well, so, if something is
> > > broken, that should be something more obscure).
> >
> > Let's see. I think I've identified one wrong path:
> >
> > diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> > index d32f8bd..ab911f8 100644
> > --- a/arch/arm/kernel/entry-header.S
> > +++ b/arch/arm/kernel/entry-header.S
> > @@ -198,7 +198,10 @@
> > #ifdef CONFIG_TRACE_IRQFLAGS
> > @ The parent context IRQs must have been enabled to get here in
> > @ the first place, so there's no point checking the PSR I bit.
> > - bl trace_hardirqs_on
> > + tst \rpsr, #PSR_I_BIT
> > + bleq trace_hardirqs_off
> > + tst \rpsr, #PSR_I_BIT
> > + blne trace_hardirqs_on
> > #endif
> > .else
> > @ IRQs off again before pulling preserved data off the stack
> >
> > This is probably no fix, but a with that change applied, the warning is
> > gone. Now the question is what to really test for when returning here. I
> > suppose we want the pipeline state of root here - should I
> > __ipipe_check_root_interruptible?
>
> This does not make sense, read the comment above that change: there
> is no way an interrupt can be taken, and so entering svc_entry, with
> interrupts off. Besides this is mainline code, so it would be a
> problem for mainline too. We are necessarily returning to a place
> where hardware irqs were on.
>
> To me the problem is rather that we enter
> trace_hardirqs_on/trace_hardirqs_off when in the xenomai domain.
> We can try and fix that, but this will result in a hell of entry.S
> to maintain.I would rather exit early in
> trace_hardirqs_on/trace_hardirqs_off if current domain is not root.
The whole code in kernel/tracer/trace_irqsoff.c can clearly
not be called from real-time domain, it uses local_irq_save,
raw_smp_processor_id, preempt_count. So, really, the only sane fix
is
if (!ipipe_root_p)
return;
at the beginning trace_hardirqs_on/trace_hardirqs_off.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 19:46 ` Gilles Chanteperdrix
2014-11-10 19:51 ` Gilles Chanteperdrix
@ 2014-11-10 19:55 ` Jan Kiszka
2014-11-10 20:00 ` Gilles Chanteperdrix
1 sibling, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 19:55 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>
>>>>>> Hi Gilles,
>>>>>>
>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>
>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>> appear on boot-up.
>>>>>>
>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>> with unlocked context switch.
>>>>>>
>>>>>> FCSE is already disabled at all.
>>>>>>
>>>>>> Do you have an idea how to overcome the problem?
>>>>>
>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>> be confused by the fact that the hardware interrupts are not off
>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>> instead of the hardware mask.
>>>>>
>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>> freeze in the kernell code when the problem happens.
>>>>>
>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>> some Linux service which reschedules from primary mode, you can try
>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>> and catch such mistakes. This is especially important if you are
>>>>> running a custom skin.
>>>>
>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>> started to look into this issue again. We tried earlier but got distracted.
>>>
>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>> see, the "scheduling while atomic" message is based on the
>>> preempt_count only and does not use irqs_disabled() (which by the
>>> way is known to work with I-pipe on ARM as well, so, if something is
>>> broken, that should be something more obscure).
>>
>> Let's see. I think I've identified one wrong path:
>>
>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>> index d32f8bd..ab911f8 100644
>> --- a/arch/arm/kernel/entry-header.S
>> +++ b/arch/arm/kernel/entry-header.S
>> @@ -198,7 +198,10 @@
>> #ifdef CONFIG_TRACE_IRQFLAGS
>> @ The parent context IRQs must have been enabled to get here in
>> @ the first place, so there's no point checking the PSR I bit.
>> - bl trace_hardirqs_on
>> + tst \rpsr, #PSR_I_BIT
>> + bleq trace_hardirqs_off
>> + tst \rpsr, #PSR_I_BIT
>> + blne trace_hardirqs_on
>> #endif
>> .else
>> @ IRQs off again before pulling preserved data off the stack
>>
>> This is probably no fix, but a with that change applied, the warning is
>> gone. Now the question is what to really test for when returning here. I
>> suppose we want the pipeline state of root here - should I
>> __ipipe_check_root_interruptible?
>
> This does not make sense, read the comment above that change: there
> is no way an interrupt can be taken, and so entering svc_entry, with
> interrupts off. Besides this is mainline code, so it would be a
> problem for mainline too. We are necessarily returning to a place
> where hardware irqs were on.
Did you also look at the trace I posted?
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/3a4c666d/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 19:55 ` Jan Kiszka
@ 2014-11-10 20:00 ` Gilles Chanteperdrix
2014-11-10 20:02 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:00 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>
> >>>>>> Hi Gilles,
> >>>>>>
> >>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>
> >>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>> appear on boot-up.
> >>>>>>
> >>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>> with unlocked context switch.
> >>>>>>
> >>>>>> FCSE is already disabled at all.
> >>>>>>
> >>>>>> Do you have an idea how to overcome the problem?
> >>>>>
> >>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>> be confused by the fact that the hardware interrupts are not off
> >>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>> instead of the hardware mask.
> >>>>>
> >>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>> freeze in the kernell code when the problem happens.
> >>>>>
> >>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>> some Linux service which reschedules from primary mode, you can try
> >>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>> and catch such mistakes. This is especially important if you are
> >>>>> running a custom skin.
> >>>>
> >>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>> started to look into this issue again. We tried earlier but got distracted.
> >>>
> >>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>> see, the "scheduling while atomic" message is based on the
> >>> preempt_count only and does not use irqs_disabled() (which by the
> >>> way is known to work with I-pipe on ARM as well, so, if something is
> >>> broken, that should be something more obscure).
> >>
> >> Let's see. I think I've identified one wrong path:
> >>
> >> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >> index d32f8bd..ab911f8 100644
> >> --- a/arch/arm/kernel/entry-header.S
> >> +++ b/arch/arm/kernel/entry-header.S
> >> @@ -198,7 +198,10 @@
> >> #ifdef CONFIG_TRACE_IRQFLAGS
> >> @ The parent context IRQs must have been enabled to get here in
> >> @ the first place, so there's no point checking the PSR I bit.
> >> - bl trace_hardirqs_on
> >> + tst \rpsr, #PSR_I_BIT
> >> + bleq trace_hardirqs_off
> >> + tst \rpsr, #PSR_I_BIT
> >> + blne trace_hardirqs_on
> >> #endif
> >> .else
> >> @ IRQs off again before pulling preserved data off the stack
> >>
> >> This is probably no fix, but a with that change applied, the warning is
> >> gone. Now the question is what to really test for when returning here. I
> >> suppose we want the pipeline state of root here - should I
> >> __ipipe_check_root_interruptible?
> >
> > This does not make sense, read the comment above that change: there
> > is no way an interrupt can be taken, and so entering svc_entry, with
> > interrupts off. Besides this is mainline code, so it would be a
> > problem for mainline too. We are necessarily returning to a place
> > where hardware irqs were on.
>
> Did you also look at the trace I posted?
Yes, but I did not see what I am supposed to see. The only thing I
see is that these trace functions should never have been called from
rt domain in the first place.
Note that the fact that this trace_irqs stuff is not working well
may be the fact that part of them are commented with CONFIG_IPIPE
(see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/468994f6/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:00 ` Gilles Chanteperdrix
@ 2014-11-10 20:02 ` Jan Kiszka
2014-11-10 20:06 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:02 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>>>
>>>>>>>> Hi Gilles,
>>>>>>>>
>>>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>>>
>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>>>> appear on boot-up.
>>>>>>>>
>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>>>> with unlocked context switch.
>>>>>>>>
>>>>>>>> FCSE is already disabled at all.
>>>>>>>>
>>>>>>>> Do you have an idea how to overcome the problem?
>>>>>>>
>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>>>> be confused by the fact that the hardware interrupts are not off
>>>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>>>> instead of the hardware mask.
>>>>>>>
>>>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>>>> freeze in the kernell code when the problem happens.
>>>>>>>
>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>>>> some Linux service which reschedules from primary mode, you can try
>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>>>> and catch such mistakes. This is especially important if you are
>>>>>>> running a custom skin.
>>>>>>
>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>>>> started to look into this issue again. We tried earlier but got distracted.
>>>>>
>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>>>> see, the "scheduling while atomic" message is based on the
>>>>> preempt_count only and does not use irqs_disabled() (which by the
>>>>> way is known to work with I-pipe on ARM as well, so, if something is
>>>>> broken, that should be something more obscure).
>>>>
>>>> Let's see. I think I've identified one wrong path:
>>>>
>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>> index d32f8bd..ab911f8 100644
>>>> --- a/arch/arm/kernel/entry-header.S
>>>> +++ b/arch/arm/kernel/entry-header.S
>>>> @@ -198,7 +198,10 @@
>>>> #ifdef CONFIG_TRACE_IRQFLAGS
>>>> @ The parent context IRQs must have been enabled to get here in
>>>> @ the first place, so there's no point checking the PSR I bit.
>>>> - bl trace_hardirqs_on
>>>> + tst \rpsr, #PSR_I_BIT
>>>> + bleq trace_hardirqs_off
>>>> + tst \rpsr, #PSR_I_BIT
>>>> + blne trace_hardirqs_on
>>>> #endif
>>>> .else
>>>> @ IRQs off again before pulling preserved data off the stack
>>>>
>>>> This is probably no fix, but a with that change applied, the warning is
>>>> gone. Now the question is what to really test for when returning here. I
>>>> suppose we want the pipeline state of root here - should I
>>>> __ipipe_check_root_interruptible?
>>>
>>> This does not make sense, read the comment above that change: there
>>> is no way an interrupt can be taken, and so entering svc_entry, with
>>> interrupts off. Besides this is mainline code, so it would be a
>>> problem for mainline too. We are necessarily returning to a place
>>> where hardware irqs were on.
>>
>> Did you also look at the trace I posted?
>
> Yes, but I did not see what I am supposed to see. The only thing I
> see is that these trace functions should never have been called from
> rt domain in the first place.
>
There is no RT domain in the trace, only an inconsistent Linux trace
state after return from IRQ.
> Note that the fact that this trace_irqs stuff is not working well
> may be the fact that part of them are commented with CONFIG_IPIPE
> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
No, that doesn't solve all issues. Even with my hack (which may not
address all cases properly) plus the reversion of that commit, there are
still inconsistencies.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/78be1a6f/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:02 ` Jan Kiszka
@ 2014-11-10 20:06 ` Gilles Chanteperdrix
2014-11-10 20:10 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:06 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>>>
> >>>>>>>> Hi Gilles,
> >>>>>>>>
> >>>>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>>>
> >>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>>>> appear on boot-up.
> >>>>>>>>
> >>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>>>> with unlocked context switch.
> >>>>>>>>
> >>>>>>>> FCSE is already disabled at all.
> >>>>>>>>
> >>>>>>>> Do you have an idea how to overcome the problem?
> >>>>>>>
> >>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>>>> be confused by the fact that the hardware interrupts are not off
> >>>>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>>>> instead of the hardware mask.
> >>>>>>>
> >>>>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>>>> freeze in the kernell code when the problem happens.
> >>>>>>>
> >>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>>>> some Linux service which reschedules from primary mode, you can try
> >>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>>>> and catch such mistakes. This is especially important if you are
> >>>>>>> running a custom skin.
> >>>>>>
> >>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>>>> started to look into this issue again. We tried earlier but got distracted.
> >>>>>
> >>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>>>> see, the "scheduling while atomic" message is based on the
> >>>>> preempt_count only and does not use irqs_disabled() (which by the
> >>>>> way is known to work with I-pipe on ARM as well, so, if something is
> >>>>> broken, that should be something more obscure).
> >>>>
> >>>> Let's see. I think I've identified one wrong path:
> >>>>
> >>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>> index d32f8bd..ab911f8 100644
> >>>> --- a/arch/arm/kernel/entry-header.S
> >>>> +++ b/arch/arm/kernel/entry-header.S
> >>>> @@ -198,7 +198,10 @@
> >>>> #ifdef CONFIG_TRACE_IRQFLAGS
> >>>> @ The parent context IRQs must have been enabled to get here in
> >>>> @ the first place, so there's no point checking the PSR I bit.
> >>>> - bl trace_hardirqs_on
> >>>> + tst \rpsr, #PSR_I_BIT
> >>>> + bleq trace_hardirqs_off
> >>>> + tst \rpsr, #PSR_I_BIT
> >>>> + blne trace_hardirqs_on
> >>>> #endif
> >>>> .else
> >>>> @ IRQs off again before pulling preserved data off the stack
> >>>>
> >>>> This is probably no fix, but a with that change applied, the warning is
> >>>> gone. Now the question is what to really test for when returning here. I
> >>>> suppose we want the pipeline state of root here - should I
> >>>> __ipipe_check_root_interruptible?
> >>>
> >>> This does not make sense, read the comment above that change: there
> >>> is no way an interrupt can be taken, and so entering svc_entry, with
> >>> interrupts off. Besides this is mainline code, so it would be a
> >>> problem for mainline too. We are necessarily returning to a place
> >>> where hardware irqs were on.
> >>
> >> Did you also look at the trace I posted?
> >
> > Yes, but I did not see what I am supposed to see. The only thing I
> > see is that these trace functions should never have been called from
> > rt domain in the first place.
> >
>
> There is no RT domain in the trace, only an inconsistent Linux trace
> state after return from IRQ.
What can I say, when returning from IRQ, you are necessarily
returning to a point where irqs are ON, as the comment says, and it
makes perfect sense. So your "fix" should be a nop. So, something
else is broken.
>
> > Note that the fact that this trace_irqs stuff is not working well
> > may be the fact that part of them are commented with CONFIG_IPIPE
> > (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
>
> No, that doesn't solve all issues. Even with my hack (which may not
> address all cases properly) plus the reversion of that commit, there are
> still inconsistencies.
You can not reverse that commit, otherwise you will end-up calling
trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
repeat, can not work.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/1effbfc6/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:06 ` Gilles Chanteperdrix
@ 2014-11-10 20:10 ` Jan Kiszka
2014-11-10 20:14 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:10 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Gilles,
>>>>>>>>>>
>>>>>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>>>>>
>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>>>>>> appear on boot-up.
>>>>>>>>>>
>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>>>>>> with unlocked context switch.
>>>>>>>>>>
>>>>>>>>>> FCSE is already disabled at all.
>>>>>>>>>>
>>>>>>>>>> Do you have an idea how to overcome the problem?
>>>>>>>>>
>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>>>>>> be confused by the fact that the hardware interrupts are not off
>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>>>>>> instead of the hardware mask.
>>>>>>>>>
>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>>>>>> freeze in the kernell code when the problem happens.
>>>>>>>>>
>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>>>>>> some Linux service which reschedules from primary mode, you can try
>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>>>>>> and catch such mistakes. This is especially important if you are
>>>>>>>>> running a custom skin.
>>>>>>>>
>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
>>>>>>>
>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>>>>>> see, the "scheduling while atomic" message is based on the
>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
>>>>>>> broken, that should be something more obscure).
>>>>>>
>>>>>> Let's see. I think I've identified one wrong path:
>>>>>>
>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>>>> index d32f8bd..ab911f8 100644
>>>>>> --- a/arch/arm/kernel/entry-header.S
>>>>>> +++ b/arch/arm/kernel/entry-header.S
>>>>>> @@ -198,7 +198,10 @@
>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
>>>>>> @ The parent context IRQs must have been enabled to get here in
>>>>>> @ the first place, so there's no point checking the PSR I bit.
>>>>>> - bl trace_hardirqs_on
>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>> + bleq trace_hardirqs_off
>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>> + blne trace_hardirqs_on
>>>>>> #endif
>>>>>> .else
>>>>>> @ IRQs off again before pulling preserved data off the stack
>>>>>>
>>>>>> This is probably no fix, but a with that change applied, the warning is
>>>>>> gone. Now the question is what to really test for when returning here. I
>>>>>> suppose we want the pipeline state of root here - should I
>>>>>> __ipipe_check_root_interruptible?
>>>>>
>>>>> This does not make sense, read the comment above that change: there
>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
>>>>> interrupts off. Besides this is mainline code, so it would be a
>>>>> problem for mainline too. We are necessarily returning to a place
>>>>> where hardware irqs were on.
>>>>
>>>> Did you also look at the trace I posted?
>>>
>>> Yes, but I did not see what I am supposed to see. The only thing I
>>> see is that these trace functions should never have been called from
>>> rt domain in the first place.
>>>
>>
>> There is no RT domain in the trace, only an inconsistent Linux trace
>> state after return from IRQ.
>
> What can I say, when returning from IRQ, you are necessarily
> returning to a point where irqs are ON, as the comment says, and it
> makes perfect sense. So your "fix" should be a nop. So, something
> else is broken.
The test is for selecting trace_hardirqs_off/on is wrong, that's why I
was asking for a better check. Also, if that path can be taken by RT
domains as well, calling trace_hardirqs_off/on was always wrong, and we
additionally need to check for the caller's domain.
>
>>
>>> Note that the fact that this trace_irqs stuff is not working well
>>> may be the fact that part of them are commented with CONFIG_IPIPE
>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
>>
>> No, that doesn't solve all issues. Even with my hack (which may not
>> address all cases properly) plus the reversion of that commit, there are
>> still inconsistencies.
>
> You can not reverse that commit, otherwise you will end-up calling
> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> repeat, can not work.
I can help to understand if that is sufficient to resolve the tracing
breakage - it isn't, there are more paths missing or wrongly instrumented.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/7faf509f/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:10 ` Jan Kiszka
@ 2014-11-10 20:14 ` Gilles Chanteperdrix
2014-11-10 20:17 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:14 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hi Gilles,
> >>>>>>>>>>
> >>>>>>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>>>>>
> >>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>>>>>> appear on boot-up.
> >>>>>>>>>>
> >>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>>>>>> with unlocked context switch.
> >>>>>>>>>>
> >>>>>>>>>> FCSE is already disabled at all.
> >>>>>>>>>>
> >>>>>>>>>> Do you have an idea how to overcome the problem?
> >>>>>>>>>
> >>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>>>>>> be confused by the fact that the hardware interrupts are not off
> >>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>>>>>> instead of the hardware mask.
> >>>>>>>>>
> >>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>>>>>> freeze in the kernell code when the problem happens.
> >>>>>>>>>
> >>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>>>>>> some Linux service which reschedules from primary mode, you can try
> >>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>>>>>> and catch such mistakes. This is especially important if you are
> >>>>>>>>> running a custom skin.
> >>>>>>>>
> >>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> >>>>>>>
> >>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>>>>>> see, the "scheduling while atomic" message is based on the
> >>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> >>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> >>>>>>> broken, that should be something more obscure).
> >>>>>>
> >>>>>> Let's see. I think I've identified one wrong path:
> >>>>>>
> >>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>>>> index d32f8bd..ab911f8 100644
> >>>>>> --- a/arch/arm/kernel/entry-header.S
> >>>>>> +++ b/arch/arm/kernel/entry-header.S
> >>>>>> @@ -198,7 +198,10 @@
> >>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>> @ The parent context IRQs must have been enabled to get here in
> >>>>>> @ the first place, so there's no point checking the PSR I bit.
> >>>>>> - bl trace_hardirqs_on
> >>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>> + bleq trace_hardirqs_off
> >>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>> + blne trace_hardirqs_on
> >>>>>> #endif
> >>>>>> .else
> >>>>>> @ IRQs off again before pulling preserved data off the stack
> >>>>>>
> >>>>>> This is probably no fix, but a with that change applied, the warning is
> >>>>>> gone. Now the question is what to really test for when returning here. I
> >>>>>> suppose we want the pipeline state of root here - should I
> >>>>>> __ipipe_check_root_interruptible?
> >>>>>
> >>>>> This does not make sense, read the comment above that change: there
> >>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> >>>>> interrupts off. Besides this is mainline code, so it would be a
> >>>>> problem for mainline too. We are necessarily returning to a place
> >>>>> where hardware irqs were on.
> >>>>
> >>>> Did you also look at the trace I posted?
> >>>
> >>> Yes, but I did not see what I am supposed to see. The only thing I
> >>> see is that these trace functions should never have been called from
> >>> rt domain in the first place.
> >>>
> >>
> >> There is no RT domain in the trace, only an inconsistent Linux trace
> >> state after return from IRQ.
> >
> > What can I say, when returning from IRQ, you are necessarily
> > returning to a point where irqs are ON, as the comment says, and it
> > makes perfect sense. So your "fix" should be a nop. So, something
> > else is broken.
>
> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> was asking for a better check. Also, if that path can be taken by RT
> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> additionally need to check for the caller's domain.
>
> >
> >>
> >>> Note that the fact that this trace_irqs stuff is not working well
> >>> may be the fact that part of them are commented with CONFIG_IPIPE
> >>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> >>
> >> No, that doesn't solve all issues. Even with my hack (which may not
> >> address all cases properly) plus the reversion of that commit, there are
> >> still inconsistencies.
> >
> > You can not reverse that commit, otherwise you will end-up calling
> > trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> > repeat, can not work.
>
> I can help to understand if that is sufficient to resolve the tracing
> breakage - it isn't, there are more paths missing or wrongly instrumented.
My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
!IPIPE, since the I-pipe tracer provides the same functionality. And
is not broken.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/9d9110a1/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:14 ` Gilles Chanteperdrix
@ 2014-11-10 20:17 ` Jan Kiszka
2014-11-10 20:18 ` Gilles Chanteperdrix
2014-11-10 20:23 ` Gilles Chanteperdrix
0 siblings, 2 replies; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:17 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>>
>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>>>>>>>
>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>>>>>>>> appear on boot-up.
>>>>>>>>>>>>
>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>>>>>>>> with unlocked context switch.
>>>>>>>>>>>>
>>>>>>>>>>>> FCSE is already disabled at all.
>>>>>>>>>>>>
>>>>>>>>>>>> Do you have an idea how to overcome the problem?
>>>>>>>>>>>
>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>>>>>>>> instead of the hardware mask.
>>>>>>>>>>>
>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>>>>>>>> freeze in the kernell code when the problem happens.
>>>>>>>>>>>
>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>>>>>>>> and catch such mistakes. This is especially important if you are
>>>>>>>>>>> running a custom skin.
>>>>>>>>>>
>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
>>>>>>>>>
>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>>>>>>>> see, the "scheduling while atomic" message is based on the
>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
>>>>>>>>> broken, that should be something more obscure).
>>>>>>>>
>>>>>>>> Let's see. I think I've identified one wrong path:
>>>>>>>>
>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>>>>>> index d32f8bd..ab911f8 100644
>>>>>>>> --- a/arch/arm/kernel/entry-header.S
>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
>>>>>>>> @@ -198,7 +198,10 @@
>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
>>>>>>>> @ The parent context IRQs must have been enabled to get here in
>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
>>>>>>>> - bl trace_hardirqs_on
>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>> + bleq trace_hardirqs_off
>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>> + blne trace_hardirqs_on
>>>>>>>> #endif
>>>>>>>> .else
>>>>>>>> @ IRQs off again before pulling preserved data off the stack
>>>>>>>>
>>>>>>>> This is probably no fix, but a with that change applied, the warning is
>>>>>>>> gone. Now the question is what to really test for when returning here. I
>>>>>>>> suppose we want the pipeline state of root here - should I
>>>>>>>> __ipipe_check_root_interruptible?
>>>>>>>
>>>>>>> This does not make sense, read the comment above that change: there
>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
>>>>>>> interrupts off. Besides this is mainline code, so it would be a
>>>>>>> problem for mainline too. We are necessarily returning to a place
>>>>>>> where hardware irqs were on.
>>>>>>
>>>>>> Did you also look at the trace I posted?
>>>>>
>>>>> Yes, but I did not see what I am supposed to see. The only thing I
>>>>> see is that these trace functions should never have been called from
>>>>> rt domain in the first place.
>>>>>
>>>>
>>>> There is no RT domain in the trace, only an inconsistent Linux trace
>>>> state after return from IRQ.
>>>
>>> What can I say, when returning from IRQ, you are necessarily
>>> returning to a point where irqs are ON, as the comment says, and it
>>> makes perfect sense. So your "fix" should be a nop. So, something
>>> else is broken.
>>
>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
>> was asking for a better check. Also, if that path can be taken by RT
>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
>> additionally need to check for the caller's domain.
>>
>>>
>>>>
>>>>> Note that the fact that this trace_irqs stuff is not working well
>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
>>>>
>>>> No, that doesn't solve all issues. Even with my hack (which may not
>>>> address all cases properly) plus the reversion of that commit, there are
>>>> still inconsistencies.
>>>
>>> You can not reverse that commit, otherwise you will end-up calling
>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
>>> repeat, can not work.
>>
>> I can help to understand if that is sufficient to resolve the tracing
>> breakage - it isn't, there are more paths missing or wrongly instrumented.
>
> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> !IPIPE, since the I-pipe tracer provides the same functionality. And
> is not broken.
No, the I-pipe trace does not provide a Linux lock dependency checker,
nor does it support might_sleep and such. If you have Linux drivers
which depend on Xenomai directly or indirectly, you cannot validate them
anymore. That's why we support this on x86.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/9bac3121/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:17 ` Jan Kiszka
@ 2014-11-10 20:18 ` Gilles Chanteperdrix
2014-11-10 20:22 ` Jan Kiszka
2014-11-10 20:23 ` Gilles Chanteperdrix
1 sibling, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:18 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> That's why we support this on x86.
But at what cost?
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/e393170e/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:18 ` Gilles Chanteperdrix
@ 2014-11-10 20:22 ` Jan Kiszka
0 siblings, 0 replies; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:22 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:18, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
>> That's why we support this on x86.
>
> But at what cost?
The last times I touched that arch: none. It's apparently mature and
doesn't break on updates. Not sure if Philippe stumbled over something
the past years, though.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/381ad6f2/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:17 ` Jan Kiszka
2014-11-10 20:18 ` Gilles Chanteperdrix
@ 2014-11-10 20:23 ` Gilles Chanteperdrix
2014-11-10 20:28 ` Jan Kiszka
1 sibling, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:23 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> >>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Gilles,
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>>>>>>>
> >>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>>>>>>>> appear on boot-up.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>>>>>>>> with unlocked context switch.
> >>>>>>>>>>>>
> >>>>>>>>>>>> FCSE is already disabled at all.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Do you have an idea how to overcome the problem?
> >>>>>>>>>>>
> >>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
> >>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>>>>>>>> instead of the hardware mask.
> >>>>>>>>>>>
> >>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>>>>>>>> freeze in the kernell code when the problem happens.
> >>>>>>>>>>>
> >>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
> >>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>>>>>>>> and catch such mistakes. This is especially important if you are
> >>>>>>>>>>> running a custom skin.
> >>>>>>>>>>
> >>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> >>>>>>>>>
> >>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>>>>>>>> see, the "scheduling while atomic" message is based on the
> >>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> >>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> >>>>>>>>> broken, that should be something more obscure).
> >>>>>>>>
> >>>>>>>> Let's see. I think I've identified one wrong path:
> >>>>>>>>
> >>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>>>>>> index d32f8bd..ab911f8 100644
> >>>>>>>> --- a/arch/arm/kernel/entry-header.S
> >>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> >>>>>>>> @@ -198,7 +198,10 @@
> >>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>>>> @ The parent context IRQs must have been enabled to get here in
> >>>>>>>> @ the first place, so there's no point checking the PSR I bit.
> >>>>>>>> - bl trace_hardirqs_on
> >>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>> + bleq trace_hardirqs_off
> >>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>> + blne trace_hardirqs_on
> >>>>>>>> #endif
> >>>>>>>> .else
> >>>>>>>> @ IRQs off again before pulling preserved data off the stack
> >>>>>>>>
> >>>>>>>> This is probably no fix, but a with that change applied, the warning is
> >>>>>>>> gone. Now the question is what to really test for when returning here. I
> >>>>>>>> suppose we want the pipeline state of root here - should I
> >>>>>>>> __ipipe_check_root_interruptible?
> >>>>>>>
> >>>>>>> This does not make sense, read the comment above that change: there
> >>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> >>>>>>> interrupts off. Besides this is mainline code, so it would be a
> >>>>>>> problem for mainline too. We are necessarily returning to a place
> >>>>>>> where hardware irqs were on.
> >>>>>>
> >>>>>> Did you also look at the trace I posted?
> >>>>>
> >>>>> Yes, but I did not see what I am supposed to see. The only thing I
> >>>>> see is that these trace functions should never have been called from
> >>>>> rt domain in the first place.
> >>>>>
> >>>>
> >>>> There is no RT domain in the trace, only an inconsistent Linux trace
> >>>> state after return from IRQ.
> >>>
> >>> What can I say, when returning from IRQ, you are necessarily
> >>> returning to a point where irqs are ON, as the comment says, and it
> >>> makes perfect sense. So your "fix" should be a nop. So, something
> >>> else is broken.
> >>
> >> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> >> was asking for a better check. Also, if that path can be taken by RT
> >> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> >> additionally need to check for the caller's domain.
> >>
> >>>
> >>>>
> >>>>> Note that the fact that this trace_irqs stuff is not working well
> >>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> >>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> >>>>
> >>>> No, that doesn't solve all issues. Even with my hack (which may not
> >>>> address all cases properly) plus the reversion of that commit, there are
> >>>> still inconsistencies.
> >>>
> >>> You can not reverse that commit, otherwise you will end-up calling
> >>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> >>> repeat, can not work.
> >>
> >> I can help to understand if that is sufficient to resolve the tracing
> >> breakage - it isn't, there are more paths missing or wrongly instrumented.
> >
> > My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> > !IPIPE, since the I-pipe tracer provides the same functionality. And
> > is not broken.
>
> No, the I-pipe trace does not provide a Linux lock dependency checker,
> nor does it support might_sleep and such. If you have Linux drivers
> which depend on Xenomai directly or indirectly, you cannot validate them
> anymore. That's why we support this on x86.
Since the I-pipe is already keeping track of irq state with
CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
of trying and using this trace_hardirqs stuff which looks
irremediably broken to me?
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/2d528f20/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:23 ` Gilles Chanteperdrix
@ 2014-11-10 20:28 ` Jan Kiszka
2014-11-10 20:37 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:28 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>>>>>>>>>> appear on boot-up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>>>>>>>>>> with unlocked context switch.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> FCSE is already disabled at all.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>>>>>>>>>> instead of the hardware mask.
>>>>>>>>>>>>>
>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
>>>>>>>>>>>>> running a custom skin.
>>>>>>>>>>>>
>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
>>>>>>>>>>>
>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
>>>>>>>>>>> broken, that should be something more obscure).
>>>>>>>>>>
>>>>>>>>>> Let's see. I think I've identified one wrong path:
>>>>>>>>>>
>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>>>>>>>> index d32f8bd..ab911f8 100644
>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
>>>>>>>>>> @@ -198,7 +198,10 @@
>>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
>>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
>>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
>>>>>>>>>> - bl trace_hardirqs_on
>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>>>> + bleq trace_hardirqs_off
>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>>>> + blne trace_hardirqs_on
>>>>>>>>>> #endif
>>>>>>>>>> .else
>>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
>>>>>>>>>>
>>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
>>>>>>>>>> gone. Now the question is what to really test for when returning here. I
>>>>>>>>>> suppose we want the pipeline state of root here - should I
>>>>>>>>>> __ipipe_check_root_interruptible?
>>>>>>>>>
>>>>>>>>> This does not make sense, read the comment above that change: there
>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
>>>>>>>>> problem for mainline too. We are necessarily returning to a place
>>>>>>>>> where hardware irqs were on.
>>>>>>>>
>>>>>>>> Did you also look at the trace I posted?
>>>>>>>
>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
>>>>>>> see is that these trace functions should never have been called from
>>>>>>> rt domain in the first place.
>>>>>>>
>>>>>>
>>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
>>>>>> state after return from IRQ.
>>>>>
>>>>> What can I say, when returning from IRQ, you are necessarily
>>>>> returning to a point where irqs are ON, as the comment says, and it
>>>>> makes perfect sense. So your "fix" should be a nop. So, something
>>>>> else is broken.
>>>>
>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
>>>> was asking for a better check. Also, if that path can be taken by RT
>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
>>>> additionally need to check for the caller's domain.
>>>>
>>>>>
>>>>>>
>>>>>>> Note that the fact that this trace_irqs stuff is not working well
>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
>>>>>>
>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
>>>>>> address all cases properly) plus the reversion of that commit, there are
>>>>>> still inconsistencies.
>>>>>
>>>>> You can not reverse that commit, otherwise you will end-up calling
>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
>>>>> repeat, can not work.
>>>>
>>>> I can help to understand if that is sufficient to resolve the tracing
>>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
>>>
>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
>>> is not broken.
>>
>> No, the I-pipe trace does not provide a Linux lock dependency checker,
>> nor does it support might_sleep and such. If you have Linux drivers
>> which depend on Xenomai directly or indirectly, you cannot validate them
>> anymore. That's why we support this on x86.
>
> Since the I-pipe is already keeping track of irq state with
> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> of trying and using this trace_hardirqs stuff which looks
> irremediably broken to me?
The former reflects the hw state, the latter traces the Linux state -
from Linux POV.
This is fixable. We just need to call the tracing functions where Linux
would call it or where we replaced some Linux call with an I-pipe
specific path and avoid calling it when the domain != root. Identifying
those spots is tricky.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/4d34a8bb/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:28 ` Jan Kiszka
@ 2014-11-10 20:37 ` Gilles Chanteperdrix
2014-11-10 20:42 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:37 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> >>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> >>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> >>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Gilles,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>>>>>>>>>> appear on boot-up.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>>>>>>>>>> with unlocked context switch.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> FCSE is already disabled at all.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
> >>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>>>>>>>>>> instead of the hardware mask.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
> >>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
> >>>>>>>>>>>>> running a custom skin.
> >>>>>>>>>>>>
> >>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> >>>>>>>>>>>
> >>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> >>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> >>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> >>>>>>>>>>> broken, that should be something more obscure).
> >>>>>>>>>>
> >>>>>>>>>> Let's see. I think I've identified one wrong path:
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>>>>>>>> index d32f8bd..ab911f8 100644
> >>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> >>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> >>>>>>>>>> @@ -198,7 +198,10 @@
> >>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
> >>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
> >>>>>>>>>> - bl trace_hardirqs_on
> >>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>>>> + bleq trace_hardirqs_off
> >>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>>>> + blne trace_hardirqs_on
> >>>>>>>>>> #endif
> >>>>>>>>>> .else
> >>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
> >>>>>>>>>>
> >>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
> >>>>>>>>>> gone. Now the question is what to really test for when returning here. I
> >>>>>>>>>> suppose we want the pipeline state of root here - should I
> >>>>>>>>>> __ipipe_check_root_interruptible?
> >>>>>>>>>
> >>>>>>>>> This does not make sense, read the comment above that change: there
> >>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> >>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
> >>>>>>>>> problem for mainline too. We are necessarily returning to a place
> >>>>>>>>> where hardware irqs were on.
> >>>>>>>>
> >>>>>>>> Did you also look at the trace I posted?
> >>>>>>>
> >>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
> >>>>>>> see is that these trace functions should never have been called from
> >>>>>>> rt domain in the first place.
> >>>>>>>
> >>>>>>
> >>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
> >>>>>> state after return from IRQ.
> >>>>>
> >>>>> What can I say, when returning from IRQ, you are necessarily
> >>>>> returning to a point where irqs are ON, as the comment says, and it
> >>>>> makes perfect sense. So your "fix" should be a nop. So, something
> >>>>> else is broken.
> >>>>
> >>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> >>>> was asking for a better check. Also, if that path can be taken by RT
> >>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> >>>> additionally need to check for the caller's domain.
> >>>>
> >>>>>
> >>>>>>
> >>>>>>> Note that the fact that this trace_irqs stuff is not working well
> >>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> >>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> >>>>>>
> >>>>>> No, that doesn't solve all issues. Even with my hack (which may not
> >>>>>> address all cases properly) plus the reversion of that commit, there are
> >>>>>> still inconsistencies.
> >>>>>
> >>>>> You can not reverse that commit, otherwise you will end-up calling
> >>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> >>>>> repeat, can not work.
> >>>>
> >>>> I can help to understand if that is sufficient to resolve the tracing
> >>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
> >>>
> >>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> >>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> >>> is not broken.
> >>
> >> No, the I-pipe trace does not provide a Linux lock dependency checker,
> >> nor does it support might_sleep and such. If you have Linux drivers
> >> which depend on Xenomai directly or indirectly, you cannot validate them
> >> anymore. That's why we support this on x86.
> >
> > Since the I-pipe is already keeping track of irq state with
> > CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> > of trying and using this trace_hardirqs stuff which looks
> > irremediably broken to me?
>
> The former reflects the hw state, the latter traces the Linux state -
> from Linux POV.
The I-pipe tracer keeps track of the root domain stall bit as well.
>
> This is fixable. We just need to call the tracing functions where Linux
> would call it or where we replaced some Linux call with an I-pipe
> specific path and avoid calling it when the domain != root. Identifying
> those spots is tricky.
If we take the example of an irq, we probably want not to call
trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
root domain stall bit.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/138e12ca/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:37 ` Gilles Chanteperdrix
@ 2014-11-10 20:42 ` Jan Kiszka
2014-11-10 20:55 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-10 20:42 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
>> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
>>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
>>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
>>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
>>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
>>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
>>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
>>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
>>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Gilles,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
>>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
>>>>>>>>>>>>>>>> appear on boot-up.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
>>>>>>>>>>>>>>>>> with unlocked context switch.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FCSE is already disabled at all.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
>>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
>>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
>>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
>>>>>>>>>>>>>>> instead of the hardware mask.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
>>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
>>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
>>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
>>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
>>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>>>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
>>>>>>>>>>>>>>> running a custom skin.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
>>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
>>>>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
>>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
>>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
>>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
>>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
>>>>>>>>>>>>> broken, that should be something more obscure).
>>>>>>>>>>>>
>>>>>>>>>>>> Let's see. I think I've identified one wrong path:
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>>>>>>>>>> index d32f8bd..ab911f8 100644
>>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
>>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
>>>>>>>>>>>> @@ -198,7 +198,10 @@
>>>>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
>>>>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
>>>>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
>>>>>>>>>>>> - bl trace_hardirqs_on
>>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>>>>>> + bleq trace_hardirqs_off
>>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
>>>>>>>>>>>> + blne trace_hardirqs_on
>>>>>>>>>>>> #endif
>>>>>>>>>>>> .else
>>>>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
>>>>>>>>>>>>
>>>>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
>>>>>>>>>>>> gone. Now the question is what to really test for when returning here. I
>>>>>>>>>>>> suppose we want the pipeline state of root here - should I
>>>>>>>>>>>> __ipipe_check_root_interruptible?
>>>>>>>>>>>
>>>>>>>>>>> This does not make sense, read the comment above that change: there
>>>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
>>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
>>>>>>>>>>> problem for mainline too. We are necessarily returning to a place
>>>>>>>>>>> where hardware irqs were on.
>>>>>>>>>>
>>>>>>>>>> Did you also look at the trace I posted?
>>>>>>>>>
>>>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
>>>>>>>>> see is that these trace functions should never have been called from
>>>>>>>>> rt domain in the first place.
>>>>>>>>>
>>>>>>>>
>>>>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
>>>>>>>> state after return from IRQ.
>>>>>>>
>>>>>>> What can I say, when returning from IRQ, you are necessarily
>>>>>>> returning to a point where irqs are ON, as the comment says, and it
>>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
>>>>>>> else is broken.
>>>>>>
>>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
>>>>>> was asking for a better check. Also, if that path can be taken by RT
>>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
>>>>>> additionally need to check for the caller's domain.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Note that the fact that this trace_irqs stuff is not working well
>>>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
>>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
>>>>>>>>
>>>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
>>>>>>>> address all cases properly) plus the reversion of that commit, there are
>>>>>>>> still inconsistencies.
>>>>>>>
>>>>>>> You can not reverse that commit, otherwise you will end-up calling
>>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
>>>>>>> repeat, can not work.
>>>>>>
>>>>>> I can help to understand if that is sufficient to resolve the tracing
>>>>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
>>>>>
>>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
>>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
>>>>> is not broken.
>>>>
>>>> No, the I-pipe trace does not provide a Linux lock dependency checker,
>>>> nor does it support might_sleep and such. If you have Linux drivers
>>>> which depend on Xenomai directly or indirectly, you cannot validate them
>>>> anymore. That's why we support this on x86.
>>>
>>> Since the I-pipe is already keeping track of irq state with
>>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
>>> of trying and using this trace_hardirqs stuff which looks
>>> irremediably broken to me?
>>
>> The former reflects the hw state, the latter traces the Linux state -
>> from Linux POV.
>
> The I-pipe tracer keeps track of the root domain stall bit as well.
>
>>
>> This is fixable. We just need to call the tracing functions where Linux
>> would call it or where we replaced some Linux call with an I-pipe
>> specific path and avoid calling it when the domain != root. Identifying
>> those spots is tricky.
>
> If we take the example of an irq, we probably want not to call
> trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> root domain stall bit.
Linux tracks the IRQ state separately from the (now virtualized) real
state - to validate the consistency independently of some spurious hard
irq enable/disable. And it tracks per task, not per CPU. It will be more
messy to fake this than to fix it, I'm quite sure.
Jan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/d7a9728f/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:42 ` Jan Kiszka
@ 2014-11-10 20:55 ` Gilles Chanteperdrix
2014-11-10 21:58 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 20:55 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:42:22PM +0100, Jan Kiszka wrote:
> On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> >> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> >>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> >>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> >>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> >>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> >>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> >>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> >>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> >>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Gilles,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> >>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> >>>>>>>>>>>>>>>> appear on boot-up.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> >>>>>>>>>>>>>>>>> with unlocked context switch.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> FCSE is already disabled at all.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> >>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
> >>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> >>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> >>>>>>>>>>>>>>> instead of the hardware mask.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> >>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> >>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> >>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> >>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
> >>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> >>>>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
> >>>>>>>>>>>>>>> running a custom skin.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> >>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> >>>>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> >>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> >>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> >>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> >>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> >>>>>>>>>>>>> broken, that should be something more obscure).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Let's see. I think I've identified one wrong path:
> >>>>>>>>>>>>
> >>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>>>>>>>>>> index d32f8bd..ab911f8 100644
> >>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> >>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> >>>>>>>>>>>> @@ -198,7 +198,10 @@
> >>>>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
> >>>>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
> >>>>>>>>>>>> - bl trace_hardirqs_on
> >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>>>>>> + bleq trace_hardirqs_off
> >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> >>>>>>>>>>>> + blne trace_hardirqs_on
> >>>>>>>>>>>> #endif
> >>>>>>>>>>>> .else
> >>>>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
> >>>>>>>>>>>>
> >>>>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
> >>>>>>>>>>>> gone. Now the question is what to really test for when returning here. I
> >>>>>>>>>>>> suppose we want the pipeline state of root here - should I
> >>>>>>>>>>>> __ipipe_check_root_interruptible?
> >>>>>>>>>>>
> >>>>>>>>>>> This does not make sense, read the comment above that change: there
> >>>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> >>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
> >>>>>>>>>>> problem for mainline too. We are necessarily returning to a place
> >>>>>>>>>>> where hardware irqs were on.
> >>>>>>>>>>
> >>>>>>>>>> Did you also look at the trace I posted?
> >>>>>>>>>
> >>>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
> >>>>>>>>> see is that these trace functions should never have been called from
> >>>>>>>>> rt domain in the first place.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
> >>>>>>>> state after return from IRQ.
> >>>>>>>
> >>>>>>> What can I say, when returning from IRQ, you are necessarily
> >>>>>>> returning to a point where irqs are ON, as the comment says, and it
> >>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
> >>>>>>> else is broken.
> >>>>>>
> >>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> >>>>>> was asking for a better check. Also, if that path can be taken by RT
> >>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> >>>>>> additionally need to check for the caller's domain.
> >>>>>>
> >>>>>>>
> >>>>>>>>
> >>>>>>>>> Note that the fact that this trace_irqs stuff is not working well
> >>>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> >>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> >>>>>>>>
> >>>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
> >>>>>>>> address all cases properly) plus the reversion of that commit, there are
> >>>>>>>> still inconsistencies.
> >>>>>>>
> >>>>>>> You can not reverse that commit, otherwise you will end-up calling
> >>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> >>>>>>> repeat, can not work.
> >>>>>>
> >>>>>> I can help to understand if that is sufficient to resolve the tracing
> >>>>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
> >>>>>
> >>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> >>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> >>>>> is not broken.
> >>>>
> >>>> No, the I-pipe trace does not provide a Linux lock dependency checker,
> >>>> nor does it support might_sleep and such. If you have Linux drivers
> >>>> which depend on Xenomai directly or indirectly, you cannot validate them
> >>>> anymore. That's why we support this on x86.
> >>>
> >>> Since the I-pipe is already keeping track of irq state with
> >>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> >>> of trying and using this trace_hardirqs stuff which looks
> >>> irremediably broken to me?
> >>
> >> The former reflects the hw state, the latter traces the Linux state -
> >> from Linux POV.
> >
> > The I-pipe tracer keeps track of the root domain stall bit as well.
> >
> >>
> >> This is fixable. We just need to call the tracing functions where Linux
> >> would call it or where we replaced some Linux call with an I-pipe
> >> specific path and avoid calling it when the domain != root. Identifying
> >> those spots is tricky.
> >
> > If we take the example of an irq, we probably want not to call
> > trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> > root domain stall bit.
>
> Linux tracks the IRQ state separately from the (now virtualized) real
> state - to validate the consistency independently of some spurious hard
> irq enable/disable. And it tracks per task, not per CPU. It will be more
> messy to fake this than to fix it, I'm quite sure.
If we take the example of irq_svc (the example you patched). We have
4 cases:
1- entry over root, exit over root
2- entry over root, exit over non root
3- entry over non root, exit over non root
4- entry over non root, exit over root
For all these cases, currently, trace_hardirqs_off is called on
entry, and trace_hardirqs_on is called on exit.
Case 1: trace_hardirqs_off on entry, may be right, but may in fact
be useless if root domain is already stalled; trace_hardirqs_on on
exit, may be wrong if root domain is stalled (so is right only if
trace_hardirqs_off was not a nop on entry).
Case 2: trace_hardirqs_off on entry, same as case 1;
trace_hardirqs_on on exit is always wrong: we are now running in RT
domain and should not touch the root domain irq mask
Case 3: wrong, wrong
Case 4: trace_hardirqs_off on entry is wrong, we are not running in
the root domain. trace_hardirqs_on on exit may be right, if the root
domain was not stalled when we took the interrupt that put us in the
RT domain.
--
Gilles.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://www.xenomai.org/pipermail/xenomai/attachments/20141110/14b4e990/attachment.sig>
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 20:55 ` Gilles Chanteperdrix
@ 2014-11-10 21:58 ` Gilles Chanteperdrix
2014-11-12 17:27 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-10 21:58 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 09:55:12PM +0100, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:42:22PM +0100, Jan Kiszka wrote:
> > On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> > > On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> > >> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> > >>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> > >>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> > >>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> > >>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> > >>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> > >>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> > >>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> > >>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > >>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Hi Gilles,
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> > >>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> > >>>>>>>>>>>>>>>> appear on boot-up.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> > >>>>>>>>>>>>>>>>> with unlocked context switch.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> FCSE is already disabled at all.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> > >>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
> > >>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> > >>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> > >>>>>>>>>>>>>>> instead of the hardware mask.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> > >>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> > >>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> > >>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> > >>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
> > >>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> > >>>>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
> > >>>>>>>>>>>>>>> running a custom skin.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> > >>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> > >>>>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> > >>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> > >>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> > >>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> > >>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> > >>>>>>>>>>>>> broken, that should be something more obscure).
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Let's see. I think I've identified one wrong path:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> index d32f8bd..ab911f8 100644
> > >>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> > >>>>>>>>>>>> @@ -198,7 +198,10 @@
> > >>>>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> > >>>>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
> > >>>>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
> > >>>>>>>>>>>> - bl trace_hardirqs_on
> > >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> > >>>>>>>>>>>> + bleq trace_hardirqs_off
> > >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> > >>>>>>>>>>>> + blne trace_hardirqs_on
> > >>>>>>>>>>>> #endif
> > >>>>>>>>>>>> .else
> > >>>>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
> > >>>>>>>>>>>> gone. Now the question is what to really test for when returning here. I
> > >>>>>>>>>>>> suppose we want the pipeline state of root here - should I
> > >>>>>>>>>>>> __ipipe_check_root_interruptible?
> > >>>>>>>>>>>
> > >>>>>>>>>>> This does not make sense, read the comment above that change: there
> > >>>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> > >>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
> > >>>>>>>>>>> problem for mainline too. We are necessarily returning to a place
> > >>>>>>>>>>> where hardware irqs were on.
> > >>>>>>>>>>
> > >>>>>>>>>> Did you also look at the trace I posted?
> > >>>>>>>>>
> > >>>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
> > >>>>>>>>> see is that these trace functions should never have been called from
> > >>>>>>>>> rt domain in the first place.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
> > >>>>>>>> state after return from IRQ.
> > >>>>>>>
> > >>>>>>> What can I say, when returning from IRQ, you are necessarily
> > >>>>>>> returning to a point where irqs are ON, as the comment says, and it
> > >>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
> > >>>>>>> else is broken.
> > >>>>>>
> > >>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> > >>>>>> was asking for a better check. Also, if that path can be taken by RT
> > >>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> > >>>>>> additionally need to check for the caller's domain.
> > >>>>>>
> > >>>>>>>
> > >>>>>>>>
> > >>>>>>>>> Note that the fact that this trace_irqs stuff is not working well
> > >>>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> > >>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> > >>>>>>>>
> > >>>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
> > >>>>>>>> address all cases properly) plus the reversion of that commit, there are
> > >>>>>>>> still inconsistencies.
> > >>>>>>>
> > >>>>>>> You can not reverse that commit, otherwise you will end-up calling
> > >>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> > >>>>>>> repeat, can not work.
> > >>>>>>
> > >>>>>> I can help to understand if that is sufficient to resolve the tracing
> > >>>>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
> > >>>>>
> > >>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> > >>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> > >>>>> is not broken.
> > >>>>
> > >>>> No, the I-pipe trace does not provide a Linux lock dependency checker,
> > >>>> nor does it support might_sleep and such. If you have Linux drivers
> > >>>> which depend on Xenomai directly or indirectly, you cannot validate them
> > >>>> anymore. That's why we support this on x86.
> > >>>
> > >>> Since the I-pipe is already keeping track of irq state with
> > >>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> > >>> of trying and using this trace_hardirqs stuff which looks
> > >>> irremediably broken to me?
> > >>
> > >> The former reflects the hw state, the latter traces the Linux state -
> > >> from Linux POV.
> > >
> > > The I-pipe tracer keeps track of the root domain stall bit as well.
> > >
> > >>
> > >> This is fixable. We just need to call the tracing functions where Linux
> > >> would call it or where we replaced some Linux call with an I-pipe
> > >> specific path and avoid calling it when the domain != root. Identifying
> > >> those spots is tricky.
> > >
> > > If we take the example of an irq, we probably want not to call
> > > trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> > > root domain stall bit.
> >
> > Linux tracks the IRQ state separately from the (now virtualized) real
> > state - to validate the consistency independently of some spurious hard
> > irq enable/disable. And it tracks per task, not per CPU. It will be more
> > messy to fake this than to fix it, I'm quite sure.
>
> If we take the example of irq_svc (the example you patched). We have
> 4 cases:
>
> 1- entry over root, exit over root
> 2- entry over root, exit over non root
> 3- entry over non root, exit over non root
> 4- entry over non root, exit over root
Sorry, it does not work like that. Only case 1 and 3 make sense.
Case 3 is easy, we do not need to call the trace_hardirqs functions.
For case 1, I guess the trace_hardirqs_on at the end must be
replaced with a test of the root domain stall bit, and call
trace_hardirqs_on only if we return to a non-stalled root.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 12:43 ` Gilles Chanteperdrix
2014-11-10 14:52 ` Jan Kiszka
@ 2014-11-11 17:33 ` Stoidner, Christoph
2014-11-11 17:46 ` Gilles Chanteperdrix
1 sibling, 1 reply; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-11 17:33 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
Hi Gilles,
> Also, for the "scheduling while atomic", it may happen if you call
> some Linux service which reschedules from primary mode, you can try
> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> and catch such mistakes. This is especially important if you are
> running a custom skin.
you are completely right, we have implemented our own skin. Using
the debugging functionality mentioned above we have identified a
function-call that leads to exact the behaviour as described in your post.
Hopefully solving that issue solves our application crash.
Thanks for your fast and helpful support!
Regards,
Christoph
________________________________________
Von: Gilles Chanteperdrix <gilles.chanteperdrix@xenomai.org>
Gesendet: Montag, 10. November 2014 13:43
An: Stoidner, Christoph
Cc: xenomai@xenomai.org
Betreff: Re: [Xenomai] "inconsistent lock state" on boot-up
On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
>
> Hi Gilles,
>
> > Do you have the same message with exactly the same kernel
> > configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
>
> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> appear on boot-up.
>
> > Do you have FCSE enabled? If yes, did you try disabling it? same
> > with unlocked context switch.
>
> FCSE is already disabled at all.
>
> Do you have an idea how to overcome the problem?
I am not sure the lockdep message really is a problem. lockdep could
be confused by the fact that the hardware interrupts are not off
when running the I-pipe, or because we are missing some bit in the
I-pipe arm specific code to get it looking at the virtual mask
instead of the hardware mask.
As for the scheduling while atomic and random segmentation fault,
you should use the I-pipe tracer, configure it with enough back
trace points, something like 1000 or 10000, and trigger a trace
freeze in the kernell code when the problem happens.
Also, for the "scheduling while atomic", it may happen if you call
some Linux service which reschedules from primary mode, you can try
enabling I-pipe debugging, and in fact all Xenomai debugging, to try
and catch such mistakes. This is especially important if you are
running a custom skin.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-11 17:33 ` Stoidner, Christoph
@ 2014-11-11 17:46 ` Gilles Chanteperdrix
2014-11-11 18:04 ` Philippe Gerum
2014-11-17 10:01 ` Stoidner, Christoph
0 siblings, 2 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-11 17:46 UTC (permalink / raw)
To: Stoidner, Christoph; +Cc: xenomai@xenomai.org
On Tue, Nov 11, 2014 at 05:33:55PM +0000, Stoidner, Christoph wrote:
>
> Hi Gilles,
>
> > Also, for the "scheduling while atomic", it may happen if you call
> > some Linux service which reschedules from primary mode, you can try
> > enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> > and catch such mistakes. This is especially important if you are
> > running a custom skin.
>
> you are completely right, we have implemented our own skin. Using
> the debugging functionality mentioned above we have identified a
> function-call that leads to exact the behaviour as described in your post.
>
> Hopefully solving that issue solves our application crash.
>
> Thanks for your fast and helpful support!
You are welcome. As a side note, if you have followed the
discussion, CONFIG_TRACE_IRQFLAGS is broken on ARM, Jan hopes to be
able to fix it, but for the time being, you should disable it, or
you risk to create other issues.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-11 17:46 ` Gilles Chanteperdrix
@ 2014-11-11 18:04 ` Philippe Gerum
2014-11-17 10:01 ` Stoidner, Christoph
1 sibling, 0 replies; 47+ messages in thread
From: Philippe Gerum @ 2014-11-11 18:04 UTC (permalink / raw)
To: Gilles Chanteperdrix, Stoidner, Christoph; +Cc: xenomai@xenomai.org
On 11/11/2014 06:46 PM, Gilles Chanteperdrix wrote:
> On Tue, Nov 11, 2014 at 05:33:55PM +0000, Stoidner, Christoph wrote:
>>
>> Hi Gilles,
>>
>>> Also, for the "scheduling while atomic", it may happen if you call
>>> some Linux service which reschedules from primary mode, you can try
>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
>>> and catch such mistakes. This is especially important if you are
>>> running a custom skin.
>>
>> you are completely right, we have implemented our own skin. Using
>> the debugging functionality mentioned above we have identified a
>> function-call that leads to exact the behaviour as described in your post.
>>
>> Hopefully solving that issue solves our application crash.
>>
>> Thanks for your fast and helpful support!
>
> You are welcome. As a side note, if you have followed the
> discussion, CONFIG_TRACE_IRQFLAGS is broken on ARM, Jan hopes to be
> able to fix it, but for the time being, you should disable it, or
> you risk to create other issues.
>
CONFIG_TRACE_IRQFLAGS is currently broken on several architectures with
IPIPE enabled. ppc64 certainly is, ppc32 likely, blackfin maybe. I did
not check x86 while upgrading to 3.16 yet.
--
Philippe.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-10 21:58 ` Gilles Chanteperdrix
@ 2014-11-12 17:27 ` Gilles Chanteperdrix
2014-11-17 16:48 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-12 17:27 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 10, 2014 at 10:58:46PM +0100, Gilles Chanteperdrix wrote:
> On Mon, Nov 10, 2014 at 09:55:12PM +0100, Gilles Chanteperdrix wrote:
> > On Mon, Nov 10, 2014 at 09:42:22PM +0100, Jan Kiszka wrote:
> > > On 2014-11-10 21:37, Gilles Chanteperdrix wrote:
> > > > On Mon, Nov 10, 2014 at 09:28:50PM +0100, Jan Kiszka wrote:
> > > >> On 2014-11-10 21:23, Gilles Chanteperdrix wrote:
> > > >>> On Mon, Nov 10, 2014 at 09:17:18PM +0100, Jan Kiszka wrote:
> > > >>>> On 2014-11-10 21:14, Gilles Chanteperdrix wrote:
> > > >>>>> On Mon, Nov 10, 2014 at 09:10:31PM +0100, Jan Kiszka wrote:
> > > >>>>>> On 2014-11-10 21:06, Gilles Chanteperdrix wrote:
> > > >>>>>>> On Mon, Nov 10, 2014 at 09:02:58PM +0100, Jan Kiszka wrote:
> > > >>>>>>>> On 2014-11-10 21:00, Gilles Chanteperdrix wrote:
> > > >>>>>>>>> On Mon, Nov 10, 2014 at 08:55:26PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>> On 2014-11-10 20:46, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>> On Mon, Nov 10, 2014 at 07:29:58PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>>>> On 2014-11-10 16:56, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>>>> On Mon, Nov 10, 2014 at 03:52:41PM +0100, Jan Kiszka wrote:
> > > >>>>>>>>>>>>>> On 2014-11-10 13:43, Gilles Chanteperdrix wrote:
> > > >>>>>>>>>>>>>>> On Mon, Nov 10, 2014 at 09:08:47AM +0000, Stoidner, Christoph wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Hi Gilles,
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Do you have the same message with exactly the same kernel
> > > >>>>>>>>>>>>>>>>> configuration, only with CONFIG_XENOMAI and CONFIG_IPIPE disabled?
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> When CONFIG_XENOMAI and CONFIG_IPIPE are disabled the message does not
> > > >>>>>>>>>>>>>>>> appear on boot-up.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Do you have FCSE enabled? If yes, did you try disabling it? same
> > > >>>>>>>>>>>>>>>>> with unlocked context switch.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> FCSE is already disabled at all.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Do you have an idea how to overcome the problem?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I am not sure the lockdep message really is a problem. lockdep could
> > > >>>>>>>>>>>>>>> be confused by the fact that the hardware interrupts are not off
> > > >>>>>>>>>>>>>>> when running the I-pipe, or because we are missing some bit in the
> > > >>>>>>>>>>>>>>> I-pipe arm specific code to get it looking at the virtual mask
> > > >>>>>>>>>>>>>>> instead of the hardware mask.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> As for the scheduling while atomic and random segmentation fault,
> > > >>>>>>>>>>>>>>> you should use the I-pipe tracer, configure it with enough back
> > > >>>>>>>>>>>>>>> trace points, something like 1000 or 10000, and trigger a trace
> > > >>>>>>>>>>>>>>> freeze in the kernell code when the problem happens.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Also, for the "scheduling while atomic", it may happen if you call
> > > >>>>>>>>>>>>>>> some Linux service which reschedules from primary mode, you can try
> > > >>>>>>>>>>>>>>> enabling I-pipe debugging, and in fact all Xenomai debugging, to try
> > > >>>>>>>>>>>>>>> and catch such mistakes. This is especially important if you are
> > > >>>>>>>>>>>>>>> running a custom skin.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> "Scheduling while atomic" may have the same reason why lockdep stumbles:
> > > >>>>>>>>>>>>>> some changes of I-pipe messe up with IRQ state tracing of Linux. I just
> > > >>>>>>>>>>>>>> started to look into this issue again. We tried earlier but got distracted.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> I doubt that very much. Though I never run with lockdep, I sometimes
> > > >>>>>>>>>>>>> run with CONFIG_PREEMPT, and never saw this message. From what I can
> > > >>>>>>>>>>>>> see, the "scheduling while atomic" message is based on the
> > > >>>>>>>>>>>>> preempt_count only and does not use irqs_disabled() (which by the
> > > >>>>>>>>>>>>> way is known to work with I-pipe on ARM as well, so, if something is
> > > >>>>>>>>>>>>> broken, that should be something more obscure).
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Let's see. I think I've identified one wrong path:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> index d32f8bd..ab911f8 100644
> > > >>>>>>>>>>>> --- a/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> +++ b/arch/arm/kernel/entry-header.S
> > > >>>>>>>>>>>> @@ -198,7 +198,10 @@
> > > >>>>>>>>>>>> #ifdef CONFIG_TRACE_IRQFLAGS
> > > >>>>>>>>>>>> @ The parent context IRQs must have been enabled to get here in
> > > >>>>>>>>>>>> @ the first place, so there's no point checking the PSR I bit.
> > > >>>>>>>>>>>> - bl trace_hardirqs_on
> > > >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> > > >>>>>>>>>>>> + bleq trace_hardirqs_off
> > > >>>>>>>>>>>> + tst \rpsr, #PSR_I_BIT
> > > >>>>>>>>>>>> + blne trace_hardirqs_on
> > > >>>>>>>>>>>> #endif
> > > >>>>>>>>>>>> .else
> > > >>>>>>>>>>>> @ IRQs off again before pulling preserved data off the stack
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> This is probably no fix, but a with that change applied, the warning is
> > > >>>>>>>>>>>> gone. Now the question is what to really test for when returning here. I
> > > >>>>>>>>>>>> suppose we want the pipeline state of root here - should I
> > > >>>>>>>>>>>> __ipipe_check_root_interruptible?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> This does not make sense, read the comment above that change: there
> > > >>>>>>>>>>> is no way an interrupt can be taken, and so entering svc_entry, with
> > > >>>>>>>>>>> interrupts off. Besides this is mainline code, so it would be a
> > > >>>>>>>>>>> problem for mainline too. We are necessarily returning to a place
> > > >>>>>>>>>>> where hardware irqs were on.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Did you also look at the trace I posted?
> > > >>>>>>>>>
> > > >>>>>>>>> Yes, but I did not see what I am supposed to see. The only thing I
> > > >>>>>>>>> see is that these trace functions should never have been called from
> > > >>>>>>>>> rt domain in the first place.
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> There is no RT domain in the trace, only an inconsistent Linux trace
> > > >>>>>>>> state after return from IRQ.
> > > >>>>>>>
> > > >>>>>>> What can I say, when returning from IRQ, you are necessarily
> > > >>>>>>> returning to a point where irqs are ON, as the comment says, and it
> > > >>>>>>> makes perfect sense. So your "fix" should be a nop. So, something
> > > >>>>>>> else is broken.
> > > >>>>>>
> > > >>>>>> The test is for selecting trace_hardirqs_off/on is wrong, that's why I
> > > >>>>>> was asking for a better check. Also, if that path can be taken by RT
> > > >>>>>> domains as well, calling trace_hardirqs_off/on was always wrong, and we
> > > >>>>>> additionally need to check for the caller's domain.
> > > >>>>>>
> > > >>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>> Note that the fact that this trace_irqs stuff is not working well
> > > >>>>>>>>> may be the fact that part of them are commented with CONFIG_IPIPE
> > > >>>>>>>>> (see asm_trace_hardirqs_on_cond, asm_trace_hardirqs_off)
> > > >>>>>>>>
> > > >>>>>>>> No, that doesn't solve all issues. Even with my hack (which may not
> > > >>>>>>>> address all cases properly) plus the reversion of that commit, there are
> > > >>>>>>>> still inconsistencies.
> > > >>>>>>>
> > > >>>>>>> You can not reverse that commit, otherwise you will end-up calling
> > > >>>>>>> trace_hardirqs_on/trace_hardirqs_off from RT domain, which, I
> > > >>>>>>> repeat, can not work.
> > > >>>>>>
> > > >>>>>> I can help to understand if that is sufficient to resolve the tracing
> > > >>>>>> breakage - it isn't, there are more paths missing or wrongly instrumented.
> > > >>>>>
> > > >>>>> My idea of all this is that CONFIG_TRACE_IRQFLAGS should depend on
> > > >>>>> !IPIPE, since the I-pipe tracer provides the same functionality. And
> > > >>>>> is not broken.
> > > >>>>
> > > >>>> No, the I-pipe trace does not provide a Linux lock dependency checker,
> > > >>>> nor does it support might_sleep and such. If you have Linux drivers
> > > >>>> which depend on Xenomai directly or indirectly, you cannot validate them
> > > >>>> anymore. That's why we support this on x86.
> > > >>>
> > > >>> Since the I-pipe is already keeping track of irq state with
> > > >>> CONFIG_IPIPE_TRACE_IRQSOFF, can we not use that information instead
> > > >>> of trying and using this trace_hardirqs stuff which looks
> > > >>> irremediably broken to me?
> > > >>
> > > >> The former reflects the hw state, the latter traces the Linux state -
> > > >> from Linux POV.
> > > >
> > > > The I-pipe tracer keeps track of the root domain stall bit as well.
> > > >
> > > >>
> > > >> This is fixable. We just need to call the tracing functions where Linux
> > > >> would call it or where we replaced some Linux call with an I-pipe
> > > >> specific path and avoid calling it when the domain != root. Identifying
> > > >> those spots is tricky.
> > > >
> > > > If we take the example of an irq, we probably want not to call
> > > > trace_hardirqs_on/trace_hardirqs_off anywhere, and just rely on the
> > > > root domain stall bit.
> > >
> > > Linux tracks the IRQ state separately from the (now virtualized) real
> > > state - to validate the consistency independently of some spurious hard
> > > irq enable/disable. And it tracks per task, not per CPU. It will be more
> > > messy to fake this than to fix it, I'm quite sure.
> >
> > If we take the example of irq_svc (the example you patched). We have
> > 4 cases:
> >
> > 1- entry over root, exit over root
> > 2- entry over root, exit over non root
> > 3- entry over non root, exit over non root
> > 4- entry over non root, exit over root
>
> Sorry, it does not work like that. Only case 1 and 3 make sense.
> Case 3 is easy, we do not need to call the trace_hardirqs functions.
> For case 1, I guess the trace_hardirqs_on at the end must be
> replaced with a test of the root domain stall bit, and call
> trace_hardirqs_on only if we return to a non-stalled root.
We do not need trace_hardirqs_on and trace_hardirqs_off for the
particular case of IRQs: they are already handled by
__ipipe_do_sync_stage.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-11 17:46 ` Gilles Chanteperdrix
2014-11-11 18:04 ` Philippe Gerum
@ 2014-11-17 10:01 ` Stoidner, Christoph
2014-11-17 10:22 ` Gilles Chanteperdrix
2014-11-17 11:49 ` Philippe Gerum
1 sibling, 2 replies; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-17 10:01 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
Hi,
> you are completely right, we have implemented our own skin. Using
> the debugging functionality mentioned above we have identified a
> function-call that leads to exact the behaviour as described in your post.
>
> Hopefully solving that issue solves our application crash.
Now the problem "scheduling while atomic" does not occur anymore within API calls of our own skin. However, after some run-time (about 5 minutes or more) it seems to appear in gatekeeper-thread (see below).
What is not clear to me is the ipipe_raise_irq() call in backtrace below. I could not identify any according call from within gatekeeper_thread(). Do I overlook something?
What I also do not understand is the timestamp of kernel message. As mentioned the messages do appear about 5 minutes after kernel start. However the message's timestamp are from about 40 seconds after boot? Could it happen that messages are delayed? Or is the timestamp wrong?
NOTE: The config option CONFIG_TRACE_IRQFLAGS is disabled.
[ 40.670362] BUG: scheduling while atomic: gatekeeper/0/22/0x00010001
[ 40.670393] CPU: 0 PID: 22 Comm: gatekeeper/0 Not tainted 3.10.18-rt14-arvero-rev01-ipipe #2
[ 40.670480] [<c00130a4>] (unwind_backtrace+0x0/0xf0) from [<c0011484>] (show_stack+0x10/0x14)
[ 40.670538] [<c0011484>] (show_stack+0x10/0x14) from [<c048329c>] (__schedule_bug+0x3c/0x54)
[ 40.670581] [<c048329c>] (__schedule_bug+0x3c/0x54) from [<c04887b8>] (__schedule+0x35c/0x4f0)
[ 40.670611] [<c04887b8>] (__schedule+0x35c/0x4f0) from [<c0488980>] (schedule+0x34/0xa0)
[ 40.670646] [<c0488980>] (schedule+0x34/0xa0) from [<c0489628>] (rt_spin_lock_slowlock+0x14c/0x308)
[ 40.670684] [<c0489628>] (rt_spin_lock_slowlock+0x14c/0x308) from [<c002ad24>] (__lock_task_sighand+0x40/0x6c)
[ 40.670716] [<c002ad24>] (__lock_task_sighand+0x40/0x6c) from [<c002ad74>] (do_send_sig_info+0x24/0x64)
[ 40.670748] [<c002ad74>] (do_send_sig_info+0x24/0x64) from [<c00aaf88>] (lostage_handler+0xec/0x11c)
[ 40.670778] [<c00aaf88>] (lostage_handler+0xec/0x11c) from [<c006dfa4>] (rthal_apc_handler+0x4c/0x60)
[ 40.670810] [<c006dfa4>] (rthal_apc_handler+0x4c/0x60) from [<c00611d8>] (__ipipe_do_sync_stage+0x1f8/0x288)
[ 40.670844] [<c00611d8>] (__ipipe_do_sync_stage+0x1f8/0x288) from [<c0014224>] (ipipe_raise_irq+0x18/0x20)
[ 40.670875] [<c0014224>] (ipipe_raise_irq+0x18/0x20) from [<c00aac84>] (gatekeeper_thread+0x150/0x368)
[ 40.670918] [<c00aac84>] (gatekeeper_thread+0x150/0x368) from [<c00383f8>] (kthread+0x9c/0xa4)
[ 40.670962] [<c00383f8>] (kthread+0x9c/0xa4) from [<c000ea20>] (ret_from_fork+0x18/0x38)
[ 40.671179] ------------[ cut here ]------------
[ 40.671233] WARNING: at kernel/softirq.c:748 irq_exit+0x118/0x138()
[ 40.671260] CPU: 0 PID: 22 Comm: gatekeeper/0 Tainted: G W 3.10.18-rt14-arvero-rev01-ipipe #2
[ 40.671317] [<c00130a4>] (unwind_backtrace+0x0/0xf0) from [<c0011484>] (show_stack+0x10/0x14)
[ 40.671372] [<c0011484>] (show_stack+0x10/0x14) from [<c001b7bc>] (warn_slowpath_common+0x48/0x64)
[ 40.671416] [<c001b7bc>] (warn_slowpath_common+0x48/0x64) from [<c001b8a0>] (warn_slowpath_null+0x1c/0x24)
[ 40.671453] [<c001b8a0>] (warn_slowpath_null+0x1c/0x24) from [<c0023414>] (irq_exit+0x118/0x138)
[ 40.671487] [<c0023414>] (irq_exit+0x118/0x138) from [<c00611dc>] (__ipipe_do_sync_stage+0x1fc/0x288)
[ 40.671518] [<c00611dc>] (__ipipe_do_sync_stage+0x1fc/0x288) from [<c0014224>] (ipipe_raise_irq+0x18/0x20)
[ 40.671548] [<c0014224>] (ipipe_raise_irq+0x18/0x20) from [<c00aac84>] (gatekeeper_thread+0x150/0x368)
[ 40.671586] [<c00aac84>] (gatekeeper_thread+0x150/0x368) from [<c00383f8>] (kthread+0x9c/0xa4)
[ 40.671620] [<c00383f8>] (kthread+0x9c/0xa4) from [<c000ea20>] (ret_from_fork+0x18/0x38)
[ 40.671632] ---[ end trace 0000000000000002 ]---
Thanks and advance
Christoph
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 10:01 ` Stoidner, Christoph
@ 2014-11-17 10:22 ` Gilles Chanteperdrix
2014-11-17 11:13 ` Stoidner, Christoph
2014-11-17 11:49 ` Philippe Gerum
1 sibling, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 10:22 UTC (permalink / raw)
To: Stoidner, Christoph; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 10:01:40AM +0000, Stoidner, Christoph wrote:
>
> Hi,
>
> > you are completely right, we have implemented our own skin. Using
> > the debugging functionality mentioned above we have identified a
> > function-call that leads to exact the behaviour as described in your post.
> >
> > Hopefully solving that issue solves our application crash.
>
> Now the problem "scheduling while atomic" does not occur anymore
> within API calls of our own skin. However, after some run-time
> (about 5 minutes or more) it seems to appear in gatekeeper-thread
> (see below).
>
> What is not clear to me is the ipipe_raise_irq() call in backtrace
> below. I could not identify any according call from within
> gatekeeper_thread(). Do I overlook something?
>
> What I also do not understand is the timestamp of kernel message.
> As mentioned the messages do appear about 5 minutes after kernel
> start. However the message's timestamp are from about 40 seconds
> after boot? Could it happen that messages are delayed? Or is the
> timestamp wrong?
This would probably mean an issue with the tsc emulation, have you
tried running the "tsc" program, from xenomai regression testsuite
with the -w option ? I remember than the imx28 tsc emulation is a
bit weird, the hardware sometimes returns wrong values, and the
support answer was "read it twice, until you get twice the same
value". But I never found this really satisfactory: what if reading
it twice returns the same wrong value twice ?
The tsc test should see if the tsc wrapping is doing fine. You can
try to run it several time, or even in parallel to your tests, to
see if it does not detect any problem.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 10:22 ` Gilles Chanteperdrix
@ 2014-11-17 11:13 ` Stoidner, Christoph
2014-11-17 11:30 ` Philippe Gerum
0 siblings, 1 reply; 47+ messages in thread
From: Stoidner, Christoph @ 2014-11-17 11:13 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
>>
>> Now the problem "scheduling while atomic" does not occur anymore
>> within API calls of our own skin. However, after some run-time
>> (about 5 minutes or more) it seems to appear in gatekeeper-thread
>> (see below).
>>
>> What is not clear to me is the ipipe_raise_irq() call in backtrace
>> below. I could not identify any according call from within
>> gatekeeper_thread(). Do I overlook something?
>>
>> What I also do not understand is the timestamp of kernel message.
>> As mentioned the messages do appear about 5 minutes after kernel
>> start. However the message's timestamp are from about 40 seconds
>> after boot? Could it happen that messages are delayed? Or is the
>> timestamp wrong?
>
> This would probably mean an issue with the tsc emulation, have you
> tried running the "tsc" program, from xenomai regression testsuite
> with the -w option ? I remember than the imx28 tsc emulation is a
> bit weird, the hardware sometimes returns wrong values, and the
> support answer was "read it twice, until you get twice the same
> value". But I never found this really satisfactory: what if reading
> it twice returns the same wrong value twice ?
>
> The tsc test should see if the tsc wrapping is doing fine. You can
> try to run it several time, or even in parallel to your tests, to
> see if it does not detect any problem.
There are some other kernel message whose's timestamp seems to be correct. E.g. when creating a semaphore (as below):
[ 17.336237] Xenomai: registered exported object @CGI (semaphores)
[ 17.344122] Xenomai: registered exported object LOG (msgx)
I would expect these message on program start which would also match the shown timestamp. However these message are also outputted late after 5 minutes run-time, exact same time when "scheduling while atomic" is showed. So now I am assuming the timestamp is valid but messages are delayed shown. However I feel this has nothing to do with my main problem: the program crash. So maybe I should open a new thread for "delayed kernel message or wrong time stamp".
Back to topic: Do you have any idea why "scheduling while atomic" is thrown by gatekeeper_thread(), based on the backtrace? Or do you know on which place ipipe_raise_irq() is called from gatekeeper thread respectively if that would be legal/expected?
Regards,
Christoph
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 11:13 ` Stoidner, Christoph
@ 2014-11-17 11:30 ` Philippe Gerum
2014-11-17 13:16 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Philippe Gerum @ 2014-11-17 11:30 UTC (permalink / raw)
To: Stoidner, Christoph, Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 11/17/2014 12:13 PM, Stoidner, Christoph wrote:
>
>>>
>>> Now the problem "scheduling while atomic" does not occur anymore
>>> within API calls of our own skin. However, after some run-time
>>> (about 5 minutes or more) it seems to appear in gatekeeper-thread
>>> (see below).
>>>
>>> What is not clear to me is the ipipe_raise_irq() call in backtrace
>>> below. I could not identify any according call from within
>>> gatekeeper_thread(). Do I overlook something?
>>>
>>> What I also do not understand is the timestamp of kernel message.
>>> As mentioned the messages do appear about 5 minutes after kernel
>>> start. However the message's timestamp are from about 40 seconds
>>> after boot? Could it happen that messages are delayed? Or is the
>>> timestamp wrong?
>>
>> This would probably mean an issue with the tsc emulation, have you
>> tried running the "tsc" program, from xenomai regression testsuite
>> with the -w option ? I remember than the imx28 tsc emulation is a
>> bit weird, the hardware sometimes returns wrong values, and the
>> support answer was "read it twice, until you get twice the same
>> value". But I never found this really satisfactory: what if reading
>> it twice returns the same wrong value twice ?
>>
>> The tsc test should see if the tsc wrapping is doing fine. You can
>> try to run it several time, or even in parallel to your tests, to
>> see if it does not detect any problem.
>
> There are some other kernel message whose's timestamp seems to be correct. E.g. when creating a semaphore (as below):
>
> [ 17.336237] Xenomai: registered exported object @CGI (semaphores)
> [ 17.344122] Xenomai: registered exported object LOG (msgx)
>
> I would expect these message on program start which would also match the shown timestamp. However these message are also outputted late after 5 minutes run-time, exact same time when "scheduling while atomic" is showed. So now I am assuming the timestamp is valid but messages are delayed shown. However I feel this has nothing to do with my main problem: the program crash. So maybe I should open a new thread for "delayed kernel message or wrong time stamp".
>
> Back to topic: Do you have any idea why "scheduling while atomic" is thrown by gatekeeper_thread(), based on the backtrace? Or do you know on which place ipipe_raise_irq() is called from gatekeeper thread respectively if that would be legal/expected?
>
You seem to be running a preempt-rt patched kernel, but the Xenomai core
acts as if it was built for a regular preemption kernel. This virq is
triggered by some code in the Xenomai rescheduling when the caller runs
in secondary mode, which the gatekeeper always does. This code is
correct, the way it is handled by the APC code in Xenomai due to this
apparent build mismatch is not.
--
Philippe.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 10:01 ` Stoidner, Christoph
2014-11-17 10:22 ` Gilles Chanteperdrix
@ 2014-11-17 11:49 ` Philippe Gerum
2014-11-17 11:51 ` Philippe Gerum
2014-11-17 13:10 ` Gilles Chanteperdrix
1 sibling, 2 replies; 47+ messages in thread
From: Philippe Gerum @ 2014-11-17 11:49 UTC (permalink / raw)
To: Stoidner, Christoph, Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 11/17/2014 11:01 AM, Stoidner, Christoph wrote:
>
> Hi,
>
>> you are completely right, we have implemented our own skin. Using
>> the debugging functionality mentioned above we have identified a
>> function-call that leads to exact the behaviour as described in your post.
>>
>> Hopefully solving that issue solves our application crash.
>
> Now the problem "scheduling while atomic" does not occur anymore within API calls of our own skin. However, after some run-time (about 5 minutes or more) it seems to appear in gatekeeper-thread (see below).
>
> What is not clear to me is the ipipe_raise_irq() call in backtrace below. I could not identify any according call from within gatekeeper_thread(). Do I overlook something?
>
You could match the closest routine to the calling PC value (c00aac84)
using add2line on your kernel image.
e.g. arm-linux-gnueabihf-addr2line -e vmlinux -a c00aac84
--
Philippe.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 11:49 ` Philippe Gerum
@ 2014-11-17 11:51 ` Philippe Gerum
2014-11-17 13:10 ` Gilles Chanteperdrix
1 sibling, 0 replies; 47+ messages in thread
From: Philippe Gerum @ 2014-11-17 11:51 UTC (permalink / raw)
To: Stoidner, Christoph, Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 11/17/2014 12:49 PM, Philippe Gerum wrote:
> On 11/17/2014 11:01 AM, Stoidner, Christoph wrote:
>>
>> Hi,
>>
>>> you are completely right, we have implemented our own skin. Using
>>> the debugging functionality mentioned above we have identified a
>>> function-call that leads to exact the behaviour as described in your post.
>>>
>>> Hopefully solving that issue solves our application crash.
>>
>> Now the problem "scheduling while atomic" does not occur anymore within API calls of our own skin. However, after some run-time (about 5 minutes or more) it seems to appear in gatekeeper-thread (see below).
>>
>> What is not clear to me is the ipipe_raise_irq() call in backtrace below. I could not identify any according call from within gatekeeper_thread(). Do I overlook something?
>>
>
> You could match the closest routine to the calling PC value (c00aac84)
> using add2line on your kernel image.
>
> e.g. arm-linux-gnueabihf-addr2line -e vmlinux -a c00aac84
>
You may need CONFIG_DEBUG_INFO enabled.
--
Philippe.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 11:49 ` Philippe Gerum
2014-11-17 11:51 ` Philippe Gerum
@ 2014-11-17 13:10 ` Gilles Chanteperdrix
2014-11-17 13:33 ` Philippe Gerum
1 sibling, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 13:10 UTC (permalink / raw)
To: Philippe Gerum; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 12:49:01PM +0100, Philippe Gerum wrote:
> On 11/17/2014 11:01 AM, Stoidner, Christoph wrote:
> >
> > Hi,
> >
> >> you are completely right, we have implemented our own skin. Using
> >> the debugging functionality mentioned above we have identified a
> >> function-call that leads to exact the behaviour as described in your post.
> >>
> >> Hopefully solving that issue solves our application crash.
> >
> > Now the problem "scheduling while atomic" does not occur anymore within API calls of our own skin. However, after some run-time (about 5 minutes or more) it seems to appear in gatekeeper-thread (see below).
> >
> > What is not clear to me is the ipipe_raise_irq() call in backtrace below. I could not identify any according call from within gatekeeper_thread(). Do I overlook something?
> >
>
> You could match the closest routine to the calling PC value (c00aac84)
> using add2line on your kernel image.
>
> e.g. arm-linux-gnueabihf-addr2line -e vmlinux -a c00aac84
That would rather be:
arm-none-linux-gnueabi-addr2line
imx28 is an armv5 (who said that armv4 and armv5 were no longer in
circulation ?).
:-)
just nitpicking.
If you have a the multiarch binutils version installed, the default
addr2line should work as well.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 11:30 ` Philippe Gerum
@ 2014-11-17 13:16 ` Gilles Chanteperdrix
0 siblings, 0 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 13:16 UTC (permalink / raw)
To: Philippe Gerum; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 12:30:28PM +0100, Philippe Gerum wrote:
> On 11/17/2014 12:13 PM, Stoidner, Christoph wrote:
> >
> >>>
> >>> Now the problem "scheduling while atomic" does not occur anymore
> >>> within API calls of our own skin. However, after some run-time
> >>> (about 5 minutes or more) it seems to appear in gatekeeper-thread
> >>> (see below).
> >>>
> >>> What is not clear to me is the ipipe_raise_irq() call in backtrace
> >>> below. I could not identify any according call from within
> >>> gatekeeper_thread(). Do I overlook something?
> >>>
> >>> What I also do not understand is the timestamp of kernel message.
> >>> As mentioned the messages do appear about 5 minutes after kernel
> >>> start. However the message's timestamp are from about 40 seconds
> >>> after boot? Could it happen that messages are delayed? Or is the
> >>> timestamp wrong?
> >>
> >> This would probably mean an issue with the tsc emulation, have you
> >> tried running the "tsc" program, from xenomai regression testsuite
> >> with the -w option ? I remember than the imx28 tsc emulation is a
> >> bit weird, the hardware sometimes returns wrong values, and the
> >> support answer was "read it twice, until you get twice the same
> >> value". But I never found this really satisfactory: what if reading
> >> it twice returns the same wrong value twice ?
> >>
> >> The tsc test should see if the tsc wrapping is doing fine. You can
> >> try to run it several time, or even in parallel to your tests, to
> >> see if it does not detect any problem.
> >
> > There are some other kernel message whose's timestamp seems to be correct. E.g. when creating a semaphore (as below):
> >
> > [ 17.336237] Xenomai: registered exported object @CGI (semaphores)
> > [ 17.344122] Xenomai: registered exported object LOG (msgx)
> >
> > I would expect these message on program start which would also match the shown timestamp. However these message are also outputted late after 5 minutes run-time, exact same time when "scheduling while atomic" is showed. So now I am assuming the timestamp is valid but messages are delayed shown. However I feel this has nothing to do with my main problem: the program crash. So maybe I should open a new thread for "delayed kernel message or wrong time stamp".
> >
> > Back to topic: Do you have any idea why "scheduling while atomic" is thrown by gatekeeper_thread(), based on the backtrace? Or do you know on which place ipipe_raise_irq() is called from gatekeeper thread respectively if that would be legal/expected?
> >
>
> You seem to be running a preempt-rt patched kernel, but the Xenomai core
> acts as if it was built for a regular preemption kernel. This virq is
> triggered by some code in the Xenomai rescheduling when the caller runs
> in secondary mode, which the gatekeeper always does. This code is
> correct, the way it is handled by the APC code in Xenomai due to this
> apparent build mismatch is not.
Note that if you are looking for low latencies on imx28 (ok, not
bounded, but much lower on average), you would probably be better
off using the FCSE extension, it also improves Linux latency. If the
guaranteed mode has to many restrictions for your use case, use the
best-effort mode, which still reduces the latency on average.
For instance, on the at91sam9263 I use to test Xenomai on armv5,
enabling the FCSE in best-effort mode divides hackbench run time by
2, for the same arguments of hackbench.
http://sisyphus.hd.free.fr/~gilles/pub/fcse/hackbench-fcse-v4.png
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 13:10 ` Gilles Chanteperdrix
@ 2014-11-17 13:33 ` Philippe Gerum
0 siblings, 0 replies; 47+ messages in thread
From: Philippe Gerum @ 2014-11-17 13:33 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 11/17/2014 02:10 PM, Gilles Chanteperdrix wrote:
> On Mon, Nov 17, 2014 at 12:49:01PM +0100, Philippe Gerum wrote:
>> On 11/17/2014 11:01 AM, Stoidner, Christoph wrote:
>>>
>>> Hi,
>>>
>>>> you are completely right, we have implemented our own skin. Using
>>>> the debugging functionality mentioned above we have identified a
>>>> function-call that leads to exact the behaviour as described in your post.
>>>>
>>>> Hopefully solving that issue solves our application crash.
>>>
>>> Now the problem "scheduling while atomic" does not occur anymore within API calls of our own skin. However, after some run-time (about 5 minutes or more) it seems to appear in gatekeeper-thread (see below).
>>>
>>> What is not clear to me is the ipipe_raise_irq() call in backtrace below. I could not identify any according call from within gatekeeper_thread(). Do I overlook something?
>>>
>>
>> You could match the closest routine to the calling PC value (c00aac84)
>> using add2line on your kernel image.
>>
>> e.g. arm-linux-gnueabihf-addr2line -e vmlinux -a c00aac84
>
> That would rather be:
> arm-none-linux-gnueabi-addr2line
>
> imx28 is an armv5 (who said that armv4 and armv5 were no longer in
> circulation ?).
>
> :-)
> just nitpicking.
>
> If you have a the multiarch binutils version installed, the default
> addr2line should work as well.
>
"e.g." precisely stands for this, "to be replaced by the command that
fits". I'm not building for v4/5 these days.
--
Philippe.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-12 17:27 ` Gilles Chanteperdrix
@ 2014-11-17 16:48 ` Jan Kiszka
2014-11-17 16:59 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-17 16:48 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
> We do not need trace_hardirqs_on and trace_hardirqs_off for the
> particular case of IRQs: they are already handled by
> __ipipe_do_sync_stage.
That was the key: Simply disabling the instrumentations in the
CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d32f8bd..d8e0b2c 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -195,7 +195,7 @@
#ifdef CONFIG_IPIPE_DEBUG_INTERNAL
bl __ipipe_bugon_irqs_enabled
#endif
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
@ The parent context IRQs must have been enabled to get here in
@ the first place, so there's no point checking the PSR I bit.
bl trace_hardirqs_on
@@ -203,7 +203,7 @@
.else
@ IRQs off again before pulling preserved data off the stack
disable_irq_notrace
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
tst \rpsr, #PSR_I_BIT
bleq trace_hardirqs_on
tst \rpsr, #PSR_I_BIT
Will send a patch.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 16:48 ` Jan Kiszka
@ 2014-11-17 16:59 ` Gilles Chanteperdrix
2014-11-17 17:11 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 16:59 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
> > We do not need trace_hardirqs_on and trace_hardirqs_off for the
> > particular case of IRQs: they are already handled by
> > __ipipe_do_sync_stage.
>
> That was the key: Simply disabling the instrumentations in the
> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
>
> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> index d32f8bd..d8e0b2c 100644
> --- a/arch/arm/kernel/entry-header.S
> +++ b/arch/arm/kernel/entry-header.S
> @@ -195,7 +195,7 @@
> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
> bl __ipipe_bugon_irqs_enabled
> #endif
> -#ifdef CONFIG_TRACE_IRQFLAGS
> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> @ The parent context IRQs must have been enabled to get here in
> @ the first place, so there's no point checking the PSR I bit.
> bl trace_hardirqs_on
> @@ -203,7 +203,7 @@
> .else
> @ IRQs off again before pulling preserved data off the stack
> disable_irq_notrace
> -#ifdef CONFIG_TRACE_IRQFLAGS
> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> tst \rpsr, #PSR_I_BIT
> bleq trace_hardirqs_on
> tst \rpsr, #PSR_I_BIT
>
> Will send a patch.
Will this work for other paths in entry.S, such as exceptions or
syscalls?
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 16:59 ` Gilles Chanteperdrix
@ 2014-11-17 17:11 ` Jan Kiszka
2014-11-17 17:33 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-17 17:11 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
> On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
>> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
>>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
>>> particular case of IRQs: they are already handled by
>>> __ipipe_do_sync_stage.
>>
>> That was the key: Simply disabling the instrumentations in the
>> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
>>
>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>> index d32f8bd..d8e0b2c 100644
>> --- a/arch/arm/kernel/entry-header.S
>> +++ b/arch/arm/kernel/entry-header.S
>> @@ -195,7 +195,7 @@
>> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
>> bl __ipipe_bugon_irqs_enabled
>> #endif
>> -#ifdef CONFIG_TRACE_IRQFLAGS
>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>> @ The parent context IRQs must have been enabled to get here in
>> @ the first place, so there's no point checking the PSR I bit.
>> bl trace_hardirqs_on
>> @@ -203,7 +203,7 @@
>> .else
>> @ IRQs off again before pulling preserved data off the stack
>> disable_irq_notrace
>> -#ifdef CONFIG_TRACE_IRQFLAGS
>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>> tst \rpsr, #PSR_I_BIT
>> bleq trace_hardirqs_on
>> tst \rpsr, #PSR_I_BIT
>>
>> Will send a patch.
>
> Will this work for other paths in entry.S, such as exceptions or
> syscalls?
Do they all come along that code? Then we need to differentiate, likely
via a separate macro parameter.
Just noticed that there is also svc_enter, and that should be handled in
the same way. And it's likely also shared across the board.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 17:11 ` Jan Kiszka
@ 2014-11-17 17:33 ` Gilles Chanteperdrix
2014-11-17 19:07 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 17:33 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 06:11:44PM +0100, Jan Kiszka wrote:
> On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
> > On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
> >> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
> >>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
> >>> particular case of IRQs: they are already handled by
> >>> __ipipe_do_sync_stage.
> >>
> >> That was the key: Simply disabling the instrumentations in the
> >> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
> >>
> >> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >> index d32f8bd..d8e0b2c 100644
> >> --- a/arch/arm/kernel/entry-header.S
> >> +++ b/arch/arm/kernel/entry-header.S
> >> @@ -195,7 +195,7 @@
> >> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
> >> bl __ipipe_bugon_irqs_enabled
> >> #endif
> >> -#ifdef CONFIG_TRACE_IRQFLAGS
> >> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >> @ The parent context IRQs must have been enabled to get here in
> >> @ the first place, so there's no point checking the PSR I bit.
> >> bl trace_hardirqs_on
> >> @@ -203,7 +203,7 @@
> >> .else
> >> @ IRQs off again before pulling preserved data off the stack
> >> disable_irq_notrace
> >> -#ifdef CONFIG_TRACE_IRQFLAGS
> >> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >> tst \rpsr, #PSR_I_BIT
> >> bleq trace_hardirqs_on
> >> tst \rpsr, #PSR_I_BIT
> >>
> >> Will send a patch.
> >
> > Will this work for other paths in entry.S, such as exceptions or
> > syscalls?
>
> Do they all come along that code? Then we need to differentiate, likely
> via a separate macro parameter.
>
> Just noticed that there is also svc_enter, and that should be handled in
> the same way. And it's likely also shared across the board.
There are 4 macros:
svc_enter
svc_exit
when entering/exiting svc mode (whether from irq, data abort,
prefetch abort), that means reentering the
irq/exception path when already in kerne-mode
usr_enter
usr_exit
when entering/exiting usr mode (whether from irq, data abort,
prefetch abort, or syscall), which is entered from user mode.
All these paths call trace_hardirqs_on/trace_hardirqs_off
I have not checked the details on the how and when and if, but since
you are the one working on this, I suggest you do.
If there is a need to call the real
trace_hardirqs_on/trace_hardirqs_off in some cases, I would very
much prefer replacing the bl trace_hard_irqs* with a bl
__ipipe_trace_hardirqs* sorting out the details in C, in
arch/arm/kernel/ipipe.c, than doing this in assembly files with
complicated #if conditions, or retrieval of the current domain
in assembly.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 17:33 ` Gilles Chanteperdrix
@ 2014-11-17 19:07 ` Jan Kiszka
2014-11-17 19:24 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-17 19:07 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-17 18:33, Gilles Chanteperdrix wrote:
> On Mon, Nov 17, 2014 at 06:11:44PM +0100, Jan Kiszka wrote:
>> On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
>>>>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
>>>>> particular case of IRQs: they are already handled by
>>>>> __ipipe_do_sync_stage.
>>>>
>>>> That was the key: Simply disabling the instrumentations in the
>>>> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
>>>>
>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>> index d32f8bd..d8e0b2c 100644
>>>> --- a/arch/arm/kernel/entry-header.S
>>>> +++ b/arch/arm/kernel/entry-header.S
>>>> @@ -195,7 +195,7 @@
>>>> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
>>>> bl __ipipe_bugon_irqs_enabled
>>>> #endif
>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>>>> @ The parent context IRQs must have been enabled to get here in
>>>> @ the first place, so there's no point checking the PSR I bit.
>>>> bl trace_hardirqs_on
>>>> @@ -203,7 +203,7 @@
>>>> .else
>>>> @ IRQs off again before pulling preserved data off the stack
>>>> disable_irq_notrace
>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>>>> tst \rpsr, #PSR_I_BIT
>>>> bleq trace_hardirqs_on
>>>> tst \rpsr, #PSR_I_BIT
>>>>
>>>> Will send a patch.
>>>
>>> Will this work for other paths in entry.S, such as exceptions or
>>> syscalls?
>>
>> Do they all come along that code? Then we need to differentiate, likely
>> via a separate macro parameter.
>>
>> Just noticed that there is also svc_enter, and that should be handled in
>> the same way. And it's likely also shared across the board.
>
> There are 4 macros:
> svc_enter
> svc_exit
> when entering/exiting svc mode (whether from irq, data abort,
> prefetch abort), that means reentering the
> irq/exception path when already in kerne-mode
>
> usr_enter
> usr_exit
> when entering/exiting usr mode (whether from irq, data abort,
> prefetch abort, or syscall), which is entered from user mode.
>
> All these paths call trace_hardirqs_on/trace_hardirqs_off
> I have not checked the details on the how and when and if, but since
> you are the one working on this, I suggest you do.
>
> If there is a need to call the real
> trace_hardirqs_on/trace_hardirqs_off in some cases, I would very
> much prefer replacing the bl trace_hard_irqs* with a bl
> __ipipe_trace_hardirqs* sorting out the details in C, in
> arch/arm/kernel/ipipe.c, than doing this in assembly files with
> complicated #if conditions, or retrieval of the current domain
> in assembly.
>
OK, here is another proposal: filter out tracing in kernel IRQ exit path
(that is required as we may have interrupted Linux with virtual IRQs
off), but otherwise rely on domain filtering in the respective tracing
functions:
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 102adcb..e285269 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -130,7 +130,7 @@
#endif
.macro asm_trace_hardirqs_off
-#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
+#if defined(CONFIG_TRACE_IRQFLAGS)
stmdb sp!, {r0-r3, ip, lr}
bl trace_hardirqs_off
ldmia sp!, {r0-r3, ip, lr}
@@ -138,7 +138,7 @@
.endm
.macro asm_trace_hardirqs_on_cond, cond
-#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
+#if defined(CONFIG_TRACE_IRQFLAGS)
/*
* actually the registers should be pushed and pop'd conditionally, but
* after bl the flags are certainly clobbered
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index d32f8bd..cf2772a 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -195,7 +195,7 @@
#ifdef CONFIG_IPIPE_DEBUG_INTERNAL
bl __ipipe_bugon_irqs_enabled
#endif
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
@ The parent context IRQs must have been enabled to get here in
@ the first place, so there's no point checking the PSR I bit.
bl trace_hardirqs_on
@@ -285,7 +285,7 @@
#ifdef CONFIG_IPIPE_DEBUG_INTERNAL
bl __ipipe_bugon_irqs_enabled
#endif
-#ifdef CONFIG_TRACE_IRQFLAGS
+#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
@ The parent context IRQs must have been enabled to get here in
@ the first place, so there's no point checking the PSR I bit.
bl trace_hardirqs_on
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index e24bb30..2e9043b 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -2559,6 +2559,9 @@ static void __trace_hardirqs_on_caller(unsigned long ip)
void trace_hardirqs_on_caller(unsigned long ip)
{
+ if (!ipipe_root_p)
+ return;
+
time_hardirqs_on(CALLER_ADDR0, ip);
if (unlikely(!debug_locks || current->lockdep_recursion))
@@ -2690,8 +2693,12 @@ void trace_softirqs_on(unsigned long ip)
*/
void trace_softirqs_off(unsigned long ip)
{
- struct task_struct *curr = current;
+ struct task_struct *curr;
+
+ if (!ipipe_root_p)
+ return;
+ curr = current;
if (unlikely(!debug_locks || current->lockdep_recursion))
return;
diff --git a/kernel/trace/trace_irqsoff.c b/kernel/trace/trace_irqsoff.c
index 2aefbee..c3ec43f 100644
--- a/kernel/trace/trace_irqsoff.c
+++ b/kernel/trace/trace_irqsoff.c
@@ -486,28 +486,28 @@ inline void print_irqtrace_events(struct task_struct *curr)
*/
void trace_hardirqs_on(void)
{
- if (!preempt_trace() && irq_trace())
+ if (ipipe_root_p && !preempt_trace() && irq_trace())
stop_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
}
EXPORT_SYMBOL(trace_hardirqs_on);
void trace_hardirqs_off(void)
{
- if (!preempt_trace() && irq_trace())
+ if (ipipe_root_p && !preempt_trace() && irq_trace())
start_critical_timing(CALLER_ADDR0, CALLER_ADDR1);
}
EXPORT_SYMBOL(trace_hardirqs_off);
void trace_hardirqs_on_caller(unsigned long caller_addr)
{
- if (!preempt_trace() && irq_trace())
+ if (ipipe_root_p && !preempt_trace() && irq_trace())
stop_critical_timing(CALLER_ADDR0, caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_on_caller);
void trace_hardirqs_off_caller(unsigned long caller_addr)
{
- if (!preempt_trace() && irq_trace())
+ if (ipipe_root_p && !preempt_trace() && irq_trace())
start_critical_timing(CALLER_ADDR0, caller_addr);
}
EXPORT_SYMBOL(trace_hardirqs_off_caller);
This works for ARM so far, need to revalidate x86, but it should work based on the concept. Comments?
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply related [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 19:07 ` Jan Kiszka
@ 2014-11-17 19:24 ` Gilles Chanteperdrix
2014-11-18 6:19 ` Jan Kiszka
0 siblings, 1 reply; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-17 19:24 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Mon, Nov 17, 2014 at 08:07:39PM +0100, Jan Kiszka wrote:
> On 2014-11-17 18:33, Gilles Chanteperdrix wrote:
> > On Mon, Nov 17, 2014 at 06:11:44PM +0100, Jan Kiszka wrote:
> >> On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
> >>>>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
> >>>>> particular case of IRQs: they are already handled by
> >>>>> __ipipe_do_sync_stage.
> >>>>
> >>>> That was the key: Simply disabling the instrumentations in the
> >>>> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
> >>>>
> >>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>> index d32f8bd..d8e0b2c 100644
> >>>> --- a/arch/arm/kernel/entry-header.S
> >>>> +++ b/arch/arm/kernel/entry-header.S
> >>>> @@ -195,7 +195,7 @@
> >>>> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
> >>>> bl __ipipe_bugon_irqs_enabled
> >>>> #endif
> >>>> -#ifdef CONFIG_TRACE_IRQFLAGS
> >>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >>>> @ The parent context IRQs must have been enabled to get here in
> >>>> @ the first place, so there's no point checking the PSR I bit.
> >>>> bl trace_hardirqs_on
> >>>> @@ -203,7 +203,7 @@
> >>>> .else
> >>>> @ IRQs off again before pulling preserved data off the stack
> >>>> disable_irq_notrace
> >>>> -#ifdef CONFIG_TRACE_IRQFLAGS
> >>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >>>> tst \rpsr, #PSR_I_BIT
> >>>> bleq trace_hardirqs_on
> >>>> tst \rpsr, #PSR_I_BIT
> >>>>
> >>>> Will send a patch.
> >>>
> >>> Will this work for other paths in entry.S, such as exceptions or
> >>> syscalls?
> >>
> >> Do they all come along that code? Then we need to differentiate, likely
> >> via a separate macro parameter.
> >>
> >> Just noticed that there is also svc_enter, and that should be handled in
> >> the same way. And it's likely also shared across the board.
> >
> > There are 4 macros:
> > svc_enter
> > svc_exit
> > when entering/exiting svc mode (whether from irq, data abort,
> > prefetch abort), that means reentering the
> > irq/exception path when already in kerne-mode
> >
> > usr_enter
> > usr_exit
> > when entering/exiting usr mode (whether from irq, data abort,
> > prefetch abort, or syscall), which is entered from user mode.
> >
> > All these paths call trace_hardirqs_on/trace_hardirqs_off
> > I have not checked the details on the how and when and if, but since
> > you are the one working on this, I suggest you do.
> >
> > If there is a need to call the real
> > trace_hardirqs_on/trace_hardirqs_off in some cases, I would very
> > much prefer replacing the bl trace_hard_irqs* with a bl
> > __ipipe_trace_hardirqs* sorting out the details in C, in
> > arch/arm/kernel/ipipe.c, than doing this in assembly files with
> > complicated #if conditions, or retrieval of the current domain
> > in assembly.
> >
>
> OK, here is another proposal: filter out tracing in kernel IRQ exit path
> (that is required as we may have interrupted Linux with virtual IRQs
> off), but otherwise rely on domain filtering in the respective tracing
> functions:
The only case where it does not work, is for asymmetric things,
namely syscalls, and exceptions (page faults) because you
can enter a syscall or exception in secondary mode (so
trace_hardirqs_on gets called) and leave in primary mode, in which
case you will reenter root with the kernel considering that hardirqs
are off whereas they may not be.
Listen, you have stop trying and testing patches and just say "it
works, so, take my patch", this will not work. I absolutely require
of you that you enumerate for each case, what the code does, and why
it works. I will not accept a patch that was quickly tested and
appeared to work. I consider that stuff a corner case, and not
really useful, so, I would very much prefer get it to depend on
!IPIPE. However, you seem to want and have it working, that is fine
by me, but in that case do the work well, so that we do not get
users complaining that it does not work in corner cases.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-17 19:24 ` Gilles Chanteperdrix
@ 2014-11-18 6:19 ` Jan Kiszka
2014-11-18 6:28 ` Gilles Chanteperdrix
0 siblings, 1 reply; 47+ messages in thread
From: Jan Kiszka @ 2014-11-18 6:19 UTC (permalink / raw)
To: Gilles Chanteperdrix; +Cc: xenomai@xenomai.org
On 2014-11-17 20:24, Gilles Chanteperdrix wrote:
> On Mon, Nov 17, 2014 at 08:07:39PM +0100, Jan Kiszka wrote:
>> On 2014-11-17 18:33, Gilles Chanteperdrix wrote:
>>> On Mon, Nov 17, 2014 at 06:11:44PM +0100, Jan Kiszka wrote:
>>>> On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
>>>>> On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
>>>>>> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
>>>>>>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
>>>>>>> particular case of IRQs: they are already handled by
>>>>>>> __ipipe_do_sync_stage.
>>>>>>
>>>>>> That was the key: Simply disabling the instrumentations in the
>>>>>> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
>>>>>>
>>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
>>>>>> index d32f8bd..d8e0b2c 100644
>>>>>> --- a/arch/arm/kernel/entry-header.S
>>>>>> +++ b/arch/arm/kernel/entry-header.S
>>>>>> @@ -195,7 +195,7 @@
>>>>>> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
>>>>>> bl __ipipe_bugon_irqs_enabled
>>>>>> #endif
>>>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
>>>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>>>>>> @ The parent context IRQs must have been enabled to get here in
>>>>>> @ the first place, so there's no point checking the PSR I bit.
>>>>>> bl trace_hardirqs_on
>>>>>> @@ -203,7 +203,7 @@
>>>>>> .else
>>>>>> @ IRQs off again before pulling preserved data off the stack
>>>>>> disable_irq_notrace
>>>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
>>>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
>>>>>> tst \rpsr, #PSR_I_BIT
>>>>>> bleq trace_hardirqs_on
>>>>>> tst \rpsr, #PSR_I_BIT
>>>>>>
>>>>>> Will send a patch.
>>>>>
>>>>> Will this work for other paths in entry.S, such as exceptions or
>>>>> syscalls?
>>>>
>>>> Do they all come along that code? Then we need to differentiate, likely
>>>> via a separate macro parameter.
>>>>
>>>> Just noticed that there is also svc_enter, and that should be handled in
>>>> the same way. And it's likely also shared across the board.
>>>
>>> There are 4 macros:
>>> svc_enter
>>> svc_exit
>>> when entering/exiting svc mode (whether from irq, data abort,
>>> prefetch abort), that means reentering the
>>> irq/exception path when already in kerne-mode
>>>
>>> usr_enter
>>> usr_exit
>>> when entering/exiting usr mode (whether from irq, data abort,
>>> prefetch abort, or syscall), which is entered from user mode.
>>>
>>> All these paths call trace_hardirqs_on/trace_hardirqs_off
>>> I have not checked the details on the how and when and if, but since
>>> you are the one working on this, I suggest you do.
>>>
>>> If there is a need to call the real
>>> trace_hardirqs_on/trace_hardirqs_off in some cases, I would very
>>> much prefer replacing the bl trace_hard_irqs* with a bl
>>> __ipipe_trace_hardirqs* sorting out the details in C, in
>>> arch/arm/kernel/ipipe.c, than doing this in assembly files with
>>> complicated #if conditions, or retrieval of the current domain
>>> in assembly.
>>>
>>
>> OK, here is another proposal: filter out tracing in kernel IRQ exit path
>> (that is required as we may have interrupted Linux with virtual IRQs
>> off), but otherwise rely on domain filtering in the respective tracing
>> functions:
>
> The only case where it does not work, is for asymmetric things,
> namely syscalls, and exceptions (page faults) because you
> can enter a syscall or exception in secondary mode (so
> trace_hardirqs_on gets called) and leave in primary mode, in which
> case you will reenter root with the kernel considering that hardirqs
> are off whereas they may not be.
>
> Listen, you have stop trying and testing patches and just say "it
> works, so, take my patch", this will not work. I absolutely require
> of you that you enumerate for each case, what the code does, and why
> it works. I will not accept a patch that was quickly tested and
> appeared to work. I consider that stuff a corner case, and not
> really useful, so, I would very much prefer get it to depend on
> !IPIPE. However, you seem to want and have it working, that is fine
> by me, but in that case do the work well, so that we do not get
> users complaining that it does not work in corner cases.
The current changes already return lockdep to usable state, which is an
improvement. Plus they remove remaining risks to call the tracing
functions over the head domain, another improvement over the existing
code, for all archs. That this may not catch all migration corner cases
yet shouldn't be your worries - if you don't care about lockdep at all.
However, I will propose properly described and signed-off patches for
merge once testing and analysis provide the required confidence in the
approach.
Jan
--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 47+ messages in thread
* Re: [Xenomai] "inconsistent lock state" on boot-up
2014-11-18 6:19 ` Jan Kiszka
@ 2014-11-18 6:28 ` Gilles Chanteperdrix
0 siblings, 0 replies; 47+ messages in thread
From: Gilles Chanteperdrix @ 2014-11-18 6:28 UTC (permalink / raw)
To: Jan Kiszka; +Cc: xenomai@xenomai.org
On Tue, Nov 18, 2014 at 07:19:17AM +0100, Jan Kiszka wrote:
> On 2014-11-17 20:24, Gilles Chanteperdrix wrote:
> > On Mon, Nov 17, 2014 at 08:07:39PM +0100, Jan Kiszka wrote:
> >> On 2014-11-17 18:33, Gilles Chanteperdrix wrote:
> >>> On Mon, Nov 17, 2014 at 06:11:44PM +0100, Jan Kiszka wrote:
> >>>> On 2014-11-17 17:59, Gilles Chanteperdrix wrote:
> >>>>> On Mon, Nov 17, 2014 at 05:48:00PM +0100, Jan Kiszka wrote:
> >>>>>> On 2014-11-12 18:27, Gilles Chanteperdrix wrote:
> >>>>>>> We do not need trace_hardirqs_on and trace_hardirqs_off for the
> >>>>>>> particular case of IRQs: they are already handled by
> >>>>>>> __ipipe_do_sync_stage.
> >>>>>>
> >>>>>> That was the key: Simply disabling the instrumentations in the
> >>>>>> CONFIG_IPIPE removes all lock state inconsistencies, at least this far:
> >>>>>>
> >>>>>> diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
> >>>>>> index d32f8bd..d8e0b2c 100644
> >>>>>> --- a/arch/arm/kernel/entry-header.S
> >>>>>> +++ b/arch/arm/kernel/entry-header.S
> >>>>>> @@ -195,7 +195,7 @@
> >>>>>> #ifdef CONFIG_IPIPE_DEBUG_INTERNAL
> >>>>>> bl __ipipe_bugon_irqs_enabled
> >>>>>> #endif
> >>>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >>>>>> @ The parent context IRQs must have been enabled to get here in
> >>>>>> @ the first place, so there's no point checking the PSR I bit.
> >>>>>> bl trace_hardirqs_on
> >>>>>> @@ -203,7 +203,7 @@
> >>>>>> .else
> >>>>>> @ IRQs off again before pulling preserved data off the stack
> >>>>>> disable_irq_notrace
> >>>>>> -#ifdef CONFIG_TRACE_IRQFLAGS
> >>>>>> +#if defined(CONFIG_TRACE_IRQFLAGS) && !defined(CONFIG_IPIPE)
> >>>>>> tst \rpsr, #PSR_I_BIT
> >>>>>> bleq trace_hardirqs_on
> >>>>>> tst \rpsr, #PSR_I_BIT
> >>>>>>
> >>>>>> Will send a patch.
> >>>>>
> >>>>> Will this work for other paths in entry.S, such as exceptions or
> >>>>> syscalls?
> >>>>
> >>>> Do they all come along that code? Then we need to differentiate, likely
> >>>> via a separate macro parameter.
> >>>>
> >>>> Just noticed that there is also svc_enter, and that should be handled in
> >>>> the same way. And it's likely also shared across the board.
> >>>
> >>> There are 4 macros:
> >>> svc_enter
> >>> svc_exit
> >>> when entering/exiting svc mode (whether from irq, data abort,
> >>> prefetch abort), that means reentering the
> >>> irq/exception path when already in kerne-mode
> >>>
> >>> usr_enter
> >>> usr_exit
> >>> when entering/exiting usr mode (whether from irq, data abort,
> >>> prefetch abort, or syscall), which is entered from user mode.
> >>>
> >>> All these paths call trace_hardirqs_on/trace_hardirqs_off
> >>> I have not checked the details on the how and when and if, but since
> >>> you are the one working on this, I suggest you do.
> >>>
> >>> If there is a need to call the real
> >>> trace_hardirqs_on/trace_hardirqs_off in some cases, I would very
> >>> much prefer replacing the bl trace_hard_irqs* with a bl
> >>> __ipipe_trace_hardirqs* sorting out the details in C, in
> >>> arch/arm/kernel/ipipe.c, than doing this in assembly files with
> >>> complicated #if conditions, or retrieval of the current domain
> >>> in assembly.
> >>>
> >>
> >> OK, here is another proposal: filter out tracing in kernel IRQ exit path
> >> (that is required as we may have interrupted Linux with virtual IRQs
> >> off), but otherwise rely on domain filtering in the respective tracing
> >> functions:
> >
> > The only case where it does not work, is for asymmetric things,
> > namely syscalls, and exceptions (page faults) because you
> > can enter a syscall or exception in secondary mode (so
> > trace_hardirqs_on gets called) and leave in primary mode, in which
> > case you will reenter root with the kernel considering that hardirqs
> > are off whereas they may not be.
> >
> > Listen, you have stop trying and testing patches and just say "it
> > works, so, take my patch", this will not work. I absolutely require
> > of you that you enumerate for each case, what the code does, and why
> > it works. I will not accept a patch that was quickly tested and
> > appeared to work. I consider that stuff a corner case, and not
> > really useful, so, I would very much prefer get it to depend on
> > !IPIPE. However, you seem to want and have it working, that is fine
> > by me, but in that case do the work well, so that we do not get
> > users complaining that it does not work in corner cases.
>
> The current changes already return lockdep to usable state, which is an
> improvement.
If this causes the kernel to hang and crash in some cases, no this
is not an improvement over "depends on !IPIPE", which is what I will
merge unless you provide me with something better. See, you are not
competing with the current state of the I-pipe, you are competing
with "depends on !IPIPE".
> Plus they remove remaining risks to call the tracing
> functions over the head domain, another improvement over the existing
> code, for all archs. That this may not catch all migration corner cases
> yet shouldn't be your worries - if you don't care about lockdep at all.
As soon as I merge your patch, I am the one maintaining it, this
means I will start enabling LOCKDEP in my "full debug" configuration
when validating the I-pipe patch, and I do not want it to crash.
Besides I am the one answering user requests about the I-pipe patch
for the ARM architecture, and I do not want to have to deal with
users reporting weird crashes with LOCKDEP turned on. So, this will
become my worry, and no, I do not want any case not covered by your
patch. Besides there are not that many paths in entry.S, so asking
for a thorough analysis is not asking for something tedious or
complicated..
>
> However, I will propose properly described and signed-off patches for
> merge once testing and analysis provide the required confidence in the
> approach.
Validation by testing is required, but by no means as valuable as a
good design.
--
Gilles.
^ permalink raw reply [flat|nested] 47+ messages in thread
end of thread, other threads:[~2014-11-18 6:28 UTC | newest]
Thread overview: 47+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-11-09 10:07 [Xenomai] "inconsistent lock state" on boot-up Stoidner, Christoph
2014-11-09 15:53 ` Gilles Chanteperdrix
2014-11-10 9:08 ` Stoidner, Christoph
2014-11-10 12:33 ` Stoidner, Christoph
2014-11-10 12:44 ` Gilles Chanteperdrix
2014-11-10 12:43 ` Gilles Chanteperdrix
2014-11-10 14:52 ` Jan Kiszka
2014-11-10 15:56 ` Gilles Chanteperdrix
2014-11-10 18:29 ` Jan Kiszka
2014-11-10 19:46 ` Gilles Chanteperdrix
2014-11-10 19:51 ` Gilles Chanteperdrix
2014-11-10 19:55 ` Jan Kiszka
2014-11-10 20:00 ` Gilles Chanteperdrix
2014-11-10 20:02 ` Jan Kiszka
2014-11-10 20:06 ` Gilles Chanteperdrix
2014-11-10 20:10 ` Jan Kiszka
2014-11-10 20:14 ` Gilles Chanteperdrix
2014-11-10 20:17 ` Jan Kiszka
2014-11-10 20:18 ` Gilles Chanteperdrix
2014-11-10 20:22 ` Jan Kiszka
2014-11-10 20:23 ` Gilles Chanteperdrix
2014-11-10 20:28 ` Jan Kiszka
2014-11-10 20:37 ` Gilles Chanteperdrix
2014-11-10 20:42 ` Jan Kiszka
2014-11-10 20:55 ` Gilles Chanteperdrix
2014-11-10 21:58 ` Gilles Chanteperdrix
2014-11-12 17:27 ` Gilles Chanteperdrix
2014-11-17 16:48 ` Jan Kiszka
2014-11-17 16:59 ` Gilles Chanteperdrix
2014-11-17 17:11 ` Jan Kiszka
2014-11-17 17:33 ` Gilles Chanteperdrix
2014-11-17 19:07 ` Jan Kiszka
2014-11-17 19:24 ` Gilles Chanteperdrix
2014-11-18 6:19 ` Jan Kiszka
2014-11-18 6:28 ` Gilles Chanteperdrix
2014-11-11 17:33 ` Stoidner, Christoph
2014-11-11 17:46 ` Gilles Chanteperdrix
2014-11-11 18:04 ` Philippe Gerum
2014-11-17 10:01 ` Stoidner, Christoph
2014-11-17 10:22 ` Gilles Chanteperdrix
2014-11-17 11:13 ` Stoidner, Christoph
2014-11-17 11:30 ` Philippe Gerum
2014-11-17 13:16 ` Gilles Chanteperdrix
2014-11-17 11:49 ` Philippe Gerum
2014-11-17 11:51 ` Philippe Gerum
2014-11-17 13:10 ` Gilles Chanteperdrix
2014-11-17 13:33 ` Philippe Gerum
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.