From mboxrd@z Thu Jan 1 00:00:00 1970 References: <87y2csqr8q.fsf@xenomai.org> <29e3091830a483dd2ad141a098baed259a1f0832.camel@siemens.com> <87h7jercfi.fsf@xenomai.org> <87eeeirc4j.fsf@xenomai.org> From: Philippe Gerum Subject: Re: [PATCH v2 0/9] y2038 groundwork and first steps In-reply-to: Date: Fri, 07 May 2021 15:00:05 +0200 Message-ID: <87bl9mr962.fsf@xenomai.org> MIME-Version: 1.0 Content-Type: text/plain List-Id: Discussions about the Xenomai project List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Florian Bezdeka Cc: xenomai@xenomai.org, Jan Kiszka Florian Bezdeka writes: > On 07.05.21 13:56, Philippe Gerum wrote: >> >> Philippe Gerum via Xenomai writes: >> >>> Florian Bezdeka writes: >>> >>>> On 06.05.21 09:08, Bezdeka, Florian via Xenomai wrote: >>>>> On Thu, 2021-05-06 at 09:02 +0200, Philippe Gerum wrote: >>>>>> Jan Kiszka via Xenomai writes: >>>>>> >>>>>>> On 05.05.21 18:52, Jan Kiszka via Xenomai wrote: >>>>>>>> Picking up from Philippe's queue: >>>>>>>> >>>>>>>> This patch series prepares the tree for the upcoming y2038 work, >>>>>>>> converting obsolete/ambiguous time specification types to the proper >>>>>>>> ones introduced upstream by the v5.x kernel series. >>>>>>>> >>>>>>>> In v2, feedback on the first round has been addressed, primarily >>>>>>>> regarding folding fixing into the patches that need them. >>>>>>>> >>>>>>>> In addition, this includes 3 patches from Florian that add >>>>>>>> sem_timedwait64 system call and a test suite for it. >>>>>>>> >>>>>>> >>>>>>> Seems we have some issue on ARM ("Illegal instruction" in smokey): >>>>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.denx.de%2FXenomai%2Fxenomai-images%2F-%2Fjobs%2F264219&data=04%7C01%7Cflorian.bezdeka%40siemens.com%7C4809653e590745bc77b008d9105dbc19%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637558817063489934%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=eH%2FvAfRIX8ORSkqcvjrmd%2BD0s6N%2FcpKD66Ptm0c0pbc%3D&reserved=0 >>>>>>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsource.denx.de%2FXenomai%2Fxenomai-images%2F-%2Fjobs%2F264225&data=04%7C01%7Cflorian.bezdeka%40siemens.com%7C4809653e590745bc77b008d9105dbc19%7C38ae3bcd95794fd4addab42e1495d55a%7C1%7C0%7C637558817063489934%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=HWAdg9KpJhevXa97ziRGDwLzyHq5%2Fj1Yv7IBcH3uTsY%3D&reserved=0 >>>>>> >>>>>> That one may be related to the code directing clock_gettime() either to >>>>>> the vDSO with Dovetail, or TSC readouts via a memory mapping with the >>>>>> I-pipe, all in libcobalt. >>>>> >>>>> Thanks for the hint! I will check that and report back. >>>> >>>> I was able to find the root cause. It's glibc syscall() vs. >>>> XENOMAI_SYSCALLx(). I have a fix around that was already tested on some >>>> qemu targets (arm as well as x86). I will provide it soon. >>>> >>>> There is still one open question to me: Why is there a special syscall >>>> handling (userland) implemented in Xenomai? I did not fully understand >>>> why we end up with an invalid instruction on arm, but I guess it's >>>> because of different registers being used. >>>> >>>> IOW: syscall() is fine as long as you are calling Linux syscalls, but >>>> you might run into problems (at least on arm) when trying to call >>>> Xenomai specific ones. >>>> >>> >>> I may know this one, there is an ABI change in recent glibc, >>> specifically in the glue code for the arm unwinder, which may cause >>> this. Turning off -fasynchronous-unwind-tables should paper over that >>> particular issue. >> >> If so, then we need to fix the Xenomai syscall prologue/epilogue >> (possibly some CPU register now has a specific function and/or should >> hold a particular value across calls). >> > > Any references? We did not update glibc recently (CI images are still > using glibc from Debian 10). The smokey tests running into problems now > are the first ones that are using the *libc syscall() function. So maybe > it was already broken on ARM? Not sure. First caught this with EVL after an upgrade to gcc-linaro-7.5.0-2019.12-x86_64_arm-linux-gnueabihf, when the issue started showing up. Some calls like pthread_cancel() would crash randomly. I still have this issue on my todo list as I want to re-enable async tables for libevl although this is not required anymore (unlike libcobalt). -- Philippe.