From mboxrd@z Thu Jan 1 00:00:00 1970 Message-ID: <4B9222D1.1000207@domain.hid> Date: Sat, 06 Mar 2010 10:39:29 +0100 From: Jan Kiszka MIME-Version: 1.0 References: <4B90ED0A.40101@domain.hid> <4B90F2D3.8020508@domain.hid> <4B91290C.7070005@domain.hid> <4B9131B4.6060105@domain.hid> <4B913C71.4060006@domain.hid> <4B914107.3000108@domain.hid> In-Reply-To: <4B914107.3000108@domain.hid> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC77565C83E2EA276840D6052" Sender: jan.kiszka@domain.hid Subject: Re: [Xenomai-core] [RFC][PATCH v2] Allow for deregistering current mode user space address List-Id: Xenomai life and development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Gilles Chanteperdrix Cc: xenomai-core This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC77565C83E2EA276840D6052 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Gilles Chanteperdrix wrote: > Jan Kiszka wrote: >> Gilles Chanteperdrix wrote: >>> Jan Kiszka wrote: >>>> Gilles Chanteperdrix wrote: >>>>> Jan Kiszka wrote: >>>>>> This adds __xn_sys_drop_u_mode, a syscall to deregister the u_mode= user >>>>>> space address passed on shadow creation for the current thread. We= need >>>>>> this in order to safely shut down a thread that wants to release i= ts >>>>>> u_mode memory (either malloc'ed or TLS-allocated). So far, the ker= nel >>>>>> might have touched this memory even after the release which caused= >>>>>> subtle corruptions. >>>>>> >>>>>> After we released u_mode, we can no longer make reliable decisions= based >>>>>> on it. For the fast mutex use case, we then assume the worst case,= ie. >>>>>> we enforce syscall-based acquisitions unconditionally. For the >>>>>> assert_nrt case, we simply acquire the required information via >>>>>> __xn_sys_current_info. >>>>> As usual, you are firing patches too fast: >>>>> - you are lacking the exit syscall trap. >>>> Don't worry, my memory is not yet _that_ bad. :) >>>> >>>>> - the patch is not correct and will cause kernel-space addresses to= be >>>>> passed to put_user, which probably yields bugs when the kernel is >>>>> compiled with proper debug flags; >>>> Right, e.g. ARM requires set_fs() fiddling in addition. But then a s= imple >>>> NULL-check is way better. Fixed. >>>> >>>>> - I do not see the point where the new syscall is called with __thr= ead, >>>>> but I am not sure it is missing, and if it is missing, you will get= the >>>>> rightfull warning when hitting the exit syscall. >>>> Destructor of (now unconditionally used) xeno_current_mode_key. >>>> >>>>> - the user-space users of the function to get current_mode are stil= l >>>>> cluttered with special cases to handle the invalid state. And >>>>> illustrating nicely the deficiency of this approach, you have forgo= t one >>>>> user. >>>> My impression is that you are still a bit biased due to TLS limitati= ons >>>> on ARM. >>> No, I think the __thread stuff thinks make C code look too much like = C++ >>> where the compiler makes things behind your back for such simple code= as: >>> >>> void *foo =3D &bar; >>> >>> This is not what I expect from a C compiler, and I suspect a lot of C= >>> developers, including the people who advocate that C, as a simple >>> language, and not C++ should be used for free software development. >>> >>> Yes, there is no gain on ARM, where this generates a function call as= >>> pthread_getspecific and causes bugs with gcc >=3D 4.3, but not much g= ain >>> on x86 either where a function call is so cheap. >> As are individual memory accesses. The sum makes the difference. >=20 > Ok. If you find before 2.5.3 a platform where this makes a difference o= f > more than 5% of this platform's latency, I do not remove the __thread > stuff. Deal? If you manage to remove 5% of the configuration variables of Xenomai that way. Or 5% of the conditional code. >=20 >> Once someone starts testing mutexes or other stuff from TSD destructor= s, >> not earlier. And this is so special that anyone of us interested in th= is >> test case will naturally be aware of its specifics. >=20 > As you said to err is human. And having a single point for handling an > exceptional case is a good idea. And this shortens your patch. >=20 >=20 >>> I think every sane free-software projects ask people to test their >>> patches/pull requests before sending them, even the Linux kernel. See= >>> for instance: >>> http://lwn.net/Articles/377091/ >>> >>> I am not asking you for one week. 24 hours would be enough. >> As you may know, complete test coverage is near to impossible. >=20 > That is no excuse for not making any test. >=20 >=20 >> On unlucky days, I happen to send out stuff that broke while continuin= g to >> type after the last test ran. But specifically these things were teste= d >> in various combinations. >> >> It is not very polite to assume the contrary. >=20 > When you send a patch every 5 minutes, it is obvious that it was not > tested. No politeness involved. Everybody can check the timestamps in > the mailing lists and do the math. I think my workflow was misleading regarding this. Not necessarily all stuff I push into my 'for-upstream' branch is done with all tests at the time when I push. Official pull request mails are supposed to mark this. I'm pushing to save my work and, of course, to expose it. So I appreciate comments, but I surely don't expect them within minutes. >=20 > I'm trying to avoid the >> same assumption about other people as well, even when things look >> differently from time to time. To err is human, to always write correc= t >> code is divine. >> >> But as you already mentioned in another mail, there is room for >> improvement /wrt testability. On step forward would be an automated >> build and run of some git tree that we can check into. One board per >> arch would be enough to avoid issues we face from time to time when we= >> only stress our favorite archs with new features or fixes. >=20 > We have the automated build. Then we just need to run it against some 'build-test' branches of everyone's git repositories on checkin. That would be great. > I plan seriously to work on the automated > run for 2.5.3. But it is too late for 2.5.2. Also note that I gave you > access to my server where several targets are configured and usable. If= > you forgot the password, I can reset it. I do remember this, but I thought this was for a specific test. If I'm supposed to access that box regularly, I'm fine, we can arrange this. >=20 >>> In fact, the mistake is partly mine too, I am replying to your patche= s >>> as fast as you send them. We were going crazy tuesday with the condit= ion >>> variable patches, I do not think it gives a reassuring image to the >>> users. >> I can only say that, as a developer in this business, I trust code >> having been discussed to the extreme _way_ more than stuff a single >> clever person checked in even after meditating over it for weeks. I've= >> seen enough of this, produced by non-clever /me as well as way more >> clever people in all kinds of projects. >> >>> So that at some point, I stopped reviewing your patches. I have >>> not even read the last 8 you sent. Without going to the other extreme= of >>> a "closed door" development model, I think we can find a more reasona= ble >>> trade off than the current one. >> Well, you have to decide which piece to take, or to request it to be >> rewritten, or to write yourself. >=20 > Yes. That is the plan. Tomorrow. >=20 >>> Our objective now, is to put 2.5.2 in a stable state. >>> >>>> +substitute_linux_syscall(xnthread_t *thread, struct pt_regs *regs) >>>> +{ >>>> + if (unlikely(__xn_reg_mux(regs) =3D=3D __NR_exit)) >>>> + handle_shadow_exit(thread); >>>> + >>> Not your fault, but this does not work on ARM. Do not know if >>> __xn_reg_mux needs fixing there or if we need to add a new macro to g= et >>> the roots syscall number. >> And that's the point of sending patches for discussion. >> >> It's regs->ARM_r7 =3D=3D __NR_SYSCALL_BASE + __NR_MY_SYSCALL, right? I= f >> __xn_reg_mux is supposed to be only a mux value for Xenomai, we need a= >> separate accessor for the Linux syscall number on all archs. >=20 > or __NR_OABI_SYSCALL_BASE + __NR_MY_SYSCALL depending on CONFIG_OABI_CO= MPAT. >=20 Patch is in my queue if you are interested. Jan PS: Note that this u_mode fix exposed an issue of the sigtest: cancel_with_signals fails if there is some TSD destructor registered that issues a Xenomai syscall. Not sure yet if it is a fundamental issue or just related to the test itself. --------------enigC77565C83E2EA276840D6052 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org iEYEARECAAYFAkuSItcACgkQitSsb3rl5xT+KwCfUF6D5SyEXSy58BZZ2sEMAHOR DE8Anjm5VheH1mmcr+GTmXbUbafYMoX8 =rHRy -----END PGP SIGNATURE----- --------------enigC77565C83E2EA276840D6052--