From mboxrd@z Thu Jan 1 00:00:00 1970 From: Vasileios Karakasis Subject: Re: realloc function Date: Wed, 05 Jan 2011 17:00:43 +0200 Message-ID: <4D24879B.8060205@cslab.ece.ntua.gr> References: <4D20B7E6.9020207@cslab.ece.ntua.gr> <4D22461A.4050206@cslab.ece.ntua.gr> <4D246000.8010404@cslab.ece.ntua.gr> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigA4CCA8C3952CAE9A72B70689" Return-path: In-Reply-To: <4D246000.8010404@cslab.ece.ntua.gr> Sender: linux-numa-owner@vger.kernel.org List-ID: To: Andi Kleen Cc: linux-numa@vger.kernel.org, 'Kornilios Kourtis' This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigA4CCA8C3952CAE9A72B70689 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 01/05/2011 02:11 PM, Vasileios Karakasis wrote: >=20 >=20 > On 01/05/2011 12:20 AM, Andi Kleen wrote: >>> Hi, >>> >>> I am sending you the updated patch (against the latest 2.0.6 version)= =2E I >>> call numa_police_memory_int() only for the newly allocated pages, whe= n >>> the area is expanded. I also added a numa_realloc_onnode() function i= n >>> the same fashion as that of the numa_alloc_onnode(), which sets a >>> specific memory binding. I pass the MPOL_MF_MOVE flag to mbind(), but= I >>> am not sure if this is worth it, since the call becomes too slow eve= n >>> in the case of no page migration. Without the MPOL_MF_MOVE flag, of >>> course, if the policy changes between realloc's, previously allocated= >>> pages won't be affected. >> >> Thinking about it more police_* is likely still the wrong semantics. >> That will always set the current policy. >> >> But the user more likely wants the same policy the original >> mapping had, right? >=20 > I agree with that. In my use case at least, I start with an > alloc_on_node() and keep realloc'ing assuming all new pages will be > allocated on the node I specified. Of course, this questions more the > existence of a realloc_onnode() function, since its functionality > overlaps with that of migrating/moving pages. So adopting these > semantics, I think we can drop the numa_realloc_onnode(). >=20 >> >> This could be implemented by calling get_mempolicy() on the old >> mapping with MPOL_F_ADDR and setting it on the new pages in >> the new mapping. >> >=20 > I will come up with a patch in the next few days. Peeking inside the mremap() source, I can see that the kernel already does this, i.e., mremap() preserves the policy of the original vm area. The problem is when the user has not specified a binding for the original mapping (default policy), in which case copying explicitly the policy from the old to the new pages won't work either; the new pages will still have MPOL_DEFAULT. So realloc() cannot guarantee that the new pages will be allocated on the same node as the preceding alloc(), unless there is a way to obtain the actual node that the pages of the original allocation were allocated on. In my opinion, this isn't a real problem, because even the simple numa_alloc() using the default policy, cannot guarantee that the pages will be allocated on the node of the calling cpu: what if the task is migrated to a different cpu on a different node, while touching (i.e., allocating) the pages with the police_memory_int()? However, if the user calls one of the functions that call mbind(), e.g., alloc_onnode(), then just mremap() will work fine. >=20 >> -Andi >> >> >=20 > Regards, --=20 V.K. --------------enigA4CCA8C3952CAE9A72B70689 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEARECAAYFAk0kh5sACgkQHUHhfRemepzddACgsljMmebienuZXkXQyJDkmN01 Ya4AoIgKJM/r5al/2MwXbsnBOGj/KSqR =2I04 -----END PGP SIGNATURE----- --------------enigA4CCA8C3952CAE9A72B70689--