Re: [RFC][PATCH] linux-2.5.34_vsyscall_A0

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Andrea Arcangeli <andrea@suse.de>
To: Stephen Hemminger <shemminger@osdl.org>
Cc: john stultz <johnstul@us.ibm.com>,
	Linus Torvalds <torvalds@transmeta.com>,
	Michael Hohnbaum <hbaum@us.ibm.com>,
	"Martin J. Bligh" <mbligh@aracnet.com>,
	george anzinger <george@mvista.com>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH] linux-2.5.34_vsyscall_A0
Date: Fri, 18 Oct 2002 19:01:16 +0200	[thread overview]
Message-ID: <20021018170116.GK23930@dualathlon.random> (raw)
In-Reply-To: <1034957619.5401.8.camel@dell_ss3.pdx.osdl.net>

On Fri, Oct 18, 2002 at 09:13:39AM -0700, Stephen Hemminger wrote:
> One reason gettimeofday ends up being important is that several
> databases call it a lot. They use it to build up a transaction id. Under
> big transaction loads, even the fast linux syscall path ends up being a
> bottleneck. Also, on NUMA machines the data used for time of day (xtime)
> ends up being a significant portion of the cache traffic.

Yep. However the main bottleneck is to go inside/outside the kernel, the
xtime is one l1 cacheline readonly that can be trivially shared under
high load. I would be surprised if that was the bottleneck, today you
should see an huge bottleneck in the xtime_lock before you can remotely
see a bottleneck in xtime data itself. (I'm speaking HZ=100 at least,
HZ=1000 would hurt more here)

> It would be great to rework the whole TSC time of day stuff to work with
> per cpu data and allow unsychronized TSC's like NUMA. The problem is
> that for fast user level access, there would need to be some way to find
> out the current CPU and avoid preemption/migration for a short period.
> It seems like the LDT stuff for per-thread data could provide the
> current cpu (and maybe current pid) somehow.  And it would be possible
> to avoid  preemption while in a vsyscall text page, some other Unix
> variants do this to implement portions of the thread library in kernel
> provided user text pages.

actually my idea on 64bit was to use the high 8 bit of each 64bit word to
give you the cpuid, to get out the coherent data, including the sequence
number that are read and written inversely with mb() like now (the
sequence number as well will become per-cpu), so it is definitely doable
without any single problem and in a very performant way, just not as
easy as without the per-cpu info. Even if segmentation per-cpu tricks
would be possible or available (remeber long mode is pure paging, no
segmentation) it would be not worthwhile IMHO, the cpuid encoded
atomically in each 64bit data provided by the vsyscall seems a much
simpler and possibly more performant solution. You set a different
per-cpu data-mapping with different pte settings in each cpu. The
vsyscall bytecode remains the same, aware about this cpuid encoded in
each 64bit word. Doing it in 32bit is ugly (or at least much slower)
since most data is natively at least 32bit, it would need some slow
demultiplexing.

Andrea

next prev parent reply	other threads:[~2002-10-18 17:39 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-10-18  4:25 [RFC][PATCH] linux-2.5.34_vsyscall_A0 john stultz
2002-10-18  4:26 ` [RFC] linux-2.5.34_vsyscall_A0 - Test App john stultz
2002-10-18 11:14 ` [RFC][PATCH] linux-2.5.34_vsyscall_A0 Andrea Arcangeli
2002-10-18 16:13   ` Stephen Hemminger
2002-10-18 16:45     ` george anzinger
2002-10-18 17:11       ` Andrea Arcangeli
2002-10-18 17:19         ` Linus Torvalds
2002-10-18 17:21           ` Andrea Arcangeli
2002-10-18 18:37             ` Stephen Hemminger
2002-10-18 18:51               ` Andrea Arcangeli
2002-10-18 19:30           ` george anzinger
2002-10-18 17:01     ` Andrea Arcangeli [this message]
2002-10-21 13:18     ` Alan Cox
2002-10-21 17:15       ` john stultz
2002-10-18 16:39   ` john stultz
2002-10-18 18:54 ` [RFC] vsyscall_A0 LD_PRELOAD implementation john stultz
2002-10-21 22:44 ` [RFC][PATCH] linux-2.5.34_vsyscall_A0 Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2002-10-18 18:21 Manfred Spraul

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021018170116.GK23930@dualathlon.random \
    --to=andrea@suse.de \
    --cc=george@mvista.com \
    --cc=hbaum@us.ibm.com \
    --cc=johnstul@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@aracnet.com \
    --cc=shemminger@osdl.org \
    --cc=torvalds@transmeta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.