* Re: 2.6.19-rc4-mm2: BUG modprobeing sound driver
From: Andrew Morton @ 2006-11-10 6:28 UTC (permalink / raw)
To: eric, Eric Buddington, linux-kernel, Greg KH, Kay Sievers,
Jaroslav Kysela, Takashi Iwai
In-Reply-To: <20061109220515.7a127070.akpm@osdl.org>
On Thu, 9 Nov 2006 22:05:15 -0800
Andrew Morton <akpm@osdl.org> wrote:
> Yup, trivial to reproduce: modprobe snd_serial_u16550 -> splat.
>
> Bisection indicates that this oops is triggered by
> gregkh-driver-sound-device.patch.
>
> snd_serial_probe() never got to call snd_card_register(), so card->dev is
> NULL.
>
> snd_serial_probe() calls snd_card_free(card) on the error path and
> snd_card_do_free() does device_del(card->dev) which oopses over the null
> pointer it got.
I suppose doing this is legit:
diff -puN sound/core/init.c~fix-gregkh-driver-sound-device sound/core/init.c
--- a/sound/core/init.c~fix-gregkh-driver-sound-device
+++ a/sound/core/init.c
@@ -361,7 +361,8 @@ static int snd_card_do_free(struct snd_c
snd_printk(KERN_WARNING "unable to free card info\n");
/* Not fatal error */
}
- device_unregister(card->dev);
+ if (card->dev)
+ device_unregister(card->dev);
kfree(card);
return 0;
}
_
^ permalink raw reply
* Re: 2.6.19-rc5-mm1: HPC nx6325 breakage, VESA fb problem, md-raid problem
From: Neil Brown @ 2006-11-10 6:28 UTC (permalink / raw)
To: Rafael J. Wysocki
Cc: Andrew Morton, linux-kernel, fbuihuu, adaplas, Andi Kleen
In-Reply-To: <200611091642.01453.rjw@sisk.pl>
On Thursday November 9, rjw@sisk.pl wrote:
> On Thursday, 9 November 2006 02:04, Rafael J. Wysocki wrote:
> > > > and the kernel says it cannot mount the root fs (which is on an md-raid).
> > >
> > > hm, there was probably some earlier message which tells us why that
> > > happened. Doing a capure-and-compare on the dmesg output would be nice
> > > (netconsole?)
>
> This happens because of md-change-lifetime-rules-for-md-devices.patch and
> seems to be a universal breakage.
Thanks for the report.
Are you at all interested in confirming that this version of the patch
works for you? I'm fairly sure it will, but I've been wrong before.
Thanks either-way.
NeilBrown
----------------------------------------------
Subject: Change lifetime rules for 'md' devices.
Currently md devices are created when first opened and remain in existence
until the module is unloaded.
This isn't a major problem, but it somewhat ugly.
This patch changes the lifetime rules so that an md device will
disappear on the last close if it has no state.
Locking rules depend on bd_mutex being held in do_open and
__blkdev_put, and on setting bd_disk->private_data to 'mddev'.
There is room for a race because md_probe is called early in do_open
(get_gendisk) to create the mddev. As this isn't protected by
bd_mutex, a concurrent call to md_close can destroy that mddev before
do_open calls md_open to get a reference on it.
md_open and md_close are serialised by md_mutex so the worst that
can happen is that md_open finds that the mddev structure doesn't
exist after all. In this case bd_disk->private_data will be NULL,
and md_open chooses to exit with -EBUSY in this case, which is
arguable and appropriate result.
The new 'dead' field in mddev is used to track whether it is time
to destroy the mddev (if a last-close happens). It is cleared when
any state is create (set_array_info) and set when the array is stopped
(do_md_stop).
mddev_put becomes simpler. It just destroys the mddev when the
refcount hits zero. This will normally be the reference held in
bd_disk->private_data.
Signed-off-by: Neil Brown <neilb@suse.de>
### Diffstat output
./drivers/md/md.c | 32 +++++++++++++++++++++++---------
./include/linux/raid/md_k.h | 3 +++
2 files changed, 26 insertions(+), 9 deletions(-)
diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c 2006-11-10 17:12:55.000000000 +1100
+++ ./drivers/md/md.c 2006-11-10 17:23:25.000000000 +1100
@@ -226,13 +226,14 @@ static void mddev_put(mddev_t *mddev)
{
if (!atomic_dec_and_lock(&mddev->active, &all_mddevs_lock))
return;
- if (!mddev->raid_disks && list_empty(&mddev->disks)) {
- list_del(&mddev->all_mddevs);
- spin_unlock(&all_mddevs_lock);
- blk_cleanup_queue(mddev->queue);
- kobject_unregister(&mddev->kobj);
- } else
- spin_unlock(&all_mddevs_lock);
+ list_del(&mddev->all_mddevs);
+ spin_unlock(&all_mddevs_lock);
+
+ del_gendisk(mddev->gendisk);
+ mddev->gendisk = NULL;
+ blk_cleanup_queue(mddev->queue);
+ mddev->queue = NULL;
+ kobject_unregister(&mddev->kobj);
}
static mddev_t * mddev_find(dev_t unit)
@@ -273,6 +274,7 @@ static mddev_t * mddev_find(dev_t unit)
atomic_set(&new->active, 1);
spin_lock_init(&new->write_lock);
init_waitqueue_head(&new->sb_wait);
+ new->dead = 1;
new->queue = blk_alloc_queue(GFP_KERNEL);
if (!new->queue) {
@@ -1384,6 +1386,7 @@ static int bind_rdev_to_array(mdk_rdev_t
ko = &rdev->bdev->bd_disk->kobj;
sysfs_create_link(&rdev->kobj, ko, "block");
bd_claim_by_disk(rdev->bdev, rdev, mddev->gendisk);
+ mddev->dead = 0;
return 0;
}
@@ -3360,6 +3363,8 @@ static int do_md_stop(mddev_t * mddev, i
mddev->array_size = 0;
mddev->size = 0;
mddev->raid_disks = 0;
+ mddev->dead = 1;
+
mddev->recovery_cp = 0;
} else if (mddev->pers)
@@ -4022,6 +4027,7 @@ static int set_array_info(mddev_t * mdde
mddev->new_layout = mddev->layout;
mddev->delta_disks = 0;
+ mddev->dead = 0;
return 0;
}
@@ -4422,8 +4428,12 @@ static int md_open(struct inode *inode,
* Succeed if we can lock the mddev, which confirms that
* it isn't being stopped right now.
*/
- mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
- int err;
+ mddev_t *mddev;
+ int err = -EBUSY;
+
+ mddev = inode->i_bdev->bd_disk->private_data;
+ if (!mddev)
+ goto out;
if ((err = mutex_lock_interruptible_nested(&mddev->reconfig_mutex, 1)))
goto out;
@@ -4442,6 +4452,10 @@ static int md_release(struct inode *inod
mddev_t *mddev = inode->i_bdev->bd_disk->private_data;
BUG_ON(!mddev);
+ if (inode->i_bdev->bd_openers == 0 && mddev->dead) {
+ inode->i_bdev->bd_disk->private_data = NULL;
+ mddev_put(mddev);
+ }
mddev_put(mddev);
return 0;
diff .prev/include/linux/raid/md_k.h ./include/linux/raid/md_k.h
--- .prev/include/linux/raid/md_k.h 2006-11-10 17:12:55.000000000 +1100
+++ ./include/linux/raid/md_k.h 2006-11-10 17:16:50.000000000 +1100
@@ -119,6 +119,9 @@ struct mddev_s
#define MD_CHANGE_PENDING 2 /* superblock update in progress */
int ro;
+ int dead; /* array should be discarded on
+ * last close
+ */
struct gendisk *gendisk;
^ permalink raw reply
* [PATCH] FIX git pull failure with shallow clone changes
From: Aneesh Kumar K.V @ 2006-11-10 6:27 UTC (permalink / raw)
To: Aneesh Kumar K.V, git
In-Reply-To: <45541503.4020604@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1 bytes --]
[-- Attachment #2: 0001-I-was-using-the-pu-branch-i-tried-to-update-the-git-repository-and-i.txt --]
[-- Type: text/plain, Size: 1470 bytes --]
I was using the pu branch i tried to update the git repository and i
got this error.
walk 9e950efa20dc8037c27509666cba6999da9368e8
walk 3b6a792f6ace33584897d1af08630c9acc0ce221
* refs/heads/origin: fast forward to branch 'master' of
http://repo.or.cz/r/linux-2.6
old..new: 3d42488..088406b
Auto-following refs/tags/v2.6.19-rc5
shallow clone with http not supported
This repository was not cloned with -depth. I only updated the git
tools using the pu branch
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@gmail.com>
---
git-fetch.sh | 6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/git-fetch.sh b/git-fetch.sh
index 8b46e73..14ba4b2 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -314,9 +314,9 @@ fetch_main () {
noepsv_opt="--disable-epsv"
fi
max_depth=5
- depth=0
+ cur_depth=0
head="ref: $remote_name"
- while (expr "z$head" : "zref:" && expr $depth \< $max_depth) >/dev/null
+ while (expr "z$head" : "zref:" && expr $cur_depth \< $max_depth) >/dev/null
do
remote_name_quoted=$(@@PERL@@ -e '
my $u = $ARGV[0];
@@ -325,7 +325,7 @@ fetch_main () {
print "$u";
' "$head")
head=$(curl -nsfL $curl_extra_args $noepsv_opt "$remote/$remote_name_quoted")
- depth=$( expr \( $depth + 1 \) )
+ cur_depth=$( expr \( $cur_depth + 1 \) )
done
expr "z$head" : "z$_x40\$" >/dev/null ||
die "Failed to fetch $remote_name from $remote"
--
1.4.4.rc1.g368c-dirty
^ permalink raw reply related
* Re: [discuss] Re: 2.6.19-rc4: known unfixed regressions (v3)
From: Jeff Chua @ 2006-11-10 6:25 UTC (permalink / raw)
To: Linus Torvalds
Cc: Adrian Bunk, Matthew Wilcox, Andi Kleen, Aaron Durbin,
Andrew Morton, Linux Kernel Mailing List, gregkh, linux-pci
In-Reply-To: <Pine.LNX.4.64.0611081010120.3667@g5.osdl.org>
On 11/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
> Pushed out. Jeff, can you verify that current git does the right thing.
Linus,
Can you post those affected patches that I can apply directly to 2.6.19-rc5?
I'm still old-fashioned and have not downloaded the git dist. I'll try
to do that tonight.
Currently, I'm using Aaron Durbin's patch and it works.
Thanks,
Jeff.
^ permalink raw reply
* Re: 2.6.19-rc5-mm1: HPC nx6325 breakage, VESA fb problem, md-raid problem
From: Andi Kleen @ 2006-11-10 6:19 UTC (permalink / raw)
To: Andrew Morton
Cc: Rafael J. Wysocki, linux-kernel, fbuihuu, adaplas, NeilBrown
In-Reply-To: <20061109211523.7abfd4ec.akpm@osdl.org>
On Friday 10 November 2006 06:15, Andrew Morton wrote:
> On Fri, 10 Nov 2006 05:49:08 +0100
> Andi Kleen <ak@suse.de> wrote:
>
> >
> > > > >
> > > > > Well, I've got some data from earlyprintk (forgot I needed to boot with
> > > > > vga=normal).
> > > > >
> > > > > Unfortunately, I had to rewrite the trace manually:
> > > > >
> > > > > clear_IO_APIC_pin+0x15/0x6a
> > > > > try_apic_pin+0x7a/0x98
> > > > > setup_IO_APIC+0x600/0xb7a
> > > > > smp_prepare_cpus+0x33a/0x371
> > > > > init+0x60/0x32d
> > > > > child_rip+0xa/0x12
> > > > >
> > > > > [And then the unwinder said it got stuck.]
> > > > >
> > > > > RIP is reported to be at ioapic_read_entry+0x33/0x61,
> > > >
> > > > This is 100% reproducible on the nx6325 (but not on the other boxes) and
> > > > apparently caused by x86_64-mm-try-multiple-timer-pins.patch (doesn't
> > > > happen with this patch reverted).
> > >
> > > Thanks, dropped.
> >
> > can I have details please?
>
> I think what's in this thread is all you'll get.
That's probably not enough then.
>
> It would be nice to see the access address. I'd be guessing that it's
> trying to read the io-apic before we're ready to read it and io_apic_base()
> is returning gunk and boom.
I would like to see the full output from the earlyprintk crash please.
.jpg would be ok.
>
> > On what system (CPU, motherboard, BIOS version) does the noidlehz stuff break?
>
> nx6325
Ah -- it seems to be an ATI chipset. I tested it on a ATI chipset machine
here so it must be doing something strange.
Anyways, you most likely broke a wide range of other motherboards again
by dropping it.
>
> It's x86_64: no noidlehz.
>
> > And what did you drop exactly?
>
> x86_64-mm-try-multiple-timer-pins.patch
Ah that was the information I was missing.
-Andi
^ permalink raw reply
* RE: + sched-use-tasklet-to-call-balancing.patch added to -mm tree
From: Chen, Kenneth W @ 2006-11-10 6:18 UTC (permalink / raw)
To: 'Christoph Lameter'
Cc: Ingo Molnar, Siddha, Suresh B, akpm, mm-commits, nickpiggin,
linux-kernel
In-Reply-To: <Pine.LNX.4.64.0611071339001.5893@schroedinger.engr.sgi.com>
Christoph Lameter wrote on Tuesday, November 07, 2006 1:50 PM
> On Tue, 7 Nov 2006, Chen, Kenneth W wrote:
> > > What broke the system was the disabling of interrupts over long time
> > > periods during load balancing.
> > The previous global load balancing tasket could be an interesting data point.
>
> Yup seems also very interesting to me. We could drop the staggering code
> f.e. if we would leave the patch as is. Maybe there are other ways to
> optimize the code because we know that there are no concurrent
> balance_tick() functions running.
>
> > Do you see a lot of imbalance in the system with the global tasket? Does it
> > take prolonged interval to reach balanced system from imbalance?
>
> I am rather surprised that I did not see any problems but I think we would
> need some more testing. It seems that having only one load balance
> running at one time speeds up load balacing in general since there is
> less lock contention.
I ran majority of micro-benchmarks from LKP project with global load
balance tasklet. (http://kernel-perf.sourceforge.net)
Result is here:
http://kernel-perf.sourceforge.net/sched/global-load-bal.txt
All results are within noise range. The global tasklet does a fairly good job especially on context switch intensive workload like
aim7, volanomark, tbench
etc. Note all machines are non-numa platform.
Base on the data, I think we should make the load balance tasklet one per numa
node instead of one per CPU.
- Ken
^ permalink raw reply
* Re: KDB blindly reads keyboard port
From: Keith Owens @ 2006-11-10 6:15 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Bjorn Helgaas, linux-kernel, linux-ia64
In-Reply-To: <20061110042803.GU16952@parisc-linux.org>
Matthew Wilcox (on Thu, 9 Nov 2006 21:28:03 -0700) wrote:
>On Fri, Nov 10, 2006 at 03:23:20PM +1100, Keith Owens wrote:
>> Bjron, could you try kdb-v4.4-2.6.19-rc5-{common,ia64}-2 on your
>> problem system? I changed kdb so it only uses the keyboard if at least
>> one console matches the pattern /^tty[0-9]*$/. IOW, if the user
>> specifies an i8042 style console on the command line (or uses the
>> default with CONFIG_VT=y) then kdb will attempt to use that keyboard.
>> Otherwise kdb ignores a VT style console, even when the kernel is
>> compiled with CONFIG_VT=y.
>
>If I'm using an HP Integrity system with a USB keyboard, won't I still
>have a console that matches ^tty[0-9]*$ ?
Good point. How about the console list must include /^tty[0-9]*$/
_and_ there must be an interrupt registered with a name of "i8042"
before KDB will attempt to access i8042 ports?
^ permalink raw reply
* Re: KDB blindly reads keyboard port
From: Keith Owens @ 2006-11-10 6:15 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Bjorn Helgaas, linux-kernel, linux-ia64
In-Reply-To: <200609261354.30722.bjorn.helgaas@hp.com>
Matthew Wilcox (on Thu, 9 Nov 2006 21:28:03 -0700) wrote:
>On Fri, Nov 10, 2006 at 03:23:20PM +1100, Keith Owens wrote:
>> Bjron, could you try kdb-v4.4-2.6.19-rc5-{common,ia64}-2 on your
>> problem system? I changed kdb so it only uses the keyboard if at least
>> one console matches the pattern /^tty[0-9]*$/. IOW, if the user
>> specifies an i8042 style console on the command line (or uses the
>> default with CONFIG_VT=y) then kdb will attempt to use that keyboard.
>> Otherwise kdb ignores a VT style console, even when the kernel is
>> compiled with CONFIG_VT=y.
>
>If I'm using an HP Integrity system with a USB keyboard, won't I still
>have a console that matches ^tty[0-9]*$ ?
Good point. How about the console list must include /^tty[0-9]*$/
_and_ there must be an interrupt registered with a name of "i8042"
before KDB will attempt to access i8042 ports?
^ permalink raw reply
* Generic Netlink HOW-TO based on Jamal's original doc
From: Paul Moore @ 2006-11-10 6:08 UTC (permalink / raw)
To: hadi, tgraf; +Cc: netdev
A couple of months ago I promised Jamal and Thomas I would post some comments to
Jamal's original genetlink how-to. However, as I started to work on the
document the diff from the original started to get a little ridiculous so
instead of posting a patch against Jamal's original how-to I'm just posting the
revised document in it's entirety.
In the document below I tried to summarize all of the things I learned while
developing NetLabel. Some of it came from Jamal's document, some the kernel
code, and some from discussions with Thomas. Hopefully this document will make
it much easier for others to use genetlink in the future.
If this text below is acceptable to everyone, should this be added to the
Documentation directory?
An Introduction To Using Generic Netlink
===============================================================================
Last Updated: November 10, 2006
Table of Contents
1. Introduction
1.1. Document Overview
1.2. Netlink And Generic Netlink
2. Architectural Overview
3. Generic Netlink Families
3.1. Family Overview
3.1.1. The genl_family Structure
3.1.2. The genl_ops Structure
3.2. Registering A Family
4. Generic Netlink Communications
4.1. Generic Netlink Message Format
4.2. Kernel Communication
4.2.1. Sending Messages
4.2.2. Receiving Messages
4.3. Userspace Communication
5. Recommendations
5.1. Attributes And Message Payloads
5.2. Operation Granularity
5.3. Acknowledgment And Error Reporting
1. Introduction
------------------------------------------------------------------------------
1.1. Document Overview
------------------------------------------------------------------------------
This document gives is a brief introduction to Generic Netlink, some simple
examples on how to use it, and some recommendations on how to make the most of
the Generic Netlink communications interface. While this document does not
require that the reader have a detailed understanding of what Netlink is
and how it works, some basic Netlink knowledge is assumed. As usual, the
kernel source code is your best friend here.
While this document talks briefly about Generic Netlink from a userspace point
of view it's primary focus is on the kernel's Generic Netlink API. It is
recommended that application developers who are interested in using Generic
Netlink make use of the libnl library[1].
[1] http://people.suug.ch/~tgr/libnl
1.2. Netlink And Generic Netlink
------------------------------------------------------------------------------
Netlink is a flexible, robust wire-format communications channel typically
used for kernel to user communication although it can also be used for
user to user and kernel to kernel communications. Netlink communication
channels are associated with families or "busses", where each bus deals with a
specific service; for example, different Netlink busses exist for routing,
XFRM, netfilter, and several other kernel subsystems. More information about
Netlink can be found in RFC 3549[1].
Over the years, Netlink has become very popular which has brought about a very
real concern that the number of Netlink family numbers may be exhausted in the
near future. In response to this the Generic Netlink family was created which
acts as a Netlink multiplexer, allowing multiple service to use a single
Netlink bus.
[1] ftp://ftp.rfc-editor.org/in-notes/rfc3549.txt
2. Architectural Overview
------------------------------------------------------------------------------
Figure #1 illustrates how the basic Generic Netlink architecture which is
composed of five different types of components.
1) The Netlink subsystem which serves as the underlying transport layer for
all of the Generic Netlink communications.
2) The Generic Netlink bus which is implemented inside the kernel, but which
is available to userspace through the socket API and inside the kernel via
the normal Netlink and Generic Netlink APIs.
3) The Generic Netlink users who communicate with each other over the Generic
Netlink bus; users can exist both in kernel and user space.
4) The Generic Netlink controller which is part of the kernel and is
responsible for dynamically allocating Generic Netlink communication
channels and other management tasks. The Generic Netlink controller is
implemented as a standard Generic Netlink user, however, it listens on a
special, pre-allocated Generic Netlink channel.
5) The kernel socket API. Generic Netlink sockets are created with the
PF_NETLINK domain and the NETLINK_GENERIC protocol values.
+---------------------+ +---------------------+
| (3) application "A" | | (3) application "B" |
+------+--------------+ +--------------+------+
| |
\ /
\ /
| |
+-------+--------------------------------+-------+
| : : | user-space
=====+ : (5) Kernel socket API : +================
| : : | kernel-space
+--------+-------------------------------+-------+
| |
+-----+-------------------------------+----+
| (1) Netlink subsystem |
+---------------------+--------------------+
|
+---------------------+--------------------+
| (2) Generic Netlink bus |
+--+--------------------------+-------+----+
| | |
+-------+---------+ | |
| (4) Controller | / \
+-----------------+ / \
| |
+------------------+--+ +--+------------------+
| (3) kernel user "X" | | (3) kernel user "Y" |
+---------------------+ +---------------------+
Figure 1: Generic Netlink Architecture
When looking at figure #1 it is important to note that any Generic Netlink
user can communicate with any other user over the bus using the same API
regardless of where the user resides in relation to the kernel/userspace
boundary.
Generic Netlink communications are essentially a series of different
communication channels which are multiplexed on a single Netlink family.
Communication channels are uniquely identified by channel numbers which are
dynamically allocated by the Generic Netlink controller. The controller is a
special Generic Netlink user which listens on a fixed communication channel,
number 0x10, which is always present. Kernel or userspace users which provide
services over the Generic Netlink bus establish new communication channels by
registering their services with the Generic Netlink controller. Users who
want to use an existing service query the controller to see if it exists and
determine the correct channel number.
3. Generic Netlink Families
------------------------------------------------------------------------------
The Generic Netlink mechanism is based on a client-server model. The Generic
Netlink servers register families, which are a collection of well defined
services, with the controller and the clients communicate with the server
through these service registrations. This section explains how Generic Netlink
families are defined, created and registered.
3.1. Family Overview
------------------------------------------------------------------------------
Generic Netlink family service registrations are defined by two structures,
genl_family and genl_ops. The genl_family structure defines the family and
it's associated communication channel. The genl_ops structure defines
an individual service or operation which the family provides to other Generic
Netlink users.
This section focuses on Generic Netlink families as they are represented in
the kernel. A similar API exists for userspace applications using the libnl
library[1].
[1] http://people.suug.ch/~tgr/libnl
3.1.2. The genl_family Structure
Generic Netlink services are defined by the genl_family structure, which is
shown below:
struct genl_family
{
unsigned int id;
unsigned int hdrsize;
char name[GENL_NAMSIZ];
unsigned int version;
unsigned int maxattr;
struct nlattr ** attrbuf;
struct list_head ops_list;
struct list_head family_list;
};
Figure 2: The genl_family structure
The genl_family structure fields are used in the following manner:
* unsigned int id
This is the dynamically allocated channel number. A value of 0x0 signifies
that the channel number should be assigned by the controller and the 0x10
value is reserved for use by the controller. Users should always use
value 0x0 when registering a new family.
* unsigned int hdrsize
If the family makes use of a family specific header, it's size is stored
here. If there is no family specific header this value should be zero.
* char name[GENL_NAMSIZ]
This string should be unique to the family as it is the key that the
controller uses to lookup channel numbers when requested.
* unsigned int version
Family specific version number.
* unsigned int maxattr
Generic Netlink makes use of the standard Netlink attributes, this value
holds the maximum number of attributes defined for the Generic Netlink
family.
* struct nlattr **attrbuf
* struct list_head ops_list
* struct list_head family_list
These are private fields and should not be modified.
3.1.2. The genl_ops Structure
struct genl_ops
{
u8 cmd;
unsigned int flags;
struct nla_policy *policy;
int (*doit)(struct sk_buff *skb,
struct genl_info *info);
int (*dumpit)(struct sk_buff *skb,
struct netlink_callback *cb);
struct list_head ops_list;
};
Figure 3: The genl_ops structure
The genl_ops structure fields are used in the following manner:
* u8 cmd
This value is unique across the corresponding Generic Netlink family and is
used to reference the operation.
* unsigned int flags
This field is used to specify any special attributes of the operation. The
following flags may be used, multiple flags can be OR'd together:
- GENL_ADMIN_PERM
The operation requires the CAP_NET_ADMIN privilege
* struct nla_policy policy
This field defines the Netlink attribute policy for the operation request
message. If specified, the Generic Netlink mechanism uses this policy to
verify all of the attributes in a operation request message before calling
the operation handler.
The attribute policy is defined as an array of nla_policy structures indexed
by the attribute number. The nla_policy structure is defined in figure #4.
struct nla_policy
{
u16 type;
u16 len;
};
Figure 4: The nla_policy structure
The fields are used in the following manner:
- u16 type
This specifies the type of the attribute, presently the following types
are defined for general use:
o NLA_UNSPEC
Undefined type
o NLA_U8
A 8 bit unsigned integer
o NLA_U16
A 16 bit unsigned integer
o NLA_U32
A 32 bit unsigned integer
o NLA_U64
A 64 bit unsigned integer
o NLA_FLAG
A simple boolean flag
o NLA_MSECS
A 64 bit time value in msecs
o NLA_STRING
A variable length string
o NLA_NUL_STRING
A variable length NULL terminated string
o NLA_NESTED
A stream of attributes
- u16 len
When the attribute type is one of the string types then this field should
be set to the maximum length of the string, not including the terminal
NULL byte. If the attribute type is unknown or NLA_UNSPEC then this field
should be set to the exact length of the attribute's payload.
Unless the attribute type is one of the fixed length types above, a value
of zero indicates that no validation of the attribute should be performed.
* int (*doit)(struct skbuff *skb, struct genl_info *info)
This callback is similar in use to the standard Netlink 'doit' callback, the
primary difference being the change in parameters.
The 'doit' handler receives two parameters, the first if the message buffer
which triggered the handler and the second is a Generic Netlink genl_info
structure which is defined in figure #5.
struct genl_ops
{
u32 snd_seq;
u32 snd_pid;
struct nlmsghdr * nlhdr;
struct genlmsghdr * genlhdr;
void * userhdr;
struct nlattr ** attrs;
};
Figure 5: The genl_info structure
The fields are populated in the following manner:
- u32 snd_seq
This is the Netlink sequence number of the request.
- u32 snd_pid
This is the PID of the client which issued the request.
- struct nlmsghdr *nlhdr
This is set to point to the Netlink message header of the request.
- struct genlmsghdr *genlhdr
This is set to point to the Generic Netlink message header of the request.
- void *userhdr
If the Generic Netlink family makes use of a family specific header, this
pointer will be set to point to the start of the family specific header.
- struct nlattr **attrs
The parsed Netlink attributes from the request, if the Generic Netlink
family definition specified a Netlink attribute policy then the
attributes will have already been validated.
The 'doit' handler should do whatever processing is necessary and return
zero on success, or a negative value on failure. Negative return values
will cause a NLMSG_ERROR message to be sent while a zero return value will
only cause a NLMSG_ERROR message to be sent if the request is received with
the NLM_F_ACK flag set.
* int (*dumpit)(struct sk_buff *skb, struct netlink_callback *cb)
This callback is similar in use to the standard Netlink 'dumpit' callback.
The 'dumpit' callback is invoked when a Generic Netlink message is received
with the NLM_F_DUMP flag set.
The main difference between a 'dumpit' handler and a 'doit' handler is
that a 'dumpit' handler does not allocate a message buffer for a response;
a pre-allocated sk_buff is passed to the 'dumpit' handler as the first
parameter. The 'dumpit' handler should fill the message buffer with the
appropriate response message and return the size of the sk_buff,
i.e. sk_buff->len, and the message buffer will automatically be sent to the
Generic Netlink client that initiated the request. As long as the 'dumpit'
handler returns a value greater than zero it will be called again with a
newly allocated message buffer to fill, when the handler has no more data
to send it should return zero; error conditions are indicated by returning
a negative value. If necessary, state can be preserved in the
netlink_callback parameter which is passed to the 'dumpit' handler; the
netlink_callback parameter values will be preserved across handler calls
for a single request.
* struct list_head ops_list
This is a private field and should not be modified.
3.2. Registering A Family
------------------------------------------------------------------------------
Registering a Generic Netlink family is a simple four step process: define the
family, define the operations, register the family, register the operations.
In order to help demonstrate these steps below is a simple example broken down
and explained in detail.
The first step is to define the family itself, which we do by creating an
instance of the genl_family structure which we explained in section 3.1.1..
In our simple example we are going to create a new Generic Netlink family
named "DOC_EXMPL".
/* attributes */
enum {
DOC_EXMPL_A_UNSPEC,
DOC_EXMPL_A_MSG,
__DOC_EXMPL_A_MAX,
};
#define DOC_EXMPL_A_MAX (__DOC_EXMPL_A_MAX - 1)
/* attribute policy */
static struct nla_policy doc_exmpl_genl_policy = [DOC_EXMPL_A_MAX + 1] = {
[DOC_EXMPL_A_MSG] = { .type = NLA_NUL_STRING },
}
/* family definition */
static struct genl_family doc_exmpl_gnl_family = {
.id = GENL_ID_GENERATE,
.hdrsize = 0,
.name = "DOC_EXMPL",
.version = 1,
.maxattr = DOC_EXMPL_A_MAX,
};
Figure 6: The DOC_EXMPL family, attributes, and policy
You can see above that we defined a new family and the family recognizes a
single attribute, DOC_EXMPL_A_ECHO, which is a NULL terminated string. The
GENL_ID_GENERATE macro/constant is really just the value 0x0 and it signifies
that we want the Generic Netlink controller to assign the channel number when
we register the family.
The second step is to define the operations for the family, which we do by
creating at least one instance of the genl_ops structure which we explained in
section 3.1.2.. In this example we are only going to define one operation but
you can define up to 255 unique operations for each family.
/* handler */
int doc_exmpl_echo(struct sk_buff *skb, struct genl_info *info)
{
/* message handling code goes here; return 0 on success, negative
* values on failure */
}
/* commands */
enum {
DOC_EXMPL_C_UNSPEC,
DOC_EXMPL_C_ECHO,
__DOC_EXMPL_C_ECHO,
};
#define DOC_EXMPL_C_MAX (__DOC_EXMPL_C_MAX - 1)
/* operation definition */
struct genl_ops doc_exmpl_gnl_ops_echo = {
.cmd = DOC_EXMPL_C_ECHO,
.flags = 0,
.policy = doc_exmpl_genl_policy,
.doit = doc_exmpl_echo,
.dumpit = NULL,
}
Figure 7: The DOC_EXMPL_C_ECHO operation
Here we have defined a single operation, DOC_EXMPL_C_ECHO, which uses the
Netlink attribute policy we defined above. Once registered, this particular
operation would call the doc_exmpl_echo() function whenever a
DOC_EXMPL_C_ECHO message is sent to the DOC_EXMPL family over the Generic
Netlink bus.
The third step it to register the DOC_EXMPL family with the Generic Netlink
operation. We do this with a single function call:
genl_register_family(&doc_exmpl_gnl_family);
This call registers the new family name with the Generic Netlink mechanism and
requests a new channel number which is stored in the genl_family struct,
replacing the GENL_ID_GENERATE value. It is important to remember to
unregister Generic Netlink families when done as the kernel does allocate
resources for each registered family.
The fourth and final step is to register the operations for the family. Once
again this is a simple function call:
genl_register_ops(&doc_exmpl_gnl_family, &doc_exmpl_gnl_ops_echo);
This call registers the DOC_EXMPL_C_ECHO operation in association with the
DOC_EXMPL family. The process is now complete, other Generic Netlink users can
now issue DOC_EXMPL_C_ECHO commands and they will be handled as desired.
4. Generic Netlink Communications
------------------------------------------------------------------------------
This section deals with the Generic Netlink messages themselves and how to
send and receive messages.
4.1. Generic Netlink Message Format
------------------------------------------------------------------------------
Generic Netlink uses the standard Netlink subsystem as a transport layer which
means that the foundation of the Generic Netlink message is the standard
Netlink message format, the only difference is the inclusion of a Generic
Netlink message header. The format of the message is defined below:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Netlink message header (nlmsghdr) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Generic Netlink message header (genlmsghdr) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Optional user specific message header |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Optional Generic Netlink message payload |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 8: Generic Netlink message format
Figure #8 is included only to give you a rough idea of how Generic Netlink
messages are formatted and sent on the "wire". In practice the Netlink and
Generic Netlink API should insulate most users from the details of the message
format and the Netlink message headers.
4.2 Kernel Communication
------------------------------------------------------------------------------
The kernel provides two sets of interfaces for sending, receiving, and
processing Generic Netlink messages. The majority of the API consists of the
general purpose Netlink interfaces, however, there are a small number of
interfaces specific to Generic Netlink. The following two include files
define the Netlink and Generic Netlink API for the kernel.
* include/net/netlink.h
* include/net/genetlink.h
4.2.1. Sending Messages
Sending Generic Netlink messages is a three step process: allocate memory for
the message buffer, create the message, send the message. In order to help
demonstrate these steps below is a simple example using the DOC_EXMPL family
shown in section 3.
The first step is to allocate a Netlink message buffer, the easiest way to do
this is with the nlsmsg_new() function.
struct sk_buff *skb;
skb = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
if (skb == NULL)
goto failure;
Figure 9: Allocating a Generic Netlink message buffer
The NLMSG_GOODSIZE macro/constant is a good value to use when you do not know
the size of the message buffer at the time of allocation. Don't forget that
the message buffer needs to be big enough to hold the message payload and both
the Netlink and Generic Netlink message headers.
The second step is to actually create the message payload. This is obviously
something which is very specific to each use service, but a simple example is
shown below.
int rc;
void *msg_head;
/* create the message headers */
msg_head = genlmsg_put(skb, pid, seq, type, 0, flags, DOC_EXMPL_C_ECHO, 1);
if (msg_head == NULL) {
rc = -ENOMEM;
goto failure;
}
/* add a DOC_EXMPL_A_MSG attribute */
rc = nla_put_string(skb, DOC_EXMPL_A_MSG, "Generic Netlink Rocks");
if (rc != 0)
goto failure;
/* finalize the message */
genlmsg_end(skb, msg_head);
Figure 10: Creating a Generic Netlink message payload
The genlmsg_put() function creates the required Netlink and Generic Netlink
message headers, populating them with the given values; see the Generic
Netlink header file for a description of the parameters. The nla_put_string()
function is a standard Netlink attribute function which adds a string
attribute to the end of the Netlink message; see the Netlink header file for a
description of the parameters. The genlmsg_end() function updates the Netlink
message header once the message payload has been finalized, this function
should be called before sending the message.
The third and final step is to send the Generic Netlink message which can be
done with a single function call. The example below is for a unicast send,
but interfaces exist for doing a multicast send of Generic Netlink message.
int rc;
rc = genlmsg_unicast(skb, pid);
if (rc != 0)
goto failure;
Figure 11: Sending Generic Netlink messages
4.2.2. Receiving Messages
Typically, the kernel acts a Generic Netlink server which means that the act of
receiving messages is handled automatically by the Generic Netlink bus. Once
the bus receives the message and determines the correct routing, the message
is passed directly to the family specific operation callback for processing.
If the kernel is acting as a Generic Netlink client, server response messages
can be received over the Generic Netlink socket using standard kernel socket
interfaces.
4.3. Userspace Communication
------------------------------------------------------------------------------
While Generic Netlink messages can be sent and received using the standard
socket API it is recommended that user space applications use the libnl
library[1]. The libnl library insulates applications from many of the low
level Netlink tasks and uses an API which is very similar to the kernel API
shown above.
[1] http://people.suug.ch/~tgr/libnl
5. Recommendations
------------------------------------------------------------------------------
The Generic Netlink mechanism is a very flexible communications mechanism and
as a result there are many different ways it can be used. The following
recommendations are based on conventions within the Linux kernel and should be
followed whenever possible. While not all existing kernel code follows the
recommendations outlined here all new code should consider these
recommendations as requirements.
5.1. Attributes And Message Payloads
------------------------------------------------------------------------------
When defining new Generic Netlink message formats you must make use of the
Netlink attributes wherever possible. The Netlink attribute mechanism has
been carefully designed to allow for future message expansion while preserving
backward compatibility. There are also additional benefits to using Netlink
attributes which include developer familiarity and basic input checking.
Most common data structures can be represented with Netlink attributes:
* scalar values
Most scalar values already have well defined attribute types, see section 3
for details
* structures
Structures can be represented using a nested attribute with the structure
fields represented as attributes in the payload of the container attribute
* arrays
Arrays can be represented by using a single nested attribute as a container
with several of the same attribute type inside each representing a spot in
the array
It is also important to use unique attributes as much as possible. This helps
make the most of the Netlink attribute mechanisms and provides for easy changes
to the message format in the future.
5.2. Operation Granularity
------------------------------------------------------------------------------
While it may be tempting to register a single operation for a Generic Netlink
family and multiplex multiple sub-commands on the single operation this
is strongly discouraged for security reasons. Combining multiple behaviors
into one operation makes it difficult to restrict the operations using the
existing Linux kernel security mechanisms.
5.3. Acknowledgment and Error Reporting
------------------------------------------------------------------------------
It is often necessary for Generic Netlink services to return an ACK or error
code to the client. It is not necessary to implement an explicit
acknowledgment message as Netlink already provides a flexible acknowledgment
and error reporting message type called NLMSG_ERROR. When an error occurs a
NLMSG_ERROR message is returned to the client with the error code returned by
the Generic Netlink operation handler. Clients can also request a NLMSG_ERROR
message when no error has occurred by setting the NLM_F_ACK flag on requests.
--
paul moore
linux security @ hp
^ permalink raw reply
* Re: 2.6.19-rc4-mm2: BUG modprobeing sound driver
From: Andrew Morton @ 2006-11-10 6:05 UTC (permalink / raw)
To: eric
Cc: Eric Buddington, linux-kernel, Greg KH, Kay Sievers,
Jaroslav Kysela, Takashi Iwai
In-Reply-To: <20061109142208.GA4291@pool-70-109-251-157.wma.east.verizon.net>
On Thu, 09 Nov 2006 09:22:09 -0500
Eric Buddington <ebuddington@verizon.net> wrote:
> Kernel 2.6.19-rc4-mm2 on an Athlon XP / SiS 741 motherboard chipset
>
> In the process of modprobe-ing all the sound modules to figure out
> which I needed (is that kosher? Well, I did it anyway), I got the
> following BUG. Share and enjoy.
>
> [ 673.745969] BUG: unable to handle kernel NULL pointer dereference at virtual address 00000064
> [ 673.745974] printing eip:
> [ 673.745977] c02d1ae5
> [ 673.745979] *pde = 00000000
> [ 673.745986] Oops: 0000 [#1]
> [ 673.745988] PREEMPT
> [ 673.745992] last sysfs file: /devices/pci0000:00/0000:00:00.0/class
> [ 673.746131] CPU: 0
> [ 673.746133] EIP: 0060:[<c02d1ae5>] Not tainted VLI
> [ 673.746135] EFLAGS: 00010246 (2.6.19-rc4-mm2 #1)
> [ 673.746150] EIP is at device_del+0x8/0x156
> [ 673.746154] eax: 00000000 ebx: 00000000 ecx: c1aff240 edx: c1679380
> [ 673.746159] esi: 00000000 edi: 0000ffff ebp: d0bd9cd0 esp: d0bd9cc4
> [ 673.746162] ds: 007b es: 007b ss: 0068
> [ 673.746166] Process modprobe (pid: 5689, ti=d0bd8000 task=f50eb1b0 task.ti=d0bd8000)
> [ 673.746169] Stack: 00000000 d085ee00 0000ffff d0bd9cdc c02d1c3e d085ee00 d0bd9cf4 f8a196e3
> [ 673.746177] d0bd9cf4 f8a19d5b 00000000 00000000 d0bd9d1c f8a19ea4 ee290000 00000286
> [ 673.746185] ee290000 ffffffed 0000ffff ee290000 ffffffed 0000ffff d0bd9d88 f8c18539
> [ 673.746192] Call Trace:
> [ 673.746212] [<c02d1c3e>] device_unregister+0xb/0x15
> [ 673.746220] [<f8a196e3>] snd_card_do_free+0xe6/0xf5 [snd]
> [ 673.746264] [<f8a19ea4>] snd_card_free+0x77/0x81 [snd]
> [ 673.746283] [<f8c18539>] snd_serial_probe+0x47a/0x528 [snd_serial_u16550]
> [ 673.746329] [<c02d49ee>] platform_drv_probe+0xf/0x11
> [ 673.746339] [<c02d3682>] really_probe+0x79/0x105
> [ 673.746349] [<c02d37a3>] driver_probe_device+0x95/0xa1
> [ 673.746357] [<c02d37b7>] __device_attach+0x8/0xa
> [ 673.746363] [<c02d2c5d>] bus_for_each_drv+0x37/0x60
> [ 673.746371] [<c02d3830>] device_attach+0x62/0x76
> [ 673.746377] [<c02d2bd2>] bus_attach_device+0x21/0x42
> [ 673.746384] [<c02d1fa1>] device_add+0x2a8/0x3eb
> [ 673.746390] [<c02d4ce1>] platform_device_add+0xee/0x11e
> [ 673.746398] [<c02d4ede>] platform_device_register_simple+0x35/0x4b
> [ 673.746405] [<f8c18079>] alsa_card_serial_init+0x39/0x7f [snd_serial_u16550]
Yup, trivial to reproduce: modprobe snd_serial_u16550 -> splat.
Bisection indicates that this oops is triggered by
gregkh-driver-sound-device.patch.
snd_serial_probe() never got to call snd_card_register(), so card->dev is
NULL.
snd_serial_probe() calls snd_card_free(card) on the error path and
snd_card_do_free() does device_del(card->dev) which oopses over the null
pointer it got.
^ permalink raw reply
* Re: shallow clone failed git pull
From: Aneesh Kumar K.V @ 2006-11-10 5:58 UTC (permalink / raw)
To: Aneesh Kumar K.V, git
In-Reply-To: <4552A865.5000201@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 682 bytes --]
Aneesh Kumar K.V wrote:
> I was using the pu branch i tried to update the git repository and i got
> this error.
>
> alk 9e950efa20dc8037c27509666cba6999da9368e8
> walk 3b6a792f6ace33584897d1af08630c9acc0ce221
> * refs/heads/origin: fast forward to branch 'master' of
> http://repo.or.cz/r/linux-2.6
> old..new: 3d42488..088406b
> Auto-following refs/tags/v2.6.19-rc5
> shallow clone with http not supported
>
>
> This repository was not cloned with -depth. I only updated the git tools
> using the pu branch
>
The attached patch gets it working. I am not sure whether the fix is the right one. I
am a little bit confused regarding the $depth being incremented.
-aneesh
[-- Attachment #2: git-fetch.sh.diff --]
[-- Type: text/x-patch, Size: 1402 bytes --]
diff --git a/git-fetch.sh b/git-fetch.sh
index 8b46e73..6459994 100755
--- a/git-fetch.sh
+++ b/git-fetch.sh
@@ -21,7 +21,7 @@ update_head_ok=
exec=
upload_pack=
keep=
-depth=
+depth=0
while case "$#" in 0) break ;; esac
do
case "$1" in
@@ -304,7 +304,7 @@ fetch_main () {
# There are transports that can fetch only one head at a time...
case "$remote" in
http://* | https://* | ftp://*)
- test -n "$depth" && die "shallow clone with http not supported"
+ [ x"$depth" != x0 ] && die "shallow clone with http not supported"
proto=`expr "$remote" : '\([^:]*\):'`
if [ -n "$GIT_SSL_NO_VERIFY" ]; then
curl_extra_args="-k"
@@ -325,7 +325,7 @@ fetch_main () {
print "$u";
' "$head")
head=$(curl -nsfL $curl_extra_args $noepsv_opt "$remote/$remote_name_quoted")
- depth=$( expr \( $depth + 1 \) )
+ # depth=$( expr \( $depth + 1 \) )
done
expr "z$head" : "z$_x40\$" >/dev/null ||
die "Failed to fetch $remote_name from $remote"
@@ -333,7 +333,7 @@ fetch_main () {
git-http-fetch -v -a "$head" "$remote/" || exit
;;
rsync://*)
- test -n "$depth" && die "shallow clone with rsync not supported"
+ [ x"$depth" != x0 ] && die "shallow clone with http not supported"
TMP_HEAD="$GIT_DIR/TMP_HEAD"
rsync -L -q "$remote/$remote_name" "$TMP_HEAD" || exit 1
head=$(git-rev-parse --verify TMP_HEAD)
^ permalink raw reply related
* Re: [discuss] Re: 2.6.19-rc4: known unfixed regressions (v3)
From: Jeff Chua @ 2006-11-10 5:53 UTC (permalink / raw)
To: Linus Torvalds
Cc: Adrian Bunk, Matthew Wilcox, Andi Kleen, Aaron Durbin,
Andrew Morton, Linux Kernel Mailing List, gregkh, linux-pci
> Jeff - when you enable "direct PCI access", what is the printout? You
> should get
>
> PCI: Using configuration type 1
>
> and the kernel should never have used MMCONFIG if the area wasn't marked
> as reserved in e820..
Linus,
Here's what I get ...
PCI: Using configuration type 1
Setting up standard PCI resources
Complete dmesg below. rc5 has same problem as rc4. I'll start testing the
git version shortly.
Thanks
Jeff.
Linux version 2.6.19-rc5 (root@boston.corp.fedex.com) (gcc version 3.4.5) #3 SMP PREEMPT Thu Nov 9 23:43:41 SGT 2006
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 00000000df686c00 (usable)
BIOS-e820: 00000000df686c00 - 00000000df688c00 (ACPI NVS)
BIOS-e820: 00000000df688c00 - 00000000df68ac00 (ACPI data)
BIOS-e820: 00000000df68ac00 - 00000000e0000000 (reserved)
BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved)
BIOS-e820: 00000000fed20000 - 00000000feda0000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fef00000 (reserved)
BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
2678MB HIGHMEM available.
896MB LOWMEM available.
found SMP MP-table at 000fe710
Entering add_active_range(0, 0, 915078) 0 entries of 256 used
Zone PFN ranges:
DMA 0 -> 4096
Normal 4096 -> 229376
HighMem 229376 -> 915078
early_node_map[1] active PFN ranges
0: 0 -> 915078
On node 0 totalpages: 915078
DMA zone: 32 pages used for memmap
DMA zone: 0 pages reserved
DMA zone: 4064 pages, LIFO batch:0
Normal zone: 1760 pages used for memmap
Normal zone: 223520 pages, LIFO batch:31
HighMem zone: 5357 pages used for memmap
HighMem zone: 680345 pages, LIFO batch:31
DMI 2.3 present.
ACPI: RSDP (v002 DELL ) @ 0x000feb00
ACPI: XSDT (v001 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd253
ACPI: FADT (v003 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd34b
ACPI: SSDT (v001 DELL st_ex 0x00001000 INTL 0x20050309) @ 0xfffd6996
ACPI: MADT (v001 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd43f
ACPI: BOOT (v001 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd4b1
ACPI: ASF! (v016 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd4d9
ACPI: MCFG (v001 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd540
ACPI: HPET (v001 DELL GX620 0x00000007 ASL 0x00000061) @ 0x000fd57e
ACPI: DSDT (v001 DELL dt_ex 0x00001000 INTL 0x20050309) @ 0x00000000
ACPI: PM-Timer IO Port: 0x808
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:4 APIC version 20
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x05] disabled)
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x07] disabled)
ACPI: LAPIC_NMI (acpi_id[0xff] high level lint[0x1])
ACPI: IOAPIC (id[0x08] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 8, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 1 I/O APICs
ACPI: HPET id: 0x8086a201 base: 0xfed00000
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at e1000000 (gap: e0000000:10000000)
Detected 2992.745 MHz processor.
Built 1 zonelists. Total pages: 907929
Kernel command line: BOOT_IMAGE=(hd0,14)/linux/bzc1 root=/dev/sda2 resume=/dev/sda3 testing_only="this is got to be good. Now I can send in a very long line just like 2.4 and need not worry about the line being too long. What a great way to start a great year!!! Cool!"
mapped APIC to ffffd000 (fee00000)
mapped IOAPIC to ffffc000 (fec00000)
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 16384 bytes)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 3625948k/3660312k available (2402k kernel code, 33292k reserved, 848k data, 200k init, 2742808k highmem)
virtual kernel memory layout:
fixmap : 0xfff50000 - 0xfffff000 ( 700 kB)
pkmap : 0xff800000 - 0xffc00000 (4096 kB)
vmalloc : 0xf8800000 - 0xff7fe000 ( 111 MB)
lowmem : 0xc0000000 - 0xf8000000 ( 896 MB)
.init : 0xc0433000 - 0xc0465000 ( 200 kB)
.data : 0xc03589f0 - 0xc042cab4 ( 848 kB)
.text : 0xc0100000 - 0xc03589f0 (2402 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Using HPET for base-timer
Calibrating delay using timer specific routine.. 5990.52 BogoMIPS (lpj=11981059)
Mount-cache hash table entries: 512
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000649d 00000000 00000001
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 0
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000649d 00000000 00000001
Compat vDSO mapped to ffffe000.
Checking 'hlt' instruction... OK.
SMP alternatives: switching to UP code
ACPI: Core revision 20060707
CPU0: Intel(R) Pentium(R) D CPU 3.00GHz stepping 07
SMP alternatives: switching to SMP code
Booting processor 1/1 eip 3000
Initializing CPU#1
Calibrating delay using timer specific routine.. 5985.49 BogoMIPS (lpj=11970995)
CPU: After generic identify, caps: bfebfbff 20100000 00000000 00000000 0000649d 00000000 00000001
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
CPU: Processor Core ID: 1
CPU: After all inits, caps: bfebfbff 20100000 00000000 00000180 0000649d 00000000 00000001
CPU1: Intel(R) Pentium(R) D CPU 3.00GHz stepping 07
Total of 2 processors activated (11976.02 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 apic1=0 pin1=2 apic2=-1 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
Brought up 2 CPUs
migration_cost=319
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] bus is 0
Boot video device is 0000:00:02.0
PCI quirk: region 0800-087f claimed by ICH6 ACPI/GPIO/TCO
PCI quirk: region 0880-08bf claimed by ICH6 GPIO
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI4._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI2._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI3._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.PCI1._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11 12 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 *10 11 12 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs *3 4 5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 12 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 *5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 *9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 *5 6 7 9 10 11 12 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 *10 11 12 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI init
pnp: PnP ACPI: found 10 devices
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq". If it helps, post a report
NET: Registered protocol family 23
pnp: 00:01: ioport range 0x800-0x85f could not be reserved
pnp: 00:01: ioport range 0xc00-0xc7f has been reserved
pnp: 00:01: ioport range 0x860-0x8ff could not be reserved
pnp: 00:09: ioport range 0x100-0x1fe has been reserved
pnp: 00:09: ioport range 0x200-0x277 has been reserved
pnp: 00:09: ioport range 0x280-0x2e7 has been reserved
pnp: 00:09: ioport range 0x2f0-0x2f7 has been reserved
pnp: 00:09: ioport range 0x300-0x377 has been reserved
pnp: 00:09: ioport range 0x380-0x3bb has been reserved
pnp: 00:09: ioport range 0x3c0-0x3e7 could not be reserved
pnp: 00:09: ioport range 0x3f6-0x3f7 has been reserved
PCI: Ignore bogus resource 6 [0:0] of 0000:00:02.0
PCI: Bridge: 0000:00:01.0
IO window: disabled.
MEM window: fe900000-fe9fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.0
IO window: disabled.
MEM window: fe800000-fe8fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1c.1
IO window: disabled.
MEM window: fe700000-fe7fffff
PREFETCH window: disabled.
PCI: Bridge: 0000:00:1e.0
IO window: disabled.
MEM window: disabled.
PREFETCH window: disabled.
ACPI: PCI Interrupt 0000:00:01.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:01.0 to 64
ACPI: PCI Interrupt 0000:00:1c.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:00:1c.0 to 64
ACPI: PCI Interrupt 0000:00:1c.1[B] -> GSI 17 (level, low) -> IRQ 17
PCI: Setting latency timer of device 0000:00:1c.1 to 64
PCI: Setting latency timer of device 0000:00:1e.0 to 64
NET: Registered protocol family 2
IP route cache hash table entries: 32768 (order: 5, 131072 bytes)
TCP established hash table entries: 262144 (order: 9, 3145728 bytes)
TCP bind hash table entries: 65536 (order: 7, 786432 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
Simple Boot Flag at 0x7a set to 0x80
apm: disabled - APM is not SMP safe.
highmem bounce pool size: 64 pages
Total HugeTLB memory allocated, 0
io scheduler noop registered
io scheduler anticipatory registered (default)
io scheduler deadline registered
io scheduler cfq registered
PCI: Setting latency timer of device 0000:00:01.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:01.0:pcie00]
PCI: Setting latency timer of device 0000:00:1c.0 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.0:pcie00]
Allocate Port Service[0000:00:1c.0:pcie02]
PCI: Setting latency timer of device 0000:00:1c.1 to 64
assign_interrupt_mode Found MSI capability
Allocate Port Service[0000:00:1c.1:pcie00]
Allocate Port Service[0000:00:1c.1:pcie02]
pci_hotplug: PCI Hot Plug PCI Core version: 0.5
acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
acpiphp_glue: can't get bus number, assuming 0
acpiphp_ibm: ibm_acpiphp_init: acpi_walk_namespace failed
pciehp: HPC vendor_id 8086 device_id 27d0 ss_vid 0 ss_did 0
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
pciehp: Cannot get control of hotplug hardware for pci 0000:00:1c.0
pciehp: HPC vendor_id 8086 device_id 27d2 ss_vid 0 ss_did 0
Evaluate _OSC Set fails. Status = 0x0005
Evaluate _OSC Set fails. Status = 0x0005
pciehp: Cannot get control of hotplug hardware for pci 0000:00:1c.1
pciehp: PCI Express Hot Plug Controller Driver version: 0.4
ACPI: Power Button (FF) [PWRF]
ACPI: Power Button (CM) [VBTN]
ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not present [20060707]
ACPI: Getting cpuindex for acpiid 0x3
ACPI Exception (acpi_processor-0681): AE_NOT_FOUND, Processor Device is not present [20060707]
ACPI: Getting cpuindex for acpiid 0x4
ibm_acpi: ec object not found
Real Time Clock Driver v1.12ac
Linux agpgart interface v0.101 (c) Dave Jones
agpgart: Detected an Intel 945G Chipset.
agpgart: Detected 7932K stolen memory.
agpgart: AGP aperture is 256M @ 0xe0000000
[drm] Initialized drm 1.0.1 20051102
RAMDISK driver initialized: 16 RAM disks of 20480K size 1024 blocksize
loop: loaded (max 8 devices)
HP CISS Driver (v 3.6.10)
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH7: IDE controller at PCI slot 0000:00:1f.1
ACPI: PCI Interrupt 0000:00:1f.1[A] -> GSI 16 (level, low) -> IRQ 16
ICH7: chipset revision 1
ICH7: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
Probing IDE interface ide0...
hda: HL-DT-ST DVD+/-RW GWA4164B, ATAPI CD/DVD-ROM drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
megaraid cmm: 2.20.2.7 (Release Date: Sun Jul 16 00:01:03 EST 2006)
megaraid: 2.20.4.9 (Release Date: Sun Jul 16 12:27:22 EST 2006)
megasas: 00.00.03.05 Mon Oct 02 11:21:32 PDT 2006
libata version 2.00 loaded.
ata_piix 0000:00:1f.2: version 2.00ac6
ata_piix 0000:00:1f.2: MAP [ P0 P2 P1 P3 ]
ACPI: PCI Interrupt 0000:00:1f.2[C] -> GSI 20 (level, low) -> IRQ 18
PCI: Setting latency timer of device 0000:00:1f.2 to 64
ata1: SATA max UDMA/133 cmd 0xFE00 ctl 0xFE12 bmdma 0xFEA0 irq 18
ata2: SATA max UDMA/133 cmd 0xFE20 ctl 0xFE32 bmdma 0xFEA8 irq 18
scsi0 : ata_piix
ata1.00: ATA-7, max UDMA/133, 488281250 sectors: LBA48 NCQ (depth 0/32)
ata1.00: ata1: dev 0 multi count 8
ata1.01: ATA-7, max UDMA/133, 488281250 sectors: LBA48 NCQ (depth 0/32)
ata1.01: ata1: dev 1 multi count 8
ata1.00: configured for UDMA/133
ata1.01: configured for UDMA/133
scsi1 : ata_piix
ata2: port is slow to respond, please be patient (Status 0xff)
ata2: port failed to respond (30 secs, Status 0xff)
ata2: SRST failed (status 0xFF)
ata2: SRST failed (err_mask=0x100)
ata2: softreset failed, retrying in 5 secs
ata2: SRST failed (status 0xFF)
ata2: SRST failed (err_mask=0x100)
ata2: softreset failed, retrying in 5 secs
ata2: SRST failed (status 0xFF)
ata2: SRST failed (err_mask=0x100)
ata2: reset failed, giving up
scsi 0:0:0:0: Direct-Access ATA WDC WD2500JS-75N 10.0 PQ: 0 ANSI: 5
SCSI device sda: 488281250 512-byte hdwr sectors (250000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
SCSI device sda: 488281250 512-byte hdwr sectors (250000 MB)
sda: Write Protect is off
sda: Mode Sense: 00 3a 00 00
SCSI device sda: drive cache: write back
sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 sda13 sda14 sda15 >
sd 0:0:0:0: Attached scsi disk sda
scsi 0:0:1:0: Direct-Access ATA WDC WD2500JS-75N 10.0 PQ: 0 ANSI: 5
SCSI device sdb: 488281250 512-byte hdwr sectors (250000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
SCSI device sdb: 488281250 512-byte hdwr sectors (250000 MB)
sdb: Write Protect is off
sdb: Mode Sense: 00 3a 00 00
SCSI device sdb: drive cache: write back
sdb:
sd 0:0:1:0: Attached scsi disk sdb
usbmon: debugfs is not available
ACPI: PCI Interrupt 0000:00:1d.7[A] -> GSI 21 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.7 to 64
ehci_hcd 0000:00:1d.7: EHCI Host Controller
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: debug port 1
PCI: cache line size of 128 is not supported by device 0000:00:1d.7
ehci_hcd 0000:00:1d.7: irq 19, io mem 0xffa80800
ehci_hcd 0000:00:1d.7: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
usb usb1: configuration #1 chosen from 1 choice
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
USB Universal Host Controller Interface driver v3.0
ACPI: PCI Interrupt 0000:00:1d.0[A] -> GSI 21 (level, low) -> IRQ 19
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: UHCI Host Controller
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
uhci_hcd 0000:00:1d.0: irq 19, io base 0x0000ff80
usb usb2: configuration #1 chosen from 1 choice
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.1[B] -> GSI 22 (level, low) -> IRQ 20
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: UHCI Host Controller
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
uhci_hcd 0000:00:1d.1: irq 20, io base 0x0000ff60
usb usb3: configuration #1 chosen from 1 choice
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 21
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: UHCI Host Controller
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
uhci_hcd 0000:00:1d.2: irq 21, io base 0x0000ff40
usb usb4: configuration #1 chosen from 1 choice
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI Interrupt 0000:00:1d.3[D] -> GSI 23 (level, low) -> IRQ 22
PCI: Setting latency timer of device 0000:00:1d.3 to 64
uhci_hcd 0000:00:1d.3: UHCI Host Controller
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
uhci_hcd 0000:00:1d.3: irq 22, io base 0x0000ff20
usb usb5: configuration #1 chosen from 1 choice
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usb 2-1: new low speed USB device using uhci_hcd and address 2
usb 2-1: configuration #1 chosen from 1 choice
usb 2-2: new low speed USB device using uhci_hcd and address 3
usb 2-2: configuration #1 chosen from 1 choice
usbcore: registered new interface driver hiddev
input: USB Keyboard as /class/input/input0
input: USB HID v1.10 Keyboard [ USB Keyboard] on usb-0000:00:1d.0-1
input: USB Keyboard as /class/input/input1
input: USB HID v1.10 Device [ USB Keyboard] on usb-0000:00:1d.0-1
input: USB Optical Mouse as /class/input/input2
input: USB HID v1.11 Mouse [USB Optical Mouse] on usb-0000:00:1d.0-2
usbcore: registered new interface driver usbhid
drivers/usb/input/hid-core.c: v2.6:USB HID core driver
PNP: No PS/2 controller found. Probing ports directly.
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
device-mapper: ioctl: 4.10.0-ioctl (2006-09-14) initialised: dm-devel@redhat.com
EDAC MC: Ver: 2.0.1 Nov 9 2006
TCP cubic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
p4-clockmod: P4/Xeon(TM) CPU On-Demand Clock Modulation available
Starting balanced_irq
Using IPI No-Shortcut mode
Time: tsc clocksource has been installed.
ACPI: (supports S0 S1 S3 S4 S5)
VFS: Mounted root (reiserfs filesystem) readonly.
Freeing unused kernel memory: 200k freed
ReiserFS: sda9: replayed 504 transactions in 0 seconds
Adding 1277156k swap on /dev/sda3. Priority:-1 extents:1 across:1277156k
tg3.c:v3.68 (November 02, 2006)
ACPI: PCI Interrupt 0000:02:00.0[A] -> GSI 16 (level, low) -> IRQ 16
PCI: Setting latency timer of device 0000:02:00.0 to 64
eth0: Tigon3 [partno(BCM5751PKFBG) rev 4001 PHY(5750)] (PCI Express) 10/100/1000BaseT Ethernet 00:13:72:7b:29:60
eth0: RXcsums[1] LinkChgREG[1] MIirq[1] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
eth0: dma_rwctrl[76180000] dma_mask[64-bit]
tg3: eth0: Link is up at 10 Mbps, half duplex.
tg3: eth0: Flow control is off for TX and off for RX.
^ permalink raw reply
* [PATCH] xencomm, xenmem and hypercall continuation
From: Isaku Yamahata @ 2006-11-10 5:49 UTC (permalink / raw)
To: xen-devel; +Cc: xen-ppc-devel, xen-ia64-devel
[-- Attachment #1: Type: text/plain, Size: 887 bytes --]
fix xenmem hypercall for non-trivial xencomm arch(i.e. ia64, and powerpc)
On ia64 and powerpc, guest_handle_add_offset() effect persists over
hypercall continuation because its consumed offset is recorced in
guest domains memory space.
On the other hand, x86 guest_handle_add_offset() effect is volatile
over hypercall continuation.
So xenmem hypercall(more specifically increase_reservation,
decrease_reservaton, populate_memory and exchange) is broken on
ia64 and powerpc.
#ifdef/ifndef CONFIG_X86 is used to solve this issue without breaking
the existing ABI. #ifdef is ugly, but it would be difficult to solve
this issue without #ifdef and to preserve the existing ABI simaltaneously.
I checked other users of both guest_handle_add_offset() and hypercall
continuation. Fortunately other users records restart points
by hypercall argument so that this isn't an issue.
--
yamahata
[-- Attachment #2: 12315_f3e97d467b6f_xencomm_and_xenmem_hypercall.patch --]
[-- Type: text/plain, Size: 3754 bytes --]
# HG changeset patch
# User yamahata@valinux.co.jp
# Date 1163136126 -32400
# Node ID f3e97d467b6f7e36c95d4deb3ed3bc955710f8e7
# Parent 5470cadfeb22e33e7abb86193224984140732361
fix xenmem hypercall for non-trivial xencomm arch(i.e. ia64, and powerpc)
On ia64 and powerpc, guest_handle_add_offset() effect persists over
hypercall continuation because its consumed offset is recorced in
guest domains memory space.
On the other hand, x86 guest_handle_add_offset() effect is volatile
over hypercall continuation.
So xenmem hypercall(more specifically increase_reservation,
decrease_reservaton, populate_memory and exchange) is broken on
ia64 and powerpc.
#ifdef/ifndef CONFIG_X86 is used to solve this issue without breaking
the existing ABI. #ifdef is ugly, but it would be difficult to solve
this issue without #ifdef and to preserve the existing ABI simaltaneously.
I checked other users of both guest_handle_add_offset() and hypercall
continuation. Fortunately other users records restart points
by hypercall argument so that this isn't an issue.
PATCHNAME: xencomm_and_xenmem_hypercall
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
diff -r 5470cadfeb22 -r f3e97d467b6f xen/common/memory.c
--- a/xen/common/memory.c Fri Nov 10 14:22:05 2006 +0900
+++ b/xen/common/memory.c Fri Nov 10 14:22:06 2006 +0900
@@ -341,23 +341,29 @@ memory_exchange(XEN_GUEST_HANDLE(xen_mem
memflags = MEMF_dma;
}
+#ifdef CONFIG_X86
guest_handle_add_offset(exch.in.extent_start, exch.nr_exchanged);
+#endif
exch.in.nr_extents -= exch.nr_exchanged;
if ( exch.in.extent_order <= exch.out.extent_order )
{
in_chunk_order = exch.out.extent_order - exch.in.extent_order;
out_chunk_order = 0;
+#ifdef CONFIG_X86
guest_handle_add_offset(
exch.out.extent_start, exch.nr_exchanged >> in_chunk_order);
+#endif
exch.out.nr_extents -= exch.nr_exchanged >> in_chunk_order;
}
else
{
in_chunk_order = 0;
out_chunk_order = exch.in.extent_order - exch.out.extent_order;
+#ifdef CONFIG_X86
guest_handle_add_offset(
exch.out.extent_start, exch.nr_exchanged << out_chunk_order);
+#endif
exch.out.nr_extents -= exch.nr_exchanged << out_chunk_order;
}
@@ -379,6 +385,15 @@ memory_exchange(XEN_GUEST_HANDLE(xen_mem
{
if ( hypercall_preempt_check() )
{
+#ifndef CONFIG_X86
+ guest_handle_add_offset(exch.in.extent_start, i);
+ if ( exch.in.extent_order <= exch.out.extent_order )
+ guest_handle_add_offset(
+ exch.out.extent_start, i >> in_chunk_order);
+ else
+ guest_handle_add_offset(
+ exch.out.extent_start, i << out_chunk_order);
+#endif
exch.nr_exchanged += i << in_chunk_order;
if ( copy_field_to_guest(arg, &exch, nr_exchanged) )
return -EFAULT;
@@ -543,8 +558,10 @@ long do_memory_op(unsigned long cmd, XEN
if ( unlikely(start_extent > reservation.nr_extents) )
return start_extent;
+#ifdef CONFIG_X86
if ( !guest_handle_is_null(reservation.extent_start) )
guest_handle_add_offset(reservation.extent_start, start_extent);
+#endif
reservation.nr_extents -= start_extent;
if ( (reservation.address_bits != 0) &&
@@ -596,6 +613,10 @@ long do_memory_op(unsigned long cmd, XEN
if ( unlikely(reservation.domid != DOMID_SELF) )
put_domain(d);
+#ifndef CONFIG_X86
+ if (rc > 0 && !guest_handle_is_null(reservation.extent_start))
+ guest_handle_add_offset(reservation.extent_start, rc);
+#endif
rc += start_extent;
if ( preempted )
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
^ permalink raw reply
* [PATCH] fix xencomm_add_offset(). index must be incremented.
From: Isaku Yamahata @ 2006-11-10 5:46 UTC (permalink / raw)
To: xen-devel; +Cc: xen-ppc-devel, xen-ia64-devel
[-- Attachment #1: Type: text/plain, Size: 145 bytes --]
fix xencomm_add_offset(). index must be incremented.
I tested only ia64 part.
Xen/powerpc developper, Please check powerpc part.
--
yamahata
[-- Attachment #2: 12320_10cd8f17f3e5_fix_xencomm_add_offset.patch --]
[-- Type: text/plain, Size: 1670 bytes --]
# HG changeset patch
# User yamahata@valinux.co.jp
# Date 1163136850 -32400
# Node ID 10cd8f17f3e583282fd4d8efa7d6f05f4a122319
# Parent 13397c9919becfc33466206f8488d7ce55097100
fix xencomm_add_offset(). index must be incremented.
PATCHNAME: fix_xencomm_add_offset
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
diff -r 13397c9919be -r 10cd8f17f3e5 xen/arch/ia64/xen/xencomm.c
--- a/xen/arch/ia64/xen/xencomm.c Fri Nov 10 14:26:32 2006 +0900
+++ b/xen/arch/ia64/xen/xencomm.c Fri Nov 10 14:34:10 2006 +0900
@@ -345,6 +345,11 @@ xencomm_add_offset(
unsigned int chunksz;
unsigned int chunk_skip;
+ if (dest_paddr == XENCOMM_INVALID) {
+ i++;
+ continue;
+ }
+
pgoffset = dest_paddr % PAGE_SIZE;
chunksz = PAGE_SIZE - pgoffset;
@@ -356,6 +361,8 @@ xencomm_add_offset(
desc->address[i] += chunk_skip;
}
bytes -= chunk_skip;
+
+ i++;
}
return handle;
}
diff -r 13397c9919be -r 10cd8f17f3e5 xen/arch/powerpc/usercopy.c
--- a/xen/arch/powerpc/usercopy.c Fri Nov 10 14:26:32 2006 +0900
+++ b/xen/arch/powerpc/usercopy.c Fri Nov 10 14:34:10 2006 +0900
@@ -249,6 +249,11 @@ int xencomm_add_offset(void *handle, uns
unsigned int chunksz;
unsigned int chunk_skip;
+ if (dest_paddr == XENCOMM_INVALID) {
+ i++;
+ continue;
+ }
+
pgoffset = dest_paddr % PAGE_SIZE;
chunksz = PAGE_SIZE - pgoffset;
@@ -260,6 +265,8 @@ int xencomm_add_offset(void *handle, uns
desc->address[i] += chunk_skip;
}
bytes -= chunk_skip;
+
+ i++;
}
return 0;
}
[-- Attachment #3: Type: text/plain, Size: 152 bytes --]
_______________________________________________
Xen-ia64-devel mailing list
Xen-ia64-devel@lists.xensource.com
http://lists.xensource.com/xen-ia64-devel
^ permalink raw reply
* Re: [PATCH 05/12] SUNRPC: Add a function to format the address in an svc_rqst for printing
From: Neil Brown @ 2006-11-10 5:44 UTC (permalink / raw)
To: Olaf Kirch; +Cc: nfs, Chuck Lever
In-Reply-To: <20061109101638.GB24104@suse.de>
On Thursday November 9, okir@suse.de wrote:
> On Thu, Nov 09, 2006 at 02:44:50PM +1100, Neil Brown wrote:
> > > +void svc_print_addr(struct svc_rqst *rqstp, char *buf, size_t len)
> > > +{
> > > + __svc_print_addr((struct sockaddr *) &rqstp->rq_addr, buf, len);
> > > +
> > > +}
> >
> > Can we have svc_print_addr take a
> > char buf[RPC_MAX_ADDRBUFLEN]
> > rather than a char* and a len? The buffer is always the same size, so
> > best to hard code it..
>
> The compiler will still let you pass a buf[6] without complaining.
> Assumptions that buffer arguments are "big enough" make for great
> security bugs two years down the road because people forget.
> There's a reason why getwd and gets have been deprecated for 18 years :)
I had hoped that sparse would pick it up, but it seems now.
>
> If you want to make it easy, why not have
>
> #define RPC_SVC_ADDR(rqstp, buf) \
> (svc_print_addr(rqstp, buf, sizeof(buf)))
That's one option.
Another would be
struct rpc_addr { char buf[RPC_MAX_ADDRBUFLEN]; };
char *svc_print_addr(struct svc_rqst rqstp, struct rpc_addr *buf)
{
....
return buf->buf;
}
so we force the compiler to check the allocated space.
NeilBrown
-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs
^ permalink raw reply
* Zero-copy question
From: Jonathan Day @ 2006-11-10 5:44 UTC (permalink / raw)
To: linux-kernel
Hi,
I'm working on a problem involving zero-copy between
an external device and physical memory, where the
total length of data is not known in advance, where
allowance must be made for non-contiguous pages in
physical memory and where any number of simultaneous
channels can be going in parallel.
Whilst I am waiting for suitable medication to arrive,
I've been trying to figure out a way in which this
problem can be solved.
My first question would be: has anyone already solved
this, thus saving my sanity? When I've looked up
zero-copy, the main reference I've found is for moving
data from fixed storage to a network. This is useful,
but doesn't quite solve the problem at hand, as you
never need more than a page at a time, so don't have
memory fragmentation to contend with.
My second question is then: assuming that zero-copy in
these situations has been deemed a Bad Idea for
whatever reason, what is the fastest method of doing
bi-directional transfers? (Feel free to assume some
logic on the device.)
Finally, I'd be interested in hearing the opinions of
Linux kernel developers on using structured memory
groups in the kernel for parallel I/O of this kind,
abandoning the zero-copy approach. Although I know a
fair number of kernel projects, I don't know of one
that uses SMGs - am I missing some obvious ones or is
there a known reason for avoiding the SMG approach?
Jonathan Day
____________________________________________________________________________________
Want to start your own business?
Learn how on Yahoo! Small Business.
http://smallbusiness.yahoo.com/r-index
^ permalink raw reply
* Re: [Fastboot] Kexec with latest kernel fail
From: Eric W. Biederman @ 2006-11-10 5:24 UTC (permalink / raw)
To: Lu, Yinghai; +Cc: Horms, yhlu, Fastboot mailing list, ebiederm, linux-kernel
In-Reply-To: <5986589C150B2F49A46483AC44C7BCA49071D3@ssvlexmb2.amd.com>
"Lu, Yinghai" <yinghai.lu@amd.com> writes:
> Thanks, It compiled
>
> kexec get the same error "Invalid memory segment 0x100000 - ...."
Could you post the entire message and the contents of /proc/iomem.
That is where the compare is happening so this may be a parsing
issue of /proc/iomem.
Eric
^ permalink raw reply
* Re: [PATCH] sysctl: Undeprecate sys_sysctl
From: Eric W. Biederman @ 2006-11-10 5:21 UTC (permalink / raw)
To: Alistair John Strachan; +Cc: linux-kernel
In-Reply-To: <200611092317.26459.s0348365@sms.ed.ac.uk>
Alistair John Strachan <s0348365@sms.ed.ac.uk> writes:
> On Wednesday 08 November 2006 19:00, you wrote:
>> The basic issue is that despite have been deprecated and warned about
>> as a very bad thing in the man pages since its inception there are a
>> few real users of sys_sysctl. It was my assumption that because
>> sysctl had been deprecated for all of 2.6 there would be no user space
>> users by this point, so I initially gave sys_sysctl a very short
>> deprecation period.
>>
>> Now that I know there are a few real users the only sane way to
>> proceed with deprecation is to push the time limit out to a year or
>> two work and work with distributions that have big testing pools like
>> fedora core to find these last remaining users.
>
> Eric, do you have a list of the remaining users? It'd be good to know for
> people using Linux in an embedded environment, where they may want to switch
> off the option, but only if it doesn't break their userspace.
They are very very few. The ones I recall are kudzu, radvd, and
libpthreads (which doesn't care).
There is a thread a month or so ago about this where I did a request
for testers, that listed all of the users we could find.
The reality is that I don't think kernel developers can seriously find
them.
If someone actually wants to kill sys_sysctl more power to them. As
long as we don't add more binary numbers I think it is actually easier
to support it than to find those weird users and remove it.
Eric
^ permalink raw reply
* [review] OMAP2 USB device DMA support
From: 박경민 @ 2006-11-10 5:18 UTC (permalink / raw)
To: linux-omap-open-source
[-- Attachment #1: Type: text/plain, Size: 7595 bytes --]
Hi,
I need to help with USB device DMA.
After some work, I can use DMA with minor problem.
After loading g_file_stroage, I got messages below.
It only displays once, after detection it works well.
So I think we have a problem with protocol with windows.
In fact, I don't know file storage protocol with windows.
Do you have any idea?
Any comments are welcome.
Thank you,
Kyungmin Park
P.S., I also attached the usb dma patch.
/ # insmod /usb/g_file_storage.ko-2.6.19 file=/dev/stl6
Using /usb/g_file_storage.ko-2.6.19
g_file_storage gadget: File-backed Storage Gadget, version: 28 November 2005
g_file_storage gadget: Number of LUNs=1
g_file_storage gadget-lun0: ro=0, file: /dev/stl6
/ # udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
g_file_storage gadget: error in submission: ep2out-bulk --> -90
udc: USB reset done, gadget g_file_storage
udc: USB reset done, gadget g_file_storage
g_file_storage gadget: full speed config #1
--
diff --git a/drivers/usb/gadget/omap_udc.c b/drivers/usb/gadget/omap_udc.c
index 31df02e..5cb3198 100644
--- a/drivers/usb/gadget/omap_udc.c
+++ b/drivers/usb/gadget/omap_udc.c
@@ -63,7 +63,7 @@
/* FIXME: OMAP2 currently has some problem in DMA mode */
#ifdef CONFIG_ARCH_OMAP2
-#undef USE_DMA
+//#undef USE_DMA
#endif
/* ISO too */
@@ -74,6 +74,7 @@
#define DMA_ADDR_INVALID (~(dma_addr_t)0)
+#define OMAP2_DMA_CH(ch) (((ch) - 1) << 1)
/*
* The OMAP UDC needs _very_ early endpoint setup: before enabling the
@@ -620,20 +621,25 @@ static void next_in_dma(struct omap_ep *
const int sync_mode = cpu_is_omap15xx()
? OMAP_DMA_SYNC_FRAME
: OMAP_DMA_SYNC_ELEMENT;
+ int dma_trigger = 0;
+
+ if (cpu_is_omap24xx())
+ dma_trigger = OMAP24XX_DMA_USB_W2FC_TX0 + OMAP2_DMA_CH(ep->dma_channel);
/* measure length in either bytes or packets */
if ((cpu_is_omap16xx() && length <= UDC_TXN_TSC)
+ || (cpu_is_omap24xx() && length <= ep->maxpacket)
|| (cpu_is_omap15xx() && length < ep->maxpacket)) {
txdma_ctrl = UDC_TXN_EOT | length;
omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S8,
- length, 1, sync_mode, 0, 0);
+ length, 1, sync_mode, dma_trigger, 0);
} else {
length = min(length / ep->maxpacket,
(unsigned) UDC_TXN_TSC + 1);
txdma_ctrl = length;
omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
ep->ep.maxpacket >> 1, length, sync_mode,
- 0, 0);
+ dma_trigger, 0);
length *= ep->maxpacket;
}
omap_set_dma_src_params(ep->lch, OMAP_DMA_PORT_EMIFF,
@@ -672,11 +678,15 @@ static void finish_in_dma(struct omap_ep
static void next_out_dma(struct omap_ep *ep, struct omap_req *req)
{
unsigned packets;
+ int dma_trigger = 0;
/* NOTE: we filtered out "short reads" before, so we know
* the buffer has only whole numbers of packets.
*/
+ if (cpu_is_omap24xx())
+ dma_trigger = OMAP24XX_DMA_USB_W2FC_RX0 + OMAP2_DMA_CH(ep->dma_channel);
+
/* set up this DMA transfer, enable the fifo, start */
packets = (req->req.length - req->req.actual) / ep->ep.maxpacket;
packets = min(packets, (unsigned)UDC_RXN_TC + 1);
@@ -684,7 +694,7 @@ static void next_out_dma(struct omap_ep
omap_set_dma_transfer_params(ep->lch, OMAP_DMA_DATA_TYPE_S16,
ep->ep.maxpacket >> 1, packets,
OMAP_DMA_SYNC_ELEMENT,
- 0, 0);
+ dma_trigger, 0);
omap_set_dma_dest_params(ep->lch, OMAP_DMA_PORT_EMIFF,
OMAP_DMA_AMODE_POST_INC, req->req.dma + req->req.actual,
0, 0);
@@ -792,6 +802,7 @@ static void dma_channel_claim(struct oma
{
u16 reg;
int status, restart, is_in;
+ int dma_channel;
is_in = ep->bEndpointAddress & USB_DIR_IN;
if (is_in)
@@ -818,11 +829,15 @@ static void dma_channel_claim(struct oma
ep->dma_channel = channel;
if (is_in) {
- status = omap_request_dma(OMAP_DMA_USB_W2FC_TX0 - 1 + channel,
+ if (cpu_is_omap24xx())
+ dma_channel = OMAP24XX_DMA_USB_W2FC_TX0 + OMAP2_DMA_CH(channel);
+ else
+ dma_channel = OMAP_DMA_USB_W2FC_TX0 - 1 + channel;
+ status = omap_request_dma(dma_channel,
ep->ep.name, dma_error, ep, &ep->lch);
if (status == 0) {
UDC_TXDMA_CFG_REG = reg;
- /* EMIFF */
+ /* EMIFF or SDRC*/
omap_set_dma_src_burst_mode(ep->lch,
OMAP_DMA_DATA_BURST_4);
omap_set_dma_src_data_pack(ep->lch, 1);
@@ -834,7 +849,11 @@ static void dma_channel_claim(struct oma
0, 0);
}
} else {
- status = omap_request_dma(OMAP_DMA_USB_W2FC_RX0 - 1 + channel,
+ if (cpu_is_omap24xx())
+ dma_channel = OMAP24XX_DMA_USB_W2FC_RX0 + OMAP2_DMA_CH(channel);
+ else
+ dma_channel = OMAP_DMA_USB_W2FC_RX0 - 1 + channel;
+ status = omap_request_dma(dma_channel,
ep->ep.name, dma_error, ep, &ep->lch);
if (status == 0) {
UDC_RXDMA_CFG_REG = reg;
@@ -844,7 +863,7 @@ static void dma_channel_claim(struct oma
OMAP_DMA_AMODE_CONSTANT,
(unsigned long) io_v2p((u32)&UDC_DATA_DMA_REG),
0, 0);
- /* EMIFF */
+ /* EMIFF or SDRC */
omap_set_dma_dest_burst_mode(ep->lch,
OMAP_DMA_DATA_BURST_4);
omap_set_dma_dest_data_pack(ep->lch, 1);
@@ -857,7 +876,7 @@ static void dma_channel_claim(struct oma
omap_disable_dma_irq(ep->lch, OMAP_DMA_BLOCK_IRQ);
/* channel type P: hw synch (fifo) */
- if (!cpu_is_omap15xx())
+ if (cpu_class_is_omap1() && !cpu_is_omap15xx())
OMAP1_DMA_LCH_CTRL_REG(ep->lch) = 2;
}
@@ -2579,7 +2598,7 @@ omap_ep_setup(char *name, u8 addr, u8 ty
* (for more reliable behavior)
*/
if ((!use_dma && (addr & USB_DIR_IN))
- || machine_is_omap_apollon()
+ || (!use_dma && machine_is_omap_apollon())
|| cpu_is_omap15xx())
dbuf = 0;
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply related
* Re: 2.6.19-rc5-mm1: HPC nx6325 breakage, VESA fb problem, md-raid problem
From: Andrew Morton @ 2006-11-10 5:15 UTC (permalink / raw)
To: Andi Kleen; +Cc: Rafael J. Wysocki, linux-kernel, fbuihuu, adaplas, NeilBrown
In-Reply-To: <200611100549.08239.ak@suse.de>
On Fri, 10 Nov 2006 05:49:08 +0100
Andi Kleen <ak@suse.de> wrote:
>
> > > >
> > > > Well, I've got some data from earlyprintk (forgot I needed to boot with
> > > > vga=normal).
> > > >
> > > > Unfortunately, I had to rewrite the trace manually:
> > > >
> > > > clear_IO_APIC_pin+0x15/0x6a
> > > > try_apic_pin+0x7a/0x98
> > > > setup_IO_APIC+0x600/0xb7a
> > > > smp_prepare_cpus+0x33a/0x371
> > > > init+0x60/0x32d
> > > > child_rip+0xa/0x12
> > > >
> > > > [And then the unwinder said it got stuck.]
> > > >
> > > > RIP is reported to be at ioapic_read_entry+0x33/0x61,
> > >
> > > This is 100% reproducible on the nx6325 (but not on the other boxes) and
> > > apparently caused by x86_64-mm-try-multiple-timer-pins.patch (doesn't
> > > happen with this patch reverted).
> >
> > Thanks, dropped.
>
> can I have details please?
I think what's in this thread is all you'll get.
It would be nice to see the access address. I'd be guessing that it's
trying to read the io-apic before we're ready to read it and io_apic_base()
is returning gunk and boom.
> On what system (CPU, motherboard, BIOS version) does the noidlehz stuff break?
nx6325
It's x86_64: no noidlehz.
> And what did you drop exactly?
x86_64-mm-try-multiple-timer-pins.patch
^ permalink raw reply
* Re: [PATCH 02/02] Elf: Align elf notes properly
From: Eric W. Biederman @ 2006-11-10 5:09 UTC (permalink / raw)
To: Magnus Damm
Cc: Magnus Damm, linux-kernel, Vivek Goyal, Andi Kleen, fastboot,
Horms, Dave Anderson
In-Reply-To: <aec7e5c30611091952j6cd7988akc1671d269925bba9@mail.gmail.com>
"Magnus Damm" <magnus.damm@gmail.com> writes:
> I'm not sure you see all my points. The important parts are the
> offsets - offset 0 and offset N2 in the description above. The should
> be aligned somehow. Exactly how to align them depends on if the 64-bit
> spec is valid or not.
>
> My points are:
>
> - Some kdump code rounds up the size of "elf note header" today. This
> is unneccessary for 32 bit alignment and plain wrong for 64 bit
> alignment. So I think that the code is strange and should be changed
> regardless if the 64-bit spec is valid or not.
Sure that is reasonable, if correct.
> - Many implementations incorrectly calculate N2 as: roundup(sizeof(elf
> note header)) + roundup(n_namesz).
I am not certain that is incorrect. roundup(sizeof(elf note header), 4) +
roundup(n_namesize, 4) will yield something that is properly 4 byte aligned.
I do agree that implementation is not correct for 8 byte alignment. 8 byte
alignment does not appear to be in widespread use in the wild.
> - You say that the size of the notes do not vary and therefore this is
> a non-issue. I agree that the size does not vary, but I believe that
> the aligment _is_ an issue. One example is the N2 calculation above,
> but more importantly the vmcore code that merges the elf note sections
> into one. You know, if you have more than one cpu you will end up with
> more than one crash note. And if you run Xen you will have even more
> crash notes.
Sure that is clearly an issue.
> - On top of this I think it would be nice if all this code could be
> unified to avoid code duplication. But we need to straighten out this
> and agree on how the aligment should work before the code can be
> merged into one implementation.
Sure.
To verify your claim that 8 byte alignment is correct I checked the
core dump code in fs/binfmt_elf.c in the linux kernel. That always
uses 4 byte alignment. Therefore it appears clear that only doing
4 byte alignment is not a local misreading of the spec, and is used in
other implementations. If you can find an implementation that uses
8 byte alignment I am willing to consider it.
The current situation is that the linux kernel generated application
core dumps use 4 byte alignment so I expect that is what existing
applications such as gdb expect.
Therefore we use 4 byte alignment unless it can be shown that the
linux core dumps are a fluke and should be fixed.
Eric
^ permalink raw reply
* Re: [patch 13/19] GTOD: Mark TSC unusable for highres timers
From: Andi Kleen @ 2006-11-10 5:10 UTC (permalink / raw)
To: john stultz
Cc: Thomas Gleixner, Andrew Morton, LKML, Ingo Molnar, Len Brown,
Arjan van de Ven, Roman Zippel
In-Reply-To: <1163121045.836.69.camel@localhost>
current_tsc_khz = tsc_khz;
> > clocksource_tsc.mult = clocksource_khz2mult(current_tsc_khz,
> > clocksource_tsc.shift);
> > +#ifndef CONFIG_HIGH_RES_TIMERS
> > /* lower the rating if we already know its unstable: */
> > if (check_tsc_unstable())
> > clocksource_tsc.rating = 0;
> > -
> > +#else
> > + /*
> > + * Mark TSC unsuitable for high resolution timers. TSC has so
> > + * many pitfalls: frequency changes, stop in idle ... When we
> > + * switch to high resolution mode we can not longer detect a
> > + * firmware caused frequency change, as the emulated tick uses
> > + * TSC as reference. This results in a circular dependency.
> > + * Switch only to high resolution mode, if pm_timer or such
> > + * is available.
> > + */
> > + clocksource_tsc.rating = 50;
> > + clocksource_tsc.is_continuous = 0;
> > +#endif
> > init_timer(&verify_tsc_freq_timer);
> > verify_tsc_freq_timer.function = verify_tsc_freq;
> > verify_tsc_freq_timer.expires =
>
>
> Hmmm. I wish this patch was unnecessary, but I don't see an easy
> solution.
Very sad. This will make a lot of people unhappy, even to the point
where they might prefer disabling noidlehz over super slow gettimeofday.
I assume you at least have a suitable command line option for that, right?
Can we get a summary on which systems the TSC is considered unstable?
Normally we assume if it's stable enough for gettimeofday it should
be stable enough for longer delays too.
-Andi
^ permalink raw reply
* Re: [PATCH] x86_64: fix perms/range of vsyscall vma in /proc/*/maps
From: Andi Kleen @ 2006-11-10 5:07 UTC (permalink / raw)
To: Ernie Petrides; +Cc: linux-kernel
In-Reply-To: <200611100121.kAA1L0UN031589@pasta.boston.redhat.com>
On Friday 10 November 2006 02:20, Ernie Petrides wrote:
> Hi, Andy. The final line of /proc/<pid>/maps on x86_64 for native 64-bit
> tasks shows an incorrect ending address and incorrect permissions. There
> is only a single page mapped in this vsyscall region, and it is accessible
> for both read and execute.
The range reported is how much address space is reserved, but you're
right it is less.
But I don't like hardcoding a page here -- this will likely be extended
soon. Can you please create a new define VSYSCALL_REAL_LENGTH or similar
in vsyscall.h and use that?
Thanks,
-Andi
^ permalink raw reply
* bcm43xx-d80211 broadcast reception with WPA
From: Paul Hampson @ 2006-11-09 22:23 UTC (permalink / raw)
To: netdev
Hi,
Long time lurker, first time poster. ^_^
I've been backporting the bcm43xx-d80211 driver to whatever the released
2.6 kernel was using the rt2x00 project's d80211 stack (equivalent to
current wireless-dev but with a workaround for not having a ieee80211_dev
pointer and still using the _tfm interface instead of the _cypher interface.)
As of last night's wireless-dev tree bcm43xx, everything seems to be
operating fine except incoming broadcast traffic is coming in 14 bytes too
long and scrambled. I presume this means it's not decrypting properly...
Anyway, I just thought I'd mention it. It might have gone unnoticed by the
bcm43xx-d80211 developers, since it doesn't interfere with normal operation
(A DHCP client's only broadcasts are outgoing) and only showed up for me
because radvd's RAs were not arriving and my IPv6 address was not being set.
I couldn't find any mention of such a thing on the list, and I'm happy to
provide whatever debugging output is useful, but the laptop with the device
isn't with me at the moment.
Relevant facts:
Platform: Debian/unstable (PPC) w/linux-image-2.6.18-1-powerpc (2.6.18-3)
Drivers: bcm43xx-d80211 from wireless-dev 774f233b7915a2c36480eb4d98e6f57938f04b7b
Firmware: 4.80.46.0 (BE, from AppleAirPortBrcm4311)
Stack: ieee80211 from http://rt2x00.serialmonkey.com/rt2x00-cvs-daily.tar.gz
2006110303 is the date on the output, I believe. Hasn't been updated since 20061028
Plus a backport of the following commits:
[PATCH] d80211: extend extra_hdr_room to be a bytecount 522e078b9f1f8309770dd161d90ddac1573a7877
[PATCH] d80211: remove unused variable in ieee80211_rx_irqsafe 10bfc9cdf9621385a3b69aa35f9fa86cc6a46bc6
[PATCH] d80211: Add wireless statistics 448bf25bc9e3d70a211fdf235426472089371c43
(as well as anything else that showed up in a diff of the d80211 dir against the rt2x00
iee80211 dir and wasn't a 2.6.19ism or wireless-devism)
I'm basically using the instructions I posted at [1] except also patching rt2x00's
ieee80211 stack.
I acknowledge that any of the firmware version, the backporting, the forward porting
or the current lunar cycle may be causing this problem. If no one pipes up with an
insight, I'll try tonight with a v3 firmware, although the reason I moved to a v3
firmware was my previous build of bcm43xx-d80211 also wasn't getting an IPv6 address,
although I don't believe the RAs were scrambled in that case.
[1] http://openfacts.berlios.de/index-en.phtml?title=Broadcom_43xx_Linux_Driver/Debian_Unstable_with_Devicescape_802.11_stack
--
Paul "TBBle" Hampson
Opinions expressed here do not reflect the views of my employer
Hell, we don't even agree on my pay cheque
^ permalink raw reply
* Re: 2.6.19-rc5-mm1: HPC nx6325 breakage, VESA fb problem, md-raid problem
From: Andi Kleen @ 2006-11-10 4:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Rafael J. Wysocki, linux-kernel, fbuihuu, adaplas, NeilBrown
In-Reply-To: <20061109095811.ac654e13.akpm@osdl.org>
> > >
> > > Well, I've got some data from earlyprintk (forgot I needed to boot with
> > > vga=normal).
> > >
> > > Unfortunately, I had to rewrite the trace manually:
> > >
> > > clear_IO_APIC_pin+0x15/0x6a
> > > try_apic_pin+0x7a/0x98
> > > setup_IO_APIC+0x600/0xb7a
> > > smp_prepare_cpus+0x33a/0x371
> > > init+0x60/0x32d
> > > child_rip+0xa/0x12
> > >
> > > [And then the unwinder said it got stuck.]
> > >
> > > RIP is reported to be at ioapic_read_entry+0x33/0x61,
> >
> > This is 100% reproducible on the nx6325 (but not on the other boxes) and
> > apparently caused by x86_64-mm-try-multiple-timer-pins.patch (doesn't
> > happen with this patch reverted).
>
> Thanks, dropped.
can I have details please?
On what system (CPU, motherboard, BIOS version) does the noidlehz stuff break?
And what did you drop exactly?
Thanks,
-Andi
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.