All of lore.kernel.org
 help / color / mirror / Atom feed
* Oops on 2.4.x invalid procfs i_ino value
@ 2004-12-17 22:49 Brent Casavant
  2004-12-18  0:38 ` William Lee Irwin III
  0 siblings, 1 reply; 7+ messages in thread
From: Brent Casavant @ 2004-12-17 22:49 UTC (permalink / raw)
  To: linux-kernel; +Cc: wli, mingo

I've run into a number of crashes while closing procfs stat files
when a system is under load.  I think I've found the problem, but
am a little unsure how to proceed.  This all happens to be on a
2.4.21 based kernel, but by my brief code inspection I think the
problem still exists on more recent 2.4.x kernels.

In procfs the fake_ino() macro is used to construct the inode number
for each entry.

	#define fake_ino(pid,ino) (((pid)<<16)|(ino))

In particular this is used in proc_pid_make_inode:

	inode->i_ino = fake_ino(task->pid, ino);

Note that a pid may be more than 16 bits in width (e.g. in IA64), and
we're trying to stuff it into the upper 16 bits of the inode number.
This isn't usually a problem, except when the lower 16 bits of the
inode happen to be 0 (i.e. pids that are a multiple of 65536).

Why does zero matter?  Glad you asked.

In proc_delete_inode there is a check to see if the inode is is
a "proper" (whatever that means) procfs inode.  The whole function
is:

static void proc_delete_inode(struct inode *inode)
{
	struct proc_dir_entry *de = inode->u.generic_ip;

	inode->i_state = I_CLEAR;

	if (PROC_INODE_PROPER(inode)) {
		proc_pid_delete_inode(inode);
		return;
	}
	if (de) {
		if (de->owner)
			__MOD_DEC_USE_COUNT(de->owner);
		de_put(de);
	}
}

PROC_INODE_PROPER() is:

	#define PROC_INODE_PROPER(inode) ((inode)->i_ino & ~0xffff)

In other words, it checks whether the top 16 bits of the inode number
(equivalent to the bottom 16 bits of the pid) are non-zero.

Thus closing a proc entry for any task with a pid that is a multiple of
65536 will fail this check, skip proc_pid_delete_inode, and call
__MOD_DEC_USE_COUNT, more than likely causing a panic on an invalid
memory access, and minimally corrupting something in memory otherwise.

I don't have a solution coded up (mostly because I'm a bit bleary
eyed after looking at crash dumps all day) -- but are there any
thoughts on how to go about addressing this one?  An obvious workaround
is setting kernel.pid_max to 65535, but that's only a workaround, not
a solution.

On a related note, if it matters, on about half the crash dumps I've
looked at, I see a pid of 0 has been assigned to a user process,
tripping this same problem.  I suspect there's another bug somewhere
that's allowing a pid of 0 to be chosen in the first place -- but I
don't totally discount that this problem may lay in SGI's patches to
this particular kernel -- I'll need to take a more thorough look.

Thanks,
Brent

-- 
Brent Casavant                          If you had nothing to fear,
bcasavan@sgi.com                        how then could you be brave?
Silicon Graphics, Inc.                    -- Queen Dama, Source Wars

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-12-27 21:40 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-12-17 22:49 Oops on 2.4.x invalid procfs i_ino value Brent Casavant
2004-12-18  0:38 ` William Lee Irwin III
2004-12-18  0:47   ` William Lee Irwin III
2004-12-20 22:35     ` Brent Casavant
2004-12-22 15:46       ` Marcelo Tosatti
2004-12-27 19:04         ` Marcelo Tosatti
2004-12-27 21:40       ` William Lee Irwin III

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.