Tuesday, October 6, 2009

linux.kernel - 26 new messages in 17 topics - digest

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* Populate subsystem vendor and device IDs for PCI-Bridges - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/3e46e25b1d0188aa?hl=en
* Linux 2.6.32-rc3 - 5 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/5429515fa7f071a2?hl=en
* trivial: fix typos "selct" + "slect" -> "select" - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/1d60baffcb344471?hl=en
* it87 sensors need an ACPI driver (2.6.31) - 3 messages, 3 authors
http://groups.google.com/group/linux.kernel/t/ab6293b7cf9f56ab?hl=en
* SCSI fixes for 2.6.32-rc3 - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/7de73324941ebcbb?hl=en
* futex: Nullify robust lists after cleanup - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/ccab234928070dfe?hl=en
* tree rcu: Add debug RCU head option - 3 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/f40d55eaaaeed891?hl=en
* mfd: ADP5520 Multifunction LCD Backlight and KeypadInput Device Driver - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/507004e57561c7a0?hl=en
* drm/radeon/kms: Fallback to non AGP when acceleration fails to initialize -
1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/b9540c97ab41b9fd?hl=en
* [PATCH] PCI PM: Read device power state from register after updating it - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/3cc3d988c89f36cd?hl=en
* TI XIO200a bridge quirk: erroneously reports support for fast b2b transfers -
1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/efc4ae96a01bd432?hl=en
* atmel_serial: Fix bad BUILD_BUG_ON() usage - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/c1f165fcfefd6b1f?hl=en
* fix spidev compilation when VERBOSE is defined - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/c7ac008c458af252?hl=en
* KVM: introduce "xinterface" API for external interaction with guests - 2
messages, 1 author
http://groups.google.com/group/linux.kernel/t/2f03996dc667e0c1?hl=en
* mlock use lru_add_drain_all_async() - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/59266ef3fd4c529b?hl=en
* perf_events: add event constraints support for Intel processors - 1 messages,
1 author
http://groups.google.com/group/linux.kernel/t/3efefc3db8320eb3?hl=en
* sata_via: SError: { UnrecovData Proto TrStaTrns } - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/b6a7be5147bec9ef?hl=en

==============================================================================
TOPIC: Populate subsystem vendor and device IDs for PCI-Bridges
http://groups.google.com/group/linux.kernel/t/3e46e25b1d0188aa?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 6 2009 9:00 am
From: Gabe Black


Change to populate the subsystem vendor and subsytem device IDs for
PCI-PCI bridges that implement the PCI Subsystem Vendor ID capability.
Previously bridges left subsystem vendor IDs unpopulated.

Signed-off-by: Gabe Black <gabe.black@ni.com>
---
drivers/pci/probe.c | 6 ++++++
include/linux/pci_regs.h | 5 +++++
2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8105e32..6d90246 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -731,6 +731,7 @@ int pci_setup_device(struct pci_dev *dev)
u32 class;
u8 hdr_type;
struct pci_slot *slot;
+ int pos = 0;

if (pci_read_config_byte(dev, PCI_HEADER_TYPE, &hdr_type))
return -EIO;
@@ -822,6 +823,11 @@ int pci_setup_device(struct pci_dev *dev)
dev->transparent = ((dev->class & 0xff) == 1);
pci_read_bases(dev, 2, PCI_ROM_ADDRESS1);
set_pcie_hotplug_bridge(dev);
+ pos = pci_find_capability(dev, PCI_CAP_ID_SSVID);
+ if (pos) {
+ pci_read_config_word(dev, pos + PCI_SSVID_VENDOR_ID, &dev->subsystem_vendor);
+ pci_read_config_word(dev, pos + PCI_SSVID_DEVICE_ID, &dev->subsystem_device);
+ }
break;

case PCI_HEADER_TYPE_CARDBUS: /* CardBus bridge header */
diff --git a/include/linux/pci_regs.h b/include/linux/pci_regs.h
index dd0bed4..d3b0cfb 100644
--- a/include/linux/pci_regs.h
+++ b/include/linux/pci_regs.h
@@ -365,6 +365,11 @@
#define PCI_X_STATUS_266MHZ 0x40000000 /* 266 MHz capable */
#define PCI_X_STATUS_533MHZ 0x80000000 /* 533 MHz capable */

+/* PCI Bridge Subsystem ID registers */
+
+#define PCI_SSVID_VENDOR_ID 4 /* PCI-Bridge subsystem vendor id register */
+#define PCI_SSVID_DEVICE_ID 6 /* PCI-Bridge subsystem device id register */
+
/* PCI Express capability registers */

#define PCI_EXP_FLAGS 2 /* Capabilities register */
--
1.6.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: Linux 2.6.32-rc3
http://groups.google.com/group/linux.kernel/t/5429515fa7f071a2?hl=en
==============================================================================

== 1 of 5 ==
Date: Tues, Oct 6 2009 9:00 am
From: Linus Torvalds


On Tue, 6 Oct 2009, Dirk Hohndel wrote:
>
> For people simply pulling from your git tree, I think the answer is YES.

If they only pull from my git tree, why do they even care?

And why don't you use LOCALVERSION_AUTO? Did you just not know about it?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 5 ==
Date: Tues, Oct 6 2009 9:00 am
From: Linus Torvalds


On Tue, 6 Oct 2009, Linus Torvalds wrote:
>
> It's the "let's make that meaningful and misleading number be even _more_
^^^^^^^^^^
> misleading by making people think it has meaning" magical thinking that I
> hate.

That 'meaningful' was supposed to be a 'meaningless' of course.

The point really is that we have about 1 new version number per week, but
at the same time we have an average of one _thousand_ commits merged per
week. So at a glance, the Makefile version number doesn't mean very much.

But if the thousand commits we merged actually were nicely spread out in
between those version numbers, it would all work really beautifully. Sure,
the top-level version number would only give you a 1/1000 of the real
commit information, but hey, that's kind of what you'd want, isn't it? So
then the top-level Makefile version number would be meaningful and useful!

But that's not how it works. In fact, if we actually followed our own
release rules ("merge window is for merging code that was written before
the merge window even started"), no commits except for my merge commits
should actually have that last release in their Makefile at all!

Now, that's not actually true, because (a) people rebase and (b) even in
the absense of rebases I do merge with people like Andrew by email, so we
actually end up having statistics like these:

git rev-list v2.6.31..v2.6.32-rc1 |
while read a
do
git show $a:Makefile | grep SUBLEVEL.=
done | sort | uniq -c

resulting in

32 SUBLEVEL = 29
383 SUBLEVEL = 30
8795 SUBLEVEL = 31
1 SUBLEVEL = 32

which is actually a bit sad in itself (showing just _how_ many people
rebased their work on top of a release), but is still showing that we
actually had 32 new commits in there that were based on a 2.6.29 kernel

And what people are suggesting with a 2.6.32-rc0 would just lead to people
now rebasing their work NOT EVEN ON A RELEASE. They'd want to rebase it on
top of that made-up commit (2.6.32-rc0), so now from a development
standpoint that commit suddenly becomes more important than the release
itself.

Do you not see the insanity? We should have _less_ of this kind of
thinking, not more. At least rebasing on top of a release sounds sane. Not
this kind of "rebase on top of a magic Linus-commit" crap.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 3 of 5 ==
Date: Tues, Oct 6 2009 9:10 am
From: Linus Torvalds


On Tue, 6 Oct 2009, Ingo Molnar wrote:
>
> hm, i think you ignored (or missed, or found irrelevant) my first
> suggested variant:

No, I didn't ignore it, it just showed that you didn't know what you were
thinking about.

> v2.6.31
> v2.6.31+
>
> The '+' sign says that it's more than .31.

No. It shows no such thing. Because it DOES NOT EXIST in other peoples
trees until they rebase on top of mine.

Unless:

> _That_ i think is a lot harder to confuse with the real .31 than a
> v2.6.31-1234-g16123c4 version string.

.. are you saying that it would be just some automatically generated
thing, just a crippled form of CONFIG_LOCALVERSION_AUTO? Kind of a
CONFIG_LOCALVERSION_AUTO_SHORTFORM?

If so, then I don't hate "v2.6.31+", but at the same time, that single
plus-sign tells _so_much_ less than v2.6.31-1234-g16123c4 that I think
it's really sad and crippled.

Is it so hard to teach people what "v2.6.31-1234-g16123c4" means? It's not
like it's very complicated, and it's not like it's not visually very
distinct indeed from the tagged release case (which is just "v2.6.31").

I'd love to use "+" instead of "-", but I was thinking that there are
various version things that get unhappy about special characters like
that, so we've always used '-' as the separator (since we've always used
it).

So I'm _entirely_ open to changing how 'CONFIG_LOCALVERSION_AUTO' works,
or extending on that notion.

What I'm _not_ open to doing is to add made-up commits that change the
top-level Makefile at non-release points. Because that way really does lie
insanity.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 4 of 5 ==
Date: Tues, Oct 6 2009 9:40 am
From: Linus Torvalds


On Tue, 6 Oct 2009, Linus Torvalds wrote:
>
> Unless:
>
> > _That_ i think is a lot harder to confuse with the real .31 than a
> > v2.6.31-1234-g16123c4 version string.
>
> .. are you saying that it would be just some automatically generated
> thing, just a crippled form of CONFIG_LOCALVERSION_AUTO? Kind of a
> CONFIG_LOCALVERSION_AUTO_SHORTFORM?

So how about this?

It changes how CONFIG_LOCALVERSION_AUTO works, in the following trivial
way:

- if it is set, things work the way they always have, and you get a
extended kernel release like

2.6.32-rc3-00052-g0eca52a-dirty

- but if it is _not_ set, we'll still try to get a version from the
underlying SCM (we actually support git, hg and SVN right now, even if
some comments may say "git only"), and if the underlying SCM says it
has a local version, we append just "+", so you get a version number
like

2.6.32-rc3+

IOW, you'd never get 2.6.32-rc0, but you'd get either the complex git
version number (or SVN/hg/whatever), or at least "2.6.31+" with the "+"
showing that it is more than plain 2.6.31.

The "+" could be anything else, of course. The diff is pretty obvious, you
can argue about exactly _what_ you'd like to see as a suffix for "and then
some".

Linus

---
Makefile | 9 +++++++--
1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index e50569a..c62b7cc 100644
--- a/Makefile
+++ b/Makefile
@@ -963,8 +963,6 @@ localver = $(subst $(space),, $(string) \
# .scmversion is used when generating rpm packages so we do not loose
# the version information from the SCM when we do the build of the kernel
# from the copied source
-ifdef CONFIG_LOCALVERSION_AUTO
-
ifeq ($(wildcard .scmversion),)
_localver-auto = $(shell $(CONFIG_SHELL) \
$(srctree)/scripts/setlocalversion $(srctree))
@@ -972,7 +970,14 @@ else
_localver-auto = $(shell cat .scmversion 2> /dev/null)
endif

+ifdef CONFIG_LOCALVERSION_AUTO
localver-auto = $(LOCALVERSION)$(_localver-auto)
+else
+ ifeq ($_localver-auto,)
+ localver-auto = $(LOCALVERSION)
+ else
+ localver-auto = $(LOCALVERSION)+
+ endif
endif

localver-full = $(localver)$(localver-auto)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 5 of 5 ==
Date: Tues, Oct 6 2009 9:40 am
From: Ingo Molnar

* Linus Torvalds <torvalds@linux-foundation.org> wrote:

> Unless:
>
> > _That_ i think is a lot harder to confuse with the real .31 than a
> > v2.6.31-1234-g16123c4 version string.
>
> .. are you saying that it would be just some automatically generated
> thing, just a crippled form of CONFIG_LOCALVERSION_AUTO? Kind of a
> CONFIG_LOCALVERSION_AUTO_SHORTFORM?

Yes, exactly. Not a Makefile or tag property, obviously.

> If so, then I don't hate "v2.6.31+", but at the same time, that single
> plus-sign tells _so_much_ less than v2.6.31-1234-g16123c4 that I think
> it's really sad and crippled.

Agreed, and in my internal testing i've made LOCALVERSION_AUTO
compulsory long time ago.

But at least to me there's a simple benchmark: i have been confused by
this. In one of the merge windows i thought i booted .31 vanilla while
in fact it was a few days into the merge window already. Took me a few
minutes to figure out why it was crashing ;-)

YMMV.

Maybe a variant of the full string:

v2.6.31+1234-g16123c4

would be even less confusing. I tend to ignore dashes sub-consciously
(as a 'minor versions' kind of thing) - while in this ~1 week out of the
9 weeks of your tree's cycle it means something much more than that, and
we dont emphasise the difference strongly enough.

But ... that's just my own experience. I also see -tip users become a
lot more cautious for a few rc's once we go -rc1 - while they happily
test things in the merge window (which is _way_ more dangerous than
pretty much any post-rc1 tree you push out).

Basically IMHO the inflection point between v2.6.31 and the merge window
is way too 'smooth', and not obviously so, and it lures people who are
not careful enough into testing something they probably wouldnt test
otherwise. It's a purely human thing - to a machine it's already very
obvious what g16123c4 means - it doesnt need any of the fancy v2.6.31
stuff either.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: trivial: fix typos "selct" + "slect" -> "select"
http://groups.google.com/group/linux.kernel/t/1d60baffcb344471?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 6 2009 9:00 am
From: Jiri Kosina


On Tue, 6 Oct 2009, Uwe Kleine-König wrote:

> This patch was generated by
>
> git grep -E -i -l 's(le|el)ct' | xargs -r perl -p -i -e 's/([Ss])(le|el)ct/$1elect/
>
> with only skipping net/netfilter/xt_SECMARK.c and
> include/linux/netfilter/xt_SECMARK.h which have a struct member called
> selctx.

Queued, thanks.

--
Jiri Kosina
SUSE Labs, Novell Inc.


==============================================================================
TOPIC: it87 sensors need an ACPI driver (2.6.31)
http://groups.google.com/group/linux.kernel/t/ab6293b7cf9f56ab?hl=en
==============================================================================

== 1 of 3 ==
Date: Tues, Oct 6 2009 9:10 am
From: Jean Delvare


On Tue, 06 Oct 2009 19:41:47 +0400, Michael Tokarev wrote:
> Luca Tettamanti wrote:
> > In term of Q-FAN profiles: yes, but I still haven't managed to make it
> > work (the interface is there though).
>
> Oh, so it's basically boils down to Silent/Optimal/Normal choices
> as in the BIOS settings, right? I mean, still no pwm_* controls
> as it87 offers?..

Probably not. But feel free to bother Asus about it. If they don't
provide the level of control you want, they should know about it, and
make it better in future products. Or even in current products through
BIOS upgrades.

--
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 3 ==
Date: Tues, Oct 6 2009 9:20 am
From: Alan Jenkins


On 10/6/09, Michael Tokarev <mjt@tls.msk.ru> wrote:
> Luca Tettamanti wrote:
>> On Mon, Oct 5, 2009 at 6:42 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> []
>>> Well, I just tried it here and it works here too, on 3 different
>>> asus motherboards. But asus_atk0110 is far less useful than the
>>> it87 variant. Yes atk0110 shows correct labels for various sensors,
>>> but for one there's no way to control fan speeds using it, at least
>>> not currently, -- something which is done by it87 easily.
> []
>> The main reason for using atk0110 is correctness: the resources are
>> claimed by ACPI, it might not be safe to touch them (for the same
>> reason two drivers are not allowed to map e.g. the same PCI BAR).
>> On newer boards the risk of collision is pretty high, since the hwmon
>> chip is used by an EC that works in background... on other boards the
>> risk is much lower since the hwmon chip doesn't seem to be probed
>> actively.
>> Anyway, as user you can override this decision with
>> "acpi_enforce_resources=lax", but _I_ wouldn't recommend it.
>
> If there's a choice between "does not work but correct" and
> "incorrect but works", i'd prefer the latter, and I'd say any
> sane person agrees.

It works fine on your system but it _doesn't_ work in the general case.

If you build a kernel with it87 and use acpi_enforce_resources=lax, it
will cause a 15 second boot delay on certain models of EeePC. There
may be worse consequences on other machines, but that's bad enough.

So this doesn't depend on sanity, it's simply a matter of whether
you're interested enough to test "acpi_enforce_resources=lax" on your
system and identify any failures. Most people aren't that interested.
For the rest, there's always the option. (Which should probably set
a TAINT flag if it doesn't already).

Thanks
Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 3 of 3 ==
Date: Tues, Oct 6 2009 9:20 am
From: Luca Tettamanti


On Tue, Oct 6, 2009 at 5:41 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
> Luca Tettamanti wrote:
>>
>> On Tue, Oct 6, 2009 at 5:22 PM, Michael Tokarev <mjt@tls.msk.ru> wrote:
>>> Ok.  Now a pure technical question, finally.  Is there a
>>> way to made asus_atk0110 to be able to *set* fan speeds
>>> too, in a way as it's done by it87?
>>
>> In term of Q-FAN profiles: yes, but I still haven't managed to make it
>> work (the interface is there though).
>
> Oh, so it's basically boils down to Silent/Optimal/Normal choices
> as in the BIOS settings, right?  I mean, still no pwm_* controls
> as it87 offers?..

It's motherboard specific. On some model with w83627ehf it's possible
to activate thermal cruise control (with a given target temperature).
I've never seen direct pwm or dc control; it's theoretically doable
with a custom DSDT ;-)

Luca
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: SCSI fixes for 2.6.32-rc3
http://groups.google.com/group/linux.kernel/t/7de73324941ebcbb?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 6 2009 9:10 am
From: Linus Torvalds


On Tue, 6 Oct 2009, James Bottomley wrote:
>
> This is mostly fixes. However, it contains two new drivers: Brocade SAS
> (bfa), the Bladengine2 iSCSI (be2iscsi) under the merge window exemption

Btw, I'm getting less excited about the merge window exemption.

It makes sense for consumer devices that people actually hit and that are
needed for bringup (ie they make a difference between a system that can be
installed and used, and one that cannot), but I'm not at all sure it makes
sense for things like this.

The _reason_ for the driver exemption was the fact that even a broken
driver is better than no driver at all for somebody who just can't get a
working system without it, but that argument really goes away when the
driver is so specialized that it's not about regular hardware any more.

And the whole "driver exemption" seems to have become a by-word for "I can
ignore the merge window for 50% of my code". Which makes me very tired of
it if there aren't real advantages to real users.

So I'm seriously considering a "the driver has to be mass market and also
actually matter to an install" rule for the exemption to be valid.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: futex: Nullify robust lists after cleanup
http://groups.google.com/group/linux.kernel/t/ccab234928070dfe?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Oct 6 2009 9:10 am
From: Anirban Sinha


Once upon a time, like on 09-10-06 8:03 AM, tip-bot for Peter Zijlstra wrote:

> futex: Nullify robust lists after cleanup
>
> The robust list pointers of user space held futexes are kept intact
> over an exec() call. When the exec'ed task exits exit_robust_list() is
> called with the stale pointer. The risk of corruption is minimal, but
> still it is incorrect to keep the pointers valid. Actually glibc
> should uninstall the robust list before calling exec() but we have to
> deal with it anyway.
>
> Nullify the pointers after [compat_]exit_robust_list() has been
> called.
>
> Reported-by: Anirban Sinha <ani@anirban.org>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> LKML-Reference: <new-submission>
> Cc: stable@kernel.org

Excellent! I very much like the solution because:

(a) it reiterates the fact that an execve() call is indeed logically the
'death' of a process.
'(b) cleans up state that should not logically propagate across an execve() call.
(c) Does not introduce new EWOULDBLOCK semantics that can potentially break user
land.
(d) Restores the 'robust' behavior of futexes by waking up other processes
waiting on it on an exec() call.
(e) handles potentially broken/buggy applications gracefully.
(f) as far as I can see, the world would be better and no worse with this patch :)

I'll let you guys know if something else strikes my eyes as being odd in the
course of my digging through the code.

Ani

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: tree rcu: Add debug RCU head option
http://groups.google.com/group/linux.kernel/t/f40d55eaaaeed891?hl=en
==============================================================================

== 1 of 3 ==
Date: Tues, Oct 6 2009 9:10 am
From: Eric Dumazet


Mathieu Desnoyers a écrit :
> Poisoning the rcu_head callback list. Only for rcu tree for now.
>
> Helps finding racy users of call_rcu(), which results in hangs because list
> entries are overwritten and/or skipped. Using the lower bit to poison because
> include/net/dst.h __pad_to_align_refcnt complains when struct rcu_head grows.
>

I see :)

> --- linux-2.6-lttng.orig/include/linux/rcupdate.h 2009-10-06 10:35:15.000000000 -0400
> +++ linux-2.6-lttng/include/linux/rcupdate.h 2009-10-06 10:35:18.000000000 -0400
> @@ -45,6 +45,8 @@
> * struct rcu_head - callback structure for use with RCU
> * @next: next update requests in a list
> * @func: actual update function to call after the grace period.
> + *
> + * Debug mode assumes func pointer value is word-aligned.

Ouch, I guess you never tried CONFIG_CC_OPTIMIZE_FOR_SIZE=y ?

random extract from "nm -v vmlinux"

c0415e58 T in4_pton
c0415f66 T in6_pton
c04161f0 T in_aton
c0416240 T net_ratelimit
c0416250 t linkwatch_add_event
c0416281 t linkwatch_schedule_work
c0416301 T linkwatch_fire_event
c0416368 t __linkwatch_run_queue
c04164d2 t linkwatch_event
c04164f4 T linkwatch_run_queue
c0416500 T sk_chk_filter
c0416703 t sk_filter_rcu_release
c041671d T sk_detach_filter
c041676a T sk_attach_filter
c0416862 T sk_run_filter
c0416f5d T sk_filter
c0416fc4 t __flow_cache_shrink

gcc -v
Using built-in specs.
Target: i686-pc-linux-gnu
Configured with: ../gcc-4.4.1/configure --enable-languages=c,c++ --prefix=/usr
Thread model: posix
gcc version 4.4.1 (GCC)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 3 ==
Date: Tues, Oct 6 2009 9:20 am
From: Mathieu Desnoyers


* Eric Dumazet (eric.dumazet@gmail.com) wrote:
> Mathieu Desnoyers a écrit :
> > Poisoning the rcu_head callback list. Only for rcu tree for now.
> >
> > Helps finding racy users of call_rcu(), which results in hangs because list
> > entries are overwritten and/or skipped. Using the lower bit to poison because
> > include/net/dst.h __pad_to_align_refcnt complains when struct rcu_head grows.
> >
>
> I see :)
>

I don't know if there is an easy fix for __pad_to_align_refcnt ?

> > --- linux-2.6-lttng.orig/include/linux/rcupdate.h 2009-10-06 10:35:15.000000000 -0400
> > +++ linux-2.6-lttng/include/linux/rcupdate.h 2009-10-06 10:35:18.000000000 -0400
> > @@ -45,6 +45,8 @@
> > * struct rcu_head - callback structure for use with RCU
> > * @next: next update requests in a list
> > * @func: actual update function to call after the grace period.
> > + *
> > + * Debug mode assumes func pointer value is word-aligned.
>
> Ouch, I guess you never tried CONFIG_CC_OPTIMIZE_FOR_SIZE=y ?
>

oops. Well, simple fix would be : depends on !CC_OPTIMIZE_FOR_SIZE

It's a debug option after all... but I'd prefer growing the structure
with a supplementary field if possible.

Mathieu

> random extract from "nm -v vmlinux"
>
> c0415e58 T in4_pton
> c0415f66 T in6_pton
> c04161f0 T in_aton
> c0416240 T net_ratelimit
> c0416250 t linkwatch_add_event
> c0416281 t linkwatch_schedule_work
> c0416301 T linkwatch_fire_event
> c0416368 t __linkwatch_run_queue
> c04164d2 t linkwatch_event
> c04164f4 T linkwatch_run_queue
> c0416500 T sk_chk_filter
> c0416703 t sk_filter_rcu_release
> c041671d T sk_detach_filter
> c041676a T sk_attach_filter
> c0416862 T sk_run_filter
> c0416f5d T sk_filter
> c0416fc4 t __flow_cache_shrink
>
> gcc -v
> Using built-in specs.
> Target: i686-pc-linux-gnu
> Configured with: ../gcc-4.4.1/configure --enable-languages=c,c++ --prefix=/usr
> Thread model: posix
> gcc version 4.4.1 (GCC)
>

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 3 of 3 ==
Date: Tues, Oct 6 2009 9:30 am
From: Eric Dumazet


Mathieu Desnoyers a écrit :
> * Eric Dumazet (eric.dumazet@gmail.com) wrote:
>> Mathieu Desnoyers a écrit :
>>> Poisoning the rcu_head callback list. Only for rcu tree for now.
>>>
>>> Helps finding racy users of call_rcu(), which results in hangs because list
>>> entries are overwritten and/or skipped. Using the lower bit to poison because
>>> include/net/dst.h __pad_to_align_refcnt complains when struct rcu_head grows.
>>>
>> I see :)
>>
>
> I don't know if there is an easy fix for __pad_to_align_refcnt ?


This check was added to make sure some tbench regression was not added if
dst->__refcnt was moved around, I am sure you dont care about tbench being 10% slower,
do you ?


diff --git a/include/net/dst.h b/include/net/dst.h
index 5a900dd..b8fba74 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -154,7 +154,9 @@ static inline void dst_hold(struct dst_entry * dst)
* If your kernel compilation stops here, please check
* __pad_to_align_refcnt declaration in struct dst_entry
*/
+#if !defined(DEBUG_RCU_HEAD)
BUILD_BUG_ON(offsetof(struct dst_entry, __refcnt) & 63);
+

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate