Tuesday, December 22, 2009

linux.kernel - 25 new messages in 14 topics - digest

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* WARN_ON at line 380 in kernel/smp.c under 2.6.32.2 + TuxOnIce + KDB - 2
messages, 2 authors
http://groups.google.com/group/linux.kernel/t/ed977c8984edac1f?hl=en
* Driver core: devtmpfs: prevent concurrent subdirectory creation and removal -
1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/4fa5251c4f5cf4b6?hl=en
* AlacrityVM guest drivers for 2.6.33 - 4 messages, 3 authors
http://groups.google.com/group/linux.kernel/t/265f7bf4a9a1d92d?hl=en
* NOMMU: Avoiding duplicate icache flushes of shared maps - 2 messages, 1
author
http://groups.google.com/group/linux.kernel/t/e9c135731d46f942?hl=en
* FDPIC: Respect PT_GNU_STACK exec protection markings when creating NOMMU
stack - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/00445fed5f02ade0?hl=en
* BUG printk with not null-terminated string in driver /drivers/acpi/osl.c - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/80092a93dec35e11?hl=en
* workqueue thing - 4 messages, 3 authors
http://groups.google.com/group/linux.kernel/t/dc904db34835b18c?hl=en
* Generic support for this_cpu_cmpxchg - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/2bf6b597f491a7bc?hl=en
* vfs patches for -rc2 - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/ba34060b05a3dda9?hl=en
* net/via-rhine: Fix scheduling while atomic bugs - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/4d8e8d13d27c7077?hl=en
* DMA cache consistency bug introduced in 2.6.28 (Was: Re: Cannot format
floppies under kernel 2.6.*?) - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/7ab0ecc4119ae78c?hl=en
* utimensat fails to update ctime - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/c83020068257a38d?hl=en
* Asus eeepc 1008HA suspend issue and mac80211 suspend corner case - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/033504ee0cb101d3?hl=en
* Fix tracing infrastructure to support multiple includes when defining CREATE_
TRACE_POINTS - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/480d9896c15b15dd?hl=en

==============================================================================
TOPIC: WARN_ON at line 380 in kernel/smp.c under 2.6.32.2 + TuxOnIce + KDB
http://groups.google.com/group/linux.kernel/t/ed977c8984edac1f?hl=en
==============================================================================

== 1 of 2 ==
Date: Tues, Dec 22 2009 9:00 am
From: Pedro Ribeiro


Hi all,

I've been asked by Nigel Cunningham of TuxOnIce to forward this bug here.

While resuming from hibernate (using TuxOnIce) I'm seeing the WARN_ON
at line 380 in kernel/smp.c trigger from a kmap_high call right after
secondary processors have been brought down (the previous message is
"CPU1 is down").

I only have pictures of the backtrace (readable but not very good
quality, sorry).
http://img51.imageshack.us/img51/2312/stacktrace1.jpg
http://img195.imageshack.us/img195/3646/stacktrace2.jpg

At first I thought this was related to my battery saving script
messing with /proc/sys/vm/dirty_writeback_centisecs, but I'm not sure
right now, it still happens with the default value, so I'm clueless.

Mind you, this does not impede the resume - it just dumps this stack
trace and continues resuming happily.

Some information which might be helpful:
lspci -vv, http://pastebin.com/m2c217b4e
dmesg, http://pastebin.com/m491ab4db
.config, http://pastebin.com/m2e4352fe
(the pastebin links are good for a month)

My hardware is a Lenovo T400, and I'm using 2.6.32.2 + TuxOnIce + KDB patches.

Thanks for your help,
Pedro
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 2 ==
Date: Tues, Dec 22 2009 9:30 am
From: Peter Zijlstra


On Tue, 2009-12-22 at 16:51 +0000, Pedro Ribeiro wrote:
> Hi all,
>
> I've been asked by Nigel Cunningham of TuxOnIce to forward this bug here.
>
> While resuming from hibernate (using TuxOnIce) I'm seeing the WARN_ON
> at line 380 in kernel/smp.c trigger from a kmap_high call right after
> secondary processors have been brought down (the previous message is
> "CPU1 is down").
>
> I only have pictures of the backtrace (readable but not very good
> quality, sorry).
> http://img51.imageshack.us/img51/2312/stacktrace1.jpg
> http://img195.imageshack.us/img195/3646/stacktrace2.jpg
>
> At first I thought this was related to my battery saving script
> messing with /proc/sys/vm/dirty_writeback_centisecs, but I'm not sure
> right now, it still happens with the default value, so I'm clueless.
>
> Mind you, this does not impede the resume - it just dumps this stack
> trace and continues resuming happily.

If you'd enable frame pointers the strack traces would be clearer, but
it looks like a bug in tux on ice, doing kmap_high() with IRQs disabled
or something like that.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: Driver core: devtmpfs: prevent concurrent subdirectory creation and
removal
http://groups.google.com/group/linux.kernel/t/4fa5251c4f5cf4b6?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Dec 22 2009 9:00 am
From: Greg KH


On Tue, Dec 22, 2009 at 04:10:31PM +0200, Kirill A. Shutemov wrote:
> On Mon, Dec 21, 2009 at 4:37 PM, Kay Sievers <kay.sievers@vrfy.org> wrote:
> > On Mon, Dec 21, 2009 at 14:37, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> >> v2.6.33-rc1-96-gdd59f6c:
> >> I guess it can be related to this commit.
> >
> > The fix is already pending here:
> >  http://patchwork.kernel.org/patch/68337/
>
> One more problem: you don't unlock dirlock if kstrdup() failed.

Care to provide a patch?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: AlacrityVM guest drivers for 2.6.33
http://groups.google.com/group/linux.kernel/t/265f7bf4a9a1d92d?hl=en
==============================================================================

== 1 of 4 ==
Date: Tues, Dec 22 2009 9:10 am
From: Avi Kivity


On 12/22/2009 06:21 PM, Andi Kleen wrote:
>> So far, the only actual technical advantage I've seen is that vbus avoids
>> EOI exits.
>>
> The technical advantage is that it's significantly faster today.
>
> Maybe your proposed alternative is as fast, or maybe it's not. Who knows?
>

We're working on numbers for the proposed alternative, so we should know
soon. Are the AlacrityVM folks working on having all the virtio drivers
for all the virtio archs?

We shouldn't drop everything and switch to new code just because someone
came up with a new idea. The default should be to enhance the existing
code.

>> We think we understand why vbus does better than the current userspace
>> virtio backend. That's why we're building vhost-net. It's not done yet,
>> but our expectation is that it will do just as well if not better.
>>
> That's the vapourware vs working code disconnect I mentioned. One side has hard
> numbers&working code and the other has expectations. I usually find it sad when the
> vapourware holds up the working code.
>

vhost-net is working code and is queued for 2.6.33.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 4 ==
Date: Tues, Dec 22 2009 9:40 am
From: Gregory Haskins


On 12/22/09 2:57 AM, Ingo Molnar wrote:
>
> * Gregory Haskins <gregory.haskins@gmail.com> wrote:
>
>> On 12/18/09 4:51 PM, Ingo Molnar wrote:
>>>
>>> * Gregory Haskins <gregory.haskins@gmail.com> wrote:
>>>
>>>> Hi Linus,
>>>>
>>>> Please pull AlacrityVM guest support for 2.6.33 from:
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/alacrityvm/linux-2.6.git
>>>> for-linus
>>>>
>>>> All of these patches have stewed in linux-next for quite a while now:
>>>>
>>>> Gregory Haskins (26):
>>>
>>> I think it would be fair to point out that these patches have been objected to
>>> by the KVM folks quite extensively,
>>
>> Actually, these patches have nothing to do with the KVM folks. [...]
>
> That claim is curious to me - the AlacrityVM host

It's quite simple, really. These drivers support accessing vbus, and
vbus is hypervisor agnostic. In fact, vbus isn't necessarily even
hypervisor related. It may be used anywhere where a Linux kernel is the
"io backend", which includes hypervisors like AlacrityVM, but also
userspace apps, and interconnected physical systems as well.

The vbus-core on the backend, and the drivers on the frontend operate
completely independent of the underlying hypervisor. A glue piece
called a "connector" ties them together, and any "hypervisor" specific
details are encapsulated in the connector module. In this case, the
connector surfaces to the guest side as a pci-bridge, so even that is
not hypervisor specific per se. It will work with any pci-bridge that
exposes a compatible ABI, which conceivably could be actual hardware.

The AlacrityVM project just so happens to be the primary consumer, and
is therefore the most convenient way to package them up at the moment.

> is 90% based on KVM code, so
> how can it not be about KVM? I just checked, most of the changes that
> AlacrityVM host does to KVM is in adding the host side interfaces for these
> guest drivers:
>
> virt/kvm/Kconfig | 11 +
> virt/kvm/coalesced_mmio.c | 65 +++---
> virt/kvm/coalesced_mmio.h | 1 +
> virt/kvm/eventfd.c | 599 +++++++++++++++++++++++++++++++++++++++++++++
> virt/kvm/ioapic.c | 118 +++++++--
> virt/kvm/ioapic.h | 5 +
> virt/kvm/iodev.h | 55 +++--
> virt/kvm/irq_comm.c | 267 ++++++++++++++-------
> virt/kvm/kvm_main.c | 127 ++++++++--
> virt/kvm/xinterface.c | 587 ++++++++++++++++++++++++++++++++++++++++++++
> 10 files changed, 1649 insertions(+), 186 deletions(-)
>
> [ stat for virt/kvm/ taken as of today, AlacrityVM host tree commit 84afcc7 ]
>
> So as far as kernel code modifications of AlacrityVM goes, it's very much
> about KVM.

I think you are confused. Even if we entertained the notion that the
host side diffstat were somehow relevant here, you are probably
comparing the kvm.git backports that are in my tree. The only real KVM
specific change that is in my tree is the 587 lines for the xinterface.c
module, which is about ~4%, not 90%. Also note that I have pushed this
xinterface logic upstream already, but it just hasn't been accepted yet.

If I wanted to be extremely generous, you could include the entire "KVM
connector" code that bridges vbus-core to kvm-core, but even that tops
out at a total of ~17% of the changes in my tree. So I am still not
seeing the 90% nor how it is relevant.

>
>> [...] You are perhaps confusing this with the hypervisor-side discussion,
>> of which there is indeed much disagreement.
>
> Are the guest drivers living in a vacuum? The whole purpose of the AlacrityVM
> guest drivers is to ... enable AlacrityVM support, right?

More specifically, the purpose of the drivers, like any driver, is to
enable support for the underlying device in which it is related to. In
this case, the devices are vbus based devices. Of those, AlacrityVM is
the only available platform that exposes them. However, that is a
maturity/adoption detail, not a technical limitation. Simply
implementing a new connector would bridge these drivers to other
environments as well. There are community members working on these as
we speak, as a matter of fact.

> So how can it be not about KVM?

Because AlacrityVM is a hypervisor that supports VBUS for PV IO, and KVM
is not. In addition, the presence of these drivers in no way alters,
interferes with, or diminishes features found in KVM today. So it is,
and never will be about KVM until upstream KVM decides that they want to
support VBUS based PV-IO.

If you want to talk about the host side, then I have +587 lines that
hang in the balance that affect KVM, yes. But that isn't what $subject
was about.

>
> Gregory, it would be nice if you worked _much_ harder with the KVM folks
> before giving up.

I think the 5+ months that I politely tried to convince the KVM folks
that this was a good idea was pretty generous of my employer. The KVM
maintainers have ultimately made it clear they are not interested in
directly supporting this concept (which is their prerogative), but are
perhaps willing to support the peripheral logic needed to allow it to
easily interface with KVM. I can accept that, and thus AlacrityVM was born.

Note that upstream KVM are also only a subset of the mindshare needed
for this project anyway, since most of the core is independent of KVM.
Perhaps the KVM folks will reconsider if/when other community members
start to see the merit in the work. Perhaps not. It's out of my
control at this point.

> It's not like there's much valid technical disagreement that
> i can identify in any of the threads

While I am sorry to hear that, it should be noted that this doesn't mean
that your perception is accurate, either. It was quite a long and
fragmented set of threads over those 5+ months, so absorbing the gist of
the vision from casual observation is not likely trivial.

> - the strongest one i could identify was:
> "I want to fork KVM so please let me do it, nobody is harmed, choice is good".

Everyone is of course entitled to an opinion, but I would respectfully
disagree with your statement (as I did last time you made the same
claim, as well). I have now, nor ever, wanted a fork. But I also
believe in the work I am doing, so I won't roll over and die just
because a certain group doesn't share the vision per se either, sorry.
I get the impression that you would not either if you were in a similar
situation, so perhaps you can respect that.

Kind Regards,
-Greg

== 3 of 4 ==
Date: Tues, Dec 22 2009 9:40 am
From: Avi Kivity


On 12/22/2009 07:33 PM, Andi Kleen wrote:
>> We're not talking about vaporware. vhost-net exists.
>>
> Is it as fast as the alacrityvm setup then e.g. for network traffic?
>
> Last I heard the first could do wirespeed 10Gbit/s on standard hardware.
>

That was with zero-copy IIRC, which is known broken. There's nothing
alacrity-specific about zerocopy (and in fact the first zerocopy patches
were from Rusty).

> Can vhost-net do the same thing?
>

I've heard unofficial numbers which approach that, but let's wait for
the official ones.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 4 of 4 ==
Date: Tues, Dec 22 2009 9:40 am
From: Andi Kleen


> We're not talking about vaporware. vhost-net exists.

Is it as fast as the alacrityvm setup then e.g. for network traffic?

Last I heard the first could do wirespeed 10Gbit/s on standard hardware.
Can vhost-net do the same thing?

-Andi
--
ak@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: NOMMU: Avoiding duplicate icache flushes of shared maps
http://groups.google.com/group/linux.kernel/t/e9c135731d46f942?hl=en
==============================================================================

== 1 of 2 ==
Date: Tues, Dec 22 2009 9:20 am
From: David Howells


From: Mike Frysinger <vapier.adi@gmail.com>

When working with FDPIC, there are many shared mappings of read-only code
regions between applications (the C library, applet packages like busybox,
etc.), but the current do_mmap_pgoff() function will issue an icache flush
whenever a VMA is added to an MM instead of only doing it when the map is
initially created.

The flush can instead be done when a region is first mmapped PROT_EXEC. Note
that we may not rely on the first mapping of a region being executable - it's
possible for it to be PROT_READ only, so we have to remember whether we've
flushed the region or not, and then flush the entire region when a bit of it is
made executable.

However, this also affects the brk area. That will no longer be executable.
We can mprotect() it to PROT_EXEC on MPU-mode kernels, but for NOMMU mode
kernels, when it increases the brk allocation, making sys_brk() flush the extra
from the icache should suffice. The brk area probably isn't used by NOMMU
programs since the brk area can only use up the leavings from the stack
allocation, where the stack allocation is larger than requested.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
---

include/linux/mm_types.h | 2 ++
mm/nommu.c | 11 ++++++++---
2 files changed, 10 insertions(+), 3 deletions(-)


diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 84a524a..84d020b 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -123,6 +123,8 @@ struct vm_region {
struct file *vm_file; /* the backing file or NULL */

atomic_t vm_usage; /* region usage count */
+ bool vm_icache_flushed : 1; /* true if the icache has been flushed for
+ * this region */
};

/*
diff --git a/mm/nommu.c b/mm/nommu.c
index 8687973..db52886 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -432,6 +432,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
/*
* Ok, looks good - let it rip.
*/
+ flush_icache_range(mm->brk, brk);
return mm->brk = brk;
}

@@ -1353,10 +1354,14 @@ unsigned long do_mmap_pgoff(struct file *file,
share:
add_vma_to_mm(current->mm, vma);

- up_write(&nommu_region_sem);
+ /* we flush the region from the icache only when the first executable
+ * mapping of it is made */
+ if (vma->vm_flags & VM_EXEC && !region->vm_icache_flushed) {
+ flush_icache_range(region->vm_start, region->vm_end);
+ region->vm_icache_flushed = true;
+ }

- if (prot & PROT_EXEC)
- flush_icache_range(result, result + len);
+ up_write(&nommu_region_sem);

kleave(" = %lx", result);
return result;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


== 2 of 2 ==
Date: Tues, Dec 22 2009 9:20 am
From: David Howells


From: Jie Zhang <jie.zhang@analog.com>

The MMU code uses the copy_*_user_page() variants in access_process_vm()
rather than copy_*_user() as the former includes an icache flush. This is
important when doing things like setting software breakpoints with gdb.
So switch the NOMMU code over to do the same.

This patch makes the reasonable assumption that copy_from_user_page() won't
fail - which is probably fine, as we've checked the VMA from which we're
copying is usable, and the copy is not allowed to cross VMAs. The one case
where it might go wrong is if the VMA is a device rather than RAM, and that
device returns an error which - in which case rubbish will be returned rather
than EIO.

Signed-off-by: Jie Zhang <jie.zhang@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: David McCullough <david_mccullough@mcafee.com>
Acked-by: Paul Mundt <lethal@linux-sh.org>
Acked-by: Greg Ungerer <gerg@uclinux.org>
---

mm/nommu.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/mm/nommu.c b/mm/nommu.c
index db52886..1e1ecb2 100644
--- a/mm/nommu.c
+++ b/mm/nommu.c
@@ -1896,9 +1896,11 @@ int access_process_vm(struct task_struct *tsk, unsigned long addr, void *buf, in

/* only read or write mappings where it is permitted */
if (write && vma->vm_flags & VM_MAYWRITE)
- len -= copy_to_user((void *) addr, buf, len);
+ copy_to_user_page(vma, NULL, addr,
+ (void *) addr, buf, len);
else if (!write && vma->vm_flags & VM_MAYREAD)
- len -= copy_from_user(buf, (void *) addr, len);
+ copy_from_user_page(vma, NULL, addr,
+ buf, (void *) addr, len);
else
len = 0;
} else {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: FDPIC: Respect PT_GNU_STACK exec protection markings when creating
NOMMU stack
http://groups.google.com/group/linux.kernel/t/00445fed5f02ade0?hl=en
==============================================================================

== 1 of 2 ==
Date: Tues, Dec 22 2009 9:20 am
From: David Howells


From: Mike Frysinger <vapier@gentoo.org>

The current code will load the stack size and protection markings, but then
only use the markings in the MMU code path. The NOMMU code path always passes
PROT_EXEC to the mmap() call. While this doesn't matter to most people whilst
the code is running, it will cause a pointless icache flush when starting every
FDPIC application. Typically this icache flush will be of a region on the
order of 128KB in size, or may be the entire icache, depending on the
facilities available on the CPU.

In the case where the arch default behaviour seems to be desired
(EXSTACK_DEFAULT), we probe VM_STACK_FLAGS for VM_EXEC to determine whether we
should be setting PROT_EXEC or not.

For arches that support an MPU (Memory Protection Unit - an MMU without the
virtual mapping capability), setting PROT_EXEC or not will make an important
difference.

It should be noted that this change also affects the executability of the brk
region, since ELF-FDPIC has that share with the stack. However, this is
probably irrelevant as NOMMU programs aren't likely to use the brk region,
preferring instead allocation via mmap().

Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: David Howells <dhowells@redhat.com>
---

arch/blackfin/include/asm/page.h | 5 +++++
arch/frv/include/asm/page.h | 2 --
fs/binfmt_elf_fdpic.c | 13 +++++++++++--
3 files changed, 16 insertions(+), 4 deletions(-)


diff --git a/arch/blackfin/include/asm/page.h b/arch/blackfin/include/asm/page.h
index 944a07c..1d04e40 100644
--- a/arch/blackfin/include/asm/page.h
+++ b/arch/blackfin/include/asm/page.h
@@ -10,4 +10,9 @@
#include <asm-generic/page.h>
#define MAP_NR(addr) (((unsigned long)(addr)-PAGE_OFFSET) >> PAGE_SHIFT)

+#define VM_DATA_DEFAULT_FLAGS \
+ (VM_READ | VM_WRITE | \
+ ((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0 ) | \
+ VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate