Tuesday, January 21, 2014

linux.kernel - 26 new messages in 16 topics - digest

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* x86: Inconsistent xAPIC synchronization in arch_irq_work_raise? - 1 messages,
1 author
http://groups.google.com/group/linux.kernel/t/daa6d2e500b383a8?hl=en
* mm: thp: hugepage_vma_check has a blind spot - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/bf263f763b1b8c0c?hl=en
* MCS Lock: Allow architectures to hook in to contended paths - 7 messages, 1
author
http://groups.google.com/group/linux.kernel/t/a70959a6ec9463ba?hl=en
* audit: store audit_pid as a struct pid pointer - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/867132a6ef8a8c2c?hl=en
* powerpc: use device_initcall for registering rtc devices - 5 messages, 4
authors
http://groups.google.com/group/linux.kernel/t/bd6c2e133bc56440?hl=en
* pm/qos: allow state control of qos class - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/99d47d2ab1cbab9d?hl=en
* bio_integrity_verify() bug causing READ verify to be silently skipped - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/cf25ed79a5250081?hl=en
* [PATCH] preempt: Debug for possible missed preemption checks - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/68b76fe9e374c2dc?hl=en
* numa,sched: define some magic numbers - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/34ca8de3d5507b71?hl=en
* gpio: bcm281xx: Centralize register locking - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/2695bfd6ae884f11?hl=en
* linux-next: the usual request - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/e939d5c79d7d826b?hl=en
* gpio: bcm281xx: Fix parameter name for GPIO_CONTROL macro - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/ffd8dad9f04fbd37?hl=en
* qrwlock, x86 - Treat all data type not bigger than long as atomic in x86 - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/45392dc27bd5157f?hl=en
* linux rdma 3.14 merge plans - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/af3006cdced4146f?hl=en
* pinctrl: Rename Broadcom Capri pinctrl driver - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/20f2ee50e5f3b283?hl=en
* Weird plugin paths in perf and perf.so binaries with 3.14 merge window - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/76af8a389853bc17?hl=en

==============================================================================
TOPIC: x86: Inconsistent xAPIC synchronization in arch_irq_work_raise?
http://groups.google.com/group/linux.kernel/t/daa6d2e500b383a8?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Jan 21 2014 3:30 pm
From: Huang Ying


On Tue, 2014-01-21 at 15:51 +0100, Peter Zijlstra wrote:
> On Tue, Jan 21, 2014 at 03:01:13PM +0100, Peter Zijlstra wrote:
> > On Tue, Jan 21, 2014 at 02:02:06PM +0100, Jan Kiszka wrote:
> > > Hi all,
> > >
> > > while trying to plug a race in the CPU hotplug code on xAPIC systems, I
> > > was analyzing IPI transmission patterns. The handlers in
> > > arch/x86/include/asm/ipi.h first wait for ICR, then send. In contrast,
> > > arch_irq_work_raise sends the self-IPI directly and then waits. This
> > > looks inconsistent. Is it intended?
> > >
> > > BTW, the races are in wakeup_secondary_cpu_via_init and
> > > wakeup_secondary_cpu_via_nmi (lacking IRQ disable around ICR accesses).
> > > There we also send first, then wait for completion. But I guess that is
> > > due to the code originally only being used during boot. Will send fixes
> > > for those once the sync pattern is clear to me.
> >
> > Could be I had no clue what I was doing and copy/pasted the code until
> > it compiled and ran.
> >
> > In fact, I've got no clue what an ICR is.
>
> I dug about a bit, I borrowed that code from:
>
> lkml.kernel.org/r/1277348698-17311-3-git-send-email-ying.huang@intel.com
>
> Huang Ying, can you explain to Jan why you do the wait afterwards?

I borrow the code from the original MCE report event code.

Andi, could you help us to explain it?

Best Regards,
Huang Ying


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/





==============================================================================
TOPIC: mm: thp: hugepage_vma_check has a blind spot
http://groups.google.com/group/linux.kernel/t/bf263f763b1b8c0c?hl=en
==============================================================================

== 1 of 1 ==
Date: Tues, Jan 21 2014 3:30 pm
From: David Rientjes


On Tue, 21 Jan 2014, Alex Thorlton wrote:

> hugepage_vma_check is called during khugepaged_scan_mm_slot to ensure
> that khugepaged doesn't try to allocate THPs in vmas where they are
> disallowed, either due to THPs being disabled system-wide, or through
> MADV_NOHUGEPAGE.
>
> The logic that hugepage_vma_check uses doesn't seem to cover all cases,
> in my opinion. Looking at the original code:
>
> if ((!(vma->vm_flags & VM_HUGEPAGE) && !khugepaged_always()) ||
> (vma->vm_flags & VM_NOHUGEPAGE))
>
> We can see that it's possible to have THP disabled system-wide, but still
> receive THPs in this vma. It seems that it's assumed that just because
> khugepaged_always == false, TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG must be
> set, which is not the case. We could have VM_HUGEPAGE set, but have THP
> set to "never" system-wide, in which case, the condition presented in the
> if will evaluate to false, and (provided the other checks pass) we can
> end up giving out a THP even though the behavior is set to "never."
>

You should be able to add a

BUG_ON(current != khugepaged_thread);

here since khugepaged is supposed to be the only caller to the function.

> While we do properly check these flags in khugepaged_has_work, it looks
> like it's possible to sleep after we check khugepaged_hask_work, but
> before hugepage_vma_check, during which time, hugepages could have been
> disabled system-wide, in which case, we could hand out THPs when we
> shouldn't be.
>

You're talking about when thp is set to "never" and before khugepaged has
stopped, correct?

That doesn't seem like a bug to me or anything that needs to be fixed, the
sysfs knob could be switched even after hugepage_vma_check() is called and
before a hugepage is actually collapsed so you have the same race.

The only thing that's guaranteed is that, upon writing "never" to
/sys/kernel/mm/transparent_hugepage/enabled, no more thp memory will be
collapsed after khugepaged has stopped.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/





==============================================================================
TOPIC: MCS Lock: Allow architectures to hook in to contended paths
http://groups.google.com/group/linux.kernel/t/a70959a6ec9463ba?hl=en
==============================================================================

== 1 of 7 ==
Date: Tues, Jan 21 2014 3:40 pm
From: Tim Chen


From: Will Deacon <will.deacon@arm.com

When contended, architectures may be able to reduce the polling overhead
in ways which aren't expressible using a simple relax() primitive.

This patch allows architectures to hook into the mcs_{lock,unlock}
functions for the contended cases only.

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---
include/linux/mcs_spinlock.h | 42 ++++++++++++++++++++++++++++--------------
1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/include/linux/mcs_spinlock.h b/include/linux/mcs_spinlock.h
index 143fa42..e9a4d74 100644
--- a/include/linux/mcs_spinlock.h
+++ b/include/linux/mcs_spinlock.h
@@ -17,6 +17,28 @@ struct mcs_spinlock {
int locked; /* 1 if lock acquired */
};

+#ifndef arch_mcs_spin_lock_contended
+/*
+ * Using smp_load_acquire() provides a memory barrier that ensures
+ * subsequent operations happen after the lock is acquired.
+ */
+#define arch_mcs_spin_lock_contended(l) \
+do { \
+ while (!(smp_load_acquire(l))) \
+ arch_mutex_cpu_relax(); \
+} while (0)
+

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate