Monday, January 13, 2014

linux.kernel - 26 new messages in 11 topics - digest

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* xen-netback: fix refcnt unbalance for 3.11 and earlier versions - 15
messages, 1 author
http://groups.google.com/group/linux.kernel/t/31c5832c1daad707?hl=en
* lockdep: Kill held_lock->check and "int check" arg of __lock_acquire() - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/07e5799772466cd4?hl=en
* sched/deadline: Add SCHED_DEADLINE avg_update accounting - 2 messages, 1
author
http://groups.google.com/group/linux.kernel/t/bae5cdbfc0d0e035?hl=en
* futexes: Document multiprocessor ordering guarantees - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/fffd606090cd5dd4?hl=en
* sysctl: Make neg_one a standard constraint - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/678813442bdbf2fd?hl=en
* Fwd: Re: RFC: cgroups aware proc - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/d7cc54deb72b98b9?hl=en
* ASoC: fsl: Add VF610 simple audio card widgets driver. - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/939890aa08d03d1e?hl=en
* rwsem: add rwsem_is_contended - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/9f2f610921ad297c?hl=en
* input: Add commonly used event types - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/1cf55a1077d02708?hl=en
* mmc: arasan: Add driver for Arasan SDHCI - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/0c94d27019dd7cb2?hl=en
* re-shrink 'struct page' when SLUB is on. - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/d6d1692f51c2de27?hl=en

==============================================================================
TOPIC: xen-netback: fix refcnt unbalance for 3.11 and earlier versions
http://groups.google.com/group/linux.kernel/t/31c5832c1daad707?hl=en
==============================================================================

== 1 of 15 ==
Date: Mon, Jan 13 2014 9:10 am
From: Luis Henriques


3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Wei Liu <wei.liu2@citrix.com>

With the introduction of "xen-netback: Don't destroy the netdev until
the vif is shut down" (upstream commit id 279f438e36), vif disconnect
and free are separated. However in the backported verion reference
counting code was not correctly modified, and the reset of vif->tx_irq
was lost. If frontend goes through vif life cycle more than once the
reference counting is skewed.

This patch adds back the missing tx_irq reset line. It also moves
several lines of the reference counting code to vif_free, so the moved
code corresponds to the counterpart in vif_alloc, thus the reference
counting is balanced.

3.12 and onward versions are not affected by this bug, because reference
counting code was removed due to the introduction of 1:1 model.

This pacth should be backported to all stable verions which are lower
than 3.12 and have 279f438e36.

Reported-and-tested-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Konrad Wilk <konrad.wilk@oracle.com>
Cc: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
drivers/net/xen-netback/interface.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index d28324a..342d4e5 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -418,9 +418,6 @@ void xenvif_disconnect(struct xenvif *vif)
if (netif_carrier_ok(vif->dev))
xenvif_carrier_off(vif);

- atomic_dec(&vif->refcnt);
- wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
-
if (vif->tx_irq) {
if (vif->tx_irq == vif->rx_irq)
unbind_from_irqhandler(vif->tx_irq, vif);
@@ -428,6 +425,7 @@ void xenvif_disconnect(struct xenvif *vif)
unbind_from_irqhandler(vif->tx_irq, vif);
unbind_from_irqhandler(vif->rx_irq, vif);
}
+ vif->tx_irq = 0;
}

xen_netbk_unmap_frontend_rings(vif);
@@ -435,6 +433,9 @@ void xenvif_disconnect(struct xenvif *vif)

void xenvif_free(struct xenvif *vif)
{
+ atomic_dec(&vif->refcnt);
+ wait_event(vif->waiting_to_free, atomic_read(&vif->refcnt) == 0);
+
unregister_netdev(vif->dev);

free_netdev(vif->dev);
--
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/




== 2 of 15 ==
Date: Mon, Jan 13 2014 9:10 am
From: Luis Henriques


3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Vetter <daniel.vetter@ffwll.ch>

commit acc240d41ea1ab9c488a79219fb313b5b46265ae upstream.

So apparently under ridiculous amounts of memory pressure we can get
into trouble in do_switch when we try to move the old hw context
backing storage object onto the active lists.

With list debugging enabled that usually results in us chasing a
poisoned pointer - which means we've hit upon a vma that has been
removed from all lrus with list_del (and then deallocated, so it's a
real use-after free).

Ian Lister has done some great callchain chasing and noticed that we
can reenter do_switch:

i915_gem_do_execbuffer()

i915_switch_context()

do_switch()
from = ring->last_context;
i915_gem_object_pin()

i915_gem_object_bind_to_gtt()
ret = drm_mm_insert_node_in_range_generic();
// If the above call fails then it will try i915_gem_evict_something()
// If that fails it will call i915_gem_evict_everything() ...
i915_gem_evict_everything()
i915_gpu_idle()
i915_switch_context(DEFAULT_CONTEXT)

Like with everything else where the shrinker or eviction code can
invalidate pointers we need to reload relevant state.

Note that there's no need to recheck whether a context switch is still
required because:

- Doing a switch to the same context is harmless (besides wasting a
bit of energy).

- This can only happen with the default context. But since that one's
pinned we'll never call down into evict_everything under normal
circumstances. Note that there's a little driver bringup fun
involved namely that we could recourse into do_switch for the
initial switch. Atm we're fine since we assign the context pointer
only after the call to do_switch at driver load or resume time. And
in the gpu reset case we skip the entire setup sequence (which might
be a bug on its own, but definitely not this one here).

Cc'ing stable since apparently ChromeOS guys are seeing this in the
wild (and not just on artificial stress tests), see the reference.

Note that in upstream code doesn't calle evict_everything directly
from evict_something, that's an extension in this product branch. But
we can still hit upon this bug (and apparently we do, see the linked
backtraces). I've noticed this while trying to construct a testcase
for this bug and utterly failed to provoke it. It looks like we need
to driver the system squarly into the lowmem wall and provoke the
shrinker to evict the context object by doing the last-ditch
evict_everything call.

Aside: There's currently no means to get a badly-fragmenting hw
context object away from a bad spot in the upstream code. We should
fix this by at least adding some code to evict_something to handle hw
contexts.

References: https://code.google.com/p/chromium/issues/detail?id=248191
Reported-by
: Ian Lister <ian.lister@intel.com>
Cc: Ian Lister <ian.lister@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Stéphane Marchesin <marcheu@chromium.org>
Cc: Bloomfield, Jon <jon.bloomfield@intel.com>
Tested-by: Rafael Barbalho <rafael.barbalho@intel.com>
Reviewed-by: Ian Lister <ian.lister@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 90b0491..5b3087f 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -409,11 +409,21 @@ static int do_switch(struct i915_hw_context *to)
if (ret)
return ret;

- /* Clear this page out of any CPU caches for coherent swap-in/out. Note
+ /*
+ * Pin can switch back to the default context if we end up calling into
+ * evict_everything - as a last ditch gtt defrag effort that also
+ * switches to the default context. Hence we need to reload from here.
+ */
+ from = ring->last_context;
+
+ /*
+ * Clear this page out of any CPU caches for coherent swap-in/out. Note
* that thanks to write = false in this call and us not setting any gpu
* write domains when putting a context object onto the active list
* (when switching away from it), this won't block.
- * XXX: We need a real interface to do this instead of trickery. */
+ *
+ * XXX: We need a real interface to do this instead of trickery.
+ */
ret = i915_gem_object_set_to_gtt_domain(to->obj, false);
if (ret) {
i915_gem_object_unpin(to->obj);
--
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/




== 3 of 15 ==
Date: Mon, Jan 13 2014 9:10 am
From: Luis Henriques


3.11.10.3 -stable review patch. If anyone has any objections, please let me know.

------------------

From: Fenghua Yu <fenghua.yu@intel.com>

commit 522e66464467543c0d88d023336eec4df03ad40b upstream.

In reboot and crash path, when we shut down the local APIC, the I/O APIC is
still active. This may cause issues because external interrupts
can still come in and disturb the local APIC during shutdown process.

To quiet external interrupts, disable I/O APIC before shutdown local APIC.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Link: http://lkml.kernel.org/r/1382578212-4677-1-git-send-email-fenghua.yu@intel.com
[ I suppose the 'issue' is a hang during shutdown. It's a fine change nevertheless. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
[ luis: backported to 3.11: adjusted context ]
Signed-off-by: Luis Henriques <luis.henriques@canonical.com>
---
arch/x86/kernel/crash.c | 2 +-
arch/x86/kernel/reboot.c | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
index 74467fe..0b9d44d 100644
--- a/arch/x86/kernel/crash.c
+++ b/arch/x86/kernel/crash.c
@@ -127,10 +127,10 @@ void native_machine_crash_shutdown(struct pt_regs *regs)
cpu_emergency_vmxoff();
cpu_emergency_svm_disable();

- lapic_shutdown();
#if defined(CONFIG_X86_IO_APIC)
disable_IO_APIC();

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate