Monday, January 20, 2014

linux.kernel - 26 new messages in 13 topics - digest

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* at drivers/md/raid5.c:291! kernel 3.13-rc8 - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/258a3c089e1fa163?hl=en
* tun: handle copy failure in tun_put_user() - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/d4927bf4a5f4d012?hl=en
* qrwlock: Use smp_store_release() in write_unlock() - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/45392dc27bd5157f?hl=en
* net: document accel_priv parameter for __dev_queue_xmit() - 2 messages, 1
author
http://groups.google.com/group/linux.kernel/t/1e51b37e4b2b3f7b?hl=en
* don't use module_init in non-modular ... (was: Re: [PATCH] m68k: don't use
module_init in non-modular mvme16x/rtc.c code) - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/98145423499b983a?hl=en
* linux-next: build failure after merge of the tip tree - 2 messages, 2
authors
http://groups.google.com/group/linux.kernel/t/5d213db28c0ba532?hl=en
* net: stmmac: Add Allwinner A20 GMAC ethernet - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/b8ff6224a630de28?hl=en
* percpu_ida+Co: Make percpu_ida_alloc accept task state bitmask - 5 messages,
1 author
http://groups.google.com/group/linux.kernel/t/5f7350d1c0fbc0c5?hl=en
* drm/i2c: tda998x: add DT documentation - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/0e0d67db5c084b40?hl=en
* [PATCH net-next v2] net: stmmac: fix NULL pointer dereference in stmmac_get_
tx_hwtstamp - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/f98354a5fe6531a5?hl=en
* Smart Card(SC) interface, TI USIM & NxP SC phy driver - 6 messages, 1 author
http://groups.google.com/group/linux.kernel/t/4d682d4213503c14?hl=en
* zram stats rework and code cleanup - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/52a5da8f61cb9b81?hl=en
* zram: delete zram_init_device() function - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/92f95a6da293588e?hl=en

==============================================================================
TOPIC: at drivers/md/raid5.c:291! kernel 3.13-rc8
http://groups.google.com/group/linux.kernel/t/258a3c089e1fa163?hl=en
==============================================================================

== 1 of 1 ==
Date: Sun, Jan 19 2014 7:40 pm
From: NeilBrown


On Mon, 20 Jan 2014 01:49:17 +0100 Ian Kumlien <ian.kumlien@gmail.com> wrote:

> On mån, 2014-01-20 at 11:38 +1100, NeilBrown wrote:
> > On Sun, 19 Jan 2014 23:00:23 +0100 Ian Kumlien <ian.kumlien@gmail.com> wrote:
> >
> > > Ok, so third try to actually email this...
> > > ---
> > >
> > > Hi,
> > >
> > > I started testing 3.13-rc8 on another machine since the first one seemed
> > > to be working fine...
> > >
> > > One spontaneous reboot later i'm not so sure ;)
> > >
> > > Right now i captured a kernel oops in the raid code it seems...
> > >
> > > (Also attached to avoid mangling)
> > >
> > > [33411.934672] ------------[ cut here ]------------
> > > [33411.934685] kernel BUG at drivers/md/raid5.c:291!
> > > [33411.934690] invalid opcode: 0000 [#1] PREEMPT SMP
> > > [33411.934696] Modules linked in: bonding btrfs microcode
> > > [33411.934705] CPU: 4 PID: 2319 Comm: md2_raid6 Not tainted 3.13.0-rc8 #83
> > > [33411.934709] Hardware name: System manufacturer System Product Name/Crosshair IV Formula, BIOS 3029 10/09/2012
> > > [33411.934716] task: ffff880326265880 ti: ffff880320472000 task.ti: ffff880320472000
> > > [33411.934720] RIP: 0010:[<ffffffff81a3a5be>] [<ffffffff81a3a5be>] do_release_stripe+0x18e/0x1a0
> > > [33411.934731] RSP: 0018:ffff880320473d28 EFLAGS: 00010087
> > > [33411.934735] RAX: ffff8802f0875a60 RBX: 0000000000000001 RCX: ffff8800b0d816b0
> > > [33411.934739] RDX: ffff880324eeee98 RSI: ffff8802f0875a40 RDI: ffff880324eeec00
> > > [33411.934743] RBP: ffff8802f0875a50 R08: 0000000000000000 R09: 0000000000000001
> > > [33411.934747] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880324eeec00
> > > [33411.934752] R13: ffff880324eeee58 R14: ffff880320473e88 R15: 0000000000000000
> > > [33411.934756] FS: 00007fc38654d700(0000) GS:ffff880337d00000(0000) knlGS:0000000000000000
> > > [33411.934761] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> > > [33411.934765] CR2: 00007f0cb28bd000 CR3: 00000002ebcf6000 CR4: 00000000000407e0
> > > [33411.934769] Stack:
> > > [33411.934771] ffff8800bba09690 ffff8800b4f16588 ffff880303005a40 0000000000000001
> > > [33411.934779] ffff8800b33e43d0 ffffffff81a3a62d ffff880324eeee58 0000000000000000
> > > [33411.934786] ffff880324eeee58 ffff880326660670 ffff880326265880 ffffffff81a41692
> > > [33411.934794] Call Trace:
> > > [33411.934798] [<ffffffff81a3a62d>] ? release_stripe_list+0x4d/0x70
> > > [33411.934803] [<ffffffff81a41692>] ? raid5d+0xa2/0x4d0
> > > [33411.934808] [<ffffffff81a65ed6>] ? md_thread+0xe6/0x120
> > > [33411.934814] [<ffffffff81122060>] ? finish_wait+0x90/0x90
> > > [33411.934818] [<ffffffff81a65df0>] ? md_rdev_init+0x100/0x100
> > > [33411.934823] [<ffffffff8110958c>] ? kthread+0xbc/0xe0
> > > [33411.934828] [<ffffffff81110000>] ? smpboot_park_threads+0x70/0x70Hi,
> >
> > Thanks for the report.
> > Can you provide any more context about the details of the array in question?
> > I see it was RAID6. Was it degraded? Was it resyncing? Was it being
> > reshaped?
> > Was there any way that it was different from the array one the machine where
> > it seemed to work?
>
> Yes, it's a raid6 and no, there is no reshaping or syncing going on...
>
> Basically everything worked fine before:
> reboot system boot 3.13.0-rc8 Sun Jan 19 21:47 - 01:42 (03:55)
> reboot system boot 3.13.0-rc8 Sun Jan 19 21:38 - 01:42 (04:04)
> reboot system boot 3.13.0-rc8 Sun Jan 19 12:13 - 01:42 (13:29)
> reboot system boot 3.13.0-rc8 Sat Jan 18 21:23 - 01:42 (1+04:19)
> reboot system boot 3.12.6 Mon Dec 30 16:27 - 22:21 (19+05:53)
>
> As in, no problems before the 3.13.0-rc8 upgrade...
>
> cat /proc/mdstat:
> Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4] [multipath]
> md2 : active raid6 sdf1[2] sdd1[9] sdj1[8] sdg1[4] sde1[5] sdi1[11] sdc1[0] sdh1[10]
> 11721074304 blocks super 1.2 level 6, 64k chunk, algorithm 2 [8/8] [UUUUUUUU]
> bitmap: 0/15 pages [0KB], 65536KB chunk
>
> What i do do is:
> echo 32768 > /sys/block/*/md/stripe_cache_size
>
> Which has caused no problems during intense write operations before...
>
> I find it quite surprising since it only requires ~3 gigabytes of writes
> to die and almost assume that it's related to the stripe_cache_size.
> (Since all memory is ECC and i doubt it would break, quite literally,
> over night i haven't run extensive memory tests)
>
> I don't quite know what other information you might need...

Thanks - that extra info is quite useful. Knowing that nothing else unusual
is happening can be quite valuable (and I don't like to assume).

I haven't found anything that would clearly cause your crash, but I have
found something that looks wrong and conceivably could.

Could you please try this patch on top of what you are currently using? By
the look of it you get a crash at least every day, often more often. So if
this produces a day with no crashes, that would be promising.

The important aspect of the patch is that it moves the "atomic_inc" of
"sh->count" back under the protection of ->device_lock in the case when some
other thread might be using the same 'sh'.

Thanks,
NeilBrown


diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 3088d3af5a89..03f82ab87d9e 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -675,8 +675,10 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
|| !conf->inactive_blocked),
*(conf->hash_locks + hash));
conf->inactive_blocked = 0;
- } else
+ } else {
init_stripe(sh, sector, previous);
+ atomic_inc(&sh->count);
+ }
} else {
spin_lock(&conf->device_lock);
if (atomic_read(&sh->count)) {
@@ -695,13 +697,11 @@ get_active_stripe(struct r5conf *conf, sector_t sector,
sh->group = NULL;
}
}
+ atomic_inc(&sh->count);
spin_unlock(&conf->device_lock);
}
} while (sh == NULL);

- if (sh)
- atomic_inc(&sh->count);
-
spin_unlock_irq(conf->hash_locks + hash);
return sh;
}





==============================================================================
TOPIC: tun: handle copy failure in tun_put_user()
http://groups.google.com/group/linux.kernel/t/d4927bf4a5f4d012?hl=en
==============================================================================

== 1 of 2 ==
Date: Sun, Jan 19 2014 7:50 pm
From: David Miller


From: Jason Wang <jasowang@redhat.com>
Date: Mon, 20 Jan 2014 11:16:48 +0800

> This patch return the error code of copy helpers in tun_put_user() instead of
> ignoring them.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

If you perform some of the copy successfully, you have to report that
length rather than just an error.

Otherwise userland has no way to determine how much of the data was
successfully sourced.

I'm not applying this, sorry.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/




== 2 of 2 ==
Date: Sun, Jan 19 2014 9:10 pm
From: Jason Wang


On 01/20/2014 11:48 AM, David Miller wrote:
> From: Jason Wang <jasowang@redhat.com>
> Date: Mon, 20 Jan 2014 11:16:48 +0800
>
>> This patch return the error code of copy helpers in tun_put_user() instead of
>> ignoring them.
>>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
> If you perform some of the copy successfully, you have to report that
> length rather than just an error.
>
> Otherwise userland has no way to determine how much of the data was
> successfully sourced.
>
> I'm not applying this, sorry.

Right, looks like we need more changes in tun to return the accurate
length copied in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/





==============================================================================
TOPIC: qrwlock: Use smp_store_release() in write_unlock()
http://groups.google.com/group/linux.kernel/t/45392dc27bd5157f?hl=en
==============================================================================

== 1 of 1 ==
Date: Sun, Jan 19 2014 7:50 pm
From: "Paul E. McKenney"


On Tue, Jan 14, 2014 at 11:44:06PM -0500, Waiman Long wrote:
> This patch modifies the queue_write_unlock() function to use the new
> smp_store_release() function (currently in tip). It also removes the
> temporary implementation of smp_load_acquire() and smp_store_release()
> function in qrwlock.c.
>
> This patch will use atomic subtraction instead if the writer field is
> not atomic.
>
> Signed-off-by: Waiman Long <Waiman.Long@hp.com>

Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>

> ---
> include/asm-generic/qrwlock.h | 10 ++++++----
> kernel/locking/qrwlock.c | 34 ----------------------------------
> 2 files changed, 6 insertions(+), 38 deletions(-)
>
> diff --git a/include/asm-generic/qrwlock.h b/include/asm-generic/qrwlock.h
> index 5abb6ca..68f488b 100644
> --- a/include/asm-generic/qrwlock.h
> +++ b/include/asm-generic/qrwlock.h
> @@ -181,11 +181,13 @@ static inline void queue_read_unlock(struct qrwlock *lock)
> static inline void queue_write_unlock(struct qrwlock *lock)
> {
> /*
> - * Make sure that none of the critical section will be leaked out.
> + * If the writer field is atomic, it can be cleared directly.
> + * Otherwise, an atomic subtraction will be used to clear it.
> */
> - smp_mb__before_clear_bit();
> - ACCESS_ONCE(lock->cnts.writer) = 0;
> - smp_mb__after_clear_bit();
> + if (__native_word(lock->cnts.writer))
> + smp_store_release(&lock->cnts.writer, 0);
> + else
> + atomic_sub(_QW_LOCKED, &lock->cnts.rwa);
> }
>
> /*
> diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c
> index 053be4d..2727188 100644
> --- a/kernel/locking/qrwlock.c
> +++ b/kernel/locking/qrwlock.c
> @@ -47,40 +47,6 @@
> # define arch_mutex_cpu_relax() cpu_relax()
>

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home


Real Estate