twitter: linux.kernel - 26 new messages in 11 topics

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

linux.kernel@googlegroups.com

Today's topics:

* x86 Kconfig: create x86/Kconfig.virt - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/350f637b956a1ce6?hl=en
* perf tools: Insert filtered entries to hists also - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/ccb3c434474ceabc?hl=en
* zram: fix race between reset and flushing pending work - 2 messages, 1
author
http://groups.google.com/group/linux.kernel/t/dc0d124af7423670?hl=en
* kernel/time: Add new helpers to convert ktime to/from jiffies - 1 messages,
1 author
http://groups.google.com/group/linux.kernel/t/03f3752a1cdcf125?hl=en
* mm/zswap: add writethrough option - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/83507199cf42b4fb?hl=en
* x86 Kconfig: move guest-side options under "Virtualization" - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/1e447c67d5e3ad49?hl=en
* hwspinlock/omap: enable build for AM33xx, AM43xx & DRA7xx - 4 messages, 1
author
http://groups.google.com/group/linux.kernel/t/35459429c62267b5?hl=en
* sfc: Maintain current frequency adjustment when applying a time offset - 8
messages, 1 author
http://groups.google.com/group/linux.kernel/t/a53824587b1cf92d?hl=en
* 3.4.77-stable review - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/ef25c501937e80d8?hl=en
* arm: remap non-modular uses of module_init properly - 3 messages, 1 author
http://groups.google.com/group/linux.kernel/t/c8256ff09940618b?hl=en
* bug in sscanf()? - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/df11bc617290103d?hl=en

==============================================================================
TOPIC: x86 Kconfig: create x86/Kconfig.virt
http://groups.google.com/group/linux.kernel/t/350f637b956a1ce6?hl=en
==============================================================================

== 1 of 2 ==
Date: Mon, Jan 13 2014 4:10 pm
From: Dave Hansen

On 01/13/2014 03:12 PM, Paolo Bonzini wrote:
> Il 14/01/2014 00:00, Dave Hansen ha scritto:
>>>> --- Virtualization
>>>> <*> Kernel-based Virtual Machine (KVM) support
>>>> <*> KVM for Intel processors support
>>>> <*> KVM for AMD processors support
>>>> [*] Audit KVM MMU
>>>> [*] KVM legacy PCI device assignment support
>>>> < > Host kernel accelerator for virtio net
>>>> [*] Linux guest support --->
>
> I think this is as confusing as before, perhaps worse because it is not
> clear that Linux guest support is not limited to KVM.

The "Linux guest support" menu is preexisting. This patch just moves it
verbatim from "Processor type and features" to "Virtualization". My
logic is that "Linux guest support" has a heck of a lot more to do with
"Virtualization" than the processor.

> If you really
> want a Virtualization menu, you have to clearly separate the guest and host:
>
> --- Virtualization
> Linux guest support --->
> Virtualization host support --->
>
> with KVM and vhost under the second item.

I think it's kinda silly to have folks chase through two levels of menus
when the submenus would have both fit on the screen anyway. Let's save
the layers of nesting for when we have 150 items in a menu.

I'll have new patches out in a sec.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 2 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Dave Hansen

From: Dave Hansen <dave.hansen@linux.intel.com>

Right now, there is a "Enable paravirtualization code" option in
the "Processor Features" menu, which means Xen. There is also a
group of host-side paravirtualization options specific to KVM
under the top-level "Virtualization" menu.

I think it makes a lot of sense to group the host and guest side
things together, especially since the top-level "Virtualization"
menu is so sparsely populated.

This creates a new hypervisor-independent arch/x86/Kconfig.virt
file, and moves the "Virtualization" menu to be defined in there.
Currently CONFIG_VIRTUALIZATION really means "host-side", so
create a new config option which matches the guest-side one, and
default its value to be what CONFIG_VIRTUALIZATION was set to.

This also removes the very counterintuitive references to
lguest/vhost code *from* kvm-specific code and removes the silly:

depends on HAVE_KVM || X86

dependency. It makes zero sense to have entries defined in
arch/x86 depend on x86.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---

linux.git-davehans/arch/x86/Kconfig | 2 +-
linux.git-davehans/arch/x86/Kconfig.virt | 25 +++++++++++++++++++++++++
linux.git-davehans/arch/x86/kvm/Kconfig | 19 -------------------
3 files changed, 26 insertions(+), 20 deletions(-)

diff -puN arch/x86/Kconfig~x86-Kconfig-move-paravirt-under-virtualization arch/x86/Kconfig
--- linux.git/arch/x86/Kconfig~x86-Kconfig-move-paravirt-under-virtualization 2014-01-13 16:09:29.785875796 -0800
+++ linux.git-davehans/arch/x86/Kconfig 2014-01-13 16:09:29.793876157 -0800
@@ -2416,6 +2416,6 @@ source "security/Kconfig"

source "crypto/Kconfig"

-source "arch/x86/kvm/Kconfig"
+source "arch/x86/Kconfig.virt"

source "lib/Kconfig"
diff -puN /dev/null arch/x86/Kconfig.virt
--- /dev/null 2013-11-27 17:20:18.337162396 -0800
+++ linux.git-davehans/arch/x86/Kconfig.virt 2014-01-13 16:09:29.793876157 -0800
@@ -0,0 +1,25 @@
+
+menu "Virtualization"
+
+config VIRTUALIZATION
+ bool
+
+config HYPERVISOR_HOST
+ bool "Host-Side Features (Linux as the Hypervisor)"
+ default y if VIRTUALIZATION
+ ---help---
+ Say Y here to get to see options for using your Linux host to run other
+ operating systems inside virtual machines (guests).
+ This option alone does not add any kernel code.
+
+ If you say N, all options in this submenu will be skipped and disabled.
+
+if HYPERVISOR_HOST
+
+source arch/x86/kvm/Kconfig
+source drivers/vhost/Kconfig
+source drivers/lguest/Kconfig
+
+endif # HYPERVISOR_HOST
+
+endmenu # "Virtualization"
diff -puN arch/x86/kvm/Kconfig~x86-Kconfig-move-paravirt-under-virtualization arch/x86/kvm/Kconfig
--- linux.git/arch/x86/kvm/Kconfig~x86-Kconfig-move-paravirt-under-virtualization 2014-01-13 16:09:29.787875886 -0800
+++ linux.git-davehans/arch/x86/kvm/Kconfig 2014-01-13 16:09:29.793876157 -0800
@@ -4,19 +4,6 @@

source "virt/kvm/Kconfig"

-menuconfig VIRTUALIZATION
- bool "Virtualization"
- depends on HAVE_KVM || X86
- default y
- ---help---
- Say Y here to get to see options for using your Linux host to run other
- operating systems inside virtual machines (guests).
- This option alone does not add any kernel code.
-
- If you say N, all options in this submenu will be skipped and disabled.
-
-if VIRTUALIZATION
-
config KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on HAVE_KVM
@@ -93,9 +80,3 @@ config KVM_DEVICE_ASSIGNMENT

If unsure, say Y.

-# OK, it's a little counter-intuitive to do this, but it puts it neatly under
-# the virtualization menu.
-source drivers/vhost/Kconfig
-source drivers/lguest/Kconfig
-
-endif # VIRTUALIZATION
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: perf tools: Insert filtered entries to hists also
http://groups.google.com/group/linux.kernel/t/ccb3c434474ceabc?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Namhyung Kim

Hi Arnaldo,

On Thu, 9 Jan 2014 11:37:24 -0300, Arnaldo Carvalho de Melo wrote:
> Em Thu, Jan 09, 2014 at 09:57:35PM +0900, Namhyung Kim escreveu:
>> 2014-01-08 (수), 15:59 -0300, Arnaldo Carvalho de Melo:
>> > Em Wed, Jan 08, 2014 at 05:22:53PM +0100, Jiri Olsa escreveu:
>> > > On Wed, Jan 08, 2014 at 09:41:13AM -0300, Arnaldo Carvalho de Melo wrote:
>> > > > Em Wed, Jan 08, 2014 at 05:46:06PM +0900, Namhyung Kim escreveu:
>> > > > > Currently if a sample was filtered by command line option, it just
>> > > > > dropped. But this affects final output in that the percentage can be
>> > > > > different since the filtered entries were not included to the total.
>> > > > >
>> > > > > For example, if an original output looked like below:
>> > > >
>> > > > Humm, if one says that he/she is interested on just samples for a and b,
>> > > > the current behaviour will state how many of the filtered samples are
>> > > > for a and b, which is valid.
>> > > >
>> > > > I bet the number of samples will reflect that as well, but you filtered
>> > > > it out, yes, it stays there, so the percentages are relative to the
>> > > > number of samples.
>> > > >
>> > > > So I think this change in behaviour is wrong, no?
>
>> > > haven't checked the implementation yet, but it kind of does
>> > > what I'd expect for symbol filtering:
>
>> > > perf report
>> > > ...
>> > > 22.00% yes libc-2.17.so [.] __strlen_sse2
>> > > 11.79% yes libc-2.17.so [.] fputs_unlocked
>> > > 9.65% yes libc-2.17.so [.] __GI___mempcpy
>> > > 1.91% yes yes [.] fputs_unlocked@plt
>> > > ...
>> > >
>> > > search (press '/') for fputs_unlocked (with Namhyung's change):
>> > > 11.79% yes libc-2.17.so [.] fputs_unlocked
>> > > 1.91% yes yes [.] fputs_unlocked@plt
>> > >
>> > > while the current one shows:
>> > > 86.08% yes libc-2.17.so [.] fputs_unlocked
>> > > 13.92% yes yes [.] fputs_unlocked@plt
>> > >
>> > > which annoys me when searching for 'invisible' symbol
>> > > within tons of others.. I had to do that grep thing
>> > > you showed.
>> > >
>> > > I'd like to have the Namhyung's change behaviour as default,
>> > > but I'll be happy with some switch as well ;-)
>> >
>> > I understand the desire for this different mode, looks indeed useful.
>>
>> Yeah, the above is the reason why I wrote this firstly. And then I
>> thought it should be applied to the command line filter options too.
>
> I don't have a problem with providing a new option, but for those who
> think that when you filter samples based on some criteria the
> percentages that should appear should be relative to the new, filtered,
> total_period, that is a change in behaviour, so needs to be switchable.
>
>> > So I think that this is a new feature and as so we should provide it as
>> > an option, that may (or not) become the default.
>> >
>> > Some concerns I have are that when we go on filtering we have to have
>> > all the things that are zeroed to then get accrued for each hist entry
>> > that matches the filter being applied and now at least a nr_entries
>> > field got out of the if (al.filtered) block, i.e. in the end we will
>> > have the number of hist entries entries filtered but continue having the
>> > total period for all (filtered or not) hist entries.
>
>> One thing related to it is when --children option is used. Since total
>> period is added only for a real sample, if the sample is filtered but
>> the parents are not, the parents might have more than 100% overhead.
>
> So when implementing the new option this has to be taken into account,
> no problem (haven't really thought about the full implications here).
>
>> > Having it as a separate feature would allow to have both views:
>> >
>> > 1. the percentages relative to the filtered samples
>> > 2. the percentages relative to all (filtered or not) samples
>> >
>> > Being selectable on the command line and also with a hotkey to provide
>> > two columns: %total, %filtered.
>>
>> Hmm.. do you really want two columns instead of single column and a
>> switch/option? Then the (second) %filtered column will be shown up only
>> if filtering is enabled. Isn't it annoying for a dynamic filtering
>> (i.e. '/' key on TUI)?
>
> Hey, I'm not the one to decide this :-)
>
> There _are_ two choices for how the percentage gets computed, if one
> wants one, the other, or both, well, the hard part here is to decide the
> default, but there are two options, showing one, the other or both
> should be left to the user, even if after one or two keystrokes :)

So I'd like to make this changed behavior as default like Jiri said.

But adding a new percentage column will be a headache since it'll
increase the combination of current behavior - total x sys/user x group
x children - I'd really want to keep it small..

What about just adding --percentage <relative|absolute> option and make
"absolute" default (it can also be changed via config option, of course)?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: zram: fix race between reset and flushing pending work
http://groups.google.com/group/linux.kernel/t/dc0d124af7423670?hl=en
==============================================================================

== 1 of 2 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Minchan Kim

On Mon, Jan 13, 2014 at 03:55:27PM -0800, Andrew Morton wrote:
> On Mon, 13 Jan 2014 20:18:56 +0900 Minchan Kim <minchan@kernel.org> wrote:
>
> > Dan and Sergey reported that there is a racy between reset and
> > flushing of pending work so that it could make oops by freeing
> > zram->meta in reset while zram_slot_free can access zram->meta
> > if new request is adding during the race window.
> >
> > This patch moves flush after taking init_lock so it prevents
> > new request so that it closes the race.
> >
> > ..
> >
> > --- a/drivers/block/zram/zram_drv.c
> > +++ b/drivers/block/zram/zram_drv.c
> > @@ -553,14 +553,14 @@ static void zram_reset_device(struct zram *zram, bool reset_capacity)
> > size_t index;
> > struct zram_meta *meta;
> >
> > - flush_work(&zram->free_work);
> > -
> > down_write(&zram->init_lock);
> > if (!zram->init_done) {
> > up_write(&zram->init_lock);
> > return;
> > }
> >
> > + flush_work(&zram->free_work);
> > +
> > meta = zram->meta;
> > zram->init_done = 0;
>
> This makes zram.lock nest inside zram.init_lock, which afaict is new
> behaviour.

Originally, it was nested so it's not new. :)
Look at zram_make_request which hold init_lock and then zram_bvec_rw
hold zram->lock.

>
> That all seems OK and logical - has it been well tested with lockdep?

Yeb.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 2 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Minchan Kim

On Mon, Jan 13, 2014 at 03:58:14PM -0800, Andrew Morton wrote:
> On Mon, 13 Jan 2014 20:18:59 +0900 Minchan Kim <minchan@kernel.org> wrote:
>
> > Some of fields in zram->stats are protected by zram->lock which
> > is rather coarse-grained so let's use atomic operation without
> > explict locking.
>
> Confused. The patch didn't remove any locking, so it made the code
> slower.

True but it could make remove dependency of zram->lock for 32bit stat
so further patches can remove messy code and enhance write performance.
So, it's preparing patch for further step.
Should I rewrite the description to explain this?

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: kernel/time: Add new helpers to convert ktime to/from jiffies
http://groups.google.com/group/linux.kernel/t/03f3752a1cdcf125?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Chanwoo Choi

On 01/13/2014 07:43 PM, Alexey Perevalov wrote:
> From: Anton Vorontsov <anton@scarybugs.org>
>
> Two new functions: jiffies_to_ktime() and ktime_to_jiffies(), we'll use
> them for timerfd deferred timers handling.
>
> We fully reuse the logic from timespec implementations, so the functions
> are pretty straightforward.
>
> The only tricky part is in headers: we have to include jiffies.h after
> we defined ktime_t, this is because ktime.h needs some declarations from
> jiffies.h (e.g. TICK_NSEC).
>
> Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org>
> Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>

Tested-by: Chanwoo Choi <cw00.choi@samsung.com>

I tested this patchset about operation of deferrable timer on user-space.

Thanks,
Chanwoo Choi

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: mm/zswap: add writethrough option
http://groups.google.com/group/linux.kernel/t/83507199cf42b4fb?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Minchan Kim

Hello Dan,

Sorry for the late response and I didn't look at the code yet
because I am not convinced. :(

On Thu, Dec 19, 2013 at 08:23:27AM -0500, Dan Streetman wrote:
> Currently, zswap is writeback cache; stored pages are not sent
> to swap disk, and when zswap wants to evict old pages it must
> first write them back to swap cache/disk manually. This avoids
> swap out disk I/O up front, but only moves that disk I/O to
> the writeback case (for pages that are evicted), and adds the
> overhead of having to uncompress the evicted pages and the
> need for an additional free page (to store the uncompressed page).
>
> This optionally changes zswap to writethrough cache by enabling
> frontswap_writethrough() before registering, so that any
> successful page store will also be written to swap disk. The
> default remains writeback. To enable writethrough, the param
> zswap.writethrough=1 must be used at boot.
>
> Whether writeback or writethrough will provide better performance
> depends on many factors including disk I/O speed/throughput,
> CPU speed(s), system load, etc. In most cases it is likely
> that writeback has better performance than writethrough before
> zswap is full, but after zswap fills up writethrough has
> better performance than writeback.

So you claims we should use writeback default but writethrough
after memory limit is full?
But it would break LRU ordering and I think better idea is to
handle it more generic way rather than chaning entire policy.

Now, zswap evict out just *a* page rather than a bunch of pages
so it stucks every store if many swap write happens continuously.
It's not efficient so how about adding kswapd's threshold concept
like min/low/high? So, it could evict early before reaching zswap
memory pool and stop it reaches high watermark.
I guess it could be better than now.

Other point: As I read device-mapper/cache.txt, cache operating mode
already supports writethrough. It means zram zRAM can support
writeback/writethough with dm-cache.
Have you tried it? Is there any problem?

Acutally, I really don't know how much benefit we have that in-memory
swap overcomming to the real storage but if you want, zRAM with dm-cache
is another option rather than invent new wheel by "just having is better".

Thanks.

>
> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
>
> ---
>
> Based on specjbb testing on my laptop, the results for both writeback
> and writethrough are better than not using zswap at all, but writeback
> does seem to be better than writethrough while zswap isn't full. Once
> it fills up, performance for writethrough is essentially close to not
> using zswap, while writeback seems to be worse than not using zswap.
> However, I think more testing on a wider span of systems and conditions
> is needed. Additionally, I'm not sure that specjbb is measuring true
> performance under fully loaded cpu conditions, so additional cpu load
> might need to be added or specjbb parameters modified (I took the
> values from the 4 "warehouses" test run).
>
> In any case though, I think having writethrough as an option is still
> useful. More changes could be made, such as changing from writeback
> to writethrough based on the zswap % full. And the patch doesn't
> change default behavior - writethrough must be specifically enabled.
>
> The %-ized numbers I got from specjbb on average, using the default
> 20% max_pool_percent and varying the amount of heap used as shown:
>
> ram | no zswap | writeback | writethrough
> 75 93.08 100 96.90
> 87 96.58 95.58 96.72
> 100 92.29 89.73 86.75
> 112 63.80 38.66 19.66
> 125 4.79 29.90 15.75
> 137 4.99 4.50 4.75
> 150 4.28 4.62 5.01
> 162 5.20 2.94 4.66
> 175 5.71 2.11 4.84
>
>
>
> mm/zswap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 64 insertions(+), 4 deletions(-)
>
> diff --git a/mm/zswap.c b/mm/zswap.c
> index e55bab9..2f919db 100644
> --- a/mm/zswap.c
> +++ b/mm/zswap.c
> @@ -61,6 +61,8 @@ static atomic_t zswap_stored_pages = ATOMIC_INIT(0);
> static u64 zswap_pool_limit_hit;
> /* Pages written back when pool limit was reached */
> static u64 zswap_written_back_pages;
> +/* Pages evicted when pool limit was reached */
> +static u64 zswap_evicted_pages;
> /* Store failed due to a reclaim failure after pool limit was reached */
> static u64 zswap_reject_reclaim_fail;
> /* Compressed page was too big for the allocator to (optimally) store */
> @@ -89,6 +91,10 @@ static unsigned int zswap_max_pool_percent = 20;
> module_param_named(max_pool_percent,
> zswap_max_pool_percent, uint, 0644);
>
> +/* Writeback/writethrough mode (fixed at boot for now) */
> +static bool zswap_writethrough;
> +module_param_named(writethrough, zswap_writethrough, bool, 0444);
> +
> /*********************************
> * compression functions
> **********************************/
> @@ -629,6 +635,48 @@ end:
> }
>
> /*********************************
> +* evict code
> +**********************************/
> +
> +/*
> + * This evicts pages that have already been written through to swap.
> + */
> +static int zswap_evict_entry(struct zbud_pool *pool, unsigned long handle)
> +{
> + struct zswap_header *zhdr;
> + swp_entry_t swpentry;
> + struct zswap_tree *tree;
> + pgoff_t offset;
> + struct zswap_entry *entry;
> +
> + /* extract swpentry from data */
> + zhdr = zbud_map(pool, handle);
> + swpentry = zhdr->swpentry; /* here */
> + zbud_unmap(pool, handle);
> + tree = zswap_trees[swp_type(swpentry)];
> + offset = swp_offset(swpentry);
> + BUG_ON(pool != tree->pool);
> +
> + /* find and ref zswap entry */
> + spin_lock(&tree->lock);
> + entry = zswap_rb_search(&tree->rbroot, offset);
> + if (!entry) {
> + /* entry was invalidated */
> + spin_unlock(&tree->lock);
> + return 0;
> + }
> +
> + zswap_evicted_pages++;
> +
> + zswap_rb_erase(&tree->rbroot, entry);
> + zswap_entry_put(tree, entry);
> +
> + spin_unlock(&tree->lock);
> +
> + return 0;
> +}
> +
> +/*********************************
> * frontswap hooks
> **********************************/
> /* attempts to compress and store an single page */
> @@ -744,7 +792,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t offset,
> spin_lock(&tree->lock);
> entry = zswap_entry_find_get(&tree->rbroot, offset);
> if (!entry) {
> - /* entry was written back */
> + /* entry was written back or evicted */
> spin_unlock(&tree->lock);
> return -1;
> }
> @@ -778,7 +826,7 @@ static void zswap_frontswap_invalidate_page(unsigned type, pgoff_t offset)
> spin_lock(&tree->lock);
> entry = zswap_rb_search(&tree->rbroot, offset);
> if (!entry) {
> - /* entry was written back */
> + /* entry was written back or evicted */
> spin_unlock(&tree->lock);
> return;
> }
> @@ -813,18 +861,26 @@ static void zswap_frontswap_invalidate_area(unsigned type)
> zswap_trees[type] = NULL;
> }
>
> -static struct zbud_ops zswap_zbud_ops = {
> +static struct zbud_ops zswap_zbud_writeback_ops = {
> .evict = zswap_writeback_entry
> };
> +static struct zbud_ops zswap_zbud_writethrough_ops = {
> + .evict = zswap_evict_entry
> +};
>
> static void zswap_frontswap_init(unsigned type)
> {
> struct zswap_tree *tree;
> + struct zbud_ops *ops;
>
> tree = kzalloc(sizeof(struct zswap_tree), GFP_KERNEL);
> if (!tree)
> goto err;
> - tree->pool = zbud_create_pool(GFP_KERNEL, &zswap_zbud_ops);
> + if (zswap_writethrough)
> + ops = &zswap_zbud_writethrough_ops;
> + else
> + ops = &zswap_zbud_writeback_ops;
> + tree->pool = zbud_create_pool(GFP_KERNEL, ops);
> if (!tree->pool)
> goto freetree;
> tree->rbroot = RB_ROOT;
> @@ -875,6 +931,8 @@ static int __init zswap_debugfs_init(void)
> zswap_debugfs_root, &zswap_reject_compress_poor);
> debugfs_create_u64("written_back_pages", S_IRUGO,
> zswap_debugfs_root, &zswap_written_back_pages);
> + debugfs_create_u64("evicted_pages", S_IRUGO,
> + zswap_debugfs_root, &zswap_evicted_pages);
> debugfs_create_u64("duplicate_entry", S_IRUGO,
> zswap_debugfs_root, &zswap_duplicate_entry);
> debugfs_create_u64("pool_pages", S_IRUGO,
> @@ -919,6 +977,8 @@ static int __init init_zswap(void)
> pr_err("per-cpu initialization failed\n");
> goto pcpufail;
> }
> + if (zswap_writethrough)
> + frontswap_writethrough(true);
> frontswap_register_ops(&zswap_frontswap_ops);
> if (zswap_debugfs_init())
> pr_warn("debugfs initialization failed\n");
> --
> 1.8.3.1
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: x86 Kconfig: move guest-side options under "Virtualization"
http://groups.google.com/group/linux.kernel/t/1e447c67d5e3ad49?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Jan 13 2014 4:20 pm
From: Dave Hansen

We now have two groups of options in "Virtualization": one for
host-side support and a matching one for guest-side stuff. We do
not need separate submenus for this stuff since there are so few
options.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dmitry Torokhov <dtor@vmware.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
---

linux.git-davehans/arch/x86/Kconfig | 83 ----------------------------
linux.git-davehans/arch/x86/Kconfig.virt | 85 +++++++++++++++++++++++++++++
linux.git-davehans/arch/x86/kvm/Kconfig | 2
linux.git-davehans/arch/x86/lguest/Kconfig | 1
4 files changed, 85 insertions(+), 86 deletions(-)

diff -puN arch/x86/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really arch/x86/Kconfig
--- linux.git/arch/x86/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really 2014-01-13 16:10:05.721497873 -0800
+++ linux.git-davehans/arch/x86/Kconfig 2014-01-13 16:10:05.758499543 -0800
@@ -1018,89 +1018,6 @@ config X86_ES7000
Support for Unisys ES7000 systems. Say 'Y' here if this kernel is
supposed to run on an IA32-based Unisys ES7000 system.

-menuconfig HYPERVISOR_GUEST
- bool "Linux guest support"
- ---help---
- Say Y here to enable options for running Linux under various hyper-
- visors. This option enables basic hypervisor detection and platform
- setup.
-
- If you say N, all options in this submenu will be skipped and
- disabled, and Linux guest support won't be built in.
-
-if HYPERVISOR_GUEST
-
-config PARAVIRT
- bool "Enable paravirtualization code"
- ---help---
- This changes the kernel so it can modify itself when it is run
- under a hypervisor, potentially improving performance significantly
- over full virtualization. However, when run without a hypervisor
- the kernel is theoretically slower and slightly larger.
-
-config PARAVIRT_DEBUG
- bool "paravirt-ops debugging"
- depends on PARAVIRT && DEBUG_KERNEL
- ---help---
- Enable to debug paravirt_ops internals. Specifically, BUG if
- a paravirt_op is missing when it is called.
-
-config PARAVIRT_SPINLOCKS
- bool "Paravirtualization layer for spinlocks"
- depends on PARAVIRT && SMP
- select UNINLINE_SPIN_UNLOCK
- ---help---
- Paravirtualized spinlocks allow a pvops backend to replace the
- spinlock implementation with something virtualization-friendly
- (for example, block the virtual CPU rather than spinning).
-
- It has a minimal impact on native kernels and gives a nice performance
- benefit on paravirtualized KVM / Xen kernels.
-
- If you are unsure how to answer this question, answer Y.
-
-source "arch/x86/xen/Kconfig"
-
-config KVM_GUEST
- bool "KVM Guest support (including kvmclock)"
- depends on PARAVIRT
- select PARAVIRT_CLOCK
- default y
- ---help---
- This option enables various optimizations for running under the KVM
- hypervisor. It includes a paravirtualized clock, so that instead
- of relying on a PIT (or probably other) emulation by the
- underlying device model, the host provides the guest with
- timing infrastructure such as time of day, and system time
-
-config KVM_DEBUG_FS
- bool "Enable debug information for KVM Guests in debugfs"
- depends on KVM_GUEST && DEBUG_FS
- default n
- ---help---
- This option enables collection of various statistics for KVM guest.
- Statistics are displayed in debugfs filesystem. Enabling this option
- may incur significant overhead.
-
-source "arch/x86/lguest/Kconfig"
-
-config PARAVIRT_TIME_ACCOUNTING
- bool "Paravirtual steal time accounting"
- depends on PARAVIRT
- default n
- ---help---
- Select this option to enable fine granularity task steal time
- accounting. Time spent executing other tasks in parallel with
- the current vCPU is discounted from the vCPU power. To account for
- that, there can be a small performance impact.
-
- If in doubt, say N here.
-
-config PARAVIRT_CLOCK
- bool
-
-endif #HYPERVISOR_GUEST
-
config NO_BOOTMEM
def_bool y

diff -puN arch/x86/Kconfig.virt~x86-Kconfig-move-paravirt-under-virtualization-really arch/x86/Kconfig.virt
--- linux.git/arch/x86/Kconfig.virt~x86-Kconfig-move-paravirt-under-virtualization-really 2014-01-13 16:10:05.722497918 -0800
+++ linux.git-davehans/arch/x86/Kconfig.virt 2014-01-13 16:10:35.730852434 -0800
@@ -1,3 +1,5 @@
+# No menu items in here, just common config variables:
+source "virt/kvm/Kconfig"

menu "Virtualization"

@@ -22,4 +24,87 @@ source drivers/lguest/Kconfig

endif # HYPERVISOR_HOST

+config HYPERVISOR_GUEST
+ bool "Guest-Side Features (Linux running under a Hypervisor)"
+ ---help---
+ Say Y here to enable options for running Linux under various hyper-
+ visors. This option enables basic hypervisor detection and platform
+ setup.
+
+ If you say N, all options in this submenu will be skipped and
+ disabled, and Linux guest support won't be built in.
+
+if HYPERVISOR_GUEST
+
+config PARAVIRT
+ bool "Enable paravirtualization code"
+ ---help---
+ This changes the kernel so it can modify itself when it is run
+ under a hypervisor, potentially improving performance significantly
+ over full virtualization. However, when run without a hypervisor
+ the kernel is theoretically slower and slightly larger.
+
+config PARAVIRT_DEBUG
+ bool "paravirt-ops debugging"
+ depends on PARAVIRT && DEBUG_KERNEL
+ ---help---
+ Enable to debug paravirt_ops internals. Specifically, BUG if
+ a paravirt_op is missing when it is called.
+
+config PARAVIRT_SPINLOCKS
+ bool "Paravirtualization layer for spinlocks"
+ depends on PARAVIRT && SMP
+ select UNINLINE_SPIN_UNLOCK
+ ---help---
+ Paravirtualized spinlocks allow a pvops backend to replace the
+ spinlock implementation with something virtualization-friendly
+ (for example, block the virtual CPU rather than spinning).
+
+ It has a minimal impact on native kernels and gives a nice performance
+ benefit on paravirtualized KVM / Xen kernels.
+
+ If you are unsure how to answer this question, answer Y.
+
+source "arch/x86/xen/Kconfig"
+
+config KVM_GUEST
+ bool "KVM Guest support (including kvmclock)"
+ depends on PARAVIRT
+ select PARAVIRT_CLOCK
+ default y
+ ---help---
+ This option enables various optimizations for running under the KVM
+ hypervisor. It includes a paravirtualized clock, so that instead
+ of relying on a PIT (or probably other) emulation by the
+ underlying device model, the host provides the guest with
+ timing infrastructure such as time of day, and system time
+
+config KVM_DEBUG_FS
+ bool "Enable debug information for KVM Guests in debugfs"
+ depends on KVM_GUEST && DEBUG_FS
+ default n
+ ---help---
+ This option enables collection of various statistics for KVM guest.
+ Statistics are displayed in debugfs filesystem. Enabling this option
+ may incur significant overhead.
+
+source "arch/x86/lguest/Kconfig"
+
+config PARAVIRT_TIME_ACCOUNTING
+ bool "Paravirtual steal time accounting"
+ depends on PARAVIRT
+ default n
+ ---help---
+ Select this option to enable fine granularity task steal time
+ accounting. Time spent executing other tasks in parallel with
+ the current vCPU is discounted from the vCPU power. To account for
+ that, there can be a small performance impact.
+
+ If in doubt, say N here.
+
+config PARAVIRT_CLOCK
+ bool
+
+endif # HYPERVISOR_GUEST
+
endmenu # "Virtualization"
diff -puN arch/x86/lguest/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really arch/x86/lguest/Kconfig
--- linux.git/arch/x86/lguest/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really 2014-01-13 16:10:05.724498009 -0800
+++ linux.git-davehans/arch/x86/lguest/Kconfig 2014-01-13 16:10:05.759499589 -0800
@@ -2,7 +2,6 @@ config LGUEST_GUEST
bool "Lguest guest support"
depends on X86_32 && PARAVIRT
select TTY
- select VIRTUALIZATION
select VIRTIO
select VIRTIO_CONSOLE
help
diff -puN arch/x86/kvm/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really arch/x86/kvm/Kconfig
--- linux.git/arch/x86/kvm/Kconfig~x86-Kconfig-move-paravirt-under-virtualization-really 2014-01-13 16:10:05.729498234 -0800
+++ linux.git-davehans/arch/x86/kvm/Kconfig 2014-01-13 16:10:05.760499634 -0800
@@ -2,8 +2,6 @@
# KVM configuration
#

-source "virt/kvm/Kconfig"
-
config KVM
tristate "Kernel-based Virtual Machine (KVM) support"
depends on HAVE_KVM
_
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: hwspinlock/omap: enable build for AM33xx, AM43xx & DRA7xx
http://groups.google.com/group/linux.kernel/t/35459429c62267b5?hl=en
==============================================================================

== 1 of 4 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Suman Anna

HwSpinlocks are supported on AM33xx, AM43xx and DRA7xx SoC
device families as well. The IPs are identical to that of
OMAP4/OMAP5, except for the number of locks.

Add a depends on to the above family of SoCs to enable the
build support for OMAP hwspinlock driver for any of the above
SoC configs.

Signed-off-by: Suman Anna <s-anna@ti.com>
---
drivers/hwspinlock/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/hwspinlock/Kconfig b/drivers/hwspinlock/Kconfig
index 70637d2..3612cb5 100644
--- a/drivers/hwspinlock/Kconfig
+++ b/drivers/hwspinlock/Kconfig
@@ -10,7 +10,7 @@ menu "Hardware Spinlock drivers"

config HWSPINLOCK_OMAP
tristate "OMAP Hardware Spinlock device"
- depends on ARCH_OMAP4 || SOC_OMAP5
+ depends on ARCH_OMAP4 || SOC_OMAP5 || SOC_DRA7XX || SOC_AM33XX || SOC_AM43XX
select HWSPINLOCK
help
Say y here to support the OMAP Hardware Spinlock device (firstly
--
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 4 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Suman Anna

Hi,

This is an updated series mainly addressing Mark Rutland's comments
about hwlock specifier being always one-cell. The series adds the
support for #hwlock-cells property and adds a simple default OF
translate function.

The DTS patches from previous series have already been merged, and
needs this property to be added. This is handled in a separate series
that only deals with OMAP hwspinlock DTS patches.

The series, along with the DTS patches, is tested on top of v3.13-rc8
plus Tero's v13 clock DT series and Tony's 3.14 staged branches. The
validation on OMAP5, DRA7, AM437 requires Tero's series with couple of
additional base patches for AM43xx. AM43xx functionality needs a hwmod
fix [1] for creating the associated omap_device as well.

The validation logs on all the applicable OMAP SoCs are at:
OMAP4 - http://paste2.org/YJ7ZwG80
OMAP5 - http://paste2.org/c6vO96b9
DRA7 - http://paste2.org/tHvxN439
AM33x - http://paste2.org/AjCv0U4t
AM43x - http://paste2.org/2AKIPa55

The kernel with the test patches plus the various pulled in branches
is here for reference (not for merging)
https://github.com/sumananna/omap-kernel/commits/hwspinlock/3.13-rc8-v4-test

[1] http://marc.info/?l=linux-omap&m=138939747524820&w=2

Changes new in v4:
- The DT bindings are split into separate patches, and updated to
add comments about #hwlock-cells (Patches 1 & 2)
- Fixed a registration issue with repeated module installation and
removal. (Patch 3)
- Added a new OF helper to support #hwlock-cells in addition to the
previous OF functions (Patch 4). The OMAP adaptation patch is
updated to use the default translate function (Patch 5)
- Updated hwspinlock documentation to adjust for the structure
changes and the new api additions. (Patches 3, 4)
- Added build support for AM335x, AM43xx and DRA7xx (Patch 7)
- The AM335/AM43x fix patch is unchanged (Patch 6)

v3:
- Removed the DT property hwlock-base-id and associated OF helper
- Added changes in core to support requesting a specific hwlock using
phandle + args approach
- Revised both the common and OMAP DT bindings document
http://marc.info/?l=linux-omap&m=138143992932197&w=2

v2:
- Added a new common DT binding documentation and OF helpers.
- Revised OMAP DT parse support to use the new OF helper (Patch2)
- OMAP5 hwspinlock support including the hwmod entry and DT node
- Add AM335x support to OMAP hwspinlock driver, including a fix
needed in driver given that AM335 spinlock module requires s/w wakeup
- AM335 DT node for spinlock, and a hwmod change to enable smart-idle
for AM335.
- OMAP4 DT node patch is unchanged
http://marc.info/?l=linux-omap&m=137944644112727&w=2

v1:
- Add DT parse support to OMAP hwspinlock driver
- Add OMAP4 DT node and bindings information
http://marc.info/?l=linux-omap&m=137823082308009&w=2

Suman Anna (7):
Documentation: dt: add common bindings for hwspinlock
Documentation: dt: add the omap hwspinlock bindings document
hwspinlock/core: maintain a list of registered hwspinlock banks
hwspinlock/core: add common OF helpers
hwspinlock/omap: add support for dt nodes
hwspinlock/omap: enable module before reading SYSSTATUS register
hwspinlock/omap: enable build for AM33xx, AM43xx & DRA7xx

.../devicetree/bindings/hwlock/hwlock.txt | 52 +++++++
.../devicetree/bindings/hwlock/omap-hwspinlock.txt | 24 ++++
Documentation/hwspinlock.txt | 36 ++++-
MAINTAINERS | 1 -
arch/arm/mach-omap2/Makefile | 3 -
arch/arm/mach-omap2/hwspinlock.c | 60 --------
drivers/hwspinlock/Kconfig | 2 +-
drivers/hwspinlock/hwspinlock_core.c | 159 ++++++++++++++++++++-
drivers/hwspinlock/hwspinlock_internal.h | 6 +
drivers/hwspinlock/omap_hwspinlock.c | 39 +++--
include/linux/hwspinlock.h | 20 ++-
11 files changed, 321 insertions(+), 81 deletions(-)
create mode 100644 Documentation/devicetree/bindings/hwlock/hwlock.txt
create mode 100644 Documentation/devicetree/bindings/hwlock/omap-hwspinlock.txt
delete mode 100644 arch/arm/mach-omap2/hwspinlock.c

--
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 3 of 4 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Suman Anna

HwSpinlock IP is present only on OMAP4 and other newer SoCs,
which are all device-tree boot only. This patch adds the
DT bindings information for OMAP hwspinlock module.

Cc: Rob Herring <robh+dt@kernel.org>
Signed-off-by: Suman Anna <s-anna@ti.com>
---
.../devicetree/bindings/hwlock/omap-hwspinlock.txt | 24 ++++++++++++++++++++++
1 file changed, 24 insertions(+)
create mode 100644 Documentation/devicetree/bindings/hwlock/omap-hwspinlock.txt

diff --git a/Documentation/devicetree/bindings/hwlock/omap-hwspinlock.txt b/Documentation/devicetree/bindings/hwlock/omap-hwspinlock.txt
new file mode 100644
index 0000000..568eae2
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwlock/omap-hwspinlock.txt
@@ -0,0 +1,24 @@
+OMAP4+ HwSpinlock Driver
+========================
+
+Required properties:
+- compatible: Should be "ti,omap4-hwspinlock" for
+ OMAP44xx, OMAP54xx, AM33xx, AM43xx, DRA7xx SoCs
+- reg: Contains the hwspinlock module register address space
+ (base address and length)
+- ti,hwmods: Name of the hwmod associated with the hwspinlock device
+- #hwlock-cells: Should be 1. The OMAP hwspinlock users will use a
+ 0-indexed relative hwlock number as the argument
+ specifier value for requesting a specific hwspinlock
+ within a hwspinlock bank.
+
+
+Example:
+
+/* OMAP4 */
+hwspinlock: spinlock@4a0f6000 {
+ compatible = "ti,omap4-hwspinlock";
+ reg = <0x4a0f6000 0x1000>;
+ ti,hwmods = "spinlock";
+ #hwlock-cells = <1>;
+};
--
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 4 of 4 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Suman Anna

The number of hwspinlocks are determined based on the value read
from the IP block's SYSSTATUS register. However, the module may
not be enabled and clocked, and the read may result in a bus error.

This particular issue is seen rather easily on AM33XX, since the
module wakeup is software controlled, and it is disabled out of
reset. Make sure the module is enabled and clocked before reading
the SYSSTATUS register.

Signed-off-by: Suman Anna <s-anna@ti.com>
---
drivers/hwspinlock/omap_hwspinlock.c | 21 ++++++++++++++-------
1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/drivers/hwspinlock/omap_hwspinlock.c b/drivers/hwspinlock/omap_hwspinlock.c
index 9f56fb2..194886e 100644
--- a/drivers/hwspinlock/omap_hwspinlock.c
+++ b/drivers/hwspinlock/omap_hwspinlock.c
@@ -101,10 +101,23 @@ static int omap_hwspinlock_probe(struct platform_device *pdev)
if (!io_base)
return -ENOMEM;

+ /*
+ * make sure the module is enabled and clocked before reading
+ * the module SYSSTATUS register
+ */
+ pm_runtime_enable(&pdev->dev);
+ pm_runtime_get_sync(&pdev->dev);
+
/* Determine number of locks */
i = readl(io_base + SYSSTATUS_OFFSET);
i >>= SPINLOCK_NUMLOCKS_BIT_OFFSET;

+ /*
+ * runtime PM will make sure the clock of this module is
+ * enabled again iff at least one lock is requested
+ */
+ pm_runtime_put(&pdev->dev);
+
/* one of the four lsb's must be set, and nothing else */
if (hweight_long(i & 0xf) != 1 || i > 8) {
ret = -EINVAL;
@@ -124,12 +137,6 @@ static int omap_hwspinlock_probe(struct platform_device *pdev)
for (i = 0, hwlock = &bank->lock[0]; i < num_locks; i++, hwlock++)
hwlock->priv = io_base + LOCK_BASE_OFFSET + sizeof(u32) * i;

- /*
- * runtime PM will make sure the clock of this module is
- * enabled iff at least one lock is requested
- */
- pm_runtime_enable(&pdev->dev);
-
ret = hwspin_lock_register(bank, &pdev->dev, &omap_hwspinlock_ops,
base_id, num_locks);
if (ret)
@@ -138,9 +145,9 @@ static int omap_hwspinlock_probe(struct platform_device *pdev)
return 0;

reg_fail:
- pm_runtime_disable(&pdev->dev);
kfree(bank);
iounmap_base:
+ pm_runtime_disable(&pdev->dev);
iounmap(io_base);
return ret;
}
--
1.8.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: sfc: Maintain current frequency adjustment when applying a time offset
http://groups.google.com/group/linux.kernel/t/a53824587b1cf92d?hl=en
==============================================================================

== 1 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: Ben Hutchings <bhutchings@solarflare.com>

[ Upstream commit cd6fe65e923175e4f2e9fb585b1d78c6bf580fc6 ]

There is a single MCDI PTP operation for setting the frequency
adjustment and applying a time offset to the hardware clock. When
applying a time offset we should not change the frequency adjustment.

These two operations can now be requested separately but this requires
a flash firmware update. Keep using the single operation, but
remember and repeat the previous frequency adjustment.

Fixes: 7c236c43b838 ('sfc: Add support for IEEE-1588 PTP')
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/sfc/ptp.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/net/ethernet/sfc/ptp.c
+++ b/drivers/net/ethernet/sfc/ptp.c
@@ -1426,7 +1426,7 @@ static int efx_phc_adjfreq(struct ptp_cl
if (rc != 0)
return rc;

- ptp_data->current_adjfreq = delta;
+ ptp_data->current_adjfreq = adjustment_ns;
return 0;
}

@@ -1441,7 +1441,7 @@ static int efx_phc_adjtime(struct ptp_cl

MCDI_SET_DWORD(inbuf, PTP_IN_OP, MC_CMD_PTP_OP_ADJUST);
MCDI_SET_DWORD(inbuf, PTP_IN_PERIPH_ID, 0);
- MCDI_SET_QWORD(inbuf, PTP_IN_ADJUST_FREQ, 0);
+ MCDI_SET_QWORD(inbuf, PTP_IN_ADJUST_FREQ, ptp_data->current_adjfreq);
MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_SECONDS, (u32)delta_ts.tv_sec);
MCDI_SET_DWORD(inbuf, PTP_IN_ADJUST_NANOSECONDS, (u32)delta_ts.tv_nsec);
return efx_mcdi_rpc(efx, MC_CMD_PTP, inbuf, sizeof(inbuf),

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: "David S. Miller" <davem@davemloft.net>

[ Upstream commit 2205369a314e12fcec4781cc73ac9c08fc2b47de ]

When the vlan code detects that the real device can do TX VLAN offloads
in hardware, it tries to arrange for the real device's header_ops to
be invoked directly.

But it does so illegally, by simply hooking the real device's
header_ops up to the VLAN device.

This doesn't work because we will end up invoking a set of header_ops
routines which expect a device type which matches the real device, but
will see a VLAN device instead.

Fix this by providing a pass-thru set of header_ops which will arrange
to pass the proper real device instead.

To facilitate this add a dev_rebuild_header(). There are
implementations which provide a ->cache and ->create but not a
->rebuild (f.e. PLIP). So we need a helper function just like
dev_hard_header() to avoid crashes.

Use this helper in the one existing place where the
header_ops->rebuild was being invoked, the neighbour code.

With lots of help from Florian Westphal.

Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
include/linux/netdevice.h | 9 +++++++++
net/8021q/vlan_dev.c | 19 ++++++++++++++++++-
net/core/neighbour.c | 2 +-
3 files changed, 28 insertions(+), 2 deletions(-)

--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1872,6 +1872,15 @@ static inline int dev_parse_header(const
return dev->header_ops->parse(skb, haddr);
}

+static inline int dev_rebuild_header(struct sk_buff *skb)
+{
+ const struct net_device *dev = skb->dev;
+
+ if (!dev->header_ops || !dev->header_ops->rebuild)
+ return 0;
+ return dev->header_ops->rebuild(skb);
+}
+
typedef int gifconf_func_t(struct net_device * dev, char __user * bufptr, int len);
extern int register_gifconf(unsigned int family, gifconf_func_t * gifconf);
static inline int unregister_gifconf(unsigned int family)
--- a/net/8021q/vlan_dev.c
+++ b/net/8021q/vlan_dev.c
@@ -549,6 +549,23 @@ static const struct header_ops vlan_head
.parse = eth_header_parse,
};

+static int vlan_passthru_hard_header(struct sk_buff *skb, struct net_device *dev,
+ unsigned short type,
+ const void *daddr, const void *saddr,
+ unsigned int len)
+{
+ struct vlan_dev_priv *vlan = vlan_dev_priv(dev);
+ struct net_device *real_dev = vlan->real_dev;
+
+ return dev_hard_header(skb, real_dev, type, daddr, saddr, len);
+}
+
+static const struct header_ops vlan_passthru_header_ops = {
+ .create = vlan_passthru_hard_header,
+ .rebuild = dev_rebuild_header,
+ .parse = eth_header_parse,
+};
+
static struct device_type vlan_type = {
.name = "vlan",
};
@@ -592,7 +609,7 @@ static int vlan_dev_init(struct net_devi

dev->needed_headroom = real_dev->needed_headroom;
if (real_dev->features & NETIF_F_HW_VLAN_CTAG_TX) {
- dev->header_ops = real_dev->header_ops;
+ dev->header_ops = &vlan_passthru_header_ops;
dev->hard_header_len = real_dev->hard_header_len;
} else {
dev->header_ops = &vlan_header_ops;
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -1274,7 +1274,7 @@ int neigh_compat_output(struct neighbour

if (dev_hard_header(skb, dev, ntohs(skb->protocol), NULL, NULL,
skb->len) < 0 &&
- dev->header_ops->rebuild(skb))
+ dev_rebuild_header(skb))
return 0;

return dev_queue_xmit(skb);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 3 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: Abhilash Kesavan <a.kesavan@samsung.com>

commit 8fb9aeb7a71ef4f3e0613d459a2e1366a7a90469 upstream.

Adds gate clock for MDMA0 on Exynos5250 SoC. This is needed to ensure
that the clock is enabled when MDMA0 is used on systems on which
firmware gates the clockby default.

Signed-off-by: Abhilash Kesavan <a.kesavan@samsung.com>
Acked-by: Mike Turquette <mturquette@linaro.org>
[t.figa: Updated patch description.]
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
Documentation/devicetree/bindings/clock/exynos5250-clock.txt | 2 ++
drivers/clk/samsung/clk-exynos5250.c | 5 ++++-
2 files changed, 6 insertions(+), 1 deletion(-)

--- a/Documentation/devicetree/bindings/clock/exynos5250-clock.txt
+++ b/Documentation/devicetree/bindings/clock/exynos5250-clock.txt
@@ -159,6 +159,8 @@ clock which they consume.
mixer 343
hdmi 344
g2d 345
+ mdma0 346
+ smmu_mdma0 347

[Clock Muxes]
--- a/drivers/clk/samsung/clk-exynos5250.c
+++ b/drivers/clk/samsung/clk-exynos5250.c
@@ -120,7 +120,8 @@ enum exynos5250_clks {
spi2, i2s1, i2s2, pcm1, pcm2, pwm, spdif, ac97, hsi2c0, hsi2c1, hsi2c2,
hsi2c3, chipid, sysreg, pmu, cmu_top, cmu_core, cmu_mem, tzpc0, tzpc1,
tzpc2, tzpc3, tzpc4, tzpc5, tzpc6, tzpc7, tzpc8, tzpc9, hdmi_cec, mct,
- wdt, rtc, tmu, fimd1, mie1, dsim0, dp, mixer, hdmi, g2d,
+ wdt, rtc, tmu, fimd1, mie1, dsim0, dp, mixer, hdmi, g2d, mdma0,
+ smmu_mdma0,

/* mux clocks */
mout_hdmi = 1024,
@@ -492,6 +493,8 @@ static struct samsung_gate_clock exynos5
GATE(mixer, "mixer", "mout_aclk200_disp1", GATE_IP_DISP1, 5, 0, 0),
GATE(hdmi, "hdmi", "mout_aclk200_disp1", GATE_IP_DISP1, 6, 0, 0),
GATE(g2d, "g2d", "aclk200", GATE_IP_ACP, 3, 0, 0),
+ GATE(mdma0, "mdma0", "aclk266", GATE_IP_ACP, 1, 0, 0),
+ GATE(smmu_mdma0, "smmu_mdma0", "aclk266", GATE_IP_ACP, 5, 0, 0),
};

static struct samsung_pll_rate_table vpll_24mhz_tbl[] __initdata = {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 4 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Bresticker <abrestic@chromium.org>

commit 97c3557c3e0413efb1f021f582d1459760e22727 upstream.

The gate clocks for the MFC sysmmus appear to be flipped, i.e.
GATE_IP_MFC[2] gates sysmmu_mfcl and GATE_IP_MFC[1] gates sysmmu_mfcr.
Fix this so that the MFC will start up.

Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
drivers/clk/samsung/clk-exynos5250.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/clk/samsung/clk-exynos5250.c
+++ b/drivers/clk/samsung/clk-exynos5250.c
@@ -355,8 +355,8 @@ static struct samsung_gate_clock exynos5
GATE(smmu_gscl2, "smmu_gscl2", "aclk266", GATE_IP_GSCL, 9, 0, 0),
GATE(smmu_gscl3, "smmu_gscl3", "aclk266", GATE_IP_GSCL, 10, 0, 0),
GATE(mfc, "mfc", "aclk333", GATE_IP_MFC, 0, 0, 0),
- GATE(smmu_mfcl, "smmu_mfcl", "aclk333", GATE_IP_MFC, 1, 0, 0),
- GATE(smmu_mfcr, "smmu_mfcr", "aclk333", GATE_IP_MFC, 2, 0, 0),
+ GATE(smmu_mfcl, "smmu_mfcl", "aclk333", GATE_IP_MFC, 2, 0, 0),
+ GATE(smmu_mfcr, "smmu_mfcr", "aclk333", GATE_IP_MFC, 1, 0, 0),
GATE(rotator, "rotator", "aclk266", GATE_IP_GEN, 1, 0, 0),
GATE(jpeg, "jpeg", "aclk166", GATE_IP_GEN, 2, 0, 0),
GATE(mdma1, "mdma1", "aclk266", GATE_IP_GEN, 4, 0, 0),

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 5 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <edumazet@google.com>

[ Upstream commit 28e24c62ab3062e965ef1b3bcc244d50aee7fa85 ]

Few network drivers really supports frag_list : virtual drivers.

Some drivers wrongly advertise NETIF_F_FRAGLIST feature.

If skb with a frag_list is given to them, packet on the wire will be
corrupt.

Remove this flag, as core networking stack will make sure to
provide packets that can be sent without corruption.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
Cc: Anirudha Sarangi <anirudh@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/ethernet/ibm/ehea/ehea_main.c | 2 +-
drivers/net/ethernet/tehuti/tehuti.c | 1 -
drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +-
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2 +-
4 files changed, 3 insertions(+), 4 deletions(-)

--- a/drivers/net/ethernet/ibm/ehea/ehea_main.c
+++ b/drivers/net/ethernet/ibm/ehea/ehea_main.c
@@ -3033,7 +3033,7 @@ static struct ehea_port *ehea_setup_sing

dev->hw_features = NETIF_F_SG | NETIF_F_TSO |
NETIF_F_IP_CSUM | NETIF_F_HW_VLAN_CTAG_TX;
- dev->features = NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_TSO |
+ dev->features = NETIF_F_SG | NETIF_F_TSO |
NETIF_F_HIGHDMA | NETIF_F_IP_CSUM |
NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXCSUM;
--- a/drivers/net/ethernet/tehuti/tehuti.c
+++ b/drivers/net/ethernet/tehuti/tehuti.c
@@ -2019,7 +2019,6 @@ bdx_probe(struct pci_dev *pdev, const st
ndev->features = NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO
| NETIF_F_HW_VLAN_CTAG_TX | NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_FILTER | NETIF_F_RXCSUM
- /*| NETIF_F_FRAGLIST */
;
ndev->hw_features = NETIF_F_IP_CSUM | NETIF_F_SG |
NETIF_F_TSO | NETIF_F_HW_VLAN_CTAG_TX;
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -1016,7 +1016,7 @@ static int temac_of_probe(struct platfor
platform_set_drvdata(op, ndev);
SET_NETDEV_DEV(ndev, &op->dev);
ndev->flags &= ~IFF_MULTICAST; /* clear multicast */
- ndev->features = NETIF_F_SG | NETIF_F_FRAGLIST;
+ ndev->features = NETIF_F_SG;
ndev->netdev_ops = &temac_netdev_ops;
ndev->ethtool_ops = &temac_ethtool_ops;
#if 0
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -1486,7 +1486,7 @@ static int axienet_of_probe(struct platf

SET_NETDEV_DEV(ndev, &op->dev);
ndev->flags &= ~IFF_MULTICAST; /* clear multicast */
- ndev->features = NETIF_F_SG | NETIF_F_FRAGLIST;
+ ndev->features = NETIF_F_SG;
ndev->netdev_ops = &axienet_netdev_ops;
ndev->ethtool_ops = &axienet_ethtool_ops;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 6 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Andrew Bresticker <abrestic@chromium.org>

commit 97c3557c3e0413efb1f021f582d1459760e22727 upstream.

The gate clocks for the MFC sysmmus appear to be flipped, i.e.
GATE_IP_MFC[2] gates sysmmu_mfcl and GATE_IP_MFC[1] gates sysmmu_mfcr.
Fix this so that the MFC will start up.

Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org>
Acked-by: Mike Turquette <mturquette@linaro.org>
Signed-off-by: Tomasz Figa <t.figa@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
drivers/clk/samsung/clk-exynos5250.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/drivers/clk/samsung/clk-exynos5250.c
+++ b/drivers/clk/samsung/clk-exynos5250.c
@@ -325,8 +325,8 @@ struct samsung_gate_clock exynos5250_gat
GATE(smmu_gscl2, "smmu_gscl2", "aclk266", GATE_IP_GSCL, 9, 0, 0),
GATE(smmu_gscl3, "smmu_gscl3", "aclk266", GATE_IP_GSCL, 10, 0, 0),
GATE(mfc, "mfc", "aclk333", GATE_IP_MFC, 0, 0, 0),
- GATE(smmu_mfcl, "smmu_mfcl", "aclk333", GATE_IP_MFC, 1, 0, 0),
- GATE(smmu_mfcr, "smmu_mfcr", "aclk333", GATE_IP_MFC, 2, 0, 0),
+ GATE(smmu_mfcl, "smmu_mfcl", "aclk333", GATE_IP_MFC, 2, 0, 0),
+ GATE(smmu_mfcr, "smmu_mfcr", "aclk333", GATE_IP_MFC, 1, 0, 0),
GATE(rotator, "rotator", "aclk266", GATE_IP_GEN, 1, 0, 0),
GATE(jpeg, "jpeg", "aclk166", GATE_IP_GEN, 2, 0, 0),
GATE(mdma1, "mdma1", "aclk266", GATE_IP_GEN, 4, 0, 0),

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 7 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.12-stable review patch. If anyone has any objections, please let me know.

------------------

From: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>

[ Upstream commit d0b7da8afa079ffe018ab3e92879b7138977fc8f ]

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/net/tun.c | 2 ++
1 file changed, 2 insertions(+)

--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1356,6 +1356,8 @@ static ssize_t tun_chr_aio_read(struct k
ret = tun_do_read(tun, tfile, iocb, iv, len,
file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
+ if (ret > 0)
+ iocb->ki_pos = ret;
out:
tun_put(tun);
return ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 8 of 8 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.10-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <dborkman@redhat.com>

[ Upstream commit b1aac815c0891fe4a55a6b0b715910142227700f ]

Jakub reported while working with nlmon netlink sniffer that parts of
the inet_diag_sockid are not initialized when r->idiag_family != AF_INET6.
That is, fields of r->id.idiag_src[1 ... 3], r->id.idiag_dst[1 ... 3].

In fact, it seems that we can leak 6 * sizeof(u32) byte of kernel [slab]
memory through this. At least, in udp_dump_one(), we allocate a skb in ...

rep = nlmsg_new(sizeof(struct inet_diag_msg) + ..., GFP_KERNEL);

... and then pass that to inet_sk_diag_fill() that puts the whole struct
inet_diag_msg into the skb, where we only fill out r->id.idiag_src[0],
r->id.idiag_dst[0] and leave the rest untouched:

r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;

struct inet_diag_msg embeds struct inet_diag_sockid that is correctly /
fully filled out in IPv6 case, but for IPv4 not.

So just zero them out by using plain memset (for this little amount of
bytes it's probably not worth the extra check for idiag_family == AF_INET).

Similarly, fix also other places where we fill that out.

Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/ipv4/inet_diag.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -106,6 +106,10 @@ int inet_sk_diag_fill(struct sock *sk, s

r->id.idiag_sport = inet->inet_sport;
r->id.idiag_dport = inet->inet_dport;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;

@@ -240,12 +244,19 @@ static int inet_twsk_diag_fill(struct in

r->idiag_family = tw->tw_family;
r->idiag_retrans = 0;
+
r->id.idiag_if = tw->tw_bound_dev_if;
sock_diag_save_cookie(tw, r->id.idiag_cookie);
+
r->id.idiag_sport = tw->tw_sport;
r->id.idiag_dport = tw->tw_dport;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = tw->tw_rcv_saddr;
r->id.idiag_dst[0] = tw->tw_daddr;
+
r->idiag_state = tw->tw_substate;
r->idiag_timer = 3;
r->idiag_expires = DIV_ROUND_UP(tmo * 1000, HZ);
@@ -732,8 +743,13 @@ static int inet_diag_fill_req(struct sk_

r->id.idiag_sport = inet->inet_sport;
r->id.idiag_dport = ireq->rmt_port;
+
+ memset(&r->id.idiag_src, 0, sizeof(r->id.idiag_src));
+ memset(&r->id.idiag_dst, 0, sizeof(r->id.idiag_dst));
+
r->id.idiag_src[0] = ireq->loc_addr;
r->id.idiag_dst[0] = ireq->rmt_addr;
+
r->idiag_expires = jiffies_to_msecs(tmo);
r->idiag_rqueue = 0;
r->idiag_wqueue = 0;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: 3.4.77-stable review
http://groups.google.com/group/linux.kernel/t/ef25c501937e80d8?hl=en
==============================================================================

== 1 of 2 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

This is the start of the stable review cycle for the 3.4.77 release.
There are 27 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Jan 16 00:26:11 UTC 2014.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.4.77-rc1.gz
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linux 3.4.77-rc1

Paul Turner <pjt@google.com>
sched: Guarantee new group-entities always have weight

Ben Segall <bsegall@google.com>
sched: Fix hrtimer_cancel()/rq->lock deadlock

Ben Segall <bsegall@google.com>
sched: Fix cfs_bandwidth misuse of hrtimer_expires_remaining

Ben Segall <bsegall@google.com>
sched: Fix race on toggling cfs_bandwidth_used

Linus Torvalds <torvalds@linux-foundation.org>
x86, fpu, amd: Clear exceptions in AMD FXSAVE workaround

Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
ARM: shmobile: mackerel: Fix coherent DMA mask

Russell King <rmk+kernel@arm.linux.org.uk>
ARM: fix "bad mode in ... handler" message for undefined instructions

Curt Brune <curt@cumulusnetworks.com>
bridge: use spin_lock_bh() in br_multicast_set_hash_max

Daniel Borkmann <dborkman@redhat.com>
net: llc: fix use after free in llc_ui_recvmsg

David S. Miller <davem@davemloft.net>
vlan: Fix header ops passthru when doing TX VLAN offload.

Florian Westphal <fw@strlen.de>
net: rose: restore old recvmsg behavior

Sasha Levin <sasha.levin@oracle.com>
rds: prevent dereference of a NULL device

Salva Peiró <speiro@ai2.upv.es>
hamradio/yam: fix info leak in ioctl

Wenliang Fan <fanwlexca@gmail.com>
drivers/net/hamradio: Integer overflow in hdlcdrv_ioctl()

Daniel Borkmann <dborkman@redhat.com>
net: inet_diag: zero out uninitialized idiag_{src,dst} fields

Sasha Levin <sasha.levin@oracle.com>
net: unix: allow bind to fail on mutex lock

Jason Wang <jasowang@redhat.com>
netvsc: don't flush peers notifying work during setting mtu

Nat Gurumoorthy <natg@google.com>
tg3: Initialize REG_BASE_ADDR at PCI config offset 120 to 0

Sasha Levin <sasha.levin@oracle.com>
net: unix: allow set_peek_off to fail

Changli Gao <xiaosuo@gmail.com>
net: drop_monitor: fix the value of maxattr

Hannes Frederic Sowa <hannes@stressinduktion.org>
ipv6: don't count addrconf generated routes against gc limit

Jason Wang <jasowang@redhat.com>
macvtap: signal truncated packets

Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
tun: update file current position

Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
macvtap: update file current position

Vlad Yasevich <vyasevic@redhat.com>
macvtap: Do not double-count received packets

Venkat Venkatsubra <venkat.x.venkatsubra@oracle.com>
rds: prevent BUG_ON triggered on congestion update to loopback

Eric Dumazet <edumazet@google.com>
net: do not pretend FRAGLIST support

-------------

Diffstat:

Makefile | 4 +-
arch/arm/kernel/traps.c | 8 +++-
arch/arm/mach-shmobile/board-mackerel.c | 4 +-
arch/x86/include/asm/fpu-internal.h | 13 +++---
drivers/net/ethernet/broadcom/tg3.c | 3 ++
drivers/net/ethernet/calxeda/xgmac.c | 2 +-
drivers/net/ethernet/ibm/ehea/ehea_main.c | 2 +-
drivers/net/ethernet/tehuti/tehuti.c | 1 -
drivers/net/ethernet/xilinx/ll_temac_main.c | 2 +-
drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 2 +-
drivers/net/hamradio/hdlcdrv.c | 2 +
drivers/net/hamradio/yam.c | 1 +
drivers/net/hyperv/netvsc_drv.c | 1 -
drivers/net/macvtap.c | 20 ++++------
drivers/net/tun.c | 2 +
include/linux/net.h | 2 +-
include/linux/netdevice.h | 9 +++++
kernel/sched/core.c | 9 ++++-
kernel/sched/fair.c | 48 ++++++++++++++++-------
kernel/sched/sched.h | 3 +-
net/8021q/vlan_dev.c | 19 ++++++++-
net/bridge/br_multicast.c | 4 +-
net/core/drop_monitor.c | 1 -
net/core/sock.c | 2 +-
net/ipv4/inet_diag.c | 16 ++++++++
net/ipv6/route.c | 8 +---
net/llc/af_llc.c | 5 ++-
net/rds/ib.c | 3 +-
net/rds/ib_send.c | 5 +--
net/rose/af_rose.c | 16 ++------
net/unix/af_unix.c | 16 ++++++--
31 files changed, 153 insertions(+), 80 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 2 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Greg Kroah-Hartman

3.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Curt Brune <curt@cumulusnetworks.com>

[ Upstream commit fe0d692bbc645786bce1a98439e548ae619269f5 ]

br_multicast_set_hash_max() is called from process context in
net/bridge/br_sysfs_br.c by the sysfs store_hash_max() function.

br_multicast_set_hash_max() calls spin_lock(&br->multicast_lock),
which can deadlock the CPU if a softirq that also tries to take the
same lock interrupts br_multicast_set_hash_max() while the lock is
held . This can happen quite easily when any of the bridge multicast
timers expire, which try to take the same lock.

The fix here is to use spin_lock_bh(), preventing other softirqs from
executing on this CPU.

Steps to reproduce:

1. Create a bridge with several interfaces (I used 4).
2. Set the "multicast query interval" to a low number, like 2.
3. Enable the bridge as a multicast querier.
4. Repeatedly set the bridge hash_max parameter via sysfs.

# brctl addbr br0
# brctl addif br0 eth1 eth2 eth3 eth4
# brctl setmcqi br0 2
# brctl setmcquerier br0 1

# while true ; do echo 4096 > /sys/class/net/br0/bridge/hash_max; done

Signed-off-by: Curt Brune <curt@cumulusnetworks.com>
Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
net/bridge/br_multicast.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -1744,7 +1744,7 @@ int br_multicast_set_hash_max(struct net
u32 old;
struct net_bridge_mdb_htable *mdb;

- spin_lock(&br->multicast_lock);
+ spin_lock_bh(&br->multicast_lock);
if (!netif_running(br->dev))
goto unlock;

@@ -1776,7 +1776,7 @@ rollback:
}

unlock:
- spin_unlock(&br->multicast_lock);
+ spin_unlock_bh(&br->multicast_lock);

return err;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: arm: remap non-modular uses of module_init properly
http://groups.google.com/group/linux.kernel/t/c8256ff09940618b?hl=en
==============================================================================

== 1 of 3 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Paul Gortmaker

The goal is to move module_init/module_exit from init.h and into
module.h -- however in doing so, we uncover several instances in
ARM code where module_init is used somewhat incorrectly by non modular
code, and a file that needs module.h but isn't sourcing it. We need to
make these fixups 1st before changing the headers so that we don't cause
build failures later on.

The changes are largely inert, however we do cause a largely trivial
change in one initcall ordering -- that happens because module_init
is really device_initcall; but I didn't use device_initcall because
subsys_initcall seems somewhat more appropriate.

All modified files were build tested on today's linux next tree.

Paul.
---

Paul Gortmaker (3):
arm: use subsys_initcall in non-modular pl320 IPC code
arm: include module.h in drivers/bus/omap_l3_smx.c
arm: don't use module_init in non-modular mach-vexpress/spc.c code

arch/arm/mach-vexpress/spc.c | 2 +-
drivers/bus/omap_l3_smx.c | 1 +
drivers/mailbox/pl320-ipc.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)

--
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 3 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Paul Gortmaker

The drivers/mailbox/pl320-ipc.o is dependent on config PL320_MBOX
which is declared as a bool. Hence the code is never going to be
modular. So using module_init as an alias for __initcall can be
somewhat misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing. Also add an inclusion of init.h, as
that was previously implicit.

Note that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of subsys_initcall (which seems to make
sense for IPC code) will thus change this registration from a
level 6-device to a level 4-subsys (i.e. slightly earlier).
However no impact of that small difference is expected.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
drivers/mailbox/pl320-ipc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/mailbox/pl320-ipc.c b/drivers/mailbox/pl320-ipc.c
index d873cbae2fbb..b2737a2df1d3 100644
--- a/drivers/mailbox/pl320-ipc.c
+++ b/drivers/mailbox/pl320-ipc.c
@@ -195,4 +195,4 @@ static int __init ipc_init(void)
{
return amba_driver_register(&pl320_driver);
}
-module_init(ipc_init);
+subsys_initcall(ipc_init);
--
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 3 of 3 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Paul Gortmaker

The spc.o is built for ARCH_VEXPRESS_SPC -- which is bool, and hence
this code is either present or absent. It will never be modular,
so using module_init as an alias for __initcall can be somewhat
misleading.

Fix this up now, so that we can relocate module_init from
init.h into module.h in the future. If we don't do this, we'd
have to add module.h to obviously non-modular code, and that
would be a worse thing.

Note that direct use of __initcall is discouraged, vs. one
of the priority categorized subgroups. As __initcall gets
mapped onto device_initcall, our use of device_initcall
directly in this change means that the runtime impact is
zero -- it will remain at level 6 in initcall ordering.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
arch/arm/mach-vexpress/spc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/mach-vexpress/spc.c b/arch/arm/mach-vexpress/spc.c
index c26ef5b92ca7..9312a9b0405b 100644
--- a/arch/arm/mach-vexpress/spc.c
+++ b/arch/arm/mach-vexpress/spc.c
@@ -581,4 +581,4 @@ static int __init ve_spc_clk_init(void)
platform_device_register_simple("vexpress-spc-cpufreq", -1, NULL, 0);
return 0;
}
-module_init(ve_spc_clk_init);
+device_initcall(ve_spc_clk_init);
--
1.8.5.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: bug in sscanf()?
http://groups.google.com/group/linux.kernel/t/df11bc617290103d?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Jan 13 2014 4:30 pm
From: Linus Torvalds

On Tue, Jan 14, 2014 at 6:30 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Comments?

Do we have actual users of this? Because I'd almost be inclined to say
"we just don't support field widths on sscanf() and will warn" unless
there are users.

We've done that before. The kernel has various limited functions. See
the whole snprint() issue with %n, which we decided that supporting
the full semantics was actually a big mistake and we actively
*removed* code that had been misguidedly added just because people
thought we should do everything a standard user library does..

Limiting our problem space is a *good* thing, not a bad thing.

If it's possible, of course, and we don't have nasty users.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================

You received this message because you are subscribed to the Google Groups "linux.kernel"
group.

To post to this group, visit http://groups.google.com/group/linux.kernel?hl=en

To unsubscribe from this group, send email to linux.kernel+unsubscribe@googlegroups.com

To change the way you get mail from this group, visit:
http://groups.google.com/group/linux.kernel/subscribe?hl=en

To report abuse, send email explaining the problem to abuse@googlegroups.com

==============================================================================
Google Groups: http://groups.google.com/?hl=en

twitter

Monday, January 13, 2014

linux.kernel - 26 new messages in 11 topics - digest

0 Comments:

Post a Comment

About Me

Previous Posts