twitter: linux.kernel - 26 new messages in 17 topics

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

Today's topics:

* tty: prevent DOS in the flush_to_ldisc - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/b22e5935660077fb?hl=en
* Tracing: do export trace_set_clr_event - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/392e03d89ceb236d?hl=en
* [PATCH] DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ? - 2 messages, 2
authors
http://groups.google.com/group/linux.kernel/t/aa6f89b2fbf0c4f9?hl=en
* dyntick-hpc and RCU - 3 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/cd349868731ca1c0?hl=en
* mm: make ioremap_prot() take a pgprot. - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/6800294b28bef246?hl=en
* cfq-iosched: don't idle if a deep seek queue is slow - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/699aa32478cb8985?hl=en
* cfq-iosched: schedule dispatch for noidle queue - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/67b4a5358c0f2b08?hl=en
* mxcmmc: Add the ability to bind a regulator to manage the MMC card voltage -
1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/e65fb7063bdef227?hl=en
* spi/xilinx: Merge OF and non-OF drivers - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/7769b62e03e2ceee?hl=en
* TCM Core and TCM_Loop patches for v2.6.37 - 4 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/1bed3967d6e42751?hl=en
* Warning Code: ID67565434. - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/5b6757ddf2cc83e6?hl=en
* x86, hw_nmi: Move backtrace_mask declaration under ARCH_HAS_NMI_WATCHDOG. -
1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/4f1290a298c1b97e?hl=en
* input: Introduce light-weight contact tracking - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/0afd1560f6cc979d?hl=en
* tcm: Add SPC-4 compliant Persistent Reservations (PR) - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/9ec3f50497214db4?hl=en
* Q: sys_perf_event_open() && PF_EXITING - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/a6dd8c26ae5f2ddd?hl=en
* Q: perf_event && task->ptrace_bps[] - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/94d1426aaa464d03?hl=en
* [PATCH 2/2] Drivers: block: aoe: Makefile: replace the use of <module>-objs
with <module>-y - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/3abe1a7d54d7d4f0?hl=en

==============================================================================
TOPIC: tty: prevent DOS in the flush_to_ldisc
http://groups.google.com/group/linux.kernel/t/b22e5935660077fb?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Nov 8 2010 6:20 am
From: Alan Cox

On Mon, 8 Nov 2010 14:48:41 +0100
Jiri Olsa <jolsa@redhat.com> wrote:

> hi, any feedback?

Don't think I saw this before.

> > The attached patch (based on -next tree) fixes this by adding threshold
> > for processed data. When the threshold is reached, the current work is
> > rescheduled, so another could run.
> >
> > The threshold is set to the tty buffer maximum size.

That is an n_tty concept really - most other ldiscs simply eat stuff as
it hits them. It's also something we've got some evidence may need to
become a variable, but would still make sense.

Would it be simpler to remember the queue end before the first iteration
and not go past the queue end as it was at the entry to flush_to_ldisc.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: Tracing: do export trace_set_clr_event
http://groups.google.com/group/linux.kernel/t/392e03d89ceb236d?hl=en
==============================================================================

== 1 of 2 ==
Date: Mon, Nov 8 2010 6:20 am
From: Christoph Hellwig

On Mon, Nov 08, 2010 at 09:02:14AM -0500, Steven Rostedt wrote:
> I like the trace=on parameter much better. If that is set we could
> enable the tracepoints of that module at load time. I really do not want
> to export the function that was proposed in that patch.

Yes. Adding generic support in the module loader to turn on tracepoint
seems like the much better long term strategy. Even better if it allows
turning on individual points instead of all or nothing.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 2 of 2 ==
Date: Mon, Nov 8 2010 6:30 am
From: Steven Rostedt

On Mon, 2010-11-08 at 09:14 -0500, Christoph Hellwig wrote:
> On Mon, Nov 08, 2010 at 09:02:14AM -0500, Steven Rostedt wrote:
> > I like the trace=on parameter much better. If that is set we could
> > enable the tracepoints of that module at load time. I really do not want
> > to export the function that was proposed in that patch.
>
> Yes. Adding generic support in the module loader to turn on tracepoint
> seems like the much better long term strategy. Even better if it allows
> turning on individual points instead of all or nothing.

I was thinking the same. How about this:

trace=1 - all tracepoints in the module is enabled
trace=0 - same as leaving it off

trace=name - a specific tracepoint is enabled, using the simple globs
that set_event allows.

trace=name1,name2,name3 - for more than one tracepoint.

-- Steve

==============================================================================
TOPIC: [PATCH] DM-CRYPT: Scale to multiple CPUs v3 on 2.6.37-rc* ?
http://groups.google.com/group/linux.kernel/t/aa6f89b2fbf0c4f9?hl=en
==============================================================================

== 1 of 2 ==
Date: Mon, Nov 8 2010 6:20 am
From: Alasdair G Kergon

On Mon, Nov 08, 2010 at 12:05:09AM +0100, Andi Kleen wrote:
> e.g. the btrfs mailing list is full of corruption reports
> on dm-crypt and most of the symptoms point to broken barriers.

linux-btrfs? I'm not subscribed, but I the searches I've tried
don't show it to be "full of corruption reports".

Could you post links to the threads concerned so we can investigate?

Are we just talking -rc1 or earlier too?

Thanks,
Alasdair

== 2 of 2 ==
Date: Mon, Nov 8 2010 7:00 am
From: Mike Snitzer

On Sun, Nov 07 2010 at 6:05pm -0500,
Andi Kleen <andi@firstfloor.org> wrote:

> On Sun, Nov 07, 2010 at 10:39:23PM +0100, Milan Broz wrote:
> > On 11/07/2010 08:45 PM, Andi Kleen wrote:
> > >> I read about barrier-problems and data getting to the partition when
> > >> using dm-crypt and several layers so I don't know if that could be
> > >> related
> > >
> > > Barriers seem to be totally broken on dm-crypt currently.
> >
> > Can you explain it?
>
> e.g. the btrfs mailing list is full of corruption reports
> on dm-crypt and most of the symptoms point to broken barriers.

[cc'ing linux-btrfs, hopefully in the future dm-devel will get cc'd when
concerns about DM come up on linux-btrfs (or other lists)]

I spoke with Josef Bacik and these corruption reports are apparently
against older kernels (e.g. <= 2.6.33). I say <= 2.6.33 because:

https://btrfs.wiki.kernel.org/index.php/Gotchas states:
"btrfs volumes on top of dm-crypt block devices (and possibly LVM)
require write-caching to be turned off on the underlying HDD. Failing to
do so, in the event of a power failure, may result in corruption not yet
handled by btrfs code. (2.6.33)"

But Josef was not aware of any reports with kernels newer than 2.6.32
(F12).

Josef also noted that until last week btrfs wouldn't retry another
mirror in the face of some corruption, the fix is here:
http://git.kernel.org/?p=linux/kernel/git/mason/btrfs-unstable.git;a=commit;h=cb44921a09221

This obviously doesn't fix any source of corruption but it makes btrfs
more resilient when it encounters the corruption.

> > Barriers/flush change should work, if it is broken, it is not only dm-crypt.
> > (dm-crypt simply relies on dm-core implementation, when barrier/flush
> > request come to dmcrypt, all previous IO must be already finished).
>
> Possibly, at least it doesn't seem to work.

Can you please be more specific? What test(s)? What kernel(s)?

Any pointers to previous (and preferably: recent) reports would be
appreciated.

The DM barrier code has seen considerable change recently (via flush+fua
changes in 2.6.37). Those changes have been tested quite a bit
(including ext4 consistency after a crash).

But even prior to those flush+fua changes DM's support for barriers
(Linux >= 2.6.31) was held to be robust. No known (at least no
reported) issues with DM's barrier support.

Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: dyntick-hpc and RCU
http://groups.google.com/group/linux.kernel/t/cd349868731ca1c0?hl=en
==============================================================================

== 1 of 3 ==
Date: Mon, Nov 8 2010 6:20 am
From: Frederic Weisbecker

On Fri, Nov 05, 2010 at 08:04:36AM -0700, Paul E. McKenney wrote:
> On Fri, Nov 05, 2010 at 06:27:46AM +0100, Frederic Weisbecker wrote:
> > Yet another solution is to require users of bh and sched rcu flavours to
> > call a specific rcu_read_lock_sched()/bh, or something similar, that would
> > be only implemented in this new rcu config. We would only need to touch the
> > existing users and the future ones instead of adding an explicit call
> > to every implicit paths.
>
> This approach would be a much nicer solution, and I do wish I had required
> this to start with. Unfortunately, at that time, there was no preemptible
> RCU, CONFIG_PREEMPT, nor any RCU-bh, so there was no way to enforce this.
> Besides which, I was thinking in terms of maybe 100 occurrences of the RCU
> API in the kernel. ;-)

Ok, I'll continue the discussion about this specific point in the
non-timer based rcu patch thread.

> > > 4. Substitute an RCU implementation based on one of the
> > > user-level RCU implementations. This has roughly the same
> > > advantages and disadvantages as does #3 above.
> > >
> > > 5. Don't tell RCU about dyntick-hpc mode, but instead make RCU
> > > push processing through via some processor that is kept out
> > > of dyntick-hpc mode.
> >
> > I don't understand what you mean.
> > Do you mean that dyntick-hpc cpu would enqueue rcu callbacks to
> > another CPU? But how does that protect rcu critical sections
> > in our dyntick-hpc CPU?
>
> There is a large range of possible solutions, but any solution will need
> to check for RCU read-side critical sections on the dyntick-hpc CPU. I
> was thinking in terms of IPIing the dyntick-hpc CPUs, but very infrequently,
> say once per second.

Everytime we want to notify a quiescent state, right?
But I fear that forcing an IPI, even only once per second, breaks our
initial requirement.

> > > This requires that the rcutree RCU
> > > priority boosting be pushed further along so that RCU grace period
> > > and callback processing is done in kthread context, permitting
> > > remote forcing of grace periods.
> >
> >
> >
> > I should have a look at the rcu priority boosting to understand what you
> > mean here.
>
> The only thing that you really need to know about it is that I will be
> moving the current softirq processing to kthread context. The key point
> here is that we can wake up a kthread on some other CPU.

Ok.

> > > The RCU_JIFFIES_TILL_FORCE_QS
> > > macro is promoted to a config variable, retaining its value
> > > of 3 in absence of dyntick-hpc, but getting value of HZ
> > > (or thereabouts) for dyntick-hpc builds. In dyntick-hpc
> > > builds, force_quiescent_state() would push grace periods
> > > for CPUs lacking a scheduling-clock interrupt.
> > >
> > > + Relatively small changes to RCU, some of which is
> > > coming with RCU priority boosting anyway.
> > >
> > > + No need to inform RCU of user/kernel transitions.
> > >
> > > + No need to turn scheduling-clock interrupts on
> > > at each user/kernel transition.
> > >
> > > - Some IPIs to dyntick-hpc CPUs remain, but these
> > > are down in the every-second-or-so frequency,
> > > so hopefully are not a real problem.
> >
> >
> > Hmm, I hope we could avoid that, ideally the task in userspace shouldn't be
> > interrupted at all.
>
> Yep. But if we do need to interrupt it, let's do it as infrequently as
> we can!

If we have no other solution yeah, but I'm not sure that's a right way
to go.

> > I wonder if we shouldn't go back to #3 eventually.
>
> And there are variants of #3 that permit preemption of RCU read-side
> critical sections.

Ok.

> > At that time yeah.
> >
> > But now I don't know, I really need to dig deeper into it and really
> > understand how #5 works before picking that orientation :)
>
> This is probably true for all of us for all of the options. ;-)

Hehe ;-)

== 2 of 3 ==
Date: Mon, Nov 8 2010 7:10 am
From: Frederic Weisbecker

On Sat, Nov 06, 2010 at 12:28:12PM -0700, Paul E. McKenney wrote:
> On Fri, Nov 05, 2010 at 05:00:59PM -0400, Joe Korty wrote:
> > +/**
> > + * synchronize_sched - block until all CPUs have exited any non-preemptive
> > + * kernel code sequences.
> > + *
> > + * This means that all preempt_disable code sequences, including NMI and
> > + * hardware-interrupt handlers, in progress on entry will have completed
> > + * before this primitive returns. However, this does not guarantee that
> > + * softirq handlers will have completed, since in some kernels
>
> OK, so your approach treats preempt_disable code sequences as RCU
> read-side critical sections by relying on the fact that the per-CPU
> ->krcud task cannot run until such code sequences complete, correct?
>
> This seems to require that each CPU's ->krcud task be awakened at
> least once per grace period, but I might well be missing something.

I understood it differently, but I might also be wrong as well. krcud
executes the callbacks, but it is only woken up for CPUs that want to
execute callbacks, not for those that only signal a quiescent state,
which is only determined in two ways through rcu_poll_other_cpus():

- if the CPU is in an rcu_read_lock() critical section, it has the
IN_RCU_READ_LOCK flag. If so then we set up its DO_RCU_COMPLETION flag so
that it signals its quiescent state on rcu_read_unlock().

- otherwise it's in a quiescent state.

This works for rcu and rcu bh critical sections.
But this works in rcu sched critical sections only if rcu_read_lock_sched() has
been called explicitly, otherwise that doesn't work (in preempt_disable(),
local_irq_save(), etc...). I think this is what is not complete when
Joe said it's not yet a complete rcu implementation.

This is also the part that scaries me most :)

== 3 of 3 ==
Date: Mon, Nov 8 2010 7:20 am
From: Joe Korty

On Mon, Nov 08, 2010 at 10:06:47AM -0500, Frederic Weisbecker wrote:
> On Sat, Nov 06, 2010 at 12:28:12PM -0700, Paul E. McKenney wrote:
>> OK, so your approach treats preempt_disable code sequences as RCU
>> read-side critical sections by relying on the fact that the per-CPU
>> ->krcud task cannot run until such code sequences complete, correct?
>>
>> This seems to require that each CPU's ->krcud task be awakened at
>> least once per grace period, but I might well be missing something.
>
> I understood it differently, but I might also be wrong as well. krcud
> executes the callbacks, but it is only woken up for CPUs that want to
> execute callbacks, not for those that only signal a quiescent state,
> which is only determined in two ways through rcu_poll_other_cpus():
>
> - if the CPU is in an rcu_read_lock() critical section, it has the
> IN_RCU_READ_LOCK flag. If so then we set up its DO_RCU_COMPLETION flag so
> that it signals its quiescent state on rcu_read_unlock().
>
> - otherwise it's in a quiescent state.
>
> This works for rcu and rcu bh critical sections.
> But this works in rcu sched critical sections only if rcu_read_lock_sched() has
> been called explicitly, otherwise that doesn't work (in preempt_disable(),
> local_irq_save(), etc...). I think this is what is not complete when
> Joe said it's not yet a complete rcu implementation.
>
> This is also the part that scaries me most :)

Mostly, I meant that the new RCU API interfaces that have come into
existance since 2004 were only hastily wrapped or NOPed by me to get
things going.

Jim's method only works with explicit rcu_read_lock..unlock sequences,
implicit sequences via preempt_disable..enable and the like are not
handled. I had thought all such sequences were converted to rcu_read_lock
but maybe that is not yet correct.

Jim will have to comment on the full history. He is incommunicado
at the moment; hopefully he will be able to participate sometime in
the next few days.

Regards,
Joe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: mm: make ioremap_prot() take a pgprot.
http://groups.google.com/group/linux.kernel/t/6800294b28bef246?hl=en
==============================================================================

== 1 of 1 ==
Date: Mon, Nov 8 2010 6:30 am
From: Benjamin Herrenschmidt

On Mon, 2010-11-08 at 15:34 +0900, Paul Mundt wrote:
> On Wed, Nov 03, 2010 at 05:31:03AM +0900, Paul Mundt wrote:
> > The current definition of ioremap_prot() takes an unsigned long for the
> > page flags and then converts to/from a pgprot as necessary. This is
> > unfortunately not sufficient for the SH-X2 TLB case which has a 64-bit
> > pgprot and a 32-bit unsigned long.
> >
> > An inspection of the tree shows that tile and cris also have their
> > own equivalent routines that are using the pgprot_t but do not set
> > HAVE_IOREMAP_PROT, both of which could trivially be adapted.
> >
> > After cris/tile are updated there would also be enough critical mass to
> > move the powerpc devm_ioremap_prot() in to the generic lib/devres.c.
> >
> > Signed-off-by: Paul Mundt <lethal@linux-sh.org>
> >
> Any takers?

Haven't had a chance to play with it yet, still travelling.

Cheers,
Ben.

> > ---
> >
> > arch/powerpc/include/asm/io.h | 8 +++++---
> > arch/powerpc/lib/devres.c | 10 +++++-----
> > arch/sh/Kconfig | 2 +-
> > arch/sh/boards/mach-landisk/setup.c | 2 +-
> > arch/sh/boards/mach-lboxre2/setup.c | 2 +-
> > arch/sh/boards/mach-sh03/setup.c | 2 +-
> > arch/sh/include/asm/io.h | 4 ++--
> > arch/x86/include/asm/io.h | 2 +-
> > arch/x86/mm/ioremap.c | 5 +++--
> > arch/x86/mm/pat.c | 5 ++---
> > include/linux/mm.h | 2 +-
> > mm/memory.c | 6 +++---
> > 12 files changed, 26 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
> > index 001f2f1..27f40e6 100644
> > --- a/arch/powerpc/include/asm/io.h
> > +++ b/arch/powerpc/include/asm/io.h
> > @@ -618,7 +618,8 @@ static inline void iosync(void)
> > *
> > * * ioremap_flags allows to specify the page flags as an argument and can
> > * also be hooked by the platform via ppc_md. ioremap_prot is the exact
> > - * same thing as ioremap_flags.
> > + * same thing as ioremap_flags, with the exception that it takes a
> > + * pgprot value instead.
> > *
> > * * ioremap_nocache is identical to ioremap
> > *
> > @@ -643,7 +644,8 @@ extern void __iomem *ioremap(phys_addr_t address, unsigned long size);
> > extern void __iomem *ioremap_flags(phys_addr_t address, unsigned long size,
> > unsigned long flags);
> > #define ioremap_nocache(addr, size) ioremap((addr), (size))
> > -#define ioremap_prot(addr, size, prot) ioremap_flags((addr), (size), (prot))
> > +#define ioremap_prot(addr, size, prot) ioremap_flags((addr), (size), \
> > + pgprot_val(prot))
> >
> > extern void iounmap(volatile void __iomem *addr);
> >
> > @@ -779,7 +781,7 @@ static inline void * bus_to_virt(unsigned long address)
> > #define clrsetbits_8(addr, clear, set) clrsetbits(8, addr, clear, set)
> >
> > void __iomem *devm_ioremap_prot(struct device *dev, resource_size_t offset,
> > - size_t size, unsigned long flags);
> > + size_t size, pgprot_t prot);
> >
> >

twitter

Monday, November 8, 2010

linux.kernel - 26 new messages in 17 topics - digest

0 Comments:

Post a Comment

About Me

Previous Posts