twitter: linux.kernel - 26 new messages in 16 topics

linux.kernel
http://groups.google.com/group/linux.kernel?hl=en

Today's topics:

* ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC: early
exception 08 rip 246:10 error ffffffff810251b5 cr2 0) - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/42f60222e13dd142?hl=en
* perf: New PERF_EVENT_IOC_INJECT ioctl - 5 messages, 4 authors
http://groups.google.com/group/linux.kernel/t/88914be5e6b28d7d?hl=en
* networking tcp: Writing tcp socket be atomic - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/7ae4e30c83ee8a1e?hl=en
* mxc: Core support for i.MX5 series of processors from Freescale - 4 messages,
2 authors
http://groups.google.com/group/linux.kernel/t/281ca1a532ca6d76?hl=en
* regression in 2.6.27.45 with usb and suspend-to-disk - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/f18f89f344fa881c?hl=en
* Linux wireless GSoC 2010 project ideas - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/09f2b94c7080815d?hl=en
* perf tools: Use O_LARGEFILE to open perf data file - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/5e0297b0b8d16155?hl=en
* Improving OOM killer - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/389db2dcf6479d30?hl=en
* HID: make raw output callback more flexible - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/24254d069a593af0?hl=en
* mfd: Add support for the timberdale FPGA. - 2 messages, 2 authors
http://groups.google.com/group/linux.kernel/t/d01182830d6bf0f5?hl=en
* platform_driver_register: warn if probe is in .init.text - 1 messages, 1
author
http://groups.google.com/group/linux.kernel/t/7db9be36a46d6f4f?hl=en
* exit: PR_SET_ANCHOR for marking processes as reapers for child processes - 1
messages, 1 author
http://groups.google.com/group/linux.kernel/t/85def94b1ef0fefe?hl=en
* Dell activity led WMI driver - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/4f642e9c45e31b8d?hl=en
* x86: fix race in create_irq_nr on irq_desc - 2 messages, 1 author
http://groups.google.com/group/linux.kernel/t/8d79051b75e0b2aa?hl=en
* USB: g_mass_storage: min(...) warning fixed - 1 messages, 1 author
http://groups.google.com/group/linux.kernel/t/a5a0604f1c0a4dee?hl=en
* linux-next: manual merge of the trivial tree with the net tree - 1 messages,
1 author
http://groups.google.com/group/linux.kernel/t/dacea17d4f39c345?hl=en

==============================================================================
TOPIC: ohci1394_dma=early crash since 2.6.32 (was Re: [Bug #14487] PANIC:
early exception 08 rip 246:10 error ffffffff810251b5 cr2 0)
http://groups.google.com/group/linux.kernel/t/42f60222e13dd142?hl=en
==============================================================================

== 1 of 1 ==
Date: Wed, Feb 3 2010 1:20 am
From: "Jan Beulich"

>>> "Justin P. Mattock" <justinmattock@gmail.com> 03.02.10 02:43 >>>
>The only thing I can think of at this point
>is maybe the CFLAGS I used to build this system.
>(as for the x86_32 working and x86_64 failing not sure);
>
>I'm curious to see if anybody else is hitting this?

I think it is pretty clear how a page fault can happen here (but you're
observing a double fault, which I cannot explain [nor can I explain
why the fault apparently didn't get an error code pushed, which is
why address and error code displayed are mixed up]): I would
suspect that FIX_OHCI1394_BASE is now in a different (virtual) 2Mb
range than what is covered by level{1,2}_fixmap_pgt, but this was
a latent issue even before that patch (just waiting for sufficiently
many fixmap entries getting inserted before
__end_of_permanent_fixed_addresses).

The thing is that head_64.S uses hard-coded numbers, but doesn't
really make sure (at build time) that the fixmap page tables established
indeed cover all the entries of importance (and honestly I even can't
easily tell which of the candidates - FIX_DBGP_BASE,
FIX_EARLYCON_MEM_BASE, and FIX_OHCI1394_BASE afaict - really
matter). If either of the first does, the only reasonable solution imo
is to move FIX_OHCI1394_BASE out of the boot time only range into
the permanent range (unless the other two can be moved into the
boot time only range). And obviously the hard coded numbers
should be eliminated from head_64.S.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

==============================================================================
TOPIC: perf: New PERF_EVENT_IOC_INJECT ioctl
http://groups.google.com/group/linux.kernel/t/88914be5e6b28d7d?hl=en
==============================================================================

== 1 of 5 ==
Date: Wed, Feb 3 2010 1:20 am
From: Frederic Weisbecker

The PERF_EVENT_IOC_INJECT perf event ioctl can be used to inject
events, if the corresponding pmu and event supports it.

On trace events, it will call the inject callback, usually reserved
for events that need to catch up with the user.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
Cc: Li Zefan <lizf@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
---
include/linux/perf_event.h | 2 ++
kernel/perf_event.c | 23 +++++++++++++++++++++++
2 files changed, 25 insertions(+), 0 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 556b0f4..d2e83f0 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -228,6 +228,7 @@ struct perf_event_attr {
#define PERF_EVENT_IOC_PERIOD _IOW('$', 4, __u64)
#define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
#define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
+#define PERF_EVENT_IOC_INJECT _IO ('$', 7)

enum perf_event_ioc_flags {
PERF_IOC_FLAG_GROUP = 1U << 0,
@@ -513,6 +514,7 @@ struct pmu {
void (*disable) (struct perf_event *event);
void (*read) (struct perf_event *event);
void (*unthrottle) (struct perf_event *event);
+ void (*inject) (struct perf_event *event);
};

/**
diff --git a/kernel/perf_event.c b/kernel/perf_event.c
index 40f8b07..e4dfd12 100644
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -2151,6 +2151,26 @@ unlock:
return ret;
}

+static void __perf_event_inject(void *info)
+{
+ struct perf_event *event = info;
+
+ event->pmu->inject(event);
+}
+
+static int perf_event_inject(struct perf_event *event)
+{
+ struct perf_event_context *ctx = event->ctx;
+ struct task_struct *task = ctx->task;
+
+ if (!event->pmu->inject || task)
+ return -EINVAL;
+
+ smp_call_function_single(event->cpu, __perf_event_inject, event, 1);
+
+ return 0;
+}
+
static int perf_event_set_output(struct perf_event *event, int output_fd);
static int perf_event_set_filter(struct perf_event *event, void __user *arg);

@@ -2183,6 +2203,9 @@ static long perf_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
case PERF_EVENT_IOC_SET_FILTER:
return perf_event_set_filter(event, (void __user *)arg);

+ case PERF_EVENT_IOC_INJECT:
+ return perf_event_inject(event);
+
default:
return -ENOTTY;
}
--
1.6.2.3

== 2 of 5 ==
Date: Wed, Feb 3 2010 1:30 am
From: Frederic Weisbecker

On Wed, Feb 03, 2010 at 10:14:29AM +0100, Frederic Weisbecker wrote:
> The PERF_EVENT_IOC_INJECT perf event ioctl can be used to inject
> events, if the corresponding pmu and event supports it.
>
> On trace events, it will call the inject callback, usually reserved
> for events that need to catch up with the user.
>
> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Hitoshi Mitake <mitake@dcl.info.waseda.ac.jp>
> Cc: Li Zefan <lizf@cn.fujitsu.com>
> Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
> Cc: Masami Hiramatsu <mhiramat@redhat.com>
> Cc: Jens Axboe <jens.axboe@oracle.com>
> ---
> include/linux/perf_event.h | 2 ++
> kernel/perf_event.c | 23 +++++++++++++++++++++++
> 2 files changed, 25 insertions(+), 0 deletions(-)
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index 556b0f4..d2e83f0 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -228,6 +228,7 @@ struct perf_event_attr {
> #define PERF_EVENT_IOC_PERIOD _IOW('$', 4, __u64)
> #define PERF_EVENT_IOC_SET_OUTPUT _IO ('$', 5)
> #define PERF_EVENT_IOC_SET_FILTER _IOW('$', 6, char *)
> +#define PERF_EVENT_IOC_INJECT _IO ('$', 7)
>
> enum perf_event_ioc_flags {
> PERF_IOC_FLAG_GROUP = 1U << 0,
> @@ -513,6 +514,7 @@ struct pmu {
> void (*disable) (struct perf_event *event);
> void (*read) (struct perf_event *event);
> void (*unthrottle) (struct perf_event *event);
> + void (*inject) (struct perf_event *event);
> };
>
> /**
> diff --git a/kernel/perf_event.c b/kernel/perf_event.c
> index 40f8b07..e4dfd12 100644
> --- a/kernel/perf_event.c
> +++ b/kernel/perf_event.c
> @@ -2151,6 +2151,26 @@ unlock:
> return ret;
> }
>
> +static void __perf_event_inject(void *info)
> +{
> + struct perf_event *event = info;
> +
> + event->pmu->inject(event);
> +}
> +
> +static int perf_event_inject(struct perf_event *event)
> +{
> + struct perf_event_context *ctx = event->ctx;
> + struct task_struct *task = ctx->task;
> +
> + if (!event->pmu->inject || task)
> + return -EINVAL;
> +
> + smp_call_function_single(event->cpu, __perf_event_inject, event, 1);
> +

Ah, I forgot to say. Injection is only supported on cpu
bound events (non-task bound). Because if it is task-bound,
we can't ensure we can inject while the event is scheduled,
not sure how to fix this.

== 3 of 5 ==
Date: Wed, Feb 3 2010 2:30 am
From: Jens Axboe

On Wed, Feb 03 2010, Frederic Weisbecker wrote:
> Hi,
>
> There are many things that happen in this patchset, treating
> different problems:
>
> - remove most of the string copy overhead in fast path
> - open the way for lock class oriented profiling (as
> opposite to lock instance profiling. Both can be useful
> in different ways).
> - remove the buffers muliplexing (less contention)
> - event injection support
> - remove violent lock events recursion (only 2 among 3, the remaining
> one is detailed below).
>
> Some differences, by running:
> perf lock record perf sched pipe -l 100000
>
> Before the patchset:
>
> Total time: 91.015 [sec]
>
> 910.157300 usecs/op
> 1098 ops/sec
>
> After this patchset applied:
>
> Total time: 43.706 [sec]
>
> 437.062080 usecs/op
> 2288 ops/sec

This does a lot better here, even if it isn't exactly stellar
performance. It generates a LOT of data:

root@nehalem:/dev/shm # time perf lock rec -fg ls
perf.data perf.data.old
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 137.224 MB perf.data (~5995421
samples) ]

real 0m3.320s
user 0m0.000s
sys 0m3.220s

Without -g, it has 1.688s real and 1.590s sys time.

So while this is orders of magnitude better than the previous patchset,
it's still not anywhere near lean. But I expect you know that, just
consider this a 'I tested it and this is what happened' report :-)

--
Jens Axboe

== 4 of 5 ==
Date: Wed, Feb 3 2010 2:30 am
From: Ingo Molnar

* Frederic Weisbecker <fweisbec@gmail.com> wrote:

> Hi,
>
> There are many things that happen in this patchset, treating
> different problems:
>
> - remove most of the string copy overhead in fast path
> - open the way for lock class oriented profiling (as
> opposite to lock instance profiling. Both can be useful
> in different ways).
> - remove the buffers muliplexing (less contention)
> - event injection support
> - remove violent lock events recursion (only 2 among 3, the remaining
> one is detailed below).
>
> Some differences, by running:
> perf lock record perf sched pipe -l 100000
>
> Before the patchset:
>
> Total time: 91.015 [sec]
>
> 910.157300 usecs/op
> 1098 ops/sec
>
> After this patchset applied:
>
> Total time: 43.706 [sec]
>
> 437.062080 usecs/op
> 2288 ops/sec

Fantastic!

There's one area that needs more thought i think: the dump-all-classes
init-event-injector approach. It is async, hence we could lose events if
there's a lot of lock classes to dump. Plus we eventually want to use your
injector approach for other things as well (such as to dump the state of a
collection of tasks) - so i think we want it to be more synchronous.

One approach would be to allow a gradual read() deplete the dump. Also, i
think the 'state dump' events should be separate from regular init events.
Filters attached to these events will automatically cause the dumping to be
restricted to the filter set. For example in the case of tasks one could dump
only tasks from a particular UID - by adding a 'uid == 1234' filter before
the dump (on a per tasks basis - so the filtering is nicely task local).

What do you think?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

== 5 of 5 ==
Date: Wed, Feb 3 2010 2:40 am
From: Peter Zijlstra

On Wed, 2010-02-03 at 10:14 +0100, Frederic Weisbecker wrote:
> - event injection support

I like the idea, I'm just not sure about the name and API details.

I would like to call it something like collection support, and the API
should have an iterator like interface.

That is, it should not blindly dump all events from a collection at
once, praying the output buffer is large enough, but either dump a
specified number and/or stop dumping when the buffer is full. Allowing a
second invocation to continue where it left off after the buffer content
has been consumed.

Which brings us to the ioctl() interface, we can do the above using
ioctl()s, but it seems to me we're starting to get ioctl() heavy and
should be looking at alternative ways of extending this.

Anybody any bright ideas?

==============================================================================
TOPIC: networking tcp: Writing tcp socket be atomic
http://groups.google.com/group/linux.kernel/t/7ae4e30c83ee8a1e?hl=en
==============================================================================

== 1 of 1 ==
Date: Wed, Feb 3 2010 1:30 am
From: "john ye"

Subject: [PATCH 2.6.27.7-9-pae #7 SMP 1/1] networking tcp: Writing tcp socket be atomic
from: John Ye <johny@asimco.com.cn>

Writing tcp socket is not atomic in current kernel. When a socket is written by
multi-processes or threads,the other end will read interleaved garbage data.

This simple patch is to resolve this issue, to make the stream socket writing
be atomic under certain data size limit.

Similar to file system pipe ( with a max atomic write limit ), an atomic
socket can be written by multi processes or threads.

But it's more than pipe. The pipe can only be used by multi processes in a
local system, the atomic stream socket can be used remotely to send data
among machines without user level locking involved.

How to test this patch:
1) apply the patch to kernel and modules, reboot from the new patched kernel
2) #define TCP_ATOMIC 20 in your test.c (TCP_ATOMIC is defined as 20 in kernel)
3) create a tcp socket, set the atomic option.
for example:
int val = 512;
int len = 4;
if(setsockopt(s, IPPROTO_TCP, TCP_ATOMIC, &val, len) == -1) {
perror("setsockopt");
return -1 ;
}
will set the atomic max data size to 512 bytes

to get the current atomic size for socket s,
val = 0;
len = 4;
if(getsockopt(s, IPPROTO_TCP, TCP_ATOMIC, &val, &len) == -1) {
perror("setsockopt");
return -1 ;
}

4) Then, connect to a tcp server, fork a child process.
let both main process and child process write() or send() its own data block to the server.
From the server, the received data bytes will be interleaved if no TCP_ATOMIC is set.
(I have a testing c code ready)

Signed-off-by: John Ye (Seeker) johny@asimco.com.cn

---

--- linux/net/ipv4/tcp.c 2008-12-05 09:48:57.000000000 +0800
+++ linux/net/ipv4/tcp.c 2010-02-03 15:15:11.000000000 +0800
@@ -822,6 +822,7 @@
int mss_now, size_goal;
int err, copied;
long timeo;
+ int atomic; /* is atomic write? johnye. Feb 2, 2010 */

lock_sock(sk);
TCP_CHECK_TIMER(sk);
@@ -849,6 +850,11 @@
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto do_error;

+
+ /* for multi-seg data or too big chunk, no atomic. johnye. */
+ atomic = tp->atomic_size;
+ if(iovlen > 1 || iov->iov_len > atomic) atomic = 0;
+
while (--iovlen >= 0) {
int seglen = iov->iov_len;
unsigned char __user *from = iov->iov_base;
@@ -889,14 +895,28 @@
if (copy > seglen)
copy = seglen;

+ /* if atomic write. johnye */
+ if (atomic)
+ copy = seglen;
+
/* Where to copy to? */
if (skb_tailroom(skb) > 0) {
/* We have some space in skb head. Superb! */
- if (copy > skb_tailroom(skb))
+ /* consider atomic write, johnye */
+ if (copy > skb_tailroom(skb)) {
+ if(atomic)
+ goto skb_page_start; /* q mark yet, johnye */
+
copy = skb_tailroom(skb);
+ }
if ((err = skb_add_data(skb, from, copy)) != 0)
goto do_fault;
- } else {
+
+ goto skb_page_done;
+ //} else {
+ }
+ skb_page_start:
+ {
int merge = 0;
int i = skb_shinfo(skb)->nr_frags;
struct page *page = TCP_PAGE(sk);
@@ -925,8 +945,17 @@
} else
off = 0;

- if (copy > PAGE_SIZE - off)
- copy = PAGE_SIZE - off;
+ /* consider atomic write, johnye */
+ if (copy > PAGE_SIZE - off) {
+ if (atomic && page) {
+ put_page(page);
+ TCP_PAGE(sk) = page = NULL;
+ off = 0;
+ merge = 0;
+ } else {
+ copy = PAGE_SIZE - off;
+ }
+ }

if (!sk_wmem_schedule(sk, copy))
goto wait_for_memory;
@@ -968,6 +997,7 @@

TCP_OFF(sk) = off + copy;
}
+ skb_page_done:

if (!copied)
TCP_SKB_CB(skb)->flags &= ~TCPCB_FLAG_PSH;
@@ -2019,6 +2049,16 @@
lock_sock(sk);

switch (optname) {
+
+ /* set the atomic write max size. johnye */
+ case TCP_ATOMIC:
+ if(val > 1024) {
+ err = -EINVAL;
+ break;
+ }
+ tp->atomic_size = val;
+ break;
+
case TCP_MAXSEG:
/* Values greater than interface MTU won't take effect. However
* at the point when this call is done we typically don't yet
@@ -2276,6 +2316,12 @@
return -EINVAL;

switch (optname) {
+
+ /* get the atomic write max size. johnye */
+ case TCP_ATOMIC:
+ val = tp->atomic_size;
+ break;
+
case TCP_MAXSEG:
val = tp->mss_cache;
if (!val && ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN)))
--- linux/include/linux/tcp.h 2008-10-10 06:13:53.000000000 +0800
+++ linux/include/linux/tcp.h 2010-02-03 13:54:55.000000000 +0800
@@ -97,6 +97,8 @@
#define TCP_CONGESTION 13 /* Congestion control algorithm */
#define TCP_MD5SIG 14 /* TCP MD5 Signature (RFC2385) */

+#define TCP_ATOMIC 20 /* atomic TCP socket writting */
+
#define TCPI_OPT_TIMESTAMPS 1
#define TCPI_OPT_SACK 2
#define TCPI_OPT_WSCALE 4
@@ -411,6 +413,7 @@

twitter

Wednesday, February 3, 2010

linux.kernel - 26 new messages in 16 topics - digest

0 Comments:

Post a Comment

About Me

Previous Posts