Skip to content
Snippets Groups Projects
  1. Mar 16, 2023
  2. Jul 15, 2019
  3. Jun 19, 2019
  4. Jun 08, 2019
  5. Mar 25, 2019
  6. Jan 27, 2019
    • Quentin Perret's avatar
      sched/topology: Introduce a sysctl for Energy Aware Scheduling · 8d5d0cfb
      Quentin Perret authored
      
      In its current state, Energy Aware Scheduling (EAS) starts automatically
      on asymmetric platforms having an Energy Model (EM). However, there are
      users who want to have an EM (for thermal management for example), but
      don't want EAS with it.
      
      In order to let users disable EAS explicitly, introduce a new sysctl
      called 'sched_energy_aware'. It is enabled by default so that EAS can
      start automatically on platforms where it makes sense. Flipping it to 0
      rebuilds the scheduling domains and disables EAS.
      
      Signed-off-by: default avatarQuentin Perret <quentin.perret@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: adharmap@codeaurora.org
      Cc: chris.redpath@arm.com
      Cc: currojerez@riseup.net
      Cc: dietmar.eggemann@arm.com
      Cc: edubezval@gmail.com
      Cc: gregkh@linuxfoundation.org
      Cc: javi.merino@kernel.org
      Cc: joel@joelfernandes.org
      Cc: juri.lelli@redhat.com
      Cc: morten.rasmussen@arm.com
      Cc: patrick.bellasi@arm.com
      Cc: pkondeti@codeaurora.org
      Cc: rjw@rjwysocki.net
      Cc: skannan@codeaurora.org
      Cc: smuckle@google.com
      Cc: srinivas.pandruvada@linux.intel.com
      Cc: thara.gopinath@linaro.org
      Cc: tkjos@google.com
      Cc: valentin.schneider@arm.com
      Cc: vincent.guittot@linaro.org
      Cc: viresh.kumar@linaro.org
      Link: https://lkml.kernel.org/r/20181203095628.11858-11-quentin.perret@arm.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      8d5d0cfb
  7. Jan 08, 2019
    • Thorsten Leemhuis's avatar
      docs: Revamp tainted-kernels.rst to make it more comprehensible · 896dd323
      Thorsten Leemhuis authored
      
      Add a section about decoding /proc/sys/kernel/tainted, create a more
      understandable intro and a hopefully explain better the tainted flags in
      bugs, oops or panics messages. Only thing missing then is a table that
      quickly describes the various bits and taint flags before going into more
      detail, so add that as well.
      
      That table is partly based on a section from Documentation/sysctl/kernel.txt,
      but a bit more compact. To avoid confusion I added the shortened version to
      kernel.txt; the same table is used in three different places now:
      ./tools/debugging/kernel-chktaint,
      Documentation/admin-guide/tainted-kernels.rst and
      Documentation/sysctl/kernel.txt
      
      During review of v1 (see above) a number of existing issues with the text
      were raised, like outdated usages as well as incomplete or missing
      descriptions.  Address most of those as well.
      
      Signed-off-by: default avatarThorsten Leemhuis <linux@leemhuis.info>
      [jc: tightened up changelog]
      Signed-off-by: default avatarJonathan Corbet <corbet@lwn.net>
      896dd323
  8. Jan 04, 2019
  9. Sep 04, 2018
  10. Aug 22, 2018
    • Manfred Spraul's avatar
      ipc: reorganize initialization of kern_ipc_perm.seq · e2652ae6
      Manfred Spraul authored
      ipc_addid() initializes kern_ipc_perm.seq after having called idr_alloc()
      (within ipc_idr_alloc()).
      
      Thus a parallel semop() or msgrcv() that uses ipc_obtain_object_check()
      may see an uninitialized value.
      
      The patch moves the initialization of kern_ipc_perm.seq before the calls
      of idr_alloc().
      
      Notes:
      1) This patch has a user space visible side effect:
      If /proc/sys/kernel/*_next_id is used (i.e.: checkpoint/restore) and
      if semget()/msgget()/shmget() fails in the final step of adding the id
      to the rhash tree, then .._next_id is cleared. Before the patch, is
      remained unmodified.
      
      There is no change of the behavior after a successful ..get() call: It
      always clears .._next_id, there is no impact to non checkpoint/restore
      code as that code does not use .._next_id.
      
      2) The patch correctly documents that after a call to ipc_idr_alloc(),
      the full tear-down sequence must be used. The callers of ipc_addid()
      do not fullfill that, i.e. more bugfixes are required.
      
      The patch is a squash of a patch from Dmitry and my own changes.
      
      Link: http://lkml.kernel.org/r/20180712185241.4017-3-manfred@colorfullife.com
      
      
      Reported-by: default avatar <syzbot+2827ef6b3385deb07eaf@syzkaller.appspotmail.com>
      Signed-off-by: default avatarManfred Spraul <manfred@colorfullife.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Cc: Davidlohr Bueso <dbueso@suse.de>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e2652ae6
    • Dmitry Vyukov's avatar
      kernel/hung_task.c: allow to set checking interval separately from timeout · a2e51445
      Dmitry Vyukov authored
      Currently task hung checking interval is equal to timeout, as the result
      hung is detected anywhere between timeout and 2*timeout.  This is fine for
      most interactive environments, but this hurts automated testing setups
      (syzbot).  In an automated setup we need to strictly order CPU lockup <
      RCU stall < workqueue lockup < task hung < silent loss, so that RCU stall
      is not detected as task hung and task hung is not detected as silent
      machine loss.  The large variance in task hung detection timeout requires
      setting silent machine loss timeout to a very large value (e.g.  if task
      hung is 3 mins, then silent loss need to be set to ~7 mins).  The
      additional 3 minutes significantly reduce testing efficiency because
      usually we crash kernel within a minute, and this can add hours to bug
      localization process as it needs to do dozens of tests.
      
      Allow setting checking interval separately from timeout.  This allows to
      set timeout to, say, 3 minutes, but checking interval to 10 secs.
      
      The interval is controlled via a new hung_task_check_interval_secs sysctl,
      similar to the existing hung_task_timeout_secs sysctl.  The default value
      of 0 results in the current behavior: checking interval is equal to
      timeout.
      
      [akpm@linux-foundation.org: update hung_task_timeout_max's comment]
      Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.com
      
      
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a2e51445
  11. Jul 08, 2018
  12. Apr 11, 2018
  13. Dec 21, 2017
  14. Dec 11, 2017
  15. Aug 22, 2017
  16. Aug 14, 2017
    • Tyler Hicks's avatar
      seccomp: Sysctl to display available actions · 8e5f1ad1
      Tyler Hicks authored
      
      This patch creates a read-only sysctl containing an ordered list of
      seccomp actions that the kernel supports. The ordering, from left to
      right, is the lowest action value (kill) to the highest action value
      (allow). Currently, a read of the sysctl file would return "kill trap
      errno trace allow". The contents of this sysctl file can be useful for
      userspace code as well as the system administrator.
      
      The path to the sysctl is:
      
        /proc/sys/kernel/seccomp/actions_avail
      
      libseccomp and other userspace code can easily determine which actions
      the current kernel supports. The set of actions supported by the current
      kernel may be different than the set of action macros found in kernel
      headers that were installed where the userspace code was built.
      
      In addition, this sysctl will allow system administrators to know which
      actions are supported by the kernel and make it easier to configure
      exactly what seccomp logs through the audit subsystem. Support for this
      level of logging configuration will come in a future patch.
      
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      8e5f1ad1
  17. Mar 03, 2017
  18. Oct 25, 2016
    • Josh Poimboeuf's avatar
      x86/dumpstack: Remove raw stack dump · 0ee1dd9f
      Josh Poimboeuf authored
      
      For mostly historical reasons, the x86 oops dump shows the raw stack
      values:
      
        ...
        [registers]
        Stack:
         ffff880079af7350 ffff880079905400 0000000000000000 ffffc900008f3ae0
         ffffffffa0196610 0000000000000001 00010000ffffffff 0000000087654321
         0000000000000002 0000000000000000 0000000000000000 0000000000000000
        Call Trace:
        ...
      
      This seems to be an artifact from long ago, and probably isn't needed
      anymore.  It generally just adds noise to the dump, and it can be
      actively harmful because it leaks kernel addresses.
      
      Linus says:
      
        "The stack dump actually goes back to forever, and it used to be
         useful back in 1992 or so. But it used to be useful mainly because
         stacks were simpler and we didn't have very good call traces anyway. I
         definitely remember having used them - I just do not remember having
         used them in the last ten+ years.
      
         Of course, it's still true that if you can trigger an oops, you've
         likely already lost the security game, but since the stack dump is so
         useless, let's aim to just remove it and make games like the above
         harder."
      
      This also removes the related 'kstack=' cmdline option and the
      'kstack_depth_to_print' sysctl.
      
      Suggested-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Brian Gerst <brgerst@gmail.com>
      Cc: Denys Vlasenko <dvlasenk@redhat.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/e83bd50df52d8fe88e94d2566426ae40d813bf8f.1477405374.git.jpoimboe@redhat.com
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      0ee1dd9f
  19. Oct 24, 2016
  20. Aug 02, 2016
    • Borislav Petkov's avatar
      printk: add kernel parameter to control writes to /dev/kmsg · 750afe7b
      Borislav Petkov authored
      Add a "printk.devkmsg" kernel command line parameter which controls how
      userspace writes into /dev/kmsg.  It has three options:
      
       * ratelimit - ratelimit logging from userspace.
       * on  - unlimited logging from userspace
       * off - logging from userspace gets ignored
      
      The default setting is to ratelimit the messages written to it.
      
      This changes the kernel default setting of "on" to "ratelimit" and we do
      that because we want to keep userspace spamming /dev/kmsg to sane
      levels.  This is especially moot when a small kernel log buffer wraps
      around and messages get lost.  So the ratelimiting setting should be a
      sane setting where kernel messages should have a bit higher chance of
      survival from all the spamming.
      
      It additionally does not limit logging to /dev/kmsg while the system is
      booting if we haven't disabled it on the command line.
      
      Furthermore, we can control the logging from a lower priority sysctl
      interface - kernel.printk_devkmsg.
      
      That interface will succeed only if printk.devkmsg *hasn't* been
      supplied on the command line.  If it has, then printk.devkmsg is a
      one-time setting which remains for the duration of the system lifetime.
      This "locking" of the setting is to prevent userspace from changing the
      logging on us through sysctl(2).
      
      This patch is based on previous patches from Linus and Steven.
      
      [bp@suse.de: fixes]
        Link: http://lkml.kernel.org/r/20160719072344.GC25563@nazgul.tnic
      Link: http://lkml.kernel.org/r/20160716061745.15795-3-bp@alien8.de
      
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Dave Young <dyoung@redhat.com>
      Cc: Franck Bui <fbui@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      750afe7b
  21. Jun 15, 2016
    • Daniel Bristot de Oliveira's avatar
      rcu: sysctl: Panic on RCU Stall · 088e9d25
      Daniel Bristot de Oliveira authored
      
      It is not always easy to determine the cause of an RCU stall just by
      analysing the RCU stall messages, mainly when the problem is caused
      by the indirect starvation of rcu threads. For example, when preempt_rcu
      is not awakened due to the starvation of a timer softirq.
      
      We have been hard coding panic() in the RCU stall functions for
      some time while testing the kernel-rt. But this is not possible in
      some scenarios, like when supporting customers.
      
      This patch implements the sysctl kernel.panic_on_rcu_stall. If
      set to 1, the system will panic() when an RCU stall takes place,
      enabling the capture of a vmcore. The vmcore provides a way to analyze
      all kernel/tasks states, helping out to point to the culprit and the
      solution for the stall.
      
      The kernel.panic_on_rcu_stall sysctl is disabled by default.
      
      Changes from v1:
      - Fixed a typo in the git log
      - The if(sysctl_panic_on_rcu_stall) panic() is in a static function
      - Fixed the CONFIG_TINY_RCU compilation issue
      - The var sysctl_panic_on_rcu_stall is now __read_mostly
      
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
      Cc: Josh Triplett <josh@joshtriplett.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: Lai Jiangshan <jiangshanlai@gmail.com>
      Acked-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Reviewed-by: default avatarJosh Triplett <josh@joshtriplett.org>
      Reviewed-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Tested-by: default avatar"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Signed-off-by: default avatarPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      088e9d25
  22. May 17, 2016
    • Arnaldo Carvalho de Melo's avatar
      perf core: Separate accounting of contexts and real addresses in a stack trace · c85b0334
      Arnaldo Carvalho de Melo authored
      The perf_sample->ip_callchain->nr value includes all the entries in the
      ip_callchain->ip[] array, real addresses and PERF_CONTEXT_{KERNEL,USER,etc},
      while what the user expects is that what is in the kernel.perf_event_max_stack
      sysctl or in the upcoming per event perf_event_attr.sample_max_stack knob be
      honoured in terms of IP addresses in the stack trace.
      
      So allocate a bunch of extra entries for contexts, and do the accounting
      via perf_callchain_entry_ctx struct members.
      
      A new sysctl, kernel.perf_event_max_contexts_per_stack is also
      introduced for investigating possible bugs in the callchain
      implementation by some arch.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/n/tip-3b4wnqk340c4sg4gwkfdi9yk@git.kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c85b0334
  23. May 10, 2016
  24. Apr 27, 2016
    • Arnaldo Carvalho de Melo's avatar
      perf core: Allow setting up max frame stack depth via sysctl · c5dfd78e
      Arnaldo Carvalho de Melo authored
      
      The default remains 127, which is good for most cases, and not even hit
      most of the time, but then for some cases, as reported by Brendan, 1024+
      deep frames are appearing on the radar for things like groovy, ruby.
      
      And in some workloads putting a _lower_ cap on this may make sense. One
      that is per event still needs to be put in place tho.
      
      The new file is:
      
        # cat /proc/sys/kernel/perf_event_max_stack
        127
      
      Chaging it:
      
        # echo 256 > /proc/sys/kernel/perf_event_max_stack
        # cat /proc/sys/kernel/perf_event_max_stack
        256
      
      But as soon as there is some event using callchains we get:
      
        # echo 512 > /proc/sys/kernel/perf_event_max_stack
        -bash: echo: write error: Device or resource busy
        #
      
      Because we only allocate the callchain percpu data structures when there
      is a user, which allows for changing the max easily, its just a matter
      of having no callchain users at that point.
      
      Reported-and-Tested-by: default avatarBrendan Gregg <brendan.d.gregg@gmail.com>
      Reviewed-by: default avatarFrederic Weisbecker <fweisbec@gmail.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: He Kuang <hekuang@huawei.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masami Hiramatsu <mhiramat@kernel.org>
      Cc: Milian Wolff <milian.wolff@kdab.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Cc: Wang Nan <wangnan0@huawei.com>
      Cc: Zefan Li <lizefan@huawei.com>
      Link: http://lkml.kernel.org/r/20160426002928.GB16708@kernel.org
      
      
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c5dfd78e
  25. Mar 08, 2016
  26. Feb 09, 2016
    • Mel Gorman's avatar
      sched/debug: Make schedstats a runtime tunable that is disabled by default · cb251765
      Mel Gorman authored
      
      schedstats is very useful during debugging and performance tuning but it
      incurs overhead to calculate the stats. As such, even though it can be
      disabled at build time, it is often enabled as the information is useful.
      
      This patch adds a kernel command-line and sysctl tunable to enable or
      disable schedstats on demand (when it's built in). It is disabled
      by default as someone who knows they need it can also learn to enable
      it when necessary.
      
      The benefits are dependent on how scheduler-intensive the workload is.
      If it is then the patch reduces the number of cycles spent calculating
      the stats with a small benefit from reducing the cache footprint of the
      scheduler.
      
      These measurements were taken from a 48-core 2-socket
      machine with Xeon(R) E5-2670 v3 cpus although they were also tested on a
      single socket machine 8-core machine with Intel i7-3770 processors.
      
      netperf-tcp
                                 4.5.0-rc1             4.5.0-rc1
                                   vanilla          nostats-v3r1
      Hmean    64         560.45 (  0.00%)      575.98 (  2.77%)
      Hmean    128        766.66 (  0.00%)      795.79 (  3.80%)
      Hmean    256        950.51 (  0.00%)      981.50 (  3.26%)
      Hmean    1024      1433.25 (  0.00%)     1466.51 (  2.32%)
      Hmean    2048      2810.54 (  0.00%)     2879.75 (  2.46%)
      Hmean    3312      4618.18 (  0.00%)     4682.09 (  1.38%)
      Hmean    4096      5306.42 (  0.00%)     5346.39 (  0.75%)
      Hmean    8192     10581.44 (  0.00%)    10698.15 (  1.10%)
      Hmean    16384    18857.70 (  0.00%)    18937.61 (  0.42%)
      
      Small gains here, UDP_STREAM showed nothing intresting and neither did
      the TCP_RR tests. The gains on the 8-core machine were very similar.
      
      tbench4
                                       4.5.0-rc1             4.5.0-rc1
                                         vanilla          nostats-v3r1
      Hmean    mb/sec-1         500.85 (  0.00%)      522.43 (  4.31%)
      Hmean    mb/sec-2         984.66 (  0.00%)     1018.19 (  3.41%)
      Hmean    mb/sec-4        1827.91 (  0.00%)     1847.78 (  1.09%)
      Hmean    mb/sec-8        3561.36 (  0.00%)     3611.28 (  1.40%)
      Hmean    mb/sec-16       5824.52 (  0.00%)     5929.03 (  1.79%)
      Hmean    mb/sec-32      10943.10 (  0.00%)    10802.83 ( -1.28%)
      Hmean    mb/sec-64      15950.81 (  0.00%)    16211.31 (  1.63%)
      Hmean    mb/sec-128     15302.17 (  0.00%)    15445.11 (  0.93%)
      Hmean    mb/sec-256     14866.18 (  0.00%)    15088.73 (  1.50%)
      Hmean    mb/sec-512     15223.31 (  0.00%)    15373.69 (  0.99%)
      Hmean    mb/sec-1024    14574.25 (  0.00%)    14598.02 (  0.16%)
      Hmean    mb/sec-2048    13569.02 (  0.00%)    13733.86 (  1.21%)
      Hmean    mb/sec-3072    12865.98 (  0.00%)    13209.23 (  2.67%)
      
      Small gains of 2-4% at low thread counts and otherwise flat.  The
      gains on the 8-core machine were slightly different
      
      tbench4 on 8-core i7-3770 single socket machine
      Hmean    mb/sec-1        442.59 (  0.00%)      448.73 (  1.39%)
      Hmean    mb/sec-2        796.68 (  0.00%)      794.39 ( -0.29%)
      Hmean    mb/sec-4       1322.52 (  0.00%)     1343.66 (  1.60%)
      Hmean    mb/sec-8       2611.65 (  0.00%)     2694.86 (  3.19%)
      Hmean    mb/sec-16      2537.07 (  0.00%)     2609.34 (  2.85%)
      Hmean    mb/sec-32      2506.02 (  0.00%)     2578.18 (  2.88%)
      Hmean    mb/sec-64      2511.06 (  0.00%)     2569.16 (  2.31%)
      Hmean    mb/sec-128     2313.38 (  0.00%)     2395.50 (  3.55%)
      Hmean    mb/sec-256     2110.04 (  0.00%)     2177.45 (  3.19%)
      Hmean    mb/sec-512     2072.51 (  0.00%)     2053.97 ( -0.89%)
      
      In constract, this shows a relatively steady 2-3% gain at higher thread
      counts. Due to the nature of the patch and the type of workload, it's
      not a surprise that the result will depend on the CPU used.
      
      hackbench-pipes
                               4.5.0-rc1             4.5.0-rc1
                                 vanilla          nostats-v3r1
      Amean    1        0.0637 (  0.00%)      0.0660 ( -3.59%)
      Amean    4        0.1229 (  0.00%)      0.1181 (  3.84%)
      Amean    7        0.1921 (  0.00%)      0.1911 (  0.52%)
      Amean    12       0.3117 (  0.00%)      0.2923 (  6.23%)
      Amean    21       0.4050 (  0.00%)      0.3899 (  3.74%)
      Amean    30       0.4586 (  0.00%)      0.4433 (  3.33%)
      Amean    48       0.5910 (  0.00%)      0.5694 (  3.65%)
      Amean    79       0.8663 (  0.00%)      0.8626 (  0.43%)
      Amean    110      1.1543 (  0.00%)      1.1517 (  0.22%)
      Amean    141      1.4457 (  0.00%)      1.4290 (  1.16%)
      Amean    172      1.7090 (  0.00%)      1.6924 (  0.97%)
      Amean    192      1.9126 (  0.00%)      1.9089 (  0.19%)
      
      Some small gains and losses and while the variance data is not included,
      it's close to the noise. The UMA machine did not show anything particularly
      different
      
      pipetest
                                   4.5.0-rc1             4.5.0-rc1
                                     vanilla          nostats-v2r2
      Min         Time        4.13 (  0.00%)        3.99 (  3.39%)
      1st-qrtle   Time        4.38 (  0.00%)        4.27 (  2.51%)
      2nd-qrtle   Time        4.46 (  0.00%)        4.39 (  1.57%)
      3rd-qrtle   Time        4.56 (  0.00%)        4.51 (  1.10%)
      Max-90%     Time        4.67 (  0.00%)        4.60 (  1.50%)
      Max-93%     Time        4.71 (  0.00%)        4.65 (  1.27%)
      Max-95%     Time        4.74 (  0.00%)        4.71 (  0.63%)
      Max-99%     Time        4.88 (  0.00%)        4.79 (  1.84%)
      Max         Time        4.93 (  0.00%)        4.83 (  2.03%)
      Mean        Time        4.48 (  0.00%)        4.39 (  1.91%)
      Best99%Mean Time        4.47 (  0.00%)        4.39 (  1.91%)
      Best95%Mean Time        4.46 (  0.00%)        4.38 (  1.93%)
      Best90%Mean Time        4.45 (  0.00%)        4.36 (  1.98%)
      Best50%Mean Time        4.36 (  0.00%)        4.25 (  2.49%)
      Best10%Mean Time        4.23 (  0.00%)        4.10 (  3.13%)
      Best5%Mean  Time        4.19 (  0.00%)        4.06 (  3.20%)
      Best1%Mean  Time        4.13 (  0.00%)        4.00 (  3.39%)
      
      Small improvement and similar gains were seen on the UMA machine.
      
      The gain is small but it stands to reason that doing less work in the
      scheduler is a good thing. The downside is that the lack of schedstats and
      tracepoints may be surprising to experts doing performance analysis until
      they find the existence of the schedstats= parameter or schedstats sysctl.
      It will be automatically activated for latencytop and sleep profiling to
      alleviate the problem. For tracepoints, there is a simple warning as it's
      not safe to activate schedstats in the context when it's known the tracepoint
      may be wanted but is unavailable.
      
      Signed-off-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Reviewed-by: default avatarMatt Fleming <matt@codeblueprint.co.uk>
      Reviewed-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <mgalbraith@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1454663316-22048-1-git-send-email-mgorman@techsingularity.net
      
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      cb251765
  27. Jan 26, 2016
  28. Jan 21, 2016
  29. Dec 19, 2015
    • Hidehiro Kawai's avatar
      Documentation: Document kernel.panic_on_io_nmi sysctl · 9f318e3f
      Hidehiro Kawai authored
      
      kernel.panic_on_io_nmi sysctl was introduced by commit
      
        5211a242 ("x86: Add sysctl to allow panic on IOCK NMI error")
      
      but its documentation is missing. So, add it.
      
      Signed-off-by: default avatarHidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
      Requested-by: default avatarBorislav Petkov <bp@alien8.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Chris Metcalf <cmetcalf@ezchip.com>
      Cc: Don Zickus <dzickus@redhat.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Heinrich Schuchardt <xypron.glpk@gmx.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: kexec@lists.infradead.org
      Cc: linux-doc@vger.kernel.org
      Cc: Manfred Spraul <manfred@colorfullife.com>
      Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Nicolas Iooss <nicolas.iooss_linux@m4x.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/20151210014637.25437.71903.stgit@softrs
      
      
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      9f318e3f
  30. Nov 06, 2015
Loading