Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • sw/misc/linux
1 result
Show changes
Commits on Source (14291)
Showing
with 766 additions and 104 deletions
...@@ -42,8 +42,30 @@ Description: ...@@ -42,8 +42,30 @@ Description:
modification of EVM-protected metadata and modification of EVM-protected metadata and
disable all further modification of policy disable all further modification of policy
Note that once a key has been loaded, it will no longer be Echoing a value is additive, the new value is added to the
possible to enable metadata modification. existing initialization flags.
For example, after::
echo 2 ><securityfs>/evm
another echo can be performed::
echo 1 ><securityfs>/evm
and the resulting value will be 3.
Note that once an HMAC key has been loaded, it will no longer
be possible to enable metadata modification. Signaling that an
HMAC key has been loaded will clear the corresponding flag.
For example, if the current value is 6 (2 and 4 set)::
echo 1 ><securityfs>/evm
will set the new value to 3 (4 cleared).
Loading an HMAC key is the only way to disable metadata
modification.
Until key loading has been signaled EVM can not create Until key loading has been signaled EVM can not create
or validate the 'security.evm' xattr, but returns or validate the 'security.evm' xattr, but returns
......
...@@ -107,13 +107,14 @@ Description: ...@@ -107,13 +107,14 @@ Description:
described in ATA8 7.16 and 7.17. Only valid if described in ATA8 7.16 and 7.17. Only valid if
the device is not a PM. the device is not a PM.
pio_mode: (RO) Transfer modes supported by the device when pio_mode: (RO) PIO transfer mode used by the device.
in PIO mode. Mostly used by PATA device. Mostly used by PATA devices.
xfer_mode: (RO) Current transfer mode xfer_mode: (RO) Current transfer mode. Mostly used by
PATA devices.
dma_mode: (RO) Transfer modes supported by the device when dma_mode: (RO) DMA transfer mode used by the device.
in DMA mode. Mostly used by PATA device. Mostly used by PATA devices.
class: (RO) Device class. Can be "ata" for disk, class: (RO) Device class. Can be "ata" for disk,
"atapi" for packet device, "pmp" for PM, or "atapi" for packet device, "pmp" for PM, or
......
...@@ -138,7 +138,7 @@ Description: ...@@ -138,7 +138,7 @@ Description:
Raw capacitance measurement from channel Y. Units after Raw capacitance measurement from channel Y. Units after
application of scale and offset are nanofarads. application of scale and offset are nanofarads.
What: /sys/.../iio:deviceX/in_capacitanceY-in_capacitanceZ_raw What: /sys/.../iio:deviceX/in_capacitanceY-capacitanceZ_raw
KernelVersion: 3.2 KernelVersion: 3.2
Contact: linux-iio@vger.kernel.org Contact: linux-iio@vger.kernel.org
Description: Description:
......
What: /sys/bus/iio/devices/iio:deviceX/conversion_mode What: /sys/bus/iio/devices/iio:deviceX/in_conversion_mode
KernelVersion: 4.2 KernelVersion: 4.2
Contact: linux-iio@vger.kernel.org Contact: linux-iio@vger.kernel.org
Description: Description:
......
...@@ -480,15 +480,17 @@ Description: information about CPUs heterogeneity. ...@@ -480,15 +480,17 @@ Description: information about CPUs heterogeneity.
cpu_capacity: capacity of cpu#. cpu_capacity: capacity of cpu#.
What: /sys/devices/system/cpu/vulnerabilities What: /sys/devices/system/cpu/vulnerabilities
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
/sys/devices/system/cpu/vulnerabilities/meltdown /sys/devices/system/cpu/vulnerabilities/meltdown
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/spectre_v1 /sys/devices/system/cpu/vulnerabilities/spectre_v1
/sys/devices/system/cpu/vulnerabilities/spectre_v2 /sys/devices/system/cpu/vulnerabilities/spectre_v2
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
/sys/devices/system/cpu/vulnerabilities/srbds /sys/devices/system/cpu/vulnerabilities/srbds
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort /sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
Date: January 2018 Date: January 2018
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org> Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Information about CPU vulnerabilities Description: Information about CPU vulnerabilities
......
What: /sys/kernel/oops_count
Date: November 2022
KernelVersion: 6.2.0
Contact: Linux Kernel Hardening List <linux-hardening@vger.kernel.org>
Description:
Shows how many times the system has Oopsed since last boot.
What: /sys/kernel/warn_count
Date: November 2022
KernelVersion: 6.2.0
Contact: Linux Kernel Hardening List <linux-hardening@vger.kernel.org>
Description:
Shows how many times the system has Warned since last boot.
...@@ -90,7 +90,8 @@ Triggers can be set on more than one psi metric and more than one trigger ...@@ -90,7 +90,8 @@ Triggers can be set on more than one psi metric and more than one trigger
for the same psi metric can be specified. However for each trigger a separate for the same psi metric can be specified. However for each trigger a separate
file descriptor is required to be able to poll it separately from others, file descriptor is required to be able to poll it separately from others,
therefore for each trigger a separate open() syscall should be made even therefore for each trigger a separate open() syscall should be made even
when opening the same psi interface file. when opening the same psi interface file. Write operations to a file descriptor
with an already existing psi trigger will fail with EBUSY.
Monitors activate only when system enters stall state for the monitored Monitors activate only when system enters stall state for the monitored
psi metric and deactivates upon exit from the stall state. While system is psi metric and deactivates upon exit from the stall state. While system is
......
...@@ -82,6 +82,8 @@ Brief summary of control files. ...@@ -82,6 +82,8 @@ Brief summary of control files.
memory.swappiness set/show swappiness parameter of vmscan memory.swappiness set/show swappiness parameter of vmscan
(See sysctl's vm.swappiness) (See sysctl's vm.swappiness)
memory.move_charge_at_immigrate set/show controls of moving charges memory.move_charge_at_immigrate set/show controls of moving charges
This knob is deprecated and shouldn't be
used.
memory.oom_control set/show oom controls. memory.oom_control set/show oom controls.
memory.numa_stat show the number of memory usage per numa memory.numa_stat show the number of memory usage per numa
node node
...@@ -745,8 +747,15 @@ NOTE2: ...@@ -745,8 +747,15 @@ NOTE2:
It is recommended to set the soft limit always below the hard limit, It is recommended to set the soft limit always below the hard limit,
otherwise the hard limit will take precedence. otherwise the hard limit will take precedence.
8. Move charges at task migration 8. Move charges at task migration (DEPRECATED!)
================================= ===============================================
THIS IS DEPRECATED!
It's expensive and unreliable! It's better practice to launch workload
tasks directly from inside their target cgroup. Use dedicated workload
cgroups to allow fine-grained policy adjustments without having to
move physical pages between control domains.
Users can move charges associated with a task along with task migration, that Users can move charges associated with a task along with task migration, that
is, uncharge task's pages from the old cgroup and charge them to the new cgroup. is, uncharge task's pages from the old cgroup and charge them to the new cgroup.
......
...@@ -3002,10 +3002,10 @@ ...@@ -3002,10 +3002,10 @@
65 = /dev/infiniband/issm1 Second InfiniBand IsSM device 65 = /dev/infiniband/issm1 Second InfiniBand IsSM device
... ...
127 = /dev/infiniband/issm63 63rd InfiniBand IsSM device 127 = /dev/infiniband/issm63 63rd InfiniBand IsSM device
128 = /dev/infiniband/uverbs0 First InfiniBand verbs device 192 = /dev/infiniband/uverbs0 First InfiniBand verbs device
129 = /dev/infiniband/uverbs1 Second InfiniBand verbs device 193 = /dev/infiniband/uverbs1 Second InfiniBand verbs device
... ...
159 = /dev/infiniband/uverbs31 31st InfiniBand verbs device 223 = /dev/infiniband/uverbs31 31st InfiniBand verbs device
232 char Biometric Devices 232 char Biometric Devices
0 = /dev/biometric/sensor0/fingerprint first fingerprint sensor on first device 0 = /dev/biometric/sensor0/fingerprint first fingerprint sensor on first device
......
.. SPDX-License-Identifier: GPL-2.0
GDS - Gather Data Sampling
==========================
Gather Data Sampling is a hardware vulnerability which allows unprivileged
speculative access to data which was previously stored in vector registers.
Problem
-------
When a gather instruction performs loads from memory, different data elements
are merged into the destination vector register. However, when a gather
instruction that is transiently executed encounters a fault, stale data from
architectural or internal vector registers may get transiently forwarded to the
destination vector register instead. This will allow a malicious attacker to
infer stale data using typical side channel techniques like cache timing
attacks. GDS is a purely sampling-based attack.
The attacker uses gather instructions to infer the stale vector register data.
The victim does not need to do anything special other than use the vector
registers. The victim does not need to use gather instructions to be
vulnerable.
Because the buffers are shared between Hyper-Threads cross Hyper-Thread attacks
are possible.
Attack scenarios
----------------
Without mitigation, GDS can infer stale data across virtually all
permission boundaries:
Non-enclaves can infer SGX enclave data
Userspace can infer kernel data
Guests can infer data from hosts
Guest can infer guest from other guests
Users can infer data from other users
Because of this, it is important to ensure that the mitigation stays enabled in
lower-privilege contexts like guests and when running outside SGX enclaves.
The hardware enforces the mitigation for SGX. Likewise, VMMs should ensure
that guests are not allowed to disable the GDS mitigation. If a host erred and
allowed this, a guest could theoretically disable GDS mitigation, mount an
attack, and re-enable it.
Mitigation mechanism
--------------------
This issue is mitigated in microcode. The microcode defines the following new
bits:
================================ === ============================
IA32_ARCH_CAPABILITIES[GDS_CTRL] R/O Enumerates GDS vulnerability
and mitigation support.
IA32_ARCH_CAPABILITIES[GDS_NO] R/O Processor is not vulnerable.
IA32_MCU_OPT_CTRL[GDS_MITG_DIS] R/W Disables the mitigation
0 by default.
IA32_MCU_OPT_CTRL[GDS_MITG_LOCK] R/W Locks GDS_MITG_DIS=0. Writes
to GDS_MITG_DIS are ignored
Can't be cleared once set.
================================ === ============================
GDS can also be mitigated on systems that don't have updated microcode by
disabling AVX. This can be done by setting gather_data_sampling="force" or
"clearcpuid=avx" on the kernel command-line.
If used, these options will disable AVX use by turning off XSAVE YMM support.
However, the processor will still enumerate AVX support. Userspace that
does not follow proper AVX enumeration to check both AVX *and* XSAVE YMM
support will break.
Mitigation control on the kernel command line
---------------------------------------------
The mitigation can be disabled by setting "gather_data_sampling=off" or
"mitigations=off" on the kernel command line. Not specifying either will default
to the mitigation being enabled. Specifying "gather_data_sampling=force" will
use the microcode mitigation when available or disable AVX on affected systems
where the microcode hasn't been updated to include the mitigation.
GDS System Information
------------------------
The kernel provides vulnerability status information through sysfs. For
GDS this can be accessed by the following sysfs file:
/sys/devices/system/cpu/vulnerabilities/gather_data_sampling
The possible values contained in this file are:
============================== =============================================
Not affected Processor not vulnerable.
Vulnerable Processor vulnerable and mitigation disabled.
Vulnerable: No microcode Processor vulnerable and microcode is missing
mitigation.
Mitigation: AVX disabled,
no microcode Processor is vulnerable and microcode is missing
mitigation. AVX disabled as mitigation.
Mitigation: Microcode Processor is vulnerable and mitigation is in
effect.
Mitigation: Microcode (locked) Processor is vulnerable and mitigation is in
effect and cannot be disabled.
Unknown: Dependent on
hypervisor status Running on a virtual guest processor that is
affected but with no way to know if host
processor is mitigated or vulnerable.
============================== =============================================
GDS Default mitigation
----------------------
The updated microcode will enable the mitigation by default. The kernel's
default action is to leave the mitigation enabled.
...@@ -15,3 +15,5 @@ are configurable at compile, boot or run time. ...@@ -15,3 +15,5 @@ are configurable at compile, boot or run time.
tsx_async_abort tsx_async_abort
multihit.rst multihit.rst
special-register-buffer-data-sampling.rst special-register-buffer-data-sampling.rst
processor_mmio_stale_data.rst
gather_data_sampling.rst
=========================================
Processor MMIO Stale Data Vulnerabilities
=========================================
Processor MMIO Stale Data Vulnerabilities are a class of memory-mapped I/O
(MMIO) vulnerabilities that can expose data. The sequences of operations for
exposing data range from simple to very complex. Because most of the
vulnerabilities require the attacker to have access to MMIO, many environments
are not affected. System environments using virtualization where MMIO access is
provided to untrusted guests may need mitigation. These vulnerabilities are
not transient execution attacks. However, these vulnerabilities may propagate
stale data into core fill buffers where the data can subsequently be inferred
by an unmitigated transient execution attack. Mitigation for these
vulnerabilities includes a combination of microcode update and software
changes, depending on the platform and usage model. Some of these mitigations
are similar to those used to mitigate Microarchitectural Data Sampling (MDS) or
those used to mitigate Special Register Buffer Data Sampling (SRBDS).
Data Propagators
================
Propagators are operations that result in stale data being copied or moved from
one microarchitectural buffer or register to another. Processor MMIO Stale Data
Vulnerabilities are operations that may result in stale data being directly
read into an architectural, software-visible state or sampled from a buffer or
register.
Fill Buffer Stale Data Propagator (FBSDP)
-----------------------------------------
Stale data may propagate from fill buffers (FB) into the non-coherent portion
of the uncore on some non-coherent writes. Fill buffer propagation by itself
does not make stale data architecturally visible. Stale data must be propagated
to a location where it is subject to reading or sampling.
Sideband Stale Data Propagator (SSDP)
-------------------------------------
The sideband stale data propagator (SSDP) is limited to the client (including
Intel Xeon server E3) uncore implementation. The sideband response buffer is
shared by all client cores. For non-coherent reads that go to sideband
destinations, the uncore logic returns 64 bytes of data to the core, including
both requested data and unrequested stale data, from a transaction buffer and
the sideband response buffer. As a result, stale data from the sideband
response and transaction buffers may now reside in a core fill buffer.
Primary Stale Data Propagator (PSDP)
------------------------------------
The primary stale data propagator (PSDP) is limited to the client (including
Intel Xeon server E3) uncore implementation. Similar to the sideband response
buffer, the primary response buffer is shared by all client cores. For some
processors, MMIO primary reads will return 64 bytes of data to the core fill
buffer including both requested data and unrequested stale data. This is
similar to the sideband stale data propagator.
Vulnerabilities
===============
Device Register Partial Write (DRPW) (CVE-2022-21166)
-----------------------------------------------------
Some endpoint MMIO registers incorrectly handle writes that are smaller than
the register size. Instead of aborting the write or only copying the correct
subset of bytes (for example, 2 bytes for a 2-byte write), more bytes than
specified by the write transaction may be written to the register. On
processors affected by FBSDP, this may expose stale data from the fill buffers
of the core that created the write transaction.
Shared Buffers Data Sampling (SBDS) (CVE-2022-21125)
----------------------------------------------------
After propagators may have moved data around the uncore and copied stale data
into client core fill buffers, processors affected by MFBDS can leak data from
the fill buffer. It is limited to the client (including Intel Xeon server E3)
uncore implementation.
Shared Buffers Data Read (SBDR) (CVE-2022-21123)
------------------------------------------------
It is similar to Shared Buffer Data Sampling (SBDS) except that the data is
directly read into the architectural software-visible state. It is limited to
the client (including Intel Xeon server E3) uncore implementation.
Affected Processors
===================
Not all the CPUs are affected by all the variants. For instance, most
processors for the server market (excluding Intel Xeon E3 processors) are
impacted by only Device Register Partial Write (DRPW).
Below is the list of affected Intel processors [#f1]_:
=================== ============ =========
Common name Family_Model Steppings
=================== ============ =========
HASWELL_X 06_3FH 2,4
SKYLAKE_L 06_4EH 3
BROADWELL_X 06_4FH All
SKYLAKE_X 06_55H 3,4,6,7,11
BROADWELL_D 06_56H 3,4,5
SKYLAKE 06_5EH 3
ICELAKE_X 06_6AH 4,5,6
ICELAKE_D 06_6CH 1
ICELAKE_L 06_7EH 5
ATOM_TREMONT_D 06_86H All
LAKEFIELD 06_8AH 1
KABYLAKE_L 06_8EH 9 to 12
ATOM_TREMONT 06_96H 1
ATOM_TREMONT_L 06_9CH 0
KABYLAKE 06_9EH 9 to 13
COMETLAKE 06_A5H 2,3,5
COMETLAKE_L 06_A6H 0,1
ROCKETLAKE 06_A7H 1
=================== ============ =========
If a CPU is in the affected processor list, but not affected by a variant, it
is indicated by new bits in MSR IA32_ARCH_CAPABILITIES. As described in a later
section, mitigation largely remains the same for all the variants, i.e. to
clear the CPU fill buffers via VERW instruction.
New bits in MSRs
================
Newer processors and microcode update on existing affected processors added new
bits to IA32_ARCH_CAPABILITIES MSR. These bits can be used to enumerate
specific variants of Processor MMIO Stale Data vulnerabilities and mitigation
capability.
MSR IA32_ARCH_CAPABILITIES
--------------------------
Bit 13 - SBDR_SSDP_NO - When set, processor is not affected by either the
Shared Buffers Data Read (SBDR) vulnerability or the sideband stale
data propagator (SSDP).
Bit 14 - FBSDP_NO - When set, processor is not affected by the Fill Buffer
Stale Data Propagator (FBSDP).
Bit 15 - PSDP_NO - When set, processor is not affected by Primary Stale Data
Propagator (PSDP).
Bit 17 - FB_CLEAR - When set, VERW instruction will overwrite CPU fill buffer
values as part of MD_CLEAR operations. Processors that do not
enumerate MDS_NO (meaning they are affected by MDS) but that do
enumerate support for both L1D_FLUSH and MD_CLEAR implicitly enumerate
FB_CLEAR as part of their MD_CLEAR support.
Bit 18 - FB_CLEAR_CTRL - Processor supports read and write to MSR
IA32_MCU_OPT_CTRL[FB_CLEAR_DIS]. On such processors, the FB_CLEAR_DIS
bit can be set to cause the VERW instruction to not perform the
FB_CLEAR action. Not all processors that support FB_CLEAR will support
FB_CLEAR_CTRL.
MSR IA32_MCU_OPT_CTRL
---------------------
Bit 3 - FB_CLEAR_DIS - When set, VERW instruction does not perform the FB_CLEAR
action. This may be useful to reduce the performance impact of FB_CLEAR in
cases where system software deems it warranted (for example, when performance
is more critical, or the untrusted software has no MMIO access). Note that
FB_CLEAR_DIS has no impact on enumeration (for example, it does not change
FB_CLEAR or MD_CLEAR enumeration) and it may not be supported on all processors
that enumerate FB_CLEAR.
Mitigation
==========
Like MDS, all variants of Processor MMIO Stale Data vulnerabilities have the
same mitigation strategy to force the CPU to clear the affected buffers before
an attacker can extract the secrets.
This is achieved by using the otherwise unused and obsolete VERW instruction in
combination with a microcode update. The microcode clears the affected CPU
buffers when the VERW instruction is executed.
Kernel reuses the MDS function to invoke the buffer clearing:
mds_clear_cpu_buffers()
On MDS affected CPUs, the kernel already invokes CPU buffer clear on
kernel/userspace, hypervisor/guest and C-state (idle) transitions. No
additional mitigation is needed on such CPUs.
For CPUs not affected by MDS or TAA, mitigation is needed only for the attacker
with MMIO capability. Therefore, VERW is not required for kernel/userspace. For
virtualization case, VERW is only needed at VMENTER for a guest with MMIO
capability.
Mitigation points
-----------------
Return to user space
^^^^^^^^^^^^^^^^^^^^
Same mitigation as MDS when affected by MDS/TAA, otherwise no mitigation
needed.
C-State transition
^^^^^^^^^^^^^^^^^^
Control register writes by CPU during C-state transition can propagate data
from fill buffer to uncore buffers. Execute VERW before C-state transition to
clear CPU fill buffers.
Guest entry point
^^^^^^^^^^^^^^^^^
Same mitigation as MDS when processor is also affected by MDS/TAA, otherwise
execute VERW at VMENTER only for MMIO capable guests. On CPUs not affected by
MDS/TAA, guest without MMIO access cannot extract secrets using Processor MMIO
Stale Data vulnerabilities, so there is no need to execute VERW for such guests.
Mitigation control on the kernel command line
---------------------------------------------
The kernel command line allows to control the Processor MMIO Stale Data
mitigations at boot time with the option "mmio_stale_data=". The valid
arguments for this option are:
========== =================================================================
full If the CPU is vulnerable, enable mitigation; CPU buffer clearing
on exit to userspace and when entering a VM. Idle transitions are
protected as well. It does not automatically disable SMT.
full,nosmt Same as full, with SMT disabled on vulnerable CPUs. This is the
complete mitigation.
off Disables mitigation completely.
========== =================================================================
If the CPU is affected and mmio_stale_data=off is not supplied on the kernel
command line, then the kernel selects the appropriate mitigation.
Mitigation status information
-----------------------------
The Linux kernel provides a sysfs interface to enumerate the current
vulnerability status of the system: whether the system is vulnerable, and
which mitigations are active. The relevant sysfs file is:
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data
The possible values in this file are:
.. list-table::
* - 'Not affected'
- The processor is not vulnerable
* - 'Vulnerable'
- The processor is vulnerable, but no mitigation enabled
* - 'Vulnerable: Clear CPU buffers attempted, no microcode'
- The processor is vulnerable, but microcode is not updated. The
mitigation is enabled on a best effort basis.
* - 'Mitigation: Clear CPU buffers'
- The processor is vulnerable and the CPU buffer clearing mitigation is
enabled.
* - 'Unknown: No mitigations'
- The processor vulnerability status is unknown because it is
out of Servicing period. Mitigation is not attempted.
Definitions:
------------
Servicing period: The process of providing functional and security updates to
Intel processors or platforms, utilizing the Intel Platform Update (IPU)
process or other similar mechanisms.
End of Servicing Updates (ESU): ESU is the date at which Intel will no
longer provide Servicing, such as through IPU or other similar update
processes. ESU dates will typically be aligned to end of quarter.
If the processor is vulnerable then the following information is appended to
the above information:
======================== ===========================================
'SMT vulnerable' SMT is enabled
'SMT disabled' SMT is disabled
'SMT Host state unknown' Kernel runs in a VM, Host SMT state unknown
======================== ===========================================
References
----------
.. [#f1] Affected Processors
https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html
...@@ -60,8 +60,8 @@ privileged data touched during the speculative execution. ...@@ -60,8 +60,8 @@ privileged data touched during the speculative execution.
Spectre variant 1 attacks take advantage of speculative execution of Spectre variant 1 attacks take advantage of speculative execution of
conditional branches, while Spectre variant 2 attacks use speculative conditional branches, while Spectre variant 2 attacks use speculative
execution of indirect branches to leak privileged memory. execution of indirect branches to leak privileged memory.
See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[7] <spec_ref7>` See :ref:`[1] <spec_ref1>` :ref:`[5] <spec_ref5>` :ref:`[6] <spec_ref6>`
:ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`. :ref:`[7] <spec_ref7>` :ref:`[10] <spec_ref10>` :ref:`[11] <spec_ref11>`.
Spectre variant 1 (Bounds Check Bypass) Spectre variant 1 (Bounds Check Bypass)
--------------------------------------- ---------------------------------------
...@@ -131,6 +131,19 @@ steer its indirect branch speculations to gadget code, and measure the ...@@ -131,6 +131,19 @@ steer its indirect branch speculations to gadget code, and measure the
speculative execution's side effects left in level 1 cache to infer the speculative execution's side effects left in level 1 cache to infer the
victim's data. victim's data.
Yet another variant 2 attack vector is for the attacker to poison the
Branch History Buffer (BHB) to speculatively steer an indirect branch
to a specific Branch Target Buffer (BTB) entry, even if the entry isn't
associated with the source address of the indirect branch. Specifically,
the BHB might be shared across privilege levels even in the presence of
Enhanced IBRS.
Currently the only known real-world BHB attack vector is via
unprivileged eBPF. Therefore, it's highly recommended to not enable
unprivileged eBPF, especially when eIBRS is used (without retpolines).
For a full mitigation against BHB attacks, it's recommended to use
retpolines (or eIBRS combined with retpolines).
Attack scenarios Attack scenarios
---------------- ----------------
...@@ -364,13 +377,15 @@ The possible values in this file are: ...@@ -364,13 +377,15 @@ The possible values in this file are:
- Kernel status: - Kernel status:
==================================== ================================= ======================================== =================================
'Not affected' The processor is not vulnerable 'Not affected' The processor is not vulnerable
'Vulnerable' Vulnerable, no mitigation 'Mitigation: None' Vulnerable, no mitigation
'Mitigation: Full generic retpoline' Software-focused mitigation 'Mitigation: Retpolines' Use Retpoline thunks
'Mitigation: Full AMD retpoline' AMD-specific software mitigation 'Mitigation: LFENCE' Use LFENCE instructions
'Mitigation: Enhanced IBRS' Hardware-focused mitigation 'Mitigation: Enhanced IBRS' Hardware-focused mitigation
==================================== ================================= 'Mitigation: Enhanced IBRS + Retpolines' Hardware-focused + Retpolines
'Mitigation: Enhanced IBRS + LFENCE' Hardware-focused + LFENCE
======================================== =================================
- Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is - Firmware status: Show if Indirect Branch Restricted Speculation (IBRS) is
used to protect against Spectre variant 2 attacks when calling firmware (x86 only). used to protect against Spectre variant 2 attacks when calling firmware (x86 only).
...@@ -407,6 +422,14 @@ The possible values in this file are: ...@@ -407,6 +422,14 @@ The possible values in this file are:
'RSB filling' Protection of RSB on context switch enabled 'RSB filling' Protection of RSB on context switch enabled
============= =========================================== ============= ===========================================
- EIBRS Post-barrier Return Stack Buffer (PBRSB) protection status:
=========================== =======================================================
'PBRSB-eIBRS: SW sequence' CPU is affected and protection of RSB on VMEXIT enabled
'PBRSB-eIBRS: Vulnerable' CPU is vulnerable
'PBRSB-eIBRS: Not affected' CPU is not affected by PBRSB
=========================== =======================================================
Full mitigation might require a microcode update from the CPU Full mitigation might require a microcode update from the CPU
vendor. When the necessary microcode is not available, the kernel will vendor. When the necessary microcode is not available, the kernel will
report vulnerability. report vulnerability.
...@@ -456,8 +479,16 @@ Spectre variant 2 ...@@ -456,8 +479,16 @@ Spectre variant 2
On Intel Skylake-era systems the mitigation covers most, but not all, On Intel Skylake-era systems the mitigation covers most, but not all,
cases. See :ref:`[3] <spec_ref3>` for more details. cases. See :ref:`[3] <spec_ref3>` for more details.
On CPUs with hardware mitigation for Spectre variant 2 (e.g. Enhanced On CPUs with hardware mitigation for Spectre variant 2 (e.g. IBRS
IBRS on x86), retpoline is automatically disabled at run time. or enhanced IBRS on x86), retpoline is automatically disabled at run time.
Systems which support enhanced IBRS (eIBRS) enable IBRS protection once at
boot, by setting the IBRS bit, and they're automatically protected against
Spectre v2 variant attacks, including cross-thread branch target injections
on SMT systems (STIBP). In other words, eIBRS enables STIBP too.
Legacy IBRS systems clear the IBRS bit on exit to userspace and
therefore explicitly enable STIBP for that
The retpoline mitigation is turned on by default on vulnerable The retpoline mitigation is turned on by default on vulnerable
CPUs. It can be forced on or off by the administrator CPUs. It can be forced on or off by the administrator
...@@ -468,7 +499,7 @@ Spectre variant 2 ...@@ -468,7 +499,7 @@ Spectre variant 2
before invoking any firmware code to prevent Spectre variant 2 exploits before invoking any firmware code to prevent Spectre variant 2 exploits
using the firmware. using the firmware.
Using kernel address space randomization (CONFIG_RANDOMIZE_SLAB=y Using kernel address space randomization (CONFIG_RANDOMIZE_BASE=y
and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes and CONFIG_SLAB_FREELIST_RANDOM=y in the kernel configuration) makes
attacks on the kernel generally more difficult. attacks on the kernel generally more difficult.
...@@ -481,9 +512,12 @@ Spectre variant 2 ...@@ -481,9 +512,12 @@ Spectre variant 2
For Spectre variant 2 mitigation, individual user programs For Spectre variant 2 mitigation, individual user programs
can be compiled with return trampolines for indirect branches. can be compiled with return trampolines for indirect branches.
This protects them from consuming poisoned entries in the branch This protects them from consuming poisoned entries in the branch
target buffer left by malicious software. Alternatively, the target buffer left by malicious software.
programs can disable their indirect branch speculation via prctl()
(See :ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`). On legacy IBRS systems, at return to userspace, implicit STIBP is disabled
because the kernel clears the IBRS bit. In this case, the userspace programs
can disable indirect branch speculation via prctl() (See
:ref:`Documentation/userspace-api/spec_ctrl.rst <set_spec_ctrl>`).
On x86, this will turn on STIBP to guard against attacks from the On x86, this will turn on STIBP to guard against attacks from the
sibling thread when the user program is running, and use IBPB to sibling thread when the user program is running, and use IBPB to
flush the branch target buffer when switching to/from the program. flush the branch target buffer when switching to/from the program.
...@@ -584,12 +618,13 @@ kernel command line. ...@@ -584,12 +618,13 @@ kernel command line.
Specific mitigations can also be selected manually: Specific mitigations can also be selected manually:
retpoline retpoline auto pick between generic,lfence
replace indirect branches retpoline,generic Retpolines
retpoline,generic retpoline,lfence LFENCE; indirect branch
google's original retpoline retpoline,amd alias for retpoline,lfence
retpoline,amd eibrs enhanced IBRS
AMD-specific minimal thunk eibrs,retpoline enhanced IBRS + Retpolines
eibrs,lfence enhanced IBRS + LFENCE
Not specifying this option is equivalent to Not specifying this option is equivalent to
spectre_v2=auto. spectre_v2=auto.
...@@ -730,7 +765,7 @@ AMD white papers: ...@@ -730,7 +765,7 @@ AMD white papers:
.. _spec_ref6: .. _spec_ref6:
[6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/90343-B_SoftwareTechniquesforManagingSpeculation_WP_7-18Update_FNL.pdf>`_. [6] `Software techniques for managing speculation on AMD processors <https://developer.amd.com/wp-content/resources/Managing-Speculation-on-AMD-Processors.pdf>`_.
ARM white papers: ARM white papers:
......
...@@ -567,6 +567,12 @@ ...@@ -567,6 +567,12 @@
loops can be debugged more effectively on production loops can be debugged more effectively on production
systems. systems.
clocksource.max_cswd_read_retries= [KNL]
Number of clocksource_watchdog() retries due to
external delays before the clock will be marked
unstable. Defaults to three retries, that is,
four attempts to read the clock under test.
clearcpuid=BITNUM[,BITNUM...] [X86] clearcpuid=BITNUM[,BITNUM...] [X86]
Disable CPUID feature X for the kernel. See Disable CPUID feature X for the kernel. See
arch/x86/include/asm/cpufeatures.h for the valid bit arch/x86/include/asm/cpufeatures.h for the valid bit
...@@ -819,10 +825,6 @@ ...@@ -819,10 +825,6 @@
debugpat [X86] Enable PAT debugging debugpat [X86] Enable PAT debugging
decnet.addr= [HW,NET]
Format: <area>[,<node>]
See also Documentation/networking/decnet.txt.
default_hugepagesz= default_hugepagesz=
[same as hugepagesz=] The size of the default [same as hugepagesz=] The size of the default
HugeTLB page size. This is the size represented by HugeTLB page size. This is the size represented by
...@@ -1334,6 +1336,26 @@ ...@@ -1334,6 +1336,26 @@
Format: off | on Format: off | on
default: on default: on
gather_data_sampling=
[X86,INTEL] Control the Gather Data Sampling (GDS)
mitigation.
Gather Data Sampling is a hardware vulnerability which
allows unprivileged speculative access to data which was
previously stored in vector registers.
This issue is mitigated by default in updated microcode.
The mitigation may have a performance impact but can be
disabled. On systems without the microcode mitigation
disabling AVX serves as a mitigation.
force: Disable AVX to mitigate systems without
microcode mitigation. No effect if the microcode
mitigation is present. Known to cause crashes in
userspace with buggy AVX enumeration.
off: Disable GDS mitigation.
gcov_persist= [GCOV] When non-zero (default), profiling data for gcov_persist= [GCOV] When non-zero (default), profiling data for
kernel modules is saved and remains accessible via kernel modules is saved and remains accessible via
debugfs, even when the module is unloaded/reloaded. debugfs, even when the module is unloaded/reloaded.
...@@ -1481,6 +1503,8 @@ ...@@ -1481,6 +1503,8 @@
architectures force reset to be always executed architectures force reset to be always executed
i8042.unlock [HW] Unlock (ignore) the keylock i8042.unlock [HW] Unlock (ignore) the keylock
i8042.kbdreset [HW] Reset device connected to KBD port i8042.kbdreset [HW] Reset device connected to KBD port
i8042.probe_defer
[HW] Allow deferred probing upon i8042 probe errors
i810= [HW,DRM] i810= [HW,DRM]
...@@ -1936,24 +1960,57 @@ ...@@ -1936,24 +1960,57 @@
ivrs_ioapic [HW,X86_64] ivrs_ioapic [HW,X86_64]
Provide an override to the IOAPIC-ID<->DEVICE-ID Provide an override to the IOAPIC-ID<->DEVICE-ID
mapping provided in the IVRS ACPI table. For mapping provided in the IVRS ACPI table.
example, to map IOAPIC-ID decimal 10 to By default, PCI segment is 0, and can be omitted.
PCI device 00:14.0 write the parameter as:
For example, to map IOAPIC-ID decimal 10 to
PCI segment 0x1 and PCI device 00:14.0,
write the parameter as:
ivrs_ioapic=10@0001:00:14.0
Deprecated formats:
* To map IOAPIC-ID decimal 10 to PCI device 00:14.0
write the parameter as:
ivrs_ioapic[10]=00:14.0 ivrs_ioapic[10]=00:14.0
* To map IOAPIC-ID decimal 10 to PCI segment 0x1 and
PCI device 00:14.0 write the parameter as:
ivrs_ioapic[10]=0001:00:14.0
ivrs_hpet [HW,X86_64] ivrs_hpet [HW,X86_64]
Provide an override to the HPET-ID<->DEVICE-ID Provide an override to the HPET-ID<->DEVICE-ID
mapping provided in the IVRS ACPI table. For mapping provided in the IVRS ACPI table.
example, to map HPET-ID decimal 0 to By default, PCI segment is 0, and can be omitted.
PCI device 00:14.0 write the parameter as:
For example, to map HPET-ID decimal 10 to
PCI segment 0x1 and PCI device 00:14.0,
write the parameter as:
ivrs_hpet=10@0001:00:14.0
Deprecated formats:
* To map HPET-ID decimal 0 to PCI device 00:14.0
write the parameter as:
ivrs_hpet[0]=00:14.0 ivrs_hpet[0]=00:14.0
* To map HPET-ID decimal 10 to PCI segment 0x1 and
PCI device 00:14.0 write the parameter as:
ivrs_ioapic[10]=0001:00:14.0
ivrs_acpihid [HW,X86_64] ivrs_acpihid [HW,X86_64]
Provide an override to the ACPI-HID:UID<->DEVICE-ID Provide an override to the ACPI-HID:UID<->DEVICE-ID
mapping provided in the IVRS ACPI table. For mapping provided in the IVRS ACPI table.
example, to map UART-HID:UID AMD0020:0 to By default, PCI segment is 0, and can be omitted.
PCI device 00:14.5 write the parameter as:
For example, to map UART-HID:UID AMD0020:0 to
PCI segment 0x1 and PCI device ID 00:14.5,
write the parameter as:
ivrs_acpihid=AMD0020:0@0001:00:14.5
Deprecated formats:
* To map UART-HID:UID AMD0020:0 to PCI segment is 0,
PCI device ID 00:14.5, write the parameter as:
ivrs_acpihid[00:14.5]=AMD0020:0 ivrs_acpihid[00:14.5]=AMD0020:0
* To map UART-HID:UID AMD0020:0 to PCI segment 0x1 and
PCI device ID 00:14.5, write the parameter as:
ivrs_acpihid[0001:00:14.5]=AMD0020:0
js= [HW,JOY] Analog joystick js= [HW,JOY] Analog joystick
See Documentation/input/joydev/joystick.rst. See Documentation/input/joydev/joystick.rst.
...@@ -2106,8 +2163,12 @@ ...@@ -2106,8 +2163,12 @@
Default is 1 (enabled) Default is 1 (enabled)
kvm-intel.emulate_invalid_guest_state= kvm-intel.emulate_invalid_guest_state=
[KVM,Intel] Enable emulation of invalid guest states [KVM,Intel] Disable emulation of invalid guest state.
Default is 0 (disabled) Ignored if kvm-intel.enable_unrestricted_guest=1, as
guest state is never invalid for unrestricted guests.
This param doesn't apply to nested guests (L2), as KVM
never emulates invalid L2 guest state.
Default is 1 (enabled)
kvm-intel.flexpriority= kvm-intel.flexpriority=
[KVM,Intel] Disable FlexPriority feature (TPR shadow). [KVM,Intel] Disable FlexPriority feature (TPR shadow).
...@@ -2655,20 +2716,22 @@ ...@@ -2655,20 +2716,22 @@
Disable all optional CPU mitigations. This Disable all optional CPU mitigations. This
improves system performance, but it may also improves system performance, but it may also
expose users to several CPU vulnerabilities. expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC] Equivalent to: gather_data_sampling=off [X86]
kpti=0 [ARM64] kpti=0 [ARM64]
nospectre_v1 [X86,PPC] kvm.nx_huge_pages=off [X86]
l1tf=off [X86]
mds=off [X86]
mmio_stale_data=off [X86]
no_entry_flush [PPC]
no_uaccess_flush [PPC]
nobp=0 [S390] nobp=0 [S390]
nopti [X86,PPC]
nospectre_v1 [X86,PPC]
nospectre_v2 [X86,PPC,S390,ARM64] nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
spec_store_bypass_disable=off [X86,PPC] spec_store_bypass_disable=off [X86,PPC]
spectre_v2_user=off [X86]
ssbd=force-off [ARM64] ssbd=force-off [ARM64]
l1tf=off [X86]
mds=off [X86]
tsx_async_abort=off [X86] tsx_async_abort=off [X86]
kvm.nx_huge_pages=off [X86]
no_entry_flush [PPC]
no_uaccess_flush [PPC]
Exceptions: Exceptions:
This does not have any effect on This does not have any effect on
...@@ -2690,6 +2753,7 @@ ...@@ -2690,6 +2753,7 @@
Equivalent to: l1tf=flush,nosmt [X86] Equivalent to: l1tf=flush,nosmt [X86]
mds=full,nosmt [X86] mds=full,nosmt [X86]
tsx_async_abort=full,nosmt [X86] tsx_async_abort=full,nosmt [X86]
mmio_stale_data=full,nosmt [X86]
mminit_loglevel= mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this [KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
...@@ -2699,6 +2763,40 @@ ...@@ -2699,6 +2763,40 @@
log everything. Information is printed at KERN_DEBUG log everything. Information is printed at KERN_DEBUG
so loglevel=8 may also need to be specified. so loglevel=8 may also need to be specified.
mmio_stale_data=
[X86,INTEL] Control mitigation for the Processor
MMIO Stale Data vulnerabilities.
Processor MMIO Stale Data is a class of
vulnerabilities that may expose data after an MMIO
operation. Exposed data could originate or end in
the same CPU buffers as affected by MDS and TAA.
Therefore, similar to MDS and TAA, the mitigation
is to clear the affected CPU buffers.
This parameter controls the mitigation. The
options are:
full - Enable mitigation on vulnerable CPUs
full,nosmt - Enable mitigation and disable SMT on
vulnerable CPUs.
off - Unconditionally disable mitigation
On MDS or TAA affected machines,
mmio_stale_data=off can be prevented by an active
MDS or TAA mitigation as these vulnerabilities are
mitigated with the same mechanism so in order to
disable this mitigation, you need to specify
mds=off and tsx_async_abort=off too.
Not specifying this option is equivalent to
mmio_stale_data=full.
For details see:
Documentation/admin-guide/hw-vuln/processor_mmio_stale_data.rst
module.sig_enforce module.sig_enforce
[KNL] When CONFIG_MODULE_SIG is set, this means that [KNL] When CONFIG_MODULE_SIG is set, this means that
modules without (valid) signatures will fail to load. modules without (valid) signatures will fail to load.
...@@ -3794,6 +3892,12 @@ ...@@ -3794,6 +3892,12 @@
fully seed the kernel's CRNG. Default is controlled fully seed the kernel's CRNG. Default is controlled
by CONFIG_RANDOM_TRUST_CPU. by CONFIG_RANDOM_TRUST_CPU.
random.trust_bootloader={on,off}
[KNL] Enable or disable trusting the use of a
seed passed by the bootloader (if available) to
fully seed the kernel's CRNG. Default is controlled
by CONFIG_RANDOM_TRUST_BOOTLOADER.
ras=option[,option,...] [KNL] RAS-specific options ras=option[,option,...] [KNL] RAS-specific options
cec_disable [X86] cec_disable [X86]
...@@ -4244,6 +4348,18 @@ ...@@ -4244,6 +4348,18 @@
retain_initrd [RAM] Keep initrd memory after extraction retain_initrd [RAM] Keep initrd memory after extraction
retbleed= [X86] Control mitigation of RETBleed (Arbitrary
Speculative Code Execution with Return Instructions)
vulnerability.
off - unconditionally disable
auto - automatically select a migitation
Selecting 'auto' will choose a mitigation method at run
time according to the CPU.
Not specifying this option is equivalent to retbleed=auto.
rfkill.default_state= rfkill.default_state=
0 "airplane mode". All wifi, bluetooth, wimax, gps, fm, 0 "airplane mode". All wifi, bluetooth, wimax, gps, fm,
etc. communication is blocked by default. etc. communication is blocked by default.
...@@ -4481,8 +4597,13 @@ ...@@ -4481,8 +4597,13 @@
Specific mitigations can also be selected manually: Specific mitigations can also be selected manually:
retpoline - replace indirect branches retpoline - replace indirect branches
retpoline,generic - google's original retpoline retpoline,generic - Retpolines
retpoline,amd - AMD-specific minimal thunk retpoline,lfence - LFENCE; indirect branch
retpoline,amd - alias for retpoline,lfence
eibrs - enhanced IBRS
eibrs,retpoline - enhanced IBRS + Retpolines
eibrs,lfence - enhanced IBRS + LFENCE
ibrs - use IBRS to protect kernel
Not specifying this option is equivalent to Not specifying this option is equivalent to
spectre_v2=auto. spectre_v2=auto.
...@@ -5474,6 +5595,13 @@ ...@@ -5474,6 +5595,13 @@
as generic guest with no PV drivers. Currently support as generic guest with no PV drivers. Currently support
XEN HVM, KVM, HYPER_V and VMWARE guest. XEN HVM, KVM, HYPER_V and VMWARE guest.
xen.balloon_boot_timeout= [XEN]
The time (in seconds) to wait before giving up to boot
in case initial ballooning fails to free enough memory.
Applies only when running as HVM or PVH guest and
started with less memory configured than allowed at
max. Default is 180.
xen.event_eoi_delay= [XEN] xen.event_eoi_delay= [XEN]
How long to delay EOI handling in case of event How long to delay EOI handling in case of event
storms (jiffies). Default is 10. storms (jiffies). Default is 10.
......
...@@ -676,8 +676,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor ...@@ -676,8 +676,8 @@ the ``menu`` governor to be used on the systems that use the ``ladder`` governor
by default this way, for example. by default this way, for example.
The other kernel command line parameters controlling CPU idle time management The other kernel command line parameters controlling CPU idle time management
described below are only relevant for the *x86* architecture and some of described below are only relevant for the *x86* architecture and references
them affect Intel processors only. to ``intel_idle`` affect Intel processors only.
The *x86* architecture support code recognizes three kernel command line The *x86* architecture support code recognizes three kernel command line
options related to CPU idle time management: ``idle=poll``, ``idle=halt``, options related to CPU idle time management: ``idle=poll``, ``idle=halt``,
...@@ -699,10 +699,13 @@ idle, so it very well may hurt single-thread computations performance as well as ...@@ -699,10 +699,13 @@ idle, so it very well may hurt single-thread computations performance as well as
energy-efficiency. Thus using it for performance reasons may not be a good idea energy-efficiency. Thus using it for performance reasons may not be a good idea
at all.] at all.]
The ``idle=nomwait`` option disables the ``intel_idle`` driver and causes The ``idle=nomwait`` option prevents the use of ``MWAIT`` instruction of
``acpi_idle`` to be used (as long as all of the information needed by it is the CPU to enter idle states. When this option is used, the ``acpi_idle``
there in the system's ACPI tables), but it is not allowed to use the driver will use the ``HLT`` instruction instead of ``MWAIT``. On systems
``MWAIT`` instruction of the CPUs to ask the hardware to enter idle states. running Intel processors, this option disables the ``intel_idle`` driver
and forces the use of the ``acpi_idle`` driver instead. Note that in either
case, ``acpi_idle`` driver will function only if all the information needed
by it is in the system's ACPI tables.
In addition to the architecture-level kernel command line options affecting CPU In addition to the architecture-level kernel command line options affecting CPU
idle time management, there are parameters affecting individual ``CPUIdle`` idle time management, there are parameters affecting individual ``CPUIdle``
......
...@@ -56,31 +56,28 @@ information submitted to the security list and any followup discussions ...@@ -56,31 +56,28 @@ information submitted to the security list and any followup discussions
of the report are treated confidentially even after the embargo has been of the report are treated confidentially even after the embargo has been
lifted, in perpetuity. lifted, in perpetuity.
Coordination Coordination with other groups
------------ ------------------------------
Fixes for sensitive bugs, such as those that might lead to privilege The kernel security team strongly recommends that reporters of potential
escalations, may need to be coordinated with the private security issues NEVER contact the "linux-distros" mailing list until
<linux-distros@vs.openwall.org> mailing list so that distribution vendors AFTER discussing it with the kernel security team. Do not Cc: both
are well prepared to issue a fixed kernel upon public disclosure of the lists at once. You may contact the linux-distros mailing list after a
upstream fix. Distros will need some time to test the proposed patch and fix has been agreed on and you fully understand the requirements that
will generally request at least a few days of embargo, and vendor update doing so will impose on you and the kernel community.
publication prefers to happen Tuesday through Thursday. When appropriate,
the security team can assist with this coordination, or the reporter can The different lists have different goals and the linux-distros rules do
include linux-distros from the start. In this case, remember to prefix not contribute to actually fixing any potential security problems.
the email Subject line with "[vs]" as described in the linux-distros wiki:
<http://oss-security.openwall.org/wiki/mailing-lists/distros#how-to-use-the-lists>
CVE assignment CVE assignment
-------------- --------------
The security team does not normally assign CVEs, nor do we require them The security team does not assign CVEs, nor do we require them for
for reports or fixes, as this can needlessly complicate the process and reports or fixes, as this can needlessly complicate the process and may
may delay the bug handling. If a reporter wishes to have a CVE identifier delay the bug handling. If a reporter wishes to have a CVE identifier
assigned ahead of public disclosure, they will need to contact the private assigned, they should find one by themselves, for example by contacting
linux-distros list, described above. When such a CVE identifier is known MITRE directly. However under no circumstances will a patch inclusion
before a patch is provided, it is desirable to mention it in the commit be delayed to wait for a CVE identifier to arrive.
message if the reporter agrees.
Non-disclosure agreements Non-disclosure agreements
------------------------- -------------------------
......
...@@ -557,6 +557,15 @@ numa_balancing_scan_size_mb is how many megabytes worth of pages are ...@@ -557,6 +557,15 @@ numa_balancing_scan_size_mb is how many megabytes worth of pages are
scanned for a given scan. scanned for a given scan.
oops_limit
==========
Number of kernel oopses after which the kernel should panic when
``panic_on_oops`` is not set. Setting this to 0 disables checking
the count. Setting this to 1 has the same effect as setting
``panic_on_oops=1``. The default value is 10000.
osrelease, ostype & version: osrelease, ostype & version:
============================ ============================
...@@ -862,9 +871,40 @@ The kernel command line parameter printk.devkmsg= overrides this and is ...@@ -862,9 +871,40 @@ The kernel command line parameter printk.devkmsg= overrides this and is
a one-time setting until next reboot: once set, it cannot be changed by a one-time setting until next reboot: once set, it cannot be changed by
this sysctl interface anymore. this sysctl interface anymore.
pty
===
randomize_va_space: See Documentation/filesystems/devpts.rst.
===================
random
======
This is a directory, with the following entries:
* ``boot_id``: a UUID generated the first time this is retrieved, and
unvarying after that;
* ``uuid``: a UUID generated every time this is retrieved (this can
thus be used to generate UUIDs at will);
* ``entropy_avail``: the pool's entropy count, in bits;
* ``poolsize``: the entropy pool size, in bits;
* ``urandom_min_reseed_secs``: obsolete (used to determine the minimum
number of seconds between urandom pool reseeding). This file is
writable for compatibility purposes, but writing to it has no effect
on any RNG behavior;
* ``write_wakeup_threshold``: when the entropy count drops below this
(as a number of bits), processes waiting to write to ``/dev/random``
are woken up. This file is writable for compatibility purposes, but
writing to it has no effect on any RNG behavior.
randomize_va_space
==================
This option can be used to select the type of process address This option can be used to select the type of process address
space randomization that is used in the system, for architectures space randomization that is used in the system, for architectures
...@@ -1125,6 +1165,37 @@ NMI switch that most IA32 servers have fires unknown NMI up, for ...@@ -1125,6 +1165,37 @@ NMI switch that most IA32 servers have fires unknown NMI up, for
example. If a system hangs up, try pressing the NMI switch. example. If a system hangs up, try pressing the NMI switch.
unprivileged_bpf_disabled:
==========================
Writing 1 to this entry will disable unprivileged calls to ``bpf()``;
once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` will return
``-EPERM``. Once set to 1, this can't be cleared from the running kernel
anymore.
Writing 2 to this entry will also disable unprivileged calls to ``bpf()``,
however, an admin can still change this setting later on, if needed, by
writing 0 or 1 to this entry.
If ``BPF_UNPRIV_DEFAULT_OFF`` is enabled in the kernel config, then this
entry will default to 2 instead of 0.
= =============================================================
0 Unprivileged calls to ``bpf()`` are enabled
1 Unprivileged calls to ``bpf()`` are disabled without recovery
2 Unprivileged calls to ``bpf()`` are disabled
= =============================================================
warn_limit
==========
Number of kernel warnings after which the kernel should panic when
``panic_on_warn`` is not set. Setting this to 0 disables checking
the warning count. Setting this to 1 has the same effect as setting
``panic_on_warn=1``. The default value is 0.
watchdog: watchdog:
========= =========
......
...@@ -31,17 +31,18 @@ see only some of them, depending on your kernel's configuration. ...@@ -31,17 +31,18 @@ see only some of them, depending on your kernel's configuration.
Table : Subdirectories in /proc/sys/net Table : Subdirectories in /proc/sys/net
========= =================== = ========== ================== ========= =================== = ========== ===================
Directory Content Directory Content Directory Content Directory Content
========= =================== = ========== ================== ========= =================== = ========== ===================
core General parameter appletalk Appletalk protocol 802 E802 protocol mptcp Multipath TCP
unix Unix domain sockets netrom NET/ROM appletalk Appletalk protocol netfilter Network Filter
802 E802 protocol ax25 AX25 ax25 AX25 netrom NET/ROM
ethernet Ethernet protocol rose X.25 PLP layer bridge Bridging rose X.25 PLP layer
core General parameter tipc TIPC
ethernet Ethernet protocol unix Unix domain sockets
ipv4 IP version 4 x25 X.25 protocol ipv4 IP version 4 x25 X.25 protocol
bridge Bridging decnet DEC net ipv6 IP version 6
ipv6 IP version 6 tipc TIPC ========= =================== = ========== ===================
========= =================== = ========== ==================
1. /proc/sys/net/core - Network core options 1. /proc/sys/net/core - Network core options
============================================ ============================================
......
...@@ -61,6 +61,7 @@ Currently, these files are in /proc/sys/vm: ...@@ -61,6 +61,7 @@ Currently, these files are in /proc/sys/vm:
- overcommit_memory - overcommit_memory
- overcommit_ratio - overcommit_ratio
- page-cluster - page-cluster
- page_lock_unfairness
- panic_on_oom - panic_on_oom
- percpu_pagelist_fraction - percpu_pagelist_fraction
- stat_interval - stat_interval
...@@ -741,6 +742,14 @@ extra faults and I/O delays for following faults if they would have been part of ...@@ -741,6 +742,14 @@ extra faults and I/O delays for following faults if they would have been part of
that consecutive pages readahead would have brought in. that consecutive pages readahead would have brought in.
page_lock_unfairness
====================
This value determines the number of times that the page lock can be
stolen from under a waiter. After the lock is stolen the number of times
specified in this file (default is 5), the "fair lock handoff" semantics
will apply, and the waiter will only be awakened if the lock can be taken.
panic_on_oom panic_on_oom
============ ============
......