- Jun 08, 2017
-
-
Martin Schwidefsky authored
Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- May 17, 2017
-
-
Christian Borntraeger authored
For most cases a protection exception in the host (e.g. copy on write or dirty tracking) on the sie instruction will indicate an instruction length of 4. Turns out that there are some corner cases (e.g. runtime instrumentation) where this is not necessarily true and the ILC is unpredictable. Let's replace our 4 byte rewind_pad with 3 byte nops to prepare for all possible ILCs. Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Cc: stable@vger.kernel.org Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- May 12, 2017
-
-
Andrew Morton authored
Cc: Tigran Aivazian <aivazian.tigran@gmail.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- May 11, 2017
-
-
Daniel Borkmann authored
Shubham was recently asking on netdev why in arm64 JIT we don't multiply the index for accessing the tail call map by 8. That led me into testing out arm64 JIT wrt tail calls and it turned out I got a NULL pointer dereference on the tail call. The buggy access is at: prog = array->ptrs[index]; if (prog == NULL) goto out; [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: f86a682a ldr x10, [x1,x10] 00000068: f862694b ldr x11, [x10,x2] 0000006c: b40000ab cbz x11, 0x00000080 [...] The code triggering the crash is f862694b. x1 at the time contains the address of the bpf array, x10 offsetof(struct bpf_array, ptrs). Meaning, above we load the pointer to the program at map slot 0 into x10. x10 can then be NULL if the slot is not occupied, which we later on try to access with a user given offset in x2 that is the map index. Fix this by emitting the following instead: [...] 00000060: d2800e0a mov x10, #0x70 // #112 00000064: 8b0a002a add x10, x1, x10 00000068: d37df04b lsl x11, x2, #3 0000006c: f86b694b ldr x11, [x10,x11] 00000070: b40000ab cbz x11, 0x00000084 [...] This basically adds the offset to ptrs to the base address of the bpf array we got and we later on access the map with an index * 8 offset relative to that. The tail call map itself is basically one large area with meta data at the head followed by the array of prog pointers. This makes tail calls working again, tested on Cavium ThunderX ARMv8. Fixes: ddb55992 ("arm64: bpf: implement bpf_tail_call() helper") Reported-by:
Shubham Bansal <illusionist.neo@gmail.com> Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Elena Reshetova authored
refcount_t type and corresponding API should be used instead of atomic_t when the variable is used as a reference counter. This allows to avoid accidental refcounter overflows that might lead to use-after-free situations. Signed-off-by:
Elena Reshetova <elena.reshetova@intel.com> Signed-off-by:
Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by:
Kees Cook <keescook@chromium.org> Signed-off-by:
David Windsor <dwindsor@gmail.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Juergen Gross authored
When booted as pv-guest the p2m list presented by the Xen is already mapped to virtual addresses. In dom0 case the hypervisor might make use of 2M- or 1G-pages for this mapping. Unfortunately while being properly aligned in virtual and machine address space, those pages might not be aligned properly in guest physical address space. So when trying to obtain the guest physical address of such a page pud_pfn() and pmd_pfn() must be avoided as those will mask away guest physical address bits not being zero in this special case. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
Jan Beulich <jbeulich@suse.com> Signed-off-by:
Juergen Gross <jgross@suse.com>
-
Juergen Gross authored
When running as Xen pv guest X86_BUG_SYSRET_SS_ATTRS must not be set on AMD cpus. This bug/feature bit is kind of special as it will be used very early when switching threads. Setting the bit and clearing it a little bit later leaves a critical window where things can go wrong. This time window has enlarged a little bit by using setup_clear_cpu_cap() instead of the hypervisor's set_cpu_features callback. It seems this larger window now makes it rather easy to hit the problem. The proper solution is to never set the bit in case of Xen. Signed-off-by:
Juergen Gross <jgross@suse.com> Reviewed-by:
Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Juergen Gross <jgross@suse.com>
-
Florian Fainelli authored
When CONFIG_ARM64_MODULE_PLTS is enabled, the first allocation using the module space fails, because the module is too big, and then the module allocation is attempted from vmalloc space. Silence the first allocation failure in that case by setting __GFP_NOWARN. Reviewed-by:
Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Florian Fainelli authored
When CONFIG_ARM_MODULE_PLTS is enabled, the first allocation using the module space fails, because the module is too big, and then the module allocation is attempted from vmalloc space. Silence the first allocation failure in that case by setting __GFP_NOWARN. Acked-by:
Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by:
Florian Fainelli <f.fainelli@gmail.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Tobias Klauser authored
As of commits d8f347ba35cf ("nios2: enable earlycon support"), 0dcc0542 ("serial: altera_jtaguart: add earlycon support") and 4d9d7d89 ("serial: altera_uart: add earlycon support"), the nios2 architecture and the altera_uart/altera_jtaguart drivers support earlycon. Thus, the custom early console implementation for nios2 is no longer necessary to get early boot messages. Remove it and rely fully on earlycon support. Signed-off-by:
Tobias Klauser <tklauser@distanz.ch>
-
- May 10, 2017
-
-
Nicolas Dichtel authored
Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Nicolas Dichtel authored
This patch removes the need of subdir-y. Now all files/directories under arch/<arch>/include/uapi/ are exported. The only change for userland is the layout of the command 'make headers_install_all': directories asm-<arch> are replaced by arch-<arch>/. Those new directories contains all files/directories of the specified arch. Note that only cris and tile have more directories than only asm: - arch-v[10|32] for cris; - arch for tile. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Nicolas Dichtel authored
Regularly, when a new header is created in include/uapi/, the developer forgets to add it in the corresponding Kbuild file. This error is usually detected after the release is out. In fact, all headers under uapi directories should be exported, thus it's useless to have an exhaustive list. After this patch, the following files, which were not exported, are now exported (with make headers_install_all): asm-arc/kvm_para.h asm-arc/ucontext.h asm-blackfin/shmparam.h asm-blackfin/ucontext.h asm-c6x/shmparam.h asm-c6x/ucontext.h asm-cris/kvm_para.h asm-h8300/shmparam.h asm-h8300/ucontext.h asm-hexagon/shmparam.h asm-m32r/kvm_para.h asm-m68k/kvm_para.h asm-m68k/shmparam.h asm-metag/kvm_para.h asm-metag/shmparam.h asm-metag/ucontext.h asm-mips/hwcap.h asm-mips/reg.h asm-mips/ucontext.h asm-nios2/kvm_para.h asm-nios2/ucontext.h asm-openrisc/shmparam.h asm-parisc/kvm_para.h asm-powerpc/perf_regs.h asm-sh/kvm_para.h asm-sh/ucontext.h asm-tile/shmparam.h asm-unicore32/shmparam.h asm-unicore32/ucontext.h asm-x86/hwcap2.h asm-xtensa/kvm_para.h drm/armada_drm.h drm/etnaviv_drm.h drm/vgem_drm.h linux/aspeed-lpc-ctrl.h linux/auto_dev-ioctl.h linux/bcache.h linux/btrfs_tree.h linux/can/vxcan.h linux/cifs/cifs_mount.h linux/coresight-stm.h linux/cryptouser.h linux/fsmap.h linux/genwqe/genwqe_card.h linux/hash_info.h linux/kcm.h linux/kcov.h linux/kfd_ioctl.h linux/lightnvm.h linux/module.h linux/nbd-netlink.h linux/nilfs2_api.h linux/nilfs2_ondisk.h linux/nsfs.h linux/pr.h linux/qrtr.h linux/rpmsg.h linux/sched/types.h linux/sed-opal.h linux/smc.h linux/smc_diag.h linux/stm.h linux/switchtec_ioctl.h linux/vfio_ccw.h linux/wil6210_uapi.h rdma/bnxt_re-abi.h Note that I have removed from this list the files which are generated in every exported directories (like .install or .install.cmd). Thanks to Julien Floret <julien.floret@6wind.com> for the tip to get all subdirs with a pure makefile command. For the record, note that exported files for asm directories are a mix of files listed by: - include/uapi/asm-generic/Kbuild.asm; - arch/<arch>/include/uapi/asm/Kbuild; - arch/<arch>/include/asm/Kbuild. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by:
Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by:
Russell King <rmk+kernel@armlinux.org.uk> Acked-by:
Mark Salter <msalter@redhat.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Nicolas Dichtel authored
Even if this file was not in an uapi directory, it was exported because it was listed in the Kbuild file. Fixes: b72e7464 ("x86/uapi: Do not export <asm/msr-index.h> as part of the user API headers") Suggested-by:
Borislav Petkov <bp@alien8.de> Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Acked-by:
Ingo Molnar <mingo@kernel.org> Acked-by:
Thomas Gleixner <tglx@linutronix.de> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Nicolas Dichtel authored
This header file is exported, but from a userland pov, it's just a wrapper to asm-generic/setup.h. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by:
Tobias Klauser <tklauser@distanz.ch> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
Nicolas Dichtel authored
This header file is exported, thus move it to uapi. Signed-off-by:
Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
-
- May 09, 2017
-
-
Dave Aldridge authored
When any of the functions contained in NGbzero.S and GENbzero.S vector through *bzero_from_clear_user, we may end up taking a fault when executing one of the store alternate address space instructions. If this happens, the exception handler does not restore the %asi register. This commit fixes the issue by introducing a new exception handler that ensures the %asi register is restored when a fault is handled. Orabug: 25577560 Signed-off-by:
Dave Aldridge <david.j.aldridge@oracle.com> Reviewed-by:
Rob Gardner <rob.gardner@oracle.com> Reviewed-by:
Babu Moger <babu.moger@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Geliang Tang authored
Use memdup_user_nul() helper instead of open-coding to simplify the code. Signed-off-by:
Geliang Tang <geliangtang@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net>
-
Ben Hutchings authored
Commit 11e63f6d added cache flushing for unaligned writes from an iovec, covering the first and last cache line of a >= 8 byte write and the first cache line of a < 8 byte write. But an unaligned write of 2-7 bytes can still cover two cache lines, so make sure we flush both in that case. Cc: <stable@vger.kernel.org> Fixes: 11e63f6d ("x86, pmem: fix broken __copy_user_nocache ...") Signed-off-by:
Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by:
Dan Williams <dan.j.williams@intel.com>
-
Mark Rutland authored
Clang tries to warn when there's a mismatch between an operand's size, and the size of the register it is held in, as this may indicate a bug. Specifically, clang warns when the operand's type is less than 64 bits wide, and the register is used unqualified (i.e. %N rather than %xN or %wN). Unfortunately clang can generate these warnings for unreachable code. For example, for code like: do { \ typeof(*(ptr)) __v = (v); \ switch(sizeof(*(ptr))) { \ case 1: \ // assume __v is 1 byte wide \ asm ("{op}b %w0" : : "r" (v)); \ break; \ case 8: \ // assume __v is 8 bytes wide \ asm ("{op} %0" : : "r" (v)); \ break; \ } while (0) ... if op() were passed a char value and pointer to char, clang may produce a warning for the unreachable case where sizeof(*(ptr)) is 8. For the same reasons, clang produces warnings when __put_user_err() is used for types that are less than 64 bits wide. We could avoid this with a cast to a fixed-width type in each of the cases. However, GCC will then warn that pointer types are being cast to mismatched integer sizes (in unreachable paths). Another option would be to use the same union trickery as we do for __smp_store_release() and __smp_load_acquire(), but this is fairly invasive. Instead, this patch suppresses the clang warning by using an x modifier in the assembly for the 8 byte case of __put_user_err(). No additional work is necessary as the value has been cast to typeof(*(ptr)), so the compiler will have performed any necessary extension for the reachable case. For consistency, __get_user_err() is also updated to use the x modifier for its 8 byte case. Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Reported-by:
Matthias Kaehlcke <mka@chromium.org> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
The LSE atomic code uses asm register variables to ensure that parameters are allocated in specific registers. In the majority of cases we specifically ask for an x register when using 64-bit values, but in a couple of cases we use a w regsiter for a 64-bit value. For asm register variables, the compiler only cares about the register index, with wN and xN having the same meaning. The compiler determines the register size to use based on the type of the variable. Thus, this inconsistency is merely confusing, and not harmful to code generation. For consistency, this patch updates those cases to use the x register alias. There should be no functional change as a result of this patch. Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
Our compat swp emulation holds the compat user address in an unsigned int, which it passes to __user_swpX_asm(). When a 32-bit value is passed in a register, the upper 32 bits of the register are unknown, and we must extend the value to 64 bits before we can use it as a base address. This patch casts the address to unsigned long to ensure it has been suitably extended, avoiding the potential issue, and silencing a related warning from clang. Fixes: bd35a4ad ("arm64: Port SWP/SWPB emulation support from arm") Cc: <stable@vger.kernel.org> # 3.19.x- Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
Our access_ok() simply hands its arguments over to __range_ok(), which implicitly assummes that the addr parameter is 64 bits wide. This isn't necessarily true for compat code, which might pass down a 32-bit address parameter. In these cases, we don't have a guarantee that the address has been zero extended to 64 bits, and the upper bits of the register may contain unknown values, potentially resulting in a suprious failure. Avoid this by explicitly casting the addr parameter to an unsigned long (as is done on other architectures), ensuring that the parameter is widened appropriately. Fixes: 0aea86a2 ("arm64: User access library functions") Cc: <stable@vger.kernel.org> # 3.7.x- Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
When an inline assembly operand's type is narrower than the register it is allocated to, the least significant bits of the register (up to the operand type's width) are valid, and any other bits are permitted to contain any arbitrary value. This aligns with the AAPCS64 parameter passing rules. Our __smp_store_release() implementation does not account for this, and implicitly assumes that operands have been zero-extended to the width of the type being stored to. Thus, we may store unknown values to memory when the value type is narrower than the pointer type (e.g. when storing a char to a long). This patch fixes the issue by casting the value operand to the same width as the pointer operand in all cases, which ensures that the value is zero-extended as we expect. We use the same union trickery as __smp_load_acquire and {READ,WRITE}_ONCE() to avoid GCC complaining that pointers are potentially cast to narrower width integers in unreachable paths. A whitespace issue at the top of __smp_store_release() is also corrected. No changes are necessary for __smp_load_acquire(). Load instructions implicitly clear any upper bits of the register, and the compiler will only consider the least significant bits of the register as valid regardless. Fixes: 47933ad4 ("arch: Introduce smp_load_acquire(), smp_store_release()") Fixes: 878a84d5 ("arm64: add missing data types in smp_load_acquire/smp_store_release") Cc: <stable@vger.kernel.org> # 3.14.x- Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Cc: Matthias Kaehlcke <mka@chromium.org> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Mark Rutland authored
The inline assembly in __XCHG_CASE() uses a +Q constraint to hazard against other accesses to the memory location being exchanged. However, the pointer passed to the constraint is a u8 pointer, and thus the hazard only applies to the first byte of the location. GCC can take advantage of this, assuming that other portions of the location are unchanged, as demonstrated with the following test case: union u { unsigned long l; unsigned int i[2]; }; unsigned long update_char_hazard(union u *u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" (*(char *)&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } unsigned long update_long_hazard(union u *u) { unsigned int a, b; a = u->i[1]; asm ("str %1, %0" : "+Q" (*(long *)&u->l) : "r" (0UL)); b = u->i[1]; return a ^ b; } The linaro 15.08 GCC 5.1.1 toolchain compiles the above as follows when using -O2 or above: 0000000000000000 <update_char_hazard>: 0: d2800001 mov x1, #0x0 // #0 4: f9000001 str x1, [x0] 8: d2800000 mov x0, #0x0 // #0 c: d65f03c0 ret 0000000000000010 <update_long_hazard>: 10: b9400401 ldr w1, [x0,#4] 14: d2800002 mov x2, #0x0 // #0 18: f9000002 str x2, [x0] 1c: b9400400 ldr w0, [x0,#4] 20: 4a000020 eor w0, w1, w0 24: d65f03c0 ret This patch fixes the issue by passing an unsigned long pointer into the +Q constraint, as we do for our cmpxchg code. This may hazard against more than is necessary, but this is better than missing a necessary hazard. Fixes: 305d454a ("arm64: atomics: implement native {relaxed, acquire, release} atomics") Cc: <stable@vger.kernel.org> # 4.4.x- Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Mark Rutland <mark.rutland@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Kristina Martsenko authored
When handling a data abort from EL0, we currently zero the top byte of the faulting address, as we assume the address is a TTBR0 address, which may contain a non-zero address tag. However, the address may be a TTBR1 address, in which case we should not zero the top byte. This patch fixes that. The effect is that the full TTBR1 address is passed to the task's signal handler (or printed out in the kernel log). When handling a data abort from EL1, we leave the faulting address intact, as we assume it's either a TTBR1 address or a TTBR0 address with tag 0x00. This is true as far as I'm aware, we don't seem to access a tagged TTBR0 address anywhere in the kernel. Regardless, it's easy to forget about address tags, and code added in the future may not always remember to remove tags from addresses before accessing them. So add tag handling to the EL1 data abort handler as well. This also makes it consistent with the EL0 data abort handler. Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0") Cc: <stable@vger.kernel.org> # 3.12.x- Reviewed-by:
Dave Martin <Dave.Martin@arm.com> Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Kristina Martsenko authored
When we take a watchpoint exception, the address that triggered the watchpoint is found in FAR_EL1. We compare it to the address of each configured watchpoint to see which one was hit. The configured watchpoint addresses are untagged, while the address in FAR_EL1 will have an address tag if the data access was done using a tagged address. The tag needs to be removed to compare the address to the watchpoints. Currently we don't remove it, and as a result can report the wrong watchpoint as being hit (specifically, always either the highest TTBR0 watchpoint or lowest TTBR1 watchpoint). This patch removes the tag. Fixes: d50240a5 ("arm64: mm: permit use of tagged pointers at EL0") Cc: <stable@vger.kernel.org> # 3.12.x- Acked-by:
Mark Rutland <mark.rutland@arm.com> Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Kristina Martsenko authored
When we emulate userspace cache maintenance in the kernel, we can currently send the task a SIGSEGV even though the maintenance was done on a valid address. This happens if the address has a non-zero address tag, and happens to not be mapped in. When we get the address from a user register, we don't currently remove the address tag before performing cache maintenance on it. If the maintenance faults, we end up in either __do_page_fault, where find_vma can't find the VMA if the address has a tag, or in do_translation_fault, where the tagged address will appear to be above TASK_SIZE. In both cases, the address is not mapped in, and the task is sent a SIGSEGV. This patch removes the tag from the address before using it. With this patch, the fault is handled correctly, the address gets mapped in, and the cache maintenance succeeds. As a second bug, if cache maintenance (correctly) fails on an invalid tagged address, the address gets passed into arm64_notify_segfault, where find_vma fails to find the VMA due to the tag, and the wrong si_code may be sent as part of the siginfo_t of the segfault. With this patch, the correct si_code is sent. Fixes: 7dd01aef ("arm64: trap userspace "dc cvau" cache operation on errata-affected core") Cc: <stable@vger.kernel.org> # 4.8.x- Acked-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Kristina Martsenko <kristina.martsenko@arm.com> Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com>
-
Nicholas Piggin authored
The ibm,powerpc-cpu-features device tree binding describes CPU features with ASCII names and extensible compatibility, privilege, and enablement metadata that allows improved flexibility and compatibility with new hardware. The interface is described in detail in ibm,powerpc-cpu-features.txt in this patch. Currently this code is not enabled by default, and there are no released firmwares that provide the binding. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Nicholas Piggin authored
Currently we assume that if the cpu_spec has a pvr_mask then it must also have a cpu_name. But that will change in a subsequent commit when we do CPU feature discovery via the device tree, so check explicitly if cpu_name is NULL. Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Bandan Das authored
Advertise the PML bit in vmcs12 but don't try to enable it in hardware when running L2 since L0 is emulating it. Also, preserve L0's settings for PML since it may still want to log writes. Signed-off-by:
Bandan Das <bsd@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Bandan Das authored
With EPT A/D enabled, processor access to L2 guest paging structures will result in a write violation. When this happens, write the GUEST_PHYSICAL_ADDRESS to the pml buffer provided by L1 if the access is write and the dirty bit is being set. This patch also adds necessary checks during VMEntry if L1 has enabled PML. If the PML index overflows, we change the exit reason and run L1 to simulate a PML full event. Signed-off-by:
Bandan Das <bsd@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Bandan Das authored
When KVM updates accessed/dirty bits, this hook can be used to invoke an arch specific function that implements/emulates dirty logging such as PML. Signed-off-by:
Bandan Das <bsd@redhat.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Jim Mattson authored
According to the SDM, the CR3-target count must not be greater than 4. Future processors may support a different number of CR3-target values. Software should read the VMX capability MSR IA32_VMX_MISC to determine the number of values supported. Signed-off-by:
Jim Mattson <jmattson@google.com> Signed-off-by:
Paolo Bonzini <pbonzini@redhat.com>
-
Nicholas Piggin authored
Similarly to commit 2563a70c ("powerpc/64s: Remove unnecessary relocation branch from idle handler"), the machine check handler has a BRANCH_TO from relocated to relocated code, which is unnecessary. It has also caused build errors with some toolchains: arch/powerpc/kernel/exceptions-64s.S: Assembler messages: arch/powerpc/kernel/exceptions-64s.S:395: Error: operand out of range (0xffffffffffff8280 is not between 0x0000000000000000 and 0x000000000000ffff) Fixes: 1945bc45 ("powerpc/64s: Fix POWER9 machine check handler from stop state") Signed-off-by:
Nicholas Piggin <npiggin@gmail.com> Reported-and-tested-by : Abdul Haleem <abdhalee@linux.vnet.ibm.com> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Michael Ellerman authored
Recently in commit f6eedbba ("powerpc/mm/hash: Increase VA range to 128TB") we increased the virtual address space for user processes to 128TB by default, and up to 512TB if user space opts in. This obviously required expanding the range of the Linux page tables. For Book3s 64-bit using hash and with PAGE_SIZE=64K, we increased the PGD to 2^15 entries. This meant we could cover the full address range, while still being able to insert a 16G hugepage at the PGD level and a 16M hugepage in the PMD. The downside of that geometry is that it uses a lot of memory for the PGD, and in particular makes the PGD a 4-page allocation, which means it's much more likely to fail under memory pressure. Instead we can make the PMD larger, so that a single PUD entry maps 16G, allowing the 16G hugepages to sit at that level in the tree. We're then able to split the remaining bits between the PUG and PGD. We make the PGD slightly larger as that results in lower memory usage for typical programs. When THP is enabled the PMD actually doubles in size, to 2^11 entries, or 2^14 bytes, which is large but still < PAGE_SIZE. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Reviewed-by:
Balbir Singh <bsingharora@gmail.com> Reviewed-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
-
Horia Geantă authored
Makefile.postlink always includes include/config/auto.conf, however this file is not present in a clean kernel tree, causing make to fail: $ git clone linuxppc.git $ cd linuxppc.git $ make distclean arch/powerpc/Makefile.postlink:10: include/config/auto.conf: No such file or directory make[1]: *** No rule to make target `include/config/auto.conf'. Stop. make: *** [vmlinuxclean] Error 2 Equally running 'make distclean; make distclean' will trip the error case. Change the inclusion such that file not being found does not trigger an error. Fixes: f188d052 ("powerpc: Use the new post-link pass to check relocations") Reported-by:
Mircea Pop <mircea.pop@nxp.com> Signed-off-by:
Horia Geantă <horia.geanta@nxp.com> Tested-by:
Justin M. Forbes <jforbes@fedoraproject.org> Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au>
-
Heiko Carstens authored
The perf tool assumes that kernel symbols are never present at address zero. In fact it assumes if functions that map symbols to addresses return zero, that the symbol was not found. Given that s390's _text symbol historically is located at address zero this yields at least a couple of false errors and warnings in one of perf's test cases about not present symbols ("perf test 1"). To fix this simply move the _text symbol to address 0x200, just behind the initial psw and channel program located at the beginning of the kernel image. This is now hard coded within the linker script. I tried a nicer solution which moves the initial psw and channel program into an own section. However that would move the symbols within the "real" head.text section to different addresses, since the ".org" statements within head.S are relative to the head.text section. If there is a new section in front, everything else will be moved. Alternatively I could have adjusted all ".org" statements. But this current solution seems to be the easiest one, since nobody really cares where the _text symbol is actually located. Reported-by:
Zvonko Kosic <zkosic@linux.vnet.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Fixes a couple of compile warnings with gcc 7.1.0 : arch/s390/kernel/sysinfo.c:578:20: note: directive argument in the range [-2147483648, 4] sprintf(link_to, "15_1_%d", topology_mnest_limit()); Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
The average string that is copied from user space to kernel space is rather short. E.g. booting a system involves about 50.000 strncpy_from_user() calls where the NULL terminated string has an average size of 27 bytes. By default our s390 specific strncpy_from_user() implementation however copies up to 4096 bytes, which is a waste of cpu cycles and cache lines. Therefore reduce the default length to L1_CACHE_BYTES (256 bytes), which also reduces the average execution time of strncpy_from_user() by 30-40%. Alternatively we could have switched to the generic strncpy_from_user() implementation, however it turned out that that variant would be slower than the now optimized s390 variant. Reported-by:
Al Viro <viro@ZenIV.linux.org.uk> Reported-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Reviewed-by:
Gerald Schaefer <gerald.schaefer@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-