- Apr 25, 2017
-
-
Martin Schwidefsky authored
With TASK_SIZE now reflecting the maximum size of the address space for a process the code for arch_get_unmapped_area[_topdown] can be simplified. Just let the logic pick a suitable address and deal with the page table upgrade after the address has been selected. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
The TASK_SIZE for a process should be maximum possible size of the address space, 2GB for a 31-bit process and 8PB for a 64-bit process. The number of page table levels required for a given memory layout is a consequence of the mapped memory areas and their location. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Apr 21, 2017
-
-
Martin Schwidefsky authored
The guarded storage interface allows to register a control block for each thread that is activated with the guarded storage broadcast event. To retrieve the complete state of a process from the kernel a register set for the stored broadcast control block is required. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Apr 20, 2017
-
-
Claudio Imbrenda authored
Add use_cmma field to mm_context_t, like we do for storage keys. Signed-off-by:
Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Acked-by:
Janosch Frank <frankja@de.ibm.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Claudio Imbrenda authored
Add PGSTE manipulation functions: * set_pgste_bits sets specific bits in a PGSTE * get_pgste returns the whole PGSTE * pgste_perform_essa manipulates a PGSTE to set specific storage states * ESSA_[SG]ET_* macros used to indicate the action for manipulate_pgste Signed-off-by:
Claudio Imbrenda <imbrenda@linux.vnet.ibm.com> Reviewed-by:
Janosch Frank <frankja@de.ibm.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Apr 12, 2017
-
-
Martin Schwidefsky authored
The CAD instruction never worked quite as expected for the spinlock code. It has been disabled by default with git commit 61b0b016, if the "cad" kernel parameter is specified it is enabled for both user space and the spinlock code. Leave the option to enable the instruction for user space but remove it from the spinlock code. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
Add a couple more __atomic_xxx function to atomic_ops.h and use them to replace the compare-and-swap inlines in the spinlock code. This changes the type of the lock value from unsigned int to int. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Apr 05, 2017
-
-
Martin Schwidefsky authored
There are three different code levels in regard to the identification of guest samples. They differ in the way the LPP instruction is used. 1) Old kernels without the LPP instruction. The guest program parameter is always zero. 2) Newer kernels load the process pid into the program parameter with LPP. The guest program parameter is non-zero if the guest executes in a process != idle. 3) The latest kernels load ((1UL << 31) | pid) with LPP to make the value non-zero even for the idle task. The guest program parameter is non-zero if the guest is running. All kernels load the process pid to CR4 on context switch. The CPU sampling code uses the value in CR4 to decide between guest and host samples in case the guest program parameter is zero. The three cases: 1) CR4==pid, gpp==0 2) CR4==pid, gpp==pid 3) CR4==pid, gpp==((1UL << 31) | pid) The load-control instruction to load the pid into CR4 is expensive and the goal is to remove it. To distinguish the host CR4 from the guest pid for the idle process the maximum value 0xffff for the PASN is used. This adds a fourth case for a guest OS with an updated kernel: 4) CR4==0xffff, gpp=((1UL << 31) | pid) The host kernel will have CR4==0xffff and will use (gpp!=0 || CR4!==0xffff) to identify guest samples. This works nicely with all 4 cases, the only possible issue would be a guest with an old kernel (gpp==0) and a process pid of 0xffff. Well, don't do that.. Suggested-by:
Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Move a struct definition to get rid of a forward declaration. Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Users complained that they are hitting the limit of 64 functions (some physical functions can spawn lots of virtual functions). Double the default limit. With the latest savings in static data usage this increases the image size by only 520 bytes. Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Commit c506fff3 ("s390/pci: resize iomap") reduced the iomap to NR_FUNCTIONS * PCI_BAR_COUNT elements. Since we only support functions with 64bit BARs we can cut that number in half. Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
Address space identifiers are already defined in <asm/pci_insn.h>. Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Sebastian Ott authored
barsize was never used. Get rid of it. Signed-off-by:
Sebastian Ott <sebott@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Martin Schwidefsky authored
The 32-bit lctl instruction is quite a bit slower than the 64-bit counter part lctlg. Use the faster instruction. Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Mar 31, 2017
-
-
Dong Jia Shi authored
Although Linux does not use format-0 channel command words (CCW0) these are a non-optional part of the platform spec, and for the sake of platform compliance, and possibly some non-Linux guests, we have to support CCW0. Making the kernel execute a format 0 channel program is too much hassle because we would need to allocate and use memory which can be addressed by 24 bit physical addresses (because of CCW0.cda). So we implement CCW0 support by translating the channel program into an equivalent CCW1 program instead. Based upon an orginal patch by Kai Yue Wang. Signed-off-by:
Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20170317031743.40128-16-bjsdjshi@linux.vnet.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com>
-
Dong Jia Shi authored
To make vfio support subchannel devices, we need to leverage the mediated device framework to create a mediated device for the subchannel device. This registers the subchannel device to the mediated device framework during probe to enable mediated device creation. Reviewed-by:
Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by:
Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20170317031743.40128-7-bjsdjshi@linux.vnet.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com>
-
Dong Jia Shi authored
To make vfio support subchannel devices, we need a css driver for the vfio subchannels. This patch adds a basic vfio-ccw subchannel driver for this purpose. To enable VFIO for vfio-ccw, enable S390_CCW_IOMMU config option and configure VFIO as required. Acked-by:
Pierre Morel <pmorel@linux.vnet.ibm.com> Signed-off-by:
Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com> Message-Id: <20170317031743.40128-5-bjsdjshi@linux.vnet.ibm.com> Signed-off-by:
Cornelia Huck <cornelia.huck@de.ibm.com>
-
Hendrik Brueckner authored
The return code of hw_perf_event_update() is not evaluated by its callers. Hence, simplify the function by removing the return code. Reported-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Using a register variable for r4 is not necessary. Let the compiler decide the register to be used. Reported-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Make clear that the event definitions relate to the counter facility (cf) and not to the sampling facility (sf). Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Add the event names for the IBM z13/z13s specific CPU-MF counters. Also improve the merging of the generic and model specific events so that their sysfs attribute definitions completely reside in memory. Hence, flagging the generic event attribute definitions as initdata too. Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Complete the IBM z13 support and support counters from the MT-diagnostic counter set. Note that this counter set is available only if SMT is enabled. Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
The validate_event() function just checked for reserved counters in particular CPU-MF counter sets. Because the number of counters in counter sets vary among different hardware models, remove the explicit check to tolerate new models. Reserved counters are not accounted and, thus, will return zero. Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Hendrik Brueckner authored
Use the highest counter number that can be specified for the ecctr (extract CPU counter) instruction for perf. Signed-off-by:
Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Mar 30, 2017
-
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Mar 28, 2017
-
-
Heiko Carstens authored
Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Deferred struct page initialization works on s390. However it makes only sense for the fake numa case, since the kthreads that initialize struct pages are started per node. Without fake numa there is just a single node and therefore no gain. However there is no reason to not enable this feature. Therefore select the config option and enable the feature in all config files. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Michael Holzheu authored
Since linux v3.14 with commit 38dfac84 ("vmcore: prevent PT_NOTE p_memsz overflow during header update") on s390 we get the following message in the kdump kernel: Warning: Exceeded p_memsz, dropping PT_NOTE entry n_namesz=0x6b6b6b6b, n_descsz=0x6b6b6b6b The reason for this is that we don't create a final zero note in the ELF header which the proc/vmcore code uses to find out the end of the notes section (see also kernel/kexec_core.c:final_note()). It still worked on s390 by chance because we (most of the time?) have the byte pattern 0x6b6b6b6b after the notes section which also makes the notes parsing code stop in update_note_header_size_elf64() because 0x6b6b6b6b is interpreded as note size: if ((real_sz + sz) > max_sz) { pr_warn("Warning: Exceeded p_memsz, dropping P ...); break; } So fix this and add the missing final note to the ELF header. We don't have to adjust the memory size for ELF header ("alloc_size") because the new ELF note still fits into the 0x1000 base memory. Cc: stable@vger.kernel.org # v4.4+ Signed-off-by:
Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
HAVE_ARCH_EARLY_PFN_TO_NID selects a not present Kconfig option. Therefore remove it. Given that the first call of early_pfn_to_nid() happens after numa_setup() finished to establish the memory to node mapping, there is no need to implement an architecture private version of __early_pfn_to_nid() like (only) ia64 does. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Mar 24, 2017
-
-
Janosch Frank authored
ptep_notify and gmap_shadow_notify both need a guest address and therefore retrieve them from the available virtual host address. As they operate on the same guest address, we can calculate it once and then pass it on. As a gmap normally has more than one shadow gmap, we also do not recalculate for each of them any more. Signed-off-by:
Janosch Frank <frankja@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
- Mar 22, 2017
-
-
Heiko Carstens authored
There is no need for the __ASSEMBLY__ ifdefery anymore since the architecture level set code that deals with facility bits was converted to C in the meantime. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Use the actual size of the facility list array within the lowcore structure for the MAX_FACILITY_BIT define instead of a comment which states what this is good for. This makes it a bit harder to break things. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Provide the remaining stsi information via debugfs files. This also might be useful for debugging purposes. Suggested-by:
Christian Borntraeger <borntraeger@de.ibm.com> Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Provide the raw stsi 15,1,x data contents via debugfs. This makes it much easier to debug unexpected scheduling domains on machines that provide cpu topology information. Therefore this file adds a new 's390/stsi' debugfs directory with a file for each possible topology nesting level that is allowed by the architecture. The files will be created regardless if the machine supports all, or any, level. If a level is not supported, or no data is available, user space can recognize this with a -EINVAL error code when trying to read such data. In addition a 'topology' symlink is created that points to the file that contains the data that is used to create the scheduling domains. Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
Introduce a top-level 's390' directory which should be used when adding new s390 specific debug feature files and/or directories. This makes hopefully sure that the contents of the s390 directory will be a bit more structured. Right now we have a couple of top-level files where it is not easy to tell to which subsystem they belong to. Acked-by:
Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Harald Freudenberger authored
User space needs some information about the secure key(s) before actually invoking the pkey and/or paes funcionality. This patch introduces a new ioctl API and in kernel API to verify the the secure key blob and give back some information about the key (type, bitsize, old MKVP). Both APIs are described in detail in the header files arch/s390/include/asm/pkey.h and arch/s390/include/uapi/asm/pkey.h. Signed-off-by:
Harald Freudenberger <freude@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Michael Holzheu authored
Signed-off-by:
Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
If running within a level 3 hypervisor, the hypervisor provides a SYSIB block which contains a control program indentifier string. Use this string instead of the simple KVM and z/VM strings only. In case of z/VM this provides addtional information: the z/VM version. The new string looks similar to this: Hardware name: IBM 2964 N96 702 (z/VM 6.4.0) Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-
Heiko Carstens authored
The arch description provided for the "Hardware name:" contains lots of extra whitespace due to the way the SYSIB contents are defined (strings aren't zero terminated). This looks a bit odd and therefore remove the extra whitespace characters. This also gives the opportunity to add more information, if required, without hitting the magic 80 characters per line limit. Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Martin Schwidefsky <schwidefsky@de.ibm.com>
-