Skip to content
Snippets Groups Projects
  1. Jan 14, 2025
    • David Disseldorp's avatar
      initramfs: avoid filename buffer overrun · b4ccab34
      David Disseldorp authored and Frieder Schrempf's avatar Frieder Schrempf committed
      
      [ Upstream commit e017671f534dd3f568db9e47b0583e853d2da9b5 ]
      
      The initramfs filename field is defined in
      Documentation/driver-api/early-userspace/buffer-format.rst as:
      
       37 cpio_file := ALGN(4) + cpio_header + filename + "\0" + ALGN(4) + data
      ...
       55 ============= ================== =========================
       56 Field name    Field size         Meaning
       57 ============= ================== =========================
      ...
       70 c_namesize    8 bytes            Length of filename, including final \0
      
      When extracting an initramfs cpio archive, the kernel's do_name() path
      handler assumes a zero-terminated path at @collected, passing it
      directly to filp_open() / init_mkdir() / init_mknod().
      
      If a specially crafted cpio entry carries a non-zero-terminated filename
      and is followed by uninitialized memory, then a file may be created with
      trailing characters that represent the uninitialized memory. The ability
      to create an initramfs entry would imply already having full control of
      the system, so the buffer overrun shouldn't be considered a security
      vulnerability.
      
      Append the output of the following bash script to an existing initramfs
      and observe any created /initramfs_test_fname_overrunAA* path. E.g.
        ./reproducer.sh | gzip >> /myinitramfs
      
      It's easiest to observe non-zero uninitialized memory when the output is
      gzipped, as it'll overflow the heap allocated @out_buf in __gunzip(),
      rather than the initrd_start+initrd_size block.
      
      ---- reproducer.sh ----
      nilchar="A"	# change to "\0" to properly zero terminate / pad
      magic="070701"
      ino=1
      mode=$(( 0100777 ))
      uid=0
      gid=0
      nlink=1
      mtime=1
      filesize=0
      devmajor=0
      devminor=1
      rdevmajor=0
      rdevminor=0
      csum=0
      fname="initramfs_test_fname_overrun"
      namelen=$(( ${#fname} + 1 ))	# plus one to account for terminator
      
      printf "%s%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%08x%s" \
      	$magic $ino $mode $uid $gid $nlink $mtime $filesize \
      	$devmajor $devminor $rdevmajor $rdevminor $namelen $csum $fname
      
      termpadlen=$(( 1 + ((4 - ((110 + $namelen) & 3)) % 4) ))
      printf "%.s${nilchar}" $(seq 1 $termpadlen)
      ---- reproducer.sh ----
      
      Symlink filename fields handled in do_symlink() won't overrun past the
      data segment, due to the explicit zero-termination of the symlink
      target.
      
      Fix filename buffer overrun by aborting the initramfs FSM if any cpio
      entry doesn't carry a zero-terminator at the expected (name_len - 1)
      offset.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarDavid Disseldorp <ddiss@suse.de>
      Link: https://lore.kernel.org/r/20241030035509.20194-2-ddiss@suse.de
      
      
      Signed-off-by: default avatarChristian Brauner <brauner@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      b4ccab34
  2. Sep 17, 2024
  3. Jul 11, 2024
  4. May 13, 2024
  5. Apr 11, 2024
  6. Mar 11, 2024
    • Linus Torvalds's avatar
      update workarounds for gcc "asm goto" issue · cbf14953
      Linus Torvalds authored and Frieder Schrempf's avatar Frieder Schrempf committed
      commit 68fb3ca0 upstream.
      
      In commit 4356e9f8 ("work around gcc bugs with 'asm goto' with
      outputs") I did the gcc workaround unconditionally, because the cause of
      the bad code generation wasn't entirely clear.
      
      In the meantime, Jakub Jelinek debugged the issue, and has come up with
      a fix in gcc [2], which also got backported to the still maintained
      branches of gcc-11, gcc-12 and gcc-13.
      
      Note that while the fix technically wasn't in the original gcc-14
      branch, Jakub says:
      
       "while it is true that no GCC 14 snapshots until today (or whenever the
        fix will be committed) have the fix, for GCC trunk it is up to the
        distros to use the latest snapshot if they use it at all and would
        allow better testing of the kernel code without the workaround, so
        that if there are other issues they won't be discovered years later.
        Most userland code doesn't actually use asm goto with outputs..."
      
      so we will consider gcc-14 to be fixed - if somebody is using gcc
      snapshots of the gcc-14 before the fix, they should upgrade.
      
      Note that while the bug goes back to gcc-11, in practice other gcc
      changes seem to have effectively hidden it since gcc-12.1 as per a
      bisect by Jakub.  So even a gcc-14 snapshot without the fix likely
      doesn't show actual problems.
      
      Also, make the default 'asm_goto_output()' macro mark the asm as
      volatile by hand, because of an unrelated gcc issue [1] where it doesn't
      match the documented behavior ("asm goto is always volatile").
      
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103979 [1]
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113921 [2]
      Link: https://lore.kernel.org/all/20240208220604.140859-1-seanjc@google.com/
      
      
      Requested-by: default avatarJakub Jelinek <jakub@redhat.com>
      Cc: Uros Bizjak <ubizjak@gmail.com>
      Cc: Nick Desaulniers <ndesaulniers@google.com>
      Cc: Sean Christopherson <seanjc@google.com>
      Cc: Andrew Pinski <quic_apinski@quicinc.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cbf14953
  7. Feb 12, 2024
  8. Dec 12, 2023
  9. Oct 12, 2023
  10. Aug 17, 2023
  11. Apr 26, 2023
    • Linus Torvalds's avatar
      gcc: disable '-Warray-bounds' for gcc-13 too · a93c20f5
      Linus Torvalds authored
      
      commit 0da6e5fd upstream.
      
      We started disabling '-Warray-bounds' for gcc-12 originally on s390,
      because it resulted in some warnings that weren't realistically fixable
      (commit 8b202ee2: "s390: disable -Warray-bounds").
      
      That s390-specific issue was then found to be less common elsewhere, but
      generic (see f0be87c4: "gcc-12: disable '-Warray-bounds' universally
      for now"), and then later expanded the version check was expanded to
      gcc-11 (5a41237a: "gcc: disable -Warray-bounds for gcc-11 too").
      
      And it turns out that I was much too optimistic in thinking that it's
      all going to go away, and here we are with gcc-13 showing all the same
      issues.  So instead of expanding this one version at a time, let's just
      disable it for gcc-11+, and put an end limit to it only when we actually
      find a solution.
      
      Yes, I'm sure some of this is because the kernel just does odd things
      (like our "container_of()" use, but also knowingly playing games with
      things like linker tables and array layouts).
      
      And yes, some of the warnings are likely signs of real bugs, but when
      there are hundreds of false positives, that doesn't really help.
      
      Oh well.
      
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a93c20f5
  12. Jan 14, 2023
  13. Nov 22, 2022
  14. Oct 21, 2022
  15. Oct 12, 2022
  16. Oct 03, 2022
  17. Sep 29, 2022
    • Jason A. Donenfeld's avatar
      random: split initialization into early step and later step · f6238499
      Jason A. Donenfeld authored
      
      The full RNG initialization relies on some timestamps, made possible
      with initialization functions like time_init() and timekeeping_init().
      However, these are only available rather late in initialization.
      Meanwhile, other things, such as memory allocator functions, make use of
      the RNG much earlier.
      
      So split RNG initialization into two phases. We can provide arch
      randomness very early on, and then later, after timekeeping and such are
      available, initialize the rest.
      
      This ensures that, for example, slabs are properly randomized if RDRAND
      is available. Without this, CONFIG_SLAB_FREELIST_RANDOM=y loses a degree
      of its security, because its random seed is potentially deterministic,
      since it hasn't yet incorporated RDRAND. It also makes it possible to
      use a better seed in kfence, which currently relies on only the cycle
      counter.
      
      Another positive consequence is that on systems with RDRAND, running
      with CONFIG_WARN_ALL_UNSEEDED_RANDOM=y results in no warnings at all.
      
      One subtle side effect of this change is that on systems with no RDRAND,
      RDTSC is now only queried by random_init() once, committing the moment
      of the function call, instead of multiple times as before. This is
      intentional, as the multiple RDTSCs in a loop before weren't
      accomplishing very much, with jitter being better provided by
      try_to_generate_entropy(). Plus, filling blocks with RDTSC is still
      being done in extract_entropy(), which is necessarily called before
      random bytes are served anyway.
      
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Reviewed-by: default avatarDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: default avatarJason A. Donenfeld <Jason@zx2c4.com>
      f6238499
  18. Sep 28, 2022
  19. Sep 27, 2022
    • Liam R. Howlett's avatar
      Maple Tree: add new data structure · 54a611b6
      Liam R. Howlett authored
      Patch series "Introducing the Maple Tree"
      
      The maple tree is an RCU-safe range based B-tree designed to use modern
      processor cache efficiently.  There are a number of places in the kernel
      that a non-overlapping range-based tree would be beneficial, especially
      one with a simple interface.  If you use an rbtree with other data
      structures to improve performance or an interval tree to track
      non-overlapping ranges, then this is for you.
      
      The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
      nodes.  With the increased branching factor, it is significantly shorter
      than the rbtree so it has fewer cache misses.  The removal of the linked
      list between subsequent entries also reduces the cache misses and the need
      to pull in the previous and next VMA during many tree alterations.
      
      The first user that is covered in this patch set is the vm_area_struct,
      where three data structures are replaced by the maple tree: the augmented
      rbtree, the vma cache, and the linked list of VMAs in the mm_struct.  The
      long term goal is to reduce or remove the mmap_lock contention.
      
      The plan is to get to the point where we use the maple tree in RCU mode.
      Readers will not block for writers.  A single write operation will be
      allowed at a time.  A reader re-walks if stale data is encountered.  VMAs
      would be RCU enabled and this mode would be entered once multiple tasks
      are using the mm_struct.
      
      Davidlor said
      
      : Yes I like the maple tree, and at this stage I don't think we can ask for
      : more from this series wrt the MM - albeit there seems to still be some
      : folks reporting breakage.  Fundamentally I see Liam's work to (re)move
      : complexity out of the MM (not to say that the actual maple tree is not
      : complex) by consolidating the three complimentary data structures very
      : much worth it considering performance does not take a hit.  This was very
      : much a turn off with the range locking approach, which worst case scenario
      : incurred in prohibitive overhead.  Also as Liam and Matthew have
      : mentioned, RCU opens up a lot of nice performance opportunities, and in
      : addition academia[1] has shown outstanding scalability of address spaces
      : with the foundation of replacing the locked rbtree with RCU aware trees.
      
      A similar work has been discovered in the academic press
      
      	https://pdos.csail.mit.edu/papers/rcuvm:asplos12.pdf
      
      Sheer coincidence.  We designed our tree with the intention of solving the
      hardest problem first.  Upon settling on a b-tree variant and a rough
      outline, we researched ranged based b-trees and RCU b-trees and did find
      that article.  So it was nice to find reassurances that we were on the
      right path, but our design choice of using ranges made that paper unusable
      for us.
      
      This patch (of 70):
      
      The maple tree is an RCU-safe range based B-tree designed to use modern
      processor cache efficiently.  There are a number of places in the kernel
      that a non-overlapping range-based tree would be beneficial, especially
      one with a simple interface.  If you use an rbtree with other data
      structures to improve performance or an interval tree to track
      non-overlapping ranges, then this is for you.
      
      The tree has a branching factor of 10 for non-leaf nodes and 16 for leaf
      nodes.  With the increased branching factor, it is significantly shorter
      than the rbtree so it has fewer cache misses.  The removal of the linked
      list between subsequent entries also reduces the cache misses and the need
      to pull in the previous and next VMA during many tree alterations.
      
      The first user that is covered in this patch set is the vm_area_struct,
      where three data structures are replaced by the maple tree: the augmented
      rbtree, the vma cache, and the linked list of VMAs in the mm_struct.  The
      long term goal is to reduce or remove the mmap_lock contention.
      
      The plan is to get to the point where we use the maple tree in RCU mode.
      Readers will not block for writers.  A single write operation will be
      allowed at a time.  A reader re-walks if stale data is encountered.  VMAs
      would be RCU enabled and this mode would be entered once multiple tasks
      are using the mm_struct.
      
      There is additional BUG_ON() calls added within the tree, most of which
      are in debug code.  These will be replaced with a WARN_ON() call in the
      future.  There is also additional BUG_ON() calls within the code which
      will also be reduced in number at a later date.  These exist to catch
      things such as out-of-range accesses which would crash anyways.
      
      Link: https://lkml.kernel.org/r/20220906194824.2110408-1-Liam.Howlett@oracle.com
      Link: https://lkml.kernel.org/r/20220906194824.2110408-2-Liam.Howlett@oracle.com
      
      
      Signed-off-by: default avatarLiam R. Howlett <Liam.Howlett@oracle.com>
      Signed-off-by: default avatarMatthew Wilcox (Oracle) <willy@infradead.org>
      Tested-by: default avatarDavid Howells <dhowells@redhat.com>
      Tested-by: default avatarSven Schnelle <svens@linux.ibm.com>
      Tested-by: default avatarYu Zhao <yuzhao@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: SeongJae Park <sj@kernel.org>
      Cc: Will Deacon <will@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      54a611b6
  20. Sep 12, 2022
Loading