Skip to content
Snippets Groups Projects
  1. Oct 04, 2006
  2. Jun 29, 2006
    • Andrew Morton's avatar
      [PATCH] generic_file_buffered_write(): handle zero-length iovec segments · 81b0c871
      Andrew Morton authored
      
      The recent generic_file_write() deadlock fix caused
      generic_file_buffered_write() to loop inifinitely when presented with a
      zero-length iovec segment.  Fix.
      
      Note that this fix deliberately avoids calling ->prepare_write(),
      ->commit_write() etc with a zero-length write.  This is because I don't trust
      all filesystems to get that right.
      
      This is a cautious approach, for 2.6.17.x.  For 2.6.18 we should just go ahead
      and call ->prepare_write() and ->commit_write() with the zero length and fix
      any broken filesystems.  So I'll make that change once this code is stabilised
      and backported into 2.6.17.x.
      
      The reason for preferring to call ->prepare_write() and ->commit_write() with
      the zero-length segment: a zero-length segment _should_ be sufficiently
      uncommon that this is the correct way of handling it.  We don't want to
      optimise for poorly-written userspace at the expense of well-written
      userspace.
      
      Cc: "Vladimir V. Saveliev" <vs@namesys.com>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Chris Wright <chrisw@sous-sol.org>
      Cc: Greg KH <greg@kroah.com>
      Cc: <stable@kernel.org>
      Cc: walt <wa1ter@myrealbox.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      81b0c871
  3. Jun 25, 2006
    • NeilBrown's avatar
      [PATCH] Prepare for __copy_from_user_inatomic to not zero missed bytes · 01408c49
      NeilBrown authored
      
      The problem is that when we write to a file, the copy from userspace to
      pagecache is first done with preemption disabled, so if the source address is
      not immediately available the copy fails *and* *zeros* *the* *destination*.
      
      This is a problem because a concurrent read (which admittedly is an odd thing
      to do) might see zeros rather that was there before the write, or what was
      there after, or some mixture of the two (any of these being a reasonable thing
      to see).
      
      If the copy did fail, it will immediately be retried with preemption
      re-enabled so any transient problem with accessing the source won't cause an
      error.
      
      The first copying does not need to zero any uncopied bytes, and doing so
      causes the problem.  It uses copy_from_user_atomic rather than copy_from_user
      so the simple expedient is to change copy_from_user_atomic to *not* zero out
      bytes on failure.
      
      The first of these two patches prepares for the change by fixing two places
      which assume copy_from_user_atomic does zero the tail.  The two usages are
      very similar pieces of code which copy from a userspace iovec into one or more
      page-cache pages.  These are changed to remove the assumption.
      
      The second patch changes __copy_from_user_inatomic* to not zero the tail.
      Once these are accepted, I will look at similar patches of other architectures
      where this is important (ppc, mips and sparc being the ones I can find).
      
      This patch:
      
      There is a problem with __copy_from_user_inatomic zeroing the tail of the
      buffer in the case of an error.  As it is called in atomic context, the error
      may be transient, so it results in zeros being written where maybe they
      shouldn't be.
      
      In the usage in filemap, this opens a window for a well timed read to see data
      (zeros) which is not consistent with any ordering of reads and writes.
      
      Most cases where __copy_from_user_inatomic is called, a failure results in
      __copy_from_user being called immediately.  As long as the latter zeros the
      tail, the former doesn't need to.  However in *copy_from_user_iovec
      implementations (in both filemap and ntfs/file), it is assumed that
      copy_from_user_inatomic will zero the tail.
      
      This patch removes that assumption, so that after this patch it will
      be safe for copy_from_user_inatomic to not zero the tail.
      
      This patch also adds some commentary to filemap.h and asm-i386/uaccess.h.
      
      After this patch, all architectures that might disable preempt when
      kmap_atomic is called need to have their __copy_from_user_inatomic* "fixed".
      This includes
       - powerpc
       - i386
       - mips
       - sparc
      
      Signed-off-by: default avatarNeil Brown <neilb@suse.de>
      Cc: David Howells <dhowells@redhat.com>
      Cc: Anton Altaparmakov <aia21@cantab.net>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: William Lee Irwin III <wli@holomorphy.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      01408c49
  4. Jun 23, 2006
    • Hiro Yoshioka's avatar
      [PATCH] x86: cache pollution aware __copy_from_user_ll() · c22ce143
      Hiro Yoshioka authored
      
      Use the x86 cache-bypassing copy instructions for copy_from_user().
      
      Some performance data are
      
      Total of GLOBAL_POWER_EVENTS (CPU cycle samples)
      
      2.6.12.4.orig    1921587
      2.6.12.4.nt      1599424
      1599424/1921587=83.23% (16.77% reduction)
      
      BSQ_CACHE_REFERENCE (L3 cache miss)
      2.6.12.4.orig      57427
      2.6.12.4.nt        20858
      20858/57427=36.32% (63.7% reduction)
      
      L3 cache miss reduction of __copy_from_user_ll
      samples  %
      37408    65.1412  vmlinux                  __copy_from_user_ll
      23        0.1103  vmlinux                  __copy_user_zeroing_intel_nocache
      23/37408=0.061% (99.94% reduction)
      
      Top 5 of 2.6.12.4.nt
      Counted GLOBAL_POWER_EVENTS events (time during which processor is not stopped) with a unit mask of 0x01 (mandatory) count 100000
      samples  %        app name                 symbol name
      128392    8.0274  vmlinux                  __copy_user_zeroing_intel_nocache
      64206     4.0143  vmlinux                  journal_add_journal_head
      59746     3.7355  vmlinux                  do_get_write_access
      47674     2.9807  vmlinux                  journal_put_journal_head
      46021     2.8774  vmlinux                  journal_dirty_metadata
      pattern9-0-cpu4-0-09011728/summary.out
      
      Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x3f (multiple flags) count 3000
      samples  %        app name                 symbol name
      69755     4.2861  vmlinux                  __copy_user_zeroing_intel_nocache
      55685     3.4215  vmlinux                  journal_add_journal_head
      52371     3.2179  vmlinux                  __find_get_block
      45504     2.7960  vmlinux                  journal_put_journal_head
      36005     2.2123  vmlinux                  journal_stop
      pattern9-0-cpu4-0-09011744/summary.out
      
      Counted BSQ_CACHE_REFERENCE events (cache references seen by the bus unit) with a unit mask of 0x200 (read 3rd level cache miss) count 3000
      samples  %        app name                 symbol name
      1147      5.4994  vmlinux                  journal_add_journal_head
      881       4.2240  vmlinux                  journal_dirty_data
      872       4.1809  vmlinux                  blk_rq_map_sg
      734       3.5192  vmlinux                  journal_commit_transaction
      617       2.9582  vmlinux                  radix_tree_delete
      pattern9-0-cpu4-0-09011731/summary.out
      
      iozone results are
      
      original 2.6.12.4 CPU time = 207.768 sec
      cache aware       CPU time = 184.783 sec
      (three times run)
      184.783/207.768=88.94% (11.06% reduction)
      
      original:
      pattern9-0-cpu4-0-08191720/iozone.out:  CPU Utilization: Wall time   45.997    CPU time   64.527    CPU utilization 140.28 %
      pattern9-0-cpu4-0-08191741/iozone.out:  CPU Utilization: Wall time   46.878    CPU time   71.933    CPU utilization 153.45 %
      pattern9-0-cpu4-0-08191743/iozone.out:  CPU Utilization: Wall time   45.152    CPU time   71.308    CPU utilization 157.93 %
      
      cache awre:
      pattern9-0-cpu4-0-09011728/iozone.out:  CPU Utilization: Wall time   44.842    CPU time   62.465    CPU utilization 139.30 %
      pattern9-0-cpu4-0-09011731/iozone.out:  CPU Utilization: Wall time   44.718    CPU time   59.273    CPU utilization 132.55 %
      pattern9-0-cpu4-0-09011744/iozone.out:  CPU Utilization: Wall time   44.367    CPU time   63.045    CPU utilization 142.10 %
      
      Signed-off-by: default avatarHiro Yoshioka <hyoshiok@miraclelinux.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      c22ce143
  5. Jun 24, 2005
    • Carsten Otte's avatar
      [PATCH] xip: reduce code duplication · eb6fe0c3
      Carsten Otte authored
      
      This patch reworks filemap_xip.c with the goal to reduce code duplication
      from mm/filemap.c.  It applies agains 2.6.12-rc6-mm1.  Instead of
      implementing the aio functions, this one implements the synchronous
      read/write functions only.  For readv and writev, the generic fallback is
      used.  For aio, we rely on the application doing the fallback.  Since our
      "synchronous" function does memcpy immediately anyway, there is no
      performance difference between using the fallbacks or implementing each
      operation.
      
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      eb6fe0c3
    • Carsten Otte's avatar
      [PATCH] xip: fs/mm: execute in place · ceffc078
      Carsten Otte authored
      
      - generic_file* file operations do no longer have a xip/non-xip split
      - filemap_xip.c implements a new set of fops that require get_xip_page
        aop to work proper. all new fops are exported GPL-only (don't like to
        see whatever code use those except GPL modules)
      - __xip_unmap now uses page_check_address, which is no longer static
        in rmap.c, and defined in linux/rmap.h
      - mm/filemap.h is now much more clean, plainly having just Linus'
        inline funcs moved here from filemap.c
      - fix includes in filemap_xip to make it build cleanly on i386
      
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ceffc078
Loading