Skip to content
Snippets Groups Projects
  1. Sep 26, 2014
  2. Sep 25, 2014
  3. Sep 24, 2014
  4. Sep 22, 2014
  5. Sep 19, 2014
  6. Sep 17, 2014
  7. Sep 16, 2014
    • Zhang Haoyu's avatar
      kvm: ioapic: conditionally delay irq delivery duringeoi broadcast · 184564ef
      Zhang Haoyu authored
      
      Currently, we call ioapic_service() immediately when we find the irq is still
      active during eoi broadcast. But for real hardware, there's some delay between
      the EOI writing and irq delivery.  If we do not emulate this behavior, and
      re-inject the interrupt immediately after the guest sends an EOI and re-enables
      interrupts, a guest might spend all its time in the ISR if it has a broken
      handler for a level-triggered interrupt.
      
      Such livelock actually happens with Windows guests when resuming from
      hibernation.
      
      As there's no way to recognize the broken handle from new raised ones, this patch
      delays an interrupt if 10.000 consecutive EOIs found that the interrupt was
      still high.  The guest can then make a little forward progress, until a proper
      IRQ handler is set or until some detection routine in the guest (such as
      Linux's note_interrupt()) recognizes the situation.
      
      Cc: Michael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarZhang Haoyu <zhanghy@sangfor.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      184564ef
  8. Sep 14, 2014
  9. Sep 11, 2014
  10. Sep 05, 2014
  11. Sep 03, 2014
    • David Matlack's avatar
      kvm: fix potentially corrupt mmio cache · ee3d1570
      David Matlack authored
      
      vcpu exits and memslot mutations can run concurrently as long as the
      vcpu does not aquire the slots mutex. Thus it is theoretically possible
      for memslots to change underneath a vcpu that is handling an exit.
      
      If we increment the memslot generation number again after
      synchronize_srcu_expedited(), vcpus can safely cache memslot generation
      without maintaining a single rcu_dereference through an entire vm exit.
      And much of the x86/kvm code does not maintain a single rcu_dereference
      of the current memslots during each exit.
      
      We can prevent the following case:
      
         vcpu (CPU 0)                             | thread (CPU 1)
      --------------------------------------------+--------------------------
      1  vm exit                                  |
      2  srcu_read_unlock(&kvm->srcu)             |
      3  decide to cache something based on       |
           old memslots                           |
      4                                           | change memslots
                                                  | (increments generation)
      5                                           | synchronize_srcu(&kvm->srcu);
      6  retrieve generation # from new memslots  |
      7  tag cache with new memslot generation    |
      8  srcu_read_unlock(&kvm->srcu)             |
      ...                                         |
         <action based on cache occurs even       |
          though the caching decision was based   |
          on the old memslots>                    |
      ...                                         |
         <action *continues* to occur until next  |
          memslot generation change, which may    |
          be never>                               |
                                                  |
      
      By incrementing the generation after synchronizing with kvm->srcu readers,
      we ensure that the generation retrieved in (6) will become invalid soon
      after (8).
      
      Keeping the existing increment is not strictly necessary, but we
      do keep it and just move it for consistency from update_memslots to
      install_new_memslots.  It invalidates old cached MMIOs immediately,
      instead of having to wait for the end of synchronize_srcu_expedited,
      which makes the code more clearly correct in case CPU 1 is preempted
      right after synchronize_srcu() returns.
      
      To avoid halving the generation space in SPTEs, always presume that the
      low bit of the generation is zero when reconstructing a generation number
      out of an SPTE.  This effectively disables MMIO caching in SPTEs during
      the call to synchronize_srcu_expedited.  Using the low bit this way is
      somewhat like a seqcount---where the protected thing is a cache, and
      instead of retrying we can simply punt if we observe the low bit to be 1.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      ee3d1570
    • Paolo Bonzini's avatar
      KVM: do not bias the generation number in kvm_current_mmio_generation · 00f034a1
      Paolo Bonzini authored
      
      The next patch will give a meaning (a la seqcount) to the low bit of the
      generation number.  Ensure that it matches between kvm->memslots->generation
      and kvm_current_mmio_generation().
      
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarDavid Matlack <dmatlack@google.com>
      Reviewed-by: default avatarXiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      00f034a1
  12. Aug 29, 2014
  13. Aug 27, 2014
Loading