Blogging about open source virtualization

News from QEMU, KVM, libvirt, libguestfs, virt-manager and related tools

Subscriptions

Planet Feeds

December 17, 2018

KVM on Z

QEMU v3.1 released

QEMU v3.1 is out. Besides a number of small enhancements, some items that we would like to highlight from a KVM on Z perspective:
  • Huge Pages Support: KVM guests can now utilize 1MB pages. As this removes one layer of address translation for the guest backing, less page-faults need to be processed, and less translation lookaside buffer (TLB) entries are needed to hold translations. This, as well as the TLB improvements in z14, will improve KVM guest performance.
    To use:
    Create config file /etc/modprobe.d/kvmhpage.conf file with the following content to enable huge pages for KVM:

       options kvm hpage=1


    Furthermore, add the following line to /etc/sysctl.conf to reserve N huge pages:

       vm.nr_hugepages = N

    Alternatively, append the following statement to the kernel parameter line in case support is compiled into the kernel: kvm.hpage=1 hugepages=N.
    Note that means to add hugepages dynamically after boot exist, but with effects like memory fragmentation, it is preferable to define huge pages as early as possible.
    If successful, the file /proc/sys/vm/nr_hugepages should show N huge pages. See here for further documentation.
    Then, to enable huge pages for a guest, add the following element to the respective domain XML:

       <memoryBacking>
         <hugepages/>
       </memoryBacking>


    The use of huge pages in the host is orthogonal to the use of huge pages in the guest. Both will improve the performance independently by reducing the number of page faults and the number of page table walks after a TLB miss.
    The biggest performance improvement can be achieved by using huge pages in both, host and guest, e.g. with libhugetlbfs, as this will also make use of the larger 1M TLB entries in the hardware.
    Requires Linux kernel 4.19.
  • virtio-ap: The Adjunct Processor (AP) facility is an IBM Z cryptographic facility comprised of three AP instructions and up to 256 cryptographic adapter cards, each of which can be group into up to 85 domains , providing cryptographic services. virtio-ap maps a subset of the AP devices/domains to one or more KVM guests, such that the host and each guest has exclusive access to a discrete set of AP devices.
    Here is a small sample script illustrating host setup:

       # load vfio-ap device driver
       modprobe vfio-ap

       # create an mdev by specifying a UUID (or use uuidgen instead)
       UUID=e926839d-a0b4-4f9c-95d0-c9b34190c4ba
       echo $UUID /sys/devices/vfio_ap/matrix/create

       # reserve AP queue 7 on adapter 3 for use by a KVM guest
       echo -0x3 > /sys/bus/ap/apmask
       echo -0x7 > /sys/bus/ap/aqmask

       # create a mediated device (mdev) to provide userspace access
       # to a device in a secure manner
       echo $UUID > /sys/devices/vfio_ap/matrix/mdev_supported_types/ \
                    vfio_ap-passthrough/create
       # assign adapter, domain and control domain
       echo +0x3 > /sys/devices/vfio_ap/matrix/${UUID}/assign_adapter
       echo +0x7 > /sys/devices/vfio_ap/matrix/${UUID}/assign_domain
       echo +0x7 > /sys/devices/vfio_ap/matrix/${UUID}/assign_control_domain


    To make use of the AP device in a KVM guest, add the following element to the respective domain XML:

       <hostdev mode='subsystem' type='mdev' managed='no' model='vfio-ap'>
         <source>
           <address uuid='e926839d-a0b4-4f9c-95d0-c9b34190c4ba'/>
         </source>
       </hostdev>


    Once complete, use the passthrough device in a KVM guest just like a regular crypto adapter.
    Requires Linux kernel 4.20 and libvirt 4.9.

by Stefan Raspl (noreply@blogger.com) at December 17, 2018 03:58 PM

December 13, 2018

KVM on Z

SLES 12 SP4 released

SLES 12 SP4 is out! See the announcement and their release note with Z-specific changes.
It ships the following code levels:
  • Linux kernel 4.12 (SP3: 4.4),
  • QEMU v2.11 (SP3: v2.9), and
  • libvirt v4.0 (SP3: v3.3).
See previous blog entries on QEMU v2.10 and v2.11 for details on new features that become available by the QEMU package update.
See previous blog entries on Linux kernel 4.8 and 4.11 for details on new features becoming available through the kernel update, e.g. nested virtualization support.
An additional feature in this release is the availability of STHYI information in LPAR environments. Requires qclib v1.3 or later. See this blog post for general information on qclib.
    Furthermore, note that these changes provide a full CPU model, which provides protection against live guest migration compatibility troubles. E.g. migrating a guest exploiting the latest features to a KVM instance running on an earlier IBM Z machine lacking said feature would be detected an prevented.
    Note: With this feature, live guest migration back to a KVM instance that does not yet support CPU models (e.g. SLES 12 SP3) will not work anymore.

      by Stefan Raspl (noreply@blogger.com) at December 13, 2018 10:04 AM

      December 12, 2018

      QEMU project

      QEMU version 3.1.0 released

      We would like to announce the availability of the QEMU 3.1.0 release. This release contains 1900+ commits from 189 authors.

      You can grab the tarball from our download page. The full list of changes are available in the Wiki.

      Highlights include:

      • ARM: emulation support for microbit and Xilinx Versal machine models
      • ARM: support for ARMv6M architecture and Cortex-M0 CPU model
      • ARM: support for Cortex-A72 CPU model
      • ARM: virt/xlnx-zynqmp: virtualization extensions for GICv2 interrupt controller
      • ARM: emulation of AArch32 virtualization/hypervisor mode now supported for Cortex-A7 and Corex-A15
      • MIPS: emulation support for nanoMIPS I7200
      • MIPS: emulation support for MXU SIMD instructions for MIPS32
      • PowerPC: pseries: enablement of nested virtualization via KVM-HV
      • PowerPC: prep: deprecated in favor of 40p machine model
      • Powerpc: 40p: IRQ routing fixes, switch from Open HackWare to OpenBIOS
      • PowerPC: g3beige/mac99: support for booting from virtio-blk-pci
      • s390: VFIO passthrough support for crypto devices (vfio-ap)
      • s390: KVM support for backing guests with huge pages
      • SPARC: sun4u: support for booting from virtio-blk-pci
      • x86: multi-threaded TCG support
      • x86: KVM support for Enlightened VMCS (improved perf for Hyper-V on KVM)
      • x86: KVM support for Hyper-V IPI enlightenments
      • Xtensa: support for input from chardev consoles
      • Support for AMD IOMMU interrupt remapping and guest virtual APIC mode
      • XTS cipher mode is now ~2x faster
      • stdvga and bocks-display devices can expose EDID information to guest, (for use with xres/yres resolution options)
      • qemu-img tool can now generate LUKS-encrypted files through ‘convert’ command
      • and lots more…

      Thank you to everyone involved!

      December 12, 2018 06:50 AM

      December 04, 2018

      Cornelia Huck

      Notes from KVM Forum 2018

      KVM Forum 2018 took place October 24 - 26 in Edinburgh, Scotland. Better late than never, here are some of my notes and impressions. As always, there was a lot going on, and I could not attend everything that I would have found interesting. Fortunately, video recordings are available (see the page linked above, respectively the YouTube channel); here, I'd like to thank the folks organizing the logistics, recording the talks, and uploading nicely edited versions!

      This year, KVM Forum was again co-located with OSS Europe, and on the first day (which also featured the annual QEMU summit), talks were on a shared track. This meant an opportunity for people attending OSS to hear some KVM and virtualization related talks; unfortunately, it also meant that the room where the KVM Forum talks were held was very crowded. Nevertheless, it is always nice if a talk is interesting enough to attract a good number of people; I'm happy that my maintainership talk also attracted a nice audience. Other talks from the first day I enjoyed were Alex' talk about L1TF and Marc's talk about running huge libvirt installations.

      The second and third day featured some more comfortable rooms; organization-wise, I liked that talks about similar topics were grouped back-to-back.

      On these days, we had the keynotes for KVM, QEMU, and libvirt; as well as the contributor Q&A panel - some good questions from the audience there. Also check out Christian's talk about the various architectures supported by KVM and how much commonality is there (or not).

      Most of the time, days two and three were dual-track. Some of the topics covered were vfio and migration with vfio; nested virtualization; not-so-common architectures (including s390!); testing and continuous integration. I find it hard to point out specific sessions and recommend browsing through the posted videos instead.

      Some topics were delved into more deeply in BOF sessions; myself, I attended the vfio migration BOF which gave me a couple of things to think about. Many BOF sessions subsequently posted summaries on the relevant mailing lists.

      One of the most important features of any conference is, of course, the hallway track: Meeting new people, seeing old acquaintances again, and impromptu discussions about a lot of different topics. I find that this is one of the most valuable experiences, both for putting a face to a name and for discussing things you did not event think about beforehand.

      So, for an even shorter summary of my short notes: KVM Forum 2018 was great, go watch some videos, and consider attending future KVM Forums :)

      by Cornelia Huck (noreply@blogger.com) at December 04, 2018 06:05 PM

      December 03, 2018

      KVM on Z

      SLES 12 SP3 Updates


      SLES 12 SP3, released late last year, received a couple of mostly performance and security-related updates in support of IBM z14 and LinuxONE through the maintenance web updates.
      In particular:

        by Stefan Raspl (noreply@blogger.com) at December 03, 2018 08:38 AM

        December 01, 2018

        Thomas Huth

        QEMU Advent Calendar 2018 opened the first door

        Starting today, on December 1st, the first door of the QEMU Advent Calendar 2018 can now be opened! The advent calendar reveals a new disk image for download on each of the first 24 days in December 2018, to create a fun experience for the QEMU community, to celebrate the 15th anniversary of QEMU, and to provide some good images for testing the various CPU targets of QEMU – this year it will contain way more images for non-x86 targets than before, so if you are interested in collecting test images for the various CPU targets of QEMU, be sure to check the calendar regularly!

        December 01, 2018 07:05 AM

        November 29, 2018

        Daniel Berrange

        Improved translation po file handling by ditching gettext autotools integration

        The libvirt library has long provided translations of its end user facing strings, which largely means error messages and console output from command line tools / daemons. Since libvirt uses autotools for its build system, it naturally used the standard automake integration provided by gettext for handling .po files. The libvirt.pot file with master strings is exported to Zanata, where the actual translation work is outsourced to the Fedora translation team who support up to ~100 languages. At time of writing libvirt has some level of translation in ~45 languages.

        With use of Zanata, libvirt must periodically create an updated libvirt.pot file and push it to Zanata, and then just before release it must pull the latest translated .po files back into GIT for release.

        There have been a number of problems with this approach which have been annoying us pretty much since the start, and earlier this year it finally became too much to bear any longer.

        • The per-language translation files stored in git contain source file name and line number annotations to indicate where each translatable string originates. Since the translation files are not re-generated on every single source file changes, the file locations annotations becomes increasingly out of date after every commit. When the translation files are updated 98% of the diff is simply changing source file locations leading to a very poor signal/noise ratio.
        • The strings in the per-language translation files are sorted according to source filename. Thus when code is moved between files, or when files are renamed, the strings in the updated translation files all get needlessly reordered, again leading to a poor signal/noise ratio in diffs.
        • Each language translation file contains every translatable string even those which do not have any translation yet. This makes sense if translators are working directly against the .po files, but in libvirt everything is done via the Zanata UI which already knows the list of untranslated strings.
        • The per-language translation files grow in size over time with previously used message strings appended to the end of the file, never discarded by the gettext tools. This again makes sense if translators are working directly against .po files, but Zanata already provides a full translation memory containing historically used strings.
        • Whenever ‘make dist’ is run the gettext autotools integration will regenerate the per-language translation files. As a result of the three previous points, every time a release is made there’s a giant commit more than 100MB in size that contains diffs for translated files which are entirely noise and no signal.

        One suggested approach to deal with this is to stop storing translations in GIT at all and simply export them from Zanata only at time of ‘make dist’. The concern with this approach is that the GIT repository no longer contains the full source for the project in a self-contained manner. ‘make dist‘ now needs a live network connection to the Zanata servers. If we were to replace Zanata with a new tool in the future (Zanata is already a replacement for the previously used Transifex), we would potentially loose access to translations for old releases.

        With this in mind we decided to optimize the way translations are managed in GIT.

        The first easy win was to simply remove the master libvirt.pot file from GIT entirely. This file is auto-generated from the source files and is out of date the moment any source file changes, so no one would ever want to use the stored copy.

        The second more complex step was to minimize and canonicalize the per-language translation files. msgmerge is used to take the full .po file and strip out the source file locations and sort the string alphabetically. A perl script is then used to further process the content dropping any translations marked as “fuzzy” and drop any strings for which there is no translated text available. The resulting output is still using the normal .po file format but we call these ‘.mini.po‘ files to indicate that they are stripped down compared to what you’d normally expect to see.

        The final step was to remove the gettext / autotools integration and write a custom Makefile.am to handle the key tasks.

        • A target ‘update-mini-po‘ to automate the process of converting full .po files into .mini.po files. This is used when pulling down new translations from Zanata to be stored in git before release.
        • A target ‘update-po’ to automate the inverse process of converting .mini.po files back into full .po files. This is to be used by anyone who might need to look at full language translations outside of Zanata.
        • An install hook to generate the binary .gmo files from the .mini.po files and install them into /usr/share/locale for use at runtime. This avoids the need to ship the full .po files in release tarballs.
        • A target ‘zanata-push‘ to automate the process of re-generating the libvirt.pot file and uploading it to Zanata.
        • A target ‘zanata-pull‘ to automate the process of pulling new translations down from zanata and then triggering ‘update-mini-po

        After all this work was completed the key benefits are

        • The size of content stored in GIT was reduced from ~100MB to ~18MB.
        • Updates to the translations in GIT now produce small diffstats with a high signal/noise ratio
        • Files stored in GIT are never changed as a side effect of build system commands like ‘make dist’
        • The autotools integration is easier to understand

        while not having any visible change on the translators using Zanata. In the event anyone does need to see full translation languages outside of Zanata there is an extra step to generate the full .po files from the .mini.po files but this is countered by the fact that the result will be fully up to date with respect to translatable strings and source file locations.

        I’d encourage any project which is using gettext autotools integration, while also outsourcing to a system like Zanata, to consider whether they’d benefit from taking similar steps to libvirt. Not all projects will get the same degree of space saving but diffstats with good signal/noise ratios and removing side effects from ‘make dist’ are wins that are likely desirable for any project.

         

        by Daniel Berrange at November 29, 2018 12:22 PM

        November 28, 2018

        Stefan Hajnoczi

        Software Freedom Conservancy donations are being matched again!

        Donations to Software Freedom Conservancy, the charity that acts as the legal home for QEMU and many other popular open source projects that don't run their own foundations or charities, are being matched again this year. That means your donation is doubled thanks to a group of donors who have pledged to match donations.

        Software Freedom Conservancy helps projects with the details of running an open source project (legal advice, handling expenses, organizing conferences, etc) as well as taking a leading position on open source licensing and enforcement. Their work is not-for-profit and in the interest of the entire open source community.

        If you want more projects like QEMU, Git, Samba, Inkscape, and Selenium to succeed as healthy open source communities, then donating to Software Freedom Conservancy is a good way to help.

        Find out about becoming a Supporter here.

        by stefanha (noreply@blogger.com) at November 28, 2018 10:37 AM

        November 27, 2018

        Stefan Hajnoczi

        QEMU Advent Calendar 2018 is coming!

        QEMU Advent Calendar is running again this year. Each day from December 1st through 24th a surprise QEMU disk image will be released for your entertainment.

        Check out the website on December 1st for the first disk image:

        https://www.qemu-advent-calendar.org/2018/

        Thomas Huth is organizing QEMU Advent Calendar 2018 with the help of others from the QEMU community. If you want to contribute a disk image, take a look at the call for images email.

        by stefanha (noreply@blogger.com) at November 27, 2018 09:15 AM

        November 14, 2018

        Cornelia Huck

        s390x changes in QEMU 3.1

        QEMU is now in the -rc phase for 3.1, with a release expected in early/mid December, and, as usual, this is a good time to summarize the s390x changes for that release.

        CPU models

        • s390x now supports the 'max' cpu model as well (which somehow had been forgotten...) When using KVM, this behaves like the 'host' model; when using TCG, this is the 'qemu' model plus some additional, experimental features. Note that this is neither static nor migration-safe.

        Devices

        • Support for vfio-ap has been added. That allows to pass crypto cards on the AP bus to the guest. Support for this has been merged into the Linux kernel with 4.20. As this is a rather large feature, I plan to do a separate writeup for this.

        KVM

        • Support for enabling huge page backing has been added. This requires a host kernel of version 4.19 or higher. Note that this is only available for the s390-ccw-virtio-3.1 or later machines (due to compat handling), and that it is as of writing this incompatible with nested virtualization (which should change in the future.)
        • Support for the etoken facility (spectre mitigation) has been added. This, as well, needs a host kernel of version 4.19 or higher.

        TCG

        • Support for instruction flags and AFP registers has been added.

        Miscellaneous

        • The deprecated 's390-squash-mcss' option has been removed.
        • And the usual fixes, cleanups and improvements.

        by Cornelia Huck (noreply@blogger.com) at November 14, 2018 06:20 PM

        Thomas Huth

        QEMU Advent Calendar 2018 website online

        This year, we are celebrating the 15th anniversary of QEMU (QEMU 0.1 was announced in March 2003), and to contribute to this celebration, we will have another edition of the QEMU Advent Calendar this year. The new website for the advent calendar is now online at www.qemu-advent-calendar.org – but please do not try to open any of the doors before December 1st. We are also still looking for some images which we can present this year. If you would like to help, please have a look at the “QEMU Advent Calendar 2018 - Help wanted” mail that I have sent to the QEMU mailing lists.

        November 14, 2018 10:15 AM

        November 08, 2018

        Gerd Hoffmann

        Fedora 29 images uploaded

        Fedora 28 was released last week, so here are the fresh Fedora 29 images for qemu.

        As usual the images don't have a root password. You have to set one using virt-customize -a --root-password "password:<secret>", otherwise you can't login after boot.

        Some images use grub2 as bootloader, some use systemd-boot. The filename indicates which uses which. The x86_64 and i686 images can be booted with both uefi and bios firmware. The arm images come as grub2 variant only. systemd-boot doesn't support 32bit arm and crashes on 64bit arm.

        The images can also be booted as container, using systemd-nspawn --boot --image <file>, but you have to convert them to raw first as systemd-nspawn can't handle qcow2.

        The 32bit arm image (armhfp) isn't there because doesn't boot for me. Seems the fedora grub2-efi.armhfp package has trouble booting the kernel in qemu (with edk2 firmware). To be investigated if I find some time. Note: The fedora 28 image uses a custom grub2-efi.armhfp package as fedora didn't ship grub2-efi.armhfp in version 28.

        The images have been created with imagefish.

        by Gerd Hoffmann at November 08, 2018 11:00 PM

        November 06, 2018

        KVM on Z

        Ubuntu 18.10 released

        Ubuntu Server 18.10 is out! Support for IBM Z is available here.
        It ships

        by Stefan Raspl (noreply@blogger.com) at November 06, 2018 12:21 AM

        November 04, 2018

        Stefan Hajnoczi

        Video and slides available for "Security in QEMU"

        I gave a talk about security in QEMU at KVM Forum 2018. It covers the architecture of QEMU and focusses on the attack surfaces that are exposed to guests. I hope it will be useful to anyone auditing or writing device emulation code. It also describes the key design principles for isolating the QEMU process and limiting the damage that can be done if a guest escapes.

        The video of the talk is now available:

        The slides are available here (PDF).

        by stefanha (noreply@blogger.com) at November 04, 2018 07:12 PM

        October 24, 2018

        Gerd Hoffmann

        VGA emulation in qemu - where do we want to go?

        lets start with some history ...

        The original VGA

        It was introduced by IBM in 1987. It had a bunch of new features, and also included old ones which where already present in the predecessor devices CGA and EGA, including:

        • text modes (80x25, also 80x50 using a smaller font)
        • 16 color mode (640x480, 4 bit per color, one plane per bit)
        • 256 color mode (320x240, 8 bit per color)
        • various tweaks you can do, like enabling double scan or split screen.

        The VGA has 256k of video memory and it is accessed using a memory window at 0xa0000. It is not possible to access all video memory at the same time, you have to set bank registers to map the piece of memory you want access into the window.

        All vga devices emulated by qemu support this.

        Super VGA

        In the early 90ies various enhanced VGA cards, typically named "Super VGA" (abbreviated SVGA) became available from various vendors. The cirrus vga emulated by qemu is a typical SVGA card which was quite popular back then. They add various new features:

        • more video memory, which in turn allows for:
        • higher resolutions for 256 color modes.
        • more colors (64k, 16 bit per pixel).
        • even more colors (16M, 24 or 32 bit per pixel).
        • linear framebuffer, so you can access all video memory at the same time, without having to bank-switch the video memory into the rather small window at 0xa0000.
        • 2D acceleration (cirrus blitter for example).

        SVGA in qemu

        All SVGA devices in qemu (except cirrus) have support for the bochs display interface. That interface was implemented by the bochs emulator first (this is where the name comes from). It was implemented in qemu too. For the qemu standard vga it is the primary interface. qxl-vga, virtio-vga and vmsvga support the bochs dispi interface when they are in vga compatibility mode, which is typically the case at boot, before the guest os loads the native display driver.

        The bochs display interface is a paravirtual interface, with just the bare essentials to set video modes on a virtual display device. There are no registers for clock rate and other timing stuff for example.

        Traditionally the bochs display interface uses I/O ports 0x1ce (bochs register index) and 0x1cf (bochs register data), As both registers are 16bit the data registers is unaligned, which does not work on non-x86 archs, so 0x1d0 is supported as data port too.

        Graphics usage by modern guests

        Lets have a look at what modern guests are doing in the graphics field:

        • Everything but 32 bit true color modes is pretty much unused. The only exception is 16 bit true color modes which are still used sometimes in resource-constrained environments (raspberry pi for example).
        • 2D acceleration is dead. It's either software rendering into a dumb framebuffer, or using the 3D engine for 2D rendering.
        • text mode is used only with BIOS firmware, and even then only at boot (bootloader, vgacon until the kms driver loads). UEFI goes straight to graphics mode.
        • Banked video memory access is dead. Text mode still uses the 0xa0000 window, but the text buffer is small enough that there is no bank switching needed.

        So, we have a lot of rather complex code to emulate features not used at all by modern guests. There have been security bugs in the past in that complex but largely unused code ...

        So, can we simplify things?

        Turns out: yes, we can. First step already happened in qemu 1.3. The qemu stdvga got a MMIO bar. The MMIO bar can be used as alternative way to access the vga registers and also the bochs dispi interface registers.

        OVMF (UEFI implementation for qemu) uses the MMIO bar. The bochs-drm.ko linux kms driver uses the MMIO bar too. In fact, both use the bochs display interface registers only, except for setting the unblank bit so the screen will not stay black.

        So, the guest code already ignores the vga emulation. Cool. We can build on that.

        Introducing -device bochs-display

        New display device. Merged in qemu 3.0. Featues:

        • No VGA compatibility. PCI class is display/other instead of display/vga.
        • It has a stdvga-style MMIO bar. The vga registers are not available of course. Otherwise the register interface is identical to the stdvga though.
        • Implemented from scratch, no code sharing with vga. Code size is an order of magnitude smaller when compared to vga.
        • No I/O ports needed. You can plug it into an PCIe slot.
        • OVMF supports it.
        • bochs-drm.ko supports it too.

        So, all set for UEFI guests. You can switch from stdvga to bochs-display, and everything continues to work fine.

        But what about BIOS and text mode?

        Accessing the vga hardware directly for text mode is rare these days. Typically seabios and linux boot loaders call vgabios functions to render text on the display. So, we can hook in there and support text rendering without the hardware actually having text mode support. A very simliar approach is taken by sgabios, to redirect vga text output to the serial line.

        Luckily we are not the first ones facing that problem. coreboot can initialize the graphics hardware and setup a framebuffer with the native display resolution. Having to switch back to text mode when running seabios as coreboot payload is not exactly nice. So, there is a vgabios variant for coreboot which renders text to a framebuffer.

        So, take that, tweak the initialization code to program the bochs dispi interface instead of looking for a framebuffer setup by coreboot, and we are ready to go. Seabios boot messages show up on the bochs-display framebuffer. Yay!

        This will work out-of-the-box in qemu 3.1. The vgabios is already present in qemu 3.0, but due to a bug it is not installed by default, it must be copyed over manually to get things going.

        There are some drawbacks, which may or may not be a problem depending on your use case:

        • linux vgacon does not work due to direct vga hardware access. So you have to use vesafb or just live with not having early boot messages. Once bochs-drm.ko loads fbcon will be functional.
        • The vgabios uses a fixed 1024x768 resolution and does not support switching modes after initialization. Reason is that the initialization code runs in big real mode, so accessing the mmio bar is easy then. That is not the case for vgabios function calls though. Resolutions smaller than 1024x768 are allowed by vgabios and will simply use the upper left corner of the display.

        That's it. Enjoy the new legacy-free display device.

        by Gerd Hoffmann at October 24, 2018 10:00 PM

        October 04, 2018

        Cole Robinson

        Setting custom network names on Fedora

        systemd predictable network names give us host interface names like enp3s0. On one of my hosts, I have two interfaces: one that is my regular hard wired connection, and another I only plug in occasionally for some virt network testing. I can never remember the systemd names, so I want to rename the interfaces to something more descriptive for my needs. in my case lan0main and lan1pcie

        The page referenced says to use systemd links. However after struggling with that for a while I'm that's only relevant to systemd-networkd usage and doesn't apply to Fedora's default use of NetworkManager. So I needed another way.

        Long story short I ended up with some custom udev rules that are patterned after the old 70-persistent-net.rules file:

        $ cat /etc/udev/rules.d/99-cole-nic-names.rules 
        SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="70:8b:cd:80:e5:5f", ATTR{type}=="1", NAME="lan0main"
        SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="68:05:ca:1a:f5:da", ATTR{type}=="1", NAME="lan1pcie"

        by Cole Robinson (noreply@blogger.com) at October 04, 2018 09:27 PM

        October 03, 2018

        KVM on Z

        RHEL 7.5 Beta supports KVM on Z

        The Red Hat Enterprise Linux 7.5 Beta ships with support for KVM on Z through the kernel-alt packages. This will essentially ship Linux kernel 4.14.
        Here is the respective section from the release notes:
        KVM virtualization is now supported on IBM z Systems. However, this feature is only available in the newly introduced user space based on kernel version 4.14, provided by the kernel-alt packages.
        See here for further details.

        by Stefan Raspl (noreply@blogger.com) at October 03, 2018 09:14 AM

        October 01, 2018

        KVM on Z

        Knowledge Series: Black Box Guest Analysis Using kvm_stat


        Another new entry in our Knowledge Series details how to gain insights into black box KVM guests using kvm_stat.

        by Stefan Raspl (noreply@blogger.com) at October 01, 2018 12:12 PM

        September 24, 2018

        KVM on Z

        Knowledge Series: How to use vnc for Guest Installs

        A new entry in our Knowledge Series details how to utilize vnc for graphical installs, exemplified using RHEL 7.5.

        by Stefan Raspl (noreply@blogger.com) at September 24, 2018 08:05 PM

        September 13, 2018

        Richard Jones

        Creating Windows templates for virt-builder

        virt-builder is a tool for rapidly creating customized Linux images. Recently I’ve added support for Windows although for rather obvious licensing reasons we cannot distribute the Windows templates which would be needed to provide Windows support for everyone. However you can build your own Windows templates as described here and then:

        $ virt-builder -l | grep windows
        windows-10.0-server      x86_64     Windows Server 2016 (x86_64)
        windows-6.2-server       x86_64     Windows Server 2012 (x86_64)
        windows-6.3-server       x86_64     Windows Server 2012 R2 (x86_64)
        $ virt-builder windows-6.3-server
        [   0.6] Downloading: http://xx/builder/windows-6.3-server.xz
        [   5.1] Planning how to build this image
        [   5.1] Uncompressing
        [  60.1] Opening the new disk
        [  77.6] Setting a random seed
        virt-builder: warning: random seed could not be set for this type of guest
        virt-builder: warning: passwords could not be set for this type of guest
        [  77.6] Finishing off
                           Output file: windows-6.3-server.img
                           Output size: 10.0G
                         Output format: raw
                    Total usable space: 9.7G
                            Free space: 3.5G (36%)
        

        To build a Windows template repository you will need the latest libguestfs sources checked out from https://github.com/libguestfs/libguestfs and you will also need a suitable Windows Volume License, KMS or MSDN developer subscription. Also the final Windows templates are at least ten times larger than Linux templates, so virt-builder operations take correspondingly longer and use lots more disk space.

        First download install ISOs for the Windows guests you want to use.

        After cloning the latest libguestfs sources, go into the builder/templates subdirectory. Edit the top of the make-template.ml script to set the path which contains the Windows ISOs. You will also possibly need to edit the names of the ISOs later in the script.

        Build a template, eg:

        $ ../../run ./make-template.ml windows 2k12 x86_64
        

        You’ll need to read the script to understand what the arguments do. The script will ask you for the product key, where you should enter the volume license key or your MSDN key.

        Each time you run the script successfully you’ll end up with two files called something like:

        windows-6.2-server.xz
        windows-6.2-server.index-fragment
        

        The version numbers are Windows internal version numbers.

        After you’ve created templates for all the Windows guest types you need, copy them to any (private) web server, and concatenate all the index fragments into the final index file:

        $ cat *.index-fragment > index
        

        Finally create a virt-builder repo file pointing to this index file:

        # cat /etc/virt-builder/repos.d/windows.conf
        [windows]
        uri=http://xx/builder/index
        

        You can now create Windows guests in virt-builder. However note they are not sysprepped. We can’t do this because it requires some Windows tooling. So while these guests are good for small tests and similar, they’re not suitable for creating actual Windows long-lived VMs. To do that you will need to add a sysprep.exe step somewhere in the template creation process.

        by rich at September 13, 2018 09:07 AM

        September 11, 2018

        KVM on Z

        2018 Linux on IBM Z and LinuxONE Workshop, Poughkeepsie, NY

        Meet us at this event, taking place November 5-6, 2018, at IBM Poughkeepsie, NY. See the full announcement here.
        Naturally, KVM on IBM Z will be covered by both, presentations and workgroup sessions.

        Find the agenda here.

        Registration is open here till October 25.

        by Stefan Raspl (noreply@blogger.com) at September 11, 2018 08:25 AM

        September 10, 2018

        KVM on Z

        libvirt v4.7.0 released

        libvirt v4.7, available for download at the libvirt project website, adds support for vsock for CCW.
        For a full usage example and related information, see this article in our Knowledge series.

        by Stefan Raspl (noreply@blogger.com) at September 10, 2018 02:17 PM

        Thomas Huth

        QEMU's instance_init() vs. realize()

        Note that this is a blog post for (new) QEMU developers. If you are just interested in using QEMU, you can certainly skip this text. Otherwise, in case you have ever been in touch with the QEMU device model (“qdev”), you are likely aware of the basic qdev code boilerplate already:

        static void mydev_realize(DeviceState *dev, Error **errp)
        {
            /* callback function that is run during device "realization" */
        }
        
        static void mydev_instance_init(Object *obj)
        {
            /* callback function that is run during device instance init */
        }
        
        static Property mydev_properties[] = {
            DEFINE_PROP_xxx("myprop", MyDevState, field, ...),
            /* ... */
            DEFINE_PROP_END_OF_LIST(),
        };
        
        static void mydev_class_init(ObjectClass *oc, void *data)
        {
            DeviceClass *dc = DEVICE_CLASS(oc);
        
            dc->realize = mydev_realize;
            dc->desc = "My cool device";
            dc->props = mydev_properties;
            /* ... and other device class setup code ... */
        }
        
        static const TypeInfo mydev_info = {
            .name          = TYPE_MYDEV,
            .parent        = TYPE_SYS_BUS_DEVICE,  /* or something else */
            .instance_size = sizeof(mydev_state),
            .instance_init = mydev_instance_init,
            .class_init    = mydev_class_init,
        };
        
        static void mydev_register_types(void)
        {
            type_register_static(&mydev_info);
        }
        
        type_init(mydev_register_types)

        There are three different initialization functions involved here, the class_init, the instance_init and the realize function. While it is quite obvious to distinguish the class_init function from the two others (it is used for initializing the class data, not the data that is used for an instance … this is similar to the object model with classes and instances in C++), I initially always wondered about the difference between the instance_init() and the realize() functions. Having fixed quite a lot of related bugs in the past months in the QEMU code base, I now know that a lot of other people are also not properly aware of the difference here, so I think it is now time to write down some information that I’m now aware of, to make sure that I don’t forget about this again, and maybe help others to avoid related bugs in the future ;-)

        First it is of course always a good idea to have a look at the documentation. While the documentation of TypeInfo (where instance_init() is defined) is not very helpful to understand the differences, the documentation of DeviceClass (where realize() is defined) has some more useful information: You can learn here that the object instantiation is done first, before the device is realized, i.e. the instance_init() function is called first, and the realize() function is called afterwards. The former must not fail, while the latter can return an error to its caller via a pointer to an “Error” object pointer.

        So the basic idea here is that device objects are first instantiated, then these objects can be inspected for their interfaces and their creators can set up their properties to configure their settings and wire them up with other devices, before the device finally becomes “active” by being realized. It is important here to notice that devices can be instantiated (and also finalized) without being realized! This happens for example if the device is introspected: If you enter for example device_add xyz,help at the HMP monitor, or if you send the device-list-properties QOM command to QEMU to retrieve the device’s properties, QEMU creates a temporary instance of the device to query the properties of the object, without realizing it. The object gets destroyed (“finalized”) immediately afterwards.

        Knowing this, you can avoid a set of bugs which could be found with a couple of devices in the past:

        • If you want your device to provide properties for other parts of the QEMU code or for the users, and you want to add those properties via one of the many object_property_add*() functions of QEMU (instead of using the “props” field of the DeviceClass), then you should do this in the instance_init() and not in the realize() function. Otherwise the properties won’t show up when the user runs --device xyz,help or the device-list-properties QOM command to get some information about your device.

        • instance_init() functions must really never fail, i.e. also not call abort() or exit(). Otherwise QEMU can terminate unexpectedly when a user simply wanted to have a look at the list of device properties with device_add xyz,help or the device-list-properties QOM command. If your device cannot work in certain circumstances, check for the error condition in the realize() function instead and return with an appropriate error there.

        • Never assume that your device is always instantiated only with the machine that it was designed for. It’s of course a good idea to set the “user_creatable = false” flag in the DeviceClass of your device if your device cannot be plugged in arbitrary machines. But device introspection can still happen at any time, with any machine. So if you wrote a device called “mydev-a” that only works with --machine A, the user still can start QEMU with the option --machine B instead and then run device_add mydev-a,help or the device-list-properties QOM command. The instance_init() function of your device will be called to create a temporary instance of your device, even though the base machine is B and not A here. So you especially should take care to not depend on the availability of certain buses or other devices in the instance_init() function, nor use things like serial_hd() or nd_table[] in your instance_init() function, since these might (and should) have been used by the machine init function already. If your device needs to be wired up, provide properties as interfaces to the outside and let the creator of your device (e.g. the machine init code) wire your device between the device instantiation and the realize phase instead.

        • Make sure that your device leaves a clean state after a temporary instance is destroyed again, i.e. don’t assume that there will be only one instance of your device which is created at the beginning right after QEMU has been started and is destroyed at the very end before QEMU terminates. Thus do not assume that the things that you do in your instance_init() don’t need explicit clean-up since the device instance will only be destroyed when QEMU terminates. Device instances can be created and destroyed at any time, so when the device is finalized, you must not leave any dangling pointers or references to your device behind you, e.g. in the QOM tree. When you create other objects in your instance_init() function, make sure to set proper parents of these objects or use an instance_finalize() function, so that the created objects get cleaned up correctly again when your device is destroyed.

        All in all, if you write code for a new QEMU device, it is likely a good idea to use the instance_init() function only for e.g. creating properties and other things that are required before device realization, and then do the main work in the realize() function instead.

        September 10, 2018 01:05 PM

        September 06, 2018

        KVM on Z

        QEMU v2.11 released

        QEMU v2.11 is out. Here are the highlights from a KVM on Z perspective:
        • TOD-Clock Epoch Extension Support: Extends the TOD clock beyond the year 2042.
        • Setting sysctl vm.allocate_pgste is now superfluous.
        • Netboot: The network boot firmware sets the client architecture option (93) in the DHCP request to 0x1f ("s390 Basic"). This allows a DHCP server to deliver the correct boot image for IBM Z guests. This is useful in situations where a single DHCP server has to provide network boot images for multiple architectures, e.g. for the purpose of installing operating systems.
        • Added support for virtio-input-ccw and virtio-gpu-ccw. These newly supported devices lay the foundation for applications that require graphical interfaces, which thereby become usable from remote via VNC or SPICE.
          Here is a sample XML snippet for a guest definition:

              <input type='keyboard' bus='virtio'/>
              <graphics type='vnc' port='-1' autoport='yes' listen='0.0.0.0'>
                <listen type='address' address='0.0.0.0'/>
              </graphics>
              <video>
                <model type='virtio' heads='1' primary='yes'/>
              </video>

        by Stefan Raspl (noreply@blogger.com) at September 06, 2018 03:46 PM

        August 26, 2018

        KVM on Z

        QEMU v3.0 released


        QEMU v3.0 is out. Besides a number of small enhancements, some items that we would like to highlight from a KVM on Z perspective:

        • A new CPU model representing IBM z14 Model ZR1 was added:
          14ZR1
          (long name: IBM z14 Model ZR1 GA1).
        • Re-use your existing infrastructure for LPAR installs by utilizing the newly added support for .INS files in network boot.

        by Stefan Raspl (noreply@blogger.com) at August 26, 2018 07:47 AM

        August 21, 2018

        Gerd Hoffmann

        USB recommendations for qemu

        A collection of tips on using usb with qemu.

        Picking a host adapter

        The short answer for this one is: Unless you are running an operating system museum just use -device qemu-xhci.

        Any recent operating system should support xhci out-of-the box. The only OS without xhci support which is still in widespread use is Windows 7.

        In case your qemu version doesn't support qemu-xhci you can use nec-usb-xhci instead.

        The -usb command line switch adds usb controllers matching the emulated hardware platform. So for the 'pc' machine type (emulating a 20+ year old i440FX chipset) this is a uhci host adapter (supporting usb1). For the 'q35' machine type (emulating a almost 10 year old Q35 chipset) it is ehci (for usb2 devices) with uhci companions (for usb1 devices). This is what you can use when running old guests which lack xhci support.

        When using xhci you should better not use -usb, because you would get two usb busses then. Which is a valid configuration, but requires naming the usb host adapter and specifying the usb bus when adding usb devices if you want avoid qemu picking a random usb bus:

        -device qemu-xhci,id=xhci -device usb-tablet,bus=xhci.0

        With a single usb bus you can just say -device usb-tablet and be done with it.

        Not enough usb ports?

        Qemu can emulate an usb hub (-device usb-hub). But the hub supports usb1 only, so you should avoid using it. Better solution is to just increase the number of root ports. xhci has four root ports by default, but it supports up to 15 ports. And in case this still isn't enough a second xhci adapter can be added to the virtual machine.

        To create a host adapter with 8 ports use -device qemu-xhci,p2=8,p3=8. The libvirt configuration is:

        <controller type='usb' model='qemu-xhci' ports='8'/>

        In case you wonder why qemu-xhci needs both p2 and p3 parameters: p2 specifies the number of usb2 ports (which support usb1 too), and p3 specifies the number of usb3 ports. It is possible to assign different counts here. When using -device qemu-xhci,p2=8,p3=4 you'll get an xhci adapter where ports 1-4 support both usb2 and usb3 and ports 5-8 are usb2-only. Can be used to force a usb3-capable usb device into usb2 mode by plugging it into a usb2-only xhci port. There should rarely be a need to actually do that in practice though.

        by Gerd Hoffmann at August 21, 2018 10:00 PM

        August 17, 2018

        Daniel Berrange

        ANNOUNCE: gtk-vnc 0.9.0 release

        I’m pleased to announce a new release of GTK-VNC, version 0.9.0. This is a cleanup/modernization release. Note that the next release (1.0.0) will drop support for GTK-2

        • Requires gnutls >= 3.1.18
        • Requires libgcrypt >= 1.5.0
        • Requires glib2 >= 2.42.0
        • Use libgcrypt for DES routines
        • Add missing cipher close calls in ARD auth
        • Check for errors after reading mslogon params
        • Support newer UltraVNC mslogon auth type code
        • Avoid divide by zero in mslogin auth from bogus params
        • Re-allow python2 accidentally blocked when removing python binding

        Thanks to all those who reported bugs and provides patches that went into this new release.

        by Daniel Berrange at August 17, 2018 04:01 PM

        August 15, 2018

        QEMU project

        QEMU version 3.0.0 released

        We’d like to announce the availability of the QEMU 3.0.0 release. This release contains 2300+ commits from 169 authors.

        A note from the maintainer: Why 3.0? Well, we felt that our version numbers were getting a bit unwieldy, and since this year is QEMU’s 15th birthday it seemed like a good excuse to roll over the major digit. Going forward we plan to increment the major version once a year, for the first release of the year. Don’t read too much into it: it doesn’t imply a drastic compatibility break. Rumours of our triskaidekaphobia have been greatly exaggerated ;-)

        You can grab the tarball from our download page. The full list of changes are available in the Wiki.

        Highlights include:

        • Support for additional x86/AMD mitigations against Speculative Store Bypass (Spectre Variant 4, CVE-2018-3639)
        • Improved support for nested KVM guests running on Hyper-V
        • Block device support for active disk-mirroring, which avoids convergence issues which may arise when doing passive/background mirroring of busy devices
        • Improved support for AHCI emulation, SCSI emulation, and persistent reservations / cluster management
        • OpenGL ES support for SDL front-end, additional framebuffer device options for early boot display without using legacy VGA emulation
        • Live migration support for TPM TIS devices, capping bandwidth usage during post-copy migration, and recovering from a failed post-copy migration
        • Improved latency when using user-mode networking / SLIRP
        • ARM: support for SMMUv3 IOMMU when using ‘virt’ machine type
        • ARM: v8M extensions for VLLDM and VLSTM floating-point instructions, and improved support for AArch64 v8.2 FP16 extensions
        • ARM: support for Scalable Vector Extensions in linux-user mode
        • Microblaze: support for 64-bit address sizes and translation bug fixes
        • PowerPC: PMU support for mac99 machine type and improvements for Uninorth PCI host bridge emulation for Mac machine types
        • PowerPC: preliminary support for emulating POWER9 hash MMU mode when using powernv machine type
        • RISC-V: improvement for privileged ISA emulation
        • s390: support for z14 ZR1 CPU model
        • s390: bpb/ppa15 Spectre mitigations enabled by default for z196 and later CPU models
        • s390: support for configuring consoles via -serial options
        • and lots more…

        Thank you to everyone involved!

        August 15, 2018 11:25 AM

        August 01, 2018

        Daniel Berrange

        ANNOUNCE: gtk-vnc 0.8.0 release

        I’m pleased to announce a new release of GTK-VNC, version 0.8.0. This is a small maintenance release tidying up some loose ends

        • Deleted the python2 binding in favour of GObject introspection
        • Pull in latest keycodemapdb content
        • Disable/fix -Wcast-function-type warnings

        Thanks to all those who reported bugs and provides patches that went into this new release.

        by Daniel Berrange at August 01, 2018 04:45 PM

        Cornelia Huck

        s390x changes in QEMU 3.0

        QEMU 3.0 is currently in the late -rc phase (with the final release expected early/mid August), so here's a quick summary of what has been changed for s390x.

        CPU models

        • A CPU model for the z14 Model ZR1 has been added. This is the "small", single-frame z14.
        • The feature bits for Spectre mitigation (bpb and ppa15) are now included in the default CPU model for z196 and up. This means that these features will be available to the guest (given the host supports them) without needing to specify them explicitly.

        Devices

        • You can now configure consoles via -serial as well.
        • vfio-ccw devices have gained a "force-orb-pfch" property. This is not very useful for Linux guests, but if you are trying to use vfio-ccw with a guest that does not specify "unlimited prefetch" for its requests but does not actually rely on the semantics, this will help you. Adding support to vfio-ccw to accommodate channel programs that must not be prefetched is unfortunately not straightforward and will not happen in the foreseeable future.

        Booting and s390 bios

        • The s390-netboot image has been enhanced: It now supports indirect loading via .INS files and pxelinux.cfg-style booting.
        • The boot menu can now also deal with non-sequential entries.

        Miscalleneous

        • Handling of the TOD clock in tcg has been improved; CPU hotplug under tcg is now working.
        • And the usual fixes, cleanups and improvements.

        by Cornelia Huck (noreply@blogger.com) at August 01, 2018 11:36 AM

        Powered by Planet!
        Last updated: December 18, 2018 11:16 PM
        Powered by OpenShift Online