drvnodedev.html.in 10 KB
Newer Older
1
<?xml version="1.0" encoding="UTF-8"?>
2
<!DOCTYPE html>
3 4 5 6 7 8 9 10 11
<html xmlns="http://www.w3.org/1999/xhtml">
  <body>
    <h1>Host device management</h1>

    <p>
      Libvirt provides management of both physical and virtual host devices
      (historically also referred to as node devices) like USB, PCI, SCSI, and
      network devices. This also includes various virtualization capabilities
      which the aforementioned devices provide for utilization, for example
12
      SR-IOV, NPIV, MDEV, DRM, etc.
13 14 15 16 17 18 19 20 21 22
    </p>

    <p>
      The node device driver provides means to list and show details about host
      devices (<code>virsh nodedev-list</code>,
      <code>virsh nodedev-dumpxml</code>), which are generic and can be used
      with all devices. It also provides means to create and destroy devices
      (<code>virsh nodedev-create</code>, <code>virsh nodedev-destroy</code>)
      which are meant to be used to create virtual devices, currently only
      supported by NPIV
23
      (<a href="https://wiki.libvirt.org/page/NPIV_in_libvirt">more info about NPIV)</a>).
24 25
      Devices on the host system are arranged in a tree-like hierarchy, with
      the root node being called <code>computer</code>. The node device driver
Pavel Hrdina's avatar
Pavel Hrdina committed
26
      supports udev backend (HAL backend was removed in <code>6.8.0</code>).
27 28 29
    </p>

    <p>
30 31 32 33 34
      Details of the XML format of a host device can be found <a
      href="formatnode.html">here</a>. Of particular interest is the
      <code>capability</code> element, which describes features supported by
      the device. Some specific device types are addressed in more detail
      below.
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
    </p>
    <h2>Basic structure of a node device</h2>
    <pre>
&lt;device&gt;
  &lt;name&gt;pci_0000_00_17_0&lt;/name&gt;
  &lt;path&gt;/sys/devices/pci0000:00/0000:00:17.0&lt;/path&gt;
  &lt;parent&gt;computer&lt;/parent&gt;
  &lt;driver&gt;
    &lt;name&gt;ahci&lt;/name&gt;
  &lt;/driver&gt;
  &lt;capability type='pci'&gt;
...
  &lt;/capability&gt;
&lt;/device&gt;</pre>

    <ul id="toc"/>

52
    <h2><a id="PCI">PCI host devices</a></h2>
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92
    <dl>
      <dt><code>capability</code></dt>
      <dd>
        When used as top level element, the supported values for the
        <code>type</code> attribute are <code>pci</code> and
        <code>phys_function</code> (see <a href="#SRIOVCap">SR-IOV below</a>).
      </dd>
    </dl>
    <pre>
&lt;device&gt;
  &lt;name&gt;pci_0000_04_00_1&lt;/name&gt;
  &lt;path&gt;/sys/devices/pci0000:00/0000:00:06.0/0000:04:00.1&lt;/path&gt;
  &lt;parent&gt;pci_0000_00_06_0&lt;/parent&gt;
  &lt;driver&gt;
    &lt;name&gt;igb&lt;/name&gt;
  &lt;/driver&gt;
  &lt;capability type='pci'&gt;
    &lt;domain&gt;0&lt;/domain&gt;
    &lt;bus&gt;4&lt;/bus&gt;
    &lt;slot&gt;0&lt;/slot&gt;
    &lt;function&gt;1&lt;/function&gt;
    &lt;product id='0x10c9'&gt;82576 Gigabit Network Connection&lt;/product&gt;
    &lt;vendor id='0x8086'&gt;Intel Corporation&lt;/vendor&gt;
    &lt;iommuGroup number='15'&gt;
      &lt;address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/&gt;
    &lt;/iommuGroup&gt;
    &lt;numa node='0'/&gt;
    &lt;pci-express&gt;
      &lt;link validity='cap' port='1' speed='2.5' width='2'/&gt;
      &lt;link validity='sta' speed='2.5' width='2'/&gt;
    &lt;/pci-express&gt;
  &lt;/capability&gt;
&lt;/device&gt;</pre>

    <p>
      The XML format for a PCI device stays the same for any further
      capabilities it supports, a single nested <code>&lt;capability&gt;</code>
      element will be included for each capability the device supports.
    </p>

93
    <h3><a id="SRIOVCap">SR-IOV capability</a></h3>
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137
    <p>
      Single root input/output virtualization (SR-IOV) allows sharing of the
      PCIe resources by multiple virtual environments. That is achieved by
      slicing up a single full-featured physical resource called physical
      function (PF) into multiple devices called virtual functions (VFs) sharing
      their configuration with the underlying PF. Despite the SR-IOV
      specification, the amount of VFs that can be created on a PF varies among
      manufacturers.
    </p>

    <p>
      Suppose the NIC <a href="#PCI">above</a> was also SR-IOV capable, it would
      also include a nested
      <code>&lt;capability&gt;</code> element enumerating all virtual
      functions available on the physical device (physical port) like in the
      example below.
    </p>

    <pre>
&lt;capability type='pci'&gt;
...
  &lt;capability type='virt_functions' maxCount='7'&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x10' function='0x1'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x10' function='0x3'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x10' function='0x5'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x10' function='0x7'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x11' function='0x1'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x11' function='0x3'/&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x11' function='0x5'/&gt;
  &lt;/capability&gt;
...
&lt;/capability&gt;</pre>
    <p>
      A SR-IOV child device on the other hand, would then report its top level
      capability type as a <code>phys_function</code> instead:
    </p>

    <pre>
&lt;device&gt;
...
  &lt;capability type='phys_function'&gt;
    &lt;address domain='0x0000' bus='0x04' slot='0x00' function='0x0'/&gt;
  &lt;/capability&gt;
...
138
&lt;/device&gt;</pre>
139

140
    <h3><a id="MDEVCap">MDEV capability</a></h3>
141 142 143 144
    <p>
      A PCI device capable of creating mediated devices will include a nested
      capability <code>mdev_types</code> which enumerates all supported mdev
      types on the physical device, along with the type attributes available
145 146 147
      through sysfs. A detailed description of the XML format for the
      <code>mdev_types</code> capability can be found
      <a href="formatnode.html#MDEVCap">here</a>.
148 149
    </p>
    <p>
150 151 152
      The following example shows how we might represent an NVIDIA GPU device
      that supports mediated devices. See below for <a href="#MDEV">more
      information about mediated devices</a>.
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174
    </p>

<pre>
&lt;device&gt;
...
  &lt;driver&gt;
    &lt;name&gt;nvidia&lt;/name&gt;
  &lt;/driver&gt;
  &lt;capability type='pci'&gt;
...
    &lt;capability type='mdev_types'&gt;
      &lt;type id='nvidia-11'&gt;
        &lt;name&gt;GRID M60-0B&lt;/name&gt;
        &lt;deviceAPI&gt;vfio-pci&lt;/deviceAPI&gt;
        &lt;availableInstances&gt;16&lt;/availableInstances&gt;
      &lt;/type&gt;
      &lt;!-- Here would come the rest of the available mdev types --&gt;
    &lt;/capability&gt;
...
  &lt;/capability&gt;
&lt;/device&gt;</pre>

175
    <h2><a id="MDEV">Mediated devices (MDEVs)</a></h2>
176 177 178 179 180 181 182 183 184
    <p>
      Mediated devices (<span class="since">Since 3.2.0</span>) are software
      devices defining resource allocation on the backing physical device which
      in turn allows the parent physical device's resources to be divided into
      several mediated devices, thus sharing the physical device's performance
      among multiple guests. Unlike SR-IOV however, where a PCIe device appears
      as multiple separate PCIe devices on the host's PCI bus, mediated devices
      only appear on the mdev virtual bus. Therefore, no detach/reattach
      procedure from/to the host driver procedure is involved even though
185 186 187
      mediated devices are used in a direct device assignment manner.  A
      detailed description of the XML format for the <code>mdev</code>
      capability can be found <a href="formatnode.html#mdev">here</a>.
188 189 190 191 192 193 194 195 196 197 198 199 200 201
    </p>

    <h3>Example of a mediated device</h3>
    <pre>
&lt;device&gt;
  &lt;name&gt;mdev_4b20d080_1b54_4048_85b3_a6a62d165c01&lt;/name&gt;
  &lt;path&gt;/sys/devices/pci0000:00/0000:00:02.0/4b20d080-1b54-4048-85b3-a6a62d165c01&lt;/path&gt;
  &lt;parent&gt;pci_0000_06_00_0&lt;/parent&gt;
  &lt;driver&gt;
    &lt;name&gt;vfio_mdev&lt;/name&gt;
  &lt;/driver&gt;
  &lt;capability type='mdev'&gt;
    &lt;type id='nvidia-11'/&gt;
    &lt;iommuGroup number='12'/&gt;
202 203
  &lt;/capability&gt;
&lt;/device&gt;</pre>
204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235

    <p>
      The support of mediated device's framework in libvirt's node device driver
      covers the following features:
    </p>

    <ul>
      <li>
        list available mediated devices on the host
        (<span class="since">Since 3.4.0</span>)
      </li>
      <li>
        display device details
        (<span class="since">Since 3.4.0</span>)
      </li>
    </ul>

    <p>
      Because mediated devices are instantiated from vendor specific templates,
      simply called 'types', information describing these types is contained
      within the parent device's capabilities
      (see the example in <a href="#PCI">PCI host devices</a>).
    </p>

    <p>
      To see the supported mediated device types on a specific physical device
      use the following:
    </p>

    <pre>
$ ls /sys/class/mdev_bus/&lt;device&gt;/mdev_supported_types</pre>

236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253
    <p>
      Before creating a mediated device, unbind the device from the respective
      device driver, eg. subchannel I/O driver for a CCW device. Then bind the
      device to the respective VFIO driver. For a CCW device, also unbind the
      corresponding subchannel of the CCW device from the subchannel I/O driver
      and then bind the subchannel (instead of the CCW device) to the vfio_ccw
      driver. The below example shows the unbinding and binding steps for a CCW
      device.
    </p>

    <pre>
device="0.0.1234"
subchannel="0.0.0123"
echo $device &gt; /sys/bus/ccw/devices/$device/driver/unbind
echo $subchannel &gt; /sys/bus/css/devices/$subchannel/driver/unbind
echo $subchannel &gt; /sys/bus/css/drivers/vfio_ccw/bind
    </pre>

254 255
    <p>
      To manually instantiate a mediated device, use one of the following as a
256 257
      reference. For a CCW device, use the subchannel ID instead of the device
      ID.
258 259 260 261 262 263 264 265 266 267 268 269 270 271
    </p>

    <pre>
$ uuidgen &gt; /sys/class/mdev_bus/&lt;device&gt;/mdev_supported_types/&lt;type&gt;/create
...
$ echo &lt;UUID&gt; &gt; /sys/class/mdev_bus/&lt;device&gt;/mdev_supported_types/&lt;type&gt;/create</pre>

    <p>
      Manual removal of a mediated device is then performed as follows:
    </p>

    <pre>
$ echo 1 &gt; /sys/bus/mdev/devices/&lt;uuid&gt;/remove</pre>

272 273
  </body>
</html>