Skip to content

lukdevice:unlock: disable read/write workqueues when activating device

clayton craft requested to merge dmcrypt_perf into master

Some simple benchmarking shows this increases write performance ~35% and read performance ~33% on 4K block size (default for ext4) using fio.

These options would probably not help spinning rust drives, but I don't think there are any of those around that would use osk-sdl...

These options are only available on the Linux kernel 5.9. My understanding is that these options should be a noop if set on earlier kernels, but I'd like some help confirming that.

References:

Benchmark data before/after:

librem5:~$ sudo dmsetup table
root: 0 60647424 crypt aes-xts-plain64 :64:logon:cryptsetup:e5f04dde-e097-4e12-935e-7029abfb8605-d0 0 179:2 32768 3 allow_discards
librem6:~$ sudo fio --filename=~/test --readwrite=readwrite --bs=4k --direct=1 --loops=100 --name=plain --size=10M
plain: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.26
Starting 1 process
Jobs: 1 (f=1): [M(1)][100.0%][r=8004KiB/s,w=8364KiB/s][r=2001,w=2091 IOPS][eta 00m:00s]
plain: (groupid=0, jobs=1): err= 0: pid=6072: Wed May 12 17:00:50 2021
  read: IOPS=1943, BW=7773KiB/s (7959kB/s)(491MiB/64647msec)
    clat (usec): min=110, max=11437, avg=274.95, stdev=376.00
     lat (usec): min=110, max=11437, avg=275.47, stdev=376.01
    clat percentiles (usec):
     |  1.00th=[  115],  5.00th=[  119], 10.00th=[  128], 20.00th=[  174],
     | 30.00th=[  176], 40.00th=[  180], 50.00th=[  186], 60.00th=[  206],
     | 70.00th=[  210], 80.00th=[  219], 90.00th=[  545], 95.00th=[  603],
     | 99.00th=[ 1598], 99.50th=[ 3097], 99.90th=[ 4080], 99.95th=[ 6980],
     | 99.99th=[ 8586]
   bw (  KiB/s): min= 5888, max= 9640, per=100.00%, avg=7784.74, stdev=815.07, samples=129
   iops        : min= 1472, max= 2410, avg=1946.17, stdev=203.78, samples=129
  write: IOPS=2016, BW=8067KiB/s (8261kB/s)(509MiB/64647msec); 0 zone resets
    clat (usec): min=121, max=11618, avg=221.29, stdev=197.98
     lat (usec): min=121, max=11618, avg=221.93, stdev=198.01
    clat percentiles (usec):
     |  1.00th=[  127],  5.00th=[  128], 10.00th=[  129], 20.00th=[  133],
     | 30.00th=[  135], 40.00th=[  139], 50.00th=[  167], 60.00th=[  172],
     | 70.00th=[  178], 80.00th=[  184], 90.00th=[  570], 95.00th=[  652],
     | 99.00th=[  824], 99.50th=[ 1172], 99.90th=[ 1680], 99.95th=[ 1745],
     | 99.99th=[ 2999]
   bw (  KiB/s): min= 6392, max= 9984, per=100.00%, avg=8079.60, stdev=826.62, samples=129
   iops        : min= 1598, max= 2496, avg=2019.88, stdev=206.65, samples=129
  lat (usec)   : 250=85.29%, 500=2.63%, 750=10.38%, 1000=0.19%
  lat (msec)   : 2=1.09%, 4=0.37%, 10=0.06%, 20=0.01%
  cpu          : usr=3.00%, sys=12.46%, ctx=256614, majf=0, minf=36
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=125621,130379,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=7773KiB/s (7959kB/s), 7773KiB/s-7773KiB/s (7959kB/s-7959kB/s), io=491MiB (515MB), run=64647-64647msec
  WRITE: bw=8067KiB/s (8261kB/s), 8067KiB/s-8067KiB/s (8261kB/s-8261kB/s), io=509MiB (534MB), run=64647-64647msec

Disk stats (read/write):
    dm-0: ios=125591/130415, merge=0/0, ticks=31836/23472, in_queue=55308, util=100.00%, aggrios=125622/130431, aggrmerge=0/7, aggrticks=22005/13779, aggrin_queue=35806, aggrutil=99.97%
  mmcblk0: ios=125622/130431, merge=0/7, ticks=22005/13779, in_queue=35806, util=99.97%
librem5:~$ sudo dmsetup table
root: 0 60647424 crypt aes-xts-plain64 :64:logon:cryptsetup:e5f04dde-e097-4e12-935e-7029abfb8605-d0 0 179:2 32768 3 allow_discards no_read_workqueue no_write_workqueue
librem5:~$ sudo fio --filename=~/test --readwrite=readwrite --bs=4k --direct=1 --loops=100 --name=plain --size=10M
plain: (g=0): rw=rw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=psync, iodepth=1
fio-3.26
Starting 1 process
Jobs: 1 (f=1): [M(1)][100.0%][r=10.2MiB/s,w=10.5MiB/s][r=2601,w=2698 IOPS][eta 00m:00s]
plain: (groupid=0, jobs=1): err= 0: pid=2590: Wed May 12 17:40:49 2021
  read: IOPS=2656, BW=10.4MiB/s (10.9MB/s)(491MiB/47286msec)
    clat (usec): min=107, max=17501, avg=218.12, stdev=406.89
     lat (usec): min=107, max=17501, avg=218.57, stdev=406.92
    clat percentiles (usec):
     |  1.00th=[  110],  5.00th=[  112], 10.00th=[  113], 20.00th=[  141],
     | 30.00th=[  169], 40.00th=[  172], 50.00th=[  172], 60.00th=[  174],
     | 70.00th=[  178], 80.00th=[  188], 90.00th=[  204], 95.00th=[  227],
     | 99.00th=[ 1500], 99.50th=[ 3359], 99.90th=[ 6194], 99.95th=[ 7439],
     | 99.99th=[10028]
   bw (  KiB/s): min= 4750, max=12136, per=100.00%, avg=10643.41, stdev=1235.18, samples=94
   iops        : min= 1187, max= 3034, avg=2660.78, stdev=308.87, samples=94
  write: IOPS=2757, BW=10.8MiB/s (11.3MB/s)(509MiB/47286msec); 0 zone resets
    clat (usec): min=116, max=12864, avg=143.91, stdev=180.81
     lat (usec): min=117, max=12864, avg=144.47, stdev=180.87
    clat percentiles (usec):
     |  1.00th=[  118],  5.00th=[  119], 10.00th=[  120], 20.00th=[  120],
     | 30.00th=[  121], 40.00th=[  122], 50.00th=[  123], 60.00th=[  126],
     | 70.00th=[  133], 80.00th=[  145], 90.00th=[  159], 95.00th=[  169],
     | 99.00th=[  578], 99.50th=[ 1123], 99.90th=[ 2089], 99.95th=[ 3195],
     | 99.99th=[ 8455]
   bw (  KiB/s): min= 5037, max=12736, per=100.00%, avg=11045.41, stdev=1274.39, samples=94
   iops        : min= 1259, max= 3184, avg=2761.27, stdev=318.64, samples=94
  lat (usec)   : 250=97.17%, 500=0.94%, 750=0.48%, 1000=0.14%
  lat (msec)   : 2=0.76%, 4=0.40%, 10=0.10%, 20=0.01%
  cpu          : usr=3.49%, sys=22.73%, ctx=257922, majf=0, minf=37
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=125621,130379,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency   : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
   READ: bw=10.4MiB/s (10.9MB/s), 10.4MiB/s-10.4MiB/s (10.9MB/s-10.9MB/s), io=491MiB (515MB), run=47286-47286msec
  WRITE: bw=10.8MiB/s (11.3MB/s), 10.8MiB/s-10.8MiB/s (11.3MB/s-11.3MB/s), io=509MiB (534MB), run=47286-47286msec

Disk stats (read/write):
    dm-0: ios=126730/130418, merge=0/0, ticks=27844/16560, in_queue=44404, util=99.98%, aggrios=126805/130566, aggrmerge=316/268, aggrticks=26482/14060, aggrin_queue=40580, aggrutil=99.92%
  mmcblk0: ios=126805/130566, merge=316/268, ticks=26482/14060, in_queue=40580, util=99.92%

Merge request reports

Loading