Buggy monitors when using multiple inputs

I'm having trouble adding a second input to a monitor for motion detection. I'm trying to add the camera's RTMP sub stream alongside the RTMP main stream.

First, I was no longer getting the live stream. It stayed black. After a bit of debugging I discovered that setting the motion detector's input map to 1:1 instead of just 1 made the live stream appear again. It turns out that my RTMP inputs have a data stream at *:0, and this ignores it (same as adding -dn to the ffmpeg command line for this output).

After that, the live stream appeared again, but only after refreshing the page, mashing the stream reconnect button a bunch of times, waiting forever. Something seemed slower. It was also now 20 seconds behind "live", as in the mjpeg live stream of ZoneMinder, or the stream direct in VLC. This HLS (copy) stream of Shinobi was previously only 8 seconds behind live before adding the second input. In fact, switching the motion detector input back to 0 didn't help with this. It seems the mere presence of the second input in the ffmpeg command line, even if not used by any output, makes everything lag by a further 12 seconds.

To make matters worse, it was no longer detecting motion with this setup.

I played around a bit with ffmpeg outside of Shinobi after grabbing Shinobi's ffmpeg command line, to see if I could reproduce the delay, check the motion detector output, work out just what the hell was happening here etc. I changed the piped motion detector output to a series of jpegs, but otherwise left the rest of the arguments intact.

I discovered that if the motion detector output ignores the data stream (either using -dn or -map 1:1), only the first jpeg for this output gets created, but all the .ts files for the other outputs are correct and on time. Removing -dn and keeping -map 1 for this output (to keep the data stream), I get all the jpegs correctly but only the first .ts files get created (0 bytes) for the other outputs.

This explains what I saw on Shinobi. Setting -map 1 for the motion detector output broke the live stream, but setting -map 1:1 made motion detection fail (only the first frame gets decoded).

Ignoring the data streams for ALL the outputs (adding -dn to all outputs), however, caused the jpegs and the .ts files to be created... but with serious issues. The .ts files were no longer the 2s length they were before. They varied from 4 to 17 according to VLC player, and didn't play back correctly, jumping all over the place and almost a minute behind.

In summary:

Adding -map 1 to the motion detector output broke all other outputs (including the live stream).
Adding -map 1:1 to the motion detector output broke the motion detection but made the other outputs work again.
Adding -map 1:1 to motion detection and -map 0:1 to the others made the streams kind of work but be laggy and act funny.

I ended up splitting the ffmpeg command line into two, one for the outputs that use the camera's main stream and one for the motion detector output that uses the camera's sub stream. I hacked libs/cameraThread/singleCamera.js to spawn two ffmpeg processes, sorted out the pipes, and restarted the monitor. Motion detection works, recording works, live stream works. The .ts chunks are all consistently 2s again, no lag anywhere (just the regular 8s behind live).

Any ideas how this can be implemented properly? It seems having an ffmpeg process per input might work, but I see Shinobi's web interface lets you mix streams from different inputs for an output, making this tricky, perhaps causing duplicate decoding.