@jkohls These are interesting results. I think I need to do some reading about how Jack's rate/period/frames interacts with the underlying ALSA driver.
The following thread on the subject is intriguing to me.
List of JACK Frame & Period settings ideal for USB interface
tbritton wrote: ↑Fri Feb 22, 2013 5:20 am
...
you can enter the numbers below and thus achieve an even multiple of 1ms (that is, a whole-numbered latency figure) for use with USB interfaces which require that.
...
(actual latency depends upon other factors, but USB devices want the math-derived latency to be an even multiple of 1ms)
The asserted importance of making USB period/frames work out to an even multiple of 1ms hints at something kind of interesting to me. I need to read more, but I have a hunch. The USB Audio Class ALSA driver submits 1ms worth of audio with each USB request block. This 1ms is measured by the USB host controller clock, not by the audio word clock. Many simpler audio interfaces may actually derive their word clock from the USB clock. For these devices that would mean that 1ms measured by the USB bus would be equal to exactly 1ms measured by the word clock. In the USB class specification, such a device would be called a "synchronous" device. Practically speaking, for a High Speed USB Audio Class devices running at 48Khz what this means is that each USB request block will submit 1ms of audio (8 USB micro-frames each containing 6 audio samples).
MOTU devices operate in an "asynchronous" USB Audio Class mode. This means that the word clock is not derived from the USB clock. Instead, the number of samples in each USB micro-frame is varied depending on how many word clock intervals elapse during the USB micro-frame interval of 125us. So, if for example, the word clock was slightly fast compared to the USB bus clock, every once in a while a USB micro-frame will have one extra sample in it. So on the bus you would see a majority of micro-frames containing 6 samples and a few micro-frames containing 7 samples. This means that 1ms of audio measured by the USB host may actually be 48 plus or minus a few samples. I think if I can justify this understanding against the even 1ms recommendation, I might be closer to understanding these XRUN issues.
Another observation that might support this area of inquiry is the difference in result between 192k/1024/2 and 192k/1024/3. At 192kHz there are nominally 24 samples in each USB micro-frame or 192 nominal samples for each request block. Dividing the number of nominal samples per 1ms (request block) by the frames * period product yields a fractional result in the non-working case (1024 * 2 / 192 = 10.6666) and yields a whole number result in the working case (1024 * 3 / 192 = 16).
That said, there are configurations that you are reporting as a Pass that yield fractional results, so I there is definitely more to the story. However, I am currently thinking that these non-pass results may be addressable through Jack configuration rather than through a driver or firmware change. If anyone has any thoughts or additional understanding about this 1ms Jack requirement for USB devices I'd be happy to hear more. Not being that familiar with Jack, I wonder if there is way to add an additional extra constant amount of buffering that can absorb any of the jitter from the variable frame counts coming over USB.