Tuning out Xruns, cgroups usage advice

Optimize your system for ultimate performance.

Moderators: MattKingUSA, khz

Post Reply
FawkesThat
Established Member
Posts: 10
Joined: Mon Dec 10, 2018 8:24 am

Tuning out Xruns, cgroups usage advice

Post by FawkesThat »

I first put this i5 system with two Edirol DA2496 cards together back in 2014. I eventually ended up using KXStudio on it, and had it happily recording at 96kHz with a low-enough-for-live latency. It hasn't seen much use like that recently, and spent about a year switched off. Acquiring a 10U flight case and rack mount PC case has encouraged me to put it back into service. Just about the time KXStudio went 'on hiatus'.

Long story cut short, I've settled on a KDE Neon install using the HWE 18-04 lowlatency kernel and the recently announced new KXStudio repo(s). Getting that 96kHz xrun-free sampling is proving a lot more difficult than back in 2014. Since then we've had SPECTRE, and the kernel devs have been beavering away at making cgroups completely incomprehensible to mere mortals.

The system is sat next to me at the moment, it's been running jack at 96000/256/3 with audacious playing an album on loop for a simple load. So far 3 Xruns in the last 2 hours. Good enough for recording, but not for live work since the xruns can be a second or so of silence. Yes, I still get Xruns without audacity running, albeit perhaps even less frequently. I'd prefer none, and I've no qualms about building an rt kernel - if I can put one together with any Ubuntu-specific tweaks the current lowlatency kernel includes. [Aside: I've tried booting off the AVLinux live ISO and still got Xruns with it].

Here's a rundown of the tuning process so far, documented fairly completely so it might be of use to others. Please highlight any obsolete/incorrect bits you might spot:

Kernel & /etc/defaults/grub:

Code: Select all

$ uname -a
Linux FT-Studio-M 5.0.0-32-lowlatency #34~18.04.2-Ubuntu SMP PREEMPT Thu Oct 10 11:22:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# Transparent hugepage copied from AVLinux, isolcpus onward from Drumkit's post (see later)
GRUB_CMDLINE_DEFAULT="threadirqs transparent_hugepage=never noresume noquiet nosplash isolcpus=1-3 bohz_full=1-3 rcu_nocbs=1-3 nowatchdog acpi_irq_nobalance"
rtConfigQuickScan (pulled from git, so latest version):

Code: Select all

$ perl ./realTimeConfigQuickScan.pl 
== GUI-enabled checks ==
Checking if you are root... no - good
Checking filesystem 'noatime' parameter... 5.0.0 kernel - good
(relatime is default since 2.6.30)
Checking CPU Governors... CPU 0: 'performance' CPU 1: 'performance' CPU 2: 'performance' CPU 3: 'performance'  - good
Checking swappiness... 10 - good
Checking for resource-intensive background processes... none found - good
Checking checking sysctl inotify max_user_watches... >= 524288 - good
Checking access to the high precision event timer... readable - good
Checking access to the real-time clock... readable - good
Checking whether you're in the 'audio' group... yes - good
Checking for multiple 'audio' groups... no - good
Checking the ability to prioritize processes with chrt... yes - good
Checking kernel support for high resolution timers... found - good
Kernel with Real-Time Preemption... not found - not good
Kernel without real-time capabilities found
For more information, see http://wiki.linuxaudio.org/wiki/system_configuration#installing_a_real-time_kernel
Checking if kernel system timer is high-resolution... found - good
Checking kernel support for tickless timer... found - good
== Other checks ==
Checking filesystem types... ok.
Checking for devices at IRQ 19... did not find multiple. ok.
limits.conf:

Code: Select all

$ cat /etc/security/limits.d/audio.conf 
# Provided by the jackd package.
#
# Changes to this file will be preserved.
#
# If you want to enable/disable realtime permissions, run
#
#    dpkg-reconfigure -p high jackd

@audio   -  rtprio     95
@audio   -  memlock    unlimited
#@audio   -  nice      -19
sysctl.conf:

Code: Select all

$ tail /etc/sysctl.conf 
#
# Protects against creating or following links under certain conditions
# Debian kernels have both set to 1 (restricted) 
# See https://www.kernel.org/doc/Documentation/sysctl/fs.txt
#fs.protected_hardlinks=0
#fs.protected_symlinks=0

# Per quickscan requirements
vm.swappiness =10
fs.inotify.max_user_watches = 524288
# From Rosegarden tuning tip, may be obsolete
dev.hpet.max-user-freq=3072
rtc access:

Code: Select all

$ cat /etc/udev/rules.d/40-timer-permissions.rules 
KERNEL=="rtc0", GROUP="audio"
KERNEL=="hpet", GROUP="audio"

## Read access is set up in rtirq script (details later in this post)
rtirq settings:

Code: Select all

$ tail -n24 /etc/default/rtirq
# RTIRQ_NAME_LIST="rtc snd usb i8042" # old
RTIRQ_NAME_LIST="snd usb i8042 rtc hci_rcd"

# Highest priority.
RTIRQ_PRIO_HIGH=90

# Priority decrease step.
RTIRQ_PRIO_DECR=5

# Lowest priority.
RTIRQ_PRIO_LOW=51

# Whether to reset all IRQ threads to SCHED_OTHER.
RTIRQ_RESET_ALL=0

# On kernel configurations that support it,
# which services should be NOT threaded 
# (space separated list).
RTIRQ_NON_THREADED="rtc snd"

# Process names which will be forced to the
# highest realtime priority range (99-91)
# (space separated list, from highest to lower priority).
# RTIRQ_HIGH_LIST="timer"
And status output:

Code: Select all

$ /etc/init.d/rtirq status

  PID CLS RTPRIO  NI PRI %CPU STAT COMMAND
  546 FF      90   - 130  1.1 S    irq/19-snd_ice1
  609 FF      90   - 130  0.0 S    irq/16-snd_ice1
  150 FF      85   - 125  0.0 S    irq/24-xhci_hcd
   46 FF      50   -  90  0.0 S    irq/9-acpi
  151 FF      50   -  90  0.0 S    irq/8-rtc0
  255 FF      50   -  90  0.0 S    irq/25-ahci[000
  260 FF      50   -  90  0.0 S    irq/27-i915
  450 FF      50   -  90  0.0 S    irq/28-mei_me
  994 FF      50   -  90  0.0 S    irq/26-enp2s0
    9 TS       -   0  19  0.1 S    ksoftirqd/0
   18 TS       -   0  19  0.0 S    ksoftirqd/1
   24 TS       -   0  19  0.0 S    ksoftirqd/2
   30 TS       -   0  19  0.0 S    ksoftirqd/3
/proc/interrupts:

Code: Select all

$ cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:          9          0          0          0   IO-APIC   2-edge      timer
  8:          0          0          0          1   IO-APIC   8-edge      rtc0
  9:          0          5          0          0   IO-APIC   9-fasteoi   acpi
 16:          0          0          0          0   IO-APIC  16-fasteoi   snd_ice1712
 19:          0    6647744          0          1   IO-APIC  19-fasteoi   snd_ice1712
 24:        497      16964         50          0   PCI-MSI 327680-edge      xhci_hcd
 25:      27651          0          0          0   PCI-MSI 512000-edge      ahci[0000:00:1f.2]
 26:          0          0          0          0   PCI-MSI 1048576-edge      enp2s0
 27:     303403        151          0          0   PCI-MSI 32768-edge      i915
 28:          0          0         20          0   PCI-MSI 360448-edge      mei_me
NMI:          0          0          0          0   Non-maskable interrupts
LOC:    5250288     382908       7425       7395   Local timer interrupts
SPU:          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0   Performance monitoring interrupts
IWI:         73          0          0          0   IRQ work interrupts
RTR:          0          0          0          0   APIC ICR read retries
RES:    1005288          5          0          0   Rescheduling interrupts
CAL:          0      12038      12038      12036   Function call interrupts
TLB:          0          0          0          0   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0   Threshold APIC interrupts
DFR:          0          0          0          0   Deferred Error APIC interrupts
MCE:          0          0          0          0   Machine check exceptions
MCP:         31         32         32         32   Machine check polls
HYP:          0          0          0          0   Hypervisor callback interrupts
HRE:          0          0          0          0   Hyper-V reenlightenment interrupts
HVS:          0          0          0          0   Hyper-V stimer0 interrupts
ERR:          0
MIS:          0
PIN:          0          0          0          0   Posted-interrupt notification event
NPI:          0          0          0          0   Nested posted-interrupt event
PIW:          0          0          0          0   Posted-interrupt wakeup event
lspci:

Code: Select all

$ lspci -v
00:00.0 Host bridge: Intel Corporation 4th Gen Core Processor DRAM Controller (rev 06)
        Subsystem: Gigabyte Technology Co., Ltd 4th Gen Core Processor DRAM Controller
        Flags: bus master, fast devsel, latency 0
        Capabilities: <access denied>
        Kernel driver in use: hsw_uncore

00:02.0 VGA compatible controller: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller (rev 06) (prog-if 00 [VGA controller])
        Subsystem: Gigabyte Technology Co., Ltd Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller
        Flags: bus master, fast devsel, latency 0, IRQ 27
        Memory at f7800000 (64-bit, non-prefetchable) [size=4M]
        Memory at e0000000 (64-bit, prefetchable) [size=256M]
        I/O ports at f000 [size=64]
        [virtual] Expansion ROM at 000c0000 [disabled] [size=128K]
        Capabilities: <access denied>
        Kernel driver in use: i915
        Kernel modules: i915

00:03.0 Audio device: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller (rev 06)
        Subsystem: Intel Corporation Xeon E3-1200 v3/4th Gen Core Processor HD Audio Controller
        Flags: bus master, fast devsel, latency 0, IRQ 11
        Memory at f7d10000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel modules: snd_hda_intel

00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05) (prog-if 30 [XHCI])
        Subsystem: Gigabyte Technology Co., Ltd 8 Series/C220 Series Chipset Family USB xHCI
        Flags: bus master, medium devsel, latency 0, IRQ 24
        Memory at f7d00000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: xhci_hcd

00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series Chipset Family MEI Controller #1 (rev 04)
        Subsystem: Gigabyte Technology Co., Ltd 8 Series/C220 Series Chipset Family MEI Controller
        Flags: bus master, fast devsel, latency 0, IRQ 28
        Memory at f7d19000 (64-bit, non-prefetchable) [size=16]
        Capabilities: <access denied>
        Kernel driver in use: mei_me
        Kernel modules: mei_me

00:16.3 Serial controller: Intel Corporation 8 Series/C220 Series Chipset Family KT Controller (rev 04) (prog-if 02 [16550])
        Subsystem: Gigabyte Technology Co., Ltd 8 Series/C220 Series Chipset Family KT Controller
        Flags: bus master, 66MHz, fast devsel, latency 0, IRQ 19
        I/O ports at f0c0 [size=8]
        Memory at f7d17000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: <access denied>
        Kernel driver in use: serial

00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #1 (rev d5) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 16
        Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
        Capabilities: <access denied>
        Kernel driver in use: pcieport

00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset Family PCI Express Root Port #3 (rev d5) (prog-if 00 [Normal decode])
        Flags: bus master, fast devsel, latency 0, IRQ 18
        Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
        I/O behind bridge: 0000e000-0000efff
        Memory behind bridge: f7c00000-f7cfffff
        Prefetchable memory behind bridge: 00000000f0000000-00000000f00fffff
        Capabilities: <access denied>
        Kernel driver in use: pcieport

00:1c.3 PCI bridge: Intel Corporation 82801 PCI Bridge (rev d5) (prog-if 01 [Subtractive decode])
        Flags: bus master, fast devsel, latency 0, IRQ 19
        Bus: primary=00, secondary=03, subordinate=04, sec-latency=0
        I/O behind bridge: 0000d000-0000dfff
        Capabilities: <access denied>

00:1f.0 ISA bridge: Intel Corporation H87 Express LPC Controller (rev 05)
        Subsystem: Gigabyte Technology Co., Ltd H87 Express LPC Controller
        Flags: bus master, medium devsel, latency 0
        Capabilities: <access denied>
        Kernel driver in use: lpc_ich
        Kernel modules: lpc_ich

00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 05) (prog-if 01 [AHCI 1.0])
        Subsystem: Gigabyte Technology Co., Ltd 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode]
        Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 25
        I/O ports at f0b0 [size=8]
        I/O ports at f0a0 [size=4]
        I/O ports at f090 [size=8]
        I/O ports at f080 [size=4]
        I/O ports at f060 [size=32]
        Memory at f7d16000 (32-bit, non-prefetchable) [size=2K]
        Capabilities: <access denied>
        Kernel driver in use: ahci
        Kernel modules: ahci

00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family SMBus Controller (rev 05)
        Subsystem: Gigabyte Technology Co., Ltd 8 Series/C220 Series Chipset Family SMBus Controller
        Flags: medium devsel, IRQ 5
        Memory at f7d15000 (64-bit, non-prefetchable) [disabled] [size=256]
        I/O ports at f040 [size=32]
        Kernel modules: i2c_i801

02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 06)
        Subsystem: Gigabyte Technology Co., Ltd Onboard Ethernet
        Flags: bus master, fast devsel, latency 0, IRQ 18
        I/O ports at e000 [size=256]
        Memory at f7c00000 (64-bit, non-prefetchable) [size=4K]
        Memory at f0000000 (64-bit, prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: r8169
        Kernel modules: r8169

03:00.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 41) (prog-if 01 [Subtractive decode])
        Flags: bus master, fast devsel, latency 0, IRQ 19
        Bus: primary=03, secondary=04, subordinate=04, sec-latency=32
        I/O behind bridge: 0000d000-0000dfff
        Capabilities: <access denied>

04:00.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)
        Subsystem: Roland Corp. Edirol DA-2496
        Flags: bus master, medium devsel, latency 32, IRQ 19
        I/O ports at d0a0 [size=32]
        I/O ports at d0f0 [size=16]
        I/O ports at d0e0 [size=16]
        I/O ports at d040 [size=64]
        Capabilities: <access denied>
        Kernel driver in use: snd_ice1712
        Kernel modules: snd_ice1712

04:01.0 Multimedia audio controller: VIA Technologies Inc. ICE1712 [Envy24] PCI Multi-Channel I/O Controller (rev 02)
        Subsystem: Roland Corp. Edirol DA-2496
        Flags: bus master, medium devsel, latency 32, IRQ 16
        I/O ports at d080 [size=32]
        I/O ports at d0d0 [size=16]
        I/O ports at d0c0 [size=16]
        I/O ports at d000 [size=64]
        Capabilities: <access denied>
        Kernel driver in use: snd_ice1712
        Kernel modules: snd_ice1712
I'm using Cadence, but the jackd command line would be as follows:

Code: Select all

$ jackd -P90 -v -S -ch -dalsa -dhw:DA2496,0 -zs -M -H -p256 -r96000 -n3
FYI: Compositing is disabled in the Display and Monitor settings; just to eliminate that as a potential problem. I've also tweaked BIOS settings and can share those too if it may help pinpoint my issue.

Even with all that, I'd periodic Xruns still annoying me - far more frequently than now. But, I found obscura related to cgroups. What's on jackaudio.org baffles me, and seems to relate to cgroups1 whereas the current is cgroups2.

This post from Drumfix has helped a lot. However, if I disable rt throttling the system will sometimes lock up. All I'm doing just now is making sure the IRQs for the soundcards are off Core 0.

I've done this in /etc/init.d/rtirq, which I've a suspicion might be the ideal place to deal with this for most people. Here are the added bits to tweak CPU affinity:

Code: Select all

    ## Disable RT throttling - causes hangs
    #echo -1 >/proc/sys/kernel/sched_rt_runtime_us
    ## ICE cards to core 1
    echo 2 >/proc/irq/16/smp_affinity
    echo 2 >/proc/irq/19/smp_affinity
    ## xhci_hcd to core 1
    echo 2 >/proc/irq/24/smp_affinity
    ## Force disk handling to core 0
    echo 1 >/proc/irq/25/smp_affinity
    ## Force i915 (Intel graphics) to core 0
    echo 1 >/proc/irq/27/smp_affinity
    
    ## Timer frequency/readable RTC clock
    echo 3072 >/sys/class/rtc/rtc0/ma_user_freq
    chmod 660 /dev/hpet /dev/rtc0
The system is currently running a single card at 96000/256/3, meaning Cadence reports 2.7ms block latency. DSP load sits around 1.40% whilst running audacity.

Last bit of data for troubleshooting - a couple of sample Xruns:

Code: Select all

[...]
Tue Nov 12 12:38:55 2019: Jack: JackSocketServerChannel::Execute : fPollTable i = 2 fd = 14
Tue Nov 12 12:38:59 2019: Jack: **** alsa_pcm: xrun of at least -6689945.088 msecs
Tue Nov 12 12:38:59 2019: Jack: ALSA XRun wait_status = 0
Tue Nov 12 12:38:59 2019: Jack: JackSocketServerChannel::Execute : fPollTable i = 1 fd = 13
Tue Nov 12 12:38:59 2019: Jack: JackRequest::Notification
Tue Nov 12 12:38:59 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
Tue Nov 12 12:38:59 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
Tue Nov 12 12:38:59 2019: Jack: JackClient::ClientNotify ref = 2 name = dbusapi notify = 3
Tue Nov 12 12:38:59 2019: Jack: JackClient::kXRunCallback
Tue Nov 12 12:38:59 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
[...]
Tue Nov 12 13:03:20 2019: Jack: JackSocketServerChannel::Execute : fPollTable i = 2 fd = 14
Tue Nov 12 13:03:23 2019: Jack: **** alsa_pcm: xrun of at least -8153522.176 msecs
Tue Nov 12 13:03:23 2019: Jack: ALSA XRun wait_status = 0
Tue Nov 12 13:03:23 2019: Jack: JackSocketServerChannel::Execute : fPollTable i = 1 fd = 13
Tue Nov 12 13:03:23 2019: Jack: JackRequest::Notification
Tue Nov 12 13:03:23 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
Tue Nov 12 13:03:23 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
Tue Nov 12 13:03:23 2019: Jack: JackClient::ClientNotify ref = 2 name = dbusapi notify = 3
Tue Nov 12 13:03:23 2019: Jack: JackClient::kXRunCallback
Tue Nov 12 13:03:23 2019: Jack: JackEngine::ClientNotify: no callback for notification = 3
Tue Nov 12 13:03:23 2019: Jack: JackSocketServerChannel::Execute : fPollTable i = 2 fd = 14
[...]
Can anyone help me get a better understanding of cgroups tuning, offer any other suggestions, or point me to a decent how-to on building a Ubuntu/KDE Neon specific RT kernel?
JamesPeters
Established Member
Posts: 188
Joined: Fri Jun 29, 2018 6:35 pm
Has thanked: 8 times
Been thanked: 15 times

Re: Tuning out Xruns, cgroups usage advice

Post by JamesPeters »

With 2 different computer systems (Ryzen 3700X, Core i3-6300) I tried Kubuntu, and later with Lubuntu I added KDE Plasma (comparing it with Lxqt). So I've tried KDE Plasma twice, with that second time being a direct comparison to Lxqt (all else being equal).

Having KDE Plasma on either system meant somewhat worse performance for low latency audio in Reaper (the DAW that I use) when using a lot of CPU. Also I couldn't quite get the system as stable (without xruns) at very low latencies (apx. 2 ms round trip) in general. This is compared to the other distros I've tried the same tests with on those 2 computers, including: Mint Xfce, Xubuntu, Mx Linux, Lubuntu. I also briefly ran a few other distros but didn't test them as much (their performance for Reaper was still better than distros with KDE Plasma): Mint Cinnamon, Mint Mate, Ubuntu. I tested with the onboard (Realtek) audio devices of each mainboard, an Asus audio card, and Focusrite Scarlett 2i2 3rd gen (the latter 2 being swapped from one system to the next during testing).

I've tried disabling visual effects with KDE Plasma for a fair test. When using other DEs, I tested with and without different compositors and/or effects (Xfce's own compositor, Lubuntu's default compositor, and compared to Compton on any system that would use it). I also have decent Nvidia and AMD graphics cards which work fine in any distro, and these cards were swapped as part of the test (in case Kubuntu "preferred" one vs the other with KDE Plasma). I have several test projects in Reaper saved that I push "to the breaking point" watching for xruns, using ALSA.

Overall I had comparable performance for low latency audio at high cpu with most distros, and the compositor or visual effects had little to no performance impact on my systems (except that using a compositor always helped avoid screen tear, which I consider very beneficial). The distros that stood out as performing somewhat worse had KDE Plasma installed.

I only do a few tweaks to the system:

-change to a low latency kernel,
-change the vm swappiness,
-cpu frequency governor as "performance" (that I set with indicator-cpufreq when I run Reaper),
-disable xiccd in startup (color managment for monitor in Xubuntu, not needed for me, took extra cpu)
-disable bluetooth
-turn off power management and screen saver (let them start in session, but make sure they're set to do nothing; this seems better than disabling them, since it seems there are defaults baked into the kernel which then get used.)

I've tried the other tweaks mentioned on various Linux boards (and in your post, except the cgroups thing which I'd never heard of) plus also changing my audio device's priority, and none made any difference in a positive way that I could notice in the distros I've used. Most were using kernels 5.0+ (currently I'm using Xubuntu with 5.3 lowlatency). I think some of those tweaks either might be outdated, or not necessary for how I use the system (Reaper using ALSA, not using Jack and other apps through it). I have however added the realtime permissions for the audio group and added myself to it, as part of these tests with KDE Plasma. (I tried it both ways.)

I'm no expert so take this as "my two cents worth" of anecdotal experience: I'm sticking with Xfce or Lxqt for my distro. Maybe in the future I'll try KDE Plasma again, but especially that comparison in Lubuntu (changing over to KDE Plasma from Lxqt only took a minute, and then I ran the same tests in the DAW)...it was quite telling. I'd recommend trying a distro with Xfce and a low latency kernel to see if it helps. Your mileage may vary and all that. :)

Another recommendation: try a different blocksize/number-of-blocks configuration. Perhaps instead of 3 blocks of 256 samples, your system might work better with 6 blocks of 128 samples (or some other number of each). I got somewhat better performance (at 44.1 KHz) when I switched from 128 samples/block to 64 samples/block, then doubled the number of blocks (same total latency). From there I could then even shave off a block or two, to my surprise. If you look at the numbers used for Windows ASIO drivers in their control panels, most times they don't mention the number of blocks. You'll see "128" as a setting for instance, but it's clearly a multiple of that which they're not mentioning (# of blocks isn't shown). If you do the math for each setting when looking at the latency they cause, you'll see that audio card manufacturers will choose some specific combinations of samples/block and number of blocks. I'm guessing sometimes it just works better one way than the other.
tramp
Established Member
Posts: 2335
Joined: Mon Jul 01, 2013 8:13 am
Has thanked: 9 times
Been thanked: 454 times

Re: Tuning out Xruns, cgroups usage advice

Post by tramp »

I've i5 as well, and what helps me to avoid Xruns is the following addition to my usual boot parameters:

Code: Select all

intel_pstate=disable processor.max_cstate=1 intel_idle.max_cstate=0 idle=poll
Note, that this may lead to heating problems on Laptops, while it is save on Desktops.
On the road again.
merlyn
Established Member
Posts: 1392
Joined: Thu Oct 11, 2018 4:13 pm
Has thanked: 168 times
Been thanked: 247 times

Re: Tuning out Xruns, cgroups usage advice

Post by merlyn »

If I understand correctly hardware that worked well in 2014 with KX Studio now doesn't perform as well as it used to using an up-to-date distro.

The question is then : what is different?

KX Studio had some KDE tweaks. baloo, the KDE file indexer was disabled and WIFI was disabled.

The other thing that has changed is the kernel -- the kernel now has mitigations.

What is the DSP load when you get Xruns? If you max out your CPU then Xruns are unavoidable. This type of Xrun may go away if you turn off mitigations -- it'll give you a bit more CPU to play with on an Intel chip. If you're getting Xruns at a low DSP load then that's more like interrupts or a process like baloo.

I notice xhci has a high priority. You could try this in rtirq :

Code: Select all

RTIRQ_NAME_LIST="snd "
RTIRQ_RESET_ALL=1
FawkesThat
Established Member
Posts: 10
Joined: Mon Dec 10, 2018 8:24 am

Re: Tuning out Xruns, cgroups usage advice

Post by FawkesThat »

JamesPeters wrote: [...] the compositor or visual effects had little to no performance impact on my systems (except that using a compositor always helped avoid screen tear, which I consider very beneficial). The distros that stood out as performing somewhat worse had KDE Plasma installed.
Yup. I expect a performance hit with Plasma. Despite trying with compositing off it doesn't help much - if at all.
JamesPeters wrote: I only do a few tweaks to the system:

-change to a low latency kernel,
-change the vm swappiness,
-cpu frequency governor as "performance" (that I set with indicator-cpufreq when I run Reaper),
-disable xiccd in startup (color managment for monitor in Xubuntu, not needed for me, took extra cpu)
-disable bluetooth
-turn off power management and screen saver (let them start in session, but make sure they're set to do nothing; this seems better than disabling them, since it seems there are defaults baked into the kernel which then get used.)
xiccd looks like another point I've not yet picked on. I do pretty much the same for power management. For the screensaver I simply set the idle time before it kicks in to an hour. If I want to lock the screen then I can, and it only displays a static unlock screen..
JamesPeters wrote: I've tried the other tweaks mentioned on various Linux boards (and in your post, except the cgroups thing which I'd never heard of) plus also changing my audio device's priority, and none made any difference in a positive way that I could notice in the distros I've used. Most were using kernels 5.0+ (currently I'm using Xubuntu with 5.3 lowlatency). I think some of those tweaks either might be outdated, or not necessary for how I use the system (Reaper using ALSA, not using Jack and other apps through it). I have however added the realtime permissions for the audio group and added myself to it, as part of these tests with KDE Plasma. (I tried it both ways.)

I'm no expert so take this as "my two cents worth" of anecdotal experience: I'm sticking with Xfce or Lxqt for my distro. Maybe in the future I'll try KDE Plasma again, but especially that comparison in Lubuntu (changing over to KDE Plasma from Lxqt only took a minute, and then I ran the same tests in the DAW)...it was quite telling. I'd recommend trying a distro with Xfce and a low latency kernel to see if it helps. Your mileage may vary and all that. :)

Another recommendation: try a different blocksize/number-of-blocks configuration. Perhaps instead of 3 blocks of 256 samples, your system might work better with 6 blocks of 128 samples (or some other number of each). I got somewhat better performance (at 44.1 KHz) when I switched from 128 samples/block to 64 samples/block, then doubled the number of blocks (same total latency). From there I could then even shave off a block or two, to my surprise. If you look at the numbers used for Windows ASIO drivers in their control panels, most times they don't mention the number of blocks. You'll see "128" as a setting for instance, but it's clearly a multiple of that which they're not mentioning (# of blocks isn't shown). If you do the math for each setting when looking at the latency they cause, you'll see that audio card manufacturers will choose some specific combinations of samples/block and number of blocks. I'm guessing sometimes it just works better one way than the other.
I haven't installed Windows on any of my computers any time this decade. I do have a spare disk I could throw it on to check out the ASIO driver settings, I've toyed with the notion of doing so for a look at the full capabilities of these Roland cards.
I've two reasons for using a period of 3 blocks; firstly, I've the second card wordclock synchronised with the first and use zita-a2j with period size of 2. Secondly, buried in the lspci output there's a delay between the PCI bus and the CPU.
FawkesThat
Established Member
Posts: 10
Joined: Mon Dec 10, 2018 8:24 am

Re: Tuning out Xruns, cgroups usage advice

Post by FawkesThat »

tramp wrote:I've i5 as well, and what helps me to avoid Xruns is the following addition to my usual boot parameters:

Code: Select all

intel_pstate=disable processor.max_cstate=1 intel_idle.max_cstate=0 idle=poll
Note, that this may lead to heating problems on Laptops, while it is save on Desktops.
Will give that a go. Although, I may have done some tweaks via the BIOS settings which are functionally equivalent.
FawkesThat
Established Member
Posts: 10
Joined: Mon Dec 10, 2018 8:24 am

Re: Tuning out Xruns, cgroups usage advice

Post by FawkesThat »

merlyn wrote:If I understand correctly hardware that worked well in 2014 with KX Studio now doesn't perform as well as it used to using an up-to-date distro.
The question is then : what is different?
Put far more politely than I've been doing of late. ;)
merlyn wrote: KX Studio had some KDE tweaks. baloo, the KDE file indexer was disabled and WIFI was disabled.
I've gone through user settings to disable baloo and the file indexer. No wifi hardware in the machine, largely because I know how much of a pain it proves to be. Probably a few other things lurking, such as that 'shiney but useless' Discovery software updater.
merlyn wrote: What is the DSP load when you get Xruns? If you max out your CPU then Xruns are unavoidable. This type of Xrun may go away if you turn off mitigations -- it'll give you a bit more CPU to play with on an Intel chip. If you're getting Xruns at a low DSP load then that's more like interrupts or a process like baloo.
I've been getting Xruns with the system idling - a DSP load of sub 0.50%. Or, using a media player and getting a circa 1.40% DSP load. "mitigations=off" is one of those tips I've been looking for. Thank you! It's not as if I'm going to have the box online, other than for doing updates.
merlyn wrote: I notice xhci has a high priority. You could try this in rtirq :

Code: Select all

RTIRQ_NAME_LIST="snd "
RTIRQ_RESET_ALL=1
That's something I looked at, then decided to leave well alone. Can't do any harm to mess about with those. Well, it may make things worse - the cards are PCI and the two-slot bus is hanging off the Intel motherboard chipset.

I'll post an update on what helps from all the suggestions so far. I'm going to ditch the cgroups tweaks as too much stuff ends up running on Core 0. I also just threw AVLinux into a virtual machine to find out where its RT kernel comes from. Then discovered it is out of date - 4.16-rt versus 4.19-rt which is the last release of the 4.xx RT kernels.
FawkesThat
Established Member
Posts: 10
Joined: Mon Dec 10, 2018 8:24 am

Re: Tuning out Xruns, cgroups usage advice

Post by FawkesThat »

Tried all of the above suggestions with no luck. However, I do notice a slightly lower CPU load when using the mitigations suggestion. It can stay. I only ever network this machine via a temporary connection to install updates. I've also dropped the cgroups stuff. I may come back to it as a way to spread the load based on which IRQs are in use.

So it seems I can say "no" to using 96kHz. For the time being, anyway.
I'm beginning to wonder if the issue relates to rehousing the motherboard in a rackmount case and racking it up with the soundcards and couple of preamp units. I may have a ground loop somewhere, or have interference being picked up on the cabling to the soundcards' external boxes.

However, at sample rate 88.2kHz, buffer size 512, and periods 2, this is running just fine. Including bringing in the second card using:

Code: Select all

## -p and -n values chosen to provide lowest latency as displayed when using the -v option
$ chrt -f 85 zita-a2j -jIN-Card1 -dwh:DA2496_1 -r88200 -p64 -n5 -c12 -S
I've the whole lot plumbed into Ardour - and a pair of jkmeters - with a DSP load just under 7%.
Image
This is what I've been looking for. Albeit not at the sample rate it should be. I've certainly capacity to add F/X, samples, or computer-based synth stuff. I can likely drop the buffer size to 256 samples if I put in a fully preemptable kernel; then being able to do all monitor mixing, in realtime, on the PC itself.
Image
I'm not going to go all Neil Young about the sample rate. The 44.1kHz of CD was selected to be double the top end of average human hearing. DAT, at 48kHz, is double the range top-end of teenagers. What I'm getting won't be as good as 96kHz, but you'd need to have perfect hearing to tell the difference.
Image
I've picked up- a few points along the way here. Including seeing why these cards still fetch £100+ on Ebay. They're advertised as 8 in, 8 out. However. I've got the clicktrack wired to inputs 9 and 10 of card 1. It's included in the digital mix being sent out via the lightpipe; shows as SPDIF outputs on the first - plus inputs 8 and 9 on the second card. That seems more like 8 in, 10 out. I'm curious to look at these with designed-for Windows software now. There may be more I'm missing.
crocket
Established Member
Posts: 68
Joined: Fri Mar 29, 2019 11:56 am

Re: Tuning out Xruns, cgroups usage advice

Post by crocket »

Post Reply