DOS – A changelog by Daniel Mettler

Recently, the SystemUIServer process on my MBP running macOS Sierra has started “eating” a lot of CPU, slowing down the whole machine, even making the clock in the top menu bar stop working properly. It usually started using one-digit percentages of the available CPU power, then growing to 10%, 15%, 20%, up to well above 60%, sometimes even 80% and more! It wasn’t a steady growth – it sometimes shrank again, just to grow even further.

The only apparent remedy was to kill the SystemUIServer process (e.g. using the Activity Monitor) from time to time (i.e. every 30 minutes -> there are also scripts to automatically restart SystemUIServer). Its CPU usage then reset to a low one-digit number.

Taking a closer look at the process in the Activity Monitor, I then noticed that the number of (used) ports (so-called “Mach ports“) by the process were steadily growing, once SystemUIServer was started. This was weird, pointing to some kind of leakage. Typically, for a CPU load of around 50%, more than 5000 Mach ports were used.

By coincidence, I then noticed that, unlike expected, my MBP wasn’t actually using Ethernet, but only WiFi. Further investigation then hinted that the according Gigabit Ethernet port on my HP 1810 switch was apparently malfunctioning (or even dead): In the macOS Network Preferences, the Thunderbolt-Ethernet connection was constantly shown as red/disconnected, although the OS was apparently trying to establish a connection again and again (and failed). First, I even suspected a problem with the Thunderbolt-Gigabit-Ethernet adapter itself (it wasn’t the problem here, the adapter seemed to work fine with another Mac and connection).

The solution to this problem thus was:

Connect the Ethernet cable to another, working Ethernet port: Now the SystemUIServer process consumes less than 0.1% CPU again and roughly 400 ports only, both with and without additional WiFi.
Note that both the problem and this solution are reproducible.

Lessons learned:

Sometimes, very unexpected, seemingly unrelated and “small” problems can have big (negative) effects.
Sometimes you need a bit of luck to find the cause of a problem (a web search didn’t bring up the above hint, rather suggested updating or removing faulty apps, buggy extensions and menu widgets. I thus already tried removing or updating some of the suspected apps, extensions and widgets.)
Ports of HP 1810 switches can actually break/fail! Remember the saying: “I got 99 problems, but a switch ain’t one!” – well, in this case, the faulty switch was actually part of the problem and even the initial trigger of the problem! Also remember that HP offers a lifelong warranty on its (good ol’) 1810 switches.
Extra points for you, further research: The fact that the SystemUIServer allocates more and more Mach ports if there’s a malfunctioning Ethernet port (i.e. faulty Ethernet connection or faulty handling of a faulty Ethernet connection by the Thunderbolt-Gigabit-Ethernet adapter) is hinting that this might be an attack vector for a (new?) DoS attack. Perhaps not an easily exploitable one (on the Ethernet or MAC layer, even), but it’s nonetheless something that should actually be handled gracefully by SystemUIServer, not leading to more and more CPU and system resources being consumed.
If you have time to research this further, let me know about your findings!

Generally, Zimbra‘s update scripts for ZCS run smoothly. However, after updating ZCS from 7.0.1 to 7.1.1 on Ubuntu 10.04.2 LTS a couple of days ago, I noticed that most of the server’s services crashed within 1-2 hours after starting the (virtual) server.

To make a long debugging story short, here’s a summary of the problem:

The update script doesn’t properly remove old entries in /etc/rsyslog.conf when creating a new dedicated rsyslog configuration file for zimbra (/etc/rsyslog.d/60-zimbra.conf). This makes rsyslog log its own logging due to double rule entries – making /var/log/zimbra-stats.log grow at 2 MB/s. Like this, zimbra effectively DOSes itself as the server runs out of free disk space in no time (my above-average 12 GB of free space were filled within about 1 hour 40 minutes).

And here’s the solution:

1. Stop rsyslog to stop it from filling up your disk with nonsense: ‘/etc/init.d/rsyslog stop’ or ‘service rsyslog stop’
2. If necessary (i.e. if the updated server has been running for several minutes already), reclaim your free disk space by deleting/emptying big log files in /var/log, e.g. zimbra-stats.log, zimbra.log, mail.info, mail.log, mail.err, mail.warn etc.
3. Edit /etc/rsyslog.conf and remove the entries (likely at the end of the file) that look similar to these:
local0.* @mail.yourserver.com local1.* @mail.yourserver.com auth.* @mail.yourserver.com local0.* -/var/log/zimbra.log local1.* -/var/log/zimbra-stats.log auth.* -/var/log/zimbra.log mail.* @mail.yourserver.com mail.* -/var/log/zimbra.log
(these are now in /etc/rsyslog.d/60-zimbra.conf)
4. Restart rsyslog: ‘service rsyslog restart’
5. Restart ZCS if it doesn’t run properly anymore: /etc/init.d/zimbra restart (you may even have to reboot the whole box if that doesn’t work)

Tag: DOS

macOS: SystemUIServer eating your CPU? Check your Ethernet connection!

Zimbra Collaboration Server: Troubles after updating to ZCS 7.1.1 – here’s the solution.