Troubleshooting Guide

When encountering a problem always make sure:

  • The extension is of the latest version and the Kolab packages are up-to-date.
  • The problem is reproducible on the same system: If it’s not reproducible at all, there is nothing that can be done. We need clear steps to reproduce the issue to be able to investigate.
  • The problem can be reproduced on a second system: Unless a problem is reproducible on a second system it is not a software defect, but a configuration issue.

Once a problem has been reproduced on a second system with up to date extension and packages, it can be fixed.

Determining the faulty component

To isolate which component has a defect you can work your way from the client (where you see the symptoms) inwards.

For an issue with an IMAP client you could first have guam, and then dovecot. You could for instance bypass guam by connecting to dovecot directly, thus making sure that a specific problem indeed only exists with guam in the middle.

For an activesync/iRony issue there is NGINX -> Apache -> Syncroton/iRony. First make sure that activesync is reachable on the outside before investigating if there is a problem in syncroton.

Common scenarios

How to recognize responsible component(s) via symptoms.

  • If Outlook cannot connect to the email, sync messages, or number of messages in the Outlook inbox is different from that in Roundcube, then the most likely culprit component is Synchroton (if connected via ActiveSync). Check this chain of services: NGINX -> Apache -> PHP -> Syncroton -> IMAP.
  • Webmail is not reachable: NGINX -> Apache -> PHP -> IMAP
  • Premium email is enabled but you end up on regular roundcube: Check for /etc/roundcubemail/$domain/freemium.inc.php (should be absent).

Capturing debug output

If a problem can be reproduced, debug output can help figuring out what is going wrong exactly.

Capture debug output of all components that are involved with a problem.

Debug output should always be captured for the duration of one reproduction of the issue only:

  • Prepare everything to reproduce the problem
  • Clear existing logfiles
  • Record the timestamp when you reproduce the problem
  • Reproduce the problem
  • Immediately capture the logfiles (so we don’t end up with unnecessary extra information).

Instructions on how to enable debug output can be found here: https://kb.kolabenterprise.com/documentation/capturing-debug-logs

The following log-files/directories are relevant:

  • /var/log/maillog (only relevant when troubleshooting IMAP/SMTP)
  • /var/log/httpd /var/log/apache2
  • /var/log/nginx
  • /var/log/roundcubemail (only relevant when troubleshooting webmail)
  • /var/log/kolab-syncroton (only relevant when troubleshooting activesync)
  • /var/log/guam (only relevant when troubleshooting imap)
  • /var/log/kolab-autoconf (only relevant when troubleshooting autoconf)
  • /var/log/iRony (only relevant when troubleshooting DAV)
  • /var/log/plesk/panel.log (only relevant when troubleshooting the plesk extension)
  • /var/log/plesk-php$VERSION-fpm

To generate a tarball use a command like this (include the relevant directories according to what you’re troubleshooting):

tar -czf logfiles.tar.gz /var/log/plesk-php82-fpm/ /var/log/kolab-syncroton/

This ensure all files are bundled together in a single tarball, and retain the path which is important to e.g. distinguish multiple “error.log” files.

Use the following template to report the logfiles (every time with new log files):

These logfiles were collected while reproducing issue XXX.
The following steps were executed to reproduce the issue:
* Connected outlook for user doe@example.com (deviceid XXXX)
* Triggered synchronization for folder INBOX on 12.04.2024 11:33:11 (including seconds, a lot can happen in a minute)
* Waited for synchronization to complete

Expected outcome, what happened:
* Expected read flag to be changed
* The read flag was not changed.

It is crucial that logfiles are always collected in the same manner, it is clear during which timeframe, and that it is clear what the logfiles are supposed to document (actions that were executed during the timeframe, including timestamp, and expectations that were not met (aka. the issue)).

Common issues & Troubleshooting techniques

  • When starting the investigation, always check that extension is of the latest version and Kolab packages are up-to-date.
  • If a component that has an issue and runs in apache look out for php / fcgid errors. Memory exhaustion or request timeouts can leave very few traces otherwise.
  • Check if activesync is in principle available: curl -i -u 'user@our.domain.tld:pass' -X OPTIONS https://our.domain.tld/Microsoft-Server-ActiveSync
  • Check if SNI is working: echo 'Q' | openssl s_client -connect localhost:993 -servername domain.tld -showcerts 2>&1 | grep -Eo 'CN=[^/]+' | uniq
  • To deal with roundcube cache issues, such as disappearing calendar events or not updating address books: /usr/share/roundcubemail/plugins/libkolab/bin/modcache.sh clear -u user@domain.tld<mailto:user@domain.tld> -h imap.host.name
  • If you experience problems with the extension UI in Plesk, always check Plesk’s panel.log. Enable debug<https://support.plesk.com/hc/en-us/articles/213408889-How-to-enable-disable-Plesk-debug-mode>, if necessary.
  • You can check for missing php-modules using: /opt/plesk/php/7.3/bin/php -m. The following modules should be available: kolabcalendaring, kolabformat, kolabicalendar, kolabobject, kolabshared.
  • If no logfiles are generated (e.g. by roundcube), ensure the log-directories have been created with sufficient permissions for writing and se-linux (see /var/log/audit/audit.log) is not interfering. If in doubt temporarily disable selinux and give world-writable permissions to the log-directory.
  • When a user has a “.” in the name, then sharing breaks. This is because we use “.” as namespace separator in dovecot (as required by the spam filter).

Dovecot

  • Folders that have *only* the list right, but no read right break GETMETADATA commands that include the folder, resulting in an unexpected NO result to the command (which in turn breaks the roundcube folder list). Ensure that folders either have no list right, or always combine list with read rights. See also https://bifrost.kolabsystems.com/T802351

doveadm can be used to inspect the imap store:
list access rights:

doveadm acl get -u user@example.org Calendar

debug access rights to a folder:

doveadm acl debug -u user@example.org Calendar

list folders:

doveadm mailbox list -u user@example.org

set a folder-type annotation:

doveadm mailbox metadata get -u $user $mailbox /shared/vendor/kolab/folder-type

set a folder-type annotation:

doveadm mailbox metadata set -u $user $mailbox /shared/vendor/kolab/folder-type contact

Guam

  • A “crash” in guam is not necessarily problematic. If e.g. an ssl handshake fails you will get a crash report, because the language this module is written in defaults to “crashing” as part of the normal error handling.
  • Guam’s debug logging logs the complete imap traffic which will overwhelm a production system in use. Do not enable unless you need the output, and disable afterwards.
  • If the guam service does not start you can attempt to start the process directly using /usr/sbin/guam foreground which may give you additional debug output

SNI

If SNI is enabled guam needs special configuration for SNI to work (because guam needs to terminate ssl). The guam configuration is updated on installation, and a troubleshooting item with a “Fix” button will appear if the configuration is out of sync.

You should have a file with a name like /etc/dovecot/conf.d/14-plesk-sni-$domain.conf (from the SNI config for the domain). In that file we have a line starting with “ssl_cert =” with a path pointing to the certificate.

This path should then be in the guam config at /etc/guam/sys.config, in a line like

{ sni_hosts, [{ "$domain", [{ certfile, "$pathToCertificate" }]}]},

To verify if SNI is working use

echo 'Q' | openssl s_client -connect localhost:993 -servername domain.tld -showcerts 2>&1 | grep -Eo 'CN=[^/]+' | uniq

and compare to the output you get when connecting to port 9993 (dovecot).

Webserver

For issues with web server configuration (e.g., when some service like Mattermost is not available, but also Roundcube/ActiveSync/DAV issues), follow these steps:

  • Run plesk repair web for the affected domain
  • Check the custom configuration template in /usr/local/psa/admin/conf/templates/custom/webmail/roundcube.php . Make sure it exists, readable, and contains PHP code.
  • Compare the web server configuration with the configuration of a similar test server on which everything works.

Process limit

Plesk has a low process limit configured by default (FcgidMaxProcesses). Because ActiveSync connections (e.g. Outlook) have persistent connections that each take up a process slot, it is possible that multiple clients exceed the process limit, resulting in the webserver failing to process any further requests. This results in mod_fcgid: can't apply process slot for /var/www/cgi-bin/cgi_wrapper/cgi_wrapper errors in /var/log/httpd/error_log.

To fix increase the following values in /etc/httpd/conf.d/fcgid.conf:

FcgidMaxProcessesPerClass 50
FcgidMaxProcesses 150

Debug output

Debug output can be enabled in /etc/httpd/conf/httpd.conf by setting:

LogLevel debug

Debug output will appear in /var/log/httpd/error_log

Sieve

Sieve functionality is provided by pigeonhole.

Verify the sieve script is enabled:

$ doveadm sieve list -u admin@kolab-customer.maipo.wht.pxts.ch
roundcube ACTIVE

Activate sieve debug logging in /etc/dovecot/conf.d/90-plesk-sieve.conf:

plugin {
...
  # You have to create the log directory first
  sieve_trace_dir = /var/log/sieve
  sieve_trace_level = tests
}

troubleshoot.php

The kolab extension comes with a troubleshooting script to detect various common issues. It can be executed like so to generate a report:

plesk bin extension -e kolab troubleshoot.php > report.txt

analyzelogs.sh

The following script can be used to grep through logfiles to surface various common issues. Please note that the paths are currently geared toward centos7, and the script is WIP. Suggestions are welcome.

!/usr/bin/env bash
#This is a script to grep through logs to surface common problems quickly.
#If a message matches this "might" point to an error, but will have to be judged on a case-by-case basis.

echo "==> /var/log/kolab-syncroton/errors.log"
grep -E "(PHP Fatal error|PHP Error|SMTP Error)" /var/log/kolab-syncroton/errors.log
echo

echo "==> /var/log/maillog"
grep -E "(milter-reject)" /var/log/maillog
echo

echo "==> /var/log/roundcubemail/errors.log"
grep -E "(PHP Fatal error|PHP Error|SMTP Error|IMAP Error)" /var/log/roundcubemail/errors.log
echo

echo "==> /var/log/httpd/error_log"
grep -E "(PHP Warning|mod_fcgid: read data timeout|End of script output|mod_fcgid: can't apply process slot)" /var/log/apache2/error.log /var/log/httpd/error_log /var/log/httpd/ssl_error_log
echo

echo "==> /var/log/guam/"
grep -r -E "(Fatal error)" /var/log/guam/
echo