Sentinel Syslog Forwarder with AMA

Stay tuned for Part 2 - Transforming syslog data during ingestion

Introduction

The new kid on block AMA aka the Azure Monitor Agent is ready to dance and shake things up with collecting syslog. Compared to the Log Analytics agent we get whole slew of new features, easier management, and at scale log collection.

The Azure Monitor agent introduces several new capabilities, such as Ingestion-time transformations, filtering, scoping, and multi-homing. But it isn’t at parity yet with the current agents for other functionality. View current limitations and supported solutions to ensure the agent will meet your requirements.

Microsoft’s various agents. The new AMA replacing the Log Analytics, Dependency Agent, and Telegraf Agent

How AMA and Syslog work together

The Azure Monitor Agent leverages the Unix sockets output module known as omuxsock in rsyslog to forward messages to AMA. This is configured during installation, you can find this configuration in the /etc/rsyslog.d/10-azuremonitoragent.conf file. You can verify what ports AMA is listening on by running

ss -a | grep azuremonitor

u_str LISTEN 0 10 /run/azuremonitoragent/default_bond.socket 262160 * 0
u_str LISTEN 0 10 /run/azuremonitoragent/default_djson.socket 262161 * 0
u_str LISTEN 0 10 /run/azuremonitoragent/default_json.socket 262162 * 0
u_str LISTEN 0 10 /run/azuremonitoragent/default_fluent.socket 260877 * 0
u_str LISTEN 0 10 /run/azuremonitoragent/default_influx.socket 260878 * 0
u_dgr UNCONN 0 0 /run/azuremonitoragent/default_syslog.socket 260879 * 0

AMA will cache events from syslog in the /var/opt/microsoft/azuremonitoragent directory during processing. This cache can potentially fill up in the event AMA is not processing data or is not able to keep up with the amount data coming in. The default cache size is configured to 50000 MB. This configuration can be found in the /etc/opt/microsoft/azuremonitoragent/mdsd.xml under the Management element

<Management eventVolume="Large" defaultRetentionInDays="90" >
  <Identity>
    <IdentityComponent name="Tenant">myTestTenant</IdentityComponent>
    <IdentityComponent name="Role" envariable="MONITORING_ROLE" />
    <IdentityComponent name="Host" useComputerName="true" />
  </Identity>
  <AgentResourceUsage diskQuotaInMB="50000" />
</Management>

It’s important to understand the cache size in regard to determining your disk space requirements for your syslog server.

As syslog messages come into the rsyslog daemon they are forwarded locally to the Azure Monitor Agent. AMA will then process the messages according to the assigned data collection rules and send them onto the log analytics workspace.

A look at rsyslog and AMA on Ubuntu

Lab Environment

In the below example I am using Ubuntu 20.04.4 LTS. I have the virtual machine running in Azure, but pretending to be an on-premise server connected via Azure Arc.

Server Requirements

A single log forwarder machine using the rsyslog daemon has a supported capacity of up to 5000 events per second (EPS) collected.

  • 4 CPU cores and 8 GB RAM
  • rsyslog: v8+ or Syslog-ng: 2.1 – 3.22.1 (Ubuntu comes with rsyslog)
  • python 2.7 or 3
You can get the current version of rsyslog using the command: rsyslogd -N1
You can get the current version of python using the command: python --version or python3 --version

Disk Space Guidance

Disk space sizing can be a bit tricky to figure out. I would recommend using the Microsoft Sentinel EPS & Log Size Calculator to help determine your required volume size. Once you have an estimate on the amount of data you will be forwarding per hour to the syslog server adjust your disk space requirements accordingly. You will also want to account for an additional 12 – 15 GB for agent operations for the /var partition.

As an example if we are expecting to forward 5 GB per hour and there are sustained connectivity issues for an hour we need a minimum of 10 Gb just for caching. Adding on the additional agent requirements of 12 – 15 GB that brings us to 25 GB.

Connecting to Azure Arc

If your syslog forwarding server is running in Azure you can skip the Azure ARC steps

If you are not already familiar with Azure Arc this is a great introduction. In simple terms Azure Arc extends the Azure Resource Management plane to a number of supported resource types including servers located on-premise or in non Azure cloud environments. This allows unified management of all your resources regardless of where they are logically or physically located.

It also enables you to extend Azure Security features such as Managed Identities, Defender for Cloud, and Azure Policy to your hybrid workloads

If you are new to Arc check out the community driven Azure Arc Jumpstart for getting started.

A look at Arc Capabilities

Why is Arc Required?

The new AMA agent leverages Data Collection Rules, DCRs, which are configuration definitions on what to collect from endpoints where the AMA is installed. A single DCR can have a one to many relationship with connected machines. Connected machines can also have many to many relationship with multiple DCRs. A DCR can also have a one to many log analytics relationship.

In order for DCRs to properly work a resource needs to exist in Azure. For hybrid servers that’s where Azure Arc comes in.

Installing the Arc Agent

There are multiple methods to install the Arc Agent, including options for at scale deployment. Since we are only working with one server for this scenario, we are going to be following the more manual method of installation. Check out Arc Deployment Options for all supported deployment types.

Before You Begin

  1. Ensure you have the required resource providers registered on your subscription
  2. Create a new resource group in Azure dedicated to your Azure Arc connected machines.
  3. Exclude that resource group from any Azure Policies you may have deployed at the Subscription or Management Group level that may cause an impact.
  4. Ensure you have the correct permissions in Azure
    • To onboard machines, you must have the Azure Connected Machine Onboarding or Contributor role for the resource group in which the machines will be managed.
    • To read, modify, and delete a machine, you must have the Azure Connected Machine Resource Administrator role for the resource group.
    • To select a resource group from the drop-down list when using the Generate script method, you must have the Reader role for that resource group (or another role which includes Reader access).
  5. Ensure you meet the network requirements
Not required, but Step 1 and Step 2 are important steps to take to ensure you don't impact your existing servers and also don't incur any additional charges. While its free to connect machines to Azure Arc, you will be charged for additional Azure services that may be in scope of the Azure resource group where the Arc resource will reside. Check out the pricing documentation for full details. 

Installation

Installation is quick and simple for a single server.

  1. From the Azure Portal navigate to Azure Arc > Servers
  2. Select Add
  3. Select Generate Script under Add a single server
  4. Click Next, Select the Subscription, Resource Group, Region, Operating System, and Connectivity Method
  5. Click Next, add any tags
  6. Copy the script
  7. Next from the Linux Server, I recommend elevating your session using su first to ensure the script executes successfully: su <username>
  8. Paste the copied script into the terminal. It should only take a few seconds to install.
  9. During the installation you will be prompted to navigate to the Azure Device Login page, copy and paste the URL into your browser, enter the device code given, and authorize the connected machine agent.
  10. You should see the server connected in the Azure Arc > Servers pane after a few moments

Syslog Forwarder Server

Configuring Log Rotation

logrotate will rotate syslog by default every 7 days. We will want to update this to a shorter interval to ensure we don’t fill up the volume. Considering we are sending these logs directly to Sentinel I would recommend 1 day or less.

logrotate is not a daemon and instead runs a system cron job on a daily basis. You can check the current logrotate configuration using:

cat /etc/cron.daily/logrotate
 
cat /etc/logrotate.d/rsyslog

Edit the rsyslog logrotate file and update your desired log rotation schedule. In the below example I am updating mine to every hour and keeping a maximum of 12 archived logs files.

sudo nano /etc/logrotate.d/rsyslog
/var/log/syslog
{
        rotate 12
        hourly
        missingok
        notifempty
        delaycompress
        compress
        postrotate
                /usr/lib/rsyslog/rsyslog-rotate
        endscript
}

You can force logrotate to run by executing the below command

sudo logrotate -f /etc/logrotate.conf -d

Configuring the syslog server

We need to configure a few things to ensure the syslog daemon is listening for remote syslog connections to act as a forwarding server.

Configure TCP / UDP listening

You can choose to use TCP, UDP, or both with a custom port. In the example below I am configuring TCP and UDP on the default port, 514.

Edit the /etc/rsyslog.conf file and remove the commented TCP and UDP values under the MODULES section.

sudo nano /etc/rsyslog.conf
  • Uncomment the below under #provides UDP syslog reception
    • module(load=”imudp”)
    • input(type=”imudp” port=”514″)

  • Uncomment the below under #provides TCP syslog reception
    • module(load=”imtcp”)
    • input(type=”imtcp” port=”514″)

Your file should like the below

#################
#### MODULES ####
#################

module(load="imuxsock") # provides support for local system logging
#module(load="immark")  # provides --MARK-- message capability

# provides UDP syslog reception
module(load="imudp")
input(type="imudp" port="514")

# provides TCP syslog reception
module(load="imtcp")
input(type="imtcp" port="514")

# provides kernel logging support and enable non-kernel klog messages
module(load="imklog" permitnonkernelfacility="on")

2. Restart the rsyslog service for the changes to take effect, check that the service is running, and listening on port 514

sudo service rsyslog restart
sudo service rsyslog status
ss -4altunp | grep 514

#Output should look like below
udp    UNCONN  0       0              0.0.0.0:514          0.0.0.0:*
tcp    LISTEN  0       25             0.0.0.0:514          0.0.0.0:*

3. If the firewall service is running, allow rsyslog

sudo ufw allow 514/udp
sudo ufw allow 514/tcp

Configuring Allowed Senders (Optional)

If desired you can configure what endpoints can connect to the syslog server.

Edit the /etc/rsyslog.conf file and add the below section. Comment out the $AllowedSender UDP and $AllowedSender TCP sections and update with the appropriate allowed senders

Allowed sender lists can be defined for UDP and TCP senders separately. There can be as many allowed senders as needed. The syntax to specify them is:

$AllowedSender <type>, ip[/bits], ip[/bits]
sudo nano /etc/rsyslog.conf

###########################
#### GLOBAL DIRECTIVES ####
###########################
# $AllowedSender - specifies which remote systems are allowed to send syslog messages to rsyslogd
$AllowedSender UDP, 192.168.57.0/24, [::1]/128, *.example.net, servera.example.com
$AllowedSender TCP, 192.168.58.0/24, [::1]/128, *.example.net, servera.example.com

Configuring the Data Collection Rule

Now that we have syslog configured to act as a forwarder we can create our Data Collection Rule that defines what we want to collect from the syslog forwarder. Assigning the DCR will also install the Azure Monitor Agent on the syslog forwarder server.

From the Azure Portal navigate to Azure Monitor > Data Collection Rules

Select Create and enter the appropriate details

Personally I like to use resource groups as boundary for separating out Azure resources and their use cases. It provides an easy way to audit, manage, and define RBAC. In this example I created a dedicated resource group for all my Data Collection Rules called sentinel-dcrs


Next on the Resources tab Click Add resources. From the Add resources pane navigate to the resource group where your Arc connected syslog server resides. Select the Arc server and click Apply


Next Select Collect and deliver. On the Collect and deliver pane, select Add data source

On the Add data source pane select Linux syslog under Data source type

Under the Facility table we are going to define what we want to collect for each syslog facility. For this use case we are going to change all of the log levels to none, expect for LOG_SYSLOG. For the Minimum log level we are going to select LOG_INFO.

Next under the Destination tab select Azure Monitor Logs, the Subscription where your Sentinel workspaces lives, and the Sentinel workspace name

As you can see, we can add multiple destinations for a single DCR

Click Add data source > Next Review + create > Create

After the deployment has completed it will take a few minutes for the AMA extension to get installed on the syslog forwarder. You can check for successful installation by navigating to the Arc resource and selecting the Extensions pane. You should see the AzureMonitorLinuxAgent extension with a status of Succeeded

You now have a DCR that you easily scale out to other syslog servers, change and deploy new configurations, or remove configurations. 

Verifying Syslog Collection

An easy method to verify that syslog is to send test messages using logger

logger -p syslog.info "test to sentinel"

After a few minutes we can check in the Sentinel workspace for logs in the syslog table

Syslog 
| where Computer == "onpremsyslog01"

Forwarding from other Servers

Now that we have the syslog forwarder server configured we can configure remote endpoints to forward their syslog messages to the syslog forwarder. In the below example I am using another Ubuntu 20.04.4 LTS server.

Update the /etc/rsyslog.conf configuration file to send logs to the remote syslog server

sudo nano /etc/rsyslog.conf

You have probably seen other examples using the legacy format like below

auth,authpriv.* @192.168.57.3:514

This is no longer advised and instead you should be using action statements like below. See omfwd: syslog Forwarding Output Module for more details

The new syntax is relatively straightforward:

<facilities and levels> action(type="omfwd" target="<syslog forwarder server hostname or IP address>" port="<port>" protocol="<tcp or udp>

Add the following section to send over TCP

# Send logs to remote syslog server over TCP
syslog action(type="omfwd" target="10.0.0.9" port="514" protocol="tcp" action.resumeRetryCount="100" queue.type="linkedList" queue.size="10000" queue.saveOnShutdown="on" queue.filename="backup_log")

Add the following section to send over UDP

# Send logs to remote syslog server over UDP
syslog action(type="omfwd" target="10.0.0.9" port="514" protocol="udp" action.resumeRetryCount="100" queue.type="linkedList" queue.size="10000" queue.saveOnShutdown="on" queue.filename="backup_log")

What are these action and queue parameters?

These additional parameters provide the ability to save logs in the event there are connectivity issues to the syslog forwarder server. See Queue Parameters and Action Parameters for more details.

  • action.resumeRetryCount – Sets how often an action is retried before it is considered to have failed.
  • queue.type – Enables a LinkedList in-memory queue
  • queue.size – This is the maximum size of the queue in number of messages
  • queue.filename – Defines a disk storage. The backup files are created with the name prefix, in the working directory specified by the preceding global workDirectory directive.
  • queue.saveOnShutdown – Ensures that any queue elements are saved to disk before it terminates

Other Examples

# Send Everything over UDP
*.* action(type="omfwd" target="10.0.0.9" port="514" protocol="udp")

# Send all authpriv log types over UDP
authpriv.* action(type="omfwd" target="10.0.0.9" port="514" protocol="udp")

# Send all info except mail, authpriv, and cron log types over UDP
*.info;mail.none;authpriv.none;cron.none
action(type="omfwd" target="10.0.0.9" port="514" protocol="udp")

Restart the syslog service

sudo service rsyslog restart

Verifying Syslog Collection

After a few minutes we can check in the Sentinel workspace for logs in the syslog table from the remote endpoint by filtering on computer name.

Syslog 
| where Computer == "onpremlinux01"

Other Best Practices

Reduce Default Syslog Collection

By default rsyslog will include any configuration file in /etc/rsyslog.d to determine what to log. This configuration is defined in the /etc/rsyslog.conf file under the $IncludeConfig section or the new method using include() objects.

#
# Include all config files in /etc/rsyslog.d/
#
$IncludeConfig /etc/rsyslog.d/*.conf

You can update the include statement to only specific conf files or modify the 50-default.conf file to reduce syslog collection of unnecessary logs. Just make sure you don’t exclude the 10-azuremonitoragent.conf file as this will break the AMA integration. As you can see from below there are fair number of facilities logged out of the box on Ubuntu. You can comment out facilities that aren’t needed to mitigate any disk space issues.

#  Default rules for rsyslog.
#
#                       For more information see rsyslog.conf(5) and /etc/rsyslog.conf

#
# First some standard log files.  Log by facility.
#
auth,authpriv.*                 /var/log/auth.log
*.*;auth,authpriv.none          -/var/log/syslog
#cron.*                         /var/log/cron.log
#daemon.*                       -/var/log/daemon.log
kern.*                          -/var/log/kern.log
#lpr.*                          -/var/log/lpr.log
mail.*                          -/var/log/mail.log
#user.*                         -/var/log/user.log

%d bloggers like this: