FleetDM and OSQuery on Ubuntu with Lego and LetsEncrypt

We’ve had some major issues running Munkireport recently, and while switching platforms mid ride is generally a really bad idea, I just wanted *something* to work and I had always wanted to try FleetDM, so…

I still love Munkireport and it’s beautiful graphing and helpful community. But my analytics engine uses SQL queries, and so does FleetDM, as it adds a GUI to OSQuery under the hood. Originally developed at Facebook, it also has some other cool features- like being able to load and search for known vulns etc.

*Munkireport uses MySQL too, but even when I had a working instance I couldn’t slurp data out of it- there’s an API and even a Postman collection, but it just *wouldn’t*. I did get to export data using Zoho’s Java connector for Zoho Analytics, but it was sooo ugly.

FLeetDM is a serious, enterprise type app- they claim to have had instances with up to 150,000 live agents, but I figure I’ll need to retire at around 149,000.

Unsurprisingly for someone who *doesn’t* currently manage computers at this scale, I had some issues getting it running, so here’s some tips that might help you. I’ve tried to make the examples explicit- if you need to compare with original source, the links are provided.

This article makes a lot of assumptions, so maybe read it through before you commit- I wanted to set up a server suitable for production, that can handle ~500 clients, have the web server on port 80/443 using SSL, have the clients also use a well known ports so it can work in corporate environments where we don’t control the firewall, etc.

STOP! This took a lot longer than I had budgeted for, but most issues were simply because the docs need updating- you may be able to leverage these instructions and get going faster, marry a supermodel etc. But before you do, have a look at the task list-

Install FleetDM
Install MySQL
Install Redis
Test Redis
Install Lego
Create SSL/TLS certs
Move certificates and set permissions
Create a script to renew certificates
Prepare the Fleet database
Create a Fleet config
Create a Fleet system.service and autostart
Import standard query library

Some of these things are fairly trivial, but many are not. I think it’s a great project otherwise I wouldn’t have invested so much time, but now I need to go and write some queries to figure out when I can stop for lunch before someone installs another malware ‘sample’ on their computer…

We use Ubuntu a fair bit as the basis for cloud or customer facing services, and the instructions for FleetDM on Ubuntu say ‘Acquiring an Ubuntu host to use for this guide is largely an exercise for the reader’. But this is fine because on most cloud providers this is as difficult as opening the control panel and clicking a button. The FleetDM instructions however refer to Ubuntu 16.04 and we are living a a brand new century which includes Ubuntu 20.04 so let’s use that…

Install FleetDM

Once your host is running, download, unzip and relocate the fleet binary-

wget https://github.com/fleetdm/fleet/releases/download/fleet-v4.15.0/fleet_v4.15.0_linux.tar.gz
gzip -d fleet_v4.15.0_linux.tar.gz && tar -xvf fleet_v4.15.0_linux.tar

Move the binary to /usr/bin

sudo mv fleet /usr/bin/

Now we don’t want our services running as root, let’s create new user for fleet, and new groups for certificate access

sudo adduser username fleet

mdkFv6Pkl0Cr

(all passwords have been changed to new random ones, don’t bother trying them against our infra unless you are incredibly bored and enjoy failure)

Make group called ‘certs’

sudo addgroup certs

Add user ‘fleet’ to group ‘certs’

sudo adduser fleet certs

Install MySQL

this is where the instructions diverged quite dramatically from the install instructions on the Fleet website- part of this is because they are directing you to a non production setup, part of it is because the instructions need updating. here we need to cope with a changed auth type in MySQL.

sudo apt-get install mysql-server -y

Mysql when asked for username and password add-

root
AQErxKYJ3ac1

Had to create a new mysql user according to this

https://stackoverflow.com/questions/39281594/error-1698-28000-access-denied-for-user-rootlocalhost

fleet

F5KDS4wbjU61

log into MySQL with

sudo mysql -u root  CREATE USER 'fleet'@'localhost' IDENTIFIED BY ‘F5KDS4wbjU61’;
GRANT ALL PRIVILEGES ON *.* TO 'fleet'@'localhost';
UPDATE user SET plugin='auth_socket' WHERE User='fleet';
FLUSH PRIVILEGES;
exit;
sudo service mysql restart

I also had to do this-

UPDATE user SET plugin='mysql_native_password' WHERE User=‘fleet’;
UPDATE user SET plugin='caching_sha2_password' WHERE User=‘fleet’;
CREATE USER ‘fleet’@‘localhost' IDENTIFIED BY 'F5KDS4wbjU61';
SHOW CREATE USER ‘fleet’@‘localhost'\G

Tip- if you get lost and need to find out what fleet thinks is happening- find config-

./fleet config_dump

Install Redis

It’s quite possible that I had problems here, but I didn’t document them. So quite likely my eyes weren’t bleeding and my voice hoarse from invoking dead gods. So you’re good here fam.
Except the docs have a link to a Digital Ocean tutorial that has those terrifying words ‘compile from source’. Damn.
From the FleetDM docs- install

sudo apt-get install redis-server -y

and run redis-

sudo redis-server &

Warning!

Following this for a second time, I found that the redis user was already set up, and so was the redis.service as well. this probably means that the FleetDM instructions are out of date, but I’ll copy the instructions here and go through the process again to check. But according to Digital Ocean, there’s more work to do, (this is pretty much a direct copy from that tutorial).

Add redis user

sudo adduser --system --group --no-create-home redis

Create a directory for Redis to use as a data store

mkdir /var/lib/redis

Give redis user access to this directory

sudo chown redis:redis /var/lib/redis

Set permissions so a regular user can’t access this folder

sudo chmod 770 /var/lib/redis

Now, create a system.service file and edit it-

sudo nano /etc/systemd/system/redis.service

And add the following text-

[Unit]
Description=Redis In-Memory Data Store
After=network.target

[Service]
User=redis
Group=redis
ExecStart=/usr/local/bin/redis-server /etc/redis/redis.conf
ExecStop=/usr/local/bin/redis-cli shutdown
Restart=always

[Install]
WantedBy=multi-user.target

Enable start at boot time-

sudo systemctl enable redis

Testing Redis

Connect to Redis CLI

redis-cli

test connectivity with-

127.0.0.1:6379> ping

And you should get the following output-

PONG

There’s a storage test in the tutorial which I am ignoring, but to restart-

Exit Redis CLI

127.0.0.1:6379> exit

Restart the service-

sudo systemctl restart redis

Connect to Redis CLI again

redis-cli

and as above, test the ping function

Install Lego for Letsencrypt

Install lego so we can get a lets encrypt cert-

wget https://github.com/go-acme/lego/releases/download/v4.7.0/lego_v4.7.0_linux_amd64.tar.gz

Unpack it

gzip -d lego_v4.7.0_linux_amd64.tar.gz && tar -xvf lego_v4.7.0_linux_amd64.tar

Move the binary to /usr/bin

sudo mv lego /usr/bin/

I’m using Cloudflare, so go to the Cloudflare dashboard and create a token that can be used to generate a CSR for Letsencrypt (not covering this bit, you can do it yourself and it is fairly easy). Result-

Cloudflare API Token
A9fGkL2uXANNZPiHqBvwQIthpcBYEQI

Move to the directory that contains lego- this isn’t needed but it does result in the certs being in a known path…

cd /usr/bin

Execute command to get certs-

CF_DNS_API_TOKEN=A9fGkL2uXANNZPiHqBvwQIthpcBYEQI \
lego --dns cloudflare --domains xxx.servicemax.com.au --email xxx@servicemax.com.au run

You should get a response like this-

Please review the TOS at https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf
Do you accept the TOS? Y/n

2022/06/13 12:23:22 [INFO] acme: Registering account for xxx@servicemax.com.au
!!!! HEADS UP !!!!
Your account credentials have been saved in your Let's Encrypt
configuration directory at "/usr/bin/.lego/accounts".
You should make a secure backup of this folder now. This configuration directory will also contain certificates and
private keys obtained from Let's Encrypt so making regular backups of this folder is ideal.
........
2022/06/13 12:23:30 [INFO] [xxx.servicemax.com.au] Server responded with a certificate.

Our shiny new certs are created, verify they now in

/usr/bin/.lego/certificates/xxx.servicemax.com.au.crt
/usr/bin/.lego/certificates/xxx.servicemax.com.au.key

Fleet access to certificates

*this bit needs work*
According to this article-

https://www.getpagespeed.com/server-setup/ssl-directory

our SSLs should be stored in

/etc/ssl/certs/

and the key in

/etc/ssl/private/

Probably the best thing we can do here is add a line to the renewal script that moves the cert and key into the correct location and set perms. This bit probably needs a cleanup because even if not running as root, your Fleet user needs access to the domain.key private key, but you don’t want them getting access to other keys on the system- so one or the other of these methods is best but not both. I think. Possibly. Hopefully a reader will help…

https://serverfault.com/questions/259302/best-location-to-keep-ssl-certificates-and-private-keys-on-ubuntu-servers

Update the ssl folder to add our certs group-

sudo chown root:certs /etc/ssl/

Might be better to set ACLs?

Found in this article about Reslilio Sync

https://www.linuxbabe.com/ubuntu/install-resilio-sync-ubuntu-16-04-16-10

So let’s create an automated way to update the certificates

instructions modified from Bitnami-

https://docs.bitnami.com/aws/how-to/generate-install-lets-encrypt-ssl/

Create a script to renew the certs

sudo mkdir -p /opt/letsencrypt/scripts
sudo nano /opt/letsencrypt/scripts/renew-certificate.sh

Add this text to the renew-certificate.sh

#!/bin/bash
sudo /usr/bin/lego --tls --email=xxx@servicemax.com.au --domains=xxx.servicemax.com.au --path=/usr/bin/.lego/certificates/ renew --days 90
sudo cp /usr/bin/.lego/certificates/xxx.servicemax.com.au.crt /etc/ssl/certs/
sudo chmod 644 /etc/ssl/certs/xxx.servicemax.com.au.crt
sudo cp /usr/bin/.lego/certificates/xxx.servicemax.com.au.key /etc/ssl/private/
sudo chmod 640 /etc/ssl/private/xxx.servicemax.com.au.key
sudo setfacl -R -m "u:fleet:rwx" /etc/ssl/

make script executable

sudo chmod +x /opt/letsencrypt/scripts/renew-certificate.sh

open crontab

sudo crontab -e

add this line to crontab

0 0 1 * * /opt/letsencrypt/scripts/renew-certificate.sh 2> /dev/null

Run the Fleet server

Now we can start the Fleet server- but first we have to prepare the database with the following command ‘fleet prepare’

/usr/bin/fleet prepare db \
--mysql_address=127.0.0.1:3306 \
--mysql_database=fleet \
--mysql_username=fleet \
--mysql_password=F5KDS4wbjU61

TEXT BELOW HAS STRIKETHROUGH BECAUSE WE WANT A CONFIG FILE NOT A LAUNCH COMMAND

Now the big one– we get to start the server! run the following command ‘fleet serve’

/usr/bin/fleet serve \
  --mysql_address=127.0.0.1:3306 \
  --mysql_database=fleet \
  --mysql_username=fleet \
  --mysql_password=F5KDS4wbjU61 \ 
  --redis_address=127.0.0.1:6379 \ 
  --server_cert=/etc/ssl/certs/xxx.servicemax.com.au.crt \
  --server_key=/etc/ssl/certs/xxx.servicemax.com.au.key \
  --logging_json

~~You should now be able to go to https://yourIP:8080~~
~~and log in to Fleet. if it’s working, you’ll be redirected to https://yourIP:8080/setup~~
~~to create a user account. Which you should do now, go on.~~
~~But there’s still a few things to do…~~

Create a config file, Change ports, Set logs

Nearly there, but you want to change a few things…Now let’s set up Fleet so we can control the configs using a more traditional config file. We do this with the ‘fleet serve’ command, but let’s export our current config so we know a little bit about our setup- (this probably won’t work if you haven’t launched it yet, need to figure out if we really have to create a launch command first, only to dump it later)

fleet config_dump

ok back to our regular programming. Now, so let’s create a the config file

nano /home/fleet/fleet.yml

And add some content like this- there’s a whole bunch more options in the config docs but make sure you don’t accidentally break stuff here by leaving out important settings…


mysql:
  address: 127.0.0.1:3306
  database: fleet
  username: fleet
  password: F5KDS4wbjU61
redis:
  address: 127.0.0.1:6379
  connect_timeout: 30s
  keep_alive: 60s
server:
  address: 0.0.0.0:443 
  tls: true
  cert: /etc/ssl/certs/xxx.servicemax.com.au.crt
  key: /etc/ssl/private/xxx.servicemax.com.au.key
osquery:
  status_log_plugin: filesystem
  result_log_plugin: filesystem
  host_identifier: uuid
vulnerabilities:
  current_instance_checks: auto
logging:
# debug: true
filesystem:
  status_log_file: /var/log/osquery/status.log
  result_log_file: /var/log/osquery/results.log
  enable_log_rotation: true

And save, note the lines about Server Address, this is where we changed the port used- and be careful, leaving the Server IP at 127.0.0.1 caused me to be unable to login. Setting it to 0.0.0.0 enabled login again.
Warning- if you copy this, check the spaces to make sure the YAML is valid!

You want logs? Cos this is how you get logs… but we need to create the directory and set the permissions first

sudo mkdir /var/log/osquery
sudo touch /var/log/osquery/status.log
sudo touch /var/log/osquery/results.log
sudo chown -R fleet:users /var/log/osquery

Make the new config active

Now we need to tell Fleet to use the new config file with ‘fleet serve’. Use this command if you want to test the setup. Go to the next step to set a system service that will auto start on boot.

fleet serve --config /home/fleet/fleet.yml

Run Fleet rootless

Now, because our Fleet user is not an admin user and we want to use a port number below 1024, we need to add the capability to our Fleet binary (look up CAP_NET_BIND_SERVICE ). To give the binary this feature, run

sudo setcap cap_net_bind_service=+ep /usr/bin/fleet

Make Fleet a system service and auto-start

Create system service so fleet will survive a reboot

https://fleetdm.com/docs/deploying/configuration#running-with-systemd

switch to correct directory

cd /etc/systemd/system/

Create the new service file

nano fleet.service

Add a customised version of this- we differ from the docs here because we don’t want all our config in the service file, we only want to refer to the config file…

[Unit] Description=Fleet After=network.target

[Service] 
User=fleet 
Group=fleet 
LimitNOFILE=8192 
ExecStart=/usr/local/bin/fleet serve --config /home/fleet/fleet.yml \ 


[Install] WantedBy=multi-user.target

And save. Now you can enable auto start with

sudo systemctl enable fleet

You can check your work if you need to make changes by restarting the daemon, restarting the service, and checking the logs-

sudo systemctl daemon-reload
sudo systemctl restart fleet.service
sudo journalctl -u fleet.service -f

If everything went well, you can now log in to your FleetDM portal, add some hosts and generally behave like a rock star.
When you get bail, you can move on to fine tuning your configs and maybe even looking for some bad guys…

Logging in remotely with fleetctl

it wasn’t initially clear to me that ‘fleetctl’ is a binary that’s meant to exist on your local machine, and can be used to control a remote server, much like adding kubectl and the config files to your Mac for Kubernetes… so you install fleetctl first, then set the ‘context’ of fleetctl to match the server you want to connect to, then do the actual login. I think. Set context to existing http://domain.com:8080

fleetctl config set --address https://xxx.servicemax.com.au:8080

Now login to Fleet using CLI

fleetctl login

Enter your login email address and password to login.

fleet serve --config /home/fleet/fleet.yml

Fleet Import Standard Query Library

Go back to your home folder (You need to be logged in to fleetctl to do this import action- see the entry just above this)

cd ~

Download the .yml file that has the standard queries

wget https://github.com/fleetdm/fleet/blob/main/docs/01-Using-Fleet/standard-query-library/standard-query-library.yml

Import the query file to Fleet

fleetctl apply -f standard-query-library.yml

Security!

Now go back and update the firewall rules so that attacking your server is like licking teflon, turn off SSH if it was on (or restrict to your IP, use public key login etc.) and change the root password if you have been using the one set by the cloud provider…

Thanks

I want to extend my thanks to Zach Wasserman and ‘Mystery Incorporated’ who gave really thorough answers to my questions. I went from ‘your instructions are wrong’ to ‘I might actually be able to make this work’ based on the help I got in Slack. Thanks, and here is that conversation- you may be able to get even more out of it than I did.
My initial install did not allow clients to connect because of the self signed TLS cert- you can turn this off but then you’d also need a spanking. But I can confirm that the above strategy works- as soon as I got the LetsEncrypt certs added, the non-working client came up in the portal like magic- because it was set to check for cert validity and the cert was suddenly valid!

Me- ‘Can I please ask about ‘best practise’ for using TLS for client enrolment.’

in our reference architecture, we terminate TLS with a load balancer on AWS, using the free certificates from Amazon. Using a certificate from any commercial CA or Let’s Encrypt works great as well!

Let’s Encrypt actually is feasible despite the rotation issue because the certificate root is included in standard osquery and Orbit packaging, so you don’t have to pin to the specific certificate.

Ah gotcha, so essentially the fleet.pem that OSquery uses is actually the Let’s Encrypt root cert?

The one we package by default includes LE and a bunch of other CAs — we actually use the set of roots from Mozilla. If you use the –fleet-certificate option then it only includes whatever you have in that file (which could be the LE root if you want to allow only LE).

If you’re using a self-signed cert you definitely need the –fleet-certificate flag (or –insecure but that is of course not recommended for production)

You can look in /var/log/orbit/orbit.stderr.log to see why osquery is sad — I’d guess you will see certificate verify failed.

Ah sorry, let me try to clarify

–fleet-certificate is absolutely recommended for prod. –insecure is not recommended for prod because it disables certificate validation.

–fleet-certificate is not necessary for prod though if you have a cert that’s trusted by the roots in that bundle (eg. commercial CA, AWS ACM, Let’s Encrypt, etc.)