We’ve had some major issues running Munkireport recently, and while switching platforms mid ride is generally a really bad idea, I just wanted *something* to work and I had always wanted to try FleetDM, so…
I still love Munkireport and it’s beautiful graphing and helpful community. But my analytics engine uses SQL queries, and so does FleetDM, as it adds a GUI to OSQuery under the hood. Originally developed at Facebook, it also has some other cool features- like being able to load and search for known vulns etc.
*Munkireport uses MySQL too, but even when I had a working instance I couldn’t slurp data out of it- there’s an API and even a Postman collection, but it just *wouldn’t*. I did get to export data using Zoho’s Java connector for Zoho Analytics, but it was sooo ugly.
FLeetDM is a serious, enterprise type app- they claim to have had instances with up to 150,000 live agents, but I figure I’ll need to retire at around 149,000.
Unsurprisingly for someone who *doesn’t* currently manage computers at this scale, I had some issues getting it running, so here’s some tips that might help you. I’ve tried to make the examples explicit- if you need to compare with original source, the links are provided.
This article makes a lot of assumptions, so maybe read it through before you commit- I wanted to set up a server suitable for production, that can handle ~500 clients, have the web server on port 80/443 using SSL, have the clients also use a well known ports so it can work in corporate environments where we don’t control the firewall, etc.
STOP! This took a lot longer than I had budgeted for, but most issues were simply because the docs need updating- you may be able to leverage these instructions and get going faster, marry a supermodel etc. But before you do, have a look at the task list-
- Install FleetDM
- Install MySQL
- Install Redis
- Test Redis
- Install Lego
- Create SSL/TLS certs
- Move certificates and set permissions
- Create a script to renew certificates
- Prepare the Fleet database
- Create a Fleet config
- Create a Fleet system.service and autostart
- Import standard query library
Some of these things are fairly trivial, but many are not. I think it’s a great project otherwise I wouldn’t have invested so much time, but now I need to go and write some queries to figure out when I can stop for lunch before someone installs another malware ‘sample’ on their computer…
We use Ubuntu a fair bit as the basis for cloud or customer facing services, and the instructions for FleetDM on Ubuntu say ‘Acquiring an Ubuntu host to use for this guide is largely an exercise for the reader’. But this is fine because on most cloud providers this is as difficult as opening the control panel and clicking a button. The FleetDM instructions however refer to Ubuntu 16.04 and we are living a a brand new century which includes Ubuntu 20.04 so let’s use that…
Once your host is running, download, unzip and relocate the fleet binary-
wget https://github.com/fleetdm/fleet/releases/download/fleet-v4.15.0/fleet_v4.15.0_linux.tar.gz gzip -d fleet_v4.15.0_linux.tar.gz && tar -xvf fleet_v4.15.0_linux.tar
Move the binary to /usr/bin
sudo mv fleet /usr/bin/
Now we don’t want our services running as root, let’s create new user for fleet, and new groups for certificate access
sudo adduser username fleet
(all passwords have been changed to new random ones, don’t bother trying them against our infra unless you are incredibly bored and enjoy failure)
Make group called ‘certs’
sudo addgroup certs
Add user ‘fleet’ to group ‘certs’
sudo adduser fleet certs
this is where the instructions diverged quite dramatically from the install instructions on the Fleet website- part of this is because they are directing you to a non production setup, part of it is because the instructions need updating. here we need to cope with a changed auth type in MySQL.
sudo apt-get install mysql-server -y
Mysql when asked for username and password add-
Had to create a new mysql user according to this
log into MySQL with
sudo mysql -u root
CREATE USER 'fleet'@'localhost' IDENTIFIED BY ‘F5KDS4wbjU61’; GRANT ALL PRIVILEGES ON *.* TO 'fleet'@'localhost'; UPDATE user SET plugin='auth_socket' WHERE User='fleet'; FLUSH PRIVILEGES; exit; sudo service mysql restart
I also had to do this-
UPDATE user SET plugin='mysql_native_password' WHERE User=‘fleet’; UPDATE user SET plugin='caching_sha2_password' WHERE User=‘fleet’; CREATE USER ‘fleet’@‘localhost' IDENTIFIED BY 'F5KDS4wbjU61'; SHOW CREATE USER ‘fleet’@‘localhost'\G
Tip- if you get lost and need to find out what fleet thinks is happening- find config-
It’s quite possible that I had problems here, but I didn’t document them. So quite likely my eyes weren’t bleeding and my voice hoarse from invoking dead gods. So you’re good here fam.
Except the docs have a link to a Digital Ocean tutorial that has those terrifying words ‘compile from source’. Damn.
From the FleetDM docs- install
sudo apt-get install redis-server -y
and run redis-
sudo redis-server &
Following this for a second time, I found that the redis user was already set up, and so was the redis.service as well. this probably means that the FleetDM instructions are out of date, but I’ll copy the instructions here and go through the process again to check. But according to Digital Ocean, there’s more work to do, (this is pretty much a direct copy from that tutorial).
Add redis user
sudo adduser --system --group --no-create-home redis
Create a directory for Redis to use as a data store
Give redis user access to this directory
sudo chown redis:redis /var/lib/redis
Set permissions so a regular user can’t access this folder
sudo chmod 770 /var/lib/redis
Now, create a system.service file and edit it-
sudo nano /etc/systemd/system/redis.service
And add the following text-
Description=Redis In-Memory Data Store
Enable start at boot time-
sudo systemctl enable redis
Connect to Redis CLI
test connectivity with-
And you should get the following output-
There’s a storage test in the tutorial which I am ignoring, but to restart-
Exit Redis CLI
Restart the service-
sudo systemctl restart redis
Connect to Redis CLI again
and as above, test the ping function
Install Lego for Letsencrypt
Install lego so we can get a lets encrypt cert-
gzip -d lego_v4.7.0_linux_amd64.tar.gz && tar -xvf lego_v4.7.0_linux_amd64.tar
Move the binary to /usr/bin
sudo mv lego /usr/bin/
I’m using Cloudflare, so go to the Cloudflare dashboard and create a token that can be used to generate a CSR for Letsencrypt (not covering this bit, you can do it yourself and it is fairly easy). Result-
Cloudflare API Token A9fGkL2uXANNZPiHqBvwQIthpcBYEQI
Move to the directory that contains lego- this isn’t needed but it does result in the certs being in a known path…
Execute command to get certs-
CF_DNS_API_TOKEN=A9fGkL2uXANNZPiHqBvwQIthpcBYEQI \ lego --dns cloudflare --domains xxx.servicemax.com.au --email firstname.lastname@example.org run
You should get a response like this-
Please review the TOS at https://letsencrypt.org/documents/LE-SA-v1.2-November-15-2017.pdf Do you accept the TOS? Y/n
2022/06/13 12:23:22 [INFO] acme: Registering account for email@example.com !!!! HEADS UP !!!! Your account credentials have been saved in your Let's Encrypt configuration directory at "/usr/bin/.lego/accounts". You should make a secure backup of this folder now. This configuration directory will also contain certificates and private keys obtained from Let's Encrypt so making regular backups of this folder is ideal. ........ 2022/06/13 12:23:30 [INFO] [xxx.servicemax.com.au] Server responded with a certificate.
Our shiny new certs are created, verify they now in
Fleet access to certificates
*this bit needs work*
According to this article-
our SSLs should be stored in
and the key in
Probably the best thing we can do here is add a line to the renewal script that moves the cert and key into the correct location and set perms. This bit probably needs a cleanup because even if not running as root, your Fleet user needs access to the domain.key private key, but you don’t want them getting access to other keys on the system- so one or the other of these methods is best but not both. I think. Possibly. Hopefully a reader will help…
Update the ssl folder to add our certs group-
sudo chown root:certs /etc/ssl/
Might be better to set ACLs?
Found in this article about Reslilio Sync
So let’s create an automated way to update the certificates
instructions modified from Bitnami-
Create a script to renew the certs
sudo mkdir -p /opt/letsencrypt/scripts sudo nano /opt/letsencrypt/scripts/renew-certificate.sh
Add this text to the renew-certificate.sh
#!/bin/bash sudo /usr/bin/lego --tls --firstname.lastname@example.org --domains=xxx.servicemax.com.au --path=/usr/bin/.lego/certificates/ renew --days 90 sudo cp /usr/bin/.lego/certificates/xxx.servicemax.com.au.crt /etc/ssl/certs/ sudo chmod 644 /etc/ssl/certs/xxx.servicemax.com.au.crt sudo cp /usr/bin/.lego/certificates/xxx.servicemax.com.au.key /etc/ssl/private/ sudo chmod 640 /etc/ssl/private/xxx.servicemax.com.au.key sudo setfacl -R -m "u:fleet:rwx" /etc/ssl/
make script executable
sudo chmod +x /opt/letsencrypt/scripts/renew-certificate.sh
sudo crontab -e
add this line to crontab
0 0 1 * * /opt/letsencrypt/scripts/renew-certificate.sh 2> /dev/null
Run the Fleet server
Now we can start the Fleet server- but first we have to prepare the database with the following command ‘fleet prepare’
/usr/bin/fleet prepare db \ --mysql_address=127.0.0.1:3306 \ --mysql_database=fleet \ --mysql_username=fleet \ --mysql_password=F5KDS4wbjU61
TEXT BELOW HAS STRIKETHROUGH BECAUSE WE WANT A CONFIG FILE NOT A LAUNCH COMMAND
Now the big one– we get to start the server! run the following command ‘fleet serve’
/usr/bin/fleet serve \ --mysql_address=127.0.0.1:3306 \ --mysql_database=fleet \ --mysql_username=fleet \ --mysql_password=F5KDS4wbjU61 \ --redis_address=127.0.0.1:6379 \ --server_cert=/etc/ssl/certs/xxx.servicemax.com.au.crt \ --server_key=/etc/ssl/certs/xxx.servicemax.com.au.key \ --logging_json
You should now be able to go to https://yourIP:8080
and log in to Fleet. if it’s working, you’ll be redirected to https://yourIP:8080/setup
to create a user account. Which you should do now, go on.
But there’s still a few things to do…
Create a config file, Change ports, Set logs
Nearly there, but you want to change a few things…Now let’s set up Fleet so we can control the configs using a more traditional config file. We do this with the ‘fleet serve’ command, but let’s export our current config so we know a little bit about our setup- (this probably won’t work if you haven’t launched it yet, need to figure out if we really have to create a launch command first, only to dump it later)
ok back to our regular programming. Now, so let’s create a the config file
And add some content like this- there’s a whole bunch more options in the config docs but make sure you don’t accidentally break stuff here by leaving out important settings…
mysql: address: 127.0.0.1:3306 database: fleet username: fleet password: F5KDS4wbjU61 redis: address: 127.0.0.1:6379 connect_timeout: 30s keep_alive: 60s server: address: 0.0.0.0:443 tls: true cert: /etc/ssl/certs/xxx.servicemax.com.au.crt key: /etc/ssl/private/xxx.servicemax.com.au.key osquery: status_log_plugin: filesystem result_log_plugin: filesystem host_identifier: uuid vulnerabilities: current_instance_checks: auto logging: # debug: true filesystem: status_log_file: /var/log/osquery/status.log result_log_file: /var/log/osquery/results.log enable_log_rotation: true
And save, note the lines about Server Address, this is where we changed the port used- and be careful, leaving the Server IP at 127.0.0.1 caused me to be unable to login. Setting it to 0.0.0.0 enabled login again.
Warning- if you copy this, check the spaces to make sure the YAML is valid!
You want logs? Cos this is how you get logs… but we need to create the directory and set the permissions first
sudo mkdir /var/log/osquery sudo touch /var/log/osquery/status.log sudo touch /var/log/osquery/results.log sudo chown -R fleet:users /var/log/osquery
Make the new config active
Now we need to tell Fleet to use the new config file with ‘fleet serve’. Use this command if you want to test the setup. Go to the next step to set a system service that will auto start on boot.
fleet serve --config /home/fleet/fleet.yml
Run Fleet rootless
Now, because our Fleet user is not an admin user and we want to use a port number below 1024, we need to add the capability to our Fleet binary (look up CAP_NET_BIND_SERVICE ). To give the binary this feature, run
sudo setcap cap_net_bind_service=+ep /usr/bin/fleet
Make Fleet a system service and auto-start
Create system service so fleet will survive a reboot
switch to correct directory
Create the new service file
Add a customised version of this- we differ from the docs here because we don’t want all our config in the service file, we only want to refer to the config file…
[Unit] Description=Fleet After=network.target [Service] User=fleet Group=fleet LimitNOFILE=8192 ExecStart=/usr/local/bin/fleet serve --config /home/fleet/fleet.yml \ [Install] WantedBy=multi-user.target
And save. Now you can enable auto start with
sudo systemctl enable fleet
You can check your work if you need to make changes by restarting the daemon, restarting the service, and checking the logs-
sudo systemctl daemon-reload sudo systemctl restart fleet.service sudo journalctl -u fleet.service -f
If everything went well, you can now log in to your FleetDM portal, add some hosts and generally behave like a rock star.
When you get bail, you can move on to fine tuning your configs and maybe even looking for some bad guys…
Logging in remotely with fleetctl
it wasn’t initially clear to me that ‘fleetctl’ is a binary that’s meant to exist on your local machine, and can be used to control a remote server, much like adding kubectl and the config files to your Mac for Kubernetes… so you install fleetctl first, then set the ‘context’ of fleetctl to match the server you want to connect to, then do the actual login. I think. Set context to existing http://domain.com:8080
fleetctl config set --address https://xxx.servicemax.com.au:8080
Now login to Fleet using CLI
Enter your login email address and password to login.
fleet serve --config /home/fleet/fleet.yml
Fleet Import Standard Query Library
Go back to your home folder (You need to be logged in to fleetctl to do this import action- see the entry just above this)
Download the .yml file that has the standard queries
Import the query file to Fleet
fleetctl apply -f standard-query-library.yml
Now go back and update the firewall rules so that attacking your server is like licking teflon, turn off SSH if it was on (or restrict to your IP, use public key login etc.) and change the root password if you have been using the one set by the cloud provider…
I want to extend my thanks to Zach Wasserman and ‘Mystery Incorporated’ who gave really thorough answers to my questions. I went from ‘your instructions are wrong’ to ‘I might actually be able to make this work’ based on the help I got in Slack. Thanks, and here is that conversation- you may be able to get even more out of it than I did.
My initial install did not allow clients to connect because of the self signed TLS cert- you can turn this off but then you’d also need a spanking. But I can confirm that the above strategy works- as soon as I got the LetsEncrypt certs added, the non-working client came up in the portal like magic- because it was set to check for cert validity and the cert was suddenly valid!
Me- ‘Can I please ask about ‘best practise’ for using TLS for client enrolment.’
in our reference architecture, we terminate TLS with a load balancer on AWS, using the free certificates from Amazon. Using a certificate from any commercial CA or Let’s Encrypt works great as well!
Let’s Encrypt actually is feasible despite the rotation issue because the certificate root is included in standard osquery and Orbit packaging, so you don’t have to pin to the specific certificate.
Ah gotcha, so essentially the fleet.pem that OSquery uses is actually the Let’s Encrypt root cert?
The one we package by default includes LE and a bunch of other CAs — we actually use the set of roots from Mozilla. If you use the –fleet-certificate option then it only includes whatever you have in that file (which could be the LE root if you want to allow only LE).
If you’re using a self-signed cert you definitely need the –fleet-certificate flag (or –insecure but that is of course not recommended for production)
You can look in /var/log/orbit/orbit.stderr.log to see why osquery is sad — I’d guess you will see certificate verify failed.
Ah sorry, let me try to clarify
–fleet-certificate is absolutely recommended for prod. –insecure is not recommended for prod because it disables certificate validation.
–fleet-certificate is not necessary for prod though if you have a cert that’s trusted by the roots in that bundle (eg. commercial CA, AWS ACM, Let’s Encrypt, etc.)