CYBERSECURITY JOB HUNTING GUIDE
Getting started with Graylog
Author: Stefan Waldvogel
Introduction into Graylog
-in preparation-
What is Graylog?
Graylog is a SIEM (Security Information and Event Management). If you are a student, you might never heard this word, and this is okay. A SIEM is a piece of software and with this software you can collect logs from different machines. You need two things:
- A SIEM server
- and clients with an agent
A SIEM itself is not smart, especially without alarms. If you are a SOC Analyst, you have to know what you are looking for. Bigger companies can afford more technology with an integrated intelligence or behavior based software. Nowadays, a SIEM feeds data into a SOAR and a SOC Analyst does not ask a SIEM directly anymore, but this is a different topic.
Infrastructure
A SIEM is never alone and in a small company, it could look like this:
What is Graylog?
Graylog is a SIEM (Security Information and Event Management). If you are a student, you might never heard this word, and this is okay. A SIEM is a piece of software and with this software you can collect logs from different machines. You need two things:
- A SIEM server
- and clients with an agent
A SIEM itself is not smart, especially without alarms. If you are a SOC Analyst, you have to know what you are looking for. Bigger companies can afford more technology with an integrated intelligence or behavior based software. Nowadays, a SIEM feeds data into a SOAR and a SOC Analyst does not ask a SIEM directly anymore, but this is a different topic.
Infrastructure
A SIEM is never alone and in a small company, it could look like this:
Every piece of relevant hard- and software sends data to a SIEM. Most people think about clients, server and network appliances, but it could be anything like security cameras, mobile devices, smart devices and more. Each device or application has their own language for the logs. An agent collects the data, encrypts it and sends everything to the SIEM server. The SIEM has a huge database, many times rules and sorts the data. A SOC Analyst has access to the data, looks into alarms and monitors traffic.
Bigger companies with more money might have a SOAR (Security orchestration, automation and response) between the SIEM and the SOC Analyst.
The server
The server is the heart and it digests all the information. You can install Graylog on many different operating systems. If you are a starter, just download the OVA file start your virtual machine with VirtalBox or VMWare. This approach is simple. LogIn to your machine with admin/ubuntu find the ip adress or use the ipv6 address on the top. Go back to your main system open a browser and use the given information to login (http:\\server_ip:9000).
A Graylog server has three main components:
- Graylog -> the software / frontend
- MongoDB -> the place to store data
- Elasticseach -> the search engine
The following picture shows two configurations:
Bigger companies with more money might have a SOAR (Security orchestration, automation and response) between the SIEM and the SOC Analyst.
The server
The server is the heart and it digests all the information. You can install Graylog on many different operating systems. If you are a starter, just download the OVA file start your virtual machine with VirtalBox or VMWare. This approach is simple. LogIn to your machine with admin/ubuntu find the ip adress or use the ipv6 address on the top. Go back to your main system open a browser and use the given information to login (http:\\server_ip:9000).
A Graylog server has three main components:
- Graylog -> the software / frontend
- MongoDB -> the place to store data
- Elasticseach -> the search engine
The following picture shows two configurations:
There are many more options (cloud, with load balancing, etc.).
Server-Input
Graylog needs data to work as a useful piece of software. On a Linux system it could be syslog and on a Windows system it could be an event log, but Graylog can digest a wide variety of logs. The dataflow is like this:
Server-Output
All Graylog instances have a webserver. The standard port is 9000. If you have the IP of your Graylog server you can connect to the webserver with http://server_ip_address:9000 or additionally with an IPv6 adress (OVA image).
The client
In a modern environment, we have a lot of different clients. Windows, Linux, mobile devices, IoT are some options. The easiest client in this configuration is a stand-alone server (server and a client at the same time), but usually you want to digest data from other machines. If you have a small home lab and limited resources, you can digest the server logs.
Without configuration, a client itself does not send data to your server. If you use Linux, you need to configure syslog to send the data or you use an agent. If you use Windows, you can use an agent to forward the data. Watch this video: www.graylog.org/webinars/graylog-inputs or follow the steps on my website: Linux: https://www.cyberhuntingguide.net/graylog-linux-agent.html or for Windows: www.cyberhuntingguide.net/graylog-windows-agent.html
Additional thoughts
Before we dive into Graylog's web application, think about what you want to learn and what you want to do in the future. If you are reading this article, you might want to become a SOC Analyst. That is okay, but in Cybersecurity, there are many more jobs. If you can set up a Graylog server and you can configure inputs, you learn things beyond a SOC Analyst. You take the first steps towards a Security Engineer or a Security Architect. You need knowledge about firewalls, routing, and much more. If you install Graylog via OVA image or Ubuntu, this is simple, but if you install it on CentOS or RHEL, that is a different thing, because you have to configure your firewalls.
-> Go step by step. If the OVA thing works, try to install it manually on Ubuntu and if it works on CentOS8/RHEL.
As a solid SOC Analyst, it is great to know more about the data flow, because later you have to distinguish between malicious traffic or good traffic.
The workflow (overview)
Graylog has one huge advantage. It is very easy to get data into it. On the other hand it offers a ton of extra features for businesses. The following picture shows one idea how you can add data into Graylog:
Server-Input
Graylog needs data to work as a useful piece of software. On a Linux system it could be syslog and on a Windows system it could be an event log, but Graylog can digest a wide variety of logs. The dataflow is like this:
- External data -> all logs on a system/client maybe syslog
- Configured inputs -> configure Graylog to accept a log (maybe via an agent)
- Extractors -> optional, but needed to organize your data with e.g. regex, grok. The data goes into Elastic with organized fields
- Streams -> optional, store data in different groups/indice. Without a stream, all data goes into the default index set.
Server-Output
All Graylog instances have a webserver. The standard port is 9000. If you have the IP of your Graylog server you can connect to the webserver with http://server_ip_address:9000 or additionally with an IPv6 adress (OVA image).
The client
In a modern environment, we have a lot of different clients. Windows, Linux, mobile devices, IoT are some options. The easiest client in this configuration is a stand-alone server (server and a client at the same time), but usually you want to digest data from other machines. If you have a small home lab and limited resources, you can digest the server logs.
Without configuration, a client itself does not send data to your server. If you use Linux, you need to configure syslog to send the data or you use an agent. If you use Windows, you can use an agent to forward the data. Watch this video: www.graylog.org/webinars/graylog-inputs or follow the steps on my website: Linux: https://www.cyberhuntingguide.net/graylog-linux-agent.html or for Windows: www.cyberhuntingguide.net/graylog-windows-agent.html
Additional thoughts
Before we dive into Graylog's web application, think about what you want to learn and what you want to do in the future. If you are reading this article, you might want to become a SOC Analyst. That is okay, but in Cybersecurity, there are many more jobs. If you can set up a Graylog server and you can configure inputs, you learn things beyond a SOC Analyst. You take the first steps towards a Security Engineer or a Security Architect. You need knowledge about firewalls, routing, and much more. If you install Graylog via OVA image or Ubuntu, this is simple, but if you install it on CentOS or RHEL, that is a different thing, because you have to configure your firewalls.
-> Go step by step. If the OVA thing works, try to install it manually on Ubuntu and if it works on CentOS8/RHEL.
As a solid SOC Analyst, it is great to know more about the data flow, because later you have to distinguish between malicious traffic or good traffic.
The workflow (overview)
Graylog has one huge advantage. It is very easy to get data into it. On the other hand it offers a ton of extra features for businesses. The following picture shows one idea how you can add data into Graylog:
If you run Graylog in your homelab, you need to two things:
-> If you do not change anything else, you can work with it and you see the data in Graylog. Bigger companies have different needs and separate/parse data with the extra features. If you work as a SOC Analyst, an Engineer did this work for you. If you learn it anyway, you might see how you can improve and speed up your own work environment. Maybe you have a felling about missing data, if you know a bit about the background, you might find the issue very fast.
- setup your datasource -> maybe forward syslog messages to a port and IP.
- configure an Input -> you need the same port and if you have a firewall, allow the traffic
-> If you do not change anything else, you can work with it and you see the data in Graylog. Bigger companies have different needs and separate/parse data with the extra features. If you work as a SOC Analyst, an Engineer did this work for you. If you learn it anyway, you might see how you can improve and speed up your own work environment. Maybe you have a felling about missing data, if you know a bit about the background, you might find the issue very fast.
© 2021. This work is licensed under a CC BY-SA 4.0 license