Cassandra - OpenTelemetry Collector
The Cassandra - OpenTelemetry app is a log based app that helps you monitor the availability, performance, health, and resource utilization of your Cassandra clusters. Preconfigured dashboards provide insight into resource utilization, cache/Gossip/Memtable statistics and Error and warnings. Cassandra logs are sent to Sumo Logic through OpenTelemetry filelog receiver.
The app supports Logs from the open-source version of Cassandra. The App is tested on the 3.11.10 version of Cassandra.
Fields creation in Sumo Logic for Cassandra
Following are the Fields which will be created as part of Cassandra App install if not already present:
db.cluster.name
. User configured. Enter a name to identify this Cassandra cluster. This cluster name will be shown in the Sumo Logic dashboards.db.system
. Has fixed value of cassandra.deployment.environment
. User configured. Through this Cassandra cluster is identified by the environment where it resides. For example: dev, prod or qa.sumo.datasource
. Has fixed value of cassandra.
Prerequisites
Cassandra has three main logs: system.log, debug.log, and gc.log which hold general logging messages, debugging logging messages, and java garbage collection logs respectively.
These logs by default live in ${CASSANDRA_HOME}/logs
, but most Linux distributions relocate logs to /var/log/cassandra
. Operators can tune this location as well as what levels are logged using the provided logback.xml file. For more details on Cassandra logs, see this link.
Configure Cassandra Logs Collection
Step 1: Set up Collector
If you want to use an existing OpenTelemetry Collector, you can skip this step by selecting the Use an existing Collector option.
To create a new Collector:
- Select the Add a new Collector option.
- Select the platform where you want to install the Sumo Logic OpenTelemetry Collector.
This will generate a command that you can execute in the machine environment you need to monitor. Once executed, it will install the Sumo Logic OpenTelemetry Collector.
Step 2: Configure integration
In this step, you will configure the yaml required for Cassandra Collection. Path of the log file configured to capture Cassandra logs needs to be given here.
Below are the inputs required:
- The path to system.log is required here. This file is typically located in
/var/log/cassandra
. If you're using a customized path, check the respective conf file for this information.
You can add any custom fields which you want to be tagged with the data ingested in Sumo. Click on the Download YAML File button to get the yaml file.
Step 3: Send logs to Sumo
Once you have downloaded the yaml file as described in the previous step, follow the below steps based on your platform.
Linux:
- Copy the yaml to
/etc/otelcol-sumo/conf.d/
folder for the Cassandra instance which needs to be monitored. - Restart the collector using:
sudo systemctl restart otelcol-sumo
After successfully executing the above command, Sumo Logic will start receiving data from your host machine.
Click Next. This will install the app (dashboards and monitors) to your Sumo Logic Org.
Dashboard panels will start to fill automatically. It's important to note that each panel fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but within 20 minutes, you'll see full graphs and maps.
Sample Log
INFO [ScheduledTasks:1] 2023-01-08 09:18:47,347 StatusLogger.java:101 - system.schema_aggregates
Sample Query
Following is the query from Cassandra App's overview Dashboard's Nodes Up Panel:
%"sumo.datasource"=cassandra %"deployment.environment"=* %"db.cluster.name"=* "INFO" | json "log" as _rawlog nodrop
| if (isEmpty(_rawlog), _raw, _rawlog) as _raw
| parse regex field=_raw "(?<level>[A-Z]*) *\[(?<thread_name>[^\]]*?)[:_-]?(?<thread_id>[0-9]*)\] (?<Date>.{10} .{12}) *(?<source_file>[^:]*):(?<source_line>[0-9]*) - (?<message>.*)"
| if (message matches "InetAddress * is now UP",1,0) as UP
| timeslice 1d
| sum(UP) as UP by _timeslice
| sort by _timeslice asc
Viewing Cassandra Dashboards
Overview
The Cassandra - Overview dashboard provides an at-a-glance view of Cassandra backend and frontend HTTP error codes percentage, visitor location, URLs, and clients causing errors.
Use this dashboard to:
- Identify number of nodes which are up and down
- Gain insights into Memory - Init, used, Max and committed
- Gain insights into the error and warning logs by thread and Node activity
Cache Stats
The Cassandra - Cache Stats dashboard provides insight into the database cache status, schedule, and items.
Use this dashboard to:
- Monitor Cache performance.
- Identify Cache usage statistics.
Errors and Warnings
The Cassandra - Errors and Warnings dashboard provides details of the database errors and warnings.
Use this dashboard to:
- Review errors and warnings generated by the server.
- Review the Threads errors and warning events.
Gossip
The Cassandra - Gossip dashboard provides details about communication between various cassandra nodes.
Use this dashboard to:
- Determine nodes with errors resulting in failures.
- Review the node activity and pending tasks.
Memtable
The Cassandra - Memtable dashboard provides insights into memtable statistics.
Use this dashboard to:
- Review flush activity and memtable status.
Resource Usage
The Cassandra - Resource Usage dashboard provides details of resource utilization across Cassandra clusters.
Use this dashboard to:
- Identify resource utilization. This can help you to determine whether resources are over-allocated or under-allocated.