In this guide, you will learn how to install Cassandra on CentOS 8 and configure initial security. #centlinux #linux #cassandra
Table of Contents
What is Apache Cassandra?
Apache Cassandra is an open-source distributed NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. Here are the key aspects of Apache Cassandra:
Key Features
- Distributed Architecture: Cassandra is designed as a distributed system, allowing it to seamlessly scale across multiple nodes and handle large volumes of data.
- No Single Point of Failure: Data is replicated across multiple nodes, ensuring high availability and fault tolerance. There is no single point of failure in the system.
- Linear Scalability: Cassandra provides linear scalability by adding more nodes to the cluster. It can handle thousands of nodes across multiple data centers.
- Schema-Free Model: Cassandra follows a schema-free data model, allowing for flexible and dynamic storage of structured, semi-structured, and unstructured data.
- High Performance: Optimized for fast reads and writes, making it suitable for use cases that require low latency and high throughput.
- Tunable Consistency: Offers tunable consistency levels, allowing developers to choose between strong consistency for critical operations or eventual consistency for improved performance.
- Query Language: Cassandra Query Language (CQL) is used to interact with the database, resembling SQL syntax with additional NoSQL capabilities.
- Built-in Replication: Data is automatically replicated across nodes based on configurable replication factors, ensuring data redundancy and availability.
- Multi-Data Center Replication: Supports replication across multiple data centers, allowing for global distribution of data with low latency access.
Use Cases
- Big Data Applications: Used in big data environments where scalability, high availability, and performance are critical, such as real-time analytics and IoT applications.
- Time Series Data: Suitable for storing and analyzing time-series data, such as logs, sensor data, and financial transactions.
- Content Management: Used in content management systems where flexible schema design and horizontal scalability are essential.
- Messaging and Chat Applications: Powers messaging platforms and chat applications that require low-latency data access and high availability.
- Recommendation Systems: Supports recommendation engines by efficiently storing and retrieving user data and preferences.

Apache Cassandra Ecosystem
- Apache Cassandra: The core database system.
- DataStax Enterprise: A commercial distribution of Cassandra with additional enterprise features.
- Cassandra Query Language (CQL): SQL-like language for interacting with Cassandra.
- Cassandra Drivers: Client libraries in various programming languages (e.g., Java, Python, Node.js) for application integration.
- Cassandra Tools: Various tools for monitoring, management, and data modeling.
Summary
Apache Cassandra is a robust NoSQL database known for its distributed architecture, high availability, linear scalability, and schema flexibility. It is widely adopted in industries requiring scalable and fault-tolerant solutions for handling large-scale data operations.
Recommended Training: Amazon DynamoDB Data Modeling for Architects & Developers from Rajeev Sakhuja

Environment Specification
We are using a KVM based CentOS 8 virtual machine with following specification.
- CPU – 3.4 Ghz (2 cores)
- Memory – 2 GB
- Storage – 20 GB
- Operating System – CentOS Linux 8.2
- Hostname – cassandra-01.centlinux.com
- IP Address – 192.168.116.206 /24
6 Pieces Bodywell chip Cell Phone Stickers Protection Neutralizers Stickers Blockers Equipment Stickers for All Phones Tablets Laptops Computer WiFi All Devices Gold
$12.99 (as of March 7, 2025 18:48 GMT +00:00 – More infoProduct prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on [relevant Amazon Site(s), as applicable] at the time of purchase will apply to the purchase of this product.)Update your Linux Operating System
Connect with cassandra-01.centlinux.com as root user by using a ssh tool.
As a best practice, update existing software packages in your Linux operating system.
# dnf update -y
Verify version of active Linux kernel by using uname command.
# uname -r
4.18.0-193.6.3.el8_2.x86_64
Verify version of your Linux operating system.
# cat /etc/redhat-release
CentOS Linux release 8.2.2004 (Core)
Install Cassandra Yum Repository
Apache Software Foundation provides official yum repositories for each version of Cassandra software.
You are required to add the yum repository as mentioned at Cassandra download page.
Create a repo file by using vim text editor.
# vi /etc/yum.repos.d/cassandra.repo
Add following directives in this file.
[cassandra]
name=Apache Cassandra
baseurl=https://downloads.apache.org/cassandra/redhat/311x/
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://downloads.apache.org/cassandra/KEYS
Here, 311x is the respective version of Apache Cassandra i.e. 3.11.
It is the latest version at the time of this writing. Therefore, we are using it. If you want to install any other version of Apache Cassandra then you should update the version number in repo file accordingly.
Build yum cache for newly installed yum repository. Accept GPG keys if asked to do so.
# dnf makecache
CentOS-8 - AppStream 7.3 kB/s | 4.3 kB 00:00
CentOS-8 - Base 5.0 kB/s | 3.9 kB 00:00
CentOS-8 - Extras 162 B/s | 1.5 kB 00:09
Apache Cassandra 2.1 kB/s | 3.6 kB 00:01
Metadata cache created.
Apache Cassandra 3.11 yum repository has been installed on your Linux server.
Install Cassandra on CentOS 8
Apache Cassandra requires JVM (Java Virtual Machine) to run. Although, we can explicitly install Java on your Linux server, but if you install Cassandra software by using dnf command, it will automatically installs all required dependencies including Java.
Therefore, you should directly install Cassandra on CentOS 8 server by using dnf command.
# dnf install -y cassandra
cqlsh (Cassandra Query Language Shell) requires Python to run. Therefore, you are also required to install Python as well.
Currently, Apache Cassandra is only compatible with Python 2.7. Therefore, you need to install the same on your Linux server.
# dnf install -y python2
Cassandra service is SystemV based, therefore, you have to use the legacy commands to enable and start it.
# service cassandra start
Starting cassandra (via systemctl): [ OK ]
# chkconfig cassandra on
Verify the status of cassandra.service.
# systemctl status cassandra.service
â cassandra.service - LSB: distributed storage system for structured data
Loaded: loaded (/etc/rc.d/init.d/cassandra; generated)
Active: active (running) since Sat 2020-08-01 11:18:50 PKT; 51s ago
Docs: man:systemd-sysv-generator(8)
Main PID: 48050 (java)
Tasks: 50 (limit: 12331)
Memory: 1.1G
CGroup: /system.slice/cassandra.service
ââ48050 /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.262.b10-0.el8_2.x86_64>
Aug 01 11:18:46 cassandra-01.centlinux.com systemd[1]: Starting LSB: distribute>
Aug 01 11:18:46 cassandra-01.centlinux.com runuser[47978]: pam_unix(runuser:ses>
Aug 01 11:18:50 cassandra-01.centlinux.com cassandra[47966]: Starting Cassandra>
Aug 01 11:18:50 cassandra-01.centlinux.com systemd[1]: Started LSB: distributed>
Use the nodetool command to verify the status of your Cassandra cluster.
# nodetool status
Datacenter: datacenter1
=======================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns (effective) Host ID Rack
UN 127.0.0.1 70.01 KiB 256 100.0% 7d916cdb-8065-42d0-97c0-c88c68b68aa3 rack1
Apache Cassandra has been installed on your Linux server.
Configure NoSQL Database Security
Configuration files for Apache Cassandra are located in /etc/cassandra/conf directory.
It is a safe practice to take a backup of the original configuration file, before modifying it. Therefore, create a copy of the original cassandra.yaml configuration file as follows.
# cd /etc/cassandra/conf/
# cp cassandra.yaml cassandra.yaml.bkp
Now, edit this file by using vim text editor.
# vi /etc/cassandra/conf/cassandra.yaml
Locate following parameters in this file.
authenticator: AllowAllAuthenticator
authorizer: AllowAllAuthorizer
roles_validity_in_ms: 2000
permissions_validity_in_ms: 2000
And update their values as follows.
authenticator: org.apache.cassandra.auth.PasswordAuthenticator
authorizer: org.apache.cassandra.auth.CassandraAuthorizer
roles_validity_in_ms: 0
permissions_validity_in_ms: 0
Restart Cassandra service to take changes into effect.
# systemctl restart cassandra.service
Create a Database Admin user
Connect to cqlsh prompt by using the Cassandra default username/password.
# cqlsh -u cassandra -p cassandra
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.7 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
cassandra@cqlsh>
Create a database admin user by executing following command.
cassandra@cqlsh> CREATE ROLE ahmer WITH PASSWORD = 'Ahmer@1234' AND SUPERUSER = true AND LOGIN = true;
Exit from cqlsh prompt.
cassandra@cqlsh> exit
Again connect to cqlsh as new admin user.
# cqlsh -u ahmer -p Ahmer@1234
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 3.11.7 | CQL spec 3.4.4 | Native protocol v4]
Use HELP for help.
ahmer@cqlsh>
For better security it is always advisable to remove/disable the default users. Therefore, revoke admin role and login permissions from cassendra user.
ahmer@cqlsh> ALTER ROLE cassandra WITH PASSWORD = 'cassandra' AND SUPERUSER = false AND LOGIN = false;
Revoke all permissions from cassendra user.
ahmer@cqlsh> REVOKE ALL PERMISSIONS ON ALL KEYSPACES FROM cassandra;
Grant all permissions to new admin user.
ahmer@cqlsh> GRANT ALL PERMISSIONS ON ALL KEYSPACES TO ahmer;
Exit from cqlsh prompt.
ahmer@cqlsh> exit
Apache Cassandra has been configured. It is now ready to become part of a Cassandra cluster.
Read Also: How to install Cassandra-web on CentOS 8
Mastering Embedded Linux Development: Crafting Fast and Reliable Embedded Solutions with Linux 6.6 and the Yocto Project 5.0 (Scarthgap)
$54.99 (as of March 7, 2025 18:51 GMT +00:00 – More infoProduct prices and availability are accurate as of the date/time indicated and are subject to change. Any price and availability information displayed on [relevant Amazon Site(s), as applicable] at the time of purchase will apply to the purchase of this product.)Conclusion
Installing Apache Cassandra on CentOS 8 allows you to leverage a powerful, scalable, and distributed NoSQL database for handling large volumes of data with high availability.
By following the installation steps, configuring the necessary dependencies, and starting the Cassandra service, you can set up a reliable database environment. Once installed, ensure that Cassandra is running properly, optimize performance settings, and begin managing your data efficiently.
Need expert AWS and Linux system administration? From cloud architecture to server optimization, I provide reliable and efficient solutions tailored to your needs. Hire me on Fiverr today!
FAQs
What is Apache Cassandra, and why is it used?
Apache Cassandra is a distributed NoSQL database designed for high availability, fault tolerance, and scalability, making it ideal for handling large amounts of data across multiple servers.
What are the prerequisites for installing Cassandra on CentOS 8?
Before installing Cassandra, ensure that Java (OpenJDK 8 or later) is installed, as Cassandra runs on the Java Virtual Machine (JVM). Additionally, configure YUM repositories for downloading the latest Cassandra packages.
Why is Apache Cassandra preferred over traditional relational databases?
Cassandra offers horizontal scalability, high fault tolerance, and decentralized architecture, making it more suitable for handling big data applications compared to traditional relational databases.
What configurations are required after installing Cassandra?
After installation, the cassandra.yaml configuration file should be adjusted to define cluster name, seed nodes, replication settings, and other performance optimizations based on the deployment environment.
How can I verify that Apache Cassandra is running correctly?
After starting the Cassandra service, you can check its status using system commands, review log files for errors, and use the cqlsh
shell to connect to the database and execute queries.
Leave a Reply
You must be logged in to post a comment.