Investigating CQL injection in Apache Cassandra

Whereas engaged on new safety checks, our safety analysis group at Invicti regarded on the prospects of automating injection assaults for a lot of varieties of NoSQL databases. This put up summarizes our findings about question language injection assaults towards Apache Cassandra and reveals why such CQL injections are troublesome to carry out and automate.

What’s Apache Cassandra?

Apache Cassandra is a free, open-source, distributed NoSQL database. The challenge was first began in early 2009 and is now some of the generally used NoSQL databases. Organizations select Cassandra for its capacity to scale up very quick and seamlessly. It’s particularly designed to be used circumstances that require giant transaction volumes distributed throughout totally different geographical areas.

How is Cassandra totally different from different databases?

In contrast to different NoSQL databases, Cassandra shops its information in rows and columns, so at first look, it might seem just like relational databases. Nevertheless, in a Cassandra desk, every row is a single piece of structured information, whereas the columns are used to outline partitions and partition keys. Cassandra makes use of partitions to effectively distribute information throughout any variety of interconnected nodes (cases), with provisions for replicating information throughout a number of nodes for top availability, catastrophe restoration, and improved efficiency.

There are various different fascinating features of Cassandra database construction, structure and information modeling – see the official Cassandra documentation to study extra.

What’s Cassandra Question Language (CQL)?

The Cassandra Question Language (CQL) is the first language for speaking with Apache Cassandra databases. A Thrift API was additionally out there previously however is not supported as of Cassandra model 4. In any case, CQL is less complicated to make use of than the Thrift API, particularly as it is extremely just like SQL.

Many of the primary operators, key phrases, and identifiers in CQL are just like these utilized in SQL. Nevertheless, since Cassandra has its personal information mannequin and construction that’s not based mostly on relational tables, CQL has some distinctive options and key phrases, like KEYSPACE or USING TTL. Conversely, some relational clauses are lacking from CQL, notably JOIN, FOREIGN KEY, and UNION.

The significance of CQL shopper drivers

To hook up with a Cassandra database out of your utility, you want the precise shopper driver. As a result of every new model of Cassandra brings main adjustments to the database and the CQL language, there are many shopper drivers out there, although most are particular to at least one model and never maintained past that. Probably the most generally used drivers are developed by DataStax (an organization that provides a business database based mostly on Cassandra), so open-source DataStax drivers are what we used for our utility safety analysis. Crucially, the shopper drivers may present some further safety features.

Exploring CQL injection

Injection assaults are attainable wherever you might have unsanitized person enter being handed to a back-end system, together with a database. In contrast to conventional SQL injection, the place you need to use broadly the identical assault strategies for all relational databases, NoSQL injection requires assaults aimed toward particular databases, with MongoDB being most likely the preferred goal. The identical method ought to work towards a Cassandra database, utilizing rigorously crafted CQL queries to carry out CQL injection assaults.

Utilizing CQL injection for unauthenticated entry

To check if CQL injection is feasible, we created a susceptible login web page that does a easy database lookup utilizing unsanitized person enter. The applying queries a Cassandra database that has a customers desk, utilizing the next unsafe CQL question to authenticate customers:

SELECT * FROM customers WHERE username="[user_input]" AND password='[user_input]' ALLOW FILTERING;

On this case, the next person inputs will likely be sufficient to carry out a CQL injection assault for an authentication bypass:

Username: admin'/*
Password: */ and password >'

The ensuing question will likely be:

SELECT * FROM customers WHERE username="admin"/*' AND password='*/ and password >'' ALLOW FILTERING;

The question was presupposed to search for a legitimate person and password mixture, however the injection payload has commented out the password lookup and solely checks if a password exists. This question will run efficiently, doubtlessly permitting an attacker to log in as admin with out understanding the password.

This type of payload might be helpful to a minimum of verify that injection is feasible, however to automate the assault and procure affirmation, we wanted extra. Sadly, in the intervening time, there aren’t any recognized payloads to extract further data from the desk or the database itself. That is primarily attributable to quite a few limitations imposed by the CQL language and shopper drivers when in comparison with typical SQL injection strategies.

Why CQL injection is more durable than SQL injection

SQL injections are among the many oldest and best-researched net assaults, and CQL syntax is similar to SQL, so it is sensible to attempt to apply SQLi strategies to CQL injection. As we came upon, seemingly minor variations in comparison with SQL add as much as grow to be severe limitations that prevented us from utilizing typical SQL injection strategies. The restrictions under have been recognized and examined in a check atmosphere with Cassandra 4.0.3, a DataStax Python driver 3.25.0, and a Python utility based mostly on Flask.

Lacking injection-friendly language constructs and capabilities

SQL injections extensively use desk becoming a member of operations to seize information from further tables (union-based in-band SQL injection). Cassandra is a non-relational database, so there aren’t any JOIN or UNION statements in CQL, making it exhausting to entry different tables.
Cassandra has no handy built-in capabilities like DATABASE() or USER() to retrieve database data.
There isn’t any OR operator in CQL, so we will’t use it to arrange always-true circumstances – a CQL question like SELECT * FROM desk WHERE col1='a' OR col2='b'; will likely be rejected.
Time-based blind SQL injections depend on utilizing SLEEP() or an identical operate to induce a delay, however there isn’t any SLEEP() operate in CQL, making time-based injections very exhausting (if not inconceivable).
There aren’t any built-in capabilities that may very well be used to ship community requests, so there isn’t any simple technique to carry out out-of-band verification (usually accomplished by listening for DNS decision requests) as with out-of-band SQL injections.

WHERE clause limitations

Columns that would not have secondary indexes can’t be filtered with WHERE clauses. For instance, if column col1 just isn’t a main key and doesn’t have a secondary index, the next CQL question will likely be rejected by the database: SELECT * FROM desk WHERE col1='asd';
Solely legitimate column names might be laid out in WHERE clauses (in contrast to in SQL), so attempting so as to add an always-true situation like SELECT * FROM desk WHERE column1='a' AND '1'='1'; is not going to work as there isn’t any column named '1'. This tremendously limits the scope of accessible payloads, particularly for boolean-based detection and makes an attempt to make a question return all rows in a desk.

SELECT clause limitations

CQL doesn’t help echo-type queries resembling SELECT 'textual content'; or SELECT 3;, eliminating a helpful method for additional exploitation and affirmation.
A elementary SQLi method is to discard the top of a question by injecting a remark. When attempting this with CQL in our utility check atmosphere, a question like SELECT * FROM desk WHERE person="admin";// AND cross="cross"; fails. Be aware that this question is syntactically appropriate and works when instantly related to a Cassandra database. Nevertheless, our assessments discovered that the appliance shopper drivers don’t permit feedback on the finish of queries. That is one other main limitation since we at the moment are restricted to solely utilizing legitimate queries and circumstances.

Restrictions on keys

If a Cassandra database specifies some restrictions on its clustering keys however not on its partition keys, queries is not going to work with out the ALLOW FILTERING key phrase. To make issues worse, ALLOW FILTERING can solely be specified on the finish of a question, which limits our injection choices to the final situation in a WHERE clause (injecting wherever else would lead to an invalid question that will be rejected).
Queries are rejected if you don’t specify all of the clustering key columns. For instance, if col3 is a part of a clustering key, a question like SELECT * FROM desk WHERE col1='a' and col2='b' and col3='c' and col4='d' and col5='e'; would work, however SELECT * FROM desk WHERE col1='a' and col2='b' and col4='d' and col5='e'; could be rejected as a result of col3 is lacking. In follow, this implies an attacker would want to know all of the clustering key columns to organize a legitimate payload.

No nested or stacked queries

CQL doesn’t permit subqueries or different nested statements, so a question like SELECT * FROM desk WHERE column=(SELECT column FROM desk LIMIT 1); could be rejected. This eliminates many basic SQL injection methods, particularly for boolean-based injections.
Consumer drivers for CQL don’t permit stacked queries (a number of queries separated by semicolons). This implies we won’t be able to run arbitrary queries by stacking them.
CQL permits you to create user-defined capabilities (UDF) that may embrace arbitrary code. Up to now, this characteristic has prompted vulnerabilities resembling distant code execution by way of CVE-2021-44521. Nevertheless, since we can’t run stacked queries, we won’t be able to create user-defined capabilities with out instantly connecting to the database.

CQL injections within the wild

CQL injection continues to be a comparatively new subject, and whereas there are some weblog posts on the market, a number of the knowledge you can see is outdated or inaccurate. We examined present posts about potential CQL injection eventualities just like the one proven earlier and located that many of the instructed injection payloads don’t work, a minimum of not in our check atmosphere.

As of this writing, no CQL injection vulnerabilities have been discovered and disclosed in bug bounty packages. There are additionally no CVEs assigned to CQL injection vulnerabilities (aside from the particular case of user-defined capabilities talked about above). No open-source instruments can be found for locating CQL injections, and nearly all present NoSQL injection instruments are particular to MongoDB injection, with no particular CQL injection payloads.

Mitigating CQL injection vulnerabilities

All injection assaults are made attainable by purposes instantly utilizing unsanitized user-controllable information. As with SQL injection vulnerabilities, you may get rid of the danger of CQL injections by utilizing parameterized queries (ready statements) that stop uncooked person enter from making it into database queries. For instance, the next CQL question makes use of parameterized queries:

const question = 'SELECT * FROM desk WHERE col1=? and col2 =?';
const params = ['username', 'password'];
shopper.execute(question, params, { put together: true }, callback);

Along with utilizing parameterized queries, it’s all the time good follow to validate and filter person inputs, and in addition to use context-sensitive encoding relying on how the information is used.

Is Apache Cassandra safe from injection assaults?

Our assessments have proven that, like different NoSQL injection varieties, CQL injection assaults are technically attainable. Nevertheless, as a result of limitations imposed by each the CQL language itself and the shopper drivers, it’s actually troublesome to carry out any virtually helpful CQL injections and exploit them additional. Regardless of the syntactic similarities between SQL and CQL, only a few of the usual SQL injection strategies might be efficiently used towards Cassandra databases.

Aside from the very restricted instance described above, now we have discovered no technique to carry out helpful CQL injection assaults or extract information utilizing any of the recognized strategies in our check atmosphere. This makes Apache Cassandra a fairly safe database selection in terms of injections, particularly if elementary safe coding practices are adopted in improvement. Nevertheless, that is nonetheless a brand new space of examine with numerous room for analysis on bypassing the constraints posted right here, so it’s possible that new methodologies will finally emerge for exploiting CQL injection vulnerabilities.

Source link