Search performance has gone from a cool function to a commodity – everybody expects search to be there and simply work. On the similar time, the infrastructures, knowledge fashions, and back-ends behind the pleasant search bar can now be extremely complicated, requiring novel approaches that go far past sending a question to a database server. Elasticsearch was developed to fulfill that want and is now the world’s hottest enterprise search engine, so understanding the way to use it correctly is essential for knowledge safety – particularly because it’s very simple to get this fallacious.
Earlier than we get into some examples of insecure utilization, let’s take a step again to see how we obtained the place we’re at this time with Elasticsearch.
A silent paradigm shift in internet growth
Earlier than the cloud period, internet growth was quite a bit much less specialised. Positive, we had our fair proportion of frameworks that required particular information to construct a functioning software, however their job was largely to summary away all the tedious work that got here with constructing an internet software, comparable to session, cookie, or consumer administration. In apply, you might deal with these options as extensions of present programming languages, so studying them was a matter of remembering just a few new capabilities and patterns.
The previous few years, nonetheless, have seen a serious (if silent) shift. Whereas earlier frameworks and applied sciences had been largely designed to make a developer’s job simpler, more moderen initiatives are more and more targeted on enabling programmers to put in writing purposes which might be extra technically strong and superior below the hood. That is very true for extremely distributed techniques the place you want utterly completely different approaches to make sure scalability, reliability, and efficiency.
All this was not likely stunning given the success and growth of cloud-based techniques, particularly with large expertise firms releasing their extremely specialised tooling for anybody to make use of. One motive was to draw an open-source neighborhood that may construct upon – and naturally enhance – the applied sciences they had been utilizing for their very own merchandise.
Enter Elasticsearch
In a distributed world, many of those new applied sciences needed to be dramatically completely different from what was used earlier than, forcing a change to the way in which we write our purposes. Elasticsearch is an effective instance of a expertise that appears acquainted however represents a very completely different strategy to looking knowledge. Though it’s been round for fairly some time, its usefulness for all kinds of enterprise purposes was not instantly obvious to a wider viewers. Earlier than, you’d have largely used a conventional SQL database, comparable to MySQL – or Microsoft SQL Server, as loads of our readers are painfully conscious.
However Elasticsearch had a number of options that may make it much more standard amongst builders, comparable to the truth that it sports activities a JSON-based API – a function that anybody who’s ever needed to write an SQL question of barely above-average complexity would fortunately pay a big amount of cash for. One more reason good outdated SQL queries fell a bit out of favor is that they’ve traditionally been a number one reason for unauthorized shock backups. Clearly, there was a necessity for options.
What are search templates?
The choice we wish to speak about at this time is search templates, particularly Mustache templates as utilized in Elasticsearch (and supported for all main programming languages). Think about you wish to present search performance in your web site, for instance to seek for weblog posts. Particularly, you wish to give customers the flexibility to filter outcomes by their very own standards, such because the ID of the put up. Right here’s one instance of such a template:
{
"script": {
"lang": "mustache",
"supply": {
"question": {
"match": {
"postID": {
"question": "{{ID}}"
}
}
}
}
}
}
As you may see, we now have specified our search standards inside the template and permit for variable consumer enter between double braces. The template is conveniently written in JSON format and specifies a postID
parameter that we wish to get from the consumer. The {{ID}}
template string will probably be expanded to mirror precise consumer enter and robotically sanitized, so we are able to embed consumer enter within the template understanding we’re comparatively secure from injection – that’s, if the template is written accurately.
Mustache equals triple bother
As you may see within the instance above, a easy placeholder in a Mustache template makes use of double braces, so one thing like {{username}}
. When the template is processed, all the pieces will probably be robotically sanitized to not intrude with the syntax of no matter you’re placing the placeholder in.
To date, so good – however based on the documentation, this isn’t the one sort of placeholder:
All variables are HTML escaped by default. If you wish to return unescaped HTML, use the triple mustache: {{{identify}}}.
Relying on what templating system you might be already aware of and what you had been utilizing it for, you might be extra accustomed to utilizing such triple brace syntax by default, going for {{{username}}}
as an alternative of {{username}}
. And that may work equally effectively for one thing like a regular username, so that you won’t discover the catch. The docs say you at the moment are returning unescaped HTML, in impact utilizing the uncooked enter worth inside the template earlier than it’s handed to Elasticsearch for analysis.
Unsanitized consumer enter getting used straight in knowledge queries? Feels like a basic injection vulnerability. Let’s discover.
A weak code instance
Earlier than diving right into a weak question, let’s outline the information we’re querying. Say we now have an index referred to as kitties
containing paperwork with identify
and age
fields. Right here’s the index, with simply two paperwork:
{
...
"hits": [
{
"_index": "kitties",
"_id": "2",
"_score": 1.0,
"_source": {
"name": "Mila",
"age": 3
}
},
{
"_index": "kitties",
"_id": "1",
"_score": 1.0,
"_source": {
"name": "Marley",
"age": 2
}
}
]
...
}
If we are able to discover a weak template someplace that makes use of the triple brace syntax to go looking this index, we could possibly return all the paperwork – even when we don’t know a single identify to place within the question. A weak search template may look one thing like this:
POST /_scripts/vuln-search-template HTTP/1.1
Host: instance.com
Content material-Size: 263
...
{
"script": {
"lang": "mustache",
"supply": {
"question": {
"match": {
"identify": {
"question": "{{{identify}}}"
}
}
}
}
}
}
You possibly can see the {{{identify}}}
placeholder right here. Earlier than the request is handed to Elasticsearch, the placeholder is changed with the provided identify worth. In idea, the identify within the index should match the enter for a doc to be returned. If we go the identify Marley
as under, Elasticsearch will return the doc with that feline’s identify and age:
POST /kitties/_search/template HTTP/1.1
Host: instance.com
...
{
"id": "vuln-search-template",
"params": {
"identify": "Marley"
}
}
However now for the catch: the placeholder makes use of triple brace syntax, so it isn’t escaped or sanitized. This implies we are able to simply change the which means of the JSON question. To start out with, we are able to get away of the string that accommodates {{{identify}}}
with only a double quote character. By injecting "zero_terms_query":"all"
into the question, we are able to successfully flip this right into a match-all question, which permits us to return all paperwork (the equal of the outdated 'or 1 = 1'
trick for SQL injection). Right here is an instance exploit:
POST /kitties/_search/template
Host: instance.com
...
{
"id": "vuln-search-template",
"params": {
"identify": "", "zero_terms_query":"all"}}}}"
}
}
In sensible phrases, if an Elasticsearch template makes use of a triple brace placeholder and you realize the placeholder identify, sending a question just like the one above might expose all knowledge from that index – not good.
Doubling down on search template safety
So the place does this go away us? The issue just isn’t actually in the way in which templates are used. In truth, search templates are a strong mechanism for sending repetitive queries with altering parameters, comparable to when utilizing a search bar on an internet site. Used accurately, additionally they add a layer of safety via computerized enter encoding and sanitization.
The safety takeaway right here is that when working with templates in Elasticsearch, it’s essential to be very cautious to make use of the correct placeholder syntax. Any search template that makes use of a triple brace placeholder is weak to injection and will reveal your complete search index and knowledge to attackers, so it’s a good suggestion to implement the safer double brace syntax – and in addition be sure you’re not working legacy or third-party code that makes use of the insecure model.