I recently ran into a problem with SOLR queries running incredibly slow in production and causing our SOLR instances to crash under the pressure of users in production. The issue could not really be recreated in lower environments, so it was a tad difficult to troubleshoot and pinpoint.
The symptoms basically reminded me of the slow burn that is a memory leak, where things seemed fine at first, but when it finally toppled over, it would tank drastically.
There really was not any indication of what happened, just there would be a burst of requests from a bot, and then the site would topple over after a few moments and the SOLR instances would slowly go offline and be stuck into recovery modes. Analyzing the bot traffic showed that it was not doing something a normal user would not be, it was just doing it faster. Even after mitigation of said bot traffic, the issue persisted, it just took longer to show. So it was a situation of continuing to investigate.
Long story short here, the issue ended up being that a bad commit made it up the chain, and it finally was exposed due to increased volume on the site. The bad commit was about SearchMaxResults being adjusted to 2,147,000,000 for testing.
The reason this is a problem, is because SOLR pre-allocates the rows in memory based on this value, which comes through as a URL parameter.
Below explores that scenario, and how to understand what is happening here, and goes over some of this scenario into a little more detail. I find an example helpful when understanding these sort of things, so I will be providing one to help explain.
In Sitecore, there is a layer of abstraction that allows LINQ request to be translated into SOLR queries. Just something to keep in mind while reading.
Take the following LINQ (specifically, LINQ for Sitecore and SOLR) to find all items with a price of 1024 in a custom index we have called Products where we index these items specifically:
// Search Object, used for mapping solr fields
// This is also to show solr field names do not have to match model name
public class ProductSearchItem : SearchResultItem {
[IndexField("productname")]
public string Name {get; set;}
[IndexField("productprice")]
public float? Price {get; set;}
[IndexField("productdescription")]
public string Description {get; set;}
[IndexField("productid")]
public float SKU {get; set;}
}
// Create a context to search with
var context = ContentSearchManager.GetIndex(Constants.ProductIndexName).CreateSearchContext();
// Search for products matching the price using the search item
var 1024Products = context.GetQueryable<ProductSearchItem>().Where(p => p.Price == 1024);
To break this down just a little, here we have a custom object inheriting from the Sitecore base SearchResultItem. This just allows me to also pull and query some of the default SOLR fields like ‘version’, ‘datasource’, or even ‘templateId’, if they are indexed.
Next we have a query, where I define the context, again this is a custom index setup for these products. I am also using a constant file I have set up with the name defined here among other index names. This just keeps things cleaner and easier to maintain.
If you have logging turned on, you should see this query appear in your search logs under app_data/logs and the latest search log. It should have an entry like below, and you can pretty much see where things are broken down.
INFO Solr Query - ?q=(price_tf:(1024) )&start=0&rows=1000&fq=_indexname:(products_index_main)&wt=xml
Sidenote: You can take the query (?q) and run it on SOLR directly and see a bit of other data, or enable a debug query to get more stats. This is helpful also in determining if Sitecore or SOLR has a bottleneck.
This in itself is nothing crazy. To make it make sense here you’ve said “here is my query” and “I want rows 0 through 1000” (1000 is the default of Sitecore) and it assumes that the result set is 1000, so it will pre-allocate this memory for the request.
Now the cost of a query like this would be minimal, and has little impact. Let’s assume that there are 50 items that have this price. That is a fairly low number, and SOLR is built to handle returning large result sets back without issue. It is assumed this is the default for a reason, but even the Sitecore documentation recommends adjusting this if it is too large.
This default value can be adjusted in the Sitecore config, however it is recommended to use a config override. (here I set the base to 250)
<?xml version="1.0" encoding="utf-8"?>
<configuration
xmlns:patch="http://www.sitecore.net/xmlconfig/" xmlns:set="http://www.sitecore.net/xmlconfig/set/">
<sitecore>
<settings>
<setting name="ContentSearch.SearchMaxResults" value="250" />
</settings>
</sitecore>
</configuration>
The higher this number though (the rows) – the worse performance can become when under load. if you made this query use an incredibly large row size like 2,147,000,000 – you can really break your setup. Even if it returns quickly the one time, if it has to pre-allocate this same amount of memory for say 1,000+ unique requests, you could run into an out of memory and crash the server quite easily.
That being said, be careful with row count, and if you know your pre-allocated rows you need, it isn’t a bad idea to use that. Here is the earlier example where I just pull the first 25 results.
// Create a context to search with
var context = ContentSearchManager.GetIndex(Constants.ProductIndexName).CreateSearchContext();
// Search for products matching the price using the search item
var 1024Products = context.GetQueryable<ProductSearchItem>().Where(p => p.Price == 1024)
.Take(25);
For more information, I’d check out the article written by SearchStax on row count and how it impacts performance. They provide some more detailed graphs with some of the testing they’ve done.

Leave a comment