Troubleshooting Solr Issues - SearchStax

This page contains notes on how to resolve common SearchStax® user issues.

To open a support ticket with Measured Search®, send email to support@measuredsearch.com.


Update the SOLR Schema

Many users have asked how to update the Solr schema file on their deployments.

Uploading a revised schema file as part of a new configuration is easy. The critical part is that a schema change invalidates the collection's index. The index must be wiped and the documents reloaded as part of a schema update. The process occurs in four steps.

1. Update Schema (upload a new configuration to Zookeeper).

When you are ready to deploy your new schema file, simply use zkcli to upload your configuration directory to Zookeeper. (See zkcli download instructions.)

Run zkcli in a terminal window:

./zkcli.sh
    -zkhost <zookeeper URL> -cmd upconfig -confdir <conf directory> -confname <config name>

where <zookeeper URL> corresponds to the URL of the Zookeeper Ensemble in the deployment details page: Screenshot
<conf directory> is you local Solr configuration directory (../configsets/basic_configs/conf/); and <config name> is the name of the configuration in Zookeeper (test).

2. Delete Data (delete existing index)

To delete the existing data from your collection, open a terminal window and run the following cURL command:

curl 'https:/<load_balancer_URL>/solr/<collection_name>/update?stream.body=<delete><query>*:*</query></delete>&commit=true'

where <load_balancer_URL> is the Solr load balancer URL from the deployment details page; Screenshot
and <collection_name> is the name of the Solr collection (testcollection).

The phrase <delete><query>*:*</query></delete> is literal. This query matches all of the records in the collection's index and deletes them.

3. Reload Collection (distribute new configuration from Zookeeper)

The next step is to download the new configuration from the Zookeeper ensemble to the Solr servers.

curl 'https:/<load_balancer_URL>/solr/admin/collections?action=RELOAD&name=<collection_name>'

where <load_balancer_URL> is the Solr load balancer URL from the SearchStax dashboard, and <collection_name> is the name of the Solr collection (testcollection).

4. Reload Data (re-ingest documents)

We presume that you already know how to ingest documents, but here's a reminder:

Run the following cURL command from a terminal window.

curl  -X POST -H 'Content-type:application/json' -d <datafile_path>
    'https://<load_balancer_URL>/solr/<collection_name>/update?commit=true'

where <load_balancer_URL> is the Solr load balancer URL from the deployment details page, <datafile_path> is the location of the document file (@sample.json), and <collection_name> is the name of the Solr collection (testcollection).


Low Disk Space Warning

Many users have needed assistance with low disk space.

It is common for a user to begin with a single-node deployment while learning how to deploy Solr to the cloud. This is often a Basic deployment on the SB1 plan (one node, one ZooKeeper instance, 1 GB RAM, 8 GB SSD storage).

It is easy to overload an SB1 deployment. If this appears to be the problem, there are two directions you can go:

The SearchStax team is more than willing to assist you with deployment upgrades. Send email to support@measuredsearch.com.

Outages and Timeouts

Several users asked for help diagnosing server outages and client connection timeouts.

Client connection timeouts can be due to:

Check the CPU, Memory, JVM display under Monitoring on the left-side menu of the deployment's SearchStax dashboard. Look for unusually high readings that correspond to the connection problems reported from the field.

Gradually Increasing Disk Usage

A client noticed that a deployment's disk usage was growing even though there were no updates to the Solr index.

The team investigated and found that verbose logging was filling up the disk. SearchStax engineers deleted the log files, reset the logging level, and restarted Solr. Downtime was minimal.

Upgrade Solr 5.x to 6.x

Clients sometimes ask if it is possible to upgrade a Solr 5.x deployment to Solr 6.x. Users in this situation usually wish to keep the same servers and IP addresses. Otherwise they would create a new deployment using Solr 6.x, reload their data, and delete the old deployment.

The SearchStax engineers can perform a 5-to-6 upgrade for you, modulo the following concerns:

Send email to support@measuredsearch.com.

Add NewRelic Analytics

Several users needed help integrating NewRelic analytics with SearchStax.

See the NewRelic tutorial for detailed instructions on enabling NewRelic Analytics in SearchStax.

Servers, Shards, Replicates

Clients sometimes ask for help determining an appropriate number of servers, shards, and replicates.

The number of shards is a decision for your Solr engineers to make. We advise to start with one.

The number of servers is likewise your decision, based on volume of data and query loading. This will change over time, and can change seasonally depending on your domain.

The number of replicates should match the number of servers. This will ensure that your data is replicated on each of the nodes. If one node becomes unavailable, the full index can still be queried.

If you change the number of servers in your deployment to meet seasonal demands, SearchStax does not automatically adjust the number of shards or replicas. That adjustment is up to you. See AddReplica in the Solr documentation.

Zookeeper Connection Failure

A few clients reported this issue.

A SearchStax user reported a connection failure the first time he use the zkcli script to upload his SOLR configuration. (See zkcli download instructions.)

The error message resembled this one:

Caused by java.util.concurrent.TimeoutException: Could not connect to ZooKeeper <instance URL> within 30000 ms

This failure was due to a blocked port 2181 (the ZooKeeper port) in the user's local firewall.

IP Filtering Issues

A few users had difficulty setting up IP filtering.

Open the SearchStax dashboard for the deployment. Look in the left-side menus for Security > IP filter.

All listed IP addresses will have access to Solr, ZooKeeper or Silk, as you indicate. Note that the default entry is 0.0.0.0/0, which allows unrestricted access. You must remove this entry to enable IP access restrictions.

If you accidentally lock yourself out, contact MeasuredSearch for assistance.

Remove Configuration from ZooKeeper

A few users asked how to clear Solr configuration files from ZooKeeper.

Removing a Solr configuration from ZooKeeper is easy to do using the zkcli script.

Linux:

> ./zkcli.sh -zkhost <zookeeper URL> -cmd clear /configs/<configuration name>

Windows:

$ zkcli.bat -zkhost <zookeeper URL> -cmd clear /configs/<configuration name>

where <zookeeper URL> is the Zookeeper Ensemble URL from your deployment details page; Screenshot

and <configuration name> is Zookeeper's internal name for this configuration (test1).

For example:

./zkcli.sh -zkhost ss180178-1-zk-us-west-2-aws.measuredsearch.com:2181 -cmd clear /configs/test1

SwitchOnRebuild Configuration

Users sometimes ask about using Sitecore's "Switch on Rebuild Solr Index" type configuration. This uses a secondary core which can be read from while the primary core is being rebuilt.

There is no reason why "Switch on Rebuild" could not work with SearchStax. You can create any number of collections within your deployments. However, keep in mind that the total size of the data will double. The deployment sizing has to take that into consideration.

Long Query Latency

A user asked us to investigate unusual query latency (long waits) during peak traffic periods.

Examination of Solr and load-balancer logs in this case showed no significant internal delays.

A test script run at an East Coast data center showed normal latency, but the same script run from a European data center showed latency spikes. The team concluded that, in this case, the problem was network latency between the test client and the deployment. This could be due to routing behavior or how heavily utilized those network routes become at high-traffic periods.

Banana and Silk

A user inquired about using Lucidworks Banana with SearchStax.

The team advised him that SearchStax supports SiLK, which is an updated version of the same code. Silk is a fork of Kibana, an open source (Apache Licensed) browser-based analytics and search dashboard for Solr. Silk strives for ease of use while also being flexible and powerful.

See our Silk integration page for more information.

Create Deployment did not Finish

A user reported that the automated setup of a new deployment failed to complete.

The team determined that the user had done nothing wrong. There was an infrastructure issue at the U.S. East (N. Virginia) AWS region. We deleted the incomplete deployment and created a duplicate for him.

Please contact MeasuredSearch immediately if there is any problem with a deployment.

cURL CREATE Failed

A user reported that a cURL create-collection command failed with an inexplicable error. The cURL command itself was correct.

The team eventually determined that the schema.xml file contained an illegal attribute. In this case it was a fieldtype entry configured with a multiValued property. This property can be applied to fields, but not to fieldtypes. Core creation failed as a result.

Change Account Owner

A user needed assistance changing the owner of a SearchStax account, due to a personnel change.

Contact MeasuredSearch for assistance in this kind of situation.

Customizing Authentication

A user asked how to configure authentication to give everyone read access, but restrict write access to authenticated users.

The team was able to provide a modified security.json file that produced the desired behavior.

The user was cautioned not to turn authentication off and on again via the SearchStax dashboard, since turning on authentication overwrites the existing security.json file with a default file.

To view the security.json file, use this zkcli script command:

./zkcli.sh -zkhost <zookeeper URL> -cmd get /security.json > security.json

For further information see the Rule-Based Authorization Plugin page in the Solr documentation.

Query Solr from Javascript

A user asked for help sending queries to Solr from Javascript.

The team suggested using the JSONP format, as follows:

$.ajax({
type: 'GET',
dataType: 'jsonp',
data: {
'q': '*:*',
'wt': 'json',
},
jsonp: 'json.wrf',
url: '<load_balancer_URL>/solr/kmterms_dev/select',
success: function(msg) {
console.log(msg);
}
});
});

Data Loaded but No Query Results

A user asked for help when his collection contained data but queries came back empty.

In this instance it turned out that the user had forgotten to commit the index before querying. This is a common experience when users are loading and querying for the first time.

Null Pointer Exception required Solr Upgrade

A user experienced a situation where missing values in a field caused a node to crash. Most of his collections were still running.

The problem was traced to Solr bug that had been fixed in more recent Solr releases. User was advised to upgrade.

User created a new deloyment using a recent Solr release. He kept the old deployment in service while rebuilding the site on the new deployment. When the new site was ready, he switched over. The original deployment was deleted.

Scaling Down a Holiday Deployment

A user scaled up a deployment to meet query demand over the December holiday season. Later he asked for assistance scaling the deployment back.

The team advised him to delete the replicate servers one per day while monitoring the query load on the remaining servers. This let the user economize while still protecting the system's responsiveness.

If you change the number of servers in your deployment to meet seasonal demands, SearchStax does not automatically adjust the number of shards or replicas. That adjustment is up to you. See AddReplica in the Solr documentation.