Sitecore Experience Database xDB Maintenance Guidelines

March 24, 2017

The purpose of this blog is to provide guidelines for maintaining Sitecore Experience Database (xDB) components – in particular Mongo and Solr - after a successful go-live.

The ideal scenario would be for customers with a scaled environment using Sitecore XP. The most common level architecture we see is as follows:

  • Two Content Delivery (CD) instances
  • Dedicated Content Management (CM) instance
  • Dedicated Processing Server
  • MongoDB replica set
  • SOLR for search and analytics index
  • SQL Server for Master + Web + Core + Reporting Databases

Below are few recommendations pertaining to MongoDB, SOLR, and Sitecore.

MongoDB

The MongoDB administration documentation addresses the ongoing operation and maintenance of MongoDB instances and deployments. This documentation includes both high level overviews of these concerns as well as tutorials that cover specific procedures and processes for operating MongoDB.

Couple of key sections are mentioned below for reference.

Operations Checklist

Reference: https://docs.mongodb.com/manual/administration/production-checklist-operations/

Configuration and Maintenance

Reference: https://docs.mongodb.com/manual/administration/configuration-and-maintenance/

Backup

Reference: https://docs.mongodb.com/manual/core/backups/

Security

Reference: https://docs.mongodb.com/manual/security/

Monitoring

MongoDB includes a number of tools out of the box. One can run these against a live MongoDB server and report stats in real time.

serverStatus

The serverStatus command returns a document that provides an overview of the database’s state. Monitoring applications can run this command at a regular interval to collect statistics about the instance. The serverStatus command, or db.serverStatus() from the shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. The command returns quickly and does not impact MongoDB performance.

See the manual for details: https://docs.mongodb.com/manual/reference/command/serverStatus/#dbcmd.serverStatus

mongostat

This shows key metrics like opcounts, memory usage and replica set status updating every second. It is useful for real time troubleshooting because you can see what is going on right now.

See the manual for details: https://docs.mongodb.com/manual/reference/program/mongostat/

mongotop

Whereas mongostat shows global server metrics, mongotop looks at the metrics on a collection level, specifically in relation to reads and writes. This helps to show where the most activity is.

See the manual for details: https://docs.mongodb.com/manual/reference/program/mongotop/

rs.status()

This shows the status of the replica set from the viewpoint of the member you execute the command on. It’s useful to see the state of members and their oplog lag.

See the manual for details: https://docs.mongodb.com/manual/reference/command/replSetGetStatus/#dbcmd.replSetGetStatus

sh.status()

This shows the status of your sharded cluster, in particular the number of chunks per shard so you can see if things are balanced or not.

See the manual for details: https://docs.mongodb.com/manual/reference/method/sh.status/

Additional Help

Custom Monitoring Solution

One could also build a custom monitoring solution using open-source monitoring software such as Nagios.

Performance

Reference: https://docs.mongodb.com/manual/administration/analyzing-mongodb-performance/

Compacting

Running a big Mongo DB installation requires a certain amount of routine maintenance. Over time, collections in a Mongo DB can become fragmented. This can be a serious problem if your data usage patterns are relatively unstructured. In the long term, this can result in your databases taking up more space on disk and in RAM to hold the same amount of data, it can make many database operations noticeably slower, and it can reduce your overall query capacity significantly.

Conveniently, MongoDB provides two different ways to compact your data and restore optimal performance: 

repairDatabase

repairDatabase checks and repairs errors and inconsistencies in data storage. repairDatabase is analogous to a fsck command for file systems. Run the repairDatabase command to ensure data integrity after the system experiences an unexpected system restart or crash, if:

The mongod instance is not running with journaling enabled. When using journaling, there is almost never any need to run repairDatabase. In the event of an unclean shutdown, the server will be able to restore the data files to a pristine state automatically. There are no other intact replica set members with a complete data set.

WARNING

During normal operations, only use the repairDatabase command and wrappers including db.repairDatabase() in the mongo shell and mongod --repair, to compact database files and/or reclaim disk space. Be aware that these operations remove and do not save any corrupt data during the repair process.

If you are trying to repair a replica set member, and you have access to an intact copy of your data (e.g. a recent backup or an intact member of the replica set), you should restore from that intact copy, and not use repairDatabase.

repairDatabase takes the following form:

{ repairDatabase: 1 }

Example: db.runCommand( { repairDatabase: 1 } )

Using repairDatabase to Reclaim Disk Space

You should not use repairDatabase for data recovery unless you have no other option.

However, if you trust that there is no corruption and you have enough free space, then repairDatabase is the appropriate and the only way to reclaim disk space.

RepairDatabase is appropriate if your databases are relatively small, or you can afford to take a node out of rotation for quite a long time.

See the manual for details: https://docs.mongodb.com/manual/reference/command/repairDatabase/

compact

Rewrites and defragments all data and indexes in a collection. On WiredTiger databases, this command will release unneeded disk space to the operating system.

compact has the following form:

{ compact: <collection name> }

compact only blocks operations for the database it is currently operating on. Only use compact during scheduled maintenance periods.

See the manual for details: https://docs.mongodb.com/manual/reference/command/compact/

Additional Help

Minimum Recommendation for Regular Maintenance

Based on the above, the table below states the minimum items for regular maintenance.

Mongo DB
Scheduled Back Ups Schedule periodic tests of your backup and restore process to have time estimates on hand, and to verify its functionality
Security Patches Follow guidance from https://docs.mongodb.com/manual/administration/security-checklist/
Monitoring https://docs.mongodb.com/manual/administration/monitoring/
Component One Time Basic Setup Nightly/Weekly Maintenance Quarterly Maintenance Ad Hoc Maintenance
MONGO DB Monitoring
Scheduled Backups
  • Scheduled Backups
  • Is upgrade to a major version recommended by Sitecore?
  • Does Sitecore recommend any patches for the installed MongoDB?
  • Log Management
  • Compacting
  • Vulnerability Patches
  • Issues originating as part of monitoring setup in column 2

SOLR

The SOLR wiki suggests various techniques to keep your SOLR instance healthy. Several key sections are mentioned below for reference.

Operations and Production

SolrCaching

SolrPerformanceFactors

SolrSecurity

Built in SolrRequestHandler based SolrReplication

Unix script based CollectionDistribution

DistributedSearch

CollectionRebuilding

MergingSolrIndexes

SolrOperationsTools

SolrJmx and SolrMonitoring

Reference: https://wiki.apache.org/solr

Monitoring

Reference: https://wiki.apache.org/solr/SolrMonitoring

Minimum Recommendation for Regular Maintenance

Based on the above, the table below states the minimum items for regular maintenance.

SOLR
Security Patches https://wiki.apache.org/solr/SolrSecurity
Monitoring https://wiki.apache.org/solr/SolrMonitoring
Component One Time Basic Setup Nightly/Weekly Maintenance Quarterly Maintenance Ad Hoc Maintenance
SOLR
  • Monitoring
  • Scheduled Backups
Scheduled Backups
  • Is upgrade to a major version recommended by Sitecore?
  • Does Sitecore recommend any patches for the installed SOLR instance?
  • Log Management
  • Vulnerability Patches
  • Issues originating as part of monitoring setup in column 2

xDB Sitecore

Upgrades

Keep Sitecore up to date. We recommend you plan and budget for a Sitecore upgrade every 12 months, 18 at the most. In addition to newer versions providing new fixes and features, a more frequent upgrade schedule reduces the overall risk and cost of upgrades over the course of several years.

Patches

If you encounter what you believe to be a bug in Sitecore, you can contact Sitecore support team and provide details. Please note you need to be a Sitecore certified developer to contact Sitecore support.

Security

From time to time, there are security vulnerabilities, for which Sitecore implements patches and fixes them in next updates. For example, Sitecore suggests applying the following fixes:

  1. https://kb.sitecore.net/articles/496731
  2. https://kb.sitecore.net/articles/039942
  3. https://kb.sitecore.net/articles/547255

Performance

To review solution performance, Sitecore recommends the following for consideration

Minimum Recommendation for Regular Maintenance

Based on the above, the table below states the minimum items for regular maintenance.

Sitecore
Security Patches Apply the required security patches for Sitecore version installed on premise
Upgrades Upgrade Mongo and SOLR to the appropriate recommended versions that work in conjunction with the Sitecore version installed on premise
Component One Time Basic Setup Nightly/Weekly Maintenance Quarterly Maintenance Ad Hoc Maintenance
SITECORE   Monitor Sitecore Analytics Dashboard.
  • Apply the required security patches for Sitecore version installed on-premise
  • Log Management
  • Vulnerability Patches
  • Issues originating as part of monitoring in column 3

Contact us

Let's reinvent the future