Sitecore Analytics Data Loss: Max Size of Insert Queue Reached

For high traffic websites using Sitecore Analytics, you may run into instances where you find the following warning in your Sitecore logs:

Analystics: Max size of insert queue reached. Dropped 2680.

While this may seem like just a warning, the truth is, you’re losing data.

The warning message tells you exactly how many database writes were attempted to be queued, but dropped in the current session.

Further Analysis

Sitecore Analytics runs as a multi-threaded background process collecting thread requests to write data to the Analytics database. When the number of requests queued is greater than the threshold set in your Analytics.Config file, warnings are written for every 25 failures discovered and those additional requests are discarded with no way to recover the data.

Your only solution is to increase your MaxQueueSize attribute in the Analytics.Config file. Just make sure your servers can handle the load.

Using reflector, the method used to queue requests to be written to the database shows the vulnerability. Keep in mind this method is used in

  • Creating campaign events
  • Updating visitor identity
  • All HTTP and Media request handling

The public static method Enqueue, exposed in the DatabaseSubmitter class of Sitecore.Analytics.Data places tracking in queue to be updated in the Analytics database.

private static readonly int maxSize = Settings.GetIntSetting("Analytics.MaxQueueSize", 500);

...
public static void Enqueue(IDatabaseSubmittable databaseSubmittable)
{
    Assert.ArgumentNotNull(databaseSubmittable, "databaseSubmittable");
         lock (syncQueue)
    {
        if (queue.Count > maxSize)
        {
            AnalyticsManager.SetStatusFailed(AnalyticsManager.STATUS_QUEUE_FULL);
            
            failed++;
            if ((failed % 0x19) == 1)
            {
                Log.Warn( string.Format("Analystics: Max size of insert queue reached. Dropped {0}.", failed), typeof(DatabaseSubmitter));
            }
        }
        else
        {
            queue.Add(databaseSubmittable);
        }
    }
}

Lets hope future versions of Sitecore Analytics will provide better ways of dealing with excessive load than dropping data without having a way to recover it.

Advertisements

Abandoning the ASP.NET User Session in Sitecore with OMS

Be careful when clearing out a user’s ASP.NET session, if you’re using Sitecore OMS. If programmatically clearing one’s session using code similar to one or all of the following:


Context.Session.Abandon();

HttpContext.Current.Request.Cookies.Remove("ASP.NET_SessionId");

Response.Cookies.Add(new HttpCookie("ASP.NET_SessionId", string.Empty));

You may notice that in your Sitecore logs, you’ll have entries that match:

    Violation of PRIMARY KEY constraint ‘PK_Session_1’. Cannot insert duplicate key in object ‘dbo.Session’.

The primary key violation is an indicator that you’ve cleared out the users ASP.NET session, but neglected to also clear the Analytics Session as well. Since both sessions are so tightly bound in the Analytics data (just take a look at the table schema for the table Sessions), breaking this relationship will cause loss of data.

All recording of page events via OMS produce errors for the given session after hitting the piece of code to abandon the ASP.NET session. You lose all Analytics tracking due to this error, which means any pages a user visits after abandoning the ASP.NET session will be lost.

The Fix

In addition to clearing the ASP.NET session, also clear the Analytics session:


HttpContext.Current.Request.Cookies.Remove("SC_ANALYTICS_SESSION_COOKIE");

Response.Cookies.Add(new HttpCookie("SC_ANALYTICS_SESSION_COOKIE", string.Empty));

Thanks to Sitecore Support for providing the analysis and recommended approach.

Sitecore OMS Error: Overwhelming Change Notification

Today, our websites generated a stack trace error with an “Out of Memory Exception” message for the whole world to see.

Immediately looking at the IIS logs, we discovered 500 internal server errors over a 3-minute time span. Using the time stamp generated by IIS, I then compared against the error logs created by Sitecore (typically in [website path]\data\logs).

Looking at the Sitecore logs, sandwiched between two Analytics errors, I see the following:

4304 14:29:31 ERROR Failed to insert Analytics data
Exception: System.Data.SqlClient.SqlException
...

4276 14:30:18 INFO  **************************************************
4276 14:30:18 WARN  Sitecore shutting down
4276 14:30:18 WARN  Shutdown message: Overwhelming Change Notification in C:\inetpub\wwwroot
Overwhelming Change Notification in C:\inetpub\wwwroot
Change in GLOBAL.ASAX
HostingEnvironment initiated shutdown
CONFIG change

4304 14:30:26 ERROR Failed to insert Analytics data
Exception: System.Data.SqlClient.SqlException
...

Having no idea what “Overwhelming change notification” meant of what it was referring to, our team contacted Sitecore directly. Turns out that this error is a result of a bug in OMS 1.0.1 (running under Sitecore 6.1). An unhandled exception appeared in the Sitecore OMS module due to improper handling of the AnalyticsLogger class. Their support team were able to reproduce the issue and generate the following error message related to the one triggered above:

ERROR Unhandled exception detected. The ASP.NET worker process will be terminated.
Exception: System.IndexOutOfRangeException
Message: Index was outside the bounds of the array.
Source: mscorlib
   at System.Collections.Generic.List`1.Add(T item)
. . .

Currently, the only work around for this is to disable error logging for Sitecore OMS by following the steps below:

  1. Open the \Website\App_Config\Include\Sitecore.Analytics.config file.
  2. In Sitecore.Analytics.config file, comment the following pipeline’s processor:

    <!– <processor type=”Sitecore.Analytics.

    Pipelines.HttpRequest.StartDiagnostics, Sitecore.Analytics” patch:after=”processor[@type=’Sitecore.Pipelines.HttpRequest.StartMeasurements, Sitecore.Kernel’]” /> –>

Tough choice here…. Keep your OMS error logging or take your chances with random OMS exceptions?