Quick Post: List<int> Intersect Performance

I noticed an interesting performance difference earlier today, so I thought I’d throw together a quick post about it just in case it could be useful for anyone else.  In a project that I’m working on, I have the need to do a good amount of intersect operations on separate collections of integers.  I happened to notice that the performance of this was affected by the order in which the collections were specified.  So lets jump right in and take a look.  I’m doing all of my tests with Visual Studio 2013 in a .NET 4.5 console application compiling for x64.  CPU: Intel Xeon E3-1230 V2 with HT enabled.

using System;
using System.Collections.Generic;
using System.Linq;

namespace CompareSpeed
{
    class Program
    {
        static void Main(string[] args)
        {
            int largeColSize = 10000;
            int smallColSize = 500;
            int iterations = 25000;

            Console.WriteLine("Total Rows: {0}", (largeColSize + smallColSize).ToString("#,0"));
            Console.WriteLine("Ratio of small to large: {0}", (decimal)smallColSize / (decimal)largeColSize);
            Console.WriteLine();

            var rand = new System.Random();
            var largeCollection = new List<int>(largeColSize);
            var smallCollection = new List<int>(smallColSize);

            //Fill large collection
            for (int i = 0; i < largeColSize; i++)
            {
                largeCollection.Add(rand.Next(int.MinValue, int.MaxValue));
            }

            //Fill small collection
            for (int i = 0; i < smallColSize; i++)
            {
                smallCollection.Add(rand.Next(int.MinValue, int.MaxValue));
            }


            var sw = new System.Diagnostics.Stopwatch();

            /*  CODE
             *  GOES
             *  HERE
             *  BASED
             *  ON
             *  TEST
             */

            Console.ReadKey();
        }
    }
}

One of the following code blocks gets placed in the commented section above based on which test I was running.

//Intersect large to small, single-threaded
sw.Start();
for (int i = 0; i < iterations; i++)
    largeCollection.Intersect(smallCollection).ToList();
sw.Stop();
Console.WriteLine("Intersect Lg->Sm Single Thread:\t {0}", sw.ElapsedMilliseconds);

//Intersect small to large, single-threaded
sw.Start();
for (int i = 0; i < iterations; i++)
    smallCollection.Intersect(largeCollection).ToList();
sw.Stop();
Console.WriteLine("Intersect Sm->Lg Single Thread:\t {0}", sw.ElapsedMilliseconds);

//Intersect large to small, parallel
sw.Start();
for (int i = 0; i < iterations; i++)
    largeCollection.AsParallel().Intersect(smallCollection.AsParallel()).ToList();
sw.Stop();
Console.WriteLine("Intersect Lg->Sm Parallel:\t {0}", sw.ElapsedMilliseconds);

//Intersect small to large, parallel
sw.Start();
for (int i = 0; i < iterations; i++)
    smallCollection.AsParallel().Intersect(largeCollection.AsParallel()).ToList();
sw.Stop();
Console.WriteLine("Intersect Sm->Lg Parallel:\t {0}", sw.ElapsedMilliseconds);

Here are the results.  All time are in milliseconds.  Green indicates shortest duration, orange indicates longest duration.

Results

Based on this testing, I think there’s a strong case to be made for performing an intersect using the larger collection on the left side.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s