Index Lookup

I’d like to start off by talking about a method of looking up items in a List or array that I recently started using when I have been able.  For lack of a better term, when referring to this when talking to a colleage I have been calling it an “Index Lookup.”  If you know of a proper term please let me know in the comments for props.

Consider the following simple console application:

    class Program
    {
        private static List<SomeClass> MyList;

        class SomeClass
        {
            private static Random rand = new Random();

            public int ID { get; set; }
            public int Data { get; set; }

            public SomeClass()
            {
                this.Data = rand.Next(-2147483648, 2147483647);
            }

            public void DoSomeAction()
            {
                if (this.Data > -2147483648)
                    this.Data--;
                else
                    this.Data++;
            }
        }

        static void DoActionByID(int ID)
        {
            var FoundInstance = MyList.Single((item) => item.ID == ID);

            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        }


        static void Main(string[] args)
        {
            MyList = new List<SomeClass>();

            for (int i = 0; i < 30000000; i++)
            {
                //the ID is set to (i+1)*2 to simulate an arbitrary ID value from the DB
                //  or file data source
                var CurInstance = new SomeClass() { ID = (i + 1) * 2};

                MyList.Add(CurInstance);
            }


            //Lets assume the user takes some action and identifies item: 37569360
            //  as the item they want to perform an action on
            DoActionByID(37569360);

            Console.ReadKey();
        }
    }

On my system with this sample program the item search takes just under 1 second (987 ms). Not horrible considering the collection size, but I think we can do better.

I recently started working on a project with a colleague of mine that necessitated looking up n number of objects from a collection based on user selection. Since the design of our application necessitated separating the UI from the underlying objects, the client could not reference the underlying objects directly. Instead, the UI would pass back a list of IDs that were to be included or excluded and then wait for the results to be returned. Initially I was using an approach very similar to the above and performance was horrible. After beating my head against several solutions, I started to think about the fact that you can directly access members of a collection based on their index.

Gets or sets the element at the specified index

So as long as we are able to know what the index position of my items are, we should be good to go. The good news is, knowing an items position is easy! As we all know, the .Add() method of a List adds the item to the end of the List; this means that so long as you don’t sort the List, or use the Insert() method to insert an item at a specific index, that it’s a trivial matter to know the index position within the List of every item you insert. Since the List/Array use a 0-based index system, and the .Count or .Length properties respectively are 1-based; the index value of the next inserted item will be equal to the current (pre-insertion) value of .Count or .Length.

Consider the following modifications to the class, List insertion code, and new DoActionByIndex method.

    class Program
    {
        private static List<SomeClass> MyList;

        class SomeClass
        {
            private static Random rand = new Random();

            public int ID { get; set; }
            public int Data { get; set; }

            //A new value for tracking the index position of the item
            //  this value gets exposed to the UI layer for passback
            //  this causes the client to no longer use the ID field
            //  which is retained solely for the purpose of
            //  updating the record in the original data source
            public int index { get; set; }

            public SomeClass()
            {
                this.Data = rand.Next(-2147483648, 2147483647);
            }

            public void DoSomeAction()
            {
                if (this.Data > -2147483648)
                    this.Data--;
                else
                    this.Data++;
            }
        }

        //static void DoActionByID(int ID)
        //{
        //    var FoundInstance = MyList.Single((item) => item.ID == ID);

        //    Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
        //    FoundInstance.DoSomeAction();
        //    Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        //}

        static void DoActionByIndex(int index)
        {
            //Since we are now passing in the index position, we reference the it
            //  directly rather than doing an ID match comparison
            var FoundInstance = MyList[index];

            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        }


        static void Main(string[] args)
        {
            MyList = new List<SomeClass>();

            for (int i = 0; i < 30000000; i++)
            {
                //the ID is set to (i+1)*2 to simulate an arbitrary ID value from the DB
                //  or file data source
                var CurInstance = new SomeClass() { ID = (i + 1) * 2};

                //Since the index position is 0-based, and the Count property of a List
                //  is 1-based we know that the current Count value of the List will
                //  represent the new index of the next item we insert
                CurInstance.index = MyList.Count;


                MyList.Add(CurInstance);
            }


            //Lets assume the user takes some action and identifies item: 18784679
            //  as the item they want to perform an action on
            //  this should be the same item as before, just using index rather than ID

            //  Bear in mind that the Data value is based on a random, so the same messages
            //  will not be printed on the screen!
            DoActionByIndex(18784679);

            Console.ReadKey();
        }
    }

This runs in 1ms or less, or nearly 1000x faster. It is possible to handle multiple ID passes from the client with a similar DoAction method.

    static void DoActionByIndexes(List<int> Indexes)
    {
        foreach (var index in Indexes)
        {
            var FoundInstance = MyList[index];

            Console.WriteLine(ID.ToString());
            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
            Console.WriteLine();
        }
    }

I’ve implemented this concept a couple times now, and have been pretty darn happy with the results. It seems so obvious, I’m a bit embarrassed that it took me so long to think of it. So I figured I’d post it just in case anyone else is struggling with a similar problem.

Questions? Comments? Feel free to drop me a line.

-Alan

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s