Index Lookup or hash table?

I recently shared out the link to my blog to a few friends and colleagues. After all, that’s how you get readers, right? Anyhow, my best friend, who happens to be a PowerShell guru, was talking to me this morning about my Index Lookup post; and his question was, “why didn’t you just use a hashtable?”

We don’t, at least I don’t, think of the Hashtable very often anymore, ever since .NET 2.0 came along and swept us off our feet with the wonderful happiness that is generics. So when a PowerShell guy says “Hashtable,” what you probably will think of from a .NET perspective is a Dictionary<TKey, TValue>; this is simply the generic version of the Hashtable of old. In simplest terms, a Hashtable provides the ability to “map” a key to a specific value; and then quickly retrieve the value when you supply the key. For an in-depth discussion of hash tables, see Wikipedia, a good discussion of the Dictionary<TKey, TValue> specifically can be found on MSDN here.

I have to admit, my confidence in my Index Lookup post was somewhat shaken when he asked me about it so matter-of-factly. Had I made a mistake, was a hash table a better solution? He and I began collaborating for a little over an hour this morning testing various scenarios, which will be the subject of this post. First, we needed a data set, so I did something I should have done years ago; I generated a CSV file containing 1 million rows of junk data, the first column of which is a semi-sequential integer ID, and then 6 columns of random string data. I’ll post a link to this dataset on the files page. The file looks like 1 million rows of this:

1,tfxnvwcyycrgth,ggqbrhvgd,ltpdhe,udygxoarwof,aiozgrspzejyyukjdm,ryelqidjspjvsqid
2,xtjloonptahderoju,zvwxqciaapjr,prpljdic,rvxiwfnyv,ycjhyzdgdoo,gbsrlpfffllpdaaigc
4,btwtfgjqlgx,nutpvpe,ysqgmqtexzvzabme,imlcytzifx,klfsbjoidqgpmfsmk,fmhofmdtrzzryhsytm
5,cxqndbtcxyxggokz,nvfsphsojlr,wolgsizoyenlkfvttsd,hglyganpyzuq,dbjla,fklnjiuijmyv
8,idrpdnzsujpzjx,gdlhtexdw,bpmrddaddhyllgya,lbvnzllbgbqclkauw,xnaq,pvfpkvs
10,jsjlbbf,ebqxc,bsxyqpfbnv,ugstwagnwhjcyytq,smksdetnsledzhfoqhv,iiajjgdsqya
11,yfiblcqbzgepckwzen,rydqklai,abwtlgetyedoqtnbdl,zjdocqjy,oxpkttbyjkdvgcba,tafqpgatjn
13,nudchjeapsvxsprthl,hcnrwitezqorjfe,phbnwxalsy,naxnth,guzcgtoffjablr,eclfxwjcczye
15,dbdvjtyk,nbhzqrqkrnl,wvbnkuwrwciphbdgfrs,dspkyxfrmrilshh,bhzzbsp,tsspnjezus
16,dgrpmoffeuwwzuyfx,trlderffqdcnjqz,dzjsmkwnkpw,supdggbbjabzjgxbc,lgwnrywffiuxto,jmvnzgcoyccxkjfmv

This structure can be represented in the following objects for a storage in a Dictionary and List respectively

    //This class can be used to represent a single row
    sealed class HashObject
    {
        public int ID { get; set; }
        public string Col1 { get; set; }
        public string Col2 { get; set; }
        public string Col3 { get; set; }
        public string Col4 { get; set; }
        public string Col5 { get; set; }
        public string Col6 { get; set; }

        public HashObject(int id, string col1, string col2, string col3, string col4, string col5, string col6)
        {
            this.ID = id;
            this.Col1 = col1;
            this.Col2 = col2;
            this.Col3 = col3;
            this.Col4 = col4;
            this.Col5 = col5;
            this.Col6 = col6;
        }
    }

    //This class can be used to represent a single row
    sealed class IndexObject
    {
        //Added int property to record the object's index location within the List
        public int index { get; set; }

        public int ID { get; set; }
        public string Col1 { get; set; }
        public string Col2 { get; set; }
        public string Col3 { get; set; }
        public string Col4 { get; set; }
        public string Col5 { get; set; }
        public string Col6 { get; set; }

        public IndexObject(int id, string col1, string col2, string col3, string col4, string col5, string col6)
        {
            this.ID = id;
            this.Col1 = col1;
            this.Col2 = col2;
            this.Col3 = col3;
            this.Col4 = col4;
            this.Col5 = col5;
            this.Col6 = col6;
        }
    }

Clearly there are two different metrics to test. The first is the amount of memory used by each method, the second is the amount of time required to access a given number of objects. We’ll start with testing memory. .NET doesn’t give us many good ways to determine the amount of memory a given object or collection is using, so for our purposes I am going to simply use task manager to determine the amount of memory that the process is using. I will be running all of my code with Visual Studio 2010 as a .NET Framework 4.0 command-line application in debug mode.

First, we’ll measure the memory usage of the hash table implementation. I’ve omitted the HashObject definition for brevity.

    class Program
    {
        static void Main(string[] args)
        {
            var hash = new Dictionary<string, HashObject>();

            using (var sr = new System.IO.StreamReader(@".\junkdata.csv"))
            {
                while (sr.Peek() != -1)
                {
                    var line = sr.ReadLine();

                    var fields = line.Split(',');

                    var hashobject = new HashObject(Convert.ToInt32(fields[0]), fields[1], fields[2], fields[3], fields[4], fields[5], fields[6]);
                    hash.Add(hashobject.ID, hashobject);
                }
            }            

            Console.ReadKey();
        }
    }

I repeated this process three times and took averages of both memory usage, and the amount of time required to fully load the .csv into the hashtable. The average amount of memory allocated to my application’s process was 327212 KB, and the average amount of time to load was 2763 MS. Now let’s repeat the process with the “index” version. Again, I will omit the IndexObject definition for brevity sake.

    class Program
    {
        static void Main(string[] args)
        {
            var index = new List<IndexObject>();

            using (var sr = new System.IO.StreamReader(@".\junkdata.csv"))
            {
                while (sr.Peek() != -1)
                {
                    var line = sr.ReadLine();

                    var fields = line.Split(',');

                    var indexobject = new IndexObject(Convert.ToInt32(fields[0]), fields[1], fields[2], fields[3], fields[4], fields[5], fields[6]);
                    indexobject.index = index.Count;
                    index.Add(indexobject);
                }
            }            

            Console.ReadKey();
        }
    }

Again the process was executed three times and the results averaged; memory usage was 310432 KB on average and time to load was an average of 2673 MS. In both cases the index method seems to be more effecient, but is 16780 KB of memory and 90 MS of cpu time worth it? What if the hash table can more quickly satisfy item lookups? Let’s continue onward and find out.

Here is the code snipped I am using to get a distributed list of items to perform a lookup on. Changing the modulo factor (the ‘7’ on line 11 below) to a larger number will decrease the sample size, while decreasing it will increase the sample size.

    //For the lookup speed test we are going to need two collections, since the ID/Index #s do not match up directly
    var idx_lookups = new List<int>();
    var hash_lookups = new List<int>();

    //Get the list of Keys in the hash table
    var hashkeys = hash.Keys.ToList();

    for (int i = 0; i < index.Count; i++)
    {
        //This will give us an approximately 14% sample size
        if (i % 7 == 0)
        {
            idx_lookups.Add(i);

            //This will add the corresponding "ID" to the hash_lookups
            // collection, so both the index and hash method will be looking
            // up the same items
            hash_lookups.Add(hashkeys[i]);
        }
    }

Here is the full code of my program at this point, I’m excluding the definitions for HashObject and IndexObject to save space, and they have not changed from how they were originally posted above. This code will load both the hash table, and my index lookup list; then identify approximately 14% of the records and look them up in both the hash table and the list by index. The amount of time in Ticks (1 Tick = 100 nanoseconds) will be printed to the screen for each method.

    class Program
    {
        static void Main(string[] args)
        {
            var hash = new Dictionary<int, HashObject>();
            var index = new List<IndexObject>();

            DateTime start;

            using (var sr = new System.IO.StreamReader(@".\junkdata.csv"))
            {
                while (sr.Peek() != -1)
                {
                    var line = sr.ReadLine();

                    var fields = line.Split(',');
                    
                    var hashobject = new HashObject(Convert.ToInt32(fields[0]), fields[1], fields[2], fields[3], fields[4], fields[5], fields[6]);
                    hash.Add(hashobject.ID, hashobject);
                    
                    var indexobject = new IndexObject(Convert.ToInt32(fields[0]), fields[1], fields[2], fields[3], fields[4], fields[5], fields[6]);
                    indexobject.index = index.Count;
                    index.Add(indexobject);
                }
            }


            //For the lookup speed test we are going to need two collections, since the ID/Index #s do not match up directly
            var idx_lookups = new List<int>();
            var hash_lookups = new List<int>();

            //Get the list of Keys in the hash table
            var hashkeys = hash.Keys.ToList();

            for (int i = 0; i < index.Count; i++)
            {
                //This will give us an approximately 14% sample size
                if (i % 7 == 0)
                {
                    idx_lookups.Add(i);

                    //This will add the corresponding "ID" to the hash_lookups
                    // collection, so both the index and hash method will be looking
                    // up the same items
                    hash_lookups.Add(hashkeys[i]);
                }
            }

            start = DateTime.Now;
            foreach (var i in hash_lookups)
            {
                var obj = hash[i];
            }
            Console.WriteLine("hashtable: {0}", (DateTime.Now - start).Ticks);


            start = DateTime.Now;
            foreach (var i in idx_lookups)
            {
                var obj = index[i];
            }
            Console.WriteLine("index: {0}", (DateTime.Now - start).Ticks);

            
            Console.ReadKey();
        }
    }

On my system, this is the output:

hashtable: 50000
index: 20000

These results seem to remain consistent across different modulo values, with the index lookup method always being between 2 and 2.5 times faster than the hash table lookup.

So in summary

  • The index lookup method uses less memory
  • The index lookup method loads more quickly
  • The index lookup method is able to access items more quickly

The index lookup method does have one big disadvantage to the hash table, which I should probably mention in all fairness since I’ve been wailing on the hash table pretty hard so far. Removing items from the collection is a very dicey proposition, since it would invalidate all of your index properties that came “after” the item you removed. It’s up to you to decide which method is most appropriate for your given development situation.

Questions? Comments? Just leave a post.

Until next time!
-Alan

CSV Importing in C#

Reading and writing flat files was something that I had considered my bread and butter for quite some time. Many years ago, when I was still in high school and considered myself a master of Visual Basic 6, I was involved in a startup; if you could call it that, the goal of which was to set up an online review site of all things outdoors. Camping, hiking, rock climbing, etc. all included on one site. The idea being to allow people to not only get reviews of potential destinations, but to find out what was nearby at the same time. So, for example, upon selecting a potential campground, a site visitor might be able to see other activities within a certain distance. Nowadays I’m sure someone has pulled this idea off, but this isn’t a post about an outdoor activity review site, this is a post about reading CSV files. At that time, I was tasked with writing data-entry client software that could be used by other staff to input all of the information about a given location in discreet data fields; as well as a general description text area for free text. The application had to be usable offline, and then able to later synchronize their data with our database.

I hardly deserved to call myself a developer at this time, but confident in my knowledge I burst forth and wrote glorious VB6 code. I stored the data entered locally in a CSV file and all was well, until some nefarious (in my opinion) user dared to put a comma in their review. Of course this caused the database sync to fail miserably and all hope was lost. I then went about updating the clients code to perform a character replacement on any commas and put in a replacement character, which I would then scan for during the sync operation. All was fixed and glorious once again.

I tell this story because up until recently I was perfectly happy with this method of addressing what, to me, was a limitation of the CSV format. It was simple, fast, and easy to support and maintain. There are, of course, many solutions to this problem when you have full control over the file that is being written and read. You could use a different delimiter, possibly even one that the user would have a very difficult type typing with a regular keyboard; or a fixed width format – which works great as long as the client enforces the same length restrictions. But I’m wondering off topic again.

More recently, I finally encountered a project that required me to actually read CSV files that I did not create. That is, CSV files that someone might have exported from Excel, or Google Docs, or something similar. I no longer had control over the incoming format. Meaning I would have to load data that already had commas of its own within a field. But I’m getting ahead of myself.

Considering the following simple data set in Excel.
Simple excel dataset

When this data is saved at CSV file, here is what the resulting file contents look like.

Col1,Col2
bland,cake
text,is
is,good
always,especially
easy,new
to,york
import,cheesecake

This sort of data can be easily imported using logic similar to the following.

    class Program
    {
        static void Main(string[] args)
        {
            var MyObjects = new List<MyObject>();

            using (var sr = new System.IO.StreamReader(@".\csv_test1.csv"))
            {
                while (sr.Peek() != -1)
                {
                    var line = sr.ReadLine();
                    var fields = line.Split(',');

                    var obj = new MyObject(fields[0], fields[1]);

                    MyObjects.Add(obj);
                }

                sr.Close();
            }
        }
    }

    class MyObject
    {
        private string m_Col1;
        private string m_Col2;
        public string Col1 { get { return m_Col1; } }
        public string Col2 { get { return m_Col2; } }

        public MyObject(string col1, string col2)
        {
            this.m_Col1 = col1;
            this.m_Col2 = col2;
        }
    }

This is fairly similar to how I would have handled reading a CSV file in the past. Of course, having the benefit of nice clean data certainly helps. Lets look at another example.
A more complicated excel dataset

Saving this dataset as a CSV file and looking at the contents presents a somewhat different picture from before.

Col1,Col2
"2,500",cake
"$2,500.00 ",is
"""twenty-five hundred""",good
"I really like "" marks",especially
"I REALLY like """""" marks",new
"spreading "" them "" around "" here",york
"fake cell ending "", mid cell",cheesecake
"this cell
spans 2 lines",with strawberries

This demonstrates the finer details of what is acceptable in a CSV.

  • If the field contains no commas or quotation marks there’s nothing special that happens; the text is simply placed directly in the file without modification (See line 1 above)
  • If the field contains a comma the field is enclosed in quotation marks: ” (see line 2 above)
  • If the field contains a quotation mark the field is enclosed in quotation marks and the quotation marks within the field are doubled, this serves as a type of “escaping.” (see line 5 above)
  • If the field contains a line feed the field is enclosed in quotation marks (see lines 9 & 10 above)

Seems simple enough, but when I first looked into this I thought “surely there’s a good code snippet out there for this,” so off to Google I went. I found some interesting articles on using an ODBC text driver; but since I was developing within the context of a Metro Windows 8 Style UI application that idea was out. I found an assortment of pre-compiled libraries with various licensing strategies. I also found some suggestions to add a dependency on the Microsoft.VisualBasic namespace and use their TextParser, this also did not appeal to me. So I sat down at the drawing board and came up with a specification.

  1. If a field does not begin with a quotation mark the field ends once the next comma is located or the end of the line is reached
  2. If a field begins with a ” mark then the field ends once a single ” mark is identified since quotation marks within the field are doubled

My new code had to be more generic, capable of importing a CSV file regardless of the number of columns. It also had to work within the new Windows 8 Style-UI restrictions, which means I had to use Windows.Storage.FileIO rather than a StreamReader, however at the time of this writing I only have access to Visual Studio 2010 so bear with me.

The following code provides a self-contained class that imports the CSV file into a collection of Row objects, which are themselves simply a List of string values representative of each column. This is simply a proof of concept, deploying this code would likely involve a few tweaks; such as allowing for column headings, empty rows, a delimiter other than a comma, etc. However I hope this code will be useful to someone who may need some help getting started with their own CSV importing logic without potentially cumbersome licensing requirements.

    class Program
    {
        static void Main(string[] args)
        {
            var MyFile = new CSVFile(@".\csv_test2.csv");

            for (int row = 0; row < MyFile.Rows.Count; row++)
            {
                Console.WriteLine("Row {0}", row.ToString());

                for (int col = 0; col < MyFile.Rows[row].Fields.Count; col++)
                {
                    Console.WriteLine("Field {0}  -  {1}", col.ToString(), MyFile.Rows[row].Fields[col]);
                }

                Console.WriteLine();
            }


            Console.ReadKey();
        }
    }

    public sealed class CSVFile
    {
        private int m_ColumnCount;
        private List<CSVRow> m_Rows;

        public int ColumnCount { get { return m_ColumnCount; } }
        public System.Collections.ObjectModel.ReadOnlyCollection<CSVRow> Rows { get { return m_Rows.AsReadOnly(); } }


        public CSVFile(string path)
        {
            m_Rows = new List<CSVRow>();

            int curCharVal = 0;
            char curChar;
            bool inQuotes = false;
            var curField = new StringBuilder();
            var curRow = new CSVRow();

            try
            {
                using (var sr = new System.IO.StreamReader(path))
                {
                    curCharVal = sr.Read();

                    while (curCharVal >= 0)
                    {
                        //We can't be sure if the file we've received uses Line Feed (char 10) by itself, or Carriage Return / Line Feed (chars 13 / char 10) to indicate a new line
                        //what we can be sure of, however, is that we really don't care if there's a 13!
                        while (curCharVal == 13)
                        {
                            curCharVal = sr.Read();

                            if (curCharVal == -1)
                                break;
                        }

                        //Sanity check, if we ended up with a -1 due to curCharVal == 13 loop...
                        //Should never happen, but god knows what some people's CSV files look like
                        if (curCharVal == -1)
                        {
                            curRow.Fields.Add(curField.ToString());
                            curField.Clear();
                            this.m_Rows.Add(curRow);

                            break;
                        }

                        curChar = (char)curCharVal;

                        if (inQuotes)
                        {
                            //If we're in a quote enclosed field, we need to identify
                            //  if these new quotes are escaped (doubled)
                            //  and if they are, only add a single set of quotes to our
                            //  current field.  If they are not escaped, then we are
                            //  no longer in a quote enclosed field
                            if (curChar == '"')
                            {
                                curCharVal = sr.Read();

                                if (curCharVal >= 0)
                                {
                                    curChar = (char)curCharVal;

                                    if (curChar != '"')
                                    {
                                        inQuotes = false;
                                        
                                        //The new character we just imported (presumably a comma)
                                        //  will be handled once we fall through into the next if block below
                                    }
                                    else
                                    {
                                        curField.Append(curChar);
                                    }
                                }
                            }
                            else
                            {
                                curField.Append(curChar);
                            }
                        }
                        
                        //This is a separate if statement, rather than an else clause
                        //  because within the if block above, the inQuotes value could be
                        //  set to false, in which case we want to evaluate the logic
                        //  within this code block
                        if(!inQuotes)
                        {
                            if (curField.Length == 0 && curChar == '"')
                            {
                                inQuotes = true;
                            }
                            else if (curChar == ',')
                            {
                                curRow.Fields.Add(curField.ToString());
                                curField.Clear();
                            }
                            else if (curCharVal == 10)
                            {
                                curRow.Fields.Add(curField.ToString());
                                curField.Clear();

                                //We're done with this row, add it to the list and set
                                //  ourselves up for a fresh row.
                                this.m_Rows.Add(curRow);
                                curRow = new CSVRow();
                            }
                            else
                            {
                                curField.Append(curChar);
                            }
                        }


                        curCharVal = sr.Read();

                        //We just reached the end of the file.
                        //  Add the current row to the list of rows before the loop ends
                        if (curCharVal == -1)
                        {
                            curRow.Fields.Add(curField.ToString());
                            curField.Clear();
                        }
                    }
                }
            }
            catch
            {
                m_Rows.Clear();
                m_ColumnCount = 0;
            }
        }


        public sealed class CSVRow
        {
            private List<string> m_Fields;

            public List<string> Fields { get { return m_Fields; } }

            public CSVRow()
            {
                m_Fields = new List<string>();
            }
        }
    }

Here’s the output of the above code:

Row 0
Field 0  -  Col1
Field 1  -  Col2

Row 1
Field 0  -  2,500
Field 1  -  cake

Row 2
Field 0  -  $2,500.00
Field 1  -  is

Row 3
Field 0  -  "twenty-five hundred"
Field 1  -  good

Row 4
Field 0  -  I really like " marks
Field 1  -  especially

Row 5
Field 0  -  I REALLY like """ marks
Field 1  -  new

Row 6
Field 0  -  spreading " them " around " here
Field 1  -  york

Row 7
Field 0  -  fake cell ending ", mid cell
Field 1  -  cheesecake

Row 8
Field 0  -  this cell
spans 2 lines
Field 1  -  with strawberries

I’ve used code very similar to this to import 150+ MB CSV files in a matter of seconds without issue.

As always, if there are any questions; feel free to drop me a line. If you have a more elegant solution, I’d love to see that too!

Until next time.

-Alan

Index Lookup

I’d like to start off by talking about a method of looking up items in a List or array that I recently started using when I have been able.  For lack of a better term, when referring to this when talking to a colleage I have been calling it an “Index Lookup.”  If you know of a proper term please let me know in the comments for props.

Consider the following simple console application:

    class Program
    {
        private static List<SomeClass> MyList;

        class SomeClass
        {
            private static Random rand = new Random();

            public int ID { get; set; }
            public int Data { get; set; }

            public SomeClass()
            {
                this.Data = rand.Next(-2147483648, 2147483647);
            }

            public void DoSomeAction()
            {
                if (this.Data > -2147483648)
                    this.Data--;
                else
                    this.Data++;
            }
        }

        static void DoActionByID(int ID)
        {
            var FoundInstance = MyList.Single((item) => item.ID == ID);

            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        }


        static void Main(string[] args)
        {
            MyList = new List<SomeClass>();

            for (int i = 0; i < 30000000; i++)
            {
                //the ID is set to (i+1)*2 to simulate an arbitrary ID value from the DB
                //  or file data source
                var CurInstance = new SomeClass() { ID = (i + 1) * 2};

                MyList.Add(CurInstance);
            }


            //Lets assume the user takes some action and identifies item: 37569360
            //  as the item they want to perform an action on
            DoActionByID(37569360);

            Console.ReadKey();
        }
    }

On my system with this sample program the item search takes just under 1 second (987 ms). Not horrible considering the collection size, but I think we can do better.

I recently started working on a project with a colleague of mine that necessitated looking up n number of objects from a collection based on user selection. Since the design of our application necessitated separating the UI from the underlying objects, the client could not reference the underlying objects directly. Instead, the UI would pass back a list of IDs that were to be included or excluded and then wait for the results to be returned. Initially I was using an approach very similar to the above and performance was horrible. After beating my head against several solutions, I started to think about the fact that you can directly access members of a collection based on their index.

Gets or sets the element at the specified index

So as long as we are able to know what the index position of my items are, we should be good to go. The good news is, knowing an items position is easy! As we all know, the .Add() method of a List adds the item to the end of the List; this means that so long as you don’t sort the List, or use the Insert() method to insert an item at a specific index, that it’s a trivial matter to know the index position within the List of every item you insert. Since the List/Array use a 0-based index system, and the .Count or .Length properties respectively are 1-based; the index value of the next inserted item will be equal to the current (pre-insertion) value of .Count or .Length.

Consider the following modifications to the class, List insertion code, and new DoActionByIndex method.

    class Program
    {
        private static List<SomeClass> MyList;

        class SomeClass
        {
            private static Random rand = new Random();

            public int ID { get; set; }
            public int Data { get; set; }

            //A new value for tracking the index position of the item
            //  this value gets exposed to the UI layer for passback
            //  this causes the client to no longer use the ID field
            //  which is retained solely for the purpose of
            //  updating the record in the original data source
            public int index { get; set; }

            public SomeClass()
            {
                this.Data = rand.Next(-2147483648, 2147483647);
            }

            public void DoSomeAction()
            {
                if (this.Data > -2147483648)
                    this.Data--;
                else
                    this.Data++;
            }
        }

        //static void DoActionByID(int ID)
        //{
        //    var FoundInstance = MyList.Single((item) => item.ID == ID);

        //    Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
        //    FoundInstance.DoSomeAction();
        //    Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        //}

        static void DoActionByIndex(int index)
        {
            //Since we are now passing in the index position, we reference the it
            //  directly rather than doing an ID match comparison
            var FoundInstance = MyList[index];

            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
        }


        static void Main(string[] args)
        {
            MyList = new List<SomeClass>();

            for (int i = 0; i < 30000000; i++)
            {
                //the ID is set to (i+1)*2 to simulate an arbitrary ID value from the DB
                //  or file data source
                var CurInstance = new SomeClass() { ID = (i + 1) * 2};

                //Since the index position is 0-based, and the Count property of a List
                //  is 1-based we know that the current Count value of the List will
                //  represent the new index of the next item we insert
                CurInstance.index = MyList.Count;


                MyList.Add(CurInstance);
            }


            //Lets assume the user takes some action and identifies item: 18784679
            //  as the item they want to perform an action on
            //  this should be the same item as before, just using index rather than ID

            //  Bear in mind that the Data value is based on a random, so the same messages
            //  will not be printed on the screen!
            DoActionByIndex(18784679);

            Console.ReadKey();
        }
    }

This runs in 1ms or less, or nearly 1000x faster. It is possible to handle multiple ID passes from the client with a similar DoAction method.

    static void DoActionByIndexes(List<int> Indexes)
    {
        foreach (var index in Indexes)
        {
            var FoundInstance = MyList[index];

            Console.WriteLine(ID.ToString());
            Console.WriteLine("Before: {0}", FoundInstance.Data.ToString());
            FoundInstance.DoSomeAction();
            Console.WriteLine("After: {0}", FoundInstance.Data.ToString());
            Console.WriteLine();
        }
    }

I’ve implemented this concept a couple times now, and have been pretty darn happy with the results. It seems so obvious, I’m a bit embarrassed that it took me so long to think of it. So I figured I’d post it just in case anyone else is struggling with a similar problem.

Questions? Comments? Feel free to drop me a line.

-Alan