Remove Duplicate Items from A C# List

Remove Duplicate Items from A List Banner Image

Introduction

A C# list by design allows duplicate items. It does not check on values that are stored on it. So it is usually best if you want to exclude duplicate values from entering the list in the first place but this is not always possible. So we'll look at different methods to remove duplicate items from a list and also methods that stop duplicate items from entering our list in the first place.

Motivation

Starting with a simple case. If I create an object called transaction that has a user id, price, and timestamp and there are duplicates in the list then to remove those duplicates then we could use a brute force method by using list contains.

This code loops through all the transactions and if it is not contained in the new list then it is added.

This may work for small lists but it has a major flaw in that it is not scalable. As we loop through each transaction and check the new list. The new list grows and then checking contains on the new list each time becomes longer and longer.

Set up

To set up any of these methods when dealing with objects. We override Equals and Hashcode so that we can get a unique object. Each of these methods calls these methods.

Assumptions About The Dataset

Ordered Input

Objects rather than simple types

Given a list to filter out duplicates

DateTime is involved

Desired Output

Ordered output

Consider the date

All duplicates filtered

Efficient

List Contains Method

List<Transaction> orderedStartingTransactions = GetInitialTransactionList();//Get the starting order starting list
List<Transaction> uniqueTransactions = GetUniqueListWithListContains(orderedStartingTransactions);//Baseline list contains method of getting unique items
Dictionary<string, List<Transaction>> transactionDictionary = GroupByDateTime(uniqueTransactions);//Group by date unique items so that they can be printed to the screen
PrintDictionaryValues(transactionDictionary);//Print the unique into a table format

List<Transaction> GetUniqueListWithListContains(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactions = new List<Transaction>();//Create new list to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)//loop through an existing ordered list
    {
        if (!uniqueTransactions.Contains(orderedTransaction))//If the transaction does not exist in the new list then add it to the new list
        {
            uniqueTransactions.Add(orderedTransaction);//Add to List
        }
    }
    return uniqueTransactions;
}
Supporting Code
int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}

List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

void PrintDictionaryValues(Dictionary<string, List<Transaction>> transactionDictionary)
{
    Console.WriteLine($"Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count   ");//Header of Table
    foreach (string date in transactionDictionary.Keys)
    {
        string ids = $"  {transactionDictionary[date][0].UserId}  |  {transactionDictionary[date][1].UserId}   |  {transactionDictionary[date][2].UserId}   |  {transactionDictionary[date][3].UserId}   |  {transactionDictionary[date][4].UserId}   |  {transactionDictionary[date].Count}   ";//Row template
        Console.WriteLine($"{date} | {ids}");//Print to screen that row
    }
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}
Code Output
Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count
12/1/2022 |   1000  |  1001   |  1002   |  1003   |  1005   |  8401
12/2/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8445
12/3/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8430
12/4/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8442

Take note of this table. It shows the following about List contains that the order was preserved from id 0 to id 4 the id numbers are increasing. This gives us the desired result but we also need to check the performance which is the next test.

List Contains Speed Test

This is a test of the performance using the following parameters many times over.

10 tests with each test calling the function 10 times. Each function call processes a list with 250000 objects. The average time will be taken from the 10 tests.

using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();

for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(ListContainsMethodSpeedTest());
}
Console.WriteLine($"List Contains Method Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");

double ListContainsMethodSpeedTest()
{
    int numberOfFunctionCalls = 10;//Number of function calls made
    Stopwatch stopwatch = new Stopwatch();
    List<Transaction> orderedTransactions = GetInitialTransactionList();//Get intial random generated list
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        stopwatch.Start();//Start the Stopwatch timer
        List<Transaction> uniqueTransactions = GetUniqueListWithListContains(orderedTransactions);//Use list constructor method
        stopwatch.Stop();
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}

List<Transaction> GetUniqueListWithListContains(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactions = new List<Transaction>();//Create new list to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)//loop through an existing ordered list
    {
        if (!uniqueTransactions.Contains(orderedTransaction))//If the transaction does not exist in the new list then add it to the new list
        {
            uniqueTransactions.Add(orderedTransaction);//Add to List
        }
    }
    return uniqueTransactions;
}
Supporting Code
int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}

Code Output
Function calls:10, In 3m 18s 265ms
Function calls:10, In 3m 19s 880ms
Function calls:10, In 3m 16s 483ms
Function calls:10, In 3m 19s 304ms
Function calls:10, In 3m 15s 912ms
Function calls:10, In 3m 19s 471ms
Function calls:10, In 3m 15s 850ms
Function calls:10, In 3m 20s 298ms
Function calls:10, In 3m 15s 798ms
Function calls:10, In 3m 20s 805ms
List Contains Method Average speed:198207ms, In 10 tests

This output shows that this method is not efficient in that it takes an average of just over 3 minutes to complete. Let's look at the other methods to see if we can improve the performance.

LINQ Distinct Method

This method uses LINQ functions. It is very compact and fits into one line. Since it returns an IEnumerbale of T it needs to be converted into a List of T.

List<Transaction> orderedStartingTransactions = GetInitialTransactionList();//Get the starting order starting list
List<Transaction> uniqueTransactions = GetUniqueListWithLINQDistinct(orderedStartingTransactions);//Baseline list contains method of getting unique items
Dictionary<string, List<Transaction>> transactionDictionary = GroupByDateTime(uniqueTransactions);//Group by date unique items so that they can be printed to the screen
PrintDictionaryValues(transactionDictionary);//Print the unique into a table format


List<Transaction> GetUniqueListWithLINQDistinct(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactions = new List<Transaction>();//Create new list to hold unique transactions
    uniqueTransactions = orderedTransactions.Distinct().ToList();
    return uniqueTransactions;
}
Supporting Code
int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

void PrintDictionaryValues(Dictionary<string, List<Transaction>> transactionDictionary)
{
    Console.WriteLine($"Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count   ");//Header of Table
    foreach (string date in transactionDictionary.Keys)
    {
        string ids = $"  {transactionDictionary[date][0].UserId}  |  {transactionDictionary[date][1].UserId}   |  {transactionDictionary[date][2].UserId}   |  {transactionDictionary[date][3].UserId}   |  {transactionDictionary[date][4].UserId}   |  {transactionDictionary[date].Count}   ";//Row template
        Console.WriteLine($"{date} | {ids}");//Print to screen that row
    }
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}
Code Output
Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count
12/1/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8435
12/2/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8423
12/3/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8434
12/4/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8438

This confirms that LINQ distinct will provide us with what we need. The list that supported datetime in filtering out duplicates. It appears ordered but this method doesn't guarantee the order.

LINQ Distinct Speed Test

Now I'll complete the test for the LINQ distinct method. Below are the code changes and results. It is expected that it would be faster than the List Contains method.

using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();

for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(LINQDistinctMethodSpeedTest());
}
Console.WriteLine($"LINQ Distinct Method Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");

double LINQDistinctMethodSpeedTest()
{
    int numberOfFunctionCalls = 10;//Number of function calls made
    Stopwatch stopwatch = new Stopwatch();
    List<Transaction> orderedTransactions = GetInitialTransactionList();//Get intial random generated list
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        stopwatch.Start();//Start the Stopwatch timer
        List<Transaction> uniqueTransactions = GetUniqueListWithLINQDistinct(orderedTransactions);//Use list constructor method
        stopwatch.Stop();//Stop the Stopwatch timer
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}
List<Transaction> GetUniqueListWithLINQDistinct(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactions = new List<Transaction>();//Create new list to hold unique transactions
    uniqueTransactions = orderedTransactions.Distinct().ToList();
    return uniqueTransactions;
}
Supporting Code

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}

Code Output
Function calls:10, In 0m 0s 110ms
Function calls:10, In 0m 0s 120ms
Function calls:10, In 0m 0s 148ms
Function calls:10, In 0m 0s 135ms
Function calls:10, In 0m 0s 120ms
Function calls:10, In 0m 0s 118ms
Function calls:10, In 0m 0s 120ms
Function calls:10, In 0m 0s 116ms
Function calls:10, In 0m 0s 122ms
Function calls:10, In 0m 0s 119ms
LINQ Distinct Method Average speed:123ms, In 10 tests

This result of less than a second makes LINQ Distinct a viable method and it was able to scale well.

HashSet Method

While HashSet is not a list, it is a collection type. HashSet by default only holds distinct values so we can use this to our advantage by putting our list item into then converting the HashSet to a List. See the example below.

List<Transaction> orderedStartingTransactions = GetInitialTransactionList();//Get the starting order starting list
List<Transaction> uniqueTransactions = GetUniqueListWithHashset(orderedStartingTransactions);//Baseline list contains method of getting unique items
Dictionary<string, List<Transaction>> transactionDictionary = GroupByDateTime(uniqueTransactions);//Group by date unique items so that they can be printed to the screen
PrintDictionaryValues(transactionDictionary);//Print the unique into a table format


List<Transaction> GetUniqueListWithHashset(List<Transaction> orderedTransactions)
{
    HashSet<Transaction> uniqueTransactions = new HashSet<Transaction>();//Create new list to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)
    {
        uniqueTransactions.Add(orderedTransaction);
    }
    return uniqueTransactions.ToList();
}

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}
Supporting Code

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

void PrintDictionaryValues(Dictionary<string, List<Transaction>> transactionDictionary)
{
    Console.WriteLine($"Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count   ");//Header of Table
    foreach (string date in transactionDictionary.Keys)
    {
        string ids = $"  {transactionDictionary[date][0].UserId}  |  {transactionDictionary[date][1].UserId}   |  {transactionDictionary[date][2].UserId}   |  {transactionDictionary[date][3].UserId}   |  {transactionDictionary[date][4].UserId}   |  {transactionDictionary[date].Count}   ";//Row template
        Console.WriteLine($"{date} | {ids}");//Print to screen that row
    }
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}
Code Output
Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count
12/1/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8481
12/2/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8458
12/3/2022 |   1000  |  1001   |  1003   |  1004   |  1005   |  8439
12/4/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8444

This appears to be the expected output with HashSet, however, there is a problem. HashSet does not guarantee that the order is preserved. So if you don't need an ordered list in the output then HashSet would be good enough on its own.

HashSet Speed Test

Next, we'll test the speed of HashSet. I will use the same testing criteria as before.

using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();

for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(HashSetMethodSpeedTest());
}
Console.WriteLine($"HashSet Method Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");

double HashSetMethodSpeedTest()
{
    int numberOfFunctionCalls = 10;//Number of function calls made
    Stopwatch stopwatch = new Stopwatch();
    List<Transaction> orderedTransactions = GetInitialTransactionList();//Get intial random generated list
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        stopwatch.Start();//Start the Stopwatch timer
        List<Transaction> uniqueTransactions = GetUniqueListWithHashset(orderedTransactions);//Use list constructor method
        stopwatch.Stop();//Stop the Stopwatch timer
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}

List<Transaction> GetUniqueListWithHashset(List<Transaction> orderedTransactions)
{
    HashSet<Transaction> uniqueTransactions = new HashSet<Transaction>();//Create new list to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)
    {
        uniqueTransactions.Add(orderedTransaction);
    }
    return uniqueTransactions.ToList();
}

Supporting Code

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}
Code Output
Function calls:10, In 0m 0s 114ms
Function calls:10, In 0m 0s 165ms
Function calls:10, In 0m 0s 120ms
Function calls:10, In 0m 0s 120ms
Function calls:10, In 0m 0s 122ms
Function calls:10, In 0m 0s 115ms
Function calls:10, In 0m 0s 114ms
Function calls:10, In 0m 0s 116ms
Function calls:10, In 0m 0s 115ms
Function calls:10, In 0m 0s 116ms
HashSet Method Average speed:122ms, In 10 tests

It is a fast method and is sub-second and is practically equal to the LINQ Distinct method.

HashSet and List Combo Method

Next, I'll try a custom method that uses the distinct nature of HashSet we'll adding also to a list. This helps us to filter the incoming list and also add each distinct entry to our list. This also helps us solve the issue of HashSet not preserving order.

List<Transaction> orderedStartingTransactions = GetInitialTransactionList();//Get the starting order starting list
List<Transaction> uniqueTransactions = GetUniqueListWithHashSetAndListCombo(orderedStartingTransactions);//Baseline list contains method of getting unique items
Dictionary<string, List<Transaction>> transactionDictionary = GroupByDateTime(uniqueTransactions);//Group by date unique items so that they can be printed to the screen
PrintDictionaryValues(transactionDictionary);//Print the unique into a table format

List<Transaction> GetUniqueListWithHashSetAndListCombo(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactionsList = new List<Transaction>();//Create new List to hold unique transactions
    HashSet<Transaction> uniqueTransactionsSet = new HashSet<Transaction>();//Create new HashSet to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)
    {
        if (uniqueTransactionsSet.Add(orderedTransaction))//Check if HashSet does not contains the entry
        {
            //If unique then add to the HashSet
            uniqueTransactionsList.Add(orderedTransaction);//If unique then add to the List
        }
    }
    return uniqueTransactionsList;
}
Supporting Code

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

void PrintDictionaryValues(Dictionary<string, List<Transaction>> transactionDictionary)
{
    Console.WriteLine($"Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count   ");//Header of Table
    foreach (string date in transactionDictionary.Keys)
    {
        string ids = $"  {transactionDictionary[date][0].UserId}  |  {transactionDictionary[date][1].UserId}   |  {transactionDictionary[date][2].UserId}   |  {transactionDictionary[date][3].UserId}   |  {transactionDictionary[date][4].UserId}   |  {transactionDictionary[date].Count}   ";//Row template
        Console.WriteLine($"{date} | {ids}");//Print to screen that row
    }
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}
Code Output
Date      |   Id 0  |  Id 1   |  Id 2   |  Id 3   |  Id 4   |  Count
12/1/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8393
12/2/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8399
12/3/2022 |   1000  |  1001   |  1003   |  1005   |  1006   |  8479
12/4/2022 |   1000  |  1001   |  1002   |  1003   |  1004   |  8412

This is the expected output and since we stored the unique items in a list the order is preserved. I only used the HashSet as a filter for the items and did not return the HashSet items.

HashSet And List Combo Method Speed Test

This will again be the same performance test as before.


using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();

for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(HashSetAndListComboMethodSpeedTest());
}
Console.WriteLine($"HashSet And List Combo Method Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");

double HashSetAndListComboMethodSpeedTest()
{
    int numberOfFunctionCalls = 10;//Number of function calls made
    Stopwatch stopwatch = new Stopwatch();
    List<Transaction> orderedTransactions = GetInitialTransactionList();//Get intial random generated list
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        stopwatch.Start();//Start the Stopwatch timer
        List<Transaction> uniqueTransactions = GetUniqueListWithHashSetAndListCombo(orderedTransactions);//Use list constructor method
        stopwatch.Stop();//Stop the Stopwatch timer
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}

List<Transaction> GetUniqueListWithHashSetAndListCombo(List<Transaction> orderedTransactions)
{
    List<Transaction> uniqueTransactionsList = new List<Transaction>();//Create new List to hold unique transactions
    HashSet<Transaction> uniqueTransactionsSet = new HashSet<Transaction>();//Create new HashSet to hold unique transactions
    foreach (Transaction orderedTransaction in orderedTransactions)
    {
        if (uniqueTransactionsSet.Add(orderedTransaction))//Check if HashSet does not contains the entry
        {
            //If unique then add to the HashSet
            uniqueTransactionsList.Add(orderedTransaction);//If unique then add to the List
        }
    }
    return uniqueTransactionsList;
}
Supporting Code

int GetRandomInt(int maxNumber, int minNumber = 1)
{
    Random random = new Random();//Create Random class
    int randomInt = random.Next(minNumber, maxNumber);//Get a random number between 1 and the maxnumber
    return randomInt;
}


List<Transaction> GetInitialTransactionList()
{
    List<Transaction> transactionList = new List<Transaction>();//Create new Car List as empty
    int numberOfObjectsToCreate = 100000;
    int maxTransactionNumber = 100000;
    int maxUserId = 9999;
    int minUserId = 1000;
    int maxDays = 5;
    for (int i = 0; i < numberOfObjectsToCreate; i++)
    {
        transactionList.Add(new Transaction(GetRandomInt(maxUserId, minUserId), GetRandomInt(maxTransactionNumber), GetRandomInt(maxDays)));//Add a new Transaction to the list
    }
    return transactionList.OrderBy(x => x.TimeStamp).ThenBy(y => y.UserId).ToList();//Order the transactions by date and then user id
}

Dictionary<string, List<Transaction>> GroupByDateTime(List<Transaction> uniqueTransactions)
{
    Dictionary<string, List<Transaction>> transactionDictionary = new Dictionary<string, List<Transaction>>();
    foreach (Transaction uniqueTransaction in uniqueTransactions)
    {
        if (!transactionDictionary.ContainsKey(uniqueTransaction.TimeStamp.ToShortDateString()))
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()] = new List<Transaction>();//Create a new list if datetime does not exist in the dictionary
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//Add a Transaction to the list in the dictionary
        }
        else
        {
            transactionDictionary[uniqueTransaction.TimeStamp.ToShortDateString()].Add(uniqueTransaction);//The list already exists so we can just add onto it
        }
    }
    return transactionDictionary;
}

class Transaction
{
    public Transaction(int userId, int price, int day)
    {
        UserId = userId;
        Price = price;
        TimeStamp = new DateTime(2022, 12, day);//Keep year and month fixed and let the day vary
    }
    public int UserId { get; set; }
    public int Price { get; set; }
    public DateTime TimeStamp { get; set; }

    public override bool Equals(object? obj)
    {
        return obj is Transaction transaction &&
               UserId == transaction.UserId &&
               TimeStamp.Day == transaction.TimeStamp.Day &&
               TimeStamp.Month == transaction.TimeStamp.Month &&
               TimeStamp.Year == transaction.TimeStamp.Year;
    }

    public override int GetHashCode()
    {
        return HashCode.Combine(UserId, TimeStamp.Year, TimeStamp.Month, TimeStamp.Day);
    }

    public override string? ToString()
    {
        return Price.ToString();
    }
}

Code Output
Function calls:10, In 0m 0s 129ms
Function calls:10, In 0m 0s 147ms
Function calls:10, In 0m 0s 129ms
Function calls:10, In 0m 0s 132ms
Function calls:10, In 0m 0s 125ms
Function calls:10, In 0m 0s 121ms
Function calls:10, In 0m 0s 121ms
Function calls:10, In 0m 0s 118ms
Function calls:10, In 0m 0s 121ms
Function calls:10, In 0m 0s 121ms
HashSet And List Combo Method Average speed:127ms, In 10 tests

This comes in line with HashSet and LINQ Distinct with no advantage in performance over the other two methods. But since I saved items into a list the order is preserved.

Conclusion

Overall RankMethodSpeedOrder Preserved
1HashSet and List Combo127msYes
2HashSet122msNo
3LINQ Distinct123msNo
4List Contains198207msYes

The best method for getting unique items from a list is to use a combination of HashSet and List. It has good performance and the order of the items is preserved. If you do not care able the order then LINQ Distinct or HashSet might be better since they only have a few lines of code. Their performance is also really good. List Contains should not be used to get distinct items because it does not scale well at all and is slow.

Get Latest Updates
Comments Section