Analysis Of Getting First N Characters Methods In .NET

Analysis Of Getting First N Character Banner Image

Introduction

Getting a substring from another string is fundamental in .NET. It will come again and again so it is best to make the best assessment for which functions to use. We'll look at several different string functions and other none string functions to see how they compare like substring, remove, and span slice. In this analysis, we take a look at the performance or speed of each function. Also, we'll look at the readability of the methods.

Substring Method

The substring method is a go-to string function in many use cases. In this case, Using substring, we pass the start index and the length of the substring we would like to take from the text.

Suppose we have some text where we need to return a substring of the first 11 characters.

If we have the string 'fare_amount,pickup_datetime,pickup_longitude,pickup_latitude' and we only want to return the substring 'fare_amount' To do this we need to take the start index of 0 and take all characters until the first comma which is an index of 11 of this string. Let's see the code in action below.

String Substring Code Example

Below is the code to get the substring of the first 11 characters of the text.

We can focus on where the substring method is. That is where all the work is happening. It takes the substring from 0 to 10, which is 11 characters long.

string text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
int numberOfCharacters = 11;
string substring = GetSubstringUsingSubstring(text, numberOfCharacters);
Console.WriteLine("text:" + text);
Console.WriteLine("substring:"+substring);
string GetSubstringUsingSubstring(string text, int numberOfCharacters)
{
    if (string.IsNullOrEmpty(text))//check if text is empty or not
    {
        return "";//do nothing
    }
    if (numberOfCharacters <= 0 || numberOfCharacters > text.Length - 1)//check to see if numberOfCharacters is not zero or less and not greater than the number index in the string
    {
        return "";//do nothing
    }
    string substring = text.Substring(0, numberOfCharacters);//Get the first n characters of this string
    return substring;
}
Code Output
text:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude
substring:fare_amount

We see we can quickly get the first n characters using substring. There is code added at the beginning to protect against exceptions in those cases where the text could be null or empty and also if the numberOfCharactere is negative or exceeds the size of the string. So it's good practice to eliminate those exception threats right away.

Substring Speed Test

In addition, to see a functional example we also need to test the performance of the substring. To test this, we will call the function call 100 million times and then average that over 10 tests to see how long it takes. C# can do this operation very fast, but I call the method this many times so that I can try to get a spread in time between the methods. Let's see an example.

Substring Speed Code
using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();
for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(SubstringSpeedTest());
}
Console.WriteLine($"Substring Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");
double SubstringSpeedTest()
{
    int numberOfFunctionCalls = 100000000;//Number of function calls made
    string text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
    int numberOfCharacters = 11;
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();//Start the Stopwatch timer
    string alteredText = "";
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        alteredText = text.Substring(0, numberOfCharacters);//Get the first n characters of this string
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"sampleText:{text}, alteredText:{alteredText}, Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}
Code Output
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 1s 142ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 745ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 739ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 726ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 727ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 730ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 726ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 726ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 741ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 724ms
Substring Average speed:773ms, In 10 tests

We see in this first test that the substring method completes the test at an average of 773ms which is pretty good. That's just under a second for millions of operations. It's very fast.

Readability Analysis For Substring

This function reads really well and it is easily understood as taking just a piece of the original text.

Remove Method

Remove is similar to the substring method in the setup. But it differs that the start index is instead of 0, it is 11. The remove function will take away all the characters after index 11 so what will be left over will be the first 11 characters. Let's see an example.

Remove Method Code Example
string text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
int startIndex = 11;
string substring = GetSubstringUsingRemove(text, startIndex);
Console.WriteLine("text:" + text);
Console.WriteLine("substring:" + substring);
string GetSubstringUsingRemove(string text, int startIndex)
{
    if (string.IsNullOrEmpty(text))//check if text is empty or not
    {
        return "";//do nothing
    }
    if (startIndex <= 0 || startIndex > text.Length - 1)//check to see if startIndex is not past the text size
    {
        return "";//do nothing
    }
    int numberofCharacters = text.Length - startIndex;
    string substring = text.Remove(startIndex, numberofCharacters);//Get the first n characters of this string by removing the n last characters.
    return substring;
}
Code Output
text:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude
substring:fare_amount

For the remove method, we pass the start index and find out how many characters are based on the length of the text. The result is the same as the substring method.

Remove Method Speed Test

Next, we need to test the speed of the string remove method using the same speed test as before. Below is the code.

Remove Method Speed Code
using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();
for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(RemoveSpeedTest());
}
Console.WriteLine($"Remove Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");
double RemoveSpeedTest()
{
    int numberOfFunctionCalls = 100000000;//Number of function calls made
    string text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
    int startIndex = 11;
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();//Start the Stopwatch timer
    string alteredText = "";
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        int numberofCharacters = text.Length - startIndex;
        alteredText = text.Remove(startIndex, numberofCharacters);//Get the first n characters of this string by removing the n last characters.
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"sampleText:{text}, alteredText:{alteredText}, Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}
Code Output
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 1s 252ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 916ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 956ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 898ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 899ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 907ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 902ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 906ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 902ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 893ms
Remove Average speed:944ms, In 10 tests

With this performance at 1415ms, it is only slightly slower than substring so the difference isn't that much.

Readability Analysis For Remove

Although this works using remove makes it less intuitive because the code says we are removing the last n characters to get the first n characters.

Slice Method

Slice is a method not in the direct string class structure. It uses the span structure to store the characters in a stack rather than the heap so theoretically it would be faster than string-based methods. The set is similar to the substring method so let's look at an example.

Slice Method Example
string text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
int numberOfCharacters = 11;
string substring = GetSubstringUsingSlice(text, numberOfCharacters);
Console.WriteLine("text:" + text);
Console.WriteLine("substring:" + substring);
string GetSubstringUsingSlice(ReadOnlySpan<char> text, int numberOfCharacters)
{
    if (text == null)//check if text is empty or not
    {
        return "";//do nothing
    }
    if (numberOfCharacters <= 0 || numberOfCharacters > text.Length - 1)//check to see if numberOfCharacters is not zero or less and not greater than the number index in the string
    {
        return "";//do nothing
    }
    ReadOnlySpan<char> substring = text.Slice(0, numberOfCharacters);//Get the first n characters of this string
    return substring.ToString();
}
Code Output
text:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude
substring:fare_amount

Notice that we are using ReadOnlySpan which takes in a string that is passed and we also need do to the string to the result of the slice. The parameters are the same as the substring method and the result is also the same. Let's look at the performance.

Slice Method Speed Code
using System.Diagnostics;
int numberOfTests = 10;//Number of tests 
List<double> testSpeedList = new List<double>();
for (int i = 0; i < numberOfTests; i++)
{
    testSpeedList.Add(SliceSpeedTest());
}
Console.WriteLine($"Slice Average speed:{Math.Round(testSpeedList.Average())}ms, In {numberOfTests} tests");
double SliceSpeedTest()
{
    int numberOfFunctionCalls = 100000000;//Number of function calls made
    ReadOnlySpan<char> text = "fare_amount,pickup_datetime,pickup_longitude,pickup_latitude";
    int numberOfCharacters = 11;
    Stopwatch stopwatch = new Stopwatch();
    stopwatch.Start();//Start the Stopwatch timer
    ReadOnlySpan<char> alteredText = "";
    for (int i = 0; i < numberOfFunctionCalls; i++)
    {
        alteredText = text.Slice(0, numberOfCharacters);//Get the first n characters of this string
    }
    stopwatch.Stop();//Stop the Stopwatch timer
    Console.WriteLine($"sampleText:{text}, alteredText:{alteredText}, Function calls:{numberOfFunctionCalls}, In {stopwatch.Elapsed.Minutes}m {stopwatch.Elapsed.Seconds}s {stopwatch.Elapsed.Milliseconds}ms");
    return stopwatch.Elapsed.TotalMilliseconds;
}
Code Output
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 603ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 559ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 479ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 480ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 481ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 479ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 478ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 478ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 478ms
sampleText:fare_amount,pickup_datetime,pickup_longitude,pickup_latitude, alteredText:fare_amount, Function calls:100000000, In 0m 0s 478ms
Slice Average speed:500ms, In 10 tests

We see that slice is the fastest at 500ms. We can see the span enhancements have paid off. Understanding when to use span will be key as it does have strengths and weaknesses.

Conclusion

Overall RankMethodSpeedConciseReadability(1-5)
1Span Slice500ms10 lines4
2String Substring944ms10 lines5
3String Remove944ms11 lines3

The best method for getting the first n characters is span slice. Performance is the most important metric to determine to best method because it affects the end user the most, testing, and productivity. It is the fastest by over 200ms to the next function. Although for the slice method is not as readable as substring because an understanding of span is needed. Substring is second because it is slower than slice although it is the most readable and easiest to set up. The Remove method is the slowest of the three methods and also suffers from readability issues. It is not as intuitive as substring and slice.

Get Latest Updates