In C#, Get A Substring Before Or After A Certain Character

Get A Substring Before Or After A Certain Character Banner Image

Introduction

This scenario is common in parsing strings, user data, database data, and automatic data generated by scripts. In C#, there is always a need to parse the string for names, dates, and information and return a sub-section of that string. Of course, there is more than one approach to finding that information from a string and these are just some of the ways to do that when looking for a particular character or pattern.

Get A Substring After A Certain Character

The first step is to dynamically find a certain character's index using IndexOf then with the index use the start of a new substring. The rest of the characters are taken and returned as a new substring.

Starting String
012345678
sixty-two

Approach

Using the example of a compound number sixty-two with a hyphen. If we want all the text after the first hyphen then we use the string IndexOf method to find the hyphen or any other character we need. The IndexOf function will return a value of 5. We don't want to start the substring at 5 because we don't want to include the hyphen in the substring. To start the substring at index 6 so we must add a one to this number to get the start position after the hyphen. Once we add one then we can start the substring to all the characters right of the hyphen which is index 6 to 8. Next for the substring function we also need to provide a length for the new string which is the length of the original string minus the start of where we want to cut. See the code example below.

Code Example
string compoundNumber = "sixty-two";//Orignal string before manipulation 
int indexCharLength = 1;//Length of the character we want to skip over for the substring
char indexChar = '-';//Character that we want to find the index to
int substringStartIndex = compoundNumber.IndexOf(indexChar) + indexCharLength;//Get an index of the hyphen plus 1 to get the start index just after the hyphen
int length = compoundNumber.Length - substringStartIndex;//This will yield a length of 2 since the start index is at 6 and the length is 8.
string number = compoundNumber.Substring(substringStartIndex, length);//Start index is 6 and length is 2 so substring will cut the indexes 6,7, and 8 to give the two at the end.
Console.WriteLine("compoundNumber:" + compoundNumber);//Print to screen starting string
Console.WriteLine("number:" + number);//Print to screen ending string
Code Output
compoundNumber:sixty-two
number:two

Multiple String Approach

That works for just one string but if we want to test this against more than one string and where the hyphen is in different positions in the string then next the code needs to be placed into a function call and called many times. Below is an example.

Code Example
List<string> compoundNumberList = new List<string>() { "sixty-two", "seventy-five", "fifty-nine", "ninety-three" };
foreach (string compoundNumber in compoundNumberList)
{
    string number = GetSubstringAfterOfSpecificChar(compoundNumber);//Call function to get substring to the left of the character
    Console.WriteLine("compoundNumber:" + compoundNumber + ", number:" + number);//Print original and function output
}
string GetSubstringAfterOfSpecificChar(string compoundNumber)
{
    const int indexCharLength = 1;//Length of the character we want to skip over for the substring
    const char indexChar = '-';//Character that we want to find the index to
    int substringStartIndex = compoundNumber.IndexOf(indexChar) + indexCharLength;//Get an index of the hyphen plus 1 to get the start index just after the hyphen
    string number = compoundNumber.Substring(substringStartIndex);//Start index is determined by IndexOf then the rest of the string is returned in the substring
    return number;
}
Code Output
compoundNumber:sixty-two, number:two
compoundNumber:seventy-five, number:five
compoundNumber:fifty-nine, number:nine
compoundNumber:ninety-three, number:three

Multiple Hyphen With Old Approach

The next challenge we may face is if there is more than one hyphen in the input string. The result may not be what want to see what happens if we run the same code on a different input set. If the inputs were phone numbers and we wanted the last 4 digits and we block out the rest of the number.

Code Example
List<string> compoundNumberList = new List<string>() { "324-380-3289", "845-389-3288", "383-383-9599", "849-458-3939" };
foreach (string compoundNumber in compoundNumberList)
{
    string number = GetSubstringToLeftOfSpecificChar(compoundNumber);//Call function to get substring to the left of the character
    Console.WriteLine("compoundNumber:" + compoundNumber + ", number:***-***-" + number);//Print original and function output
}
string GetSubstringToLeftOfSpecificChar(string compoundNumber)
{
    const int indexCharLength = 1;//Length of the character we want to skip over for the substring
    const char indexChar = '-';//Character that we want to find the index to
    int substringStartIndex = compoundNumber.IndexOf(indexChar) + indexCharLength;//Get index of the hyphen plus 1 to get start index just after the hyphen
    string number = compoundNumber.Substring(substringStartIndex);//Start index is determined by IndexOf then the rest of the string is returned in the substring
    return number;
}
Code Output
compoundNumber:324-380-3289, number:***-***-380-3289
compoundNumber:845-389-3288, number:***-***-389-3288
compoundNumber:383-383-9599, number:***-***-383-9599
compoundNumber:849-458-3939, number:***-***-458-3939
Starting Phone Number Example
01234567891011
324-380-3289

What Went Wrong When There Are Multiple Hyphens?

As you can see from the output the substring was taken at the first hyphen point at index 4 onwards. When what we want is to take index 8 onwards on the second hyphen.

Multiple Hyphens With Correct Approach

To do this we need to use LastIndexOf instead of IndexOf to get the last hyphen in the string. See the example below.

Code Example
List<string> compoundNumberList = new List<string>() { "324-380-3289", "845-389-3288", "383-383-9599", "849-458-3939" };
foreach (string compoundNumber in compoundNumberList)
{
    string number = GetSubstringToAfterASpecificChar(compoundNumber);//Call function to get substring to the left of the character
    Console.WriteLine("compoundNumber:" + compoundNumber + ", number:***-***-" + number);//Print original and function output
}
string GetSubstringToAfterASpecificChar(string compoundNumber)
{
    const int indexCharLength = 1;//Length of the character we want to skip over for the substring
    const char indexChar = '-';//Character that we want to find the index to
    int substringStartIndex = compoundNumber.LastIndexOf(indexChar) + indexCharLength;//Get an index of the last hyphen plus 1 to get the start index just after the hyphen
    string number = compoundNumber.Substring(substringStartIndex);//Start index is determined by IndexOf then the rest of the string is returned in the substring
    return number;
} 
Code Output
compoundNumber:324-380-3289, number:***-***-3289
compoundNumber:845-389-3288, number:***-***-3288
compoundNumber:383-383-9599, number:***-***-9599
compoundNumber:849-458-3939, number:***-***-3939

Multiple Hyphens With Correct Approach Recap

As we can see using LastIndexOf now correctly gets the position of the last hyphen so that the substring is only the last 4 digits of the phone number.

Get A Substring Before A Certain Character

Another way we can form a substring is to use IndexOf to find a certain character and we'll pass the start of the substring will be 0 and the length of the substring will be index returned by IndexOf for the certain character. See the example below.

Starting String

Street430_10302001.csv

Problem Statement

We want to find the file name and exclude the file extension. We are only given the file name with an extension from the database and have to parse the string to get the file name only.

Approach

To solve this we need to use IndexOf as in previous examples to find the index of the period endpoint for our substring. Once that is found then we can form the substring by passing the start index of zero and the endpoint to be the index of the period. See the code example below.

Code Example
string fileNameWithExtension = "Street430_10302001.csv";//Orignal filename with extension before manipulation 
const int startIndex = 0;
char indexChar = '.';//Character that we want to find the index to
int substringEndIndex = fileNameWithExtension.IndexOf(indexChar);//Get index of the period
string fileName = fileNameWithExtension.Substring(startIndex, substringEndIndex);//Start index is 0 and this function overload will get length is the index of the period
Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension);//Print to screen starting string
Console.WriteLine("fileName:" + fileName);//Print to screen ending string
Code Output
fileNameWithExtension:Street430_10302001.csv
fileName:Street430_10302001
New Problem Statement

Now, what if want to find the file name with multiple file names with extensions assuming that the file name length may be different each time. Let's see an example.

Code Example
List<string> fileNameWithExtensions = new List<string>() { "Street430_10302001.xlsx", "Building430_1152003.csv", "Shop4620_01032020.xlsm", "Drive5405_12042021.xls" };
foreach (string fileNameWithExtension in fileNameWithExtensions)
{
    string fileName = GetSubstringBeforeASpecificChar(fileNameWithExtension);//Call function to get substring to the before of the character
    Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension + ", fileName:" + fileName);//Print original and function output
}
string GetSubstringBeforeASpecificChar(string fileNameWithExtension)
{
    const int startIndex = 0;//Start index for the substring
    const char indexChar = '.';//Character that we want to find the index to
    int substringEndIndex = fileNameWithExtension.IndexOf(indexChar);//Get index of the period
    string fileName = fileNameWithExtension.Substring(startIndex, substringEndIndex);//Start index is 0 and this function overload will get length is the index of the period
    return fileName;
}
Code Output
fileNameWithExtension:Street430_10302001.xlsx, fileName:Street430_10302001
fileNameWithExtension:Building430_1152003.csv, fileName:Building430_1152003
fileNameWithExtension:Shop4620_01032020.xlsm, fileName:Shop4620_01032020
fileNameWithExtension:Drive5405_12042021.xls, fileName:Drive5405_12042021
New Problem Statement

Now suppose that we have a new set of data that included file names with multiple periods in them. How would the current code how this use case? Let's see below.

Code Example
List<string> fileNameWithExtensions = new List<string>() { "Street430.10302001.xlsx", "Building430.1152003.csv", "Shop4620.01032020.xlsm", "Drive5405.12042021.xls" };
foreach (string fileNameWithExtension in fileNameWithExtensions)
{
    string fileName = GetSubstringBeforeASpecificChar(fileNameWithExtension);//Call function to get substring to the before of the character
    Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension + ", fileName:" + fileName);//Print original and function output
}
string GetSubstringBeforeASpecificChar(string fileNameWithExtension)
{
    const int startIndex = 0;//Start index for the substring
    const char indexChar = '.';//Character that we want to find the index to
    int substringEndIndex = fileNameWithExtension.IndexOf(indexChar);//Get index of the period
    string fileName = fileNameWithExtension.Substring(startIndex, substringEndIndex);//Start index is 0 and this function overload will get length is the index of the period
    return fileName;
}
fileNameWithExtension:Street430.10302001.xlsx, fileName:Street430
fileNameWithExtension:Building430.1152003.csv, fileName:Building430
fileNameWithExtension:Shop4620.01032020.xlsm, fileName:Shop4620
fileNameWithExtension:Drive5405.12042021.xls, fileName:Drive5405
What Went Wrong When There Are Multiple Periods?

As we can see that the file name is cut off with multiple periods. To correct this we need to take the LastIndexOf instead IndexOf because want the last period next to the extension. See the corrected code below.

Corrected Code Example
List<string> fileNameWithExtensions = new List<string>() { "Street430.10302001.xlsx", "Building430.1152003.csv", "Shop4620.01032020.xlsm", "Drive5405.12042021.xls" };
foreach (string fileNameWithExtension in fileNameWithExtensions)
{
    string fileName = GetSubstringBeforeASpecificChar(fileNameWithExtension);//Call function to get substring to the before of the character
    Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension + ", fileName:" + fileName);//Print original and function output
}
string GetSubstringBeforeASpecificChar(string fileNameWithExtension)
{
    const int startIndex = 0;//Start index for the substring
    const char indexChar = '.';//Character that we want to find the index to
    int substringEndIndex = fileNameWithExtension.LastIndexOf(indexChar);//Get index of the last period
    string fileName = fileNameWithExtension.Substring(startIndex, substringEndIndex);//Start index is 0 and this function overload will get length is the index of the period
    return fileName;
}
Code Output
fileNameWithExtension:Street430.10302001.xlsx, fileName:Street430.10302001
fileNameWithExtension:Building430.1152003.csv, fileName:Building430.1152003
fileNameWithExtension:Shop4620.01032020.xlsm, fileName:Shop4620.01032020
fileNameWithExtension:Drive5405.12042021.xls, fileName:Drive5405.12042021
Multiple Periods With Correct Approach Recap

The corrected code now returns the corrected full file name without the extension.

Alternate Approaches

There are other ways to get a sub-section of string before or after a certain character. Some other ways may be splitting the string on the period or replacing certain characters or strings. Care needs to be taken so that what you get is exactly what you want.

String Split Code Example
string fileNameWithExtension = "Street430_10302001.csv";//Orignal filename with extension before manipulation 
const char indexChar = '.';//Character that we want to find the index to
const int fileNameIndex = 0;//After the string split the index with the file name is 0
string [] splitUpFileName = fileNameWithExtension.Split(indexChar);//Get index of the period
string fileName = splitUpFileName[fileNameIndex];//index is 0 to get only the file name, index = 1 would be the extension
Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension);//Print to screen starting string
Console.WriteLine("fileName:" + fileName);//Print to screen ending string
Code Output
fileNameWithExtension:Street430_10302001.csv
fileName:Street430_10302001
String Replace Code Example
string fileNameWithExtension = "Street430_10302001.csv";//Orignal filename with extension before manipulation 
const string fileExtension = ".csv";
string fileName = fileNameWithExtension.Replace(fileExtension,"");//Replace the extension with an empty string
Console.WriteLine("fileNameWithExtension:" + fileNameWithExtension);//Print to screen starting string
Console.WriteLine("fileName:" + fileName);//Print to screen ending string
Code Output
fileNameWithExtension:Street430_10302001.csv
fileName:Street430_10302001

Recap Alternate Approaches

As you can see before string split and string replace can be used in simple examples or even in complex ones. Just consider the use case and what may work best. These worked on these simple examples and actually yielded less code than using indexOf and then substring. When coding it's best to keep things as simple as possible to reduce bugs in the code and increase productivity.

Get Latest Updates
Comments Section