validation - C# Sanitize File Name

ID : 20193

viewed : 26

Tags : c#validationpathsanitizeinvalid-charactersc#

Top 5 Answer for validation - C# Sanitize File Name

vote vote

92

To clean up a file name you could do this

private static string MakeValidFileName( string name ) {    string invalidChars = System.Text.RegularExpressions.Regex.Escape( new string( System.IO.Path.GetInvalidFileNameChars() ) );    string invalidRegStr = string.Format( @"([{0}]*\.+$)|([{0}]+)", invalidChars );     return System.Text.RegularExpressions.Regex.Replace( name, invalidRegStr, "_" ); } 
vote vote

80

A shorter solution:

var invalids = System.IO.Path.GetInvalidFileNameChars(); var newName = String.Join("_", origFileName.Split(invalids, StringSplitOptions.RemoveEmptyEntries) ).TrimEnd('.'); 
vote vote

80

Based on Andre's excellent answer but taking into account Spud's comment on reserved words, I made this version:

/// <summary> /// Strip illegal chars and reserved words from a candidate filename (should not include the directory path) /// </summary> /// <remarks> /// http://stackoverflow.com/questions/309485/c-sharp-sanitize-file-name /// </remarks> public static string CoerceValidFileName(string filename) {     var invalidChars = Regex.Escape(new string(Path.GetInvalidFileNameChars()));     var invalidReStr = string.Format(@"[{0}]+", invalidChars);      var reservedWords = new []     {         "CON", "PRN", "AUX", "CLOCK$", "NUL", "COM0", "COM1", "COM2", "COM3", "COM4",         "COM5", "COM6", "COM7", "COM8", "COM9", "LPT0", "LPT1", "LPT2", "LPT3", "LPT4",         "LPT5", "LPT6", "LPT7", "LPT8", "LPT9"     };      var sanitisedNamePart = Regex.Replace(filename, invalidReStr, "_");     foreach (var reservedWord in reservedWords)     {         var reservedWordPattern = string.Format("^{0}\\.", reservedWord);         sanitisedNamePart = Regex.Replace(sanitisedNamePart, reservedWordPattern, "_reservedWord_.", RegexOptions.IgnoreCase);     }      return sanitisedNamePart; } 

And these are my unit tests

[Test] public void CoerceValidFileName_SimpleValid() {     var filename = @"thisIsValid.txt";     var result = PathHelper.CoerceValidFileName(filename);     Assert.AreEqual(filename, result); }  [Test] public void CoerceValidFileName_SimpleInvalid() {     var filename = @"thisIsNotValid\3\\_3.txt";     var result = PathHelper.CoerceValidFileName(filename);     Assert.AreEqual("thisIsNotValid_3__3.txt", result); }  [Test] public void CoerceValidFileName_InvalidExtension() {     var filename = @"thisIsNotValid.t\xt";     var result = PathHelper.CoerceValidFileName(filename);     Assert.AreEqual("thisIsNotValid.t_xt", result); }  [Test] public void CoerceValidFileName_KeywordInvalid() {     var filename = "aUx.txt";     var result = PathHelper.CoerceValidFileName(filename);     Assert.AreEqual("_reservedWord_.txt", result); }  [Test] public void CoerceValidFileName_KeywordValid() {     var filename = "auxillary.txt";     var result = PathHelper.CoerceValidFileName(filename);     Assert.AreEqual("auxillary.txt", result); } 
vote vote

63

string clean = String.Concat(dirty.Split(Path.GetInvalidFileNameChars())); 
vote vote

56

there are a lot of working solutions here. just for the sake of completeness, here's an approach that doesn't use regex, but uses LINQ:

var invalids = Path.GetInvalidFileNameChars(); filename = invalids.Aggregate(filename, (current, c) => current.Replace(c, '_')); 

Also, it's a very short solution ;)

Top 3 video Explaining validation - C# Sanitize File Name

Related QUESTION?