Attention: We are retiring the ASP.NET Community Blogs. Learn more >

String::Split - unexpected behaviour?

After a comment which was left by Justin Rogers on a recent post of mine I decided to do a bit more testing to confirm the behaviour of String::Split. The documentation has this to say about it:

Identifies the substrings in this instance that are delimited by one or more characters specified in an array

 

I highlighted the words "or more" in that sentence because I believed that the String::Split method would use all characters to do a split but clearly that ain't the case (see example below) which means that the words "or more" in the help documentation are probably misleading or just plain incorrect.

Dim arr1, arr2 As String()
' a string with "ZY" delimiters Dim text As String = "aaaaZY ddd ZY ZYbbbb" arr1 = Split(text, "ZY") arr2 = text.Split(New Char() {"Z"c, "Y"c})
' Displays 4 , 7 MsgBox(arr1.Length & " , " & arr2.Length)
' a string with "Z" and "Y" characters in it text = "aaaaZ ddd Y Ybbbb" arr1 = Split(text, "ZY") arr2 = text.Split(New Char() {"Z"c, "Y"c})
' Displays 1 , 4 MsgBox(arr1.Length & " , " & arr2.Length)

6 Comments

  • The docs seem spot on to me. Splitting using the character array "Z", "Y" indicates that there are TWO delimiters: "Z" and "Y". After all, a character is a single character.



    Hence, if it splits on one or more CHARACTERS, it would imply that it uses one or more DELIMITERS. At least that's how I read it, since when I see character I think a single letter, so if I am told, it delimits on CHARACTERS and there are multiple CHARACTERS, to me, this implies that each character is a unique delimiter, which appears to be the behavior.

  • Yeah, I agree with Scott here. The docs are spot on, however, there is the possibility for semantical misinterpretation. Someone coming from VB, however, is more likely to fall prey since there are similarly named functions that you would think might have similar functionality. There aren't too many cases of a string split using multiple delimiters in the various programming languages (aka JScript's split takes only a single delimiter as does VB).

  • That's fine guys... I'll concede gracefully :-)

  • Maybe the documents are correct about the behavior if character arrays are passed, but look at this:



    Dim Str As String = "1, 2, 3".Split(", ")(1)

    MessageBox.Show("'" & Str & "'") ' expected: '2' or '' got: ' 2'



    String.Split happily accepts a string as an argument instead of a char() and then ignores all but the first character in the string.



    It would make sense to raise an error when a string is sent to a method that expects a character array, especially since a string is a valid argument to the Split() function.

  • The simple method for this would be replace your string with some single charecter and then split



    For Ex: myString.Replace("test","|").Split('|');



  • You can do this and seems to work quite well

    Dim text As String = "aaaaZY ddd ZY ZYbbbb"
    Dim sSplit As String() = {"ZY"}
    Dim sFinal() As String = text.Split(sSplit, StringSplitOptions.RemoveEmptyEntries)
    Dim i As Integer
    For i = 0 To sFinal.Length - 1
    Debug.Print(sFinal(i))
    Next i

    Happy coding :)

Comments have been disabled for this content.