VBspeed / VB6 to VB5 / Split
VBspeed © 2000-10, updated: 01-Jun-2002
Split


The Definition
Three versions of Split. It's tricky to emulate VB6's native Split in VB5 since VB5 functions cannot return arrays. The nearest you can get is SplitA which returns a Variant that *holds* an array. This works well, but you won't be happy with a Variant array when you're out for speed.
It's much easier and much faster to redesign Split as to return its results in an array argument. Therefore I propose a 2nd definition below, SplitB, where I also dropped the Count and Compare arguments of the original Split because I can't see much use for them in real-world programming.
A third variant, SplitC, is IMHO the best setup for a real world split job. It returns the number of tokens resulting from the split, which is often what you want to know before you do things with the array.

Function SplitA
Returns a zero-based, one-dimensional array containing a specified number of substrings.
Native to VB6, but not to VB5.
Declaration:
Public Function SplitA( _
    Expression As String, _
    Optional Delimiter As String = " ", _
    Optional Count As Long = -1, _
    Optional Compare As VbCompareMethod = vbBinaryCompare) As Variant
Arguments:
ExpressionRequired. String expression containing substrings and delimiters. If expression is a zero-length string, Split returns an empty array, that is, an array with no elements and no data.
DelimiterOptional. String character used to identify substring limits. If omitted, the space character (" ") is assumed to be the delimiter. If delimiter is a zero-length string, a single-element array containing the entire expression string is returned.
CountOptional. Number of substrings to be returned; -1 indicates that all substrings are returned. If Count is 0, an unbound array (UBound = -1) is returned.
CompareOptional. Numeric value indicating the kind of comparison to use when evaluating substrings.

Sub SplitB
Returns a zero-based, one-dimensional array containing a specified number of substrings.
This array is returned in an argument.
Declaration:
Public Sub SplitB( _
    Expression As String, _
    asToken() As String, _
    Optional Delimiter As String = " ")
Arguments:
ExpressionRequired. String expression containing substrings and delimiters. If expression is a zero-length string, SplitB returns a single-element array containing a zero-length string.
asToken()Required. One-dimensional string array that will hold the returned substrings. Does not have to be bound before calling SplitB, and is guaranteed to hold at least one element (zero-based) on return.
Delimiter[same as in Split]

Function SplitC
Returns a zero-based, one-dimensional array containing a specified number of substrings.
This array is returned in an argument. The function itself returns the token count (ie the number of elements in the returned array).
Declaration:
Public Function SplitC01( _
    Expression As String, _
    asToken() As String, _
    Optional Delimiter As String = " ") _
    As Long
Arguments:
[same as in SplitB]

IsGoodSplit?
If you want to have a go at Split yourself, use this function (VB5/6-compatible) to verify the correctness of your emulation code (SplitA, SplitB, and SplitC).


The Charts
Calls
SplitA Dim avSplit as Variant
...
avSplit = SplitA(Expression, ",")


Surprisingly, i found that returning into a String array (possible under VB6 only) is slower than returning into a Variant array.
SplitB, SplitC Dim asToken() as String
...
Call SplitB(Expression, asToken(), ",")
Call 1 Expression = [comma-separated 12,345 elements list, 1 char per element]
Call 2 Expression = [comma-separated 1,234 elements list, 1 char per element]
Call 3 Expression = [comma-separated 123 elements list, 10 chars per element]
Call 4 Expression = [comma-separated 12 elements list, 100 chars per element]
 VB5
CodeAuthorDopingNotes
SplitVB6  
SplitA01 Donald  
SplitA02 GuidoAPI 
SplitA03 EgbertTLB 
SplitB01 Donald  
SplitB02 Keith  
SplitB03 Guido  
SplitB04 Chris  
SplitB05 DonaldTLB 
SplitC01 DonaldTLB 
Call 1
   
77.86233,200µs
X4.43131,476µs
X4.04119,895µs
63.43101,781µs
53.3499,002µs
41.2737,582µs
31.1634,433µs
21.0029,728µs
11.0029,674µs
Call 2
   
74.4313,250µs
X2.878,601µs
X2.467,375µs
51.344,023µs
61.354,034µs
41.243,708µs
31.163,480µs
21.013,020µs
11.002,993µs
Call 3
   
73.221,144µs
X2.24797µs
X1.77627µs
61.22435µs
51.21432µs
31.13401µs
41.13401µs
21.00356µs
11.00355µs
Call 4
   
73.59273µs
X2.71206µs
X1.2091µs
62.33177µs
52.32176µs
11.0076µs
41.43109µs
31.35102µs
21.34102µs
 VB6
CodeAuthorDopingNotes
Split VB6  
SplitA01 Donald  
SplitA02 GuidoAPI 
SplitA03 EgbertTLB 
SplitB01 Donald  
SplitB02 Keith  
SplitB03 Guido  
SplitB04 Chris  
SplitB05 DonaldTLB 
SplitC01 DonaldTLB 
Call 1
73.72131,126µs
88.10285,429µs
X4.28150,951µs
X4.32152,399µs
63.30116,326µs
53.28115,738µs
41.1640,917µs
31.0838,195µs
11.0035,241µs
21.0035,245µs
Call 2
71.715,656µs
84.0513,411µs
X2.217,333µs
X2.277,520µs
51.234,074µs
61.244,089µs
41.133,756µs
31.053,470µs
11.003,311µs
21.013,342µs
Call 3
71.20478µs
82.791,110µs
X1.78709µs
X1.68668µs
51.10437µs
61.10437µs
21.01401µs
41.06423µs
11.00398µs
31.04414µs
Call 4
11.0066µs
84.00263µs
X2.95194µs
X1.4796µs
62.39157µs
72.41158µs
21.1676µs
51.67110µs
31.62106µs
41.64108µs
Conclusions
VB6 native Split is better than the emulations that return a (Variant) array. But, of course, all of the SplitB/C versions are faster. With the exception of call 4.
Mail your code! How to read all those numbers


The Code
SplitA01
submitted 16-Sep-2000 by Donald Lessau  
Doping: none
Public Function SplitA01(Expression As String, _
  Optional Delimiter As String = " ", _
  Optional Count As Long = -1, _
  Optional Compare As VbCompareMethod = vbBinaryCompare) As Variant
' by Donald, donald@xbeat.net, 20000916
  Const BUFFERDIM As Long = 1024
  Dim cntSplit As Long
  Dim posStart As Long
  Dim posFound As Long
  Dim lenDelimiter As Long
  Dim sArray() As String
  Dim ubArray As Long
  
  If Count = 0 Then
    ' return unbound Variant array
    SplitA01 = Array()
    Exit Function
  End If
  
  lenDelimiter = Len(Delimiter)
  If lenDelimiter = 0 Then
    ' return expression in single-element Variant array
    SplitA01 = Array(Expression)
  Else
    posStart = 1
    ubArray = -1
    Do
      If cntSplit > ubArray Then
        ubArray = ubArray + BUFFERDIM
        ReDim Preserve sArray(ubArray)
      End If
      posFound = InStr(posStart, Expression, Delimiter, Compare)
      If cntSplit + 1 = Count Then
        sArray(cntSplit) = Mid$(Expression, posStart)
        Exit Do
      Else
        If posFound Then
          sArray(cntSplit) = Mid$(Expression, posStart, posFound - posStart)
          posStart = posFound + lenDelimiter
          cntSplit = cntSplit + 1
        Else
          sArray(cntSplit) = Mid$(Expression, posStart)
        End If
      End If
    Loop While posFound
    ' shrink to actual size
    ReDim Preserve sArray(cntSplit)
    ' return string array as Variant array
    SplitA01 = sArray
  End If
  
End Function
Author's comments:
Donald's comments:

top | charts


SplitA02
submitted 22-Sep-2000 by Guido Beckmann  
Doping: API
  [07-dec-2001] Note that the function does not work 100% correct: SplitA02("a,b,c", ",", 0) returns a zero-based one-element array as(0)="". The correct return would be an unbound array (UBound = -1).
Private Declare Sub CopyMemory Lib "kernel32" Alias "RtlMoveMemory" _
                        (Dst As Any, Src As Any, ByVal iLen&)
Private Declare Sub ZeroMemory Lib "kernel32" Alias "RtlZeroMemory" _
                        (dst As Any, ByVal iLen&)


Public Function SplitA02( _ Expression As String, _ Optional Delimiter As String = " ", _ Optional Count As Long = -1, _ Optional Compare As VbCompareMethod = vbBinaryCompare) ' by G.Beckmann, G.Beckmann@NikoCity.de Const ARR_CHUNK& = 1024 Dim pArr&, pResult&, asResult$() Dim iLen&, cHits&, iLast&, iCur& If Count <> 0 Then iLen = Len(Delimiter) ReDim asResult(ARR_CHUNK) If iLen <> 0 Then iLast = 1 iCur = InStr(iLast, Expression, Delimiter, Compare) Do While iCur If cHits + 1 = Count Then Exit Do asResult(cHits) = Mid$(Expression, iLast, iCur - iLast) iLast = iCur + iLen: cHits = cHits + 1 iCur = InStr(iLast, Expression, Delimiter, Compare) If cHits > UBound(asResult) Then ReDim Preserve asResult(cHits + ARR_CHUNK - 1) End If Loop asResult(cHits) = Mid$(Expression, iLast) Else asResult(0) = Expression End If End If ReDim Preserve asResult(cHits) ' shrink to actual size '/ delivery an array without duplication pResult = VarPtr(SplitA02) CopyMemory ByVal pResult, &H2008, 2 ' initialize (Variant/String()) pArr = StrArrPtr(asResult) ' get array-pointer CopyMemory ByVal pResult + 8, ByVal pArr, 4 ' copy safearray-pointer ZeroMemory ByVal pArr, 4 ' delete safearray-pointer End Function
Private Function StrArrPtr&(v) CopyMemory StrArrPtr, ByVal VarPtr(v) + 8, 4 End Function
Author's comments:
Donald's comments:

top | charts


SplitA03
submitted 11-Dec-2000 by Egbert Nierop  
Doping: needs reference to typelib SplitA03.tlb (by Egbert Nierop) - Download TLB_Split03.zip (3KB zipped, VB5-compatible).
  [07-dec-2001] Note that the function does not work 100% correct: SplitA03("a,b,c", ",", 0) returns a zero-based one-element array as(0)="a,b,c" which is of course rubbish. The correct return would be an unbound array (UBound = -1). Also when Compare = vbTextCompare SplitA03 doesn't do it right.
Public Function SplitA03(expr As String, _
    Optional Delimiter As Variant = " ", _
    Optional ByVal limit As Long = -1, _
    Optional ByVal vbCompare As VbCompareMethod = vbBinaryCompare) As Variant
' by Egbert Nierop, egbert_nierop@goovy.hotmail.com remove the goovy agains spam..., 20001211
    
    Dim begPtr As Long
    Dim bytePtr As Long
    Dim spCount As Long
    Dim lenExpr As Long
    Dim lenDelim As Long
    Dim exprPtr As Long
    Dim psa As Long
    Dim psadata As Long
    
    Dim ptrArray() As Long
    
    lenExpr = SysStringLen(expr)
    lenDelim = SysStringLen(Delimiter)
    
    If lenExpr = 0 Then
        SplitA03 = Array()
        Exit Function
    ElseIf lenDelim = 0 Then
        SplitA03 = Array(expr)
        Exit Function
    End If

    'seems sufficient long for me :)
    If limit = -1 Then limit = 2147483647

    exprPtr = StrPtr(expr)

    'count the number of Delimiters.
    bytePtr = 1
    
    For spCount = 1 To limit - 1
        bytePtr = InStr(bytePtr, expr, Delimiter, vbCompare)
        If bytePtr = 0 Then Exit For
        bytePtr = bytePtr + lenDelim
    Next
    
    'convert byte len since a BSTR is unicoded
    lenExpr = lenExpr * 2
    lenDelim = lenDelim * 2
    
    bytePtr = 1
    spCount = spCount - 1
    ReDim ptrArray(spCount)
    
    'loop through the tokens
    For psadata = 0 To spCount - 1

        begPtr = InStrB(bytePtr, expr, Delimiter, vbCompare)
        ptrArray(psadata) = SysAllocStringLenPtr(ByVal exprPtr + bytePtr - 1, (begPtr - bytePtr) \ 2)
        bytePtr = begPtr + lenDelim

    Next
    ' fetch the last element
    ptrArray(spCount) = SysAllocStringLenPtr(ByVal exprPtr + bytePtr - 1, (lenExpr - bytePtr + 1) \ 2)
   
    ' get array handle
    spCount = spCount + 1
    psa = SafeArrayCreateVector(vbString, 0, spCount)
    'points to the same as VarPtr(StrArray(0)) for instance. Also lock the array
    psadata = SafeArrayAccessData(psa)
    'move all the BSTR pointers from the array to the BSTR() array
    kernel.MoveMemory ByVal psadata, ptrArray(0), spCount * 4
    'unlock the array
    
    SafeArrayUnaccessData psa

    ' set the vtype for the variant
    kernel.MoveMemory SplitA03, vbArray Or vbString, 2
    kernel.MoveMemory ByVal VarPtr(SplitA03) + 8, psa, Len(psa)

End Function
Author's comments:
Donald's comments:

top | charts


SplitB01
submitted 16-Sep-2000 by Donald Lessau  
Doping: none
Public Sub SplitB01(Expression As String, _
    sArrRet() As String, _
    Optional Delimiter As String = " ")
' by Donald, donald@xbeat.net, 20000916
  Const BUFFERDIM As Long = 1024
  Dim cntSplit As Long
  Dim posStart As Long
  Dim posFound As Long
  Dim lenDelimiter As Long
  Dim ubArray As Long
  
  lenDelimiter = Len(Delimiter)
  If lenDelimiter = 0 Then
    ' return expression in single-element array
    ReDim Preserve sArrRet(0)
    sArrRet(0) = Expression
  Else
    posStart = 1
    ubArray = -1
    Do
      If cntSplit > ubArray Then
        ubArray = ubArray + BUFFERDIM
        ReDim Preserve sArrRet(ubArray)
      End If
      posFound = InStr(posStart, Expression, Delimiter)
      If posFound Then
        sArrRet(cntSplit) = Mid$(Expression, posStart, posFound - posStart)
        posStart = posFound + lenDelimiter
        cntSplit = cntSplit + 1
      Else
        sArrRet(cntSplit) = Mid$(Expression, posStart)
      End If
    Loop While posFound
    ' shrink to actual size
    ReDim Preserve sArrRet(cntSplit)
  End If
  
End Sub
Author's comments:
Donald's comments:

top | charts


SplitB02
submitted 22-Sep-2000 by Keith  
Doping: none
Public Sub SplitB02( _
                     sExpression As String, _
                     sSplitArray() As String, _
            Optional sDelimiter As String = " ")
                    
   ' by Donald, donald@xbeat.net, 20000916
   ' modified by Keith, kmatzen@ispchannel.com, 20000923
   Const BUFFERDIM As Long = 1024
   
   Dim lCntSplits    As Long
   Dim lCntStart     As Long
   Dim lUBound       As Long
   Dim lPosStart     As Long
   Dim lPosFound     As Long
   Dim lLenDelimiter As Long
   Dim lStrLen       As Long
   
   lLenDelimiter = Len(sDelimiter)
   lPosStart = 1
   lPosFound = InStr(lPosStart, sExpression, sDelimiter)
   
   If lLenDelimiter = 0 Or lPosFound = 0 Then
   
      ' No delimiters - return sExpression in single-element array
      ReDim Preserve sSplitArray(0)
      sSplitArray(0) = sExpression
     
   Else
   
      lUBound = -1
      
      Do
         lCntStart = lUBound + 1
         lUBound = lUBound + BUFFERDIM
         ReDim Preserve sSplitArray(lUBound)
         For lCntSplits = lCntStart To lUBound
            If lPosFound Then    'Delimiter found
               lStrLen = lPosFound - lPosStart
               sSplitArray(lCntSplits) = Mid$(sExpression, lPosStart, lStrLen)
               lPosStart = lPosFound + lLenDelimiter
               lPosFound = InStr(lPosStart, sExpression, sDelimiter)
            Else                 'No more delimiters
               sSplitArray(lCntSplits) = Mid$(sExpression, lPosStart)
               ReDim Preserve sSplitArray(lCntSplits)
               Exit Sub
            End If
         Next lCntSplits
      Loop
      
   End If
  
End Sub
Author's comments:
Donald's comments: Clearly a better design than SplitB01. Interestingly, the performance gain is much more distinctive in IDE/P-Code (ca. 16%) than in compiled native code (only 1-2%).

top | charts


SplitB03
submitted 24-Nov-2000 by Guido Beckmann  
Doping: none
Public Sub SplitB03(Expression$, ResultSplit$(), Optional Delimiter$ = " ")
' by G.Beckmann, G.Beckmann@NikoCity.de
 
    Dim c&, iLen&, iLast&, iCur&
    
    iLen = Len(Delimiter)
    
    If iLen Then
        
        '/ count delimiters
        iCur = InStr(Expression, Delimiter)
        Do While iCur
            iCur = InStr(iCur + iLen, Expression, Delimiter)
            c = c + 1
        Loop
        
        '/ initalization
        ReDim Preserve ResultSplit(0 To c)
        c = 0: iLast = 1
        
        '/ search again...
        iCur = InStr(Expression, Delimiter)
        Do While iCur
            ResultSplit(c) = Mid$(Expression, iLast, iCur - iLast)
            iLast = iCur + iLen
            iCur = InStr(iLast, Expression, Delimiter)
            c = c + 1
        Loop
        ResultSplit(c) = Mid$(Expression, iLast)
        
    Else
        ReDim Preserve ResultSplit(0 To 0)
        ResultSplit(0) = Expression
    End If
 
End Sub
Author's comments:
Donald's comments:

top | charts


SplitB04
submitted 08-Dec-2001 by Chris Lucas  
Doping: none
Public Sub SplitB04(Expression$, ResultSplit$(), Optional Delimiter$ = " ")
' By Chris Lucas, cdl1051@earthlink.net, 20011208
    Dim c&, SLen&, DelLen&, tmp&, Results&()

    SLen = LenB(Expression) \ 2
    DelLen = LenB(Delimiter) \ 2

    ' Bail if we were passed an empty delimiter or an empty expression
    If SLen = 0 Or DelLen = 0 Then
        ReDim Preserve ResultSplit(0 To 0)
        ResultSplit(0) = Expression
        Exit Sub
    End If

    ' Count delimiters and remember their positions
    ReDim Preserve Results(0 To SLen)
    tmp = InStr(Expression, Delimiter)

    Do While tmp
        Results(c) = tmp
        c = c + 1
        tmp = InStr(Results(c - 1) + 1, Expression, Delimiter)
    Loop

    ' Size our return array
    ReDim Preserve ResultSplit(0 To c)

    ' Populate the array
    If c = 0 Then
        ' lazy man's call
        ResultSplit(0) = Expression
    Else
        ' typical call
        ResultSplit(0) = Left$(Expression, Results(0) - 1)
        For c = 0 To c - 2
            ResultSplit(c + 1) = Mid$(Expression, _
                Results(c) + DelLen, _
                Results(c + 1) - Results(c) - DelLen)
        Next c
        ResultSplit(c + 1) = Right$(Expression, SLen - Results(c) - DelLen + 1)
    End If

End Sub
Author's comments :
Donald's comments :

top | charts


SplitB05
submitted 10-Dec-2001 by Donald Lessau  
Revision 001, 01-Jun-2002
Doping: TLB (cf. Dope'n'Declarations)
Public Sub SplitB05( _
    Expression As String, _
    asToken() As String, _
    Optional Delimiter As String = " ")
' by Donald, donald@xbeat.net, 20011209, rev 001 20020601
' needs FastString typelib
  Dim lenExp As Long
  Dim lenDel As Long
  Dim aPosToken() As Long
  Dim posToken As Long
  Dim lenToken As Long
  Dim cntDelim As Long
  Dim pExp As Long
  Dim tmp As Long
  Dim i As Long
  
  lenExp = Len(Expression)
  lenDel = Len(Delimiter)
  
  ' Bail if we were passed an empty delimiter or an empty expression
  If lenExp = 0 Or lenDel = 0 Then
    ReDim Preserve asToken(0 To 0)
    asToken(0) = Expression
    Exit Sub
  End If
  
  ' Count delimiters and remember their positions
  ReDim Preserve aPosToken(0 To lenExp \ lenDel)  'max possible token
  tmp = InStr(Expression, Delimiter)
  Do While tmp
    cntDelim = cntDelim + 1
    aPosToken(cntDelim) = tmp + lenDel - 1
    tmp = InStr(tmp + lenDel, Expression, Delimiter)
  Loop
 
  ' Size our return array
  ReDim Preserve asToken(0 To cntDelim)
  
  ' Populate the array
  pExp = StrPtr(Expression)
  For i = 0 To cntDelim - 1
    posToken = pExp + aPosToken(i) + aPosToken(i)
    lenToken = aPosToken(i + 1) - lenDel - aPosToken(i)
    asToken(i) = FastString.SysAllocStringLen(ByVal posToken, lenToken)
  Next
  posToken = pExp + aPosToken(cntDelim) + aPosToken(cntDelim)
  lenToken = lenExp - aPosToken(cntDelim)
  asToken(cntDelim) = FastString.SysAllocStringLen(ByVal posToken, lenToken)
End Sub
Author's comments :
Donald's comments :

top | charts


SplitC01
submitted 01-Jun-2002 by Donald Lessau  
Doping: TLB (cf. Dope'n'Declarations)
Public Function SplitC01( _
    Expression As String, _
    asToken() As String, _
    Optional Delimiter As String = " ") _
    As Long
' by Donald, donald@xbeat.net, 20020601
' based on Sub SplitB05, but returns returns token count
' needs FastString typelib
  Dim lenExp As Long
  Dim lenDel As Long
  Dim aPosToken() As Long
  Dim posToken As Long
  Dim lenToken As Long
  Dim cntDelim As Long
  Dim pExp As Long
  Dim tmp As Long
  Dim i As Long
  
  lenExp = Len(Expression)
  lenDel = Len(Delimiter)
  
  ' Bail if we were passed an empty delimiter or an empty expression
  If lenExp = 0 Or lenDel = 0 Then
    ReDim Preserve asToken(0 To 0)
    asToken(0) = Expression
    SplitC01 = 1
    Exit Function
  End If
  
  ' Count delimiters and remember their positions
  ReDim Preserve aPosToken(0 To lenExp \ lenDel)  'max possible token
  tmp = InStr(Expression, Delimiter)
  Do While tmp
    cntDelim = cntDelim + 1
    aPosToken(cntDelim) = tmp + lenDel - 1
    tmp = InStr(tmp + lenDel, Expression, Delimiter)
  Loop
 
  ' Size our return array
  ReDim Preserve asToken(0 To cntDelim)
  
  ' Populate the array
  pExp = StrPtr(Expression)
  For i = 0 To cntDelim - 1
    posToken = pExp + aPosToken(i) + aPosToken(i)
    lenToken = aPosToken(i + 1) - lenDel - aPosToken(i)
    asToken(i) = FastString.SysAllocStringLen(ByVal posToken, lenToken)
  Next
  posToken = pExp + aPosToken(cntDelim) + aPosToken(cntDelim)
  lenToken = lenExp - aPosToken(cntDelim)
  asToken(cntDelim) = FastString.SysAllocStringLen(ByVal posToken, lenToken)
  
  ' Return count tokens
  SplitC01 = cntDelim + 1
  
End Function
Author's comments :
Donald's comments :

top | charts




VBspeed © 2000-10 by Donald Lessau