A Fast String Class for ASP Pages

By Peter A. Bromberg, Ph.D.

Peter Bromberg  


If you've done a lot of web site development with ASP, particularly using VBScript and generating dynamic page elements such as tables from a database query and the like, then you know full well that VBScript (and, in general, ANY interpreted scripting language) absolutely STINKS at string concatenation. In .NET, you have a much more efficient object, StringBuilder, that was specially designed for this purpose. But with VBScript, and even in a compiled VIsual Basic COM DLL, repetitive string concatenation is not only notoriously slow, it can run the CPU right off the top of the meter! Ouch! Not good, eh?



You know what I'm referring to:

while not rs.EOF
strTable = strTable & "<TR><TD>" & rs("Name") & "</TD><TD>" & rs("address") & ....
... etc. etc.

A few years ago, Francesco Balena put out a tremendous article called something like "A Fast Class for Strings" that used (among other things) the Win APIs CopyMemory, FillMemory and work with byte arrays and pointers to speed up the process for most string operations in VB. Since that time I've seen a few stabs taken at optimizing string concatenation in ASP, and so here's my own.

I think the key item here is first to understand how VBScript (and Visual Basic too) handle strings, and see if there is a way "around" it. Once you understand what you are asking the scripting engine to do when you write, "strMyBigString =strMyBigString & strMyLittleBittyString" 100 times, then you'll know why you want to avoid this. Not only will those dynamically assembled ASP web pages render up to 13 times faster, but your CPU will run a lot cooler as well. In my book, both of those are good goals!

So what does VB actually do when you want to concatenate a substring to an existing string?

Strings in Visual Basic are stored as BSTR's. If you use the function VarPtr on a variable of type String, you will get the address of the BSTR, which is a pointer to a pointer of the string. To get the address of the string buffer itself, you can use the StrPtr function. This function returns the address of the first character of the string. Take into account that Strings are stored as UNICODE in Visual Basic. So in VB, the variable of type String, "strMyString" is really a POINTER to a four- byte structure in memory that only holds the length and memory address of the actual UNICODE DATA. When you "Concatenate" strings with the "&" operator, numerous copies are made by VB behind the scenes to architect the new longer BSTR. The time to accomplish this process increases exponentially with the number of concatenations that need to be done.

But what about arrays? VB and VBScript, its stunted little sister, have intrinsic array functions such as JOIN that are, not surprisingly, MUCH FASTER at concatenating variant array elements. So how about if we just cobble together a little VBScript Class that keeps our stuff in an array (its always a Variant anyway, so what's the difference?) and then when we're finished with all of our concatenations we have it just do a JOIN on the array with NO DELIMITER so it all comes back as ONE BIG LONG STRING, in a SINGLE OPERATION. Make sense?

Here's my take on a Fast String Class for VBScript:

Class FastString
Dim stringArray,growthRate,numItems
Private Sub Class_Initialize()
growthRate = 50: numItems = 0
ReDim stringArray(growthRate)
End Sub
Public Sub Append(ByVal strValue)
' next line prevents type mismatch error if strValue is null. Performance hit is negligible.
strValue=strValue & ""
If numItems > UBound(stringArray) Then ReDim Preserve stringArray(UBound(stringArray) + growthRate)
stringArray(numItems) = strValue:numItems = numItems + 1
End Sub
Public Sub Reset
Erase stringArray
Class_Initialize
End Sub
Public Function concat()
Redim Preserve stringArray(numItems)
concat = Join(stringArray, "")
End Function
End Class

When the class is instantiated, Class_Initialize() sets the growthRate and Redims our stringArray. When we call Append, we are simply adding our substring as a new element and incrementing the numItems counter to keep track of "how many". If we've exceeded the initial growthRate, we also ReDim Preserve our existing elements and add another growthRate worth of elements to the array. Reset is just a convenience member, you may need to use it. Finally "concat" performs the JOIN and returns our final string.

So how much faster is it? Well, just CLICK ON THIS LINK to bring up a client-side VBScript page that will perform the operation 5,000 times on a substring "This is a substring" the old way, and then the new way, and show you the times and the speed difference ratio. (Remember, I said "5000 times" - so give it a few seconds to finsh and display). To get the code, just "View source" on the page that comes up from the link.

Now of course, most programmers who are performance - minded will be asking, "I wonder if this will work in Javascript, too?". You bet it does! In fact, the difference with Javascript is even more dramatic! To find out for yourself, try this on for size! (again, this is client - side code, so it may take a while to do the 5,000 iterations with each method).

So if you are saying "that's cool" and yet you go off and continue to concatenate your strings the old way, I leave you with this little gem to ponder:

"How many psychiatrists does it take to change a lightbulb?"
"Just one, but it will take a long time, and the bulb has to really want to change."

 

 

Peter Bromberg is an independent consultant specializing in distributed .NET solutions in Orlando and a co-developer of the NullSkull.com developer website. He can be reached at info@eggheadcafe.com