Optimizing a Xojo function
Today we want to look on seven different ways on how to implement a function to count the number of space characters in a string. Sounds simple, but depending how you code it, the performance will be much different.
All examples here use BackgroundTasks and NilObjectChecking disabled to avoid extra overhead. If background tasks are enabled, Xojo would insert a call to check for other threads due and we usually disable that in a tight loop as the check needed could take more than what the code inside the loop does. And when we deal with memory blocks below, we don't need to check whether the Memoryblock is nil on every access.
Using Middle
Let's start with the most high level view by using String.Middle function:
The Middle() function does some more on text processing here like looking how many bytes one UTF-8 character takes. Since our space we look for is just one byte, we don't need that and can optimize by using MiddleBytes later.
Using Mid
Middle above is new in API 2, but you can still use Mid() from API 1 for compatibility. That one is about the same speed or a bit quicker:
Please note that Middle() takes position zero based, while Mid() takes it one based.
Using MiddleBytes
Now since we don't care for all unicode made out of multiple bytes, we can simply scan for the space character with MiddleBytes:
This is much faster since the code doesn't need to check character boundaries.
Using Split
An alternative way is to use Split. Yes, it creates temporary strings and a temporary array, which we throw away. But since it only runs once over the string in C code inside the framework, it comes out faster.
Using SplitBytes
Faster than Split is SplitBytes, which doesn't care again about character boundaries.
Using MemoryBlock
When we use MemoryBlock, we can scan over the bytes ourselves. This can be much faster, but we have to do the work ourselves to check exactly. Converting string to MemoryBlock creates a copy, so if you do this several times, you may better cache the MemoryBlock.
Using Pointer
Using Ptr instead of MemoryBlock is faster. The advantage is that Int8Value() on MemoryBlock is a function call, while Int8() access to Ptr is directly build as CPU instructions. We have the pointer point to the memory in the Memoryblock. We skip the function call and the out of bounds checking within the function.
Results
We did some measurements. Running each function 100 times to measure averages.
Length of test string: 12999
Count1: 95.975 ms with Middle
Count2: 94.313 ms with Mid
Count3: 2.099 ms with MiddleBytes
Count4: 0.108 ms with Split
Count5: 0.117 ms with SplitBytes
Count6: 0.104 ms with Memoryblock
Count7: 0.004 ms with Ptr
Each call finds 1000 spaces in the test string. The result is always the same value with different amounts of time needed.
As you see, using bytes instead of characters can make a big difference. Using Split with a temporary array of 1000 strings being so fast is quite a surprise. In a real world application you may make your own Count function, which then uses different ways. If the text to search is only one byte long, the function may do the Ptr loop and otherwise maybe do a loop with Middle.
Final function
Here is a final function, which optimizes for three cases. First nothing to do if the text is empty. Second way is to do the Ptr loop if the search text is only one byte. And third case is for anything else with multi byte characters:
Let me know what you think about this and whether you have some improvements.
Update: We forgot CountFields and String.Characters iterator. Both are faster than the Middle loop, but CountFields is faster, so we may use that for our CountOccurrences function.