Swift Performance Notes

I had a client who wanted to write a core piece of their app's functionality in Swift. It was a pitch detection library written in C. I won't go into detail about it, except that it was calculating FFTs, and running pitch analysis algorithms done in real time and entirely on the CPU.

Additionally, we weren't going to use any of the Accelerate framework to do this. Primarily, we didn't want the user experience to change too drastically, and while we were ramping up on digital signal processing, nobody was an expert. We figured this was a reasonable way to mitigate the risk around this project. The other reason was that it has been rumored you can write Swift that can perform like C if written in the right way, we had budget for this curiosity.

I felt a bit in over my head, but it sounded like a really fun project, and I had really enjoyed the detail oriented numerical work of writing a fluid simulator with Metal. I made clear a strategy for this problem to allow for us to manage the uncertainty, and we got started.

In the beginning the Swift implementation was about 195% slower than the C reference implementation. However, with the following tricks, when analyzing an audio sample our implementation was on par and could run up to 2.5% faster than the C implementation!

Okay, but mostly they both took 15ms on average... Perhaps the bigger accomplishment was taking the Swift implementation's runtime from 47ms to 15ms.

Context Matters

Something to keep in mind is that these performance optimizations really make a difference for code where the milliseconds actually matter. In our case we are reading data from the microphone, and processing 1024 audio frames of it at a time, and returning the pitch that the person is singing. The implementation was lots of tight loops, some nested, reading and writing contiguous Float data. For each group of frames, we would have to run the FFT algorithm, and then perform the pitch tracking on its output.

Create a Separate Build

I recommend creating a separate build for running something like swift-benchmark, in release mode. The measure blocks in XCTest are OK, but I found it to be somewhat buggy. As you'll see in the following section, release mode is the best mode to find out what the actual performance will be as you end up with different binaries. For performance sensitive code, I will absolutely write it in a way that allows for it to be measured in isolation. This may not always be possible, but it turned into a great process for improving performance.

Write Unsafe Code

Apple makes it sound very dangerous to access memory directly. You'll know this if you've watched any of the videos on Unsafe Swift. But anyone can do it. Swift does a lot of nice things like type checking, and bounds checking. If you're careful, then you'll get some performance speed ups by getting rid of all the unnecessary assembly that gets generated.

One gotcha is when you are measuring the performance, you need to run the code in release mode. In developer mode there are still bounds checks being made so it will look like it's slower than it actually is. See: "Create a Separate Build"

Overflow Operators

Swift has some overflow operators &+, &-, &* which can be used for integer arithmetic. They bypass any overflow checks and won't throw any exceptions. Again, caution is advised. Know the data you're working with, and understand that UINT32_MAX &+ 1 == 0.

repeat/while Loops Instead of for Loops

Using repeat/while loops instead of for loops in Swift improved our performance by 80.83%. And in some more specific areas by more than 100%. for loops generate a lot more assembly where as a while or repeat loop gets you closer to what C would generate.

Fooled by inout

I made a mistake. When you pass stuff around as an inout, you have to use a & operator in front of the variable name.

func myFunc(inout value: Integer) { 
   val = val + 1 
}
var i = 10
myFunc(&i)
print(i) // 11
The value of i is copied in, updated, and copied back out.

You are not actually passing an address here as you would in C. You are telling swift to pass in the value, modify it, and then to copy it out. In others words: copy-in-copy-out can be expensive if that object happens to be holding a giant array of Float values. Try to pass by reference when you can. In our case it was an entirely superfluous thing I could simply remove. For us, runtime improved by 15% or 10ms.

Unnecessary Type Casts

When starting this project we wanted to try to use Double instead of Float. This was totally fine, however, near the end of the project when using Instruments to find any other spots I could squeeze out performance I noticed calls to Double() taking up 5ms of the runtime. The audio data came as Float. Meaning that every bit of data that came in I had to cast to Double.

Conclusion

I think Swift can be pretty performant, and it's not too difficult to manage that when you need it. The language does a good job of being both safe and expressive, but also allowing for developers to remove the guard rails. I find Swift can be awkward when working in unsafe territory, but it allows for abstractions which can make it elegant again.

Random number generators in Swift
I present a few random number generators in Swift and investigate their quality versus performance. I also look at performance implementations and see if I can make a C algorithm run as fast as Swift.
High Performance Numeric Programming with Swift: Explorations and Reflections
Making neural nets uncool again
apple/swift
The Swift Programming Language. Contribute to apple/swift development by creating an account on GitHub.
Performance Tips
apple/swift
The Swift Programming Language. Contribute to apple/swift development by creating an account on GitHub.
Arrays