An excellent article by Bryan Cantrill and Jeff Bonwick on how to write multi-threaded code. With more processors and no magic bullet solution for how to use them, knowing how to write multiprocessor code that doesn't screw up your system is still a valuable skill. Some topics:
Know your cold paths from your hot paths.
Intuition is frequently wrong—be data intensive.
Know when—and when not—to break up a lock.
Be wary of readers/writer locks.
Consider per-CPU locking.
Know when to broadcast—and when to signal.
Learn to debug postmortem.
Design your systems to be composable.
Don't use a semaphore where a mutex would suffice.
Consider memory retiring to implement per-chain hash-table locks.
Be aware of false sharing.
Consider using nonblocking synchronization routines to monitor contention.
When reacquiring locks, consider using generation counts to detect state change.
Use wait- and lock-free structures only if you absolutely must.
Prepare for the thrill of victory—and the agony of defeat.
While I don't agree that code using locks can be made composable, this articles covers a lot of very useful nitty-gritty details that will up your expert rating a couple points.
Reader Comments (5)
"While I don't agree that code using locks can be made composable"
How about expanding on that?
Sure Dan. In my experience, working on many largish real-time embedded systems, code grows organically which means they can't be composed. With a number of developers working on a large code base code is inserted all over the place. Look at any code and you can't tell it's locking behaviour. You can't tell if a lock was taken before it. You can't tell if a function call takes a lock. You can't tell how long those locks are taken for. When you write a function you can't tell if someone will call it from an ISR context. You can't tell if someone will just insert it on a critical pass. There is no total view of the threading/locking behaviour from static code. So you can't make code that's composable that uses lock. The results will be unpredictable in any realistic future.
So no locks. Use actors. Pass messages. Don't share anything. You can still deadlock at a protocol level of course. But that's a little easier to detect.
I have no doubt that if someone says if you full document functions, really check the code, really be careful etc that locks can be composable. But that's not how code gets written. Code is written under tremendous time constraints by people who often not real-time experts. The result is predictable.
I'm afraid that you're falling into the same trap as many: just because one can write terrible code with uncomposable locking semantics does not mean that locks themselves are uncomposable. Certainly, I would advocate (as we did in the piece) using simpler constructs when and where appropriate -- but let us not condemn software to its lowest common denominator. Indeed, we who develop the Solaris kernel have -- for years -- done exactly what you describe: fully document functions, check the code, and above all, carefully design a composable locking strategy.
I'm afraid that you're falling into the same trap as many: just because one can write terrible code with uncomposable locking semantics does not mean that locks themselves are uncomposable. Certainly, I would advocate (as we did in the piece) using simpler constructs when and where appropriate -- but let us not condemn software to its lowest common denominator. Indeed, we who develop the Solaris kernel have -- for years -- done exactly what you describe: fully document functions, check the code, and above all, carefully design a composable locking strategy.
Hunters carefully place traps where their pray are most likely to walk. This is how traps work and why they are so successful. It's not really lowest common denominator. It's what happens in the wild. If we looked at the historical bug list for the Solaris kernel I'm betting we'll see quite a few concurrency related errors. And that's in a very rarefied environment and specialized environment. Even the Mars Rover ran into a priority inversion problem which is real-time 401. Step outside these specialized environments and the risks are much greater.