With Linux for the Sony PS3, the IBM QS2x blades and the Toshiba Celleb platform having hit mainstream Linux distributions, programming for the Cell BE is becoming increasingly interesting for developers of performance computing. This talk is about the concepts of the architecture and how to develop applications for it.
Most importantly, there will be an overview of new feature additions and latest developments, including:
- Preemptive scheduling on SPUs (finally!): While it has been possible to run concurrent SPU programs for some time, there was only a very limited version of the scheduler implemented. Now we have a full time-slicing scheduler with normal and realtime priorities, SPU affinity and gang scheduling.
- Using SPUs for offloading kernel tasks: There are a few compute intensive tasks like RAID-6 or IPsec processing that can benefit from running partially on an SPU. Interesting aspects of the implementation are how to balance kernel SPU threads against user processing, how to efficiently communicate with the SPU from the kernel and measurements to see if it is actually worthwhile.
- Overlay programming: One significant limitation of the SPU is the size of the local memory that is used for both its code and data. Recent compilers support overlays of code segments, a technique widely known in the previous century but mostly forgotten in Linux programming nowadays.