The number of cores integrated onto a single die is expected to climb steadily in the foreseeable future. This move to many-core chips is driven by a need to optimize performance per watt. How best to connect these cores and how to program the resulting many-core processor, however, is an open research question. Designs vary from GPUs to cache-coherent shared memory multiprocessors to pure distributed memory chips. The 48-core SCC processor reported in this paper is an intermediate case, sharing traits of message passing and shared memory architectures. The hardware has been described elsewhere. In this paper, we describe the programmer’s view of this chip. In particular we describe RCCE: the native message passing model created for the SCC processor.


Intel Corporation: Timothy G. Mattson, Rob F. Van der Wijngaart, Michael Riepen, Thomas Lehnig, Paul Brett, Werner Haas, Patrick Kennedy, Jason Howard, Sriram Vangal, Nitin Borkar, Greg Ruhl, Saurabh Dighe


International Conference for High Performance Computing, Networking, Storage & Analysis (SC10) http://sc10.supercomputing.org PDF