The PaRSEC / DPLASMA community have put together few tutorials to explain how to use PaRSEC in a distributed heterogeneous setting, how to implement your own DSL on top of it, and how to improve the efficiency of your algorithms by using a dataflow-like task-based programming environment.
Here are two of the most recent tutorials (1 and 2).