|
PULSAR
2.0.0
Parallel Ultra-Light Systolic Array Runtime
|
PRT communication proxy. More...
#include "prt_proxy.h"Go to the source code of this file.
Functions | |
| prt_proxy_t * | prt_proxy_new (int num_agents) |
| Creates a proxy. More... | |
| void | prt_proxy_delete (prt_proxy_t *proxy) |
| Destroys a proxy. Checks if all the lists are empty at the time of destruction. Not destroying the list of receives (destroyed at the end of the proxy's cycle). More... | |
| void | prt_proxy_max_channel_size (prt_proxy_t *proxy, prt_channel_t *channel) |
| Looks for maximum channel/packet size. More... | |
| void | prt_proxy_recv (prt_proxy_t *proxy, prt_request_t *request) |
| Receives to a channel. More... | |
| void | prt_proxy_mpi (prt_proxy_t *proxy) |
| Implements the proxy's MPI cycle. Services all MPI requests. More... | |
| void | prt_proxy_cuda (prt_proxy_t *proxy) |
| Implements the proxy's CUDA cycle. Services all local transfer requests. Runs all device code. More... | |
| double | prt_proxy_run (prt_proxy_t *proxy) |
| Implements the proxy's production cycle. First, barriers with all MPI processes. Then, barriers with all local worker threads and starts measuring time. When finished, barriers with all local worker threads. Then, barriers with all MPI processes and stops the timer. More... | |
PRT communication proxy.
The proxy executes all MPI communication and all CUDA code. In the case of multiple CUDA devices, the proxy services all the devices. The proxy implements device-to-device communications as staged, device-to-host + host-to-device communications. If supported, direct device-to-device communication is also possible, using the prt_packet_device_to_device_direct function (currently not used). The proxy also implements MPI transfers involving devices as staged, device-to-host + MPI communications.
PULSAR Runtime http://icl.utk.edu/pulsar/ Copyright (C) 2012-2015 University of Tennessee.
Definition in file prt_proxy.c.
| void prt_proxy_cuda | ( | prt_proxy_t * | proxy) |
Implements the proxy's CUDA cycle. Services all local transfer requests. Runs all device code.
| proxy | – The proxy to cycle CUDA. |
Definition at line 256 of file prt_proxy.c.
| void prt_proxy_delete | ( | prt_proxy_t * | proxy) |
Destroys a proxy. Checks if all the lists are empty at the time of destruction. Not destroying the list of receives (destroyed at the end of the proxy's cycle).
| proxy | – The proxy to destroy. |
Definition at line 86 of file prt_proxy.c.
| void prt_proxy_max_channel_size | ( | prt_proxy_t * | proxy, |
| prt_channel_t * | channel | ||
| ) |
Looks for maximum channel/packet size.
| proxy | – The proxy registering the size. |
| channel | – The channel to register the size of. |
Definition at line 132 of file prt_proxy.c.
| void prt_proxy_mpi | ( | prt_proxy_t * | proxy) |
Implements the proxy's MPI cycle. Services all MPI requests.
| proxy | – The proxy to cycle MPI. |
Definition at line 187 of file prt_proxy.c.
| prt_proxy_t* prt_proxy_new | ( | int | num_agents) |
Creates a proxy.
| num_agents | – The number of local agents (threads + devices). |
Definition at line 30 of file prt_proxy.c.
| void prt_proxy_recv | ( | prt_proxy_t * | proxy, |
| prt_request_t * | request | ||
| ) |
Receives to a channel.
| proxy | – The proxy to receive the request. |
| request | – The receive request to process. |
Definition at line 150 of file prt_proxy.c.
| double prt_proxy_run | ( | prt_proxy_t * | proxy) |
Implements the proxy's production cycle. First, barriers with all MPI processes. Then, barriers with all local worker threads and starts measuring time. When finished, barriers with all local worker threads. Then, barriers with all MPI processes and stops the timer.
| proxy | – The proxy to run. |
Definition at line 319 of file prt_proxy.c.