2.11. Stream Management
This section describes the stream management functions of the low-level CUDA driver application programming interface.
Functions
- CUresult cuStreamAddCallback ( CUstream hStream, CUstreamCallback callback, void* userData, unsigned int flags )
- Add a callback to a compute stream.
- CUresult cuStreamCreate ( CUstream* phStream, unsigned int Flags )
- Create a stream.
- CUresult cuStreamCreateWithPriority ( CUstream* phStream, unsigned int flags, int priority )
- Create a stream with the given priority.
- CUresult cuStreamDestroy ( CUstream hStream )
- Destroys a stream.
- CUresult cuStreamGetFlags ( CUstream hStream, unsigned int* flags )
- Query the flags of a given stream.
- CUresult cuStreamGetPriority ( CUstream hStream, int* priority )
- Query the priority of a given stream.
- CUresult cuStreamQuery ( CUstream hStream )
- Determine status of a compute stream.
- CUresult cuStreamSynchronize ( CUstream hStream )
- Wait until a stream's tasks are completed.
- CUresult cuStreamWaitEvent ( CUstream hStream, CUevent hEvent, unsigned int Flags )
- Make a compute stream wait on an event.
Functions
- CUresult cuStreamAddCallback ( CUstream hStream, CUstreamCallback callback, void* userData, unsigned int flags )
-
Add a callback to a compute stream.
Parameters
- hStream
- - Stream to add callback to
- callback
- - The function to call once preceding stream operations are complete
- userData
- - User specified data to be passed to the callback function
- flags
- - Reserved for future use, must be 0
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_NOT_SUPPORTED
Description
Adds a callback to be called on the host after all currently enqueued items in the stream have completed. For each cuStreamAddCallback call, the callback will be executed exactly once. The callback will block later work in the stream until it is finished.
The callback may be passed CUDA_SUCCESS or an error code. In the event of a device error, all subsequently executed callbacks will receive an appropriate CUresult.
Callbacks must not make any CUDA API calls. Attempting to use a CUDA API will result in CUDA_ERROR_NOT_PERMITTED. Callbacks must not perform any synchronization that may depend on outstanding device work or other callbacks that are not mandated to run earlier. Callbacks without a mandated order (in independent streams) execute in undefined order and may be serialized.
This API requires compute capability 1.1 or greater. See cuDeviceGetAttribute or cuDeviceGetProperties to query compute capability. Attempting to use this API with earlier compute versions will return CUDA_ERROR_NOT_SUPPORTED.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamCreate, cuStreamQuery, cuStreamSynchronize, cuStreamWaitEvent, cuStreamDestroy
- CUresult cuStreamCreate ( CUstream* phStream, unsigned int Flags )
-
Create a stream.
Parameters
- phStream
- - Returned newly created stream
- Flags
- - Parameters for stream creation
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY
Description
Creates a stream and returns a handle in phStream. The Flags argument determines behaviors of the stream. Valid values for Flags are:
-
CU_STREAM_DEFAULT: Default stream creation flag.
-
CU_STREAM_NON_BLOCKING: Specifies that work running in the created stream may run concurrently with work in stream 0 (the NULL stream), and that the created stream should perform no implicit synchronization with stream 0.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamDestroy, cuStreamCreateWithPriority, cuStreamGetPriority, cuStreamGetFlags, cuStreamWaitEvent, cuStreamQuery, cuStreamSynchronize, cuStreamAddCallback
- CUresult cuStreamCreateWithPriority ( CUstream* phStream, unsigned int flags, int priority )
-
Create a stream with the given priority.
Parameters
- phStream
- - Returned newly created stream
- flags
- - Flags for stream creation. See cuStreamCreate for a list of valid flags
- priority
- - Stream priority. Lower numbers represent higher priorities. See cuCtxGetStreamPriorityRange for more information about meaningful stream priorities that can be passed.
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_OUT_OF_MEMORY
Description
Creates a stream with the specified priority and returns a handle in phStream. This API alters the scheduler priority of work in the stream. Work in a higher priority stream may preempt work already executing in a low priority stream.
priority follows a convention where lower numbers represent higher priorities. '0' represents default priority. The range of meaningful numerical priorities can be queried using cuCtxGetStreamPriorityRange. If the specified priority is outside the numerical range returned by cuCtxGetStreamPriorityRange, it will automatically be clamped to the lowest or the highest number in the range.
Note:-
Note that this function may also return error codes from previous, asynchronous launches.
-
Stream priorities are supported only on Quadro and Tesla GPUs with compute capability 3.5 or higher.
-
In the current implementation, only compute kernels launched in priority streams are affected by the stream's priority. Stream priorities have no effect on host-to-device and device-to-host memory operations.
See also:
cuStreamDestroy, cuStreamCreate, cuStreamGetPriority, cuCtxGetStreamPriorityRange, cuStreamGetFlags, cuStreamWaitEvent, cuStreamQuery, cuStreamSynchronize, cuStreamAddCallback
- CUresult cuStreamDestroy ( CUstream hStream )
-
Destroys a stream.
Parameters
- hStream
- - Stream to destroy
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE
Description
Destroys the stream specified by hStream.
In case the device is still doing work in the stream hStream when cuStreamDestroy() is called, the function will return immediately and the resources associated with hStream will be released automatically once the device has completed all work in hStream.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamCreate, cuStreamWaitEvent, cuStreamQuery, cuStreamSynchronize, cuStreamAddCallback
- CUresult cuStreamGetFlags ( CUstream hStream, unsigned int* flags )
-
Query the flags of a given stream.
Parameters
- hStream
- - Handle to the stream to be queried
- flags
- - Pointer to an unsigned integer in which the stream's flags are returned The value returned in flags is a logical 'OR' of all flags that were used while creating this stream. See cuStreamCreate for the list of valid flags
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_OUT_OF_MEMORY
Description
Query the flags of a stream created using cuStreamCreate or cuStreamCreateWithPriority and return the flags in flags.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
- CUresult cuStreamGetPriority ( CUstream hStream, int* priority )
-
Query the priority of a given stream.
Parameters
- hStream
- - Handle to the stream to be queried
- priority
- - Pointer to a signed integer in which the stream's priority is returned
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_VALUE, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_OUT_OF_MEMORY
Description
Query the priority of a stream created using cuStreamCreate or cuStreamCreateWithPriority and return the priority in priority. Note that if the stream was created with a priority outside the numerical range returned by cuCtxGetStreamPriorityRange, this function returns the clamped priority. See cuStreamCreateWithPriority for details about priority clamping.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamDestroy, cuStreamCreate, cuStreamCreateWithPriority, cuCtxGetStreamPriorityRange, cuStreamGetFlags
- CUresult cuStreamQuery ( CUstream hStream )
-
Determine status of a compute stream.
Parameters
- hStream
- - Stream to query status of
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE, CUDA_ERROR_NOT_READY
Description
Returns CUDA_SUCCESS if all operations in the stream specified by hStream have completed, or CUDA_ERROR_NOT_READY if not.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamCreate, cuStreamWaitEvent, cuStreamDestroy, cuStreamSynchronize, cuStreamAddCallback
- CUresult cuStreamSynchronize ( CUstream hStream )
-
Wait until a stream's tasks are completed.
Parameters
- hStream
- - Stream to wait for
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE
Description
Waits until the device has completed all operations in the stream specified by hStream. If the context was created with the CU_CTX_SCHED_BLOCKING_SYNC flag, the CPU thread will block until the stream is finished with all of its tasks.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamCreate, cuStreamDestroy, cuStreamWaitEvent, cuStreamQuery, cuStreamAddCallback
- CUresult cuStreamWaitEvent ( CUstream hStream, CUevent hEvent, unsigned int Flags )
-
Make a compute stream wait on an event.
Parameters
- hStream
- - Stream to wait
- hEvent
- - Event to wait on (may not be NULL)
- Flags
- - Parameters for the operation (must be 0)
Returns
CUDA_SUCCESS, CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_NOT_INITIALIZED, CUDA_ERROR_INVALID_CONTEXT, CUDA_ERROR_INVALID_HANDLE,
Description
Makes all future work submitted to hStream wait until hEvent reports completion before beginning execution. This synchronization will be performed efficiently on the device. The event hEvent may be from a different context than hStream, in which case this function will perform cross-device synchronization.
The stream hStream will wait only for the completion of the most recent host call to cuEventRecord() on hEvent. Once this call has returned, any functions (including cuEventRecord() and cuEventDestroy()) may be called on hEvent again, and subsequent calls will not have any effect on hStream.
If hStream is 0 (the NULL stream) any future work submitted in any stream will wait for hEvent to complete before beginning execution. This effectively creates a barrier for all future work submitted to the context.
If cuEventRecord() has not been called on hEvent, this call acts as if the record has already completed, and so is a functional no-op.
Note:Note that this function may also return error codes from previous, asynchronous launches.
See also:
cuStreamCreate, cuEventRecord, cuStreamQuery, cuStreamSynchronize, cuStreamAddCallback, cuStreamDestroy