259 lines
11 KiB
HTML
259 lines
11 KiB
HTML
|
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
|||
|
<html>
|
|||
|
<!-- Copyright (C) 2006-2016 Free Software Foundation, Inc.
|
|||
|
|
|||
|
Permission is granted to copy, distribute and/or modify this document
|
|||
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
|||
|
any later version published by the Free Software Foundation; with the
|
|||
|
Invariant Sections being "Funding Free Software", the Front-Cover
|
|||
|
texts being (a) (see below), and with the Back-Cover Texts being (b)
|
|||
|
(see below). A copy of the license is included in the section entitled
|
|||
|
"GNU Free Documentation License".
|
|||
|
|
|||
|
(a) The FSF's Front-Cover Text is:
|
|||
|
|
|||
|
A GNU Manual
|
|||
|
|
|||
|
(b) The FSF's Back-Cover Text is:
|
|||
|
|
|||
|
You have freedom to copy and modify this GNU Manual, like GNU
|
|||
|
software. Copies published by the Free Software Foundation raise
|
|||
|
funds for GNU development. -->
|
|||
|
<!-- Created by GNU Texinfo 5.2, http://www.gnu.org/software/texinfo/ -->
|
|||
|
<head>
|
|||
|
<title>GNU libgomp: OpenACC Library Interoperability</title>
|
|||
|
|
|||
|
<meta name="description" content="GNU libgomp: OpenACC Library Interoperability">
|
|||
|
<meta name="keywords" content="GNU libgomp: OpenACC Library Interoperability">
|
|||
|
<meta name="resource-type" content="document">
|
|||
|
<meta name="distribution" content="global">
|
|||
|
<meta name="Generator" content="makeinfo">
|
|||
|
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
|
|||
|
<link href="index.html#Top" rel="start" title="Top">
|
|||
|
<link href="Library-Index.html#Library-Index" rel="index" title="Library Index">
|
|||
|
<link href="index.html#SEC_Contents" rel="contents" title="Table of Contents">
|
|||
|
<link href="index.html#Top" rel="up" title="Top">
|
|||
|
<link href="The-libgomp-ABI.html#The-libgomp-ABI" rel="next" title="The libgomp ABI">
|
|||
|
<link href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" rel="prev" title="CUDA Streams Usage">
|
|||
|
<style type="text/css">
|
|||
|
<!--
|
|||
|
a.summary-letter {text-decoration: none}
|
|||
|
blockquote.smallquotation {font-size: smaller}
|
|||
|
div.display {margin-left: 3.2em}
|
|||
|
div.example {margin-left: 3.2em}
|
|||
|
div.indentedblock {margin-left: 3.2em}
|
|||
|
div.lisp {margin-left: 3.2em}
|
|||
|
div.smalldisplay {margin-left: 3.2em}
|
|||
|
div.smallexample {margin-left: 3.2em}
|
|||
|
div.smallindentedblock {margin-left: 3.2em; font-size: smaller}
|
|||
|
div.smalllisp {margin-left: 3.2em}
|
|||
|
kbd {font-style:oblique}
|
|||
|
pre.display {font-family: inherit}
|
|||
|
pre.format {font-family: inherit}
|
|||
|
pre.menu-comment {font-family: serif}
|
|||
|
pre.menu-preformatted {font-family: serif}
|
|||
|
pre.smalldisplay {font-family: inherit; font-size: smaller}
|
|||
|
pre.smallexample {font-size: smaller}
|
|||
|
pre.smallformat {font-family: inherit; font-size: smaller}
|
|||
|
pre.smalllisp {font-size: smaller}
|
|||
|
span.nocodebreak {white-space:nowrap}
|
|||
|
span.nolinebreak {white-space:nowrap}
|
|||
|
span.roman {font-family:serif; font-weight:normal}
|
|||
|
span.sansserif {font-family:sans-serif; font-weight:normal}
|
|||
|
ul.no-bullet {list-style: none}
|
|||
|
-->
|
|||
|
</style>
|
|||
|
|
|||
|
|
|||
|
</head>
|
|||
|
|
|||
|
<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">
|
|||
|
<a name="OpenACC-Library-Interoperability"></a>
|
|||
|
<div class="header">
|
|||
|
<p>
|
|||
|
Next: <a href="The-libgomp-ABI.html#The-libgomp-ABI" accesskey="n" rel="next">The libgomp ABI</a>, Previous: <a href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" accesskey="p" rel="prev">CUDA Streams Usage</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
|
|||
|
</div>
|
|||
|
<hr>
|
|||
|
<a name="OpenACC-Library-Interoperability-1"></a>
|
|||
|
<h2 class="chapter">8 OpenACC Library Interoperability</h2>
|
|||
|
|
|||
|
<a name="Introduction-1"></a>
|
|||
|
<h3 class="section">8.1 Introduction</h3>
|
|||
|
|
|||
|
<p>The OpenACC library uses the CUDA Driver API, and may interact with
|
|||
|
programs that use the Runtime library directly, or another library
|
|||
|
based on the Runtime library, e.g., CUBLAS<a name="DOCF2" href="#FOOT2"><sup>2</sup></a>.
|
|||
|
This chapter describes the use cases and what changes are
|
|||
|
required in order to use both the OpenACC library and the CUBLAS and Runtime
|
|||
|
libraries within a program.
|
|||
|
</p>
|
|||
|
<a name="First-invocation_003a-NVIDIA-CUBLAS-library-API"></a>
|
|||
|
<h3 class="section">8.2 First invocation: NVIDIA CUBLAS library API</h3>
|
|||
|
|
|||
|
<p>In this first use case (see below), a function in the CUBLAS library is called
|
|||
|
prior to any of the functions in the OpenACC library. More specifically, the
|
|||
|
function <code>cublasCreate()</code>.
|
|||
|
</p>
|
|||
|
<p>When invoked, the function initializes the library and allocates the
|
|||
|
hardware resources on the host and the device on behalf of the caller. Once
|
|||
|
the initialization and allocation has completed, a handle is returned to the
|
|||
|
caller. The OpenACC library also requires initialization and allocation of
|
|||
|
hardware resources. Since the CUBLAS library has already allocated the
|
|||
|
hardware resources for the device, all that is left to do is to initialize
|
|||
|
the OpenACC library and acquire the hardware resources on the host.
|
|||
|
</p>
|
|||
|
<p>Prior to calling the OpenACC function that initializes the library and
|
|||
|
allocate the host hardware resources, you need to acquire the device number
|
|||
|
that was allocated during the call to <code>cublasCreate()</code>. The invoking of the
|
|||
|
runtime library function <code>cudaGetDevice()</code> accomplishes this. Once
|
|||
|
acquired, the device number is passed along with the device type as
|
|||
|
parameters to the OpenACC library function <code>acc_set_device_num()</code>.
|
|||
|
</p>
|
|||
|
<p>Once the call to <code>acc_set_device_num()</code> has completed, the OpenACC
|
|||
|
library uses the context that was created during the call to
|
|||
|
<code>cublasCreate()</code>. In other words, both libraries will be sharing the
|
|||
|
same context.
|
|||
|
</p>
|
|||
|
<div class="smallexample">
|
|||
|
<pre class="smallexample"> /* Create the handle */
|
|||
|
s = cublasCreate(&h);
|
|||
|
if (s != CUBLAS_STATUS_SUCCESS)
|
|||
|
{
|
|||
|
fprintf(stderr, "cublasCreate failed %d\n", s);
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Get the device number */
|
|||
|
e = cudaGetDevice(&dev);
|
|||
|
if (e != cudaSuccess)
|
|||
|
{
|
|||
|
fprintf(stderr, "cudaGetDevice failed %d\n", e);
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Initialize OpenACC library and use device 'dev' */
|
|||
|
acc_set_device_num(dev, acc_device_nvidia);
|
|||
|
|
|||
|
</pre></div>
|
|||
|
<div align="center">Use Case 1
|
|||
|
</div>
|
|||
|
<a name="First-invocation_003a-OpenACC-library-API"></a>
|
|||
|
<h3 class="section">8.3 First invocation: OpenACC library API</h3>
|
|||
|
|
|||
|
<p>In this second use case (see below), a function in the OpenACC library is
|
|||
|
called prior to any of the functions in the CUBLAS library. More specificially,
|
|||
|
the function <code>acc_set_device_num()</code>.
|
|||
|
</p>
|
|||
|
<p>In the use case presented here, the function <code>acc_set_device_num()</code>
|
|||
|
is used to both initialize the OpenACC library and allocate the hardware
|
|||
|
resources on the host and the device. In the call to the function, the
|
|||
|
call parameters specify which device to use and what device
|
|||
|
type to use, i.e., <code>acc_device_nvidia</code>. It should be noted that this
|
|||
|
is but one method to initialize the OpenACC library and allocate the
|
|||
|
appropriate hardware resources. Other methods are available through the
|
|||
|
use of environment variables and these will be discussed in the next section.
|
|||
|
</p>
|
|||
|
<p>Once the call to <code>acc_set_device_num()</code> has completed, other OpenACC
|
|||
|
functions can be called as seen with multiple calls being made to
|
|||
|
<code>acc_copyin()</code>. In addition, calls can be made to functions in the
|
|||
|
CUBLAS library. In the use case a call to <code>cublasCreate()</code> is made
|
|||
|
subsequent to the calls to <code>acc_copyin()</code>.
|
|||
|
As seen in the previous use case, a call to <code>cublasCreate()</code>
|
|||
|
initializes the CUBLAS library and allocates the hardware resources on the
|
|||
|
host and the device. However, since the device has already been allocated,
|
|||
|
<code>cublasCreate()</code> will only initialize the CUBLAS library and allocate
|
|||
|
the appropriate hardware resources on the host. The context that was created
|
|||
|
as part of the OpenACC initialization is shared with the CUBLAS library,
|
|||
|
similarly to the first use case.
|
|||
|
</p>
|
|||
|
<div class="smallexample">
|
|||
|
<pre class="smallexample"> dev = 0;
|
|||
|
|
|||
|
acc_set_device_num(dev, acc_device_nvidia);
|
|||
|
|
|||
|
/* Copy the first set to the device */
|
|||
|
d_X = acc_copyin(&h_X[0], N * sizeof (float));
|
|||
|
if (d_X == NULL)
|
|||
|
{
|
|||
|
fprintf(stderr, "copyin error h_X\n");
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Copy the second set to the device */
|
|||
|
d_Y = acc_copyin(&h_Y1[0], N * sizeof (float));
|
|||
|
if (d_Y == NULL)
|
|||
|
{
|
|||
|
fprintf(stderr, "copyin error h_Y1\n");
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Create the handle */
|
|||
|
s = cublasCreate(&h);
|
|||
|
if (s != CUBLAS_STATUS_SUCCESS)
|
|||
|
{
|
|||
|
fprintf(stderr, "cublasCreate failed %d\n", s);
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Perform saxpy using CUBLAS library function */
|
|||
|
s = cublasSaxpy(h, N, &alpha, d_X, 1, d_Y, 1);
|
|||
|
if (s != CUBLAS_STATUS_SUCCESS)
|
|||
|
{
|
|||
|
fprintf(stderr, "cublasSaxpy failed %d\n", s);
|
|||
|
exit(EXIT_FAILURE);
|
|||
|
}
|
|||
|
|
|||
|
/* Copy the results from the device */
|
|||
|
acc_memcpy_from_device(&h_Y1[0], d_Y, N * sizeof (float));
|
|||
|
|
|||
|
</pre></div>
|
|||
|
<div align="center">Use Case 2
|
|||
|
</div>
|
|||
|
<a name="OpenACC-library-and-environment-variables"></a>
|
|||
|
<h3 class="section">8.4 OpenACC library and environment variables</h3>
|
|||
|
|
|||
|
<p>There are two environment variables associated with the OpenACC library
|
|||
|
that may be used to control the device type and device number:
|
|||
|
<code>ACC_DEVICE_TYPE</code> and <code>ACC_DEVICE_NUM</code>, respecively. These two
|
|||
|
environement variables can be used as an alternative to calling
|
|||
|
<code>acc_set_device_num()</code>. As seen in the second use case, the device
|
|||
|
type and device number were specified using <code>acc_set_device_num()</code>.
|
|||
|
If however, the aforementioned environment variables were set, then the
|
|||
|
call to <code>acc_set_device_num()</code> would not be required.
|
|||
|
</p>
|
|||
|
|
|||
|
<p>The use of the environment variables is only relevant when an OpenACC function
|
|||
|
is called prior to a call to <code>cudaCreate()</code>. If <code>cudaCreate()</code>
|
|||
|
is called prior to a call to an OpenACC function, then you must call
|
|||
|
<code>acc_set_device_num()</code><a name="DOCF3" href="#FOOT3"><sup>3</sup></a>
|
|||
|
</p>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
<div class="footnote">
|
|||
|
<hr>
|
|||
|
<h4 class="footnotes-heading">Footnotes</h4>
|
|||
|
|
|||
|
<h3><a name="FOOT2" href="#DOCF2">(2)</a></h3>
|
|||
|
<p>See section 2.26,
|
|||
|
"Interactions with the CUDA Driver API" in
|
|||
|
"CUDA Runtime API", Version 5.5, and section 2.27, "VDPAU
|
|||
|
Interoperability", in "CUDA Driver API", TRM-06703-001, Version 5.5,
|
|||
|
for additional information on library interoperability.</p>
|
|||
|
<h3><a name="FOOT3" href="#DOCF3">(3)</a></h3>
|
|||
|
<p>More complete information
|
|||
|
about <code>ACC_DEVICE_TYPE</code> and <code>ACC_DEVICE_NUM</code> can be found in
|
|||
|
sections 4.1 and 4.2 of the <a href="http://www.openacc.org/">OpenACC</a>
|
|||
|
Application Programming Interfaceâ, Version 2.0.</p>
|
|||
|
</div>
|
|||
|
<hr>
|
|||
|
<div class="header">
|
|||
|
<p>
|
|||
|
Next: <a href="The-libgomp-ABI.html#The-libgomp-ABI" accesskey="n" rel="next">The libgomp ABI</a>, Previous: <a href="CUDA-Streams-Usage.html#CUDA-Streams-Usage" accesskey="p" rel="prev">CUDA Streams Usage</a>, Up: <a href="index.html#Top" accesskey="u" rel="up">Top</a> [<a href="index.html#SEC_Contents" title="Table of contents" rel="contents">Contents</a>][<a href="Library-Index.html#Library-Index" title="Index" rel="index">Index</a>]</p>
|
|||
|
</div>
|
|||
|
|
|||
|
|
|||
|
|
|||
|
</body>
|
|||
|
</html>
|