1
- # matrix multiply sample
2
- A sample containing multiple implementations of matrix multiplication. This sample code is implemented using C++ and SYCL language for CPU and GPU.
1
+ # Matrix Multiply Sample
2
+ A sample containing multiple implementations of matrix multiplication. This sample code is implemented using DPC++ language for CPU and GPU.
3
3
4
4
| Optimized for | Description
5
5
|:--- |:---
6
6
| OS | Linux Ubuntu 18.04; Windows 10
7
7
| Hardware | Kaby Lake with GEN9 or newer
8
- | Software | Intel(R) oneAPI DPC++ Compiler beta; Intel(R) VTune(TM) Profiler
8
+ | Software | Intel(R) oneAPI DPC++ Compiler (beta); VTune(TM) Profiler
9
9
| What you will learn | How to profile an application using Intel(R) VTune(TM) Profiler
10
10
| Time to complete | 15 minutes
11
11
12
-
12
+ ## Purpose
13
+
14
+ The Matrix Multiplication sample performs basic matrix multiplication. Three version are provided that use different features of DPC++.
15
+
16
+ ## Key Implementation details
17
+
18
+ The basic DPC++ implementation explained in the code includes device selector, buffer, accessor, kernel, and command groups.
13
19
14
20
## License
15
21
This code sample is licensed under MIT license
@@ -26,7 +32,7 @@ Edit the line in multiply.h to select the version of the multiply function:
26
32
#define MULTIPLY multiply1
27
33
28
34
29
- ### on Linux
35
+ ### On a Linux* System
30
36
To build DPC++ version:
31
37
cd <sample dir>
32
38
cmake .
@@ -35,21 +41,36 @@ Edit the line in multiply.h to select the version of the multiply function:
35
41
Clean the program
36
42
make clean
37
43
38
- ### on Windows - Visual Studio 2017 or newer
44
+ ### On a Windows* System Using Visual Studio 2017 or newer
39
45
* Open Visual Studio 2017
40
46
* Select Menu "File > Open > Project/Solution", find "matrix_multiply" folder and select "matrix_multiply.sln"
41
47
* Select Menu "Project > Build" to build the selected configuration
42
48
* Select Menu "Debug > Start Without Debugging" to run the program
43
-
49
+
44
50
### on Windows - command line - Build the program using MSBuild
45
51
DPCPP Configurations:
46
52
Release - MSBuild matrix_multiply.sln /t:Rebuild /p:Configuration="Release"
47
53
Debug - MSBuild matrix_multiply.sln /t:Rebuild /p:Configuration="Debug"
48
54
49
55
56
+ ## Running the Sample
57
+
58
+ ### Example of Output
59
+
60
+ ./matrix.dpcpp
61
+ Address of buf1 = 0x7f5e687eb010
62
+ Offset of buf1 = 0x7f5e687eb180
63
+ Address of buf2 = 0x7f5e67fea010
64
+ Offset of buf2 = 0x7f5e67fea1c0
65
+ Address of buf3 = 0x7f5e677e9010
66
+ Offset of buf3 = 0x7f5e677e9100
67
+ Address of buf4 = 0x7f5e66fe8010
68
+ Offset of buf4 = 0x7f5e66fe8140
69
+ Using multiply kernel: multiply1
70
+ Running on Intel(R) Gen9
71
+ Elapsed Time: 0.539631s
72
+
50
73
## Running an Intel VTune Profiler analysis
51
74
------------------------------------------
52
75
53
76
vtune -collect gpu-hotspots -- ./matrix.dpcpp
54
-
55
-
0 commit comments