From 9d08b477a54c999044cd81b0ddf1dece6839ca58 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Wed, 8 Jul 2020 20:03:15 -0700 Subject: [PATCH 01/11] Initial commit for macOS Fortran Samples 08 July 2020 Signed-off-by: Ron Green --- .../Fortran/openmp_samples/.DS_Store | Bin 0 -> 6148 bytes .../Fortran/openmp_samples/License.txt | 7 + .../Fortran/openmp_samples/Makefile | 37 ++++ .../Fortran/openmp_samples/README.md | 78 +++++++++ .../Fortran/openmp_samples/sample.json | 10 ++ .../openmp_samples/src/openmp_sample.f90 | 117 +++++++++++++ .../Fortran/optimize_samples/.DS_Store | Bin 0 -> 6148 bytes .../Fortran/optimize_samples/License.txt | 7 + .../Fortran/optimize_samples/Makefile | 38 ++++ .../Fortran/optimize_samples/README.md | 163 +++++++++++++++++ .../Fortran/optimize_samples/sample.json | 10 ++ .../Fortran/optimize_samples/src/int_sin.f90 | 96 ++++++++++ .../Fortran/vec_samples/.DS_Store | Bin 0 -> 6148 bytes .../Fortran/vec_samples/License.txt | 7 + .../Fortran/vec_samples/Makefile | 30 ++++ .../Fortran/vec_samples/README.md | 30 ++++ .../vec_samples/resources/intel_logo.png | Bin 0 -> 2247 bytes .../Fortran/vec_samples/resources/samples.css | 164 ++++++++++++++++++ .../Fortran/vec_samples/sample.json | 10 ++ .../Fortran/vec_samples/src/driver.f90 | 69 ++++++++ .../Fortran/vec_samples/src/matvec.f90 | 30 ++++ 21 files changed, 903 insertions(+) create mode 100644 DirectProgramming/Fortran/openmp_samples/.DS_Store create mode 100644 DirectProgramming/Fortran/openmp_samples/License.txt create mode 100644 DirectProgramming/Fortran/openmp_samples/Makefile create mode 100644 DirectProgramming/Fortran/openmp_samples/README.md create mode 100644 DirectProgramming/Fortran/openmp_samples/sample.json create mode 100644 DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 create mode 100644 DirectProgramming/Fortran/optimize_samples/.DS_Store create mode 100644 DirectProgramming/Fortran/optimize_samples/License.txt create mode 100644 DirectProgramming/Fortran/optimize_samples/Makefile create mode 100644 DirectProgramming/Fortran/optimize_samples/README.md create mode 100644 DirectProgramming/Fortran/optimize_samples/sample.json create mode 100644 DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 create mode 100644 DirectProgramming/Fortran/vec_samples/.DS_Store create mode 100644 DirectProgramming/Fortran/vec_samples/License.txt create mode 100644 DirectProgramming/Fortran/vec_samples/Makefile create mode 100644 DirectProgramming/Fortran/vec_samples/README.md create mode 100644 DirectProgramming/Fortran/vec_samples/resources/intel_logo.png create mode 100644 DirectProgramming/Fortran/vec_samples/resources/samples.css create mode 100644 DirectProgramming/Fortran/vec_samples/sample.json create mode 100644 DirectProgramming/Fortran/vec_samples/src/driver.f90 create mode 100644 DirectProgramming/Fortran/vec_samples/src/matvec.f90 diff --git a/DirectProgramming/Fortran/openmp_samples/.DS_Store b/DirectProgramming/Fortran/openmp_samples/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..9a874b5768f336915163bb88cd434575b859f936 GIT binary patch literal 6148 zcmeH~Jr2S!425ml0g0s}V-^m;4I%_5-~tF3k&vj^b9A16778<}(6eNJu~Vz<8=6`~ zboab&MFtUB!i}=AFfm2m$tVxGT*u4pe81nUlA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{lA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{>>>> SET OPTIMIZATION LEVEL BELOW <<<<< +# +#Uncomment one of the following with which you wish to compile + +FC = ifort -O0 +#FC = ifort -O1 +#FC = ifort -O2 +#FC = ifort -O3 + +OBJS = int_sin.o + +all: int_sin + +run: int_sin + ./int_sin + +int_sin: $(OBJS) + ifort $^ -o $@ + +%.o: src/%.f90 + $(FC) $^ -c + +clean: + /bin/rm -f core.* $(OBJS) int_sin + diff --git a/DirectProgramming/Fortran/optimize_samples/README.md b/DirectProgramming/Fortran/optimize_samples/README.md new file mode 100644 index 0000000000..14627b4092 --- /dev/null +++ b/DirectProgramming/Fortran/optimize_samples/README.md @@ -0,0 +1,163 @@ +# Fortran Optimization Sample + +This sample is designed to illustrate specific +compiler optimizations, features, tools, and programming concepts. + +This program computes the integral (area under the curve) of a user-supplied function +over an interval in a stepwise fashion. +The interval is split into segments, and at each segment position the area of a rectangle +is computed whose height is the value of sine at that point and the width is the segment width. +The areas of the rectangles are then summed. + +The process is repeated with smaller and smaller width rectangles, +more closely approximating the true value. + +The source for this program also demonstrates recommended Fortran coding practices. + +## Compile the sample several times using different optimization options: + + * O1 - Enables optimizations for speed and disables some optimizations that increase code size and affect speed. + * O2 - Enables optimizations for speed. This is the generally recommended optimization level. Vectorization is enabled at O2 and higher levels. + * O3 - Performs O2 optimizations and enables more aggressive loop transformations such as Fusion, Block-Unroll-and-Jam, and collapsing IF statements. + +Read the [Intel® Fortran Compiler Developer Guide and Reference][1]:"Intel® Fortran Compiler Developer Guide and Reference" + for more information about these options. + +Some of these automatic optimizations use features and options that can +restrict program execution to specific architectures. + +| Optimized for | Description +|:--- |:--- +| OS | macOS* with Xcode installed (see Release Notes for details) +| Software | Intel® oneAPI Intel Fortran Compiler (beta) +| What you will learn | Vectorization using Intel Fortran compiler +| Time to complete | 15 minutes + + +## License +This code sample is licensed under MIT license + +## How to Build +Use the one of the following compiler options: + + +## macOS* : -O0 -O1, -O2, -O3 + +### STEP 1: Build and run at O0 + cd optimize_samples + edit Makefile + set optimization levels + uncomment FC = ifort -O0 like this + FC = ifort -O0 + #FC = ifort -O1 + #FC = ifort -O2 + #FC = ifort -O3 + make + + * Run the program + make run + + * Note the final run time (example) + CPU Time = 3.776983 seconds + + * Clean the program + make clean + +### STEP 2: Build and run at O1 + cd optimize_samples + edit Makefile + set optimization levels + uncomment FC = ifort -O1 like this + #FC = ifort -O0 + FC = ifort -O1 + #FC = ifort -O2 + #FC = ifort -O3 + make + + * Run the program + make run + + * Note the final run time (example) + CPU Time = 1.444569 seconds + + * Clean the program + make clean + +### STEP 3: Build and run at O2 + cd optimize_samples + edit Makefile + set optimization levels + uncomment FC = ifort -O2 like this + #FC = ifort -O0 + #FC = ifort -O1 + FC = ifort -O2 + #FC = ifort -O3 + make + + * Run the program + make run + + * Note the final run time (example) + CPU Time = 0.5143980 seconds + + * Clean the program + make clean + +### STEP 4: Build and run at O3 + cd optimize_samples + edit Makefile + set optimization levels + uncomment FC = ifort -O3 like this + #FC = ifort -O0 + #FC = ifort -O1 + #FC = ifort -O2 + FC = ifort -O3 + make + + * Run the program + make run + + * Note the final run time (example) + CPU Time = 0.5133380 seconds + + * Clean the program + make clean + +## What did we learn? +There are big jumps going from O0 to O1, and from O1 to O2. +but very little going from O2 to O3. +This does vary by application but generally with Intel Compilers +O2 is has most aggressive optimizations. Sometimes O3 can help, of course, +but generally O2 is sufficient for most applications. + +### Extra Exploration +The Intel® Fortran Compiler has many options for optimization. +If you have a genuine Intel® Architecture process, try these additional options + edit Makefile + set optimization levels + uncomment FC = ifort -O3 and add additional options shown: + #FC = ifort -O0 + #FC = ifort -O1 + #FC = ifort -O2 + FC = ifort -O3 -xhost -align array64byte + make + + * Run the program + make run + + * Note the final run time (example) + CPU Time = 0.2578490 seconds + + * Clean the program + make clean +There are 2 additional compiler options here that are worth mentioning: + +Read the online [Developer Guide and Reference][1]:"Developer Guide and Reference" for more information about +the options + 1. -xhost (sub option of -x option): [-x][1]:"-x option" + 2. -align array64byte [-align ][1]:"-align option" + +### Clean up + * Clean the program + make clean + diff --git a/DirectProgramming/Fortran/optimize_samples/sample.json b/DirectProgramming/Fortran/optimize_samples/sample.json new file mode 100644 index 0000000000..a87fa63f7a --- /dev/null +++ b/DirectProgramming/Fortran/optimize_samples/sample.json @@ -0,0 +1,10 @@ +{ + "name": "optimization_samples", + "categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ], + "description": "Fortran Sample - Simple Compiler Optimizations", + "toolchain": [ "ifort" ], + "languages": [ { "fortran": {} } ], + "targetDevice": [ "CPU" ], + "os": [ "darwin" ], + "builder": [ "make" ] +} diff --git a/DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 b/DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 new file mode 100644 index 0000000000..d1519820f3 --- /dev/null +++ b/DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 @@ -0,0 +1,96 @@ + ! ============================================================== + ! Copyright © 2020 Intel Corporation + ! + ! SPDX-License-Identifier: MIT + ! ============================================================= + ! + ! [DESCRIPTION] + ! This program computes the integral (area under the curve) of a user-supplied + ! function over an interval in a stepwise fashion. The interval is split into + ! segments, and at each segment position the area of a rectangle is computed + ! whose height is the value of sine at that point and the width is the segment + ! width. The areas of the rectangles are then summed. + ! + ! The process is repeated with smaller and smaller width rectangles, more + ! closely approximating the true value. + ! + ! The source for this program also demonstrates recommended Fortran + ! coding practices. + ! + ! Compile the sample several times using different optimization options. + ! + ! Read the Intel(R) Fortran Compiler Documentation for more information about these options. + ! + ! Some of these automatic optimizations use features and options + ! that can restrict program execution to specific architectures. + ! + ! [COMPILE] + ! Use the one of the following compiler options: + ! + ! Windows*: /O1, /O2, /O3 + ! + ! Linux* and macOS*: -O1, -O2, -O3 + ! + +program int_sin +implicit none + +! Create a value DP that is the "kind" number of a double precision value +! We will use this value in our declarations and constants. +integer, parameter :: DP = kind(0.0D0) + +! Declare a named constant for pi, specifying the kind type +real(DP), parameter :: pi = 3.141592653589793238_DP + +! Declare interval begin and end +real(DP), parameter :: interval_begin = 0.0_DP +real(DP), parameter :: interval_end = 2.0_DP * pi + +real(DP) :: step, sum, x_i +integer :: N, i, j +real clock_start, clock_finish + +write (*,'(A)') " " +write (*,'(A)') " Number of | Computed Integral |" +write (*,'(A)') " Interior Points | |" +call cpu_time (clock_start) + +do j=2,26 + write (*,'(A)') "--------------------------------------" + N = 2**j + ! Compute stepsize for N-1 internal rectangles + step = (interval_end - interval_begin) / real(N,DP); + + ! Approximate 1/2 area in first rectangle: f(x0) * (step/2) + sum = INTEG_FUNC(interval_begin) * (step / 2.0_DP) + + do i=1,N-1 + x_i = real(i,DP) * step + ! Apply midpoint rule: + ! Given length = f(x), compute the area of the + ! rectangle of width step + sum = sum + (INTEG_FUNC(x_i) * step) + end do + + ! Add approximate area in last rectangle for f(xN) * (step/2) + sum = sum + (INTEG_FUNC(interval_end) * (step / 2.0_DP)) + + write (*,'(T5,I10,T18,"|",2X,1P,E14.7,T38,"|")') N, sum + end do + +call cpu_time(clock_finish) +write (*,'(A)') "--------------------------------------" +write (*,'(A)') " " +write (*,*) "CPU Time = ",(clock_finish - clock_start), " seconds" + +contains + +! Function to integrate +real(DP) function INTEG_FUNC (x) + real(DP), intent(IN) :: x + + INTEG_FUNC = abs(sin(x)) + return +end function INTEG_FUNC + +end program int_sin diff --git a/DirectProgramming/Fortran/vec_samples/.DS_Store b/DirectProgramming/Fortran/vec_samples/.DS_Store new file mode 100644 index 0000000000000000000000000000000000000000..9a874b5768f336915163bb88cd434575b859f936 GIT binary patch literal 6148 zcmeH~Jr2S!425ml0g0s}V-^m;4I%_5-~tF3k&vj^b9A16778<}(6eNJu~Vz<8=6`~ zboab&MFtUB!i}=AFfm2m$tVxGT*u4pe81nUlA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{Px#1ZP1_K>z@;j|==^1poj532;bRa{vG?A^-p`A_1!6-I4$R02y>eSaefwW^{L9 za%BKPWN%_+AW3auXJt}lVPtu6$z?nM00=ipL_t(&L*<$YtX)$ShFfYbYSvb1RJG=L zDv_8{R76mMQp7warqmFk#u};!r4lrVn5P(O9%?Qj=9uRxMa&QH)%UOS@8jIwd*8eF zzSs9|<4^wVz5liLI{U19hQ0T>y_yMaZEcIe5%6$$1$+#C0Dp!Fzrjyu&6@Qj<(2SY z>gzhh+zC!XC%!2>5?%wJfuF+PIJugE|KIQj_%^%)o(MO=)W?TvPB6fA8R+rw0XQ?x zW;USz1HXgs!IzPA1x$kv!AIeA_&WRyR!tl>^}i6VeBqf2XBG5LH%-r?AC4InA^i=A`=Gf zx8Ry~>3R@+AiUDY(*bpF?*o^zOOJwa=t}w1!_)Zmz}?8^t^olTg=a$HS9k#&W^+@4 zacu^_bv~4k?nGK81J4= zM4QXKZK@q~(yMGxB5|5(Qp1?GfoGdXSk{|uIPE=Pe@pUxScwp*zfRHaYrG6}(roch z^A9-GrltVnxi~w$IKhvkQJs(a2~cyHU1`r_-)-y{R|nWYdWXPqVV7G(H-k8xdQcK# z95HS%$5?I{#260)#GWCIavqvvDbJ@IskU_`NmKuHs~j)AUEp*wJk|9P>L^ebRt={H zLI0N=PVuH#ZWzQUONKywLAWBE&r$$4tz!b%uD4tn_#B9n2D-1?dS+-+6`VbIy&%RT zFySK04TBgJlTi)0X>yn_&2l*~>XH)jj8X3c`xQ8^_1^d|2M1K)k`Z7wpcj*&RjVd$ z15(@~&{sjNz0|crniOm4?f0ee#_GJ0R24?+i>ZW3(|fc zYM-GJf6&3;77@rmFD66FaK)Voi%)&MA)bPN(?3965AfSz$@HFfy~&uIc?TY}TnY?N z_@LWa3eetMdIIg$rBhWhzRuL@z(*&sS_A{s^JLg1Kr1q#2<)H0V&dve#?zU1 z;2O*S0db5E^q8d_Xxf1)B2El>eEq+EX$) zFx|5BVlvLN9H2c#gUC#64^fR*qCk5j)CN$(D%R^s#xaHX@{e0S2iiV%c%LFacmjG& z$)LS|bTE+8LQZ(jvh-pyF0rh5nge{C$$?F6%7LaUCH<-gS*s}-*fiea9OHSL{sW?x z-NHmmB|vX98Ruq}K(#9)PX={8z~k03o`AaD67PZEz=R3zS5q=DPjnI@8DO+c84#Ao zmNsN|wp;@AIhl+}mZcB#I0CA-2VAaozYdduKH7aEE`|13OV=fcqioc@SaHpP-Yr)K z{Rh+jE0Y6yuxfv^8h2Gc%65~1yc7HjTY&mM6y={7FlaF&XveyrSUzbQn=7^ ziuYwE2S(V`F7T_G3?zMyXhqLK`*52}fng#PzJLo^E&=^xp_aJ?I-3lAAWE2OIY9f4 z!m*yE8jHmQG|i9?aWjCXrF#HdB!g13nPs7$5&QDz*mc~<$q;&`! z^H&bYpX}9uPrmm_H)UU|%HihX;bdT2CfTT0yz0Yhc5vXegTX9#Z8Nqm0hm#odPaglfHZ=wU z9OgjjcG3KyrWs&QfUzyaF!f-)8y5FJgg5NgI0$lA3g}W>a%aFr?9!uPNP3lT4Ykpe z_Fn?D1FnDQE{%gYngLE%z#EFjocLleZ#`g+r;=_i3&J5U(F`*QeZ{Qk%Bg$jSdskbsuMQh=uRHe)5*r66c* zr~5c%z@%G9`h= Date: Wed, 8 Jul 2020 20:12:56 -0700 Subject: [PATCH 02/11] Update to README.md added link to online User Guide for OpenMP --- DirectProgramming/Fortran/openmp_samples/README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/DirectProgramming/Fortran/openmp_samples/README.md b/DirectProgramming/Fortran/openmp_samples/README.md index d09da38fa2..c85d611033 100644 --- a/DirectProgramming/Fortran/openmp_samples/README.md +++ b/DirectProgramming/Fortran/openmp_samples/README.md @@ -48,8 +48,8 @@ This code sample is licensed under MIT license ### Experiment 1 Default Optimized build and run * Build openmp_samples - cd openmp_samples && - make clean && + cd openmp_samples + make clean make * Run the program @@ -59,8 +59,8 @@ This code sample is licensed under MIT license ### Experiment 2 Unoptimized build and run * Build openmp_samples - cd openmp_samples && - make clean && + cd openmp_samples + make clean make debug * Run the program @@ -76,3 +76,7 @@ This code sample is licensed under MIT license * Clean the program make clean +## Further Reading +Interested in learning more? We have a wealth of information +on using OpenMP with the Intel Fortran Compiler in our +[OpenMP section of Developer Guide and Reference][1]:"Developer Guide and Reference" From 572f285fbfb4f25b3f9d76e378ac0b6e89a4bf26 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Wed, 8 Jul 2020 20:17:28 -0700 Subject: [PATCH 03/11] Test README.md link syntax --- DirectProgramming/Fortran/openmp_samples/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/DirectProgramming/Fortran/openmp_samples/README.md b/DirectProgramming/Fortran/openmp_samples/README.md index c85d611033..09a83221cd 100644 --- a/DirectProgramming/Fortran/openmp_samples/README.md +++ b/DirectProgramming/Fortran/openmp_samples/README.md @@ -79,4 +79,4 @@ This code sample is licensed under MIT license ## Further Reading Interested in learning more? We have a wealth of information on using OpenMP with the Intel Fortran Compiler in our -[OpenMP section of Developer Guide and Reference][1]:"Developer Guide and Reference" +[OpenMP section of Developer Guide and Reference][1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support.html "Developer Guide and Reference" From 92d47422a436b40399bf2f53813c9d0b9f38acae Mon Sep 17 00:00:00 2001 From: Ron Green Date: Wed, 8 Jul 2020 22:07:40 -0700 Subject: [PATCH 04/11] Edits misc --- .../Fortran/openmp_samples/README.md | 50 +++-- .../openmp_samples/src/openmp_sample.f90 | 2 +- .../Fortran/optimize_samples/README.md | 113 ++++++---- .../Fortran/vec_samples/README.md | 196 +++++++++++++++++- .../vec_samples/resources/intel_logo.png | Bin 2247 -> 0 bytes .../Fortran/vec_samples/resources/samples.css | 164 --------------- .../Fortran/vec_samples/src/driver.f90 | 2 +- 7 files changed, 294 insertions(+), 233 deletions(-) delete mode 100644 DirectProgramming/Fortran/vec_samples/resources/intel_logo.png delete mode 100644 DirectProgramming/Fortran/vec_samples/resources/samples.css diff --git a/DirectProgramming/Fortran/openmp_samples/README.md b/DirectProgramming/Fortran/openmp_samples/README.md index 09a83221cd..18a598e303 100644 --- a/DirectProgramming/Fortran/openmp_samples/README.md +++ b/DirectProgramming/Fortran/openmp_samples/README.md @@ -2,12 +2,12 @@ This sample is designed to illustrate how to use the OpenMP* API with the Intel® Fortran Compiler. -This program finds all primes in the first 10,000,000 integers, +This program finds all primes in the first 40,000,000 integers, the number of 4n+1 primes, and the number of 4n-1 primes in the same range. It illustrates two OpenMP* directives to help speed up the code. -This program finds all primes in the first 10,000,000 integers, the number of 4n+1 primes, +This program finds all primes in the first 40,000,000 integers, the number of 4n+1 primes, and the number of 4n-1 primes in the same range. It illustrates two OpenMP* directives to help speed up the code. @@ -34,8 +34,8 @@ Read the Intel® Fortran Compiler Documentation for more information about these | Optimized for | Description |:--- |:--- -| OS | macOS* with Xcode installed (see Release Notes for details) -| Software | Intel® oneAPI Intel Fortran Compiler (beta) +| OS | macOS* with Xcode* installed +| Software | Intel® oneAPI Intel Fortran Compiler (Beta) | What you will learn | How to build and run a Fortran OpenMP application using Intel Fortran compiler | Time to complete | 10 minutes @@ -45,32 +45,42 @@ This code sample is licensed under MIT license ## How to Build -### Experiment 1 Default Optimized build and run - * Build openmp_samples +### Experiment 1: Unoptimized build and run +* Build openmp_samples - cd openmp_samples - make clean - make + cd openmp_samples + make clean + make debug * Run the program - make run + make debug_run -### Experiment 2 Unoptimized build and run - * Build openmp_samples + * What did you see? + + Did the debug, unoptimized code run slower? + +### Experiment 2: Default Optimized build and run - cd openmp_samples - make clean - make debug + * Build openmp_samples + make * Run the program - make debug_run + make run - * What did you see? +### Experiment 3: Controlling number of threads +By default an OpenMP application creates and uses as many threads as there are "processors" in a system. A "processor" is the number of logical processors which on hyperthreaded cores is twice the number of physical cores. - Did the debug, unoptimized code run slower? +OpenMP uses environment variable 'OMP_NUM_THREADS' to set number of threads to use. Try this! + + export OMP_NUM_THREADS=1 + make run +note the number of threads reported by the application. Now try 2 threads: + export OMP_NUM_THREADS=2 + make run +Did the make the application run faster? Experiment with the number of threads and see how it affects performance. ### Clean up * Clean the program @@ -79,4 +89,6 @@ This code sample is licensed under MIT license ## Further Reading Interested in learning more? We have a wealth of information on using OpenMP with the Intel Fortran Compiler in our -[OpenMP section of Developer Guide and Reference][1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support.html "Developer Guide and Reference" +[OpenMP section of Developer Guide and Reference][1] + +[1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/optimization-and-programming-guide/openmp-support.html "Developer Guide and Reference" diff --git a/DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 b/DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 index 65fc22d7d4..fb0a9ebacf 100644 --- a/DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 +++ b/DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 @@ -5,7 +5,7 @@ ! ============================================================= ! ! [DESCRIPTION] -! This code finds all primes in the first 10,000,000 integers, the number of +! This code finds all primes in the first 40,000,000 integers, the number of ! 4n+1 primes, and the number of 4n-1 primes in the same range. ! ! This source illustrates two OpenMP directives to help speed up diff --git a/DirectProgramming/Fortran/optimize_samples/README.md b/DirectProgramming/Fortran/optimize_samples/README.md index 14627b4092..8a2ded5944 100644 --- a/DirectProgramming/Fortran/optimize_samples/README.md +++ b/DirectProgramming/Fortran/optimize_samples/README.md @@ -16,11 +16,13 @@ The source for this program also demonstrates recommended Fortran coding practic ## Compile the sample several times using different optimization options: + * O0 - No optimizations * O1 - Enables optimizations for speed and disables some optimizations that increase code size and affect speed. * O2 - Enables optimizations for speed. This is the generally recommended optimization level. Vectorization is enabled at O2 and higher levels. * O3 - Performs O2 optimizations and enables more aggressive loop transformations such as Fusion, Block-Unroll-and-Jam, and collapsing IF statements. -Read the [Intel® Fortran Compiler Developer Guide and Reference][1]:"Intel® Fortran Compiler Developer Guide and Reference" +Read the [Intel® Fortran Compiler Developer Guide and Reference][1] +[1]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top.html "Intel® Fortran Compiler Developer Guide and Reference" for more information about these options. Some of these automatic optimizations use features and options that can @@ -28,8 +30,8 @@ restrict program execution to specific architectures. | Optimized for | Description |:--- |:--- -| OS | macOS* with Xcode installed (see Release Notes for details) -| Software | Intel® oneAPI Intel Fortran Compiler (beta) +| OS | macOS* with Xcode* installed +| Software | Intel® oneAPI Intel Fortran Compiler (Beta) | What you will learn | Vectorization using Intel Fortran compiler | Time to complete | 15 minutes @@ -43,119 +45,146 @@ Use the one of the following compiler options: ## macOS* : -O0 -O1, -O2, -O3 -### STEP 1: Build and run at O0 - cd optimize_samples - edit Makefile - set optimization levels - uncomment FC = ifort -O0 like this +### STEP 1: Build and run with -O0 +cd optimize_samples + +Edit 'Makefile' using your favorite editor + +To set optimization level uncomment FC = ifort -O0 like this + FC = ifort -O0 #FC = ifort -O1 #FC = ifort -O2 #FC = ifort -O3 - make + * Build the executable with 'make' + + make * Run the program - make run + + make run * Note the final run time (example) CPU Time = 3.776983 seconds - * Clean the program + * Clean the files we built + make clean -### STEP 2: Build and run at O1 - cd optimize_samples - edit Makefile - set optimization levels - uncomment FC = ifort -O1 like this + +### STEP 2: Build and run with -O1 +Edit 'Makefile' using your favorite editor + +To set optimization level uncomment FC = ifort -O1 like this + #FC = ifort -O0 FC = ifort -O1 #FC = ifort -O2 #FC = ifort -O3 + * Build the executable with 'make' + make * Run the program + make run * Note the final run time (example) CPU Time = 1.444569 seconds - * Clean the program + * Clean the files we built + make clean + + +### STEP 3: Build and run with -O2 +Edit 'Makefile' using your favorite editor -### STEP 3: Build and run at O2 - cd optimize_samples - edit Makefile - set optimization levels - uncomment FC = ifort -O2 like this +To set optimization level uncomment FC = ifort -O2 like this + #FC = ifort -O0 #FC = ifort -O1 FC = ifort -O2 #FC = ifort -O3 + * Build the executable with 'make' + make * Run the program + make run * Note the final run time (example) CPU Time = 0.5143980 seconds - * Clean the program + * Clean the files we built + make clean -### STEP 4: Build and run at O3 - cd optimize_samples - edit Makefile - set optimization levels - uncomment FC = ifort -O3 like this +### STEP 4: Build and run with -O3 +Edit 'Makefile' using your favorite editor + +To set optimization level uncomment FC = ifort -O3 like this + #FC = ifort -O0 #FC = ifort -O1 #FC = ifort -O2 FC = ifort -O3 + * Build the executable with 'make' + make * Run the program + make run * Note the final run time (example) CPU Time = 0.5133380 seconds - * Clean the program + * Clean the files we built + make clean ## What did we learn? There are big jumps going from O0 to O1, and from O1 to O2. -but very little going from O2 to O3. -This does vary by application but generally with Intel Compilers -O2 is has most aggressive optimizations. Sometimes O3 can help, of course, +But we see very little performance gain going from O2 to O3. +This does vary by application but generally with Intel® Compilers +O2 is has most optimizations. Sometimes O3 can help, of course, but generally O2 is sufficient for most applications. ### Extra Exploration The Intel® Fortran Compiler has many options for optimization. -If you have a genuine Intel® Architecture process, try these additional options - edit Makefile - set optimization levels - uncomment FC = ifort -O3 and add additional options shown: +If you have a genuine Intel® Architecture processor, try these additional options + + edit 'Makefile' using your favorite editor. To set additional optimizations uncomment FC = ifort -O3 and add additional options shown: + #FC = ifort -O0 #FC = ifort -O1 #FC = ifort -O2 FC = ifort -O3 -xhost -align array64byte + * Build the executable with the new options -xhost -align array64byte + make * Run the program + make run * Note the final run time (example) CPU Time = 0.2578490 seconds * Clean the program - make clean -There are 2 additional compiler options here that are worth mentioning: -Read the online [Developer Guide and Reference][1]:"Developer Guide and Reference" for more information about -the options - 1. -xhost (sub option of -x option): [-x][1]:"-x option" - 2. -align array64byte [-align ][1]:"-align option" + make clean + +There are 2 additional compiler options here that are worth mentioning: Read the online +[Developer Guide and Reference][3] for more information about +these options +[3]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top.html "Developer Guide and Reference" + 1. -xhost (sub option of -x option): [-x][4] + [4]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/code-generation-options/x-qx.html "-x option" + 2. -align array64byte: [-align][5] + [5]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/compiler-option-details/data-options/align.html "-align option" ### Clean up * Clean the program diff --git a/DirectProgramming/Fortran/vec_samples/README.md b/DirectProgramming/Fortran/vec_samples/README.md index 783a2615b5..ac6175b208 100644 --- a/DirectProgramming/Fortran/vec_samples/README.md +++ b/DirectProgramming/Fortran/vec_samples/README.md @@ -11,7 +11,7 @@ serial version and the version that was compiled with the auto-vectorizer. | Optimized for | Description |:--- |:--- -| OS | macOS* with Xcode installed (see Release Notes for details) +| OS | macOS* with Xcode installed | Software | Intel® oneAPI Intel Fortran Compiler (beta) | What you will learn | Vectorization using Intel Fortran compiler | Time to complete | 15 minutes @@ -20,11 +20,195 @@ serial version and the version that was compiled with the auto-vectorizer. ## License This code sample is licensed under MIT license -## How to Build - * make +### Introduction to Auto Vectorization +For the Intel® compiler, vectorization is the unrolling of a loop combined with the generation of packed SIMD instructions. Because the packed instructions operate on more than one data element at a time, the loop can execute more efficiently. It is sometimes referred to as auto-vectorization to emphasize that the compiler automatically identifies and optimizes suitable loops on its own. -### Clean up - * Clean the program - make clean +Intel® Advisor can assist with vectorization and show optimization report messages with your source code. See [Intel Advisor][1] for details. +[1]: https://software.intel.com/en-us/intel-advisor-xe "Intel Avisor" + +Vectorization may call library routines that can result in additional performance gain on Intel microprocessors than on non-Intel microprocessors. The vectorization can also be affected by certain options, such as m or x. + +Vectorization is enabled with the compiler at optimization levels of O2 (default level) and higher for both Intel® microprocessors and non-Intel® microprocessors. Many loops are vectorized automatically, but in cases where this doesn't happen, you may be able to vectorize loops by making simple code modifications. In this sample, you will: + +1. establish a performance baseline + +2. generate a vectorization report + +3. improve performance by aligning data + +4. improve performance using Interprocedural Optimization + +### Preparing the Sample Application + +In this sample, you will use the following files: + + driver.f90 + + matvec.f90 + + +### Establishing a Performance Baseline + +To set a performance baseline for the improvements that follow in this sample, compile your sources from the src directory with these compiler options: + + ifort -real-size 64 -O1 matvec.f90 driver.f90 -o MatVector + +Execute 'MatVector' + + ./MatVector +and record the execution time reported in the output. This is the baseline against which subsequent improvements will be measured. + + +### Generating a Vectorization Report + +A vectorization report shows what loops in your code were vectorized and explains why other loops were not vectorized. To generate a vectorization report, use the **qopt-report-phase=vec** compiler options together with **qopt-report=1** or **qopt-report=2**. + +Together with **qopt-report-phase=vec**, **qopt-report=1** generates a report with the loops in your code that were vectorized while **qopt-report-phase=vec** with **qopt-report=2** generates a report with both the loops in your code that were vectorized and the reason that other loops were not vectorized. + +Because vectorization is turned off with the **O1** option, the compiler does not generate a vectorization report. To generate a vectorization report, compile your project with the **O2**, **qopt-report-phase=vec**, **qopt-report=1** options: + + ifort -real-size 64 -O2 -qopt-report=1 -qopt-report-phase=vec matvec.f90 driver.f90 -o MatVector + +Recompile the program and then execute MatVector. Record the new execution time. The reduction in time is mostly due to auto-vectorization of the inner loop at line 32 noted in the vectorization report **matvec.optrpt** : + + Begin optimization report for: matvec_ + + Report from: Vector optimizations [vec] + + + LOOP BEGIN at matvec.f90(26,3) + remark #25460: No loop optimizations reported + + LOOP BEGIN at matvec.f90(26,3) + remark #15300: LOOP WAS VECTORIZED + LOOP END + + LOOP BEGIN at matvec.f90(26,3) + + LOOP END + LOOP END + + LOOP BEGIN at matvec.f90(27,3) + remark #25460: No loop optimizations reported + + LOOP BEGIN at matvec.f90(32,6) + + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + remark #15300: LOOP WAS VECTORIZED + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + + LOOP END + LOOP END + +Note + +Your line and column numbers may be different. + +**qopt-report=2** with **qopt-report-phase=vec,loop** returns a list that also includes loops that were not vectorized or multi-versioned, along with the reason that the compiler did not vectorize them or multi-version the loop. + +Recompile your project with the **qopt-report=2** and **qopt-report-phase=vec,loop** options. + + ifort -real-size 64 -O2 -qopt-report-phase=vec -qopt-report=2 matvec.f90 driver.f90 -o MatVector + +The vectorization report matvec.optrpt indicates that the loop at line 33 in matvec.f90 did not vectorize because it is not the innermost loop of the loop nest. + + LOOP BEGIN at matvec.f90(27,3) + remark #15542: loop was not vectorized: inner loop was already vectorized + + LOOP BEGIN at matvec.f90(32,6) + + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + remark #15300: LOOP WAS VECTORIZED + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + + LOOP END + + LOOP BEGIN at matvec.f90(32,6) + + remark #15335: remainder loop was not vectorized: vectorization possible but seems inefficient. Use vector always directive or -vec-threshold0 to override + LOOP END + LOOP END + +Note: Your line and column numbers may be different. + +For more information on the **qopt-report** and **qopt-report-phase** compiler options, see the +[Compiler Options section][3] in the Intel® Fortran Compiler Developer Guide and Reference. +[3]: https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/compiler-options/alphabetical-list-of-compiler-options.html "Options" + + +### Improving Performance by Aligning Data + +The vectorizer can generate faster code when operating on aligned data. In this activity you will improve the vectorizer performance by aligning the arrays a, b, and c in **driver.f90** on a 16-byte boundary so the vectorizer can use aligned load instructions for all arrays rather than the slower unaligned load instructions and can avoid runtime tests of alignment. Using the ALIGNED macro will insert an alignment directive for a, b, and c in driver.f90 with the following syntax: + + !dir$ attributes align : 16 :: a,b,c + +This instructs the compiler to create arrays that it are aligned on a 16-byte boundary, which should facilitate the use of SSE aligned load instructions. + +In addition, the column height of the matrix a needs to be padded out to be a multiple of 16 bytes, so that each individual column of a maintains the same 16-byte alignment. In practice, maintaining a constant alignment between columns is much more important than aligning the start of the arrays. + +To derive the maximum benefit from this alignment, we also need to tell the vectorizer it can safely assume that the arrays in matvec.f90 are aligned by using the directive + + !dir$ vector aligned + +Note If you use **!dir$ vector aligned**, you must be sure that all the arrays or subarrays in the loop are 16-byte aligned. Otherwise, you may get a runtime error. Aligning data may still give a performance benefit even if **!dir$ vector aligned** is not used. See the code under the ALIGNED macro in **matvec.f90** + +If your compilation targets the Intel® AVX-512 instruction set, you should try to align data on a 64-byte boundary. This may result in improved performance. In this case, **!dir$ vector aligned** advises the compiler that the data is 64-byte aligned. + +Recompile the program after adding the ALIGNED macro to ensure consistently aligned data: + + ifort -real-size 64 -qopt-report=2 -qopt-report-phase=vec -D ALIGNED matvec.f90 driver.f90 -o MatVector + + +### Improving Performance with Interprocedural Optimization + +The compiler may be able to perform additional optimizations if it is able to optimize across source line boundaries. These may include, but are not limited to, function inlining. This is enabled with the **-ipo** option. + +Recompile the program using the **-ipo** option to enable interprocedural optimization. + + ifort -real-size 64 -qopt-report=2 -qopt-report-phase=vec -D ALIGNED -ipo matvec.f90 driver.f90 -o MatVector + +Note that the vectorization messages now appear at the point of inlining in **driver.f90** (line 70) and this is found in the file **ipo_out.optrpt**. + + LOOP BEGIN at driver.f90(73,16) + remark #15541: loop was not vectorized: inner loop was already vectorized + + LOOP BEGIN at matvec.f90(32,3) inlined into driver.f90(70,14) + remark #15398: loop was not vectorized: loop was transformed to memset or memcpy + LOOP END + + LOOP BEGIN at matvec.f90(33,3) inlined into driver.f90(70,14) + remark #15541: loop was not vectorized: inner loop was already vectorized + + LOOP BEGIN at matvec.f90(38,6) inlined into driver.f90(70,14) + remark #15399: vectorization support: unroll factor set to 4 + remark #15300: LOOP WAS VECTORIZED + LOOP END + LOOP END + LOOP END + + +Note: Your line and column numbers may be different. + +Now, run the executable and record the execution time. + +### Additional Exercises + +The previous examples made use of double precision arrays. They may be built instead with single precision arrays by changing the command-line option **-real-size 64** to **-real-size 32**. The non-vectorized versions of the loop execute only slightly faster the double precision version; however, the vectorized versions are substantially faster. This is because a packed SIMD instruction operating on a 32-byte vector register operates on eight single precision data elements at once instead of four double precision data elements. + +Note: In the example with data alignment, you will need to set ROWBUF=3 to ensure 16-byte alignment for each row of the matrix a. Otherwise, the directive **!dir$ vector aligned** will cause the program to fail. + +This completes the sample that shows how the compiler can optimize performance with various vectorization techniques. diff --git a/DirectProgramming/Fortran/vec_samples/resources/intel_logo.png b/DirectProgramming/Fortran/vec_samples/resources/intel_logo.png deleted file mode 100644 index 9afc40c81bbb5cd00403a1cf10c4f89e2a9826a1..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 2247 zcmV;&2srnNP)Px#1ZP1_K>z@;j|==^1poj532;bRa{vG?A^-p`A_1!6-I4$R02y>eSaefwW^{L9 za%BKPWN%_+AW3auXJt}lVPtu6$z?nM00=ipL_t(&L*<$YtX)$ShFfYbYSvb1RJG=L zDv_8{R76mMQp7warqmFk#u};!r4lrVn5P(O9%?Qj=9uRxMa&QH)%UOS@8jIwd*8eF zzSs9|<4^wVz5liLI{U19hQ0T>y_yMaZEcIe5%6$$1$+#C0Dp!Fzrjyu&6@Qj<(2SY z>gzhh+zC!XC%!2>5?%wJfuF+PIJugE|KIQj_%^%)o(MO=)W?TvPB6fA8R+rw0XQ?x zW;USz1HXgs!IzPA1x$kv!AIeA_&WRyR!tl>^}i6VeBqf2XBG5LH%-r?AC4InA^i=A`=Gf zx8Ry~>3R@+AiUDY(*bpF?*o^zOOJwa=t}w1!_)Zmz}?8^t^olTg=a$HS9k#&W^+@4 zacu^_bv~4k?nGK81J4= zM4QXKZK@q~(yMGxB5|5(Qp1?GfoGdXSk{|uIPE=Pe@pUxScwp*zfRHaYrG6}(roch z^A9-GrltVnxi~w$IKhvkQJs(a2~cyHU1`r_-)-y{R|nWYdWXPqVV7G(H-k8xdQcK# z95HS%$5?I{#260)#GWCIavqvvDbJ@IskU_`NmKuHs~j)AUEp*wJk|9P>L^ebRt={H zLI0N=PVuH#ZWzQUONKywLAWBE&r$$4tz!b%uD4tn_#B9n2D-1?dS+-+6`VbIy&%RT zFySK04TBgJlTi)0X>yn_&2l*~>XH)jj8X3c`xQ8^_1^d|2M1K)k`Z7wpcj*&RjVd$ z15(@~&{sjNz0|crniOm4?f0ee#_GJ0R24?+i>ZW3(|fc zYM-GJf6&3;77@rmFD66FaK)Voi%)&MA)bPN(?3965AfSz$@HFfy~&uIc?TY}TnY?N z_@LWa3eetMdIIg$rBhWhzRuL@z(*&sS_A{s^JLg1Kr1q#2<)H0V&dve#?zU1 z;2O*S0db5E^q8d_Xxf1)B2El>eEq+EX$) zFx|5BVlvLN9H2c#gUC#64^fR*qCk5j)CN$(D%R^s#xaHX@{e0S2iiV%c%LFacmjG& z$)LS|bTE+8LQZ(jvh-pyF0rh5nge{C$$?F6%7LaUCH<-gS*s}-*fiea9OHSL{sW?x z-NHmmB|vX98Ruq}K(#9)PX={8z~k03o`AaD67PZEz=R3zS5q=DPjnI@8DO+c84#Ao zmNsN|wp;@AIhl+}mZcB#I0CA-2VAaozYdduKH7aEE`|13OV=fcqioc@SaHpP-Yr)K z{Rh+jE0Y6yuxfv^8h2Gc%65~1yc7HjTY&mM6y={7FlaF&XveyrSUzbQn=7^ ziuYwE2S(V`F7T_G3?zMyXhqLK`*52}fng#PzJLo^E&=^xp_aJ?I-3lAAWE2OIY9f4 z!m*yE8jHmQG|i9?aWjCXrF#HdB!g13nPs7$5&QDz*mc~<$q;&`! z^H&bYpX}9uPrmm_H)UU|%HihX;bdT2CfTT0yz0Yhc5vXegTX9#Z8Nqm0hm#odPaglfHZ=wU z9OgjjcG3KyrWs&QfUzyaF!f-)8y5FJgg5NgI0$lA3g}W>a%aFr?9!uPNP3lT4Ykpe z_Fn?D1FnDQE{%gYngLE%z#EFjocLleZ#`g+r;=_i3&J5U(F`*QeZ{Qk%Bg$jSdskbsuMQh=uRHe)5*r66c* zr~5c%z@%G9`h= Date: Thu, 16 Jul 2020 13:30:10 -0700 Subject: [PATCH 05/11] Edited json files to add CI stanzas --- .../Fortran/openmp_samples/sample.json | 20 +++++++++++++++++++ .../Fortran/optimize_samples/sample.json | 12 +++++++++++ .../Fortran/vec_samples/sample.json | 12 +++++++++++ 3 files changed, 44 insertions(+) diff --git a/DirectProgramming/Fortran/openmp_samples/sample.json b/DirectProgramming/Fortran/openmp_samples/sample.json index 2228ad8fce..9e0685107d 100644 --- a/DirectProgramming/Fortran/openmp_samples/sample.json +++ b/DirectProgramming/Fortran/openmp_samples/sample.json @@ -7,4 +7,24 @@ "targetDevice": [ "CPU" ], "os": [ "darwin" ], "builder": [ "make" ] + "ciTests":{ + "darwin": [ + { + "id": "fort_release_cpu" + "steps": [ + "make release", + "make run", + "make clean" + ] + }, + { + "id": "fort_debug_cpu" + "steps": [ + "make debug", + "make debug_run", + "make clean" + ] + } + ] + } } diff --git a/DirectProgramming/Fortran/optimize_samples/sample.json b/DirectProgramming/Fortran/optimize_samples/sample.json index a87fa63f7a..408e78a334 100644 --- a/DirectProgramming/Fortran/optimize_samples/sample.json +++ b/DirectProgramming/Fortran/optimize_samples/sample.json @@ -7,4 +7,16 @@ "targetDevice": [ "CPU" ], "os": [ "darwin" ], "builder": [ "make" ] + "ciTests":{ + "darwin": [ + { + "id": "fort_optsample_cpu" + "steps": [ + "make", + "make run", + "make clean" + ] + } + ] + } } diff --git a/DirectProgramming/Fortran/vec_samples/sample.json b/DirectProgramming/Fortran/vec_samples/sample.json index 547a7061d7..99c1a5e1c1 100644 --- a/DirectProgramming/Fortran/vec_samples/sample.json +++ b/DirectProgramming/Fortran/vec_samples/sample.json @@ -7,4 +7,16 @@ "targetDevice": [ "CPU" ], "os": [ "darwin" ], "builder": [ "make" ] + "ciTests":{ + "darwin": [ + { + "id": "fort_vecsample_cpu" + "steps": [ + "make", + "make run", + "make clean" + ] + } + ] + } } From 6547e0e07197357441d092d06053b73fc91007f0 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Fri, 17 Jul 2020 14:18:11 -0700 Subject: [PATCH 06/11] rwg - updates 3 README.md files to Joe O's outline --- .../Fortran/openmp_samples/.DS_Store | Bin 6148 -> 0 bytes .../Fortran/openmp_samples/README.md | 22 +++++---- .../Fortran/optimize_samples/.DS_Store | Bin 6148 -> 0 bytes .../Fortran/optimize_samples/README.md | 34 +++++++------- .../Fortran/vec_samples/.DS_Store | Bin 6148 -> 0 bytes .../Fortran/vec_samples/README.md | 42 +++++++++++------- 6 files changed, 57 insertions(+), 41 deletions(-) delete mode 100644 DirectProgramming/Fortran/openmp_samples/.DS_Store delete mode 100644 DirectProgramming/Fortran/optimize_samples/.DS_Store delete mode 100644 DirectProgramming/Fortran/vec_samples/.DS_Store diff --git a/DirectProgramming/Fortran/openmp_samples/.DS_Store b/DirectProgramming/Fortran/openmp_samples/.DS_Store deleted file mode 100644 index 9a874b5768f336915163bb88cd434575b859f936..0000000000000000000000000000000000000000 GIT binary patch literal 0 HcmV?d00001 literal 6148 zcmeH~Jr2S!425ml0g0s}V-^m;4I%_5-~tF3k&vj^b9A16778<}(6eNJu~Vz<8=6`~ zboab&MFtUB!i}=AFfm2m$tVxGT*u4pe81nUlA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{lA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{lA49C} z?O@64YO)2RT{MRe%{!}2F))pG(Sih~)xkgosK7*lF7m<7{{#Hn{6A@7N(HFEpDCdI z{ Date: Mon, 20 Jul 2020 11:44:43 +0300 Subject: [PATCH 07/11] Fix for sample.json syntax --- DirectProgramming/Fortran/openmp_samples/sample.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/DirectProgramming/Fortran/openmp_samples/sample.json b/DirectProgramming/Fortran/openmp_samples/sample.json index 9e0685107d..c7a512744c 100644 --- a/DirectProgramming/Fortran/openmp_samples/sample.json +++ b/DirectProgramming/Fortran/openmp_samples/sample.json @@ -6,11 +6,11 @@ "languages": [ { "fortran": {} } ], "targetDevice": [ "CPU" ], "os": [ "darwin" ], - "builder": [ "make" ] + "builder": [ "make" ], "ciTests":{ "darwin": [ { - "id": "fort_release_cpu" + "id": "fort_release_cpu", "steps": [ "make release", "make run", @@ -18,7 +18,7 @@ ] }, { - "id": "fort_debug_cpu" + "id": "fort_debug_cpu", "steps": [ "make debug", "make debug_run", From 9e2d2edcec9aafcfc98e7ac3a402d1dc041f9c89 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Tue, 21 Jul 2020 09:49:42 -0700 Subject: [PATCH 08/11] fixed syntax in jsons --- DirectProgramming/Fortran/openmp_samples/sample.json | 6 +++--- DirectProgramming/Fortran/optimize_samples/sample.json | 4 ++-- DirectProgramming/Fortran/vec_samples/sample.json | 4 ++-- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/DirectProgramming/Fortran/openmp_samples/sample.json b/DirectProgramming/Fortran/openmp_samples/sample.json index 9e0685107d..c7a512744c 100644 --- a/DirectProgramming/Fortran/openmp_samples/sample.json +++ b/DirectProgramming/Fortran/openmp_samples/sample.json @@ -6,11 +6,11 @@ "languages": [ { "fortran": {} } ], "targetDevice": [ "CPU" ], "os": [ "darwin" ], - "builder": [ "make" ] + "builder": [ "make" ], "ciTests":{ "darwin": [ { - "id": "fort_release_cpu" + "id": "fort_release_cpu", "steps": [ "make release", "make run", @@ -18,7 +18,7 @@ ] }, { - "id": "fort_debug_cpu" + "id": "fort_debug_cpu", "steps": [ "make debug", "make debug_run", diff --git a/DirectProgramming/Fortran/optimize_samples/sample.json b/DirectProgramming/Fortran/optimize_samples/sample.json index 408e78a334..0cd201d97e 100644 --- a/DirectProgramming/Fortran/optimize_samples/sample.json +++ b/DirectProgramming/Fortran/optimize_samples/sample.json @@ -6,11 +6,11 @@ "languages": [ { "fortran": {} } ], "targetDevice": [ "CPU" ], "os": [ "darwin" ], - "builder": [ "make" ] + "builder": [ "make" ], "ciTests":{ "darwin": [ { - "id": "fort_optsample_cpu" + "id": "fort_optsample_cpu", "steps": [ "make", "make run", diff --git a/DirectProgramming/Fortran/vec_samples/sample.json b/DirectProgramming/Fortran/vec_samples/sample.json index 99c1a5e1c1..c25dc42d09 100644 --- a/DirectProgramming/Fortran/vec_samples/sample.json +++ b/DirectProgramming/Fortran/vec_samples/sample.json @@ -6,11 +6,11 @@ "languages": [ { "fortran": {} } ], "targetDevice": [ "CPU" ], "os": [ "darwin" ], - "builder": [ "make" ] + "builder": [ "make" ], "ciTests":{ "darwin": [ { - "id": "fort_vecsample_cpu" + "id": "fort_vecsample_cpu", "steps": [ "make", "make run", From d2d4417a5e5be07bfba1ee298ddebe27fc28f579 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Wed, 22 Jul 2020 14:00:46 -0700 Subject: [PATCH 09/11] Moved to directory structure --- .../Fortran/{ => CombinationalLogic}/openmp_samples/License.txt | 0 .../Fortran/{ => CombinationalLogic}/openmp_samples/Makefile | 0 .../Fortran/{ => CombinationalLogic}/openmp_samples/README.md | 0 .../Fortran/{ => CombinationalLogic}/openmp_samples/sample.json | 0 .../{ => CombinationalLogic}/openmp_samples/src/openmp_sample.f90 | 0 .../Fortran/{ => DenseLinearAlgebra}/optimize_samples/License.txt | 0 .../Fortran/{ => DenseLinearAlgebra}/optimize_samples/Makefile | 0 .../Fortran/{ => DenseLinearAlgebra}/optimize_samples/README.md | 0 .../Fortran/{ => DenseLinearAlgebra}/optimize_samples/sample.json | 0 .../{ => DenseLinearAlgebra}/optimize_samples/src/int_sin.f90 | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/License.txt | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/Makefile | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/README.md | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/sample.json | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/src/driver.f90 | 0 .../Fortran/{ => DenseLinearAlgebra}/vec_samples/src/matvec.f90 | 0 16 files changed, 0 insertions(+), 0 deletions(-) rename DirectProgramming/Fortran/{ => CombinationalLogic}/openmp_samples/License.txt (100%) rename DirectProgramming/Fortran/{ => CombinationalLogic}/openmp_samples/Makefile (100%) rename DirectProgramming/Fortran/{ => CombinationalLogic}/openmp_samples/README.md (100%) rename DirectProgramming/Fortran/{ => CombinationalLogic}/openmp_samples/sample.json (100%) rename DirectProgramming/Fortran/{ => CombinationalLogic}/openmp_samples/src/openmp_sample.f90 (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/optimize_samples/License.txt (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/optimize_samples/Makefile (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/optimize_samples/README.md (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/optimize_samples/sample.json (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/optimize_samples/src/int_sin.f90 (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/License.txt (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/Makefile (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/README.md (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/sample.json (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/src/driver.f90 (100%) rename DirectProgramming/Fortran/{ => DenseLinearAlgebra}/vec_samples/src/matvec.f90 (100%) diff --git a/DirectProgramming/Fortran/openmp_samples/License.txt b/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/License.txt similarity index 100% rename from DirectProgramming/Fortran/openmp_samples/License.txt rename to DirectProgramming/Fortran/CombinationalLogic/openmp_samples/License.txt diff --git a/DirectProgramming/Fortran/openmp_samples/Makefile b/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/Makefile similarity index 100% rename from DirectProgramming/Fortran/openmp_samples/Makefile rename to DirectProgramming/Fortran/CombinationalLogic/openmp_samples/Makefile diff --git a/DirectProgramming/Fortran/openmp_samples/README.md b/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/README.md similarity index 100% rename from DirectProgramming/Fortran/openmp_samples/README.md rename to DirectProgramming/Fortran/CombinationalLogic/openmp_samples/README.md diff --git a/DirectProgramming/Fortran/openmp_samples/sample.json b/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/sample.json similarity index 100% rename from DirectProgramming/Fortran/openmp_samples/sample.json rename to DirectProgramming/Fortran/CombinationalLogic/openmp_samples/sample.json diff --git a/DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 b/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/src/openmp_sample.f90 similarity index 100% rename from DirectProgramming/Fortran/openmp_samples/src/openmp_sample.f90 rename to DirectProgramming/Fortran/CombinationalLogic/openmp_samples/src/openmp_sample.f90 diff --git a/DirectProgramming/Fortran/optimize_samples/License.txt b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/License.txt similarity index 100% rename from DirectProgramming/Fortran/optimize_samples/License.txt rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/License.txt diff --git a/DirectProgramming/Fortran/optimize_samples/Makefile b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/Makefile similarity index 100% rename from DirectProgramming/Fortran/optimize_samples/Makefile rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/Makefile diff --git a/DirectProgramming/Fortran/optimize_samples/README.md b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/README.md similarity index 100% rename from DirectProgramming/Fortran/optimize_samples/README.md rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/README.md diff --git a/DirectProgramming/Fortran/optimize_samples/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/sample.json similarity index 100% rename from DirectProgramming/Fortran/optimize_samples/sample.json rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/sample.json diff --git a/DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/src/int_sin.f90 similarity index 100% rename from DirectProgramming/Fortran/optimize_samples/src/int_sin.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/src/int_sin.f90 diff --git a/DirectProgramming/Fortran/vec_samples/License.txt b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/License.txt similarity index 100% rename from DirectProgramming/Fortran/vec_samples/License.txt rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/License.txt diff --git a/DirectProgramming/Fortran/vec_samples/Makefile b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/Makefile similarity index 100% rename from DirectProgramming/Fortran/vec_samples/Makefile rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/Makefile diff --git a/DirectProgramming/Fortran/vec_samples/README.md b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/README.md similarity index 100% rename from DirectProgramming/Fortran/vec_samples/README.md rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/README.md diff --git a/DirectProgramming/Fortran/vec_samples/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/sample.json similarity index 100% rename from DirectProgramming/Fortran/vec_samples/sample.json rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/sample.json diff --git a/DirectProgramming/Fortran/vec_samples/src/driver.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/driver.f90 similarity index 100% rename from DirectProgramming/Fortran/vec_samples/src/driver.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/driver.f90 diff --git a/DirectProgramming/Fortran/vec_samples/src/matvec.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/matvec.f90 similarity index 100% rename from DirectProgramming/Fortran/vec_samples/src/matvec.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/matvec.f90 From ab89fb0ee9a13254404346161274c760e75a94f1 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Wed, 5 Aug 2020 15:14:29 -0700 Subject: [PATCH 10/11] Renamed directories to conform to standards --- .../{openmp_samples => openmp-primes}/License.txt | 0 .../{openmp_samples => openmp-primes}/Makefile | 0 .../{openmp_samples => openmp-primes}/README.md | 2 +- .../{openmp_samples => openmp-primes}/sample.json | 0 .../{openmp_samples => openmp-primes}/src/openmp_sample.f90 | 0 .../{optimize_samples => optimize-integral}/License.txt | 0 .../{optimize_samples => optimize-integral}/Makefile | 0 .../{optimize_samples => optimize-integral}/README.md | 2 +- .../{optimize_samples => optimize-integral}/sample.json | 0 .../{optimize_samples => optimize-integral}/src/int_sin.f90 | 0 .../{vec_samples => vectorize-vecmatmult}/License.txt | 0 .../{vec_samples => vectorize-vecmatmult}/Makefile | 0 .../{vec_samples => vectorize-vecmatmult}/README.md | 2 +- .../{vec_samples => vectorize-vecmatmult}/sample.json | 0 .../{vec_samples => vectorize-vecmatmult}/src/driver.f90 | 0 .../{vec_samples => vectorize-vecmatmult}/src/matvec.f90 | 0 16 files changed, 3 insertions(+), 3 deletions(-) rename DirectProgramming/Fortran/CombinationalLogic/{openmp_samples => openmp-primes}/License.txt (100%) rename DirectProgramming/Fortran/CombinationalLogic/{openmp_samples => openmp-primes}/Makefile (100%) rename DirectProgramming/Fortran/CombinationalLogic/{openmp_samples => openmp-primes}/README.md (99%) rename DirectProgramming/Fortran/CombinationalLogic/{openmp_samples => openmp-primes}/sample.json (100%) rename DirectProgramming/Fortran/CombinationalLogic/{openmp_samples => openmp-primes}/src/openmp_sample.f90 (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{optimize_samples => optimize-integral}/License.txt (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{optimize_samples => optimize-integral}/Makefile (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{optimize_samples => optimize-integral}/README.md (99%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{optimize_samples => optimize-integral}/sample.json (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{optimize_samples => optimize-integral}/src/int_sin.f90 (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/License.txt (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/Makefile (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/README.md (99%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/sample.json (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/src/driver.f90 (100%) rename DirectProgramming/Fortran/DenseLinearAlgebra/{vec_samples => vectorize-vecmatmult}/src/matvec.f90 (100%) diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/License.txt b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/License.txt similarity index 100% rename from DirectProgramming/Fortran/CombinationalLogic/openmp_samples/License.txt rename to DirectProgramming/Fortran/CombinationalLogic/openmp-primes/License.txt diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/Makefile b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/Makefile similarity index 100% rename from DirectProgramming/Fortran/CombinationalLogic/openmp_samples/Makefile rename to DirectProgramming/Fortran/CombinationalLogic/openmp-primes/Makefile diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/README.md b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md similarity index 99% rename from DirectProgramming/Fortran/CombinationalLogic/openmp_samples/README.md rename to DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md index a9610afc6b..6509c3788b 100644 --- a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/README.md +++ b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md @@ -1,4 +1,4 @@ -# `Fortran OpenMP*` sample +# `OpenMP Primes` This sample is designed to illustrate how to use the OpenMP* API with the Intel® Fortran Compiler. diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/sample.json b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json similarity index 100% rename from DirectProgramming/Fortran/CombinationalLogic/openmp_samples/sample.json rename to DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp_samples/src/openmp_sample.f90 b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/src/openmp_sample.f90 similarity index 100% rename from DirectProgramming/Fortran/CombinationalLogic/openmp_samples/src/openmp_sample.f90 rename to DirectProgramming/Fortran/CombinationalLogic/openmp-primes/src/openmp_sample.f90 diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/License.txt b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/License.txt similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/License.txt rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/License.txt diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/Makefile b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/Makefile similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/Makefile rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/Makefile diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/README.md b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/README.md similarity index 99% rename from DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/README.md rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/README.md index 8681e3a9ac..e576ecb8af 100644 --- a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/README.md +++ b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/README.md @@ -1,4 +1,4 @@ -# `Fortran Optimization` sample +# `Optimization Integral` This sample is designed to illustrate compiler optimization features and programming concepts. diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/sample.json rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/src/int_sin.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/src/int_sin.f90 similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/optimize_samples/src/int_sin.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/src/int_sin.f90 diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/License.txt b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/License.txt similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/License.txt rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/License.txt diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/Makefile b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/Makefile similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/Makefile rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/Makefile diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/README.md b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md similarity index 99% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/README.md rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md index 32cb08e5c6..79d800f3e0 100644 --- a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/README.md +++ b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md @@ -1,4 +1,4 @@ -# `Fortran Vectorization` sample +# `Vectorize VecMatMult` In this sample, you will use the auto-vectorizer to improve the performance of the sample application. You will compare the performance of the diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/sample.json rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/driver.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/src/driver.f90 similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/driver.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/src/driver.f90 diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/matvec.f90 b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/src/matvec.f90 similarity index 100% rename from DirectProgramming/Fortran/DenseLinearAlgebra/vec_samples/src/matvec.f90 rename to DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/src/matvec.f90 From 9f0434eac3c92df07583d1a4012d966dcfa66c19 Mon Sep 17 00:00:00 2001 From: Ron Green Date: Mon, 24 Aug 2020 16:06:09 -0700 Subject: [PATCH 11/11] Misc edits to fix issues found in REVIEW by Barbara Perz. --- .../Fortran/CombinationalLogic/openmp-primes/README.md | 4 ++-- .../Fortran/CombinationalLogic/openmp-primes/sample.json | 2 +- .../Fortran/DenseLinearAlgebra/optimize-integral/sample.json | 2 +- .../Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md | 4 ++-- .../DenseLinearAlgebra/vectorize-vecmatmult/sample.json | 2 +- 5 files changed, 7 insertions(+), 7 deletions(-) diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md index 6509c3788b..b8b7de039e 100644 --- a/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md +++ b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/README.md @@ -21,7 +21,7 @@ and the number of 4n-1 primes in the same range. It illustrates two OpenMP* dire to help speed up the code. First, a dynamic schedule clause is used with the OpenMP* for directive. -Because the for loop's workload increases as its index gets bigger, +Because the DO loop's workload increases as its index gets bigger, the default static scheduling does not work well. Instead, dynamic scheduling is used to account for the increasing workload. But dynamic scheduling itself has more overhead than static scheduling, @@ -29,7 +29,7 @@ so a chunk size of 10 is used to reduce the overhead for dynamic scheduling. Second, a reduction clause is used instead of an OpenMP* critical directive to eliminate lock overhead. A critical directive would cause excessive lock overhead -due to the one-thread-at-time update of the shared variables each time through the for loop. +due to the one-thread-at-time update of the shared variables each time through the DO loop. Instead the reduction clause causes only one update of the shared variables once at the end of the loop. The sample can be compiled unoptimized (-O0 ), or at any level of diff --git a/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json index c7a512744c..67c356fe59 100644 --- a/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json +++ b/DirectProgramming/Fortran/CombinationalLogic/openmp-primes/sample.json @@ -1,5 +1,5 @@ { - "name": "openmp_samples", + "name": "openmp-primes", "categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ], "description": "Fortran Tutorial - Using OpenMP", "toolchain": [ "ifort" ], diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json index 0cd201d97e..1e0a458b35 100644 --- a/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json +++ b/DirectProgramming/Fortran/DenseLinearAlgebra/optimize-integral/sample.json @@ -1,5 +1,5 @@ { - "name": "optimization_samples", + "name": "optimization-integral", "categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ], "description": "Fortran Sample - Simple Compiler Optimizations", "toolchain": [ "ifort" ], diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md index 79d800f3e0..1e08cb7caf 100644 --- a/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md +++ b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/README.md @@ -22,7 +22,7 @@ Single Instruction Multiple Data (SIMD) instruction set. For the Intel® compiler, vectorization is the unrolling of a loop combined with the generation of packed SIMD instructions. Because the packed instructions operate on more than one data element at a time, the loop can execute more efficiently. It is sometimes referred to as auto-vectorization to emphasize that the compiler automatically identifies and optimizes suitable loops on its own. Intel® Advisor can assist with vectorization and show optimization report messages with your source code. See [Intel Advisor][1] for details. -[1]: https://software.intel.com/en-us/intel-advisor-xe "Intel Avisor" +[1]: https://software.intel.com/content/www/us/en/develop/tools/advisor.html "Intel Avisor" Vectorization may call library routines that can result in additional performance gain on Intel microprocessors than on non-Intel microprocessors. The vectorization can also be affected by certain options, such as m or x. @@ -51,7 +51,7 @@ This code sample is licensed under MIT license ## Building the `Fortran Vectorization` sample -This sample contains 2 Fortran source files, in subdirectory 'src/' under the main sample root directory oneAPI-samples/DirectProgramming/Fortran/vec_samples +This sample contains 2 Fortran source files, in subdirectory 'src/' under the main sample root directory oneAPI-samples/DirectProgramming/Fortran/vectorize-vecmatmult 1. matvec.f90 is a Fortran source file with a matrix-times-vector algorithm 2. driver.f90 is a Fortran source file with the main program calling matvec diff --git a/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json index c25dc42d09..a573f6b037 100644 --- a/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json +++ b/DirectProgramming/Fortran/DenseLinearAlgebra/vectorize-vecmatmult/sample.json @@ -1,5 +1,5 @@ { - "name": "vec_samples", + "name": "vectorize-vecmatmult", "categories": [ "Toolkit/Intel® oneAPI HPC Toolkit" ], "description": "Fortran Tutorial - Using Auto Vectorization", "toolchain": [ "ifort" ],