Changes to FPGA tutorial shannonization to make it more like other tutorials. (#553)

tyoungsc · web-flow · commit 67653a2aae81 · 2021-07-06T11:02:57.000-04:00
* First commit for merge sort design Signed-off-by: tyoungsc <tanner.young-schultz@intel.com> * Cleaned up MergeSort.hpp a bit using defines Signed-off-by: tyoungsc <tanner.young-schultz@intel.com> * Formatting/comments * Reduced area by connecting Partition unit (previously called Shuffle) to only the first merge unit Changed Shuffle to Partition Updated pictures and README Used existing pipe_array code (instead of using my own) Code cleanup and comments Tested in emulation and reports, doing a HW build now * Fixed bug with no-USM BSPs in Produce kernel Slight code cleanup * README update after Mike's review CMake update Removed line from samples.json that was not necessary Deleted unused files * Changed Windows specific flag to use '/' instead of '-' * CMake update * Removed all pointer arithmetic for offseting to the inside of kernels (to avoid runtime issue) * Big change to example design: encorperate a new multi-element per cycle merge unit, use a bitonic sorter on the input, rather than a simple partition. Updated README and pictures to fit the new design. Addressed Mike's most recent review comments * Code cleanup: changed some variable names and comments * README update * Picture update * Comments * Code cleanup and comments * Small change to README * Merged shannonization VCXPROJ files into a single one (as per all the other tests). Small update to source file to change II target for A10 and fix indenting * Adding file I missed in previous commit * Updated comment * Grammar update * Again * Updating design output and README * Simplified Produce kernel * Allowed SORT_WIDTH to be 1 (1 element per cycle) Improved Merge kernel for case where SORT_WIDTH=1 with shannonization * Code cleanup Variable renaming Comments Grammar * Comments, formatting, README grammar changes, and general cleanup :) * Renamed all files to match google style Renamed main file to main.cpp Used impu namespace in unrolledloop, pipearray, and static_math Renamed static_math.hpp to impu_math.hpp Updated README, Windows VS files, and CMake with these changes * README updates Comments * Changed filenames to use underscores between words * Added 'pipe' namespace to impu namespace for pipe utilities * Revert "Merge pull request #1 from tyoungsc/new_fpga_ref_design_merge_sort" This reverts commit 202ff82, reversing changes made to 72580ac.
diff --git a/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization.sln b/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization.sln
@@ -3,9 +3,7 @@ Microsoft Visual Studio Solution File, Format Version 12.00
 # Visual Studio 15
 VisualStudioVersion = 15.0.28307.705
 MinimumVisualStudioVersion = 10.0.40219.1
-Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "shannonization_a10", "shannonization_a10.vcxproj", "{D6A634E7-9F2B-46C2-A21C-2402F631A55A}"
-EndProject
-Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "shannonization_s10", "shannonization_s10.vcxproj", "{30A42429-E56D-4448-903E-6F4C4756E491}"
+Project("{8BC9CEB8-8B4A-11D0-8D11-00A0C91BC942}") = "shannonization", "shannonization.vcxproj", "{D6A634E7-9F2B-46C2-A21C-2402F631A55A}"
 EndProject
 Global
 	GlobalSection(SolutionConfigurationPlatforms) = preSolution
diff --git a/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization.vcxproj b/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization.vcxproj
@@ -105,7 +105,6 @@
       <AdditionalOptions>-DFPGA_EMULATOR %(AdditionalOptions)</AdditionalOptions>
       <ObjectFileName>$(IntDir)shannonization.obj</ObjectFileName>
       <AdditionalIncludeDirectories>$(ONEAPI_ROOT)dev-utilities\latest\include</AdditionalIncludeDirectories>
-      <PreprocessorDefinitions>A10;%(PreprocessorDefinitions)</PreprocessorDefinitions>
     </ClCompile>
     <Link>
       <SubSystem>Console</SubSystem>
@@ -147,7 +146,6 @@
       <AdditionalOptions>-DFPGA_EMULATOR %(AdditionalOptions)</AdditionalOptions>
       <ObjectFileName>$(IntDir)shannonization.obj</ObjectFileName>
       <AdditionalIncludeDirectories>$(ONEAPI_ROOT)dev-utilities\latest\include</AdditionalIncludeDirectories>
-      <PreprocessorDefinitions>A10;%(PreprocessorDefinitions)</PreprocessorDefinitions>
     </ClCompile>
     <Link>
       <SubSystem>Console</SubSystem>
@@ -160,4 +158,4 @@
   <Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
   <ImportGroup Label="ExtensionTargets">
   </ImportGroup>
-</Project>
+</Project>
diff --git a/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization_s10.vcxproj b/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/shannonization_s10.vcxproj
diff --git a/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/src/shannonization.cpp b/DirectProgramming/DPC++FPGA/Tutorials/DesignPatterns/shannonization/src/shannonization.cpp
@@ -268,28 +268,28 @@ int main(int argc, char** argv) {
 
     bool success = true;
 
-  // Instantiate multiple versions of the kernel
-  // The II achieved by the compiler can differ between FPGA architectures
-  //
-  // On Arria 10, we are able to achieve an II of 1 for versions 1 and 2 of
-  // the kernel (not version 0).
-  // Version 2 of the kernel can achieve the highest Fmax with 
-  // an II of 1 (and therefore has the highest throughput).
-  // Since this tutorial compiles to a single FPGA image, this is not
-  // reflected in the final design (that is, version 1 bottlenecks the Fmax
-  // of the entire design, which contains versions 0, 1 and 2).
-  // However, the difference between versions 1 and 2
-  // can be seen in the "Block Scheduled Fmax" columns in the 
-  // "Loop Analysis" tab of the HTML reports.
-  //
-  // On Stratix 10 and Agilex, the same discussion applies, but version 0
-  // can only achieve an II of 3 while versions 1 and 2 can only achieve
-  // an II of 2. On Stratix 10 and Agilex, we can achieve an II of 1 if we use
-  // non-blocking pipe reads in the IntersectionKernel, which is shown in
-  // version 3 of the kernel.
-  //
+    // Instantiate multiple versions of the kernel
+    // The II achieved by the compiler can differ between FPGA architectures
+    //
+    // On Arria 10, we are able to achieve an II of 1 for all versions of the
+    // kernel.
+    // Version 2 of the kernel can achieve the highest Fmax with 
+    // an II of 1 (and therefore has the highest throughput).
+    // Since this tutorial compiles to a single FPGA image, this is not
+    // reflected in the final design (that is, version 1 bottlenecks the Fmax
+    // of the entire design, which contains versions 0, 1 and 2).
+    // However, the difference between versions 1 and 2
+    // can be seen in the "Block Scheduled Fmax" columns in the 
+    // "Loop Analysis" tab of the HTML reports.
+    //
+    // On Stratix 10 and Agilex, the same discussion applies, but version 0
+    // can only achieve an II of 3 while versions 1 and 2 can only achieve
+    // an II of 2. On Stratix 10 and Agilex, we can achieve an II of 1 if we use
+    // non-blocking pipe reads in the IntersectionKernel, which is shown in
+    // version 3 of the kernel.
+    //
 #if defined(A10)
-    success &= Intersection<0,2>(q, a, b, golden_n);
+    success &= Intersection<0,1>(q, a, b, golden_n);
     success &= Intersection<1,1>(q, a, b, golden_n);
     success &= Intersection<2,1>(q, a, b, golden_n);
     success &= Intersection<3,1>(q, a, b, golden_n);