Conversation
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/CMakeLists.txt
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/CMakeLists.txt
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/sample.json
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.cpp
Show resolved
Hide resolved
| class SetBitSlice; | ||
|
|
||
| using ac_int4 = ac_intN::int4; | ||
| using ac_int14 = ac_intN::int14; |
There was a problem hiding this comment.
These typedefs seem unnecessary, since we already have them in the ac_int header file.
There was a problem hiding this comment.
I will let Paul weigh in on this one as well. I added these because they simplify the functions below and because I have seen non oneAPI code samples do it. I don't have a strong preference either way.
There was a problem hiding this comment.
Hi @paul-white-intel please review this part.
There was a problem hiding this comment.
I agree with Ajay. I would use typedefs if you wanted to wrap a particular parameterization of ac_int, e.g. ac_int<14, false>, but if you're using the convenience type, then it's kind of redundant. What's the point of the convenience type then? :)
There was a problem hiding this comment.
To me, the fewer characters looked better. I didn't want to see ac_intN:: everywhere. If it doesn't add any value in your opinion, I will remove them.
There was a problem hiding this comment.
then when you typedef, I think you should use the ac_int<x, b> parameterization.
| accessor x{inp1, h, read_only}; | ||
| accessor y{inp2, h, read_only}; | ||
| accessor res{result, h, write_only}; | ||
| h.single_task<Mult>([=] { res[0] = x[0] * y[0]; }); |
There was a problem hiding this comment.
Should we instead use one function to demonstrate all of these? I’m worried we’re adding a lot of boilerplate to show something relatively simple.
There was a problem hiding this comment.
Perhaps one function per class of operation?
There was a problem hiding this comment.
Tanner pointed this out as well. I want to know what Paul thinks about this. One function can be used for the add, sub, mult operations - it could even be just one kernel.
There was a problem hiding this comment.
Hi @paul-white-intel please comment on this one as well, thank you!
There was a problem hiding this comment.
We can make one function for everything using the right C++ fun. I can chat test something out in Godbolt and send it to Abhishek.
There was a problem hiding this comment.
I like having them separate from an IPA point of view; it makes the reports trivial to understand.
An alternative would be to put each of the functions in a separate CPP file, like I did for each of the parts here: https://github.com/intel-sandbox/whitepau.simple-oneapi
I’m worried we’re adding a lot of boilerplate to show something relatively simple.
this is the nature of SYCL. lots of boilerplate.
There was a problem hiding this comment.
I'd say no to extra files, since it doesn't actually fix the problem, but rather distribute it to different files.
this is the nature of SYCL. lots of boilerplate.
Untrue. Poorly written SYCL has lots of boilerplate. SYCL has more boilerplate than HLS, sure, but that is because it does more than an HLS component. C++ can have lots of boiler plate too, if it is written poorly. We shouldn't avoid using common C++ features because we are worried about our users not understanding C++. If we think there is a gap, we should explain what is happening instead of avoiding a method to improve code reuse.
| // create the SYCL device queue | ||
| queue q(selector, dpc_common::exception_handler); | ||
|
|
||
| // Initialize two random ints |
There was a problem hiding this comment.
Use a fixed srand() value so that we’re not truly random so it makes it easier to reproduce results here.
There was a problem hiding this comment.
I would not use rand() at all... I would want to explicitly show a negative value and a positive value.
Also, why not use ac_int types for t1 and t2 instead of native int types? the truncation should happen automatically.
There was a problem hiding this comment.
why not use ac_int types for t1 and t2 instead of native int types?
I think it is to emphasize that the int vs ac_int comparison
How about I replace it with :
int t1 = 624
int t2 = -254
{
int golden = t1 + t2;
ac_int14 a = t1;
ac_int14 b = t2;
ac_int15 c;
TestAdd(q, a, b, c);
...
}
There was a problem hiding this comment.
can this be written as
ac_int14 a = 624;
ac_int14 b = -254;
{
int golden = a + b;
ac_int15 c;
TestAdd(q, a, b, c);
...
}
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/src/ac_int.hpp
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
| make fpga | ||
| ``` | ||
|
|
||
| 3. (Optional) As the above hardware compile may take several hours to complete, FPGA precompiled binaries (compatible with Linux* Ubuntu* 18.04) can be downloaded <a href="https://iotdk.intel.com/fpga-precompiled-binaries/latest/ac_int.fpga.tar.gz" download>here</a>. |
There was a problem hiding this comment.
do we offer precompiled binaries for all the code samples? I don't recall seeing this before.
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
| ```cpp | ||
| #include <sycl/ext/intel/ac_types/ac_int.hpp> | ||
| ``` | ||
| Additionally, you must use the flag `-qactypes` in order to ensure that the headers are correctly included. |
There was a problem hiding this comment.
cue the 'You didn't have to do that in HLS' complaints :)
There was a problem hiding this comment.
You didn't have to for SYCL before either.
I think the change that resulted in this would affect HLS too, no?
There was a problem hiding this comment.
No, HLS branches are separate hence HLS is not affected.
There was a problem hiding this comment.
Right. I am saying, if HLS wasn't being deprecated, it would have been updated too.
There was a problem hiding this comment.
@tyoungsc that... doesn't make it better.
do you at least get a clear actionable error message if you try to use an ac-type without including -qactypes?
There was a problem hiding this comment.
You'd get an include error - it will not be able to find the ac_types header.
There was a problem hiding this comment.
hmm. does the error literally say, "make sure you compile with -qactypes"? otherwise, I can see this causing problems.
There was a problem hiding this comment.
Unfortunately no. But that is clearly highlighted on the first page of the AC types Documentation. Let me see what we can do to generate that error message.
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
| - If the user knows that the shift direction and that the shift value is always positive, it is advisable to use an unsigned datatype for the shift value to obtain better QoR. | ||
| - Shift values of greater than the width of the datatypes are treated as a shift equal to the width of the datatype. | ||
|
|
||
| Further, the operation can be done more efficiently by specifying the amount to shift with the smallest possible ac_int. |
There was a problem hiding this comment.
this is a similar follow-on to your recommendation to use unsigned datatypes for shift operators, so it should be a bullet in the list above.
DirectProgramming/DPC++FPGA/Tutorials/Features/ac_types/ac_int/README.md
Outdated
Show resolved
Hide resolved
| ## License | ||
|
|
||
| Code samples are licensed under the MIT license. See | ||
| [License.txt](https://github.com/oneapi-src/oneAPI-samples/blob/master/License.txt) for details. |
There was a problem hiding this comment.
do we need a license for the ac_data_types_ref.pdf?
There was a problem hiding this comment.
Let me follow up on this.
| std::cout << "\nBitwise Operations:\n"; | ||
| // Shift operator | ||
| { | ||
| t1 = rand() & ((1 << 13) - 1); |
There was a problem hiding this comment.
does it make sense to put the testbench code into the TestXXX() functions? that would make the main() function tidier. I believe the main point of separating them in the hls version was to separate the device code from testbench code, but that abstraction doesn't really work in oneAPI anyway, since the TestXXX() functions already contain host code.
There was a problem hiding this comment.
Good idea, I think that makes sense. I'll update it once we reach consensus on whether or not to use templated functions for the TestXXX() functions.
There was a problem hiding this comment.
You could make functions that accept accessors and then all of the code in the "single_task" lamda would be a call to a function (passing it accessors). Then your functions would be just device code.
| | Software | Intel® oneAPI DPC++ Compiler <br> Intel® FPGA Add-On for oneAPI Base Toolkit | ||
| | What you will learn | Using the ac_int data-type for basic operations <br> Efficiently using the left shift operation <br> Setting and reading certain bits of an ac_int number | ||
| | What you will learn | Using the `ac_int` data-type for basic operations <br> Efficiently using the left shift operation <br> Setting and reading certain bits of an `ac_int` number | ||
| | Time to complete | 20 minutes |
There was a problem hiding this comment.
How do we quantify this 'time to complete'? I think all the code samples are just 20 minutes :)
There was a problem hiding this comment.
I view it as time to go through the README and observe the reports.
There was a problem hiding this comment.
Correct, it is roughly "time to read to README, not including HW compile time". This is a crude estimate, but gives the reader a decent idea.
Internal review of ac_int tutorial. This is a port of the HLS ac_int/ac_int_basic_ops tutorial.
Testing:
[DONE] Linux Local compile
[TODO] Linux Local compile with CMAKE
[TODO] Linux Regtest
[TODO] Windows Local compile with CMAKE
[TODO] Windows Regtest