- 
                Notifications
    You must be signed in to change notification settings 
- Fork 579
RFC: StreamExecutor C API #257
Conversation
Defined a portable bool type TF_BOOL. Added structure size definitions and updated usage examples. Fixed typos where member functions are supposed to be member function pointers. Fixed missing extern "C".
| TF_Status* status); | ||
|  | ||
| // Causes the host code to synchronously wait for the event to complete. | ||
| void (*block_host_for_event)( | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @annarev It seems that block_host_until_done is removed, can we keep this API? block_host_until_done is a more large-grained API, it provides more flexibility for the backend to implement, and lots of device runtime already provide the capability to match this API(CUDA: cuStreamSynchronize(stream),SYCL queue.wait() ). I saw current implementation is composed of create_event+ record_event + block_host_for_event + delete event, that means each back-end needs to implements all of these API, which limits the flexibility, and I am not sure whether it will bring some performance penalty (for DPC++, we need to enqueue a barrier to retrieve the event)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| I am planning to add the following  typedef struct SP_DeviceDescription {
  size_t struct_size;
  
  // Device hardware name. Used for printing.
  // Must be null-terminated.
  char* name;
  
  // Device vendor name. Used for printing.
  // Must be null-terminated.
  char* device_vendor;
  // Returns the PCI bus identifier for this device, of the form
  // [domain]:[bus]:[device].[function]
  // Used for printing.
  // Must be null-terminated.
  char* pci_bus_id;
  TF_Bool has_numa_node;  // True if `numa_node` is set.
  // Returns the NUMA node associated with this device, for use in
  // determining socket locality.
  int numa_node;
  TF_Bool has_memory_bandwidth;  // True if `memory_bandwidth` is set.
  // Device's memory bandwidth in bytes/sec.  (This is for reads/writes to/from
  // the device's own memory, not for transfers between the host and device.)
  int64_t memory_bandwidth;
  TF_Bool has_gflops;  // True if `gflops` field is set.
  // Estimate of floating point operations per second for this device * 10e-9.
  double gflops;
} SP_DeviceDescription; | 
* Added the Implementation Conventions and Usage Overview section. * Replaced the Stability / User Impact with the Versioning Strategy and Stability section. Also moved the section higher. * Updated the API design according to the real implementation. * Updated code examples.
…gableDevice RFC (PR tensorflow#262). * Referenced semver and TensorFlow's Version Compatibility page. * Added the Updating Guidelines, Convention, and Detecting Incompatibility sections. * Merged the current code examples with @yisitu's newer ones on PR tensorflow#262 and simplified them a bit.
… stream_executor_rfc
…to stream_executor_rfc
Also some more minor edits.
Co-authored-by: Situ Yi <[email protected]>
Co-authored-by: Situ Yi <[email protected]>
Changed "shall" to "must" to make sure people would not misinterpret "shall".
Updates based on recent changes and discussion on RFC PR tensorflow#262
| 
 Thanks!. This looks good to me. Just to confirm, all the versioning information is subsumed in struct_size, and we don't need any extra fields in  | 
| 
 Yes, we rely on  | 
| This RFC has been up for a while. The implementation has even been checked in over a month ago. I think we should wrap this up (i.e., merge this PR). Changes to StreamExecutor C API since the last combined design review meeting with PluggableDevice on 8/13/20: 
 If anyone has anything else that needs to be discussed/resolved before this RFC can be merged (i.e., fundamental issues), please explicitly say so here by this Thursday, 9/24 at 11:59pm PT. | 
| 
 Thanks Penporn, for reaching out. Nothing from our side. | 
| @kulinseth Thank you for the quick confirmation! :) @ematejska @alextp I believe there is no outstanding issue now. Could we move forward? | 
| Yes, let's merge the PR.… On Fri, Sep 25, 2020 at 10:13 AM Penporn Koanantakool < ***@***.***> wrote:
 @kulinseth <https://github.com/kulinseth> Thank you for the quick
 confirmation! :)
 @ematejska <https://github.com/ematejska> @alextp
 <https://github.com/alextp> I believe there is no outstanding issue now.
 Could we move forward?
 —
 You are receiving this because you were mentioned.
 Reply to this email directly, view it on GitHub
 <#257 (comment)>,
 or unsubscribe
 <https://github.com/notifications/unsubscribe-auth/AAABHRL4JHCQHDI5DPSTMLLSHTFTRANCNFSM4N4SSC6Q>
 .
 -- 
 - Alex | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
| @penpornk Hi, is there any PluggableDevice template for beginner? TKS | 
The feedback phase will be open for two weeks until Tuesday June 30, 2020.
Objective
Provide basic device management C API to allow new devices to modularly connect
to the current TensorFlow runtime.