The currently available implementation of HardwareSerial is very primitive and is based on polling.
Especially when sending data, this causes unnecessary blocking of the program's execution of other tasks.
A DMA-based implementation, although it would relieve the CPU, will not be optimal for the variable amount of data transferred, hence I propose a traditional interrupt-based implementation.