Right now the resolution of the result seems to be limited to 1ns. This should be enough for expensive benchmarks but could be improved with averaging and may not be enough for benchmarking cheap operations that only takes a few ns.
I encounter these situations mainly when benchmarking low level operations. E.g. in JuliaLang/julia#16174, where the optimized version of g2 takes only 1.2ns per loop. It would be nice if I don't have to write my own loops for these.