If one of the claims is that we are better than the state of the art, we need to compare against pycg. - [ ] design evaluation - [x] have test examples - [ ] bring corpus from pycg - [ ] design new use cases