Skip to content

Too much macro dependence? #820

@ChrisRackauckas

Description

@ChrisRackauckas

I like Turing.jl, I really do. But one thing that seems to get in the way all of the time is the @model macro. DSLs are usually constructed as a way to constrain the possible inputs so that a compilation process can be written on a simplified form. Example: Stan wrote its own derivatives for every term for its AD, so it's constrained to an interface where you can use only the terms it defined. For JuMP, it needs to know what terms are linear, quadratic, integer, etc. in order to specialize, so it has a DSL that specifically captures that information.

Using DSLs isn't that great in many circumstances because, well, you might not have a full programming language available to you. But even more importantly, in many cases you have to write into a language, and thus a script needs to be built at compile time. Again, Stan you build a whole program as a string.

Turing.jl sits in an odd location of design space because it has a DSL-based interface... but it doesn't need to. Turing.jl is built on things like ForwardDiff.jl and Tracker.jl which are language-wide AD systems, so in theory any Julia code could work. In practice, things that are definable in the macro are what is allowed. @model is very good with allowing arbitrary Julia functions, but it still has the issue that, as a macro, it is evaluated at compile time. So like Stan, if you want to programmatically create models, you have to interpolate into a compile time script and then run the compiler. This goes unnoticed in a lot of Turing.jl usage because a lot of users are writing models in the global scope, in which case there's implicitly an eval happening after each command.

However, this gets tricky when defining a Turing model in a function because... functions run at runtime and not at compile time, so the macro is expanded before the function values are known. So for example, let's look at the DiffEqBayes.jl integration with the ODE solvers. Let's say we had a list of variables that we want to be named syms in the output. In theory we could do

syms = [:a,:b]
function (syms,priors,....)
    @model bif(x) = begin
    for i in 1:length(priors)
      syms[i] ~ priors[i]
    end 
    ...

but that would make a bunch of variables named syms, not a variable named the symbol value that is in syms[i], so this is distinctly different from writing a ~ priors[1]. If you want the output to have named chains like [:a,:b], you could construct an expression for the @model, interpolate

$(syms[i]) ~ priors[i]

and then eval the expression, but I think it's clear that is showing that the model isn't truly a macro.

At its core, the issue is that a model isn't actually a macro because it doesn't actually have to use compile-time information: the model is actually a function, and the macro is just a nice way to construct it. The simplest solution of course is to document the internals, like in https://turing.ml/docs/advanced/ , but I find it interesting that there are no test cases and tutorials that show how to use this, given that this is what you'd need to do if you don't want the whole structure defined at compile time.

With #819 and reconstructing the output chain https://github.com/TuringLang/MCMCChains.jl#parameter-names with new names I can probably hack my way around to get the output I want without resorting to eval and using what's documented/tested/expected to be used, but I think this is something to think about in future tutorials.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions