-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
So, I had a conversation with @eerhardt and @yaeldekel, about the nature of models, in particular relating to the saving and loading routines. This is very important for us to get right, since the artifacts of what we learn and how we transform data, and their persistability, is probably the most important thing we have to do correctly.
We take the view initially that the model is the ITransformer (note that a chain of ITransformers is itself an ITransformer). But, by itself this is an insufficient description, was we saw in in #2663 and its subsequent "child" #2735, from the point of view of model being practically "the stuff you need to keep," there's a lot more to a machine learning model than merely just the ITransformer chain -- you also need to preserve some notion of what the input is to that. So we added these things to take either a loader, or the input schema, to be saved as part of the model.
Yet, is the loader a model itself? Sometimes that's precisely what we call it:
| public void Save<TSource>(IDataLoader<TSource> model, Stream stream) |
And in the same file we call it something else:
| public void Save<TSource>(IDataLoader<TSource> loader, ITransformer model, Stream stream) => |
It is a model in one sense because it is yet another things that takes input data and produces output data -- the fact that ITransformer does it specifically over IDataView as input specifically does not give it some magical, special status to allow it to be called a model, to the exclusion of other candidates. If I take a loader, and append a transform to it, then the whole aggregate thing is still a loader. If it isn't a model, it only isn't one by the mere skin of its teeth. Hence the presence of the original thing, and why we have in the model operations catalog operations to save and load IDataLoader itself specifically.
But at the same time this duality of the term "model" is, as I understand @eerhardt, confusing. We have two things we're calling model. In an ideal world, I feel like if we can get away with just one story of what the model is, we should take it, and if there must be only one it must be ITransformer. We even have the situation where if you have mlContext.Model.Save(, the first thing that pops up is the IDataLoader thing, which is kind of strange.
I am not quite sure what I think, since in this case I agree with whoever talks to me last with an even a vaguely convincing argument. But I think in this case I will see about getting rid of the IDataLoader-only APIs -- people can, if it is important, continue to save and load such things by using empty ITransformer chains (again, any chain of ITransformer is itself an ITransformer, including the empty chain).
Since we are approaching v1, I view it as a bit more important to be conservative w.r.t. potentially confusing additions to the API, especially around something as central as the saving and loading of models. We might be able to add it back later if there's some really compelling scenario for them, that we somehow did not anticipate.
We will of course retain the saving and loading of transformers with loaders, since that is really important to be able to capture, but I think being consistent around the story that "models are transformers" as we are most places is kind of important.