-
Notifications
You must be signed in to change notification settings - Fork 1.3k
FEA add ValueDifferenceMetric as a pairwise metric #796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @glemaitre! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2021-02-14 18:28:41 UTC |
|
Could you add in this branch some bench code so we could test vdm performance in order to improve it? |
Yep. We could do that. I made some profiling and I am actually not sure that we can speed-up the computation. |
Codecov Report
@@ Coverage Diff @@
## master #796 +/- ##
=======================================
Coverage 98.62% 98.62%
=======================================
Files 89 89
Lines 5881 5881
Branches 494 494
=======================================
Hits 5800 5800
Misses 80 80
Partials 1 1 Continue to review full report at Codecov.
|
|
@chkoar I am wondering if this metric should be public or private. Indeed, it required some |
We could add it without a leading |
It seems x20 slower than the current implementation. I think that we are fine to go indeed. |
I am starting to think that it could leave in the documentation as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case that we will use fit, probably we could inherit from the base estimator since it estimates from data.
class ValueDifferenceMetric(BaseEstimator):
def __init__(self, k=1, r=2):...
def fit(self, X, y):
# learning unique classes here
def pairwise(self, X, Y=None):...On the other hand, another option would be to require the data in the init.
class ValueDifferenceMetric:
def __init__(self, X, y, k=1, r=2):...
def pairwise(self, X, Y=None):...Additionally we could implement the callable API
vdm = ValueDifferenceMetric(...)
distance = vdm(x1, x2)Design wise, I would be in favor for the following in order to use the DistanceMetric API but is way to slow.
class ValueDifferenceMetric:
def __init__(self, X, y):...
def __call__(self, x1, x2, k=1, r=2):...
vdm = ValueDifferenceMetric(X,y)
metric = DistanceMetric.get_metric(vdm)
metric.pairwise(X)
# or
knn = KNearestNeighbors(metric=metric)All these assuming that X is ordinal encoded with ints.
Can we get rid of that after the sampling? |
yes, it is just temporary for the sampling for the NN search. |
Co-authored-by: Christos Aridas <[email protected]>
Co-authored-by: Christos Aridas <[email protected]>
|
@chkoar I think that I would like to see this PR merge as is and open another one for Do you see anything else to add? |
Agreed. |
Co-authored-by: Christos Aridas <[email protected]>
|
OK I think this is good to be merged. I fixed the issue with what's new. @chkoar Feel free to merge. |
Done |
Uh oh!
There was an error while loading. Please reload this page.