Skip to content

Conversation

@AakashKumarNain
Copy link
Member

Added Cohens Kappa as a new metric.

@AakashKumarNain AakashKumarNain requested a review from a team as a code owner June 1, 2019 10:26
Copy link
Member

@WindQAQ WindQAQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @AakashKumarNain, thanks for the contribution! I wonder if I misunderstand something here:

Shouldn't update_state aggregate the previous statistics like what other metrics in tf.keras.metrics do? For example EDIT:

y_true = np.array([0, 1, 0, 1, 0])
y_pred = np.array([0, 1, 0, 1, 1])
m = tf.keras.metrics.BinaryAccuracy()
m.update_state(y_true, y_pred) # 1st acc: 0.8
print(m.result().numpy()) # 0.8

y_true = np.array([0, 1, 0, 1, 0])
y_pred = np.array([0, 1, 0, 1, 0])
m.update_state(y_true, y_pred) # 2nd acc: 1.0
print(m.result().numpy()) # 0.9 = (0.8 + 1) / 2

m.reset_states() # reset if needed

Seems that the implementation now only takes current input into account. Is there any reason that we have to only compute the current input's Cohen's Kappa instead of accumulating states and computing overall Cohen's Kappa? Please correct me if I misunderstand. cc @facaiy.

EDIT: #265 (comment)

@WindQAQ WindQAQ mentioned this pull request Jun 3, 2019
11 tasks
@AakashKumarNain
Copy link
Member Author

Thanks @WindQAQ . I am aware that the state needs to be accumulated but couldn't figure out how to do that. @facaiy can you elaborate a bit on how to achieve the same?

@facaiy
Copy link
Member

facaiy commented Jun 6, 2019

@AakashKumarNain I think tf.kears.metric.AUC is a good example. We should save/update intermediate variables (say, confusion matrix), rather than the result (kappa score).

@facaiy
Copy link
Member

facaiy commented Jun 6, 2019

@AakashKumarNain
Copy link
Member Author

AakashKumarNain commented Jun 6, 2019

Thank you @facaiy for the information.

@WindQAQ @facaiy I think I have found a very elegant solution to the problem. I cannot think of anything better than this. Please take a look at this notebook. I will make the changes in PR once you are okay with the solution, though the errors from graph mode are up again in test cases.

https://colab.research.google.com/drive/10CNyrnq10RUTHssUcfSdtIyYdGnC5bGD

@facaiy
Copy link
Member

facaiy commented Jun 9, 2019

@AakashKumarNain Aakash, the new solution looks good :-)

@facaiy facaiy added the metrics label Jun 9, 2019
@AakashKumarNain
Copy link
Member Author

@facaiy Thanks Yan. Can you please help me with the errors in the test case? I have shown it in the same notebook

@facaiy
Copy link
Member

facaiy commented Jun 10, 2019

Could you remove self.kappa_score variable, Aakash? It seems that we don't need it any more.

@AakashKumarNain
Copy link
Member Author

It didn't help. Now the error is raised due to theconf_mtx variable that is dependent on the self.conf_mtx variable defined in the constructor.

@facaiy
Copy link
Member

facaiy commented Jun 10, 2019

I'm not so sure. Seems that we have to initialize variables by ourselves or use metric in a keras model, please refer to metric's test cases :-)

https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/metrics_test.py

@AakashKumarNain
Copy link
Member Author

I checked that and I am doing the same thing. This line self.conf_mtx.assign_add(new_conf_mtx) is causing the problem. The AssignAdd operation is throwing errors.

@WindQAQ
Copy link
Member

WindQAQ commented Jun 10, 2019

Hi, @AakashKumarNain, you have to initialize variables within the metric object:

self.evaluate(tf.compat.v1.variables_initializer(kp_obj1.variables))

Here is the revised notebook.

@AakashKumarNain
Copy link
Member Author

Thank you @WindQAQ for looking into it. Can you provide me the access to the notebook?

@WindQAQ
Copy link
Member

WindQAQ commented Jun 10, 2019

Sorry about that... Already shared it. Please click the link above again.

@AakashKumarNain
Copy link
Member Author

Got it. Thanks.

@Squadrick
Copy link
Member

@AakashKumarNain I'm trying to figure that out. Could you tell your TF 2.x version using: tf.__version__

@AakashKumarNain
Copy link
Member Author

AakashKumarNain commented Jun 12, 2019

@facaiy @WindQAQ @Squadrick I am on 2.0.0-beta0 . The funny thing is that now it is failing on tf.initializers, which shouldn't be the case anyway.

Another funny thing is that everything works with Py3, all test cases pass but something is always off with Py2. Py2 support is going to end in December, maybe it is the sign 😆

Either the version in Py2 is very different than it is with Py3 or something is seriously broken

@seanpmorgan
Copy link
Member

@facaiy @WindQAQ @Squadrick I am on 2.0.0-beta0 . The funny thing is that now it is failing on tf.initializers, which shouldn't be the case anyway.

Another funny thing is that everything works with Py3, all test cases pass but something is always off with Py2. Py2 support is going to end in December, maybe it is the sign

Either the version in Py2 is very different than it is with Py3 or something is seriously broken

Sorry I haven't been tracking this too closely. If you're refering to why py3 tests are passing in our CI it's probably because there is no py34 tf2-nightly available so it's running on an old version. See #279

@AakashKumarNain
Copy link
Member Author

But @seanpmorgan no matter what version, tf.initializers should be there. It is like tf.nn, an important part of the API, right?

@seanpmorgan
Copy link
Member

seanpmorgan commented Jun 12, 2019

See #278 #273 I asked but didn't look into when this API enforcement occurred. (So tf.keras.initalizers should work). The py34 test is passing on an outdated nightly I believe

@seanpmorgan
Copy link
Member

I checked on py36 and there is no alias for tf.initializers anymore

@Squadrick
Copy link
Member

Can confirm, tf.initializers is not found in the latest TF2.x nightly for Python3.6, but tf.keras.initializers works.

@AakashKumarNain
Copy link
Member Author

@Squadrick I fixed it. Can you trigger the test again, please?

@AakashKumarNain
Copy link
Member Author

Thanks @Squadrick Finally! This has taken an enormous amount of time. Thanks @facaiy @WindQAQ @seanpmorgan for all the guidance and efforts.

PS: Is there a list where we are keeping track of all the removed APIs?

@Squadrick Squadrick merged commit 977d96e into tensorflow:master Jun 12, 2019
@Squadrick
Copy link
Member

@AakashKumarNain Thanks again for the contribution.

@seanpmorgan
Copy link
Member

Thanks @Squadrick Finally! This has taken an enormous amount of time. Thanks @facaiy @WindQAQ @seanpmorgan for all the guidance and efforts.

PS: Is there a list where we are keeping track of all the removed APIs?

Not that I know of, but may be worth asking. The best I would look at is https://github.com/tensorflow/tensorflow/blob/master/tensorflow/tools/compatibility/all_renames_v2.py

But that mostly uses compat.v1 as replacement. And thanks very much to you for putting in the time to get it all working! It sets a great example for stateful metrics in Addons!

@AakashKumarNain AakashKumarNain deleted the add_kappa branch June 13, 2019 10:09
@llu025
Copy link

llu025 commented Nov 5, 2019

Hello,

I am using tfa.metrics.CohenKappa and something seems to be really wrong. While tf.keras.metrics.Accuracy is working fine, the CohenKappa gets values above 1 (sometimes above 2..).

I have a dictionary of metrics:

self.change_map_metrics = {
            "ACC": tf.keras.metrics.Accuracy(),
            "cohens kappa": tfa.metrics.CohenKappa(num_classes=2),
        }

and then after each training epoch I call this:

def _compute_metrics(self, y_true, y_pred, metrics):
        """
            Compute the metrics specified in metrics.
            Write results to self.tb_writer
            Input:
                y_true - tensor (n, )
                y_pred - tensor (n, )
                metrics - dict {name: tf.metrics class instance}
            Output:
                None
        """
        y_true, y_pred = tf.reshape(y_true, [-1]), tf.reshape(y_pred, [-1])
        for name, metric in metrics.items():
            metric.update_state(y_true, y_pred)
            with self.tb_writer.as_default():
                tf.summary.scalar(name, metric.result())
            metric.reset_states()

As you can see on Tensorboard, it does not really make any sense.

image

I am using TF 2.0 and the docker image tensorflow/tensorflow:2.0.0-gpu.
Please let me know if you need further details.

@WindQAQ
Copy link
Member

WindQAQ commented Nov 5, 2019

@llu025 Hi, is it possible to dump y_true and y_pred in your program? Thank you.

@llu025
Copy link

llu025 commented Nov 5, 2019

@llu025 Hi, is it possible to dump y_true and y_pred in your program? Thank you.

I am not really sure what you mean with "dump". If you are interested in what they contain, they are boolean, so either 0s or 1s.

@WindQAQ
Copy link
Member

WindQAQ commented Nov 5, 2019

@llu025 Hi, is it possible to dump y_true and y_pred in your program? Thank you.

I am not really sure what you mean with "dump". If you are interested in what they contain, they are boolean, so either 0s or 1s.

I mean the exact values of them, which could make kappa score > 1.

@AakashKumarNain
Copy link
Member Author

Yeah, it would be really helpful to debug if you provide the values that produced this score

@llu025
Copy link

llu025 commented Nov 5, 2019

I printed 20 consecutive values from a random index for both y_pred and y_true, this is an example of one epoch:

image

I thought the problem might have been the use of boolean labels instead of integers, so I added this in my _compute_metrics function:

if y_pred.dtype == tf.bool:
            y_true, y_pred = tf.cast(y_true, tf.uint8), tf.cast(y_pred, tf.uint8)

so I dump them again

image

but I still get kappa above 1

image

@WindQAQ
Copy link
Member

WindQAQ commented Nov 5, 2019

Thank you!
@AakashKumarNain I find some abnormal test case that could make it > 1:
https://colab.research.google.com/drive/1lA0mc0prr0XB4buKOdf44WvKi_xMXtIt
I guess it's due to overflow of int32.

@AakashKumarNain
Copy link
Member Author

@WindQAQ Thanks for the test cases. Yeah, it seems that we need to change the dtype to int64

@WindQAQ
Copy link
Member

WindQAQ commented Nov 5, 2019

This should be fix in #675 :D

@AakashKumarNain
Copy link
Member Author

AakashKumarNain commented Nov 5, 2019

This should be fix in #675 :D

Thank you @WindQAQ . I checked the same on my end. Seems that dtype was the issue.
https://colab.research.google.com/drive/1vPGX11biU3CepL3wHp0tZocihu-d9fgK

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants