Profiling of neural networks #26

bedapisl · 2019-01-28T16:52:29Z

Profiling is switch on by model.profile variable in YAML config:

model:
    profile: True

Profile is saved in the log_dir of the model and can be viewed in Google Chrome at address chrome://tracing/.

FloopCZ

Very nice functionality, thank you @bedapisl for pointing it out!
Can you please add the guidelines on how to use it? It should be visible here: http://emloop.org/advanced/model.html

FloopCZ · 2019-01-28T20:15:01Z

emloop_tensorflow/frozen_model.py

+            outputs = self._session.run(fetches=fetches, feed_dict=feed_dict,
+                                        options=run_options, run_metadata=run_metadata)
+
+            with open(path.join(self._log_dir, "profile.json"), "w") as ofile:


Suggested change

with open(path.join(self._log_dir, "profile.json"), "w") as ofile:

with open(path.join(self._log_dir, 'profile.json'), 'w') as ofile:

FloopCZ · 2019-01-28T20:15:28Z

emloop_tensorflow/model.py

+                                        options=run_options, run_metadata=run_metadata)
+
+        if self._profile:
+            with open(path.join(self._log_dir, "profile.json"), "w") as ofile:


Suggested change

with open(path.join(self._log_dir, "profile.json"), "w") as ofile:

with open(path.join(self._log_dir, 'profile.json'), 'w') as ofile:

FloopCZ · 2019-01-28T20:15:44Z

emloop_tensorflow/tests/frozen_model_test.py

    """Test frozen model restoration."""
    with pytest.raises(ValueError):
-        FrozenModel(inputs=[], outputs=[], restore_from=tmpdir)  # there is no .pb file yet
+        FrozenModel(log_dir="/dev/null", inputs=[], outputs=[], restore_from=tmpdir)  # there is no .pb file yet


Suggested change

FrozenModel(log_dir="/dev/null", inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet

FrozenModel(log_dir='/dev/null', inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet

FloopCZ · 2019-01-28T20:15:54Z

emloop_tensorflow/tests/frozen_model_test.py


    # restore from directory
-    FrozenModel(**_IO, restore_from=tmpdir)
+    FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir)


Suggested change

FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir)

FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir)

FloopCZ · 2019-01-28T20:16:03Z

emloop_tensorflow/tests/frozen_model_test.py


    # restore from file
-    FrozenModel(**_IO, restore_from=path.join(tmpdir, 'model.pb'))
+    FrozenModel(log_dir="/dev/null", **_IO, restore_from=path.join(tmpdir, 'model.pb'))


Suggested change

FrozenModel(log_dir="/dev/null", **_IO, restore_from=path.join(tmpdir, 'model.pb'))

FrozenModel(log_dir='/dev/null', **_IO, restore_from=path.join(tmpdir, 'model.pb'))

FloopCZ · 2019-01-28T20:16:49Z

emloop_tensorflow/tests/frozen_model_test.py


    # restore from directory
-    frozen_model = FrozenModel(**_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
+    frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})


Suggested change

frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})

frozen_model = FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})

FloopCZ · 2019-01-28T20:17:00Z

emloop_tensorflow/tests/frozen_model_test.py

    model.save('')

-    frozen_model = FrozenModel(inputs=['input'], outputs=['output'], restore_from=tmpdir)
+    frozen_model = FrozenModel(log_dir="/dev/null", inputs=['input'], outputs=['output'], restore_from=tmpdir)


Suggested change

frozen_model = FrozenModel(log_dir="/dev/null", inputs=['input'], outputs=['output'], restore_from=tmpdir)

frozen_model = FrozenModel(log_dir='/dev/null', inputs=['input'], outputs=['output'], restore_from=tmpdir)

FloopCZ · 2019-01-28T20:17:16Z

emloop_tensorflow/tests/frozen_model_test.py


    # restore from directory
-    frozen_model = FrozenModel(**_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
+    frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})


Line too long.

FloopCZ · 2019-01-28T20:21:58Z

emloop_tensorflow/model.py

+        if self._profile:
+            with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
+                tl = timeline.Timeline(run_metadata.step_stats)
+                ofile.write(tl.generate_chrome_trace_format())


This seems to overwrite the profile file on every call to run with only the statistics from this single call. Shouldn't it rather create the run_metadata variable in the constructor and consider the statistics from all the calls to run?

FloopCZ · 2019-01-28T20:22:37Z

emloop_tensorflow/frozen_model.py

+
+            with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
+                tl = timeline.Timeline(run_metadata.step_stats)
+                ofile.write(tl.generate_chrome_trace_format())


Ditto statistics from a single call.

Well, this is a bit tricky. Both first (warm-up) and last (possibly smaller batch) profiles may be inaccurate. So what do we want to actually save? I guess keeping last keep_profiles: int = 5 is reasonable.

blazekadam · 2019-01-28T21:49:32Z

Whoa, @FloopCZ I thought we already gave up the " vs ' thing...

blazekadam

Thank you for the code @bedapisl , I am sure it will be eventually very useful in other cases. Lets just pay a bit more attention to it so that we can marge it to this rather stable (well tested) project.

make log_dir optional in both models (and avoid modifying the tests)
implement reasonable behaviour with respect to multiple run calls (you have two suggestions from me and @FloopCZ )
restructure the code so that duplicities are avoided
explain this feature to @gdynusa and ask her for tests once you do the above
mention this feature in the docs as @FloopCZ suggests

Anyways, thank you again!

blazekadam · 2019-01-28T21:54:34Z

emloop_tensorflow/frozen_model.py

    def __init__(self,
-                 inputs: List[str], outputs: List[str], restore_from: str,
-                 session_config: Optional[dict]=None, n_gpus: int=0, **_):
+                 log_dir: str, inputs: List[str], outputs: List[str], restore_from: str,


log_dir argument should be rather optional as it was previously if I am not mistaken. Of course, we should sanitize arguments similarly to this

if profile and not log_dir: # works for both None and empty string raise ValueError('log_dir has to be specified with profile set to True')

blazekadam · 2019-01-28T21:55:57Z

emloop_tensorflow/frozen_model.py

        """
        Initialize new :py:class:`FrozenModel` instance.

+        :param log_dir: path to the logging directory (wherein models should be saved)


This docstring is inaccurate as FrozenModel cannot be saved. It is solely for the profile or is it not?

blazekadam · 2019-01-28T21:57:11Z

emloop_tensorflow/frozen_model.py

        :param restore_from: restore model path (either a dir or a .pb file)
        :param session_config: TF session configuration dict
        :param n_gpus: number of GPUs to use (either 0 or 1)
+        :param profile: whether profile.json should be saved to log_dir


Suggested change

:param profile: whether profile.json should be saved to log_dir

:param profile: if true, profile the speed of model inference and save profile.json to the specified log_dir

blazekadam · 2019-01-28T21:58:13Z

emloop_tensorflow/frozen_model.py

        self._graph = tf.Graph()
        if session_config:
            session_config = tf.ConfigProto(**session_config)
+


is this intentional?

blazekadam · 2019-01-28T22:05:35Z

emloop_tensorflow/frozen_model.py

+
+            with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
+                tl = timeline.Timeline(run_metadata.step_stats)
+                ofile.write(tl.generate_chrome_trace_format())


Well, this is a bit tricky. Both first (warm-up) and last (possibly smaller batch) profiles may be inaccurate. So what do we want to actually save? I guess keeping last keep_profiles: int = 5 is reasonable.

blazekadam · 2019-01-28T22:06:21Z

emloop_tensorflow/model.py

    """Name of the monitored signal variance tensor/output."""

    def __init__(self,  # pylint: disable=too-many-arguments
                 dataset: Optional[el.AbstractDataset], log_dir: Optional[str], inputs: List[str], outputs: List[str],


I would make the log_dir indeed optional with default None.

Is this in some way related to this pull request?

blazekadam · 2019-01-28T22:06:35Z

emloop_tensorflow/model.py

        :param monitor: monitor signal mean and variance of the tensors which names contain the specified value
        :param restore_fallback: ignored arg. (allows training from configs saved by emloop where it is added)
        :param clip_gradient: limit the absolute value of the gradient; set to None for no clipping
+        :param profile: whether profile.json should be saved to log_dir


ditto description

blazekadam · 2019-01-28T22:08:26Z

emloop_tensorflow/model.py

                for output_name in self.output_names:
                    fetches.append(tower[output_name])

+        run_options = None


This is more or less the same code as in FrozenModel. Can you perhaps wrap session.run to some util function which would be utilized in both classes?

…to profiling

blazekadam · 2019-02-09T13:38:04Z

docs/conf.py

 project = 'emloop-tensorflow' 
 copyright = '2018, Iterait a.s.'                     
-author = 'Blazek Adam, Belohlavek Petr, Matzner Filip'
+author = 'Blazek Adam, Belohlavek Petr, Matzner Filip, Bedrich Pisl'


Nice except that Beda's name and surname are in the wrong order. :-)

blazekadam · 2019-02-09T13:38:18Z

docs/advanced.rst

@@ -0,0 +1,27 @@
+Profiling networks


Thank you for the docs.

Add profiling of neural networks

3e70801

bedapisl requested a review from FloopCZ January 28, 2019 16:52

bedapisl changed the title ~~Add profiling of neural networks~~ Profiling of neural networks Jan 28, 2019

Tests

778c47c

FloopCZ reviewed Jan 28, 2019

View reviewed changes

blazekadam requested changes Jan 28, 2019

View reviewed changes

bedapisl added 7 commits January 29, 2019 10:01

Refactor profiling

d2e5ea3

Cosmetic changes

9f7bb55

Tests

44eaf7b

Cosmetic changes

970da2b

Cosmetic changes

e2ff8c8

Refactoring profiler

6eaa70d

Documentation

3bf6f10

bedapisl assigned gdynusa Feb 4, 2019

bedapisl added 2 commits February 8, 2019 11:57

Add profiling test

45a674f

Merge branch 'dev' of https://github.com/iterait/emloop-tensorflow in…

b75a200

…to profiling

blazekadam approved these changes Feb 9, 2019

View reviewed changes

blazekadam merged commit 6b2e648 into dev Feb 9, 2019

blazekadam deleted the profiling branch February 9, 2019 13:40

	with open(path.join(self._log_dir, "profile.json"), "w") as ofile:
	with open(path.join(self._log_dir, 'profile.json'), 'w') as ofile:

	FrozenModel(log_dir="/dev/null", inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet
	FrozenModel(log_dir='/dev/null', inputs=[], outputs=[], restore_from=tmpdir) # there is no .pb file yet

	FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir)
	FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir)

	frozen_model = FrozenModel(log_dir="/dev/null", **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})
	frozen_model = FrozenModel(log_dir='/dev/null', **_IO, restore_from=tmpdir, session_config={'allow_soft_placement': True})

	:param profile: whether profile.json should be saved to log_dir
	:param profile: if true, profile the speed of model inference and save profile.json to the specified log_dir

Profiling of neural networks #26

Profiling of neural networks #26

Uh oh!

Conversation

bedapisl commented Jan 28, 2019

Uh oh!

FloopCZ left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

blazekadam commented Jan 28, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

blazekadam left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bedapisl Feb 8, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

blazekadam commented Jan 28, 2019 •

edited

Loading

blazekadam left a comment •

edited

Loading

bedapisl Feb 8, 2019 •

edited

Loading