Skip to content

Conversation

ayushkarnawat
Copy link
Contributor

@ayushkarnawat ayushkarnawat commented Jun 18, 2020

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Motivation and context / Related issue

Python 3.8 no longer supports forking a process (rather it spawns it on a new thread) on MacOS. In ideal conditions -- that is when the code can be trivially pickled to be sent to the individual threads -- this doesn't lead to any problems. However, cython code (which allows the CPP code to be called from within python) cannot be easily pickled. As such, to avoid picking errors and to fix the problems associated with forking a process on windows as well, it is better to use parallel processing in the core CPP code (w/ OpenMP). Resolves #187.

How has this been tested (if it applies)

Checklist

  • The documentation is up-to-date with the changes I made.
  • I have read the CONTRIBUTING document.
  • All tests passed, and additional code has been covered with new tests.

@ayushkarnawat
Copy link
Contributor Author

ayushkarnawat commented Jun 18, 2020

Sorry for the delay in getting this started. There are things that still need to be completed before a review. In particular,

  • Proper compile flags for OpenMP across all OSes (setup.py)
  • OpenMP within CI
  • Remove parmap (not needed anymore)
  • Don't require OpenMP dependency?? (Code can still compile/run, but EMD computation will be a lot slower)
    • Modify bonneel code
    • Tests to check if OpenMP is supported (w/in pkg)
    • Fail fast OpenMP code (checks if OpenMP is available in the user's system)
    • Modify CI pipeline to run tests that require/don't require OpenMP (add env variables to ensure proper compilation of CPP code)
  • Update cython min version to 0.28.5 (for py35)
  • Rename files, i.e. EMD_wrapper.cpp -> emd.cpp (or something similar)

@rflamary rflamary changed the title OpenMP EMD [WIP] OpenMP EMD Jun 19, 2020
run: |
# install OpenMP (via homebrew)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install.sh)"
brew install libomp
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also keep a macos build without openmp so we can check it works on a fresh machine too? thx

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes indeed i would keep the classical build on macosx and add a build/test with openmp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was planning on doing this (just after I fixed the issues with the full build)

@ayushkarnawat
Copy link
Contributor Author

@agramfort @rflamary Some insight/pointers would be appreciated.

Although the package is being installed properly on all platforms (with correct flags it seems), it is for some reason not recognizing get_omp_num_threads within the Linux and Mac builds, respectively. These work flawlessly on my local build, so can't quite pin down the issue with the CI builds. Interestingly, it works on Windows without a problem.

NOTE: The tests need to be modified to address the new changes, hence why some tests fail.

@rflamary
Copy link
Collaborator

Looks to me as if it builds bu do not link openmp.

There is an option for the cython builder called extra_link_args that can be used for that I guess. I would try it with the same value that you add to compile args.

@ayushkarnawat
Copy link
Contributor Author

Thanks for the quick reply. I'll give that a try!

@agramfort
Copy link
Collaborator

did you look how scikit-learn does it ? https://github.com/scikit-learn/scikit-learn/blob/master/setup.py#L137

it took a lot of time to @jeremiedbb to make it work on scikit-learn.

@ayushkarnawat
Copy link
Contributor Author

@agramfort I did look into it, but would it be premature optimization? I say this because the implementation in scipy looks like it is meant to handle many submodules that contain C/C++ code to compile properly. Although I agree that approach is better (i.e. cleaner and more robust for future additions of C/C++ submodule(s)), we currently only have one module where C/C++ code is being used, hence why I thought it might have been easier to define how to cythonize the EMD module in setup.py directly.

Alternatively, since we have to check if OpenMP is supported within a user's system (so that we can run EMD computation with/without OpenMP support), we may have to end up mimicking how scipy is approaching this since the (a) compile flags would change accordingly. and (b) the cythonization would change depending on whether the system could support OpenMP.

@agramfort
Copy link
Collaborator

@ayushkarnawat are you suggesting that making sure it works on systems with or without openmp support automatically is premature optimization?

@ayushkarnawat
Copy link
Contributor Author

ayushkarnawat commented Jun 28, 2020

Perhaps I was not clear enough. What I meant to say was the way scipy handles checking if OpenMP is supported in the the scipy/_build_utils module. From looking at the code, it seems this was done to:

  1. check if a user's system supports OpenMP (through a small test program)
  2. to define cythonization of submodules within each respective submodule's setup.py.

I agree that (1) should also be done in POT since checking if the user's system supports OpenMP is important (as mentioned previously). This allows us to add the proper openmp linker to the compiler so that the generated .so file can use parallel acceleration if the user's system supports it, and fallback into single-threaded use, if that is not possible.

What I meant by "premature optimization" was concerning (2). This is because it seems like it was meant to support cythonization of multiple C++ src files contained within different submodules, which we (currently) do not have with POT. To be pedantic, I don't think we need to define the cythonization in the fashion that scipy does it. I guess we simply can forgo that portion, if you agree.

Let me know what you think and if this makes sense.

@agramfort
Copy link
Collaborator

agramfort commented Jun 29, 2020 via email

@rflamary
Copy link
Collaborator

Thank you @ayushkarnawat for this work, You have motivated us to look into openMP also.

we have another PR that has done the OpenMP implementation (actually both single thread and openMP) and currently build on all OS in #260 so I will close this PR.

@rflamary rflamary closed this Sep 17, 2021
@ayushkarnawat
Copy link
Contributor Author

Very cool, apologies for not finishing this earlier; got busy with other things and sort of forgot about it. I'm happy to actually seeing this actually being done. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to compute multiple EMD loss
3 participants