[ML] Convert std::shared_ptrs to std::unique_ptrs where possible #108

tveasey · 2018-05-29T11:03:26Z

Many of our current uses of shared pointer date from when not all our target platforms were C++11 compliant. Now that they all are, we can switch these to std::unique_ptr. This has a few advantages:

It saves us 16 bytes (8 for the extra pointer to the reference count and 8 for the reference count) per instance
It avoids atomic updates of the reference count
It forces one to have the correct copy semantics in such cases. For example, this showed up that we had an implicit shallow copy (not appropriate) on our naive Bayes implementation and fixing this showed up a bug in restore.

Whilst making this change I also spotted an error in memory accounting for shared pointers, which wasn't including the extra memory for the reference count. Note that I've split out the fix to memory accounting and a change to provide support for unique pointers in some of our utility functionality into separate commits.

This has no direct impact on results. Again there can be an impact from the memory reduction causing a change of behaviour in jobs which hit the hard or soft memory limits.

…mentation

…iveBayes

hendrikmuhs · 2018-05-29T12:55:08Z

lib/maths/CTimeSeriesDecompositionDetail.cc

    for (std::size_t i = 0u; !isForForecast && i < other.m_Windows.size(); ++i) {
        if (other.m_Windows[i]) {
-            m_Windows[i] = std::make_shared<CExpandingWindow>(*other.m_Windows[i]);
+            m_Windows[i].reset(new CExpandingWindow{*other.m_Windows[i]});


I know make_unique is C++14, but can we maybe use a backport, which we can throw away when we switched?

Herb Sutter has an implementation here: https://herbsutter.com/gotw/_102/

Another possibility for using std::make_unique is to change the -std argument for the Mac/Linux compilers to c++14/gnu++14 respectively on the 6.x branch.

Even though Visual Studio 2013 doesn't support very much of C++14, one bit it does support is std::make_unique. So switching the standard filtering used by the other compilers would mean all of them support std::make_unique.

People would just need to remember that only a tiny fraction of C++14 is available in the 6.x branch. From https://msdn.microsoft.com/en-us/library/hh567368.aspx:

Visual C++ in Visual Studio 2013 implemented some key C++14 library features:

"Transparent operator functors" less<>, greater<>, plus<>, multiplies<>, and so on.

make_unique<T>(args...) and make_unique<T[]>(n)

cbegin()/cend(), rbegin()/rend(), and crbegin()/crend() non-member functions.

Sorry, missed your comment @droberts195. I migrated to use boost::make_unique. What do you both think? Is this ok as a stopgap measure or would it be better to use this C++14 feature. Personally, I feel like it is okay for the time being and we should perhaps migrate to C++14 in one go, but don't have a strong opinion.

I wondered if boost::make_unique might return boost::unique_ptrs, but just checked and it really does create std::unique_ptrs. So I guess it was designed for exactly this case of a compiler supporting std::unique_ptr but not std::make_unique.

I'm happy to use it until 6.x is for bug fixes only and then someone can do a search and replace of boost::make_unique with std::make_unique on the master branch.

Ok cool, I'll leave as I have it.

hendrikmuhs · 2018-05-29T14:33:36Z

include/core/CMemory.h

                ss << "shared_ptr (x" << uc << ')';
                // Round up
-                mem->addItem(ss.str(), (CMemory::staticSize(*t) +
+                mem->addItem(ss.str(), (sizeof(long) + CMemory::staticSize(*t) +


nit: I suggest to add a code comment that sizeof(long) is for the ref counter

hendrikmuhs · 2018-05-29T14:40:49Z

include/maths/CClustererStateSerialiser.h

 class MATHS_EXPORT CClustererStateSerialiser {
 public:
-    using TClusterer1dPtr = std::shared_ptr<CClusterer1d>;
+    using TClusterer1dPtr = std::unique_ptr<CClusterer1d>;


General problem:

In other parts of the code we use UPtr, so TClusterer1dPtr would be TClusterer1dUPtr.

I thought about this. The problem with renaming the type of pointer is that it makes changes like this much more difficult. Everywhere the typedef is referred to suddenly has to change as well. So this change will become a whole lot bigger. Also we don't use SPtr for shared_ptr for example which would be consistent with this pattern.

Whilst we probably don't need to change this type again, I don't like that an "implementation detail" (type of smart pointer) gets mixed up with the name of the type. In some cases we can't avoid this because we have a shared_ptr<T> and a unique_ptr<T> in the same scope but in most cases we can.

Thoughts?

I am fine with that, actually I agree more to what you wrote, for me one purpose of a typedef/using is to abstract away the implementation detail, putting it back in the name defeats this.

I just wanted to point it out, if you grep the codebase for UPtr you find a small number of uses but I could not find a rule in our new and old styleguide.

Any other comments on this?

Yes I think our naming of pointer types is not completely consistent. I would propose something like the following rules:

If there are no name collisions, all pointer types should just be suffixed with Ptr

If there would be a name collision then all pointer types should be all qualified: shared_ptr as SPtr, unique_ptr as UPtr, auto_ptr as APtr and weak_ptr as WPtr.

Let's take this discussion offline though. If we get agreement on this we can revisit the codebase and tidy up the naming to conform to this in a separate PR.

hendrikmuhs · 2018-05-29T14:49:13Z

include/maths/CNaiveBayes.h

+
+        //! Get the number of examples in this class.
+        double count() const;
+        //! Set the number of examples in this class.


nit: It's not really a setter but exposes count for writes, maybe just "access to number of ... "

True. I started off making this a setter, but then decided to make it return a writable reference.

…ld have been using make_shared)

hendrikmuhs · 2018-05-30T06:38:57Z

lib/maths/CModelStateSerialiser.cc

 #include <maths/CTimeSeriesModel.h>

 #include <boost/bind.hpp>
+#include <boost/make_unique.hpp>


superfluous? this class does not use make_unique

Although it wasn't, this is now being used (see this commit).

tveasey · 2018-05-30T08:51:38Z

@hendrikmuhs, doing some more testing I realised that I had subverted the caching mechanism for new time series models and correlated pairs residual priors. This means we ended up with duplicates of these objects. Also, there was no need to cache the correlation model: it can't be shared. We were previously caching, but then cloning the value passed in anyway. I now pass this in by rvalue reference and move into place. I have corrected this behaviour in my last commit which is worth reviewing.

…g). Plus prefer C++11 style copying deletion

hendrikmuhs

LGTM

…on (even though it is for the explicitly declared destructor in CTimeSeriesModel.cc). Hey ho.

… actually used (hopefully deleting them will stop it)

…stic#108) Many of our current uses of shared pointer date from when not all our target platforms were C++11 compliant and in particular move semantics weren't available. Now that they all are, we can switch these to std::unique_ptr. This has a few advantages: i) it saves us 16 bytes (8 for the extra pointer to the reference count and 8 for the reference count) per instance, ii) it avoids atomic updates of the reference count, iii) it forces one to have the correct copy semantics in such cases. For example, this showed up that we had an implicit shallow copy (not appropriate) on our naive Bayes implementation and fixing this showed up a bug in restore. This also fixes an error in memory accounting for shared pointers, which wasn't including the extra memory for the reference count.

…ble (elastic#108)" This reverts commit 71dc485.

…ble (#108)" (#114) This reverts commit 71dc485.

tveasey added 5 commits May 25, 2018 10:09

Memory accounting missing shared count size and also unique_ptr imple…

f0479df

…mentation

Checksum and orderings missing unique pointer implementations

2a35a2c

Migrate (unshared) shared_ptrs to unique_ptr. Fix a bug restoring CNa…

21ed7dc

…iveBayes

Merge branch 'master' into enhancement/unique-ptrs

96ec215

Fix unit test

14b65f2

tveasey added v7.0.0 >non-issue :ml v6.4.0 affects-results labels May 29, 2018

tveasey requested a review from hendrikmuhs May 29, 2018 11:03

tveasey removed the >non-issue label May 29, 2018

hendrikmuhs reviewed May 29, 2018

View reviewed changes

tveasey added 2 commits May 29, 2018 16:31

Migrate to use boost::make_unique (and some other cases where we shou…

27c00b7

…ld have been using make_shared)

Some review comments

ff1be72

hendrikmuhs reviewed May 30, 2018

View reviewed changes

We do need to share new models

f3b30df

tveasey and others added 4 commits May 30, 2018 10:01

Update change log

6ec167e

Assignment operator should return by non-const reference (for chainin…

8239159

…g). Plus prefer C++11 style copying deletion

Merge branch 'master' into enhancement/unique-ptrs

b769626

Update comment

31d3081

hendrikmuhs approved these changes May 30, 2018

View reviewed changes

tveasey added 4 commits May 30, 2018 14:40

VC fails with the multivariate prior not being in scope for destructi…

048c3e8

…on (even though it is for the explicitly declared destructor in CTimeSeriesModel.cc). Hey ho.

Formatting fixes

4239b53

Another VC work around

1f6ce15

VC fails trying to instantiate some assignment operators which aren't…

9ea3252

… actually used (hopefully deleting them will stop it)

More of the same

c8e2a1a

tveasey merged commit 71dc485 into elastic:master May 30, 2018

tveasey mentioned this pull request May 30, 2018

[6.x][ML] Convert std::shared_ptrs to std::unique_ptrs where possible #111

Closed

droberts195 mentioned this pull request May 31, 2018

[ML] Failures in ML tests in UpgradeClusterClientYamlTestSuiteIT elastic/elasticsearch#30982

Closed

tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 31, 2018

Revert "[ML] Convert std::shared_ptrs to std::unique_ptrs where possi…

aa3ca5b

…ble (elastic#108)" This reverts commit 71dc485.

tveasey added a commit that referenced this pull request May 31, 2018

Revert "[ML] Convert std::shared_ptrs to std::unique_ptrs where possi…

c6f3be3

…ble (#108)" (#114) This reverts commit 71dc485.

tveasey mentioned this pull request May 31, 2018

[ML] Convert std::shared_ptrs to std::unique_ptrs where possible #115

Merged

sophiec20 added :ml and removed :ml labels Jun 12, 2018

tveasey deleted the enhancement/unique-ptrs branch March 22, 2019 09:55

[ML] Convert std::shared_ptrs to std::unique_ptrs where possible #108

[ML] Convert std::shared_ptrs to std::unique_ptrs where possible #108

Uh oh!

Conversation

tveasey commented May 29, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tveasey May 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tveasey May 29, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tveasey May 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hendrikmuhs May 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tveasey commented May 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hendrikmuhs left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tveasey May 29, 2018 •

edited

Loading

tveasey May 29, 2018 •

edited

Loading

tveasey May 30, 2018 •

edited

Loading

hendrikmuhs May 30, 2018 •

edited

Loading

tveasey commented May 30, 2018 •

edited

Loading