Skip to content

Conversation

@tveasey
Copy link
Contributor

@tveasey tveasey commented May 29, 2018

Many of our current uses of shared pointer date from when not all our target platforms were C++11 compliant. Now that they all are, we can switch these to std::unique_ptr. This has a few advantages:

  1. It saves us 16 bytes (8 for the extra pointer to the reference count and 8 for the reference count) per instance
  2. It avoids atomic updates of the reference count
  3. It forces one to have the correct copy semantics in such cases. For example, this showed up that we had an implicit shallow copy (not appropriate) on our naive Bayes implementation and fixing this showed up a bug in restore.

Whilst making this change I also spotted an error in memory accounting for shared pointers, which wasn't including the extra memory for the reference count. Note that I've split out the fix to memory accounting and a change to provide support for unique pointers in some of our utility functionality into separate commits.

This has no direct impact on results. Again there can be an impact from the memory reduction causing a change of behaviour in jobs which hit the hard or soft memory limits.

for (std::size_t i = 0u; !isForForecast && i < other.m_Windows.size(); ++i) {
if (other.m_Windows[i]) {
m_Windows[i] = std::make_shared<CExpandingWindow>(*other.m_Windows[i]);
m_Windows[i].reset(new CExpandingWindow{*other.m_Windows[i]});

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know make_unique is C++14, but can we maybe use a backport, which we can throw away when we switched?

Herb Sutter has an implementation here: https://herbsutter.com/gotw/_102/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another possibility for using std::make_unique is to change the -std argument for the Mac/Linux compilers to c++14/gnu++14 respectively on the 6.x branch.

Even though Visual Studio 2013 doesn't support very much of C++14, one bit it does support is std::make_unique. So switching the standard filtering used by the other compilers would mean all of them support std::make_unique.

People would just need to remember that only a tiny fraction of C++14 is available in the 6.x branch. From https://msdn.microsoft.com/en-us/library/hh567368.aspx:

Visual C++ in Visual Studio 2013 implemented some key C++14 library features:

  • "Transparent operator functors" less<>, greater<>, plus<>, multiplies<>, and so on.
  • make_unique<T>(args...) and make_unique<T[]>(n)
  • cbegin()/cend(), rbegin()/rend(), and crbegin()/crend() non-member functions.

Copy link
Contributor Author

@tveasey tveasey May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed your comment @droberts195. I migrated to use boost::make_unique. What do you both think? Is this ok as a stopgap measure or would it be better to use this C++14 feature. Personally, I feel like it is okay for the time being and we should perhaps migrate to C++14 in one go, but don't have a strong opinion.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wondered if boost::make_unique might return boost::unique_ptrs, but just checked and it really does create std::unique_ptrs. So I guess it was designed for exactly this case of a compiler supporting std::unique_ptr but not std::make_unique.

I'm happy to use it until 6.x is for bug fixes only and then someone can do a search and replace of boost::make_unique with std::make_unique on the master branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, I'll leave as I have it.

ss << "shared_ptr (x" << uc << ')';
// Round up
mem->addItem(ss.str(), (CMemory::staticSize(*t) +
mem->addItem(ss.str(), (sizeof(long) + CMemory::staticSize(*t) +

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I suggest to add a code comment that sizeof(long) is for the ref counter

class MATHS_EXPORT CClustererStateSerialiser {
public:
using TClusterer1dPtr = std::shared_ptr<CClusterer1d>;
using TClusterer1dPtr = std::unique_ptr<CClusterer1d>;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General problem:

In other parts of the code we use UPtr, so TClusterer1dPtr would be TClusterer1dUPtr.

Copy link
Contributor Author

@tveasey tveasey May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this. The problem with renaming the type of pointer is that it makes changes like this much more difficult. Everywhere the typedef is referred to suddenly has to change as well. So this change will become a whole lot bigger. Also we don't use SPtr for shared_ptr for example which would be consistent with this pattern.

Whilst we probably don't need to change this type again, I don't like that an "implementation detail" (type of smart pointer) gets mixed up with the name of the type. In some cases we can't avoid this because we have a shared_ptr<T> and a unique_ptr<T> in the same scope but in most cases we can.

Thoughts?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fine with that, actually I agree more to what you wrote, for me one purpose of a typedef/using is to abstract away the implementation detail, putting it back in the name defeats this.

I just wanted to point it out, if you grep the codebase for UPtr you find a small number of uses but I could not find a rule in our new and old styleguide.

Any other comments on this?

Copy link
Contributor Author

@tveasey tveasey May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think our naming of pointer types is not completely consistent. I would propose something like the following rules:

  1. If there are no name collisions, all pointer types should just be suffixed with Ptr
  2. If there would be a name collision then all pointer types should be all qualified: shared_ptr as SPtr, unique_ptr as UPtr, auto_ptr as APtr and weak_ptr as WPtr.

Let's take this discussion offline though. If we get agreement on this we can revisit the codebase and tidy up the naming to conform to this in a separate PR.


//! Get the number of examples in this class.
double count() const;
//! Set the number of examples in this class.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It's not really a setter but exposes count for writes, maybe just "access to number of ... "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. I started off making this a setter, but then decided to make it return a writable reference.

#include <maths/CTimeSeriesModel.h>

#include <boost/bind.hpp>
#include <boost/make_unique.hpp>
Copy link

@hendrikmuhs hendrikmuhs May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

superfluous? this class does not use make_unique

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although it wasn't, this is now being used (see this commit).

@tveasey
Copy link
Contributor Author

tveasey commented May 30, 2018

@hendrikmuhs, doing some more testing I realised that I had subverted the caching mechanism for new time series models and correlated pairs residual priors. This means we ended up with duplicates of these objects. Also, there was no need to cache the correlation model: it can't be shared. We were previously caching, but then cloning the value passed in anyway. I now pass this in by rvalue reference and move into place. I have corrected this behaviour in my last commit which is worth reviewing.

Copy link

@hendrikmuhs hendrikmuhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

tveasey added 4 commits May 30, 2018 14:40
…on (even though it is for the explicitly declared destructor in CTimeSeriesModel.cc). Hey ho.
… actually used (hopefully deleting them will stop it)
@tveasey tveasey merged commit 71dc485 into elastic:master May 30, 2018
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 30, 2018
…stic#108)

Many of our current uses of shared pointer date from when not all our target platforms were
C++11 compliant and in particular move semantics weren't available. Now that they all are, we can
switch these to std::unique_ptr. This has a few advantages: i) it saves us 16 bytes (8 for the extra
pointer to the reference count and 8 for the reference count) per instance, ii) it avoids atomic
updates of the reference count, iii) it forces one to have the correct copy semantics in such
cases. For example, this showed up that we had an implicit shallow copy (not appropriate) on our
naive Bayes implementation and fixing this showed up a bug in restore. This also fixes an error in
memory accounting for shared pointers, which wasn't including the extra memory for the
reference count.
tveasey added a commit to tveasey/ml-cpp-1 that referenced this pull request May 31, 2018
tveasey added a commit that referenced this pull request May 31, 2018
@sophiec20 sophiec20 added :ml and removed :ml labels Jun 12, 2018
@tveasey tveasey deleted the enhancement/unique-ptrs branch March 22, 2019 09:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants