Skip to content

Decimal C# conversion to python is expensive #2825

@Martin-Molinero

Description

@Martin-Molinero

Expected Behavior

  • Converting data points from C# to Python doesn't cause a big overhead

Actual Behavior

  • The current PythonNet conversion implementation converts from C# decimal to python decimal and has a big impact on execution speeds.

BasicTemplateAlgorithm:
3.70 seconds at 32k data points per second. Processing total of 117,001 data points.
3.91 seconds at 30k data points per second. Processing total of 117,001 data points.
3.69 seconds at 32k data points per second. Processing total of 117,001 data points.
BasicTemplateAlgorithm accessing Price:
5.51 seconds at 21k data points per second. Processing total of 117,001 data points.
5.10 seconds at 23k data points per second. Processing total of 117,001 data points.
5.51 seconds at 21k data points per second. Processing total of 117,001 data points.
BasicTemplateAlgorithm accessing Price proposed fix:
4.11 seconds at 28k data points per second. Processing total of 117,001 data points.
3.89 seconds at 30k data points per second. Processing total of 117,001 data points.
4.10 seconds at 29k data points per second. Processing total of 117,001 data points.
3.91 seconds at 30k data points per second. Processing total of 117,001 data points.

class BasicTemplateAlgorithm(QCAlgorithm):

    def Initialize(self):
        self.SetStartDate(2013,10, 7)
        self.SetEndDate(2013,10,11)
        self.SetCash(100000)
        self.AddEquity("SPY", Resolution.Second)
        self.Debug("numpy test >>> print numpy.pi: " + str(np.pi))
        self.spy = self.Securities["SPY"]

    def OnData(self, data):
        pepe = self.spy.Price
        if not self.Portfolio.Invested:
            self.SetHoldings("SPY", 1)

Master benchmark algorithm:
172.59 seconds at 41k data points per second. Processing total of 7,092,090 data points.
178.52 seconds at 40k data points per second. Processing total of 7,092,090 data points.
182.66 seconds at 39k data points per second. Processing total of 7,092,090 data points.
Benchmark algorithm proposed fix:
145.83 seconds at 49k data points per second. Processing total of 7,092,090 data points.
152.77 seconds at 46k data points per second. Processing total of 7,092,090 data points.
155.22 seconds at 46k data points per second. Processing total of 7,092,090 data points.

python_benchmark_algo.txt

Potential Solution

  • The current proposed solution is to convert C# decimal to double and use python float versus python decimal.
    Note this is already being used for history requests https://github.com/QuantConnect/Lean/blob/master/Common/Python/PandasData.cs#L306
    *Keep in mind backward compatibility. Example: python algorithms performing math operations over Price shouldn't fail due to: TypeError : unsupported operand type(s) for *: 'float' and 'decimal.Decimal' this is captured by a couple of regression algorithms

Reproducing the Problem

See Actual Behavior

System Information

N/A

Checklist

  • I have completely filled out this template
  • I have confirmed that this issue exists on the current master branch
  • I have confirmed that this is not a duplicate issue by searching issues
  • I have provided detailed steps to reproduce the issue

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions