Skip to content

Commit a7d260a

Browse files
coruscatingnkanazawa1989wshanks
authored
Add artifacts interface (#1342)
### Summary This PR adds the artifacts interface following the design in https://github.com/Qiskit/rfcs/blob/master/0007-experiment-dataframe.md. ### Details and comments - Added the `ArtifactData` dataclass for representing artifacts. - Added `ExperimentData.artifacts()`, `.add_artifacts()`, and `delete_artifact()` for working with artifacts, which is stored in a thread safe list. Currently the `ScatterTable` and `CurveFitResult` objects are stored as artifacts, and experiment serialization data will be added in the future. - Artifacts are grouped by type and stored in a compressed format so that there aren't a huge number of individual files for composite experiments. As such, this PR depends on qiskit-community/qiskit-ibm-experiment#93 to allow `.zip` formats for uploading to the cloud service. Inside each zipped file is a list of JSON artifact files with the filename equal to their unique artifact ID. For composite experiments with `flatten_results=True`, all `ScatterTable` artifacts are stored in `curve_data.zip` in individual jsons and so forth. - Added a how-to for artifacts and updated documentation to demonstrate dataframe objects like AnalysisResults and the ScatterTable (`dataframe.css` is for styling these tables). - Deprecated accessing analysis results via numerical indices to anticipate removing the curve fit result from analysis results altogether in the next release. - Fixed bug where `figure_names` were being duplicated in a copied `ExperimentData` object. Example experiment with artifacts ([link](https://quantum.ibm.com/experiments/eaad518d-232f-4cab-b137-e480ff7f1cbb)): ![image](https://github.com/Qiskit-Extensions/qiskit-experiments/assets/3870315/a2929782-dfef-4535-b246-1167666ebfc9) --------- Co-authored-by: Naoki Kanazawa <[email protected]> Co-authored-by: Will Shanks <[email protected]>
1 parent 777e2d5 commit a7d260a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1406
-379
lines changed

docs/_static/dataframe.css

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
/* Styling for pandas dataframes in documentation */
2+
3+
div.output table {
4+
border: none;
5+
border-collapse: collapse;
6+
border-spacing: 0;
7+
color: black;
8+
font-size: 12px;
9+
table-layout: fixed;
10+
width: 100%;
11+
}
12+
div.output thead {
13+
border-bottom: 1px solid black;
14+
vertical-align: bottom;
15+
}
16+
div.output tr,
17+
div.output th,
18+
div.output td {
19+
text-align: right;
20+
vertical-align: middle;
21+
padding: 0.5em 0.5em;
22+
line-height: normal;
23+
white-space: normal;
24+
max-width: none;
25+
border: none;
26+
}
27+
div.output th {
28+
font-weight: bold;
29+
}
30+
div.output tbody tr:nth-child(odd) {
31+
background: #f5f5f5;
32+
}
33+
div.output tbody tr:hover {
34+
background: rgba(66, 165, 245, 0.2);
35+
}

docs/conf.py

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,9 +80,7 @@
8080
templates_path = ["_templates"]
8181
# Manually add the gallery CSS file for now
8282
# TODO: Figure out why the styling is not working by default
83-
html_css_files = [
84-
"nbsphinx-gallery.css",
85-
]
83+
html_css_files = ["nbsphinx-gallery.css", "dataframe.css"]
8684

8785
nbsphinx_timeout = 360
8886
nbsphinx_execute = os.getenv("QISKIT_DOCS_BUILD_TUTORIALS", "never")
@@ -171,6 +169,7 @@
171169
"matplotlib": ("https://matplotlib.org/stable/", None),
172170
"qiskit": ("https://docs.quantum.ibm.com/api/qiskit/", None),
173171
"uncertainties": ("https://pythonhosted.org/uncertainties", None),
172+
"pandas": ("http://pandas.pydata.org/docs/", None),
174173
"qiskit_aer": ("https://qiskit.org/ecosystem/aer", None),
175174
"qiskit_dynamics": ("https://qiskit.org/ecosystem/dynamics/", None),
176175
"qiskit_ibm_runtime": ("https://docs.quantum.ibm.com/api/qiskit-ibm-runtime/", None),
@@ -236,6 +235,11 @@ def maybe_skip_member(app, what, name, obj, skip, options):
236235
"filter_kwargs",
237236
"fit_func",
238237
"signature",
238+
"artifact_id",
239+
"artifact_data",
240+
"device_components",
241+
"created_time",
242+
"data",
239243
]
240244
skip_members = [
241245
ParameterRepr.repr,

docs/howtos/artifacts.rst

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
Work with experiment artifacts
2+
==============================
3+
4+
Problem
5+
-------
6+
7+
You want to view, add, remove, and save artifacts associated with your :class:`.ExperimentData` instance.
8+
9+
Solution
10+
--------
11+
12+
Artifacts are used to store auxiliary data for an experiment that don't fit neatly in the
13+
:class:`.AnalysisResult` model. Any data that can be serialized, such as fit data, can be added as
14+
:class:`.ArtifactData` artifacts to :class:`.ExperimentData`.
15+
16+
For example, after an experiment that uses :class:`.CurveAnalysis` is run, its :class:`.ExperimentData`
17+
object is automatically populated with ``fit_summary`` and ``curve_data`` artifacts. The ``fit_summary``
18+
artifact has one or more :class:`.CurveFitResult` objects that contain parameters from the fit. The
19+
``curve_data`` artifact has a :class:`.ScatterTable` object that contains raw and fitted data in a pandas
20+
:class:`~pandas:pandas.DataFrame`.
21+
22+
Viewing artifacts
23+
~~~~~~~~~~~~~~~~~
24+
25+
Here we run a parallel experiment consisting of two :class:`.T1` experiments in parallel and then view the output
26+
artifacts as a list of :class:`.ArtifactData` objects accessed by :meth:`.ExperimentData.artifacts`:
27+
28+
.. jupyter-execute::
29+
30+
from qiskit_ibm_runtime.fake_provider import FakePerth
31+
from qiskit_aer import AerSimulator
32+
from qiskit_experiments.library import T1
33+
from qiskit_experiments.framework import ParallelExperiment
34+
import numpy as np
35+
36+
backend = AerSimulator.from_backend(FakePerth())
37+
exp1 = T1(physical_qubits=[0], delays=np.arange(1e-6, 6e-4, 5e-5))
38+
exp2 = T1(physical_qubits=[1], delays=np.arange(1e-6, 6e-4, 5e-5))
39+
data = ParallelExperiment([exp1, exp2], flatten_results=True).run(backend).block_for_results()
40+
data.artifacts()
41+
42+
Artifacts can be accessed using either the artifact ID, which has to be unique in each
43+
:class:`.ExperimentData` object, or the artifact name, which does not have to be unique and will return
44+
all artifacts with the same name:
45+
46+
.. jupyter-execute::
47+
48+
print("Number of curve_data artifacts:", len(data.artifacts("curve_data")))
49+
# retrieve by name and index
50+
curve_data_id = data.artifacts("curve_data")[0].artifact_id
51+
# retrieve by ID
52+
scatter_table = data.artifacts(curve_data_id).data
53+
print("The first curve_data artifact:\n")
54+
scatter_table.dataframe
55+
56+
In composite experiments, artifacts behave like analysis results and figures in that if
57+
``flatten_results`` isn't ``True``, they are accessible in the :meth:`.artifacts` method of each
58+
:meth:`.child_data`. The artifacts in a large composite experiment with ``flatten_results=True`` can be
59+
distinguished from each other using the :attr:`~.ArtifactData.experiment` and
60+
:attr:`~.ArtifactData.device_components`
61+
attributes.
62+
63+
One useful pattern is to load raw or fitted data from ``curve_data`` for further data manipulation. You
64+
can work with the dataframe using standard pandas dataframe methods or the built-in
65+
:class:`.ScatterTable` methods:
66+
67+
.. jupyter-execute::
68+
69+
import matplotlib.pyplot as plt
70+
71+
exp_type = data.artifacts(curve_data_id).experiment
72+
component = data.artifacts(curve_data_id).device_components[0]
73+
74+
raw_data = scatter_table.filter(category="raw")
75+
fitted_data = scatter_table.filter(category="fitted")
76+
77+
# visualize the data
78+
plt.figure()
79+
plt.errorbar(raw_data.x, raw_data.y, yerr=raw_data.y_err, capsize=5, label="raw data")
80+
plt.errorbar(fitted_data.x, fitted_data.y, yerr=fitted_data.y_err, capsize=5, label="fitted data")
81+
plt.title(f"{exp_type} experiment on {component}")
82+
plt.xlabel('x')
83+
plt.ylabel('y')
84+
plt.legend()
85+
plt.show()
86+
87+
Adding artifacts
88+
~~~~~~~~~~~~~~~~
89+
90+
You can add arbitrary data as an artifact as long as it's serializable with :class:`.ExperimentEncoder`,
91+
which extends Python's default JSON serialization with support for other data types commonly used with
92+
Qiskit Experiments.
93+
94+
.. jupyter-execute::
95+
96+
from qiskit_experiments.framework import ArtifactData
97+
98+
new_artifact = ArtifactData(name="experiment_notes", data={"content": "Testing some new ideas."})
99+
data.add_artifacts(new_artifact)
100+
data.artifacts("experiment_notes")
101+
102+
.. jupyter-execute::
103+
104+
print(data.artifacts("experiment_notes").data)
105+
106+
Saving and loading artifacts
107+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
108+
109+
.. note::
110+
This feature is only for those who have access to the cloud service. You can
111+
check whether you do by logging into the IBM Quantum interface
112+
and seeing if you can see the `database <https://quantum.ibm.com/experiments>`__.
113+
114+
Artifacts are saved and loaded to and from the cloud service along with the rest of the
115+
:class:`ExperimentData` object. Artifacts are stored as ``.zip`` files in the cloud service grouped by
116+
the artifact name. For example, the composite experiment above will generate two artifact files, ``fit_summary.zip`` and
117+
``curve_data.zip``. Each of these zipfiles will contain serialized artifact data in JSON format named
118+
by their unique artifact ID:
119+
120+
.. jupyter-execute::
121+
:hide-code:
122+
123+
print("fit_summary.zip")
124+
print(f"|- {data.artifacts('fit_summary')[0].artifact_id}.json")
125+
print(f"|- {data.artifacts('fit_summary')[1].artifact_id}.json")
126+
print("curve_data.zip")
127+
print(f"|- {data.artifacts('curve_data')[0].artifact_id}.json")
128+
print(f"|- {data.artifacts('curve_data')[1].artifact_id}.json")
129+
print("experiment_notes.zip")
130+
print(f"|- {data.artifacts('experiment_notes').artifact_id}.json")
131+
132+
Note that for performance reasons, the auto save feature does not apply to artifacts. You must still
133+
call :meth:`.ExperimentData.save` once the experiment analysis has completed to upload artifacts to the
134+
cloud service.
135+
136+
Note also though individual artifacts can be deleted, currently artifact files cannot be removed from the
137+
cloud service. Instead, you can delete all artifacts of that name
138+
using :meth:`~.delete_artifact` and then call :meth:`.ExperimentData.save`.
139+
This will save an empty file to the service, and the loaded experiment data will not contain
140+
these artifacts.
141+
142+
See Also
143+
--------
144+
145+
* :ref:`Curve Analysis: Data management with scatter table <data_management_with_scatter_table>` tutorial
146+
* :class:`.ArtifactData` API documentation
147+
* :class:`.ScatterTable` API documentation
148+
* :class:`.CurveFitResult` API documentation

docs/manuals/measurement/readout_mitigation.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ circuits, one for all “0” and one for all “1” results.
7878

7979
exp.analysis.set_options(plot=True)
8080
result = exp.run(backend)
81-
mitigator = result.analysis_results(0).value
81+
mitigator = result.analysis_results("Local Readout Mitigator").value
8282

8383
The resulting measurement matrix can be illustrated by comparing it to
8484
the identity.

docs/tutorials/curve_analysis.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -318,6 +318,8 @@ without an overhead of complex data management, as well as end-users with
318318
retrieving and reusing the intermediate data for their custom fitting workflow
319319
outside our curve fitting framework.
320320
Note that a :class:`ScatterTable` instance may be saved in the :class:`.ExperimentData` as an artifact.
321+
See the :doc:`Artifacts how-to </howtos/artifacts>` for more information.
322+
321323

322324
.. _curve_analysis_workflow:
323325

docs/tutorials/getting_started.rst

Lines changed: 35 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -150,6 +150,9 @@ analysis, respectively:
150150
print(exp_data.job_status())
151151
print(exp_data.analysis_status())
152152

153+
Figures
154+
-------
155+
153156
Once the analysis is complete, figures are retrieved using the
154157
:meth:`~.ExperimentData.figure` method. See the :doc:`visualization module
155158
<visualization>` tutorial on how to customize figures for an experiment. For our
@@ -160,15 +163,22 @@ exponential decay model of the :math:`T_1` experiment:
160163

161164
display(exp_data.figure(0))
162165

163-
The fit results and associated parameters are accessed with
164-
:meth:`~.ExperimentData.analysis_results`:
166+
Analysis Results
167+
----------------
168+
169+
The analysis results resulting from the fit are accessed with :meth:`~.ExperimentData.analysis_results`:
165170

166171
.. jupyter-execute::
167172

168173
for result in exp_data.analysis_results():
169174
print(result)
170175

171-
Results can be indexed numerically (starting from 0) or using their name.
176+
Results can be indexed numerically (starting from 0) or using their name. Analysis results can also be
177+
retrieved in the pandas :class:`~pandas:pandas.DataFrame` format by passing ``dataframe=True``:
178+
179+
.. jupyter-execute::
180+
181+
exp_data.analysis_results(dataframe=True)
172182

173183
.. note::
174184
See the :meth:`~.ExperimentData.analysis_results` API documentation for more
@@ -186,6 +196,24 @@ value and standard deviation of each value can be accessed as follows:
186196
For further documentation on how to work with UFloats, consult the ``uncertainties``
187197
:external+uncertainties:doc:`user_guide`.
188198

199+
Artifacts
200+
---------
201+
202+
The curve fit data itself is contained in :meth:`~.ExperimentData.artifacts`, which are accessed
203+
in an analogous manner. Artifacts for a standard experiment include both the curve fit data
204+
stored in ``artifacts("curve_data")`` and information on the fit stored in ``artifacts("fit_summary")``.
205+
Use the ``data`` attribute to access artifact data:
206+
207+
.. jupyter-execute::
208+
209+
print(exp_data.artifacts("fit_summary").data)
210+
211+
.. note::
212+
See the :doc:`artifacts </howtos/artifacts>` how-to for more information on using artifacts.
213+
214+
Circuit data and metadata
215+
-------------------------
216+
189217
Raw circuit output data and its associated metadata can be accessed with the
190218
:meth:`~.ExperimentData.data` property. Data is indexed by the circuit it corresponds
191219
to. Depending on the measurement level set in the experiment, the raw data will either
@@ -210,6 +238,9 @@ Experiments also have global associated metadata accessed by the
210238

211239
print(exp_data.metadata)
212240

241+
Job information
242+
---------------
243+
213244
The actual backend jobs that were executed for the experiment can be accessed with the
214245
:meth:`~.ExperimentData.jobs` method.
215246

@@ -406,8 +437,7 @@ into one level:
406437
)
407438
parallel_data = parallel_exp.run(backend, seed_simulator=101).block_for_results()
408439

409-
for result in parallel_data.analysis_results():
410-
print(result)
440+
parallel_data.analysis_results(dataframe=True)
411441

412442
Broadcasting analysis options to child experiments
413443
--------------------------------------------------

qiskit_experiments/curve_analysis/base_curve_analysis.py

Lines changed: 3 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -98,13 +98,6 @@ class BaseCurveAnalysis(BaseAnalysis, ABC):
9898
This method creates analysis results for important fit parameters
9999
that might be defined by analysis options ``result_parameters``.
100100
101-
.. rubric:: _create_curve_data
102-
103-
This method creates analysis results for the formatted dataset, i.e. data used for the fitting.
104-
Entries are created when the analysis option ``return_data_points`` is ``True``.
105-
If analysis consists of multiple series, analysis result is created for
106-
each curve data in the series definitions.
107-
108101
.. rubric:: _create_figures
109102
110103
This method creates figures by consuming the scatter table data.
@@ -162,9 +155,9 @@ def _default_options(cls) -> Options:
162155
dataset without formatting, on canvas. This is ``False`` by default.
163156
plot (bool): Set ``True`` to create figure for fit result or ``False`` to
164157
not create a figure. This overrides the behavior of ``generate_figures``.
165-
return_fit_parameters (bool): Set ``True`` to return all fit model parameters
166-
with details of the fit outcome. Default to ``True``.
167-
return_data_points (bool): Set ``True`` to include in the analysis result
158+
return_fit_parameters (bool): (Deprecated) Set ``True`` to return all fit model parameters
159+
with details of the fit outcome. Default to ``False``.
160+
return_data_points (bool): (Deprecated) Set ``True`` to include in the analysis result
168161
the formatted data points given to the fitter. Default to ``False``.
169162
data_processor (Callable): A callback function to format experiment data.
170163
This can be a :class:`.DataProcessor`
@@ -237,49 +230,6 @@ def _default_options(cls) -> Options:
237230

238231
return options
239232

240-
def set_options(self, **fields):
241-
"""Set the analysis options for :meth:`run` method.
242-
243-
Args:
244-
fields: The fields to update the options
245-
246-
Raises:
247-
KeyError: When removed option ``curve_fitter`` is set.
248-
"""
249-
# TODO remove this in Qiskit Experiments v0.5
250-
251-
if "curve_fitter_options" in fields:
252-
warnings.warn(
253-
"The option 'curve_fitter_options' is replaced with 'lmfit_options.' "
254-
"This option will be removed in Qiskit Experiments 0.5.",
255-
DeprecationWarning,
256-
stacklevel=2,
257-
)
258-
fields["lmfit_options"] = fields.pop("curve_fitter_options")
259-
260-
# TODO remove this in Qiskit Experiments 0.6
261-
if "curve_drawer" in fields:
262-
warnings.warn(
263-
"The option 'curve_drawer' is replaced with 'plotter'. "
264-
"This option will be removed in Qiskit Experiments 0.6.",
265-
DeprecationWarning,
266-
stacklevel=2,
267-
)
268-
# Set the plotter drawer to `curve_drawer`. If `curve_drawer` is the right type, set it
269-
# directly. If not, wrap it in a compatibility drawer.
270-
if isinstance(fields["curve_drawer"], BaseDrawer):
271-
plotter = self.options.plotter
272-
plotter.drawer = fields.pop("curve_drawer")
273-
fields["plotter"] = plotter
274-
else:
275-
drawer = fields["curve_drawer"]
276-
compat_drawer = LegacyCurveCompatDrawer(drawer)
277-
plotter = self.options.plotter
278-
plotter.drawer = compat_drawer
279-
fields["plotter"] = plotter
280-
281-
super().set_options(**fields)
282-
283233
@abstractmethod
284234
def _run_data_processing(
285235
self,

0 commit comments

Comments
 (0)