diff --git a/.changeset/pretty-candles-chew.md b/.changeset/pretty-candles-chew.md new file mode 100644 index 0000000000..2ee0a25f23 --- /dev/null +++ b/.changeset/pretty-candles-chew.md @@ -0,0 +1,8 @@ +--- +"rrweb-snapshot": minor +"rrweb": minor +"rrdom": patch +"@rrweb/types": patch +--- + +Added support for Asset Event and capturing many different types of assets (not just img#src) diff --git a/.changeset/yellow-vans-protect.md b/.changeset/yellow-vans-protect.md new file mode 100644 index 0000000000..28249aa8f0 --- /dev/null +++ b/.changeset/yellow-vans-protect.md @@ -0,0 +1,8 @@ +--- +"rrweb-snapshot": major +"@rrweb/types": patch +--- + +`NodeType` enum was moved from rrweb-snapshot to @rrweb/types +The following types where moved from rrweb-snapshot to @rrweb/types: `documentNode`, `documentTypeNode`, `attributes`, `legacyAttributes`, `elementNode`, `textNode`, `cdataNode`, `commentNode`, `serializedNode`, `serializedNodeWithId` and `DataURLOptions` +`inlineImage` config option is deprecated and in `rrweb` is an alias for `captureAssets` config option diff --git a/.github/workflows/style-check.yml b/.github/workflows/style-check.yml index a37b1a45a7..61e9cb1ea7 100644 --- a/.github/workflows/style-check.yml +++ b/.github/workflows/style-check.yml @@ -87,12 +87,12 @@ jobs: contents: write name: Format Code steps: - - uses: actions/checkout@v3 + - uses: actions/checkout@v4 with: repository: ${{ github.event.pull_request.head.repo.full_name }} ref: ${{ github.head_ref }} - name: Setup Node - uses: actions/setup-node@v3 + uses: actions/setup-node@v4 with: node-version: lts/* cache: 'yarn' diff --git a/docs/assets.md b/docs/assets.md new file mode 100644 index 0000000000..0f0aeaa015 --- /dev/null +++ b/docs/assets.md @@ -0,0 +1,70 @@ +# Asset Capture Methods & Configuration in rrweb + +[rrweb](https://rrweb.io/) is a JavaScript library that allows you to record and replay user interactions on your website. It provides various configuration options for capturing assets (such as images) during the recording process. In this document, we will explore the different asset capture methods and their configuration options in rrweb. + +## Asset Events + +Assets are a new type of event that embody a serialized version of a http resource captured during snapshotting. Some examples are images, media files and stylesheets. Resources can be fetched externally (from cache) in the case of a href, or internally for blob: urls and same-origin stylesheets. Asset events are emitted subsequent to either a FullSnapshot or an IncrementalSnapshot (mutation), and although they may have a later timestamp, during replay they are rebuilt as part of the snapshot that they are associated with. In the case where e.g. a stylesheet is referenced at the time of a FullSnapshot, but hasn't been downloaded yet, there can be a subsequent mutation event with a later timestamp which, along with the asset event, can recreate the experience of a network-delayed load of the stylesheet. + +## Assets to mitigate stylesheet processing cost + +In the case of stylesheets, rrweb does some record-time processing in order to serialize the css rules which had a negative effect on the initial page loading times and how quickly the FullSnapshot was taken (see https://pagespeed.web.dev/). These are now taken out of the main thread and processed asynchronously to be emitted (up to `processStylesheetsWithin` ms) later. There is no corresponding delay on the replay side so long as the stylesheet has been successfully emitted. + +## Asset Capture Configuration + +The `captureAssets` configuration option allows you to customize the asset capture process. It is an object with the following properties: + +- `objectURLs` (default: `true`): This property specifies whether to capture same-origin `blob:` assets using object URLs. Object URLs are created using the `URL.createObjectURL()` method. Setting `objectURLs` to `true` enables the capture of object URLs. + +- `origins` (default: `false`): This property determines which origins to capture assets from. It can have the following values: + + - `false` or `[]`: Disables capturing any assets apart from object URLs, stylesheets (unless set to false) and images (if that setting is turned on). + - `true`: Captures assets from all origins. + - `[origin1, origin2, ...]`: Captures assets only from the specified origins. For example, `origins: ['https://s3.example.com/']` captures all assets from the origin `https://s3.example.com/`. + +- `images` (default: `true` if `inlineImages` is true in rrweb.record config): When set to true, this option turns on asset capturing for all images irrespective of their origin. When set to false, no images will be captured even if the origin matches. By default images will be captured if their src url matches the `origins` setting above, including if the `origins` is set to `true`. + +- `video` When set to true, this option turns on asset capturing for videos irrespective of their origin. When set to false, no videos will be captured even if the origin matches. By default videos will be captured if their src url matches the `origins` setting above, including if the `origins` is set to `true`. + +- `audio` When set to true, this option turns on asset capturing for audio files irrespective of their origin. When set to false, no audio files will be captured even if the origin matches. By default audio files will be captured if their src url matches the `origins` setting above, including if the `origins` is set to `true`. + +- `stylesheets` (default: `'without-fetch'`): When set to `true`, this turns on capturing of all stylesheets and style elements via the asset system irrespective of origin. The default of `'without-fetch'` is designed to match with the previous `inlineStylesheet` behaviour, whereas the `true` value allows capturing of stylesheets which are otherwise inaccessible due to CORS restrictions to be captured via a fetch call, which will normally use the browser cache. If a stylesheet matches via the `origins` config above, it will be captured irrespective of this config setting (either directly or via fetch). + +- `stylesheetsRuleThreshold` (default: `0`): only invoke the asset system for stylesheets with more than this number of rules. Defaults to zero (rather than say 100) as it only looks at the 'outer' rules (e.g. could have a single media rule which nests 1000s of sub rules). This default may be increased based on feedback. + +- `processStylesheetsWithin` (default: `2000`): This property defines the maximum time in milliseconds that the browser should delay before processing stylesheets. Inline ` +`); + + const onAssetDetectedCallback = vi.fn(); + serializeNode(el, onAssetDetectedCallback); + expect(onAssetDetectedCallback).toBeCalledTimes(1); + expect(onAssetDetectedCallback).toHaveBeenCalledWith({ + element: el.querySelector('style'), + attr: 'css_text', + styleId: 1, + value: 'http://localhost:3000/', + }); + }); + + it('should detect style depending on if stylesheetsRuleThreshold is met', () => { + const el = render(`
+ + +
`); + + const onAssetDetectedCallback = vi.fn(); + const captureAssets = { + objectURLs: true, + origins: ['https://example.com'], + stylesheetsRuleThreshold: 2, + }; + const inlineImagesUndefined = undefined; + serializeNode( + el, + onAssetDetectedCallback, + inlineImagesUndefined, + captureAssets, + ); + expect(onAssetDetectedCallback).toBeCalledTimes(1); + }); + + // SKIP: TODO: avoid capturing large video blobs, but allow capture of (presumably smaller) image blobs + it.skip('should not try to capture blob video under defaults', () => { + const el = render(`
`); + + const onAssetDetectedCallback = vi.fn(); + const captureAssets = undefined; // defaults + const inlineImagesUndefined = undefined; + serializeNode( + el, + onAssetDetectedCallback, + inlineImagesUndefined, + captureAssets, + ); + expect(onAssetDetectedCallback).toBeCalledTimes(0); + + // make sure it would be called with video on + captureAssets.video = true; + serializeNode( + el, + onAssetDetectedCallback, + inlineImagesUndefined, + captureAssets, + ); + expect(onAssetDetectedCallback).toBeCalledTimes(1); + }); +}); diff --git a/packages/rrweb-snapshot/test/utils.test.ts b/packages/rrweb-snapshot/test/utils.test.ts index 0a82d8c16c..d1b8fff908 100644 --- a/packages/rrweb-snapshot/test/utils.test.ts +++ b/packages/rrweb-snapshot/test/utils.test.ts @@ -6,6 +6,9 @@ import { escapeImportStatement, extractFileExtension, fixSafariColons, + shouldIgnoreAsset, + isAttributeCapturable, + shouldCaptureAsset, isNodeMetaEqual, } from '../src/utils'; import { NodeType } from '@rrweb/types'; @@ -153,6 +156,7 @@ describe('utils', () => { expect(isNodeMetaEqual(element2, element3)).toBeFalsy(); }); }); + describe('extractFileExtension', () => { test('absolute path', () => { const path = 'https://example.com/styles/main.css'; @@ -280,4 +284,241 @@ describe('utils', () => { expect(out3).toEqual('[data-aa\\:other] { color: red; }'); }); }); + + describe('shouldIgnoreAsset()', () => { + it(`should ignore assets when config not specified`, () => { + expect(shouldIgnoreAsset('http://example.com', {})).toBe(true); + }); + + it(`should not ignore matching origin`, () => { + expect( + shouldIgnoreAsset('http://example.com/', { + origins: ['http://example.com'], + }), + ).toBe(false); + }); + + it(`should ignore mismatched origin`, () => { + expect( + shouldIgnoreAsset('http://123.com/', { + origins: ['http://example.com'], + }), + ).toBe(true); + }); + + it(`should ignore malformed url`, () => { + expect( + shouldIgnoreAsset('http:', { origins: ['http://example.com'] }), + ).toBe(true); + }); + + it(`should ignore malformed url even with origins: true`, () => { + expect(shouldIgnoreAsset('http:', { origins: true })).toBe(true); + }); + }); + + describe('isAttributeCapturable()', () => { + const validAttributeCombinations = [ + ['img', ['src', 'srcset']], + ['video', ['src']], + ['audio', ['src']], + ['embed', ['src']], + ['source', ['src']], + ['track', ['src']], + ['input', ['src']], + ['object', ['src']], + ] as const; + + const invalidAttributeCombinations = [ + ['img', ['href']], + ['script', ['href']], + ['link', ['src']], + ['video', ['href']], + ['audio', ['href']], + ['div', ['src']], + ['source', ['href']], + ['track', ['href']], + ['input', ['href']], + ['iframe', ['href']], + ['object', ['href']], + ['link', ['href']], // without rel="stylesheet" + ] as const; + + validAttributeCombinations.forEach(([tagName, attributes]) => { + const element = document.createElement(tagName); + attributes.forEach((attribute) => { + it(`should correctly identify <${tagName} ${attribute}> as capturable`, () => { + expect(isAttributeCapturable(element, attribute)).toBe(true); + }); + }); + }); + + invalidAttributeCombinations.forEach(([tagName, attributes]) => { + const element = document.createElement(tagName); + attributes.forEach((attribute) => { + it(`should correctly identify <${tagName} ${attribute}> as NOT capturable`, () => { + expect(isAttributeCapturable(element, attribute)).toBe(false); + }); + }); + }); + + it(`should identify a child of a element as a capturable image`, () => { + const picture = document.createElement('picture'); + const source = document.createElement('source'); + source.srcset = 'https://example.com/img1.png'; + + // https://developer.mozilla.org/en-US/docs/Web/HTML/Element/source + // "Not allowed if parent is a picture" + source.src = 'https://example.com/img2.png'; + + const fallback_img = document.createElement('img'); + fallback_img.src = 'https://example.com/img3.png'; + + picture.append(source); + picture.append(fallback_img); + + expect( + shouldCaptureAsset(source, 'srcset', source.srcset, { images: true }), + ).toBe(true); + expect( + shouldCaptureAsset(source, 'src', source.srcset, { images: true }), + ).toBe(false); // not allowed + expect( + shouldCaptureAsset(fallback_img, 'src', source.src, { images: true }), + ).toBe(true); + + expect( + shouldCaptureAsset(source, 'srcset', source.srcset, { images: false }), + ).toBe(false); + expect( + shouldCaptureAsset(fallback_img, 'src', source.src, { images: false }), + ).toBe(false); + }); + + it(`should correctly identify child of a