Skip to content
This repository was archived by the owner on May 17, 2024. It is now read-only.
This repository was archived by the owner on May 17, 2024. It is now read-only.

Adding support for PostgreSQL json and RedShift Super #515

@nicolasaldecoa

Description

@nicolasaldecoa

Hello devs.

My coworkers and I have recently been looking for a tool to compare tables that are replicated in PostgreSQL and RedShift. We discovered data-diff and gave it a try a couple of weeks ago, and found it to be useful for some tasks. Nevertheless, we have a lot of columns that contain bjson data (json in PG schema and super in RS) which aren't supported data types,

In order to deal with that problem, I forked this repo and drafted a solution that is working for us at the moment.
I've observed that the default normalization (adding ::varchar ) to RS super column makes it return a NULL output, and in PG the json string contains a space after each colon.
These are the normalizations that I'm using to obtain a minified json string in both cases:

Postgre json (I'm aware that this expression doesn't cover all cases, I need to put a bit more work into this one)

def normalize_json(self, value: str, _coltype: PostgresqlJSON) -> str:
    return f"replace({value}::text, '\": \"', '\":\"')"

RedShift super

def normalize_super(self, value: str, _coltype: RedShiftSuper) -> str:
    return f'nvl2({value}, json_serialize({value}), NULL)'

We would love to stay up-to-date with the development of this tool, so I'd like to ask you if you could take a look at this commit in my fork and consider giving official support to these database types.

I'm not creating a PR because I haven't had the time to study the entire project, so my code is probably suboptimal from a design point of view. I'd really appreciate if you could help me out with that.

Thank you in advance, cheers,
-Nico

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestsqeletonstaleIssues/PRs that have gone stale

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions