add example_data/fix_db_names.py to fix inconsistency in dbnames #21

njupopsicle · 2025-08-20T05:39:59Z

Hello authors, as mentioned in #17, there does exist some inconsistency in database names between train/test.parquet and the actual SynSQL-2.5M databases., which would break the training process. To fix this issue and make SQL-R1 more reproducable, I add example_data/fix_db_names.py, which is generated by ChatGPT and can fix all inconsistent database names with the minimum editing distance algorithm.

add example_data/fix_db_names.py to fix inconsistency in dbnames

f77704d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add example_data/fix_db_names.py to fix inconsistency in dbnames #21

add example_data/fix_db_names.py to fix inconsistency in dbnames #21

Uh oh!

njupopsicle commented Aug 20, 2025

Uh oh!

Uh oh!

add example_data/fix_db_names.py to fix inconsistency in dbnames #21

Are you sure you want to change the base?

add example_data/fix_db_names.py to fix inconsistency in dbnames #21

Uh oh!

Conversation

njupopsicle commented Aug 20, 2025

Uh oh!

Uh oh!