enh(policy): add Gen AI policy to review guide #344

lwasser · 2025-09-11T17:51:08Z

This policy was developed based on a conversation here:

I think that what we should do is review this policy and also consider linking to a blog post that covers some of our broad concerns and reasons for developing such a policy.

lwasser · 2025-09-11T17:52:21Z

appendices/gen-ai-checklist.md

@@ -0,0 +1,10 @@
+```markdown


Once we are happy with this checklist, i'll open a pr to update our submission template as well.

jedbrown · 2025-09-11T20:55:18Z

appendices/editor-in-chief-checks.md


 **Optional:** Let projects know that it's a great idea for projects to have a .github repository for the project organization where they can host a commonly used LICENSE, Code of Conduct, and even a YAML file with label definitions. These items will then be automatically applied to every repository in the organization to ensure consistency (but can be customized within repos too). The [SunPy project](https://github.com/sunpy/.github/) has a great example of this.

 ---
 - [ ] [Initial onboarding survey was filled out ](https://forms.gle/F9mou7S3jhe8DMJ16)
 We appreciate each maintainer of the package filling out this survey individually. :raised_hands:
-Thank you, authors, in advance for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:
+Thank you, authors, in advance, xwfor setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:


Suggested change

Thank you, authors, in advance, xwfor setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:

Thank you, authors, in advance, for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:

jedbrown · 2025-09-11T20:56:51Z

our-process/policies.md

+
+The policy below was co-developed by the pyOpenSci community. Its goals are:
+
+* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.


Suggested change

* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.

* Acknowledgment of and transparency around the widespread use of Generative AI tools with a focus on Large Language Models (LLMs) in Open Source.

unmatched parens

jedbrown · 2025-09-11T21:51:45Z

our-process/policies.md

+* When you submit a package to pyOpenSci, please disclose any use of LLMs (Large Language Models) in your package’s generation by checking the appropriate boxes on our software submission form. Disclosure should generally include what parts of your package were developed using LLM tools.
+  * Please also disclose this use of Generative AI tools in your package's `README.md` file and in any modules where generative AI contributions have been implemented.
+* We require that all aspects of your package have been reviewed carefully by a human on your maintainer team. Please ensure all text and code have been carefully checked for bias, bugs, and issues before submitting to pyOpenSci.
+* Your acknowledgment of using Generative AI will not impact the success of your submission unless you have blindly copied text and code into your package without careful review and evaluation of its accuracy, and for any systemic bias.


I don't know if this can realistically be promised. I can imagine some reviewers would not want to participate in such a review. Imagine if you are asked to review a paper and they acknowledge that anonymous non-consenting students and a paid ghostwriter wrote portions of the paper, but the author has read it all so it's good to go. If you believe that constitutes research misconduct because it violates the policies of your university and of other journals that you typically review for (which may use COPE and CRediT).

You could say that reviewers assigned to the submission will have to promise it won't affect their review. (IMO, reviewers need to be aware of this and have an informed opportunity for choose whether to consent.)

Right. Maybe something like "use of generative AI, if disclosed, does not disqualify your package from review. Your disclosure is used to match your submission with reviewers with similar values and experience regarding gen AI. Incomplete disclosure of gen AI could affect your review success, as reviewers are volunteers and retain discretion to discontinue their review if this, or any other part of your submission appears inaccurate on closer inspection."

I'm not sure we want to match package with "reviewers who share your values". That (a) makes ii even harder to find reviewers (instead of asking anyone, I know need to ask first "Where do you stand o the copyright and LLM debate?") and (b) makes pyopensci fall into two parts the "PyOpenSci with AI friendly authors and reviewers" and the "PyOS with anti- LLM authors and reviewers".

This is also "where do you stand on publication ethics". Unpaid reviewing of scholarly work is service to the epistemic integrity of a community and knowledge. I think the motivations to review for pyOS are more akin to reviewing for scholarly journals than to conducting an external security audit (which is almost always a paid activity and has a narrow purpose).

To impose on reviewers a uniform policy that violations of publication ethics norms must not influence their review will drive those reviewers away. The remaining reviewers, being "LLM-friendly", may be prone to use LLMs to create the appearance that they have conducted a review. (This is a huge problem for CS conferences where chairs are desperate for reviewers and people would like the professional recognition of being a member of the program committee without doing the work.)

sneakers-the-rat · 2025-09-13T00:01:44Z

appendices/gen-ai-checklist.md

+- [ ] Some parts of the package were created using LLMs in some way.
+  * Please check any of the boxes below to clarify which parts of the package were impacted by LLM use
+    - [ ] LLMs were used to develop code
+    - [ ] LLMs were used to develop documentation
+    - [ ] LLMs were used to develop tests
+    - [ ] LLMs were used to develop infrastructure (CI, automation)


so i think checklist can be good for prompting on different things, but i do think we want to have this be a text response - "used to develop code" could be anything from "had tab autocompletion on and used it once" to "wholly generated by LLMs."

So maybe something like...

Generative AI was used to produce some of the material in this submission

If the above box was checked, please describe how generative AI was used, including

Which parts of the submission were generated: e.g. documentation, tests, code. In addition to a general description, please specifically indicate any substantial portions of code (classes, modules, subpackages) that were wholly or primarily generated by AI.

The approximate scale of the generated portions: e.g. "all of the tests were generated and then checked by a human," "small routines were generated and copied into the code."

How the generative AI was used: e.g. line completion, help with translation, queried separately and integrated, agentic workflow.

If generative AI was used, the authors affirm that all generated material has been reviewed and edited for clarity, concision, correctness, and absence of machine bias. The authors are responsible for the content of their work, and affirm that it is in a state where reviewers will not be responsible for primary editing and review of machine-generated material.

I like this direction, but I'm stuck on a lot of questions about this part:

What does "authors are responsible for the content of their work" mean? Is that meant to be a statement of provenance or merely of agreement?

What if the submission contains a page of code that matches verbatim a different package with an incompatible license/no attribution? Is that misconduct, something a reviewer would politely ask them to fix, or acceptable as-is if it is plausible to believe it was tab-completed instead of flagrantly copied? (In a journal context, that would involve rejection and possible reporting of misconduct to the authors' institutions.)

What if that code was accepted in a PR from a minor contributor who is not an author on a paper? (Super common for projects with many contributors. You might have hundreds of minor contributors, but a dozen core developers and maintainers.) Is there an expectation that maintainers practice due diligence? (This is why I think DCO-style norms are so important.)

If the "fix" remedy is chosen, does that whole part of the code need to be rewritten using clean-room methods or is it enough to change variable names so it no longer shows up as a verbatim match?

What does "authors are responsible for the content of their work" mean? Is that meant to be a statement of provenance or merely of agreement?

it's a clarification within the affirmation that "the authors have reviewed and stand behind their work." regardless of provenance, the authors are the ones that are submitting the thing to be reviewed, and they are responsible for the content of the thing being reviewed.

this is meant to address this:

What if that code was accepted in a PR from a minor contributor who is not an author on a paper?

The people that are engaging in the review process take responsibility for the material they are asking to be reviewed.

What if the submission contains a page of code that matches verbatim a different package with an incompatible license/no attribution?

I think this is important but should probably be a separate item, eg. in this comment "we currently ask authors to write something about the state of the field of neighboring packages, ... if reviewers have generated some substantial part of their package that could have conceivably been "inspired by"/copied from another existing package, ask if they have searched for related implementations, and write something short about why not use that, and if the code does appear to overlap substantially, add some attribution. ... "

it might be a pretty high bar, and i would definitely be open to disagreement on it, because i both agree that we should encourage people being responsible with provenance, but also we can't ask someone to chase down the provenance of an undefinably small chunk of code. i again think in terms of facilitating review not what i would think are optimal development practices - "if there was a whole module of generated code that probably drew from a popular package, what would a reviewer need to know, and what is a standard we could expect from an author regarding code duplication from LLM code generation"

if the "fix" remedy is chosen

i would think attribution would be the preferred remedy here, but i also don't think that needs to be prescribed in the policies and can be a matter of negotiation between reviewers, editor, and author.

Thanks. Attribution is only a "fix" if the licenses are compatible.

More generally, I'm worried that we're losing a culture (admittedly very inconsistently-practiced) of due diligence and replacing it with one in which everyone has plausible deniability, which is then vulnerable to DOS attack by dishonest academic-bean-farmers (and undermines the legitimacy of the entire JOSS/pyOS/+ community). If adequate redress for plagiarism in the scholarly literature was blame-free addition of the citation, it would be way more prevalent and the job of reviewer would be way more fraught and unappealing (and biased outcomes would proliferate). The system is built on presumption of good faith, with exceptions being scarce.

Maybe a happy middle ground could be an optional text-box?
I don't think many folks are currently tracking where/when they are using LLMs in a systematic or auditable way, or I would think its rare. You may however recall wanting to auto generate documentation explaining your functions.

I think the next item on the list with a human acknowledgement also sets this expectation and accountability for submitters too.

sneakers-the-rat · 2025-09-13T00:03:19Z

our-process/policies.md

+
+### Disclosure of generative AI use in pyOpenSci reviewed packages
+
+* When you submit a package to pyOpenSci, please disclose any use of LLMs (Large Language Models) in your package’s generation by checking the appropriate boxes on our software submission form. Disclosure should generally include what parts of your package were developed using LLM tools.


along with above, checking boxes -> describing your use of generative AI.

sneakers-the-rat · 2025-09-13T00:07:32Z

our-process/policies.md

+The policy below was co-developed by the pyOpenSci community. Its goals are:
+
+* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.
+* Protect peer review efficiency: Ensure human review of any LLM-generated contributions to a package to protect editor and reviewer volunteer time in our peer review process.


a little confusing - we want to avoid pyOS reviewers being the first ones to review LLM-generated code, so by "human review" we mean "prior author review." so maybe something like "Ensure an equitable balance of labor, where authors have ensured that generated material is in a state that minimizes review time, and reviewers are not responsible for correcting errors and unclarity from machine generated code. The PyOS review process should not be used as a mechanism for outsourcing human review of generated code."

sneakers-the-rat · 2025-09-13T00:10:22Z

our-process/policies.md

+
+The policy below was co-developed by the pyOpenSci community. Its goals are:
+
+* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.


along with this - allow reviewers to make informed decisions about what they choose to review, and allow authors to have reviewers that align with their values and practices. facilitate a review process that aligns with the ethics and values of its participants.

enh(policy): add Gen AI policy to review guide

68cc75a

lwasser commented Sep 11, 2025

View reviewed changes

lwasser mentioned this pull request Sep 11, 2025

Develop policy around LLM generated code in our packages submissions #331

Open

jedbrown reviewed Sep 11, 2025

View reviewed changes

sneakers-the-rat reviewed Sep 13, 2025

View reviewed changes

lwasser requested a review from a team September 16, 2025 19:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

enh(policy): add Gen AI policy to review guide #344

enh(policy): add Gen AI policy to review guide #344

Uh oh!

lwasser commented Sep 11, 2025

Uh oh!

lwasser Sep 11, 2025

Uh oh!

jedbrown Sep 11, 2025

Uh oh!

jedbrown Sep 11, 2025

Uh oh!

jedbrown Sep 11, 2025

Uh oh!

sneakers-the-rat Sep 12, 2025

Uh oh!

hamogu Sep 14, 2025

Uh oh!

jedbrown Sep 14, 2025

Uh oh!

sneakers-the-rat Sep 13, 2025

Uh oh!

jedbrown Sep 13, 2025

Uh oh!

sneakers-the-rat Sep 13, 2025

Uh oh!

jedbrown Sep 13, 2025

Uh oh!

yeelauren Sep 19, 2025

Uh oh!

sneakers-the-rat Sep 13, 2025

Uh oh!

sneakers-the-rat Sep 13, 2025

Uh oh!

sneakers-the-rat Sep 13, 2025

Uh oh!

Uh oh!

	Thank you, authors, in advance, xwfor setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:
	Thank you, authors, in advance, for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:


		The policy below was co-developed by the pyOpenSci community. Its goals are:

		* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.


		### Disclosure of generative AI use in pyOpenSci reviewed packages

		* When you submit a package to pyOpenSci, please disclose any use of LLMs (Large Language Models) in your package’s generation by checking the appropriate boxes on our software submission form. Disclosure should generally include what parts of your package were developed using LLM tools.

enh(policy): add Gen AI policy to review guide #344

Are you sure you want to change the base?

enh(policy): add Gen AI policy to review guide #344

Uh oh!

Conversation

lwasser commented Sep 11, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!