Skip to content

Conversation

lwasser
Copy link
Member

@lwasser lwasser commented Sep 11, 2025

This policy was developed based on a conversation here:

#331

I think that what we should do is review this policy and also consider linking to a blog post that covers some of our broad concerns and reasons for developing such a policy.

@@ -0,0 +1,10 @@
```markdown
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once we are happy with this checklist, i'll open a pr to update our submission template as well.


**Optional:** Let projects know that it's a great idea for projects to have a .github repository for the project organization where they can host a commonly used LICENSE, Code of Conduct, and even a YAML file with label definitions. These items will then be automatically applied to every repository in the organization to ensure consistency (but can be customized within repos too). The [SunPy project](https://github.com/sunpy/.github/) has a great example of this.

---
- [ ] [Initial onboarding survey was filled out ](https://forms.gle/F9mou7S3jhe8DMJ16)
We appreciate each maintainer of the package filling out this survey individually. :raised_hands:
Thank you, authors, in advance for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:
Thank you, authors, in advance, xwfor setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Thank you, authors, in advance, xwfor setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:
Thank you, authors, in advance, for setting aside five to ten minutes to do this. It truly helps our organization. :raised_hands:


The policy below was co-developed by the pyOpenSci community. Its goals are:

* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.
* Acknowledgment of and transparency around the widespread use of Generative AI tools with a focus on Large Language Models (LLMs) in Open Source.

unmatched parens

* When you submit a package to pyOpenSci, please disclose any use of LLMs (Large Language Models) in your package’s generation by checking the appropriate boxes on our software submission form. Disclosure should generally include what parts of your package were developed using LLM tools.
* Please also disclose this use of Generative AI tools in your package's `README.md` file and in any modules where generative AI contributions have been implemented.
* We require that all aspects of your package have been reviewed carefully by a human on your maintainer team. Please ensure all text and code have been carefully checked for bias, bugs, and issues before submitting to pyOpenSci.
* Your acknowledgment of using Generative AI will not impact the success of your submission unless you have blindly copied text and code into your package without careful review and evaluation of its accuracy, and for any systemic bias.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this can realistically be promised. I can imagine some reviewers would not want to participate in such a review. Imagine if you are asked to review a paper and they acknowledge that anonymous non-consenting students and a paid ghostwriter wrote portions of the paper, but the author has read it all so it's good to go. If you believe that constitutes research misconduct because it violates the policies of your university and of other journals that you typically review for (which may use COPE and CRediT).

You could say that reviewers assigned to the submission will have to promise it won't affect their review. (IMO, reviewers need to be aware of this and have an informed opportunity for choose whether to consent.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. Maybe something like "use of generative AI, if disclosed, does not disqualify your package from review. Your disclosure is used to match your submission with reviewers with similar values and experience regarding gen AI. Incomplete disclosure of gen AI could affect your review success, as reviewers are volunteers and retain discretion to discontinue their review if this, or any other part of your submission appears inaccurate on closer inspection."

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to match package with "reviewers who share your values". That (a) makes ii even harder to find reviewers (instead of asking anyone, I know need to ask first "Where do you stand o the copyright and LLM debate?") and (b) makes pyopensci fall into two parts the "PyOpenSci with AI friendly authors and reviewers" and the "PyOS with anti- LLM authors and reviewers".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also "where do you stand on publication ethics". Unpaid reviewing of scholarly work is service to the epistemic integrity of a community and knowledge. I think the motivations to review for pyOS are more akin to reviewing for scholarly journals than to conducting an external security audit (which is almost always a paid activity and has a narrow purpose).

To impose on reviewers a uniform policy that violations of publication ethics norms must not influence their review will drive those reviewers away. The remaining reviewers, being "LLM-friendly", may be prone to use LLMs to create the appearance that they have conducted a review. (This is a huge problem for CS conferences where chairs are desperate for reviewers and people would like the professional recognition of being a member of the program committee without doing the work.)

Comment on lines +2 to +7
- [ ] Some parts of the package were created using LLMs in some way.
* Please check any of the boxes below to clarify which parts of the package were impacted by LLM use
- [ ] LLMs were used to develop code
- [ ] LLMs were used to develop documentation
- [ ] LLMs were used to develop tests
- [ ] LLMs were used to develop infrastructure (CI, automation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so i think checklist can be good for prompting on different things, but i do think we want to have this be a text response - "used to develop code" could be anything from "had tab autocompletion on and used it once" to "wholly generated by LLMs."

So maybe something like...

  • Generative AI was used to produce some of the material in this submission
  • If the above box was checked, please describe how generative AI was used, including
    • Which parts of the submission were generated: e.g. documentation, tests, code. In addition to a general description, please specifically indicate any substantial portions of code (classes, modules, subpackages) that were wholly or primarily generated by AI.
    • The approximate scale of the generated portions: e.g. "all of the tests were generated and then checked by a human," "small routines were generated and copied into the code."
    • How the generative AI was used: e.g. line completion, help with translation, queried separately and integrated, agentic workflow.
  • If generative AI was used, the authors affirm that all generated material has been reviewed and edited for clarity, concision, correctness, and absence of machine bias. The authors are responsible for the content of their work, and affirm that it is in a state where reviewers will not be responsible for primary editing and review of machine-generated material.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this direction, but I'm stuck on a lot of questions about this part:

  • What does "authors are responsible for the content of their work" mean? Is that meant to be a statement of provenance or merely of agreement?
  • What if the submission contains a page of code that matches verbatim a different package with an incompatible license/no attribution? Is that misconduct, something a reviewer would politely ask them to fix, or acceptable as-is if it is plausible to believe it was tab-completed instead of flagrantly copied? (In a journal context, that would involve rejection and possible reporting of misconduct to the authors' institutions.)
  • What if that code was accepted in a PR from a minor contributor who is not an author on a paper? (Super common for projects with many contributors. You might have hundreds of minor contributors, but a dozen core developers and maintainers.) Is there an expectation that maintainers practice due diligence? (This is why I think DCO-style norms are so important.)
  • If the "fix" remedy is chosen, does that whole part of the code need to be rewritten using clean-room methods or is it enough to change variable names so it no longer shows up as a verbatim match?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "authors are responsible for the content of their work" mean? Is that meant to be a statement of provenance or merely of agreement?

it's a clarification within the affirmation that "the authors have reviewed and stand behind their work." regardless of provenance, the authors are the ones that are submitting the thing to be reviewed, and they are responsible for the content of the thing being reviewed.

this is meant to address this:

What if that code was accepted in a PR from a minor contributor who is not an author on a paper?

The people that are engaging in the review process take responsibility for the material they are asking to be reviewed.

What if the submission contains a page of code that matches verbatim a different package with an incompatible license/no attribution?

I think this is important but should probably be a separate item, eg. in this comment "we currently ask authors to write something about the state of the field of neighboring packages, ... if reviewers have generated some substantial part of their package that could have conceivably been "inspired by"/copied from another existing package, ask if they have searched for related implementations, and write something short about why not use that, and if the code does appear to overlap substantially, add some attribution. ... "

it might be a pretty high bar, and i would definitely be open to disagreement on it, because i both agree that we should encourage people being responsible with provenance, but also we can't ask someone to chase down the provenance of an undefinably small chunk of code. i again think in terms of facilitating review not what i would think are optimal development practices - "if there was a whole module of generated code that probably drew from a popular package, what would a reviewer need to know, and what is a standard we could expect from an author regarding code duplication from LLM code generation"

if the "fix" remedy is chosen

i would think attribution would be the preferred remedy here, but i also don't think that needs to be prescribed in the policies and can be a matter of negotiation between reviewers, editor, and author.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Attribution is only a "fix" if the licenses are compatible.

More generally, I'm worried that we're losing a culture (admittedly very inconsistently-practiced) of due diligence and replacing it with one in which everyone has plausible deniability, which is then vulnerable to DOS attack by dishonest academic-bean-farmers (and undermines the legitimacy of the entire JOSS/pyOS/+ community). If adequate redress for plagiarism in the scholarly literature was blame-free addition of the citation, it would be way more prevalent and the job of reviewer would be way more fraught and unappealing (and biased outcomes would proliferate). The system is built on presumption of good faith, with exceptions being scarce.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a happy middle ground could be an optional text-box?
I don't think many folks are currently tracking where/when they are using LLMs in a systematic or auditable way, or I would think its rare. You may however recall wanting to auto generate documentation explaining your functions.

I think the next item on the list with a human acknowledgement also sets this expectation and accountability for submitters too.


### Disclosure of generative AI use in pyOpenSci reviewed packages

* When you submit a package to pyOpenSci, please disclose any use of LLMs (Large Language Models) in your package’s generation by checking the appropriate boxes on our software submission form. Disclosure should generally include what parts of your package were developed using LLM tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

along with above, checking boxes -> describing your use of generative AI.

The policy below was co-developed by the pyOpenSci community. Its goals are:

* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.
* Protect peer review efficiency: Ensure human review of any LLM-generated contributions to a package to protect editor and reviewer volunteer time in our peer review process.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little confusing - we want to avoid pyOS reviewers being the first ones to review LLM-generated code, so by "human review" we mean "prior author review." so maybe something like "Ensure an equitable balance of labor, where authors have ensured that generated material is in a state that minimizes review time, and reviewers are not responsible for correcting errors and unclarity from machine generated code. The PyOS review process should not be used as a mechanism for outsourcing human review of generated code."


The policy below was co-developed by the pyOpenSci community. Its goals are:

* Acknowledgment of and transparency around the widespread use of Generative AI tools (with a focus on Large Language Models (LLMs) in Open Source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

along with this - allow reviewers to make informed decisions about what they choose to review, and allow authors to have reviewers that align with their values and practices. facilitate a review process that aligns with the ethics and values of its participants.

@lwasser lwasser requested a review from a team September 16, 2025 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants