Skip to content

Conversation

@srowen
Copy link
Member

@srowen srowen commented May 14, 2014

LICENSE and NOTICE policy is explained here:

http://www.apache.org/dev/licensing-howto.html
http://www.apache.org/legal/3party.html

This leads to the following changes.

First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt.

This generates a consolidated NOTICE file that I manually added to NOTICE.

Next, a list of all dependencies and their licenses was generated:
mvn ... license:aggregate-add-third-party
to create: target/generated-sources/license/THIRD-PARTY.txt

Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one.

For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license.

I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above).

BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately.

Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there.

LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14976/

@pwendell
Copy link
Contributor

@srowen thanks a bunch for ton for looking through this. I'll go ahead and merge this and cut a new RC.

asfgit pushed a commit that referenced this pull request May 14, 2014
…tive dependency info

LICENSE and NOTICE policy is explained here:

http://www.apache.org/dev/licensing-howto.html
http://www.apache.org/legal/3party.html

This leads to the following changes.

First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt.

This generates a consolidated NOTICE file that I manually added to NOTICE.

Next, a list of all dependencies and their licenses was generated:
`mvn ... license:aggregate-add-third-party`
to create: `target/generated-sources/license/THIRD-PARTY.txt`

Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one.

For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license.

I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above).

BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately.

Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there.

LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it.

Author: Sean Owen <[email protected]>

Closes #770 from srowen/SPARK-1827 and squashes the following commits:

a764504 [Sean Owen] Add LICENSE and NOTICE info for all transitive dependencies as of 1.0
(cherry picked from commit 2e5a7cd)

Signed-off-by: Patrick Wendell <[email protected]>
@asfgit asfgit closed this in 2e5a7cd May 14, 2014
@pwendell
Copy link
Contributor

Hey @srowen - I noticed that hadoop and a bunch of the other Apache projects have like 3-4 things in the NOTICE file. Are those projects in violation of the "letter of the law" here? It seems like a lot of these dependencies are likely also dependencies of e.g. Hadoop.

@mateiz
Copy link
Contributor

mateiz commented May 14, 2014

I was going to ask the same thing, from what I've seen in other projects. It probably doesn't hurt to keep all of these notices, especially if they're automatically generated, but it may not be necessary. During incubation we were told to put stuff in LICENSE only for libraries we ship in source code form, and in NOTICE only for certain licenses that require mention there (e.g. MPL). I believe MIT, BSD and Apache licensed libraries don't require it.

@mateiz
Copy link
Contributor

mateiz commented May 14, 2014

E.g. look at how few are mentioned in http://www.us.apache.org/dist/hadoop/common/hadoop-2.4.0/.

@srowen
Copy link
Member Author

srowen commented May 14, 2014

I think the difference is that Spark is distributing its dependencies too in the assembly, whereas I am not sure Hadoop et al. do more than distribute their own artifacts. If that's the not correct, then yeah it's very possible Hadoop doesn't get it right.

I am pretty confident that this is the right thing to do for Spark, and went back to close-read the official word on what goes where. It surprised me a little too. Better safe than sorry, and I think it's buttoned up now to the best of a reasonable person's ability. Thanks for integrating it.

@srowen srowen deleted the SPARK-1827 branch May 16, 2014 11:00
@mateiz
Copy link
Contributor

mateiz commented May 18, 2014

@srowen Hadoop does distribute binary artifacts that work without a dependency download, so it might be good to let them know about this. Thanks for taking a look at the official policy though.

@srowen
Copy link
Member Author

srowen commented May 18, 2014

@mateiz That's a good question. I browsed through the Hadoop 2.4.0 binary distribution, and it looks like none of the hadoop-* JAR files are 'assembly' JARs -- they all just contain Hadoop-related code. The distro contains standalone third-party JARs in various lib/ directories though. I would have expected to see similar notices in NOTICE.txt and/or LICENSE.txt but there is no mention of any of these third-party libraries. Each JAR contains its own NOTICE and/or LICENSE, I suppose, and that probably technically satisfies the requirement. I still would have expected this to be reproduced, I think. I'll ping it over to those more knowledgeable to see if anyone thinks that needs a change.

pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
…tive dependency info

LICENSE and NOTICE policy is explained here:

http://www.apache.org/dev/licensing-howto.html
http://www.apache.org/legal/3party.html

This leads to the following changes.

First, this change enables two extensions to maven-shade-plugin in assembly/ that will try to include and merge all NOTICE and LICENSE files. This can't hurt.

This generates a consolidated NOTICE file that I manually added to NOTICE.

Next, a list of all dependencies and their licenses was generated:
`mvn ... license:aggregate-add-third-party`
to create: `target/generated-sources/license/THIRD-PARTY.txt`

Each dependency is listed with one or more licenses. Determine the most-compatible license for each if there is more than one.

For "unknown" license dependencies, I manually evaluateD their license. Many are actually Apache projects or components of projects covered already. The only non-trivial one was Colt, which has its own (compatible) license.

I ignored Apache-licensed and public domain dependencies as these require no further action (beyond NOTICE above).

BSD and MIT licenses (permissive Category A licenses) are evidently supposed to be mentioned in LICENSE, so I added a section without output from the THIRD-PARTY.txt file appropriately.

Everything else, Category B licenses, are evidently mentioned in NOTICE (?) Same there.

LICENSE contained some license statements for source code that is redistributed. I left this as I think that is the right place to put it.

Author: Sean Owen <[email protected]>

Closes apache#770 from srowen/SPARK-1827 and squashes the following commits:

a764504 [Sean Owen] Add LICENSE and NOTICE info for all transitive dependencies as of 1.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants