-
Notifications
You must be signed in to change notification settings - Fork 28.9k
[SPARK-8376][Docs]Add common lang3 to the Spark Flume Sink doc #6829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is built into the assembly though, right? |
No. Spark Flume Sink does not assemble the dependencies. Actually, now we don't have an assembly jar for Flume. |
|
OK but surely it's easier to make an assembly target than tell people they have to piece together the dependencies and keep updating docs about it? |
ping @tdas about the assembly idea. |
|
Test build #34936 has finished for PR 6829 at commit
|
|
I think the assembly is a good idea. Though for that we will have to
@harishreedharan What do you think about this. |
|
Jenkins, retest this please. |
|
(I'm retesting this to see whether our new Jenkins PRB script is properly skipping the tests for doc-only changes) |
|
Test build #35074 has finished for PR 6829 at commit
|
|
+1 on creating assembly. I am not entirely sure what it takes to generate the assembly, but if it is possible to add that to the current sink module, that would be great. I doubt this would affect any existing deployments in any way. |
|
@zsxwing Then lets try to build an assembly. But for the benefit of branch-1.4 I am going to merge this PR to master and 1.4 (so that the docs are updated for 1.4.1). But then lets create a separate JIRA and PR for the assembly. |
Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since #5703. This PR updates the doc for it. Author: zsxwing <[email protected]> Closes #6829 from zsxwing/flume-sink-dep and squashes the following commits: f8617f0 [zsxwing] Add common lang3 to the Spark Flume Sink doc (cherry picked from commit 24e5379) Signed-off-by: Tathagata Das <[email protected]>
|
@srowen do you have an example to publish both the single jar and the assembly jar? Two approaches I'm thinking about:
I prefer 1 because I don't know how to implement 2 in maven. What do you think? |
|
Actually, I dont think we can publish two artifacts from same project. Nor On Thu, Jun 18, 2015 at 6:59 PM, Shixiong Zhu [email protected]
|
|
Publishing the single jar would be helpful if people find some dependency conflicts or want to upgrade the version of a dependency library, and want to resolve it by themselves. I think people won't use the assembly jar in a pom.xml. So I think we can publish the assembly jar under the same artifact. |
Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since apache#5703. This PR updates the doc for it. Author: zsxwing <[email protected]> Closes apache#6829 from zsxwing/flume-sink-dep and squashes the following commits: f8617f0 [zsxwing] Add common lang3 to the Spark Flume Sink doc
|
Oh, I didn't realize you wanted to publish the assembly JAR. I don't think it makes sense to publish assemblies as Maven artifacts. Right? anyone that uses it via Maven does not want an assembly and it causes a bunch of problems. So no please don't publish the assembly that way. You just need a target to build the assembly right? that's just a matter of adding a plugin. (You can publish multiple artifacts under different classifiers for one group/artifact but this isn't the situation that this would be used.) |
|
The idea for publishing the assembly (which BTW is not that big as it On Fri, Jun 19, 2015 at 12:52 AM, Sean Owen [email protected]
|
|
Yeah I'm not worried about size. Maven isn't really the right place to distribute assemblies as it's not something to depend on. Yes, it just should be downloadable. I get it, that maven artifacts are still a pretty easy way to make it available. In that case maybe a new module makes sense after all: flume-assembly. Or if you're saying nobody would ever use the existing no-assembly artifact anyway, I can see changing it to the assembly. But that is suggesting that the existing module would never be used as a dependency. |
|
So the only reason for an assembly would be to add commons-lang3. I am more in favor of removing that dependency than making the build more complex. Flume already has scala in classpath (since that is pulled in by the Kafka dependency). I am inclined to keep this component as simple as possible and depend only on stuff already pulled in by Flume into its own classpath anyway. |
|
How extensive is our use of commons Lang 3 in flume sink? If we only use one class or method maybe we can just copy the source into our repository, depending on how complex or large it is. |
|
This is the only usage: I am inclined to just copy the method into the class. |
|
I will open a PR later today for this one. |
|
Yep, lets just remove the dependency on common lang. However, what is the On a related note, should we bump the flume support to latest Flume? It On Fri, Jun 19, 2015 at 10:04 AM, Hari Shreedharan <[email protected]
|
|
Kafka brings in 2.10. |
|
I dont see a dependency on Kafka in Flume 1.4.0 What am i missing? On Fri, Jun 19, 2015 at 2:21 PM, Hari Shreedharan [email protected]
|
|
That looks like just the API module. I suspect it comes via the actual implementation such as in http://mvnrepository.com/artifact/org.apache.flume/flume-ng-sources/1.6.0 but I don't know Flume well. |
|
I see. So the Kafka is present only through the flume-kafka-source Furthermore this is not available for Flume 1.4.0 as kafka source was added So here are two questions
I dont know enough about Flume, but I will be very surprised if the kafka @harishreedharan please comment. On Fri, Jun 19, 2015 at 2:50 PM, Sean Owen [email protected] wrote:
|
|
Yes, all of the libs in the flume-ng/lib directory gets added to the classpath, so scala would get added to the classpath, but get loaded only as required (which is normal JVM protocol). We'd have to bump our dependency to 1.6.0 for scala to be automagically available. Even if we don't upgrade, we don't need to change the dependency set, as the behavior is the same as before (add scala to flume-ng/lib or plugins dir). Apart from the assembly part, nothing else changes. I am sending a PR soon to get rid of the commons-lang3 dependency anyway |
|
I agree with @srowen that Maven isn't really the right place to distribute assemblies. For the assembly jars, what we need to do is just providing download links for people. I find now these jars are already assembled in |
|
Now that we don't have the commons-lang3 dependency in the flume-sink anymore, the assembly question for this module is moot. But if we want to have a more general discussion, we should perhaps move this discussion to a jira or the dev list? |
Commons Lang 3 has been added as one of the dependencies of Spark Flume Sink since #5703. This PR updates the doc for it.