Skip to content

Conversation

@mengxr
Copy link
Contributor

@mengxr mengxr commented Mar 17, 2014

The current implementation uses Array(1.0, features: _*) to construct a new array with intercept. This is not efficient for big arrays because Array.apply uses a for loop that iterates over the arguments. Array.+: is a better choice here.

Also, I don't see a reason to set initial weights to ones. So I set them to zeros.

JIRA: https://spark-project.atlassian.net/browse/SPARK-1260

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/13213/

@rxin
Copy link
Contributor

rxin commented Mar 18, 2014

Thanks. I've merged this!

@asfgit asfgit closed this in e108b9a Mar 18, 2014
@mengxr mengxr deleted the sgd branch March 18, 2014 22:40
mengxr added a commit to mengxr/spark that referenced this pull request Mar 19, 2014
The current implementation uses `Array(1.0, features: _*)` to construct a new array with intercept. This is not efficient for big arrays because `Array.apply` uses a for loop that iterates over the arguments. `Array.+:` is a better choice here.

Also, I don't see a reason to set initial weights to ones. So I set them to zeros.

JIRA: https://spark-project.atlassian.net/browse/SPARK-1260

Author: Xiangrui Meng <[email protected]>

Closes apache#161 from mengxr/sgd and squashes the following commits:

b5cfc53 [Xiangrui Meng] set default weights to zeros
a1439c2 [Xiangrui Meng] faster construction of features with intercept
ericl pushed a commit to ericl/spark that referenced this pull request Jan 23, 2017
## What changes were proposed in this pull request?

This PR adds a new project `ql-kafka-0-8` to support Kafka 0.8 for Structured Streaming. It follows the design of Kafka 0.10 source except:
- Don't support `subscribePattern`. Because without the 0.10 Kafka APIs, we need to ask all topics from Zookeeper and filter topics by ourselves.
- Don't support `failOnDataLoss` option. It means that the user cannot delete topics, otherwise the query will fail.

In addition, comparing to DStream Kafka 0.8 source, it has the following addition feature:
- Support discovering new partitions of a topic if the user uses `subscribe` option.

## How was this patch tested?

(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)

Author: Shixiong Zhu <[email protected]>

Closes apache#161 from zsxwing/kafka08.
ericl pushed a commit to ericl/spark that referenced this pull request Jan 23, 2017
## What changes were proposed in this pull request?

A follow up PR for apache#161 to disallow unsupported options.

## How was this patch tested?

`test("unsupported options")`

Author: Shixiong Zhu <[email protected]>

Closes apache#169 from zsxwing/kafka08-errors.
ash211 referenced this pull request in palantir/spark Mar 3, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments

(cherry picked from commit f6823f3)
lins05 pushed a commit to lins05/spark that referenced this pull request Apr 23, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments
erikerlandson pushed a commit to erikerlandson/spark that referenced this pull request Jul 28, 2017
* Allow setting memory on the driver submission server.

* Address comments

* Address comments
yoonlee95 pushed a commit to yoonlee95/spark that referenced this pull request Aug 17, 2017
YSPARK-713: Made changes to spark-env-gen.sh to resolve keystore and truststore url on QE cluster
jlopezmalla pushed a commit to jlopezmalla/spark that referenced this pull request Feb 27, 2018
Igosuki pushed a commit to Adikteev/spark that referenced this pull request Jul 31, 2018
…emporary disconnection between driver and Mesos master. (apache#161)
bzhaoopenstack pushed a commit to bzhaoopenstack/spark that referenced this pull request Sep 11, 2019
Do refactor for Ansible jobs to keep struct consistent
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
* Revert "release r49 (apache#162)"

This reverts commit 62da28f.

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

* revert ae release r52

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

Co-authored-by: 7mming7 <[email protected]>
microbearz pushed a commit to microbearz/spark that referenced this pull request Dec 15, 2020
* Revert "release r49 (apache#162)"

This reverts commit 62da28f.

* Revert "release r48 (apache#161)"

This reverts commit 1441531.

* revert ae skew release r51

Co-authored-by: 7mming7 <[email protected]>
turboFei pushed a commit to turboFei/spark that referenced this pull request Nov 6, 2025
turboFei added a commit to turboFei/spark that referenced this pull request Nov 6, 2025
…33] Backport insert operation lock (apache#197)

* [HADP-40184]Backport insert operation lock (#15)

[HADP-31946] Fix data duplicate on application retry and support concurrent write to different partitions in the same table.[HADP-33040][HADP-33041] Optimize merging staging files to output path and detect conflict with HDFS file lease.
HADP-34738] During commitJob, merge paths with multi threads (apache#218)
[HADP-36251] Enhance the concurrent lock mechanism  for insert operation (apache#272)
[HADP-37137] Add option to disable insert operation lock to write partitioned table (apache#286)

* [HADP-46224] Do not overwrite the lock file when creating lock (apache#133)

* [HADP-46868] Fix Spark merge path race condition (apache#161)

* [HADP-50903] Ignore the error message if insert operation lock file has been deleted (apache#271)

* [HADP-50733] Enhance the error message on picking insert operation lock failure (apache#267)

* Fix

* Fix

* Fix

* fix

* Fix

* Fix

* Fix

* Fix

* Fix

* [HADP-50574] Support to create the lock file for EC enabled path (apache#263)

* [HADP-50574][FOLLOWUP] Add parameter type when getting overwrite method (apache#265)

* [HADP-50574][FOLLOWUP] Add UT for creating ec disabled lock file and use underlying DistributedFileSystem for ViewFileSystem (apache#266)

* Fix

* Fix

* Fix

* [HADP-34612][FOLLOWUP] Do not show the insert local error by removing the being written stream from dfs client (apache#288)

* Enabled Hadoop 3

---------

Co-authored-by: fwang12 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants