Skip to content

Conversation

@giwa
Copy link
Contributor

@giwa giwa commented May 18, 2014

toFormattedString should represent formatted millisecond like "10 ms" not simply give back "10"
toString should represent string of duration. It should simply give back string of millisecond.

Currently like this.

duration = Duration(10)
duration.toString()
>> "10 ms"
duration.toFormattedString()
>> "10"

Should be

duration = Duration(10)
duration.toString()
>> "10"
duration.toFormattedString()
>> "10 ms"

Please explain what does "formatted" mean? Why does it simply give millisecond with string?

toFormattedString should represent formatted millisecond like "10 ms" not simply give back "10"
toString should represent string of duration. It should simply give back string of millisecond.

Currently like this. 
duration = Duration(10)
duration.toString()
>> "10 ms"
duration.toFormattedString()
>> "10"

Should be 
duration = Duration(10)
duration.toString()
>> "10"
duration.toFormattedString()
>> "10 ms"


Please explain what does "formatted" mean? Why does it simply give milli second with string foramt
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@ash211
Copy link
Contributor

ash211 commented May 18, 2014

Where else in the Spark codebase are these methods called? If we're switching their meaning we need to make sure that callers are updated to expect the new formats.

@giwa
Copy link
Contributor Author

giwa commented May 18, 2014

@ash211 Thank you for your comment. After my second thought, this suggestion is not good. toString in Java world gives back human readable format.

The reason why I came up this question is I am writing wrapper of Duration in Python. Since you are familiar with Python, Could I ask some questions?

Do we need dunder str and dunder repr in Duration? If they are needed, what they should give back respectively?

str(Duration)
"10 ms"

repr(Duration)
"10"

BTW
toFormattedString is not used anywhere. Regarding to toString, I think it is not used anywhere in streaming, though I need more time to look deeply.

@giwa giwa closed this May 18, 2014
@ash211
Copy link
Contributor

ash211 commented May 19, 2014

I'm familiar with the Python language but much less so the conventions like str vs repr.

From this SO post though, it looks like str should be "10 ms", but also repr should be something like "Duration(10ms)"

http://stackoverflow.com/questions/1436703/difference-between-str-and-repr-in-python

cloud-fan pushed a commit that referenced this pull request Dec 12, 2022
### What changes were proposed in this pull request?

Remove overriding the description method in the V2 file sources. `FileScan` already uses all the metadata to create the description, so adding the same fields to the overridden description creates duplicates.

### Why are the changes needed?

Example parquet scan from the agg pushdown suite:

Before:
```
+- BatchScan parquet file:/...[min(_3)#814, max(_3)#815, min(_1)#816, max(_1)#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)..., PushedFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedGroupBy: [] RuntimeFilters: []
```

After:
```
 +- BatchScan parquet file:/...[min(_3)#814, max(_3)#815, min(_1)#816, max(_1)#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)... RuntimeFilters: []
```

### Does this PR introduce _any_ user-facing change?

Just description change in explain output.

### How was this patch tested?

Updated a few UTs to accommodate checking explain string.

Closes #38229 from Kimahriman/remove-file-source-description.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
beliefer pushed a commit to beliefer/spark that referenced this pull request Dec 18, 2022
### What changes were proposed in this pull request?

Remove overriding the description method in the V2 file sources. `FileScan` already uses all the metadata to create the description, so adding the same fields to the overridden description creates duplicates.

### Why are the changes needed?

Example parquet scan from the agg pushdown suite:

Before:
```
+- BatchScan parquet file:/...[min(_3)apache#814, max(_3)apache#815, min(_1)apache#816, max(_1)apache#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)..., PushedFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedGroupBy: [] RuntimeFilters: []
```

After:
```
 +- BatchScan parquet file:/...[min(_3)apache#814, max(_3)apache#815, min(_1)apache#816, max(_1)apache#817, count(*)#818L, count(_1)#819L, count(_2)#820L, count(_3)#821L] ParquetScan DataFilters: [], Format: parquet, Location: InMemoryFileIndex(1 paths)[file:/..., PartitionFilters: [], PushedAggregation: [MIN(_3), MAX(_3), MIN(_1), MAX(_1), COUNT(*), COUNT(_1), COUNT(_2), COUNT(_3)], PushedFilters: [], PushedGroupBy: [], ReadSchema: struct<min(_3):int,max(_3):int,min(_1):int,max(_1):int,count(*):bigint,count(_1):bigint,count(_2)... RuntimeFilters: []
```

### Does this PR introduce _any_ user-facing change?

Just description change in explain output.

### How was this patch tested?

Updated a few UTs to accommodate checking explain string.

Closes apache#38229 from Kimahriman/remove-file-source-description.

Authored-by: Adam Binford <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants