Skip to content

Add DBTables type #269

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Feb 11, 2022
Merged

Add DBTables type #269

merged 8 commits into from
Feb 11, 2022

Conversation

jeremiahpslewis
Copy link
Contributor

@jeremiahpslewis jeremiahpslewis commented Oct 23, 2021

@quinnj as discussed.

Resolves #209.

Note: moved few tests from typeof -> isa, let me know if I should nix it :)

TODO: need to add a test for db without any tables

@codecov
Copy link

codecov bot commented Nov 2, 2021

Codecov Report

Merging #269 (344472e) into master (bf7b7b8) will increase coverage by 0.13%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #269      +/-   ##
==========================================
+ Coverage   85.87%   86.01%   +0.13%     
==========================================
  Files           5        5              
  Lines         701      708       +7     
==========================================
+ Hits          602      609       +7     
  Misses         99       99              
Impacted Files Coverage Δ
src/SQLite.jl 96.15% <100.00%> (+0.02%) ⬆️
src/tables.jl 99.32% <100.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bf7b7b8...344472e. Read the comment docs.

@quinnj
Copy link
Member

quinnj commented Nov 2, 2021

Ok, this is a good start in the right direction, @jeremiahpslewis! Thanks for taking this on. Part of the awkwardness though in the original issue was that there wasn't a well-defined schema for the list of tables, which isn't quite solved here (as far as I can tell). I'm thinking maybe we try something along the lines of:

struct DBTables
    tbls::Vector{String}
end

Tables.isrowtable(::Type{DBTables}) = true
Tables.rows(x::DBTables) = x
Tables.schema(x::DBTables) = Tables.Schema((:name,), (String,))

function Base.iterate(x::DBTables, st=1)
    st > length(x.tbls) && return nothing
    return (name=x.tbls[st],), st + 1
end

function tables(db::DB, sink=nothing)
    tbls = DBInterface.execute(db, "SELECT name FROM sqlite_master WHERE type='table';")
    if sink !== nothing
        # for backwards compat
        return tbls |> sink
    end
    x = columntable(tbls)
    return DBTables(x == NamedTuple() ? String[] : x.name)
end

in this way, since we're defining our own Tables.schema on DBTables, we can assure that even if there are no tables in the database, it always at least returns a consistent schema. I haven't run this code, I just threw it together as hopefully to get the right idea.

@jeremiahpslewis
Copy link
Contributor Author

Thanks for the feedback. I think this may be already covered in the PR, but potentially not very clearly. The PR introduces the DBTables type here:

DBTable(name::String) = DBTable(name, Tables.schema(""))
and essentially just piggybacks/inherits on Tables.jl support for vectors to get all of the necessary properties. So because the vector type is iterable, the DBTables type is iterable and doesn't need to be further defined.

@jeremiahpslewis
Copy link
Contributor Author

There are a number of instances where tables are referred to as strings. If we move to a 'table object' with an optional schema, the question arises whether to support dispatch on strings and table objects (I suppose yes, for backward compatibility) as well as on table objects?

E.g. here:

function drop!(db::DB, table::AbstractString; ifexists::Bool=false)

@jeremiahpslewis
Copy link
Contributor Author

But I do think a key step which this PR takes is introducing a DBTable type which tracks both name and (optionally) schema. Kind of wonder whether this is a generic which is useful in other DBs as well

@quinnj
Copy link
Member

quinnj commented Nov 14, 2021

Ok, I better understand the design decision of defining const DBTables = Vector{DBTable}; I think that works well. My main desire is that we also define Tables.schema(x::DBTables) = Tables.Schema(names, types) where names and types would be the expected column names/types; that goes back to the original issue this is trying to solve that ensures if the sqlite db is empty then there's still an expected schema that sinks can use.

@jeremiahpslewis
Copy link
Contributor Author

jeremiahpslewis commented Dec 6, 2021

@quinnj I think I understand your concerns and am hopeful this is already covered by the proposed changes. I extended the tests so that the way in which this PR solves the original issue is clearer, for both the case when tables are present in the db as well as when there are no tables; let me know :)

@jeremiahpslewis
Copy link
Contributor Author

@quinnj Do you have an update on this PR? Would love to tie up any loose ends. :)

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review here; been tied up in other things. Thanks @jeremiahpslewis!

@quinnj quinnj merged commit d99fdfb into JuliaDatabases:master Feb 11, 2022
@bkamins
Copy link
Contributor

bkamins commented Jul 3, 2022

What is currently the point of sink argument in SQLite.tables function?

@quinnj
Copy link
Member

quinnj commented Jul 4, 2022

So you can control what kind of "table" you get the results back in. i.e. DataFrame, CSV.write, columntable or rowtable, etc. Probably overly generic here and we should just return columntable and you can convert it to something else yourself if needed.

@bkamins
Copy link
Contributor

bkamins commented Jul 4, 2022

@quinnj - this was the case before this PR and 1.4.1 release. Now this function always returns Vector{DBTable} so I am not sure what is the use of sink now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tables(db) for empty database should be (names=[],)
3 participants