I’m wanting to get a csv dump of all the data. dumping the data table with a where, and referencing into the jsonb is as quick as I expected
2016-03-16 07607, 2016
kepstin
what's the query you're running for that?
2016-03-16 07608, 2016
alastairp
about 1000 rows per second
2016-03-16 07627, 2016
alastairp
select hl.mbid, hlm.data->>'value' as genre from highlevel_model hlm join highlevel hl on hlm.highlevel=hl.id where model=3
2016-03-16 07655, 2016
alastairp
as soon as I put the join in, it takes about 30 seconds per 100 rows
2016-03-16 07631, 2016
kepstin
hmm. you're extracting this information for every entry, rather than only a subset of them?
2016-03-16 07601, 2016
alastairp
hlm.model=3
2016-03-16 07622, 2016
alastairp
it’ll be 1 row for every item in highlevel, but only 1/14 for highlevel_model
2016-03-16 07639, 2016
alastairp
oh, I reversed the from/join in this one to see what the effect would be
2016-03-16 07658, 2016
kepstin
right, so the sequential scan over highlevel is expected, there's no more efficient way to get all the rows in a table, and the problem is in the join itself.
try adding an index on highlevel_model (highlevel, model) perhaps?
2016-03-16 07616, 2016
alastairp
hmm, right
2016-03-16 07643, 2016
kepstin
right now it's filtering by model then sorting both tables on highlevel id to do the merge. hmm.
2016-03-16 07656, 2016
kepstin
hmm.
2016-03-16 07651, 2016
kepstin
having the index on highlevel,model might result in the results from the model filter being returned in presorted order, which could speed up the join?
2016-03-16 07653, 2016
kepstin
not sure :)
2016-03-16 07616, 2016
kepstin isn't a postgres expert, by far - he's trying to do his best rubber duck impression atm.
2016-03-16 07623, 2016
alastairp
building it now, let’s see
2016-03-16 07650, 2016
alastairp
(this is a big table, it might take some time :)
2016-03-16 07613, 2016
alastairp
settings have been tweaked, we have more shared buffers, etc
2016-03-16 07648, 2016
kepstin
hmm. adding that index might turn it into a nested loop instead of a merge join - scan the highlevel table then lookup each row in highlevel_model by (highlevel, model)
2016-03-16 07656, 2016
kepstin
no idea if the query planner will actually pick that
2016-03-16 07603, 2016
kepstin
or if it would be faster :)
2016-03-16 07615, 2016
kepstin
hmm. that merge join doesn't actually have any explicit sorts being added, so it looks like it should be optimal or close to it.
2016-03-16 07617, 2016
alastairp
right, this is why I was a little confused
2016-03-16 07643, 2016
alastairp
because the select without the join is really fast
2016-03-16 07655, 2016
alastairp
and the join didn’t seem to add anything strange in the query planner
2016-03-16 07659, 2016
kepstin
what's the speed of a simple select hlm.data->>'value' as genrea from highlevel_model as hlm where model=3 order by highlevel asc; look like?
2016-03-16 07644, 2016
alastairp
ah, with the order
2016-03-16 07609, 2016
kepstin
oh, i misunderstood your index names
2016-03-16 07618, 2016
kepstin
it's using "highlevel_ndx_highlevel_model" then filtering by model
2016-03-16 07630, 2016
alastairp
yeah, I’m not sure who came up with this name
2016-03-16 07630, 2016
kepstin
so it's actually reading *every row* to check the model
2016-03-16 07638, 2016
alastairp
but it’s field_ndx_tablename
2016-03-16 07652, 2016
alastairp
which seems odd to me
2016-03-16 07605, 2016
kepstin
so yeah, adding an index on (model, highlevel) or (highlevel, model) might help. Not sure which would help more.