-
alastairp
db._connection.reset() :)
2015-07-16 19730, 2015
-
alastairp
I still don’t like how we’re looping through both lowlevel and highlevel_json
2015-07-16 19736, 2015
-
alastairp
it means we do twice as much work
2015-07-16 19756, 2015
-
alastairp
we can do lowlevel.id=highlevel.id, highlevel.data=highlevel_json.id
2015-07-16 19709, 2015
-
Gentlecat
I didn't have much time for optimization yesterday :)
2015-07-16 19717, 2015
-
alastairp
and so when we’ve decided that a lowevel is bad, automatically go to fix the hl json
2015-07-16 19719, 2015
-
alastairp
OK, no problem!
2015-07-16 19725, 2015
-
alastairp
do you think you can do that?
2015-07-16 19738, 2015
-
alastairp
I’ll continue testing this
2015-07-16 19715, 2015
-
Gentlecat
in one update query?
2015-07-16 19739, 2015
-
alastairp
well, perhaps do 2 update_row()s, but only get the candidate lowlevel ids
2015-07-16 19751, 2015
-
alastairp
then do another query to join into the highlevel_json table to get its id
2015-07-16 19702, 2015
-
alastairp
did I explain that clearly?
2015-07-16 19753, 2015
-
diana_olhovik joined the channel
2015-07-16 19739, 2015
-
Gentlecat
2015-07-16 19721, 2015
-
alastairp
cool
2015-07-16 19732, 2015
-
alastairp
I think the hl row id is wrong
2015-07-16 19738, 2015
-
alastairp
highlevel.id = lowlevel.id
2015-07-16 19749, 2015
-
Gentlecat
ohh
2015-07-16 19751, 2015
-
alastairp
except the highlevel_data.id is different - it’s stored in highlevel.data
2015-07-16 19754, 2015
-
Gentlecat
right
2015-07-16 19715, 2015
-
alastairp
also, your string interpolation in cursor.execute doesn’t work
2015-07-16 19719, 2015
-
alastairp
i just fixed it on mine
2015-07-16 19736, 2015
-
alastairp
you need to do interpolation for the table name, but then second arguent to execute for actual data
2015-07-16 19758, 2015
-
Gentlecat
right
2015-07-16 19713, 2015
-
alastairp
Row #True is bad! Fixing...
2015-07-16 19717, 2015
-
alastairp
I guess that should be a number!
2015-07-16 19725, 2015
-
alastairp
but otherwise, great. it works here
2015-07-16 19747, 2015
-
Gentlecat
without high-level stuff?
2015-07-16 19709, 2015
-
alastairp
yeah
2015-07-16 19710, 2015
-
alastairp
cursor.execute("SELECT id FROM %s" % table)
2015-07-16 19719, 2015
-
alastairp
without the loop this has to be lowlevel again
2015-07-16 19723, 2015
-
Gentlecat
just fixed that too
2015-07-16 19756, 2015
-
alastairp
(y)
2015-07-16 19702, 2015
-
Mineo
aren't you just getting all the ids from lowlevel in do_magic only to load the row itself in is_bad? any reason to not load both at once?
2015-07-16 19714, 2015
-
Mineo
oh
2015-07-16 19720, 2015
-
Mineo is not completely awake
2015-07-16 19709, 2015
-
Mineo
carry on doing whatever awesome things you're doing :)
2015-07-16 19738, 2015
-
alastairp
yes, postgres aborts the query if the json in the row is bad
2015-07-16 19707, 2015
-
alastairp
and python’s json module won’t say it’s bad, only postgres once you try and access it as a json field (rather than text)
2015-07-16 19713, 2015
-
Gentlecat
alastairp: updated
2015-07-16 19708, 2015
-
alastairp
hl_data_dict=purify(get_data_as_text("highlevel_json", row['id'])),
2015-07-16 19711, 2015
-
alastairp
still wrong :(
2015-07-16 19712, 2015
-
alastairp
sorry
2015-07-16 19725, 2015
-
Gentlecat
ugh, forgot about this one
2015-07-16 19732, 2015
-
alastairp
perhaps you could do the join to hl_json at the initial select
2015-07-16 19742, 2015
-
alastairp
so we don’t have to do a million small selects
2015-07-16 19726, 2015
-
alastairp
return json, sha256
2015-07-16 19727, 2015
-
alastairp
jason
2015-07-16 19731, 2015
-
Gentlecat
I'll probably have to rewrite it completely then
2015-07-16 19707, 2015
-
Gentlecat
updated again
2015-07-16 19716, 2015
-
alastairp
nah, just select ll.id, hl.data from ll join hl on hl.id=ll.id
2015-07-16 19727, 2015
-
alastairp
and pass both ids into update_rows
2015-07-16 19737, 2015
-
Gentlecat
2015-07-16 19739, 2015
-
alastairp
you could go back to update_row(table, data, id) in this case
2015-07-16 19748, 2015
-
alastairp
yeah
2015-07-16 19752, 2015
-
Gentlecat
right
2015-07-16 19708, 2015
-
Gentlecat
alastairp: try again
2015-07-16 19701, 2015
-
alastairp
cool. working!
2015-07-16 19714, 2015
-
alastairp
a few small things I had to fix with % arguments to execute
2015-07-16 19730, 2015
-
alastairp
hmm, weird
2015-07-16 19745, 2015
-
alastairp
KeyError: 'metadata'
2015-07-16 19721, 2015
-
Gentlecat
metadata is missing?
2015-07-16 19755, 2015
-
Gentlecat
how is that possible
2015-07-16 19714, 2015
-
alastairp
ah, interesting
2015-07-16 19715, 2015
-
alastairp
so
2015-07-16 19741, 2015
-
alastairp
if the highlevel extractor can’t compute anything, it inserts {} into highlevel_json
2015-07-16 19755, 2015
-
alastairp
in this example, it couldn’t
2015-07-16 19713, 2015
-
Gentlecat
why would it do that?
2015-07-16 19719, 2015
-
alastairp
I wonder if our extractor is as strict as postgres
2015-07-16 19723, 2015
-
alastairp
and this is why it failed
2015-07-16 19731, 2015
-
alastairp
why would what do what?
2015-07-16 19740, 2015
-
alastairp
extractor insert {}, or fail?
2015-07-16 19747, 2015
-
Gentlecat
insert empty json
2015-07-16 19705, 2015
-
MBJenkins
dufferzafar0: Add missing jsonify import
2015-07-16 19719, 2015
-
Gentlecat
to prevent itself from running on the same row again?
2015-07-16 19724, 2015
-
alastairp
yep
2015-07-16 19726, 2015
-
alastairp
exactly
2015-07-16 19731, 2015
-
Gentlecat
ok
2015-07-16 19708, 2015
-
alastairp
ok, I’ll just replace it with .get(‘’, {})
2015-07-16 19740, 2015
-
alastairp
uh oh
2015-07-16 19745, 2015
-
alastairp
I have this really funny feeling
2015-07-16 19756, 2015
-
alastairp
that we only have 1 bad row ;)
2015-07-16 19703, 2015
-
Gentlecat
fun
2015-07-16 19727, 2015
-
alastairp
oh, no. I got another one
2015-07-16 19743, 2015
-
alastairp
hmm, but only 1 more it seems
2015-07-16 19745, 2015
-
alastairp
Gentlecat: cool. we seem to be ready
2015-07-16 19748, 2015
-
alastairp
thanks for your work
2015-07-16 19759, 2015
-
Gentlecat
exciting!
2015-07-16 19715, 2015
-
Gentlecat
now what exactly are we ready for? :)
2015-07-16 19720, 2015
-
ruaok
LOL
2015-07-16 19724, 2015
-
ruaok
heh. :)
2015-07-16 19754, 2015
-
alastairp
hah
2015-07-16 19711, 2015
-
alastairp
so, we have this paper for a conference
2015-07-16 19728, 2015
-
alastairp
and I wrote all this stuff about how we had 1 million tracks
2015-07-16 19743, 2015
-
alastairp
and then the paper was accepted, and the final version is due tomorrow
2015-07-16 19708, 2015
-
Mineo
and now we suddenly have nearly 2 million tracks!
2015-07-16 19710, 2015
-
Gentlecat
do you actually need to provide all the data with the paper?
2015-07-16 19714, 2015
-
alastairp
Mineo: bingo!
2015-07-16 19715, 2015
-
Mineo
(I can't actually check how many there are at the moment because
abz.org ISEs)
2015-07-16 19730, 2015
-
alastairp
some of our stats are to do with the number of unique items in the metadata
2015-07-16 19738, 2015
-
Gentlecat
uh oh
2015-07-16 19739, 2015
-
alastairp
which we need to parse the json for
2015-07-16 19741, 2015
-
alastairp
uh oh
2015-07-16 19745, 2015
-
alastairp
what did I do?
2015-07-16 19753, 2015
-
alastairp
did we just delete 700k items?
2015-07-16 19757, 2015
-
Gentlecat
yay
2015-07-16 19709, 2015
-
Gentlecat
restart uwsgi?
2015-07-16 19720, 2015
-
Gentlecat
check logs I guess
2015-07-16 19751, 2015
-
alastairp
ok, better. it has a database connection
2015-07-16 19707, 2015
-
alastairp
I just keep seeing connection already closed
2015-07-16 19715, 2015
-
alastairp
even after restarting wsgi
2015-07-16 19733, 2015
-
Gentlecat
ohhh
2015-07-16 19749, 2015
-
Gentlecat
we might need to update paths to high-level extractor too
2015-07-16 19706, 2015
-
Gentlecat
but I've got no idea how we run it there
2015-07-16 19710, 2015
-
alastairp
sure, but that shouldn’t affect the database, right?
2015-07-16 19712, 2015
-
alastairp
yeah, I can do that
2015-07-16 19750, 2015
-
alastairp
I mean, it doesn’t affect the website
2015-07-16 19753, 2015
-
Gentlecat
well it was running after update has been deployed
2015-07-16 19758, 2015
-
alastairp
yeah
2015-07-16 19702, 2015
-
Gentlecat
I even saw new submissions
2015-07-16 19704, 2015
-
alastairp
and then I played with the database
2015-07-16 19708, 2015
-
Gentlecat
with high-level data
2015-07-16 19723, 2015
-
alastairp
right
2015-07-16 19730, 2015
-
Gentlecat
what's in uwsgi logs?
2015-07-16 19739, 2015
-
alastairp
that’ll be because the program was already running
2015-07-16 19740, 2015
-
alastairp
just connection closed
2015-07-16 19707, 2015
-
Gentlecat
what if you just try to start server manually?
2015-07-16 19714, 2015
-
Gentlecat
from manage.py
2015-07-16 19755, 2015
-
alastairp
weird
2015-07-16 19758, 2015
-
alastairp
lost synchronization with server: got message type ...
2015-07-16 19701, 2015
-
alastairp
this is a postgres error
2015-07-16 19706, 2015
-
alastairp
I’ve /never/ seen it before
2015-07-16 19751, 2015
-
Gentlecat
hm
2015-07-16 19753, 2015
-
alastairp
2015-07-16 19743, 2015
-
alastairp
OK. I set that setting
2015-07-16 19746, 2015
-
alastairp
but it’s really slow
2015-07-16 19717, 2015
-
alastairp
better now. maybe it was just postgres being sluggish
2015-07-16 19742, 2015
-
alastairp
weird. I’m doing a dump, and it’s stuck on incremental_dumps table
2015-07-16 19749, 2015
-
alastairp
that seems a weird table to be stuck on
2015-07-16 19728, 2015
-
alastairp
oh, then there’s that thing where pxz is using 1000% cpu
2015-07-16 19733, 2015
-
Gentlecat
what do you mean stuck?
2015-07-16 19711, 2015
-
alastairp
well, it looked like it was doing nothing
2015-07-16 19730, 2015
-
alastairp
but it just seems like it’s streaming 600k items in to an xz file
2015-07-16 19735, 2015
-
alastairp
no problem at all :)
2015-07-16 19726, 2015
-
Freso
alastairp | did we just delete 700k items? — XD
2015-07-16 19709, 2015
-
alastairp
it’s ok. we didn’t. I’m just doing the backup now, *after* I destructively edited the database
2015-07-16 19714, 2015
-
alastairp
everything is under control
2015-07-16 19706, 2015
-
MBJenkins
* Michael Wiencek: Fix npm warning about knockout-arraytransforms
2015-07-16 19707, 2015
-
MBJenkins
* Michael Wiencek: Replace deprecated react-tools with babel