#musicbrainz-devel

/

      • alastairp
        db._connection.reset() :)
      • 2015-07-16 19730, 2015

      • alastairp
        I still don’t like how we’re looping through both lowlevel and highlevel_json
      • 2015-07-16 19736, 2015

      • alastairp
        it means we do twice as much work
      • 2015-07-16 19756, 2015

      • alastairp
        we can do lowlevel.id=highlevel.id, highlevel.data=highlevel_json.id
      • 2015-07-16 19709, 2015

      • Gentlecat
        I didn't have much time for optimization yesterday :)
      • 2015-07-16 19717, 2015

      • alastairp
        and so when we’ve decided that a lowevel is bad, automatically go to fix the hl json
      • 2015-07-16 19719, 2015

      • alastairp
        OK, no problem!
      • 2015-07-16 19725, 2015

      • alastairp
        do you think you can do that?
      • 2015-07-16 19738, 2015

      • alastairp
        I’ll continue testing this
      • 2015-07-16 19715, 2015

      • Gentlecat
        in one update query?
      • 2015-07-16 19739, 2015

      • alastairp
        well, perhaps do 2 update_row()s, but only get the candidate lowlevel ids
      • 2015-07-16 19751, 2015

      • alastairp
        then do another query to join into the highlevel_json table to get its id
      • 2015-07-16 19702, 2015

      • alastairp
        did I explain that clearly?
      • 2015-07-16 19753, 2015

      • diana_olhovik joined the channel
      • 2015-07-16 19739, 2015

      • Gentlecat
      • 2015-07-16 19721, 2015

      • alastairp
        cool
      • 2015-07-16 19732, 2015

      • alastairp
        I think the hl row id is wrong
      • 2015-07-16 19738, 2015

      • alastairp
        highlevel.id = lowlevel.id
      • 2015-07-16 19749, 2015

      • Gentlecat
        ohh
      • 2015-07-16 19751, 2015

      • alastairp
        except the highlevel_data.id is different - it’s stored in highlevel.data
      • 2015-07-16 19754, 2015

      • Gentlecat
        right
      • 2015-07-16 19715, 2015

      • alastairp
        also, your string interpolation in cursor.execute doesn’t work
      • 2015-07-16 19719, 2015

      • alastairp
        i just fixed it on mine
      • 2015-07-16 19736, 2015

      • alastairp
        you need to do interpolation for the table name, but then second arguent to execute for actual data
      • 2015-07-16 19758, 2015

      • Gentlecat
        right
      • 2015-07-16 19713, 2015

      • alastairp
        Row #True is bad! Fixing...
      • 2015-07-16 19717, 2015

      • alastairp
        I guess that should be a number!
      • 2015-07-16 19725, 2015

      • alastairp
        but otherwise, great. it works here
      • 2015-07-16 19747, 2015

      • Gentlecat
        without high-level stuff?
      • 2015-07-16 19709, 2015

      • alastairp
        yeah
      • 2015-07-16 19710, 2015

      • alastairp
        cursor.execute("SELECT id FROM %s" % table)
      • 2015-07-16 19719, 2015

      • alastairp
        without the loop this has to be lowlevel again
      • 2015-07-16 19723, 2015

      • Gentlecat
        just fixed that too
      • 2015-07-16 19756, 2015

      • alastairp
        (y)
      • 2015-07-16 19702, 2015

      • Mineo
        aren't you just getting all the ids from lowlevel in do_magic only to load the row itself in is_bad? any reason to not load both at once?
      • 2015-07-16 19714, 2015

      • Mineo
        oh
      • 2015-07-16 19720, 2015

      • Mineo is not completely awake
      • 2015-07-16 19709, 2015

      • Mineo
        carry on doing whatever awesome things you're doing :)
      • 2015-07-16 19738, 2015

      • alastairp
        yes, postgres aborts the query if the json in the row is bad
      • 2015-07-16 19707, 2015

      • alastairp
        and python’s json module won’t say it’s bad, only postgres once you try and access it as a json field (rather than text)
      • 2015-07-16 19713, 2015

      • Gentlecat
        alastairp: updated
      • 2015-07-16 19708, 2015

      • alastairp
        hl_data_dict=purify(get_data_as_text("highlevel_json", row['id'])),
      • 2015-07-16 19711, 2015

      • alastairp
        still wrong :(
      • 2015-07-16 19712, 2015

      • alastairp
        sorry
      • 2015-07-16 19725, 2015

      • Gentlecat
        ugh, forgot about this one
      • 2015-07-16 19732, 2015

      • alastairp
        perhaps you could do the join to hl_json at the initial select
      • 2015-07-16 19742, 2015

      • alastairp
        so we don’t have to do a million small selects
      • 2015-07-16 19726, 2015

      • alastairp
        return json, sha256
      • 2015-07-16 19727, 2015

      • alastairp
        jason
      • 2015-07-16 19731, 2015

      • Gentlecat
        I'll probably have to rewrite it completely then
      • 2015-07-16 19707, 2015

      • Gentlecat
        updated again
      • 2015-07-16 19716, 2015

      • alastairp
        nah, just select ll.id, hl.data from ll join hl on hl.id=ll.id
      • 2015-07-16 19727, 2015

      • alastairp
        and pass both ids into update_rows
      • 2015-07-16 19737, 2015

      • Gentlecat
      • 2015-07-16 19739, 2015

      • alastairp
        you could go back to update_row(table, data, id) in this case
      • 2015-07-16 19748, 2015

      • alastairp
        yeah
      • 2015-07-16 19752, 2015

      • Gentlecat
        right
      • 2015-07-16 19708, 2015

      • Gentlecat
        alastairp: try again
      • 2015-07-16 19701, 2015

      • alastairp
        cool. working!
      • 2015-07-16 19714, 2015

      • alastairp
        a few small things I had to fix with % arguments to execute
      • 2015-07-16 19730, 2015

      • alastairp
        hmm, weird
      • 2015-07-16 19745, 2015

      • alastairp
        KeyError: 'metadata'
      • 2015-07-16 19721, 2015

      • Gentlecat
        metadata is missing?
      • 2015-07-16 19755, 2015

      • Gentlecat
        how is that possible
      • 2015-07-16 19714, 2015

      • alastairp
        ah, interesting
      • 2015-07-16 19715, 2015

      • alastairp
        so
      • 2015-07-16 19741, 2015

      • alastairp
        if the highlevel extractor can’t compute anything, it inserts {} into highlevel_json
      • 2015-07-16 19755, 2015

      • alastairp
        in this example, it couldn’t
      • 2015-07-16 19713, 2015

      • Gentlecat
        why would it do that?
      • 2015-07-16 19719, 2015

      • alastairp
        I wonder if our extractor is as strict as postgres
      • 2015-07-16 19723, 2015

      • alastairp
        and this is why it failed
      • 2015-07-16 19731, 2015

      • alastairp
        why would what do what?
      • 2015-07-16 19740, 2015

      • alastairp
        extractor insert {}, or fail?
      • 2015-07-16 19747, 2015

      • Gentlecat
        insert empty json
      • 2015-07-16 19705, 2015

      • MBJenkins
        dufferzafar0: Add missing jsonify import
      • 2015-07-16 19719, 2015

      • Gentlecat
        to prevent itself from running on the same row again?
      • 2015-07-16 19724, 2015

      • alastairp
        yep
      • 2015-07-16 19726, 2015

      • alastairp
        exactly
      • 2015-07-16 19731, 2015

      • Gentlecat
        ok
      • 2015-07-16 19708, 2015

      • alastairp
        ok, I’ll just replace it with .get(‘’, {})
      • 2015-07-16 19740, 2015

      • alastairp
        uh oh
      • 2015-07-16 19745, 2015

      • alastairp
        I have this really funny feeling
      • 2015-07-16 19756, 2015

      • alastairp
        that we only have 1 bad row ;)
      • 2015-07-16 19703, 2015

      • Gentlecat
        fun
      • 2015-07-16 19727, 2015

      • alastairp
        oh, no. I got another one
      • 2015-07-16 19743, 2015

      • alastairp
        hmm, but only 1 more it seems
      • 2015-07-16 19745, 2015

      • alastairp
        Gentlecat: cool. we seem to be ready
      • 2015-07-16 19748, 2015

      • alastairp
        thanks for your work
      • 2015-07-16 19759, 2015

      • Gentlecat
        exciting!
      • 2015-07-16 19715, 2015

      • Gentlecat
        now what exactly are we ready for? :)
      • 2015-07-16 19720, 2015

      • ruaok
        LOL
      • 2015-07-16 19724, 2015

      • ruaok
        heh. :)
      • 2015-07-16 19754, 2015

      • alastairp
        hah
      • 2015-07-16 19711, 2015

      • alastairp
        so, we have this paper for a conference
      • 2015-07-16 19728, 2015

      • alastairp
        and I wrote all this stuff about how we had 1 million tracks
      • 2015-07-16 19743, 2015

      • alastairp
        and then the paper was accepted, and the final version is due tomorrow
      • 2015-07-16 19708, 2015

      • Mineo
        and now we suddenly have nearly 2 million tracks!
      • 2015-07-16 19710, 2015

      • Gentlecat
        do you actually need to provide all the data with the paper?
      • 2015-07-16 19714, 2015

      • alastairp
        Mineo: bingo!
      • 2015-07-16 19715, 2015

      • Mineo
        (I can't actually check how many there are at the moment because abz.org ISEs)
      • 2015-07-16 19730, 2015

      • alastairp
        some of our stats are to do with the number of unique items in the metadata
      • 2015-07-16 19738, 2015

      • Gentlecat
        uh oh
      • 2015-07-16 19739, 2015

      • alastairp
        which we need to parse the json for
      • 2015-07-16 19741, 2015

      • alastairp
        uh oh
      • 2015-07-16 19745, 2015

      • alastairp
        what did I do?
      • 2015-07-16 19753, 2015

      • alastairp
        did we just delete 700k items?
      • 2015-07-16 19757, 2015

      • Gentlecat
        yay
      • 2015-07-16 19709, 2015

      • Gentlecat
        restart uwsgi?
      • 2015-07-16 19720, 2015

      • Gentlecat
        check logs I guess
      • 2015-07-16 19751, 2015

      • alastairp
        ok, better. it has a database connection
      • 2015-07-16 19707, 2015

      • alastairp
        I just keep seeing connection already closed
      • 2015-07-16 19715, 2015

      • alastairp
        even after restarting wsgi
      • 2015-07-16 19733, 2015

      • Gentlecat
        ohhh
      • 2015-07-16 19749, 2015

      • Gentlecat
        we might need to update paths to high-level extractor too
      • 2015-07-16 19706, 2015

      • Gentlecat
        but I've got no idea how we run it there
      • 2015-07-16 19710, 2015

      • alastairp
        sure, but that shouldn’t affect the database, right?
      • 2015-07-16 19712, 2015

      • alastairp
        yeah, I can do that
      • 2015-07-16 19750, 2015

      • alastairp
        I mean, it doesn’t affect the website
      • 2015-07-16 19753, 2015

      • Gentlecat
        well it was running after update has been deployed
      • 2015-07-16 19758, 2015

      • alastairp
        yeah
      • 2015-07-16 19702, 2015

      • Gentlecat
        I even saw new submissions
      • 2015-07-16 19704, 2015

      • alastairp
        and then I played with the database
      • 2015-07-16 19708, 2015

      • Gentlecat
        with high-level data
      • 2015-07-16 19723, 2015

      • alastairp
        right
      • 2015-07-16 19730, 2015

      • Gentlecat
        what's in uwsgi logs?
      • 2015-07-16 19739, 2015

      • alastairp
        that’ll be because the program was already running
      • 2015-07-16 19740, 2015

      • alastairp
        just connection closed
      • 2015-07-16 19707, 2015

      • Gentlecat
        what if you just try to start server manually?
      • 2015-07-16 19714, 2015

      • Gentlecat
        from manage.py
      • 2015-07-16 19755, 2015

      • alastairp
        weird
      • 2015-07-16 19758, 2015

      • alastairp
        lost synchronization with server: got message type ...
      • 2015-07-16 19701, 2015

      • alastairp
        this is a postgres error
      • 2015-07-16 19706, 2015

      • alastairp
        I’ve /never/ seen it before
      • 2015-07-16 19751, 2015

      • Gentlecat
        hm
      • 2015-07-16 19753, 2015

      • alastairp
      • 2015-07-16 19743, 2015

      • alastairp
        OK. I set that setting
      • 2015-07-16 19746, 2015

      • alastairp
        but it’s really slow
      • 2015-07-16 19717, 2015

      • alastairp
        better now. maybe it was just postgres being sluggish
      • 2015-07-16 19742, 2015

      • alastairp
        weird. I’m doing a dump, and it’s stuck on incremental_dumps table
      • 2015-07-16 19749, 2015

      • alastairp
        that seems a weird table to be stuck on
      • 2015-07-16 19728, 2015

      • alastairp
        oh, then there’s that thing where pxz is using 1000% cpu
      • 2015-07-16 19733, 2015

      • Gentlecat
        what do you mean stuck?
      • 2015-07-16 19711, 2015

      • alastairp
        well, it looked like it was doing nothing
      • 2015-07-16 19730, 2015

      • alastairp
        but it just seems like it’s streaming 600k items in to an xz file
      • 2015-07-16 19735, 2015

      • alastairp
        no problem at all :)
      • 2015-07-16 19726, 2015

      • Freso
        alastairp | did we just delete 700k items? — XD
      • 2015-07-16 19709, 2015

      • alastairp
        it’s ok. we didn’t. I’m just doing the backup now, *after* I destructively edited the database
      • 2015-07-16 19714, 2015

      • alastairp
        everything is under control
      • 2015-07-16 19706, 2015

      • MBJenkins
        * Michael Wiencek: Fix npm warning about knockout-arraytransforms
      • 2015-07-16 19707, 2015

      • MBJenkins
        * Michael Wiencek: Replace deprecated react-tools with babel