#metabrainz

/

      • agentsim joined the channel
      • 2017-06-23 17448, 2017

      • D4RK-PH0ENiX has quit
      • 2017-06-23 17433, 2017

      • D4RK-PH0ENiX joined the channel
      • 2017-06-23 17458, 2017

      • agentsim has quit
      • 2017-06-23 17444, 2017

      • iliekcomputers
        ruaok: I created LB-182, but I was thinking, what if a person first does a last.fm import (of the entire history) and then an alpha import, this could still lead to duplicates, I guess we're just gonna tell people not to do this?
      • 2017-06-23 17444, 2017

      • BrainzBot
        LB-182: Only do a last.fm import until the last alpha_import listen https://tickets.metabrainz.org/browse/LB-182
      • 2017-06-23 17427, 2017

      • ruaok
        iliekcomputers: yes, that.
      • 2017-06-23 17440, 2017

      • ruaok
        if people do stupid things twice, they deserve it.
      • 2017-06-23 17417, 2017

      • alastairp
        3:15 PM <ruaok> then if we do a last.fm, never import anything older than that timestamp.
      • 2017-06-23 17423, 2017

      • alastairp
        I was thinking a few ways of doing this
      • 2017-06-23 17447, 2017

      • alastairp
        1) this is basically incremental last.fm dumps, is it much more work to make this task that instead?
      • 2017-06-23 17423, 2017

      • alastairp
        2) if we don't want to do this yet, we could have the "has imported from alpha?" flag, and just disable last.fm imports for now
      • 2017-06-23 17445, 2017

      • alastairp
        when we have time, and implement 1), we can turn this into "do incremental import"
      • 2017-06-23 17423, 2017

      • ruaok
        oh, oh, oh.
      • 2017-06-23 17429, 2017

      • ruaok
        1 is clever.
      • 2017-06-23 17442, 2017

      • ruaok
        iliekcomputers: following this?
      • 2017-06-23 17446, 2017

      • iliekcomputers
        what does "incremental import" mean exactly?
      • 2017-06-23 17402, 2017

      • ruaok
        only import up until the last time you imported into last. fm.
      • 2017-06-23 17411, 2017

      • ruaok
        *from
      • 2017-06-23 17439, 2017

      • iliekcomputers
        ah
      • 2017-06-23 17410, 2017

      • iliekcomputers
        I guess we'd have to track which was the last listen from last.fm then?
      • 2017-06-23 17420, 2017

      • ruaok
        which as alastairp pointed out, is practically the as as what you're doing now.
      • 2017-06-23 17436, 2017

      • ruaok
        yes, but you're already going to add htat.
      • 2017-06-23 17453, 2017

      • ruaok
        the import from alpha would then basically do the following:
      • 2017-06-23 17459, 2017

      • ruaok
        1. import everything from alpha.
      • 2017-06-23 17420, 2017

      • ruaok
        2. Set the last last.fm import date to the most recent listen from alpha.
      • 2017-06-23 17455, 2017

      • iliekcomputers
        I get it, nice
      • 2017-06-23 17408, 2017

      • ruaok
        yeah, kills two birds with one stone.
      • 2017-06-23 17457, 2017

      • iliekcomputers
        okay, how exactly would we track which listens are from last.fm, send an extra field along with the data?
      • 2017-06-23 17420, 2017

      • iliekcomputers
        and how do we update the last lfm import date
      • 2017-06-23 17432, 2017

      • alastairp
        ah, the nice thing about that is I don't think you need a field for "imported from alpha"
      • 2017-06-23 17438, 2017

      • alastairp
        just pretend that it's last.fm
      • 2017-06-23 17439, 2017

      • iliekcomputers
        every rabbitmq batch
      • 2017-06-23 17445, 2017

      • iliekcomputers
        ?
      • 2017-06-23 17402, 2017

      • alastairp
        because the final result is the same, right?
      • 2017-06-23 17437, 2017

      • alastairp
        ah, if the process fails half way through and you want to continue it, you will need two fields
      • 2017-06-23 17438, 2017

      • ruaok
        alastairp: should be, yes.
      • 2017-06-23 17412, 2017

      • ruaok
        iliekcomputers: after writing a batch of listens, update the latest timestamp in PG.
      • 2017-06-23 17443, 2017

      • ruaok
        latest_import_timestamp TIMESTAMP WITH TIMEZONE
      • 2017-06-23 17405, 2017

      • ruaok
        and if the import from alpha does the same, then we're clear and out of the woods
      • 2017-06-23 17420, 2017

      • alastairp
        make sure that alpha and last.fm imports go from oldest-newst
      • 2017-06-23 17423, 2017

      • alastairp
        newest
      • 2017-06-23 17433, 2017

      • iliekcomputers
        that would be extra work considering they both go newest-oldest right now, could we not just check the field each batch and update if it is smaller than our current ts (or would that be too bad for performance?)
      • 2017-06-23 17416, 2017

      • Gore|woerk has quit
      • 2017-06-23 17421, 2017

      • ruaok
        alastairp: why is that order needed?
      • 2017-06-23 17436, 2017

      • ruaok
        can we just store max(stored ts, most recent import ts)?
      • 2017-06-23 17413, 2017

      • alastairp
        hmm
      • 2017-06-23 17429, 2017

      • alastairp
        OK, if influx can handle deduplicating OK
      • 2017-06-23 17410, 2017

      • ruaok
        in theory. :)
      • 2017-06-23 17446, 2017

      • alastairp
        I guess this goes back to our storing stuff in cassandra
      • 2017-06-23 17411, 2017

      • alastairp
        so, my idea for incremental imports was always that the importer would first ask "what was the last thing you have?" to the server
      • 2017-06-23 17415, 2017

      • alastairp
        then only send data from that point
      • 2017-06-23 17441, 2017

      • alastairp
        because if you import some stuff from day 1-14
      • 2017-06-23 17446, 2017

      • alastairp
        then on day 30 you do another import
      • 2017-06-23 17404, 2017

      • alastairp
        the first page of data that gets submitted to the database is day 28-30
      • 2017-06-23 17430, 2017

      • alastairp
        if you want to import all 60,000 pages of your listens each time you do an import; OK
      • 2017-06-23 17451, 2017

      • alastairp
        if that's the case, max() would work fine
      • 2017-06-23 17403, 2017

      • ruaok
        > the first page of data that gets submitted to the database is day 28-30
      • 2017-06-23 17407, 2017

      • ruaok
        where did 28 come from?
      • 2017-06-23 17415, 2017

      • alastairp
        because a page is what, 50 items?
      • 2017-06-23 17425, 2017

      • alastairp
        containing all the music that you've listened to in the last 2 days
      • 2017-06-23 17400, 2017

      • alastairp
        I think you can do it backwards. let me write some psuedocode
      • 2017-06-23 17441, 2017

      • ruaok
        thx, I'm still confused. :)
      • 2017-06-23 17446, 2017

      • ruaok
        <== natural state of being
      • 2017-06-23 17429, 2017

      • alastairp
      • 2017-06-23 17401, 2017

      • ruaok
        that is pretty much how I saw it working.
      • 2017-06-23 17411, 2017

      • ruaok
        iliekcomputers: you?
      • 2017-06-23 17423, 2017

      • ruaok
        s/saw/envisioned
      • 2017-06-23 17433, 2017

      • alastairp
      • 2017-06-23 17436, 2017

      • alastairp
        refreshed
      • 2017-06-23 17448, 2017

      • alastairp
        now there's a 'newest to oldest' and 'oldest to newest'
      • 2017-06-23 17406, 2017

      • alastairp
        I always thought that oldest to newest would be the nicer way to do it, but now I see it's possible to do it both ways
      • 2017-06-23 17457, 2017

      • ruaok
        > do some sort of binary search to find the page which contains the last imported item
      • 2017-06-23 17416, 2017

      • ruaok
        this sounds troublesome, for an unclear gain.
      • 2017-06-23 17426, 2017

      • ruaok
        I'm ok with newest to oldest.
      • 2017-06-23 17429, 2017

      • iliekcomputers
        the newest to oldest one is almost exactly what I was thinking, except for the last request announcing the latest_timestamp, we get it ourselves from influx-writer
      • 2017-06-23 17459, 2017

      • iliekcomputers
        but now think alastairp's version is better
      • 2017-06-23 17411, 2017

      • ruaok
        iliekcomputers: the problem is the influx writer cannot discern a listen from an import, can it?
      • 2017-06-23 17422, 2017

      • alastairp
        the gain is that last import date gets set every time a submission gets stored
      • 2017-06-23 17425, 2017

      • alastairp
        in my view
      • 2017-06-23 17434, 2017

      • alastairp
        so that the client doesn't have to send this value
      • 2017-06-23 17455, 2017

      • iliekcomputers
        ruaok: I was thinking we could pass an extra field to additional_info for that
      • 2017-06-23 17457, 2017

      • ruaok
        alastairp: ah, I see.
      • 2017-06-23 17404, 2017

      • alastairp
        I think it's more elegant for cases where the browser window closes, or import crashes
      • 2017-06-23 17420, 2017

      • alastairp
        it's true, finding the page to start from will be difficult
      • 2017-06-23 17425, 2017

      • ruaok
        iliekcomputers: additional info isn't ours to store data in.
      • 2017-06-23 17432, 2017

      • alastairp
        difficult/complex/finicky
      • 2017-06-23 17420, 2017

      • ruaok
        agreed, but the point about browser crash (or whatever) is a good thing to consider.
      • 2017-06-23 17423, 2017

      • alastairp
        iliekcomputers: if you go new->old you have to announce the last_timestamp, because otherwise a crash in the importer will lose listens
      • 2017-06-23 17404, 2017

      • ruaok
        I suppose the recovery isn't terrible -- you do the import again, dups get tossed, until the import completes.
      • 2017-06-23 17411, 2017

      • alastairp
        yup
      • 2017-06-23 17426, 2017

      • ruaok
        I find that is sufficient.
      • 2017-06-23 17433, 2017

      • alastairp
        it sounds like we're happier about importing dups than with cassandra
      • 2017-06-23 17440, 2017

      • alastairp
        in which case, this is an OK approach to me
      • 2017-06-23 17452, 2017

      • ruaok
        yes, influx handles dups
      • 2017-06-23 17450, 2017

      • iliekcomputers
        so oldest to newest is the way to go :)
      • 2017-06-23 17403, 2017

      • ruaok
        phew. :)
      • 2017-06-23 17446, 2017

      • alastairp
        wait. you all just convinced me that newest to oldest is OK
      • 2017-06-23 17413, 2017

      • ruaok
        get our the red pen?
      • 2017-06-23 17415, 2017

      • ruaok
        out
      • 2017-06-23 17431, 2017

      • iliekcomputers
        welp, i meant the opposite
      • 2017-06-23 17436, 2017

      • iliekcomputers
        sorry
      • 2017-06-23 17438, 2017

      • alastairp
        OK :)
      • 2017-06-23 17400, 2017

      • ruaok
        oh.
      • 2017-06-23 17401, 2017

      • iliekcomputers
        brainfade
      • 2017-06-23 17409, 2017

      • ruaok
        meh. lysdexia blows.
      • 2017-06-23 17417, 2017

      • alastairp
        I think we broke ruaok
      • 2017-06-23 17433, 2017

      • ruaok
        been broken for a while.
      • 2017-06-23 17438, 2017

      • ruaok checks calendar
      • 2017-06-23 17451, 2017

      • ruaok
        ah, yes almost 47 years.
      • 2017-06-23 17455, 2017

      • Quesito
        alastairp: home till sant Juan BBQ tonight...let me know if you will pass by!
      • 2017-06-23 17406, 2017

      • ruaok
        I'm sadly without plans for Sant Juan as of yet. This is my first evening without any social plans and I would totally stay home and have a quiet evening in.
      • 2017-06-23 17411, 2017

      • ruaok
        "quiet"
      • 2017-06-23 17419, 2017

      • ruaok
        wrong evening for that.
      • 2017-06-23 17401, 2017

      • alastairp
        Quesito: 7:30?
      • 2017-06-23 17445, 2017

      • Quesito
        That'll work alastairp!
      • 2017-06-23 17423, 2017

      • Rotab has quit
      • 2017-06-23 17457, 2017

      • Rotab joined the channel
      • 2017-06-23 17433, 2017

      • alastairp
        Quesito: in terms of beer, I have a stout, and a few types of hoppy ales
      • 2017-06-23 17439, 2017

      • alastairp
        do you want anything in particular
      • 2017-06-23 17401, 2017

      • Quesito
        Love hoppy ales in the summertime! I'm excited!!!
      • 2017-06-23 17408, 2017

      • alastairp
        ok! :)
      • 2017-06-23 17438, 2017

      • CatQuest
        guys: can you make the "import form alpha" button more conspicious thoghu, it's easy to just think "oh yeas import" and thne click it and there is no "confirm yo uwant to import fro malpha" thing after :/
      • 2017-06-23 17458, 2017

      • CatQuest
        anyway I don't mind not having the option to import from alpha
      • 2017-06-23 17400, 2017

      • ruaok
        yes, on my list of things to do!
      • 2017-06-23 17444, 2017

      • lazka has quit
      • 2017-06-23 17405, 2017

      • Quesito
        ruaok: find plans!!! our plans will last an hour max before we turn into little beast pumpkins.....someone has to celebrate on my behalf till sunrise walking backward and all!
      • 2017-06-23 17438, 2017

      • ruaok
        yes, ma'am!
      • 2017-06-23 17407, 2017

      • CatQuest
        wat
      • 2017-06-23 17415, 2017

      • ruaok
      • 2017-06-23 17408, 2017

      • CallerNo6 has quit
      • 2017-06-23 17405, 2017

      • CallerNo6 joined the channel
      • 2017-06-23 17410, 2017

      • Wizzup has quit
      • 2017-06-23 17421, 2017

      • alastairp
        ruaok: you at home?
      • 2017-06-23 17449, 2017

      • ruaok
        Negative
      • 2017-06-23 17402, 2017

      • ruaok
        Will be in 15 or so
      • 2017-06-23 17430, 2017

      • alastairp
        I got your Shields and accelerometers
      • 2017-06-23 17449, 2017

      • ruaok
        If you're near my hood, drop them in my mail slot.
      • 2017-06-23 17418, 2017

      • ruaok
        But, not important. I won't get a chance to play in the next week.
      • 2017-06-23 17439, 2017

      • alastairp
        Ok, no problem
      • 2017-06-23 17401, 2017

      • alastairp
        Let's meet some time soon then
      • 2017-06-23 17452, 2017

      • ruaok
        Let's!
      • 2017-06-23 17429, 2017

      • agentsim joined the channel
      • 2017-06-23 17458, 2017

      • agentsim has quit
      • 2017-06-23 17436, 2017

      • LordSputnik
        ruaok: did you happen to see my email for the second set of GCI trip expenses?
      • 2017-06-23 17446, 2017

      • ruaok
        I did.
      • 2017-06-23 17450, 2017

      • agentsim joined the channel
      • 2017-06-23 17442, 2017

      • travis-ci joined the channel
      • 2017-06-23 17443, 2017

      • travis-ci
        Project bookbrainz-site build #1258: passed in 6 min 6 sec: https://travis-ci.org/bookbrainz/bookbrainz-site/…
      • 2017-06-23 17443, 2017

      • travis-ci has left the channel