#metabrainz

/

      • D4RK-PH0ENiX joined the channel
      • 2019-01-07 00758, 2019

      • CatQuest has quit
      • 2019-01-07 00711, 2019

      • CatQuest joined the channel
      • 2019-01-07 00701, 2019

      • JTL is now known as JLT
      • 2019-01-07 00730, 2019

      • JLT is now known as JTL
      • 2019-01-07 00726, 2019

      • Nyanko-sensei joined the channel
      • 2019-01-07 00746, 2019

      • d4rkie joined the channel
      • 2019-01-07 00746, 2019

      • Nyanko-sensei has quit
      • 2019-01-07 00759, 2019

      • D4RK-PH0ENiX has quit
      • 2019-01-07 00719, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-01-07 00720, 2019

      • d4rkie has quit
      • 2019-01-07 00701, 2019

      • D4RK-PH0ENiX has quit
      • 2019-01-07 00720, 2019

      • ayerhart has quit
      • 2019-01-07 00735, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-01-07 00750, 2019

      • ayerhart joined the channel
      • 2019-01-07 00718, 2019

      • Leo_Verto_ joined the channel
      • 2019-01-07 00707, 2019

      • Leo_Verto has quit
      • 2019-01-07 00707, 2019

      • Leo_Verto_ is now known as Leo_Verto
      • 2019-01-07 00726, 2019

      • c1e0 joined the channel
      • 2019-01-07 00709, 2019

      • c1e0 has quit
      • 2019-01-07 00749, 2019

      • Major_Lurker joined the channel
      • 2019-01-07 00719, 2019

      • Gore|woerk joined the channel
      • 2019-01-07 00752, 2019

      • G0re has quit
      • 2019-01-07 00729, 2019

      • iliekcomputers
        Moin!
      • 2019-01-07 00756, 2019

      • iliekcomputers
        It is cold! 😥😥
      • 2019-01-07 00757, 2019

      • michelv joined the channel
      • 2019-01-07 00737, 2019

      • michelv
      • 2019-01-07 00705, 2019

      • pristine_ joined the channel
      • 2019-01-07 00721, 2019

      • dpmittal[m] has quit
      • 2019-01-07 00709, 2019

      • ruaok
        moin moin!
      • 2019-01-07 00713, 2019

      • ruaok
        yep, cold indeed.
      • 2019-01-07 00703, 2019

      • iliekcomputers
        ruaok: hi
      • 2019-01-07 00718, 2019

      • iliekcomputers
        Got time to help me out with some spark stuff today?
      • 2019-01-07 00728, 2019

      • iliekcomputers
        :)
      • 2019-01-07 00729, 2019

      • ruaok
        I should, yes.
      • 2019-01-07 00736, 2019

      • ruaok
        lemme grab some coffee
      • 2019-01-07 00743, 2019

      • iliekcomputers
        Great, I'll be near a computer soon.
      • 2019-01-07 00753, 2019

      • iliekcomputers
        10-15ish minutes?
      • 2019-01-07 00714, 2019

      • modwizcode has quit
      • 2019-01-07 00737, 2019

      • ruaok
        sure
      • 2019-01-07 00733, 2019

      • modwizcode joined the channel
      • 2019-01-07 00700, 2019

      • iliekcomputers
        yo.
      • 2019-01-07 00704, 2019

      • ruaok
        tú qué?
      • 2019-01-07 00747, 2019

      • iliekcomputers
        so I was submitting the dataframes script to spark from inside the play container
      • 2019-01-07 00709, 2019

      • iliekcomputers
        the out of memory errors were because I was submitting it without specifying the master.
      • 2019-01-07 00721, 2019

      • iliekcomputers
        so it submitted to a spark instance inside the container, i think.
      • 2019-01-07 00727, 2019

      • demonimin joined the channel
      • 2019-01-07 00747, 2019

      • ruaok
        sounds about right.
      • 2019-01-07 00752, 2019

      • iliekcomputers
        i fixed that and submitted using `spark-submit --master spark://spark-master.spark-network:7077 <Script>`
      • 2019-01-07 00708, 2019

      • iliekcomputers
        that led to a problem with the files from the dump.
      • 2019-01-07 00747, 2019

      • iliekcomputers
        i was getting a bunch of filenotfounderrors
      • 2019-01-07 00714, 2019

      • iliekcomputers
        i figured this was because the dump was being extracted inside the play container and the code was running inside the spark-master container
      • 2019-01-07 00758, 2019

      • ruaok is listening with fascination
      • 2019-01-07 00708, 2019

      • iliekcomputers
        there is an addFile method in pyspark that downloads files to all nodes
      • 2019-01-07 00708, 2019

      • ruaok
        what a complicated saga its been.
      • 2019-01-07 00731, 2019

      • iliekcomputers
        ruaok: i'm not good at telling stories :D
      • 2019-01-07 00744, 2019

      • iliekcomputers
        but the addFile method didn't really work.
      • 2019-01-07 00756, 2019

      • iliekcomputers
        it just added the file to the tmp folder in the play container and nowhere else.
      • 2019-01-07 00759, 2019

      • ruaok
        I'm enjoying it. :)
      • 2019-01-07 00723, 2019

      • iliekcomputers
        I was wondering how we worked with files when we worked on the recommendation engine.
      • 2019-01-07 00728, 2019

      • ruaok
        did the addfile method get the proper --master info?
      • 2019-01-07 00750, 2019

      • iliekcomputers
        ruaok: huh, i dunno. ideally it should have with the --master switch, but i'll check.
      • 2019-01-07 00751, 2019

      • ruaok
        everything was on the master filesystem, I think.
      • 2019-01-07 00721, 2019

      • ruaok
        and then we made rdds, which were distributed. but everything was on a local instance.
      • 2019-01-07 00736, 2019

      • iliekcomputers
        so ideally if I ran it from the master container, it should work?
      • 2019-01-07 00749, 2019

      • iliekcomputers
        i tried that, but i still got filenotfounderrors.
      • 2019-01-07 00748, 2019

      • ruaok
        I think the way to look at this is to make sure that our assumptions are correct.
      • 2019-01-07 00711, 2019

      • ruaok
        we are assuming/hoping that files got copies to HDFS and are available on all the nodes. is that correct?
      • 2019-01-07 00700, 2019

      • slurpee- joined the channel
      • 2019-01-07 00703, 2019

      • iliekcomputers
        ruaok: hdfs isn't involved yet.
      • 2019-01-07 00725, 2019

      • ruaok
        O_O
      • 2019-01-07 00729, 2019

      • iliekcomputers
      • 2019-01-07 00710, 2019

      • Slurpee has quit
      • 2019-01-07 00716, 2019

      • ruaok
        2.1.0? old docs?
      • 2019-01-07 00751, 2019

      • iliekcomputers
      • 2019-01-07 00718, 2019

      • ruaok
        heh.
      • 2019-01-07 00723, 2019

      • iliekcomputers
        another way I tried was to upload the files to hdfs before working with them.
      • 2019-01-07 00733, 2019

      • ruaok
        that was my assumptiong.
      • 2019-01-07 00744, 2019

      • iliekcomputers
        it did work. but it was really slow.
      • 2019-01-07 00748, 2019

      • ruaok
        upload to HDFS first, then specify HDFS file path to spark.
      • 2019-01-07 00708, 2019

      • ruaok
        `it did work. but it was really slow.` that's a start.
      • 2019-01-07 00723, 2019

      • ruaok
        remember, we've done no tuning and have not really given a lot of ram.
      • 2019-01-07 00744, 2019

      • ruaok
        sounds like that should be our next step.
      • 2019-01-07 00753, 2019

      • iliekcomputers
        took it around 12 hrs to write dataframes for 2002 and 2003.
      • 2019-01-07 00703, 2019

      • iliekcomputers
        which is when I stopped it.
      • 2019-01-07 00708, 2019

      • ruaok
        I have no idea how resource allocation should work, so we need to learn.
      • 2019-01-07 00713, 2019

      • iliekcomputers
        hmm.
      • 2019-01-07 00716, 2019

      • ruaok
        hah, yes.
      • 2019-01-07 00744, 2019

      • ruaok
        we may also not have enough nodes -- only 4 nodes. but I suspect that we're not giving enough ram.
      • 2019-01-07 00757, 2019

      • ruaok
        each node may only be using half the ram it has, possibly even less.
      • 2019-01-07 00703, 2019

      • iliekcomputers
        hmm, could be it.
      • 2019-01-07 00723, 2019

      • iliekcomputers
        i'd like to get the spark UI back somehow.
      • 2019-01-07 00732, 2019

      • iliekcomputers
        makes it easier to analyze this stuff.
      • 2019-01-07 00712, 2019

      • ruaok
        I need to add your SSH key to the docker gateway container and the tunnels should work for you.
      • 2019-01-07 00728, 2019

      • iliekcomputers
        yokay. great.
      • 2019-01-07 00743, 2019

      • ruaok
        zas: do you know how to configure openvpn?
      • 2019-01-07 00755, 2019

      • ruaok
        that may be a better option, if someone knows how to tame it.
      • 2019-01-07 00756, 2019

      • iliekcomputers
        ruaok: with the recommendation engine, did you add any options to spark-submit to change the default memory used etc?
      • 2019-01-07 00716, 2019

      • ruaok
        yes, I think so.
      • 2019-01-07 00735, 2019

      • iliekcomputers
        hmm, nice. let me look into that then.
      • 2019-01-07 00754, 2019

      • ruaok
        spark-submit --master spark://195.201.112.36:7077 --executor-memory=29g `pwd`/<script> <args>
      • 2019-01-07 00703, 2019

      • ruaok
        from readme.md
      • 2019-01-07 00737, 2019

      • iliekcomputers facepalms. (should have read that) :P
      • 2019-01-07 00755, 2019

      • D4RK-PH0ENiX has quit
      • 2019-01-07 00709, 2019

      • D4RK-PH0ENiX joined the channel
      • 2019-01-07 00709, 2019

      • pristine_ has quit
      • 2019-01-07 00709, 2019

      • TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | GSoC https://goo.gl/7jsjG2 | Meeting agenda (next meeting: 2019-01-07): Reviews, Google Code-in (Freso), Annual report 2018 (ruaok), mini summit (ruaok)
      • 2019-01-07 00724, 2019

      • ruaok
      • 2019-01-07 00742, 2019

      • ruaok
        you can always spot when india and IST causes something odd in the world. :)
      • 2019-01-07 00757, 2019

      • zas
        ruaok: about openvpn i did but a looong time ago, so i guess you better use online docs rather than my memory ;)
      • 2019-01-07 00708, 2019

      • ruaok
        ok, then sod that.
      • 2019-01-07 00738, 2019

      • zas
        about brazilian synth pop, the work on this album is pretty impressive, each song is finely crafted, not my usual kind of music, but definitively worth listening (https://erica.bandcamp.com/album/beautiful for readers)
      • 2019-01-07 00728, 2019

      • zas
        she's sharing stages with another brazilian artist i just added to the db: https://pinaud.bandcamp.com/album/telephunk
      • 2019-01-07 00712, 2019

      • code_master5 joined the channel
      • 2019-01-07 00720, 2019

      • c1e0 joined the channel
      • 2019-01-07 00711, 2019

      • discopatrick joined the channel
      • 2019-01-07 00730, 2019

      • michelv has quit
      • 2019-01-07 00735, 2019

      • michelv joined the channel
      • 2019-01-07 00700, 2019

      • reosarevok looks at which of his PRs have hopelessly broken over the holiday season
      • 2019-01-07 00708, 2019

      • iliekcomputers
        0!
      • 2019-01-07 00749, 2019

      • code_master5
        iliekcomputers: 0! = 1. 😈
      • 2019-01-07 00703, 2019

      • iliekcomputers
        unexpected factorial
      • 2019-01-07 00705, 2019

      • iliekcomputers
        shit.
      • 2019-01-07 00706, 2019

      • CatQuest
        ho ho
      • 2019-01-07 00718, 2019

      • CatQuest
      • 2019-01-07 00723, 2019

      • CatQuest
        best answer
      • 2019-01-07 00757, 2019

      • ruaok
        unexpected factorial, kinda like surprise buttsecks?
      • 2019-01-07 00719, 2019

      • code_master5
        CatQuest: 🤭
      • 2019-01-07 00731, 2019

      • yvanzo
        Hi reosarevok, my (not broken yet) PRs are just waiting for your review :)
      • 2019-01-07 00757, 2019

      • reosarevok
        Ok, let me look into mine for a bit and then check
      • 2019-01-07 00756, 2019

      • CatQuest
      • 2019-01-07 00706, 2019

      • iliekcomputers
        first time getting import errors after spark-submit.
      • 2019-01-07 00712, 2019

      • github joined the channel
      • 2019-01-07 00712, 2019

      • github
        [listenbrainz-server] vansika opened pull request #475: make timestamp optional for playing now listens in RedisListenStore (master...playing-now-timestamp) https://git.io/fhGZE
      • 2019-01-07 00712, 2019

      • github has left the channel
      • 2019-01-07 00748, 2019

      • reosarevok
        huh
      • 2019-01-07 00750, 2019

      • reosarevok
        Cannot get React.AbstractComponent because property AbstractComponent is missing in module react [1].
      • 2019-01-07 00759, 2019

      • reosarevok
        What am I missing
      • 2019-01-07 00725, 2019

      • reosarevok
        Oh. I'm getting that in master too
      • 2019-01-07 00718, 2019

      • reosarevok also sighs at "Identifier 'l_statistics' is not in camel case" etc by eslint
      • 2019-01-07 00733, 2019

      • reosarevok
        Tempted to just send a PR camelcasing the whole thing so it shuts up
      • 2019-01-07 00705, 2019

      • yvanzo
        Exceptions can be added to eslint rules with https://eslint.org/docs/rules/camelcase#allow
      • 2019-01-07 00709, 2019

      • c1e0 has quit
      • 2019-01-07 00717, 2019

      • reosarevok
        But is there an actual reason these are not in camelcase?
      • 2019-01-07 00737, 2019

      • yvanzo
        web service, if I recall correctly
      • 2019-01-07 00726, 2019

      • yvanzo
        Wait, l_statistics is not served by WS, there is no reason :)
      • 2019-01-07 00720, 2019

      • reosarevok
        But l_relationships or whatever can't be changed?
      • 2019-01-07 00719, 2019

      • Lotheric has quit
      • 2019-01-07 00701, 2019

      • yvanzo
        It might be because of N_l.
      • 2019-01-07 00743, 2019

      • yvanzo
        I would vote for making exceptions of N_l, N_ln, N_lp and camelcasing others.