#metabrainz

/

      • D4RK-PH0ENiX joined the channel
      • 2018-11-08 31236, 2018

      • ruaok
        zas: no wifi on train, I take it?
      • 2018-11-08 31225, 2018

      • zas
        just boarded, intermittent wifi and/or 4G
      • 2018-11-08 31251, 2018

      • ruaok
        perfect for doing remote work! :) :)
      • 2018-11-08 31216, 2018

      • ruaok
      • 2018-11-08 31237, 2018

      • Leo_Verto
        woo!
      • 2018-11-08 31254, 2018

      • ruaok
        Leo_Verto: so, no luck on that problem from yesterday.
      • 2018-11-08 31235, 2018

      • Leo_Verto
        hm
      • 2018-11-08 31223, 2018

      • ruaok
        and even with the firewall completely turned off, it doesn't work.
      • 2018-11-08 31204, 2018

      • Leo_Verto
        can you access hadoop from another container in spark-network?
      • 2018-11-08 31240, 2018

      • ruaok
        yep
      • 2018-11-08 31203, 2018

      • michelv has quit
      • 2018-11-08 31223, 2018

      • Leo_Verto
        okay, are you sure setting up the services on a dedicated overlay network is what you want to do?
      • 2018-11-08 31231, 2018

      • ruaok
        yes, yes.
      • 2018-11-08 31235, 2018

      • ruaok
        are for a login?
      • 2018-11-08 31237, 2018

      • ruaok
        care
      • 2018-11-08 31242, 2018

      • Leo_Verto
        sure
      • 2018-11-08 31251, 2018

      • ruaok
        which key should I use?
      • 2018-11-08 31256, 2018

      • ruaok
        and login?
      • 2018-11-08 31249, 2018

      • Leo_Verto
        if I wanted to make a service externally accessable I'd use a bridged/host network instead, not sure if that works with services though
      • 2018-11-08 31224, 2018

      • Leo_Verto
        although I use expose instead of publish, hm
      • 2018-11-08 31200, 2018

      • ruaok
        the overlay network is what enabled the nodes to easily communicate with one another.
      • 2018-11-08 31218, 2018

      • ruaok
        which is the whole reason why I am using docker swarm.
      • 2018-11-08 31239, 2018

      • ruaok
        and it works quite well from what I can see. except this whole publishing ports bit
      • 2018-11-08 31206, 2018

      • Leo_Verto
        yeah, I think you might need an additional bridged network for publish to work
      • 2018-11-08 31223, 2018

      • Leo_Verto
        at least that's how I got my (non-swarm) setup working
      • 2018-11-08 31254, 2018

      • ruaok
        hmm.
      • 2018-11-08 31204, 2018

      • ruaok
        so, create bridged network, add services to both bridged network and to overlay network.
      • 2018-11-08 31220, 2018

      • ruaok
        and then?
      • 2018-11-08 31207, 2018

      • Leo_Verto
        I mean that's how it'd work with normal containers, not entirely sure what the ingress network does here
      • 2018-11-08 31209, 2018

      • Leo_Verto
        also my network graphing tool is broken, nice
      • 2018-11-08 31226, 2018

      • ruaok
        except the it works for port 8080, for the spark-master service.
      • 2018-11-08 31249, 2018

      • ruaok
        wget http://localhost:8080 -> works great.
      • 2018-11-08 31202, 2018

      • ruaok
        that goes into the spark service. but the hadoop service one doesn'twork.
      • 2018-11-08 31210, 2018

      • ruaok
        and they are both on the overlay network.
      • 2018-11-08 31210, 2018

      • Leo_Verto
        if you docker inspect the individual containers you can see that not only are they on spark-network but also ingress
      • 2018-11-08 31237, 2018

      • ruaok
        thats normal/correct, no?
      • 2018-11-08 31254, 2018

      • Leo_Verto
        yeah, I'm pretty sure spark-network isn't really used here, the requests use ingress
      • 2018-11-08 31206, 2018

      • Leo_Verto
        and running docker service inspect --format="{{json .Endpoint.Spec.Ports}}" <service> I get pretty much the same setup for both
      • 2018-11-08 31236, 2018

      • ruaok
        for a single node setup it isn't used, but won't it be needed for a multi-node setup (which we use)?
      • 2018-11-08 31240, 2018

      • Leo_Verto
        lateron definitely, but not right now. maybe we should try removing it to see if it interferes here?
      • 2018-11-08 31206, 2018

      • ruaok
        sure.
      • 2018-11-08 31219, 2018

      • ruaok
        would you like to try it yourself or do you want me to make the changes?
      • 2018-11-08 31244, 2018

      • ruaok
        if you clone this, you can have a play yourself
      • 2018-11-08 31245, 2018

      • ruaok
      • 2018-11-08 31204, 2018

      • Leo_Verto
        any reason you aren't using compose? :P
      • 2018-11-08 31232, 2018

      • ruaok
        compose doesn't do swarms. or at least it didn't when I started this mess.
      • 2018-11-08 31203, 2018

      • ruaok
        and I'm not interested in using compose much when I have to reinvent the wheel into normal run commands half the time. might as well just do run commands from the get go.
      • 2018-11-08 31215, 2018

      • ruaok snaps his unix suspenders in defiance.
      • 2018-11-08 31237, 2018

      • ruaok
        I just pushed the publish change from last night.
      • 2018-11-08 31255, 2018

      • ruaok
        but if you clone this and then run docker/start-master-service.sh
      • 2018-11-08 31259, 2018

      • ruaok
        you can start it yourself.
      • 2018-11-08 31215, 2018

      • Leo_Verto
        okay, thanks
      • 2018-11-08 31216, 2018

      • ruaok
        then start-worker-service.sh
      • 2018-11-08 31222, 2018

      • ruaok
        and the requisite stop scripts.
      • 2018-11-08 31249, 2018

      • ruaok
        a working setup should yield both 8088 and 8080 services being mapped to localhost.
      • 2018-11-08 31257, 2018

      • Leo_Verto
        okay these containers take forever to pull
      • 2018-11-08 31224, 2018

      • michelv joined the channel
      • 2018-11-08 31203, 2018

      • jwf has quit
      • 2018-11-08 31203, 2018

      • CallerNo6 has quit
      • 2018-11-08 31203, 2018

      • Sophist-UK has quit
      • 2018-11-08 31204, 2018

      • yvanzo has quit
      • 2018-11-08 31224, 2018

      • CallerNo6 joined the channel
      • 2018-11-08 31230, 2018

      • yvanzo joined the channel
      • 2018-11-08 31222, 2018

      • Sophist-UK joined the channel
      • 2018-11-08 31233, 2018

      • jwf joined the channel
      • 2018-11-08 31229, 2018

      • Leo_Verto
        ruaok, figured it out
      • 2018-11-08 31241, 2018

      • Leo_Verto
        the webinterface on hadoop-master is binding to 10.0.0.10 for some reason
      • 2018-11-08 31259, 2018

      • Leo_Verto
        and 10.0.0.0/24 is spark-network
      • 2018-11-08 31212, 2018

      • Leo_Verto
        I think setting "yarn.nodemanager.runtime.linux.docker.default-container-network" and "yarn.nodemanager.runtime.linux.docker.allowed-container-networks" in the yarn site config here https://github.com/metabrainz/hadoop-cluster-dock… should fix it
      • 2018-11-08 31234, 2018

      • ruaok
        That seems ok, no?
      • 2018-11-08 31242, 2018

      • Leo_Verto
        there's this doc entry about running hadoop in docker https://hadoop.apache.org/docs/current/hadoop-yar…
      • 2018-11-08 31202, 2018

      • Leo_Verto
        well, that means you can't access the webinterface from outside spark-network, no matter what you publish
      • 2018-11-08 31244, 2018

      • ruaok
        ah, the publish command doesn't bridge it?
      • 2018-11-08 31212, 2018

      • Leo_Verto
        nope, the container has to be on a bridge network and the service inside must bind to that network or all of them
      • 2018-11-08 31235, 2018

      • ruaok
        ok, let me try the settings you suggested.
      • 2018-11-08 31258, 2018

      • Leo_Verto
        but the ingress network should work as a bridge, I think
      • 2018-11-08 31234, 2018

      • Leo_Verto
        that is 10.255.0.0/24 btw
      • 2018-11-08 31247, 2018

      • ruaok
        yarn.nodemanager.runtime.linux.docker.default-container-network ==> ingress
      • 2018-11-08 31248, 2018

      • ruaok
        ?
      • 2018-11-08 31210, 2018

      • Leo_Verto
        yeah
      • 2018-11-08 31218, 2018

      • ruaok
        like this perhaps?
      • 2018-11-08 31218, 2018

      • ruaok
      • 2018-11-08 31245, 2018

      • Leo_Verto
        hmm, reading more of that article it seems like that requires using the "Linux Container Executor" to start yarn which is not something we want, right?
      • 2018-11-08 31225, 2018

      • ruaok
        dunno what that exactly is. I want swarm to start the containers.
      • 2018-11-08 31245, 2018

      • Leo_Verto
        yeah, th
      • 2018-11-08 31255, 2018

      • Leo_Verto
        there has to be a simpler way of just changing the bind address
      • 2018-11-08 31239, 2018

      • Leo_Verto
        but for some reason the documentation doesn't mention the web server, ugh https://mapr.com/docs/home/ReferenceGuide/yarn-si…
      • 2018-11-08 31249, 2018

      • ruaok
        and we're on hadoop 3.1.1. at that...
      • 2018-11-08 31255, 2018

      • Leo_Verto
        oh yeah
      • 2018-11-08 31208, 2018

      • Leo_Verto
        yarn.resourcemanager.hostname
      • 2018-11-08 31232, 2018

      • Leo_Verto
        apparently that sort of stuff is "documented" here https://hadoop.apache.org/docs/r3.1.1/hadoop-yarn…
      • 2018-11-08 31227, 2018

      • Leo_Verto
        so you should be fine setting that to 0.0.0.0 I suppose, otherwise it wouldn't be accessible from spark-network
      • 2018-11-08 31218, 2018

      • ruaok
        Let me that that in a bit, afk for a moment.
      • 2018-11-08 31241, 2018

      • iliekcomputers
      • 2018-11-08 31223, 2018

      • Leo_Verto
        iliekcomputers, you should see New Year's Eve here. one year we had so much smog there was only ~15m visibility
      • 2018-11-08 31252, 2018

      • iliekcomputers
        So dumb.
      • 2018-11-08 31216, 2018

      • Leo_Verto
        literally lights up the particulates sensor maps https://www.umweltbundesamt.de/sites/default/file…
      • 2018-11-08 31225, 2018

      • iliekcomputers
        Official fireworks time is supposed to be 8-10 PM
      • 2018-11-08 31226, 2018

      • Soumya joined the channel
      • 2018-11-08 31240, 2018

      • reosarevok
        There's official firewords time? :D
      • 2018-11-08 31254, 2018

      • Soumya
        Hi... I am a gci participant...
      • 2018-11-08 31228, 2018

      • thefar8
        Hi Soumya
      • 2018-11-08 31251, 2018

      • thefar8
        welcome to metabrainz irc
      • 2018-11-08 31258, 2018

      • Soumya
        Greetings! I need some help regarding my task..
      • 2018-11-08 31226, 2018

      • thefar8
        which task is it?
      • 2018-11-08 31232, 2018

      • Soumya
        I have to collect some imteresting facts aboit METABRAINZ
      • 2018-11-08 31258, 2018

      • Soumya
        Interesting***
      • 2018-11-08 31202, 2018

      • Soumya
        About***
      • 2018-11-08 31254, 2018

      • iliekcomputers
        Soumya: Hi, can you link the task you're working on?
      • 2018-11-08 31220, 2018

      • iliekcomputers
        reosarevok: there is! it's to stop them going all night I guess, but nobody cares ;)
      • 2018-11-08 31223, 2018

      • Soumya
      • 2018-11-08 31226, 2018

      • iliekcomputers
        Soumya: ok, thanks. I can talk to you about CritiqueBrainz, ListenBrainz or AcousticBrainz if you want. I'm not a mentor of that task, so I don't know exactly what kind of facts they're expecting.
      • 2018-11-08 31243, 2018

      • Soumya
        Okay.... I'll ask after a few minutes of researching...
      • 2018-11-08 31259, 2018

      • Soumya has quit
      • 2018-11-08 31214, 2018

      • CallerNo6
        Soumya, also, non-mentors can't follow that link, but might have interesting input. Can you pastebin the task description?
      • 2018-11-08 31252, 2018

      • Sophist-UK has quit
      • 2018-11-08 31243, 2018

      • iliekcomputers
      • 2018-11-08 31226, 2018

      • CallerNo6
        iliekcomputers, thanks!
      • 2018-11-08 31243, 2018

      • iliekcomputers
        =)
      • 2018-11-08 31212, 2018

      • bukwurm has quit
      • 2018-11-08 31239, 2018

      • iliekcomputers
      • 2018-11-08 31204, 2018

      • ruaok
        on your server?
      • 2018-11-08 31234, 2018

      • iliekcomputers
        on my laptop, local small dump :D
      • 2018-11-08 31240, 2018

      • ruaok
        k.
      • 2018-11-08 31252, 2018

      • ruaok
        perfect timing. because I hope to have hdfs up and running version soon
      • 2018-11-08 31207, 2018

      • ruaok
        Leo_Verto: that 0.0.0.0 trick worked and in hindsight makes total sense.
      • 2018-11-08 31252, 2018

      • Leo_Verto
        yeah, all the time spent digging through docker docs and it's something as simple as binding to the wrong interface…
      • 2018-11-08 31202, 2018

      • ruaok
        yeah...
      • 2018-11-08 31250, 2018

      • Leo_Verto
        anyway, let me know when spark's running and we can get started with the jupyter notebook server :D
      • 2018-11-08 31229, 2018

      • michelv has quit
      • 2018-11-08 31214, 2018

      • ruaok
        looks like it needs more config settings.
      • 2018-11-08 31245, 2018

      • ruaok
        yarn.resourcemanager.hostname defines the hostname which others refer to the service with.
      • 2018-11-08 31250, 2018

      • ruaok
        useless for connecting to the service.
      • 2018-11-08 31223, 2018

      • ruaok
        yarn.resourcemanager.bind-host I bet.
      • 2018-11-08 31252, 2018

      • github joined the channel
      • 2018-11-08 31252, 2018

      • github
        [picard] phw opened pull request #1027: PICARD-1375: Handle empty values in metadata sanitation (master...PICARD-1375) https://git.io/fpkkR
      • 2018-11-08 31252, 2018

      • github has left the channel
      • 2018-11-08 31217, 2018

      • ruaok
        iliekcomputers: exciting, I like that PR!
      • 2018-11-08 31206, 2018

      • iliekcomputers
        ruaok: yusss. =)
      • 2018-11-08 31243, 2018

      • iliekcomputers
        For streaming I was thinking, a simple flask app with an API and a spark writer on one of the nodes
      • 2018-11-08 31201, 2018

      • ruaok
        please elaborate on that thought -- sounds complicated.
      • 2018-11-08 31248, 2018

      • iliekcomputers
        the spark nodes will probably not have a direct connection with lemmy right?
      • 2018-11-08 31225, 2018

      • iliekcomputers
        run a small flask app on the nodes to which the `biquery-writer` equivalent in listenbrainz submit new listens
      • 2018-11-08 31208, 2018

      • iliekcomputers
        the flask app then submits to a script running on the nodes which writes them to hdfs
      • 2018-11-08 31252, 2018

      • ruaok
        > the spark nodes will probably not have a direct connection with lemmy right?
      • 2018-11-08 31255, 2018

      • ruaok
        they could
      • 2018-11-08 31213, 2018

      • ruaok
        rabbitmq makes a good isolator in case the cluster is not available.
      • 2018-11-08 31219, 2018

      • iliekcomputers
        could the spark nodes access rabbitmq?
      • 2018-11-08 31230, 2018

      • ruaok
        yes.