#metabrainz

/

      • D4RK-PH0ENiX joined the channel
      • 2018-11-15 31928, 2018

      • Zialus_PT has quit
      • 2018-11-15 31900, 2018

      • Zialus joined the channel
      • 2018-11-15 31925, 2018

      • bruce_r joined the channel
      • 2018-11-15 31923, 2018

      • ferbncode has quit
      • 2018-11-15 31954, 2018

      • Nyanko-sensei joined the channel
      • 2018-11-15 31938, 2018

      • D4RK-PH0ENiX has quit
      • 2018-11-15 31943, 2018

      • Dr-Flay has quit
      • 2018-11-15 31908, 2018

      • Dr-Flay joined the channel
      • 2018-11-15 31925, 2018

      • thefar8 has quit
      • 2018-11-15 31919, 2018

      • Nyanko-sensei has quit
      • 2018-11-15 31903, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-11-15 31942, 2018

      • Leo_Verto_ joined the channel
      • 2018-11-15 31949, 2018

      • Leo_Verto has quit
      • 2018-11-15 31949, 2018

      • Leo_Verto_ is now known as Leo_Verto
      • 2018-11-15 31904, 2018

      • bruce_r has quit
      • 2018-11-15 31942, 2018

      • thefar8 joined the channel
      • 2018-11-15 31919, 2018

      • Nyanko-sensei joined the channel
      • 2018-11-15 31953, 2018

      • D4RK-PH0ENiX has quit
      • 2018-11-15 31927, 2018

      • MightyJay has quit
      • 2018-11-15 31940, 2018

      • rsh7 joined the channel
      • 2018-11-15 31945, 2018

      • MightyJay joined the channel
      • 2018-11-15 31926, 2018

      • yokel has quit
      • 2018-11-15 31956, 2018

      • ruaok
        so, what have you learned, iliekcomputers?
      • 2018-11-15 31957, 2018

      • Dr-Flay has quit
      • 2018-11-15 31948, 2018

      • iliekcomputers
        ruaok: so i was mucking around with hadoop and config values yesterday.
      • 2018-11-15 31914, 2018

      • iliekcomputers
        there's two interfaces that clients can use to read/write/interact with hdfs
      • 2018-11-15 31945, 2018

      • iliekcomputers
        one is the standard rpc interface, the other is a http api type interface called webhdfs
      • 2018-11-15 31903, 2018

      • ruaok
        do you have a feeling for which is preferred?
      • 2018-11-15 31929, 2018

      • iliekcomputers
        ruaok: most of the simpler python clients I've looked at use webhdfs
      • 2018-11-15 31940, 2018

      • iliekcomputers
        the client I wrote the code with uses webhdfs too
      • 2018-11-15 31951, 2018

      • iliekcomputers
        `hdfs dfs -put` used RPC
      • 2018-11-15 31909, 2018

      • ruaok
        and that is the one interface we got working so far. lol.
      • 2018-11-15 31923, 2018

      • iliekcomputers
        yeah, about that.
      • 2018-11-15 31941, 2018

      • iliekcomputers
        i changed some config values and used `stop-master-service` and `start-master-service`
      • 2018-11-15 31949, 2018

      • iliekcomputers
        and now the put thing isn't working...
      • 2018-11-15 31955, 2018

      • ruaok
        heh, lol.
      • 2018-11-15 31957, 2018

      • iliekcomputers
        i'm not sure why exactly
      • 2018-11-15 31907, 2018

      • ruaok
        hang on, there might be uncommited changes in my repo.
      • 2018-11-15 31936, 2018

      • iliekcomputers
        the latest hadoop-yarn image is just the master branch of github/meb/hadoop-docker
      • 2018-11-15 31904, 2018

      • ruaok
        the playground repo only has these two lines
      • 2018-11-15 31907, 2018

      • ruaok
        + -p published=8088,target=8088,mode=host \
      • 2018-11-15 31907, 2018

      • ruaok
        + -p published=9864,target=9864 \
      • 2018-11-15 31915, 2018

      • ruaok
        which don't affect anything.
      • 2018-11-15 31900, 2018

      • iliekcomputers
        maybe it's the worker nodes. how exactly do I find out what IP they are on and if they're running.
      • 2018-11-15 31910, 2018

      • Nyanko-sensei has quit
      • 2018-11-15 31926, 2018

      • iliekcomputers
        from the PR Leo_Verto made, it seemed like `hdfs dfsadmin -report` wasn't trustworthy.
      • 2018-11-15 31946, 2018

      • ruaok
        changes commited, and yes those would make a difference.
      • 2018-11-15 31956, 2018

      • ruaok
        I've built a new image and am pushing it now
      • 2018-11-15 31957, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-11-15 31958, 2018

      • ruaok
        push.
      • 2018-11-15 31901, 2018

      • ruaok
        +ed
      • 2018-11-15 31902, 2018

      • D4RK-PH0ENiX has quit
      • 2018-11-15 31906, 2018

      • iliekcomputers
        ok cool
      • 2018-11-15 31909, 2018

      • ruaok
        you should be able to stop/start and have it work again.
      • 2018-11-15 31916, 2018

      • iliekcomputers
        let me try the put thing again.
      • 2018-11-15 31937, 2018

      • D4RK-PH0ENiX joined the channel
      • 2018-11-15 31940, 2018

      • iliekcomputers
      • 2018-11-15 31900, 2018

      • ruaok
        did you restart both the master and workers?
      • 2018-11-15 31915, 2018

      • iliekcomputers
        i don't have access to the workers.
      • 2018-11-15 31925, 2018

      • ruaok
        you don't need it.
      • 2018-11-15 31929, 2018

      • iliekcomputers
        afaik
      • 2018-11-15 31932, 2018

      • ruaok
        ./stop-workers.sh
      • 2018-11-15 31936, 2018

      • iliekcomputers
        ah
      • 2018-11-15 31937, 2018

      • ruaok
        ./start-workser.sh
      • 2018-11-15 31939, 2018

      • iliekcomputers
        oh cool
      • 2018-11-15 31940, 2018

      • ruaok
        -s
      • 2018-11-15 31917, 2018

      • ruaok
        the nice thing bout this system is that you never have to care about the worker nodes. they get fired up and join the cluster.
      • 2018-11-15 31948, 2018

      • iliekcomputers
        Usage: start-worker-service.sh <replicas>
      • 2018-11-15 31954, 2018

      • iliekcomputers
        replicas is a number?
      • 2018-11-15 31957, 2018

      • iliekcomputers
        3?
      • 2018-11-15 31901, 2018

      • ruaok
        yep
      • 2018-11-15 31911, 2018

      • ruaok
        (to both)
      • 2018-11-15 31922, 2018

      • iliekcomputers
        noice, that's pretty cool.
      • 2018-11-15 31912, 2018

      • outsidecontext joined the channel
      • 2018-11-15 31941, 2018

      • iliekcomputers
        still the same error
      • 2018-11-15 31904, 2018

      • iliekcomputers
        can you confirm for once that the workers are on 10.0.0.24, 10.0.0.23 etc?
      • 2018-11-15 31948, 2018

      • ruaok
        the 10 network is the overlay network and every time you start and stop the workers, they get new IPs.
      • 2018-11-15 31911, 2018

      • ruaok
        I'm going to restart the cluster using my setup. hang on
      • 2018-11-15 31937, 2018

      • ruaok
        now it works again.
      • 2018-11-15 31938, 2018

      • ruaok
        why?
      • 2018-11-15 31959, 2018

      • iliekcomputers
        ruaok: foo is an empty file
      • 2018-11-15 31914, 2018

      • ruaok
        yea, I just touched it.
      • 2018-11-15 31934, 2018

      • iliekcomputers
        once it has data, it doesn't work...
      • 2018-11-15 31941, 2018

      • iliekcomputers
        try again, i echoed some text into it.
      • 2018-11-15 31907, 2018

      • ruaok
        doh.
      • 2018-11-15 31919, 2018

      • ruaok
        huh, interesting.
      • 2018-11-15 31925, 2018

      • ruaok
        but that used to work, no?
      • 2018-11-15 31922, 2018

      • ruaok
        well, we should really focus on the http version of this interface and go with that.
      • 2018-11-15 31953, 2018

      • ruaok
        are you annoyed by the fact that the only error messages that hadoop gives are java stacktraces?
      • 2018-11-15 31904, 2018

      • iliekcomputers
        yes.
      • 2018-11-15 31914, 2018

      • iliekcomputers
        not very helpful
      • 2018-11-15 31929, 2018

      • ruaok
        we'd get flayed alive if we wrote code like that.
      • 2018-11-15 31936, 2018

      • ruaok
        but int he java world, I guess you're used to pain.
      • 2018-11-15 31930, 2018

      • iliekcomputers
        I think we should look at what ports are accessible to the master from the workers
      • 2018-11-15 31919, 2018

      • ruaok
        let me log into a worker
      • 2018-11-15 31947, 2018

      • iliekcomputers
        there's probably some reason why the master isn't able to connect with the workers
      • 2018-11-15 31935, 2018

      • ruaok
      • 2018-11-15 31937, 2018

      • ruaok
        the workers have quite fewer ports open.
      • 2018-11-15 31952, 2018

      • iliekcomputers
      • 2018-11-15 31931, 2018

      • iliekcomputers
        The ports 9866 and 9864 are the ones we want accessible from the master for sure
      • 2018-11-15 31912, 2018

      • iliekcomputers
      • 2018-11-15 31915, 2018

      • ruaok
        as in a process on master makes a connection to port 9866/9864 on a worker?
      • 2018-11-15 31926, 2018

      • iliekcomputers
        ruaok: yes
      • 2018-11-15 31925, 2018

      • ruaok
        ok, both of those are exposed on the inside of the worker containers.
      • 2018-11-15 31937, 2018

      • ruaok
        could you ping 10.0.0.39 ?
      • 2018-11-15 31952, 2018

      • iliekcomputers
        ping works
      • 2018-11-15 31957, 2018

      • ruaok
        ok, then we need to find which setting causes those ports to be bound on all interfaces.
      • 2018-11-15 31954, 2018

      • ruaok
        dfs.namenode.http-bind-host ?
      • 2018-11-15 31949, 2018

      • ruaok starts a build
      • 2018-11-15 31911, 2018

      • iliekcomputers
      • 2018-11-15 31913, 2018

      • ruaok
        yay!
      • 2018-11-15 31924, 2018

      • ruaok
        both ports are not reachable from the master node.
      • 2018-11-15 31927, 2018

      • ruaok
        *now
      • 2018-11-15 31936, 2018

      • iliekcomputers
        I'm still getting connection refuse from inside the master container
      • 2018-11-15 31951, 2018

      • ruaok
        ok, remember that hadoop-master is the master node.
      • 2018-11-15 31958, 2018

      • ruaok
        03a472b53968 is a worker node
      • 2018-11-15 31908, 2018

      • ruaok
      • 2018-11-15 31932, 2018

      • iliekcomputers
        no data received. but it connected.
      • 2018-11-15 31956, 2018

      • ruaok
        probably an invalid wget command, but at least the port is open.
      • 2018-11-15 31910, 2018

      • iliekcomputers
        yes
      • 2018-11-15 31927, 2018

      • iliekcomputers
        I see that the client is trying to connect to incorrect IPs.
      • 2018-11-15 31938, 2018

      • iliekcomputers
        This was the thing that Leo_Verto had fixed
      • 2018-11-15 31957, 2018

      • ruaok
        did we lose his PR somehow?
      • 2018-11-15 31912, 2018

      • iliekcomputers
        core-site.xml in /usr/local/hadoop/etc/hadoop/ has the config value set correctly.
      • 2018-11-15 31920, 2018

      • iliekcomputers
        maybe it needs to be in hdfs-site.xml ?
      • 2018-11-15 31949, 2018

      • ruaok
        can't hurt to try
      • 2018-11-15 31954, 2018

      • ruaok is on it
      • 2018-11-15 31924, 2018

      • ruaok
        change applied, cluster restarted.
      • 2018-11-15 31938, 2018

      • ruaok
        try it again, iliekcomputers
      • 2018-11-15 31934, 2018

      • iliekcomputers
      • 2018-11-15 31946, 2018

      • iliekcomputers
        two workers errored but I think it worked
      • 2018-11-15 31934, 2018

      • ruaok
        I wonder if that is because we just restarted things...
      • 2018-11-15 31903, 2018

      • thefar8
        Hi CatQuest
      • 2018-11-15 31920, 2018

      • iliekcomputers
        it is weird.
      • 2018-11-15 31933, 2018

      • iliekcomputers
        from the logs it still seems like it can't connect to 2 of the datanodes.
      • 2018-11-15 31950, 2018

      • thefar8
        just got a long list of gamelan set (it's 26) from my Javanese Art teacher :)
      • 2018-11-15 31945, 2018

      • ruaok
        which ones?
      • 2018-11-15 31902, 2018

      • ruaok
        remember that you can look at which ones are alive here:
      • 2018-11-15 31903, 2018

      • ruaok
      • 2018-11-15 31932, 2018

      • iliekcomputers
        can you add dfs.datanode.use.datanode.hostname to true once too?
      • 2018-11-15 31906, 2018

      • ruaok
        once?
      • 2018-11-15 31923, 2018

      • ruaok
        I didn't parse what you're requesting
      • 2018-11-15 31936, 2018

      • iliekcomputers
        whoops
      • 2018-11-15 31946, 2018

      • iliekcomputers
        can you set dfs.datanode.use.datanode.hostname to true in hdfs-site.xml?
      • 2018-11-15 31954, 2018

      • iliekcomputers
        and then I'll try again.
      • 2018-11-15 31953, 2018

      • ruaok
        done
      • 2018-11-15 31935, 2018

      • iliekcomputers
        no error this time, woo!
      • 2018-11-15 31937, 2018

      • iliekcomputers
        now let's check if the http interface works...
      • 2018-11-15 31941, 2018

      • iliekcomputers crosses fingers