-
D4RK-PH0ENiX joined the channel
2018-11-15 31928, 2018
-
Zialus_PT has quit
2018-11-15 31900, 2018
-
Zialus joined the channel
2018-11-15 31925, 2018
-
bruce_r joined the channel
2018-11-15 31923, 2018
-
ferbncode has quit
2018-11-15 31954, 2018
-
Nyanko-sensei joined the channel
2018-11-15 31938, 2018
-
D4RK-PH0ENiX has quit
2018-11-15 31943, 2018
-
Dr-Flay has quit
2018-11-15 31908, 2018
-
Dr-Flay joined the channel
2018-11-15 31925, 2018
-
thefar8 has quit
2018-11-15 31919, 2018
-
Nyanko-sensei has quit
2018-11-15 31903, 2018
-
D4RK-PH0ENiX joined the channel
2018-11-15 31942, 2018
-
Leo_Verto_ joined the channel
2018-11-15 31949, 2018
-
Leo_Verto has quit
2018-11-15 31949, 2018
-
Leo_Verto_ is now known as Leo_Verto
2018-11-15 31904, 2018
-
bruce_r has quit
2018-11-15 31942, 2018
-
thefar8 joined the channel
2018-11-15 31919, 2018
-
Nyanko-sensei joined the channel
2018-11-15 31953, 2018
-
D4RK-PH0ENiX has quit
2018-11-15 31927, 2018
-
MightyJay has quit
2018-11-15 31940, 2018
-
rsh7 joined the channel
2018-11-15 31945, 2018
-
MightyJay joined the channel
2018-11-15 31926, 2018
-
yokel has quit
2018-11-15 31956, 2018
-
ruaok
so, what have you learned, iliekcomputers?
2018-11-15 31957, 2018
-
Dr-Flay has quit
2018-11-15 31948, 2018
-
iliekcomputers
ruaok: so i was mucking around with hadoop and config values yesterday.
2018-11-15 31914, 2018
-
iliekcomputers
there's two interfaces that clients can use to read/write/interact with hdfs
2018-11-15 31945, 2018
-
iliekcomputers
one is the standard rpc interface, the other is a http api type interface called webhdfs
2018-11-15 31903, 2018
-
ruaok
do you have a feeling for which is preferred?
2018-11-15 31929, 2018
-
iliekcomputers
ruaok: most of the simpler python clients I've looked at use webhdfs
2018-11-15 31940, 2018
-
iliekcomputers
the client I wrote the code with uses webhdfs too
2018-11-15 31951, 2018
-
iliekcomputers
`hdfs dfs -put` used RPC
2018-11-15 31909, 2018
-
ruaok
and that is the one interface we got working so far. lol.
2018-11-15 31923, 2018
-
iliekcomputers
yeah, about that.
2018-11-15 31941, 2018
-
iliekcomputers
i changed some config values and used `stop-master-service` and `start-master-service`
2018-11-15 31949, 2018
-
iliekcomputers
and now the put thing isn't working...
2018-11-15 31955, 2018
-
ruaok
heh, lol.
2018-11-15 31957, 2018
-
iliekcomputers
i'm not sure why exactly
2018-11-15 31907, 2018
-
ruaok
hang on, there might be uncommited changes in my repo.
2018-11-15 31936, 2018
-
iliekcomputers
the latest hadoop-yarn image is just the master branch of github/meb/hadoop-docker
2018-11-15 31904, 2018
-
ruaok
the playground repo only has these two lines
2018-11-15 31907, 2018
-
ruaok
+ -p published=8088,target=8088,mode=host \
2018-11-15 31907, 2018
-
ruaok
+ -p published=9864,target=9864 \
2018-11-15 31915, 2018
-
ruaok
which don't affect anything.
2018-11-15 31900, 2018
-
iliekcomputers
maybe it's the worker nodes. how exactly do I find out what IP they are on and if they're running.
2018-11-15 31910, 2018
-
Nyanko-sensei has quit
2018-11-15 31926, 2018
-
iliekcomputers
from the PR Leo_Verto made, it seemed like `hdfs dfsadmin -report` wasn't trustworthy.
2018-11-15 31946, 2018
-
ruaok
changes commited, and yes those would make a difference.
2018-11-15 31956, 2018
-
ruaok
I've built a new image and am pushing it now
2018-11-15 31957, 2018
-
D4RK-PH0ENiX joined the channel
2018-11-15 31958, 2018
-
ruaok
push.
2018-11-15 31901, 2018
-
ruaok
+ed
2018-11-15 31902, 2018
-
D4RK-PH0ENiX has quit
2018-11-15 31906, 2018
-
iliekcomputers
ok cool
2018-11-15 31909, 2018
-
ruaok
you should be able to stop/start and have it work again.
2018-11-15 31916, 2018
-
iliekcomputers
let me try the put thing again.
2018-11-15 31937, 2018
-
D4RK-PH0ENiX joined the channel
2018-11-15 31940, 2018
-
iliekcomputers
2018-11-15 31900, 2018
-
ruaok
did you restart both the master and workers?
2018-11-15 31915, 2018
-
iliekcomputers
i don't have access to the workers.
2018-11-15 31925, 2018
-
ruaok
you don't need it.
2018-11-15 31929, 2018
-
iliekcomputers
afaik
2018-11-15 31932, 2018
-
ruaok
./stop-workers.sh
2018-11-15 31936, 2018
-
iliekcomputers
ah
2018-11-15 31937, 2018
-
ruaok
./start-workser.sh
2018-11-15 31939, 2018
-
iliekcomputers
oh cool
2018-11-15 31940, 2018
-
ruaok
-s
2018-11-15 31917, 2018
-
ruaok
the nice thing bout this system is that you never have to care about the worker nodes. they get fired up and join the cluster.
2018-11-15 31948, 2018
-
iliekcomputers
Usage: start-worker-service.sh <replicas>
2018-11-15 31954, 2018
-
iliekcomputers
replicas is a number?
2018-11-15 31957, 2018
-
iliekcomputers
3?
2018-11-15 31901, 2018
-
ruaok
yep
2018-11-15 31911, 2018
-
ruaok
(to both)
2018-11-15 31922, 2018
-
iliekcomputers
noice, that's pretty cool.
2018-11-15 31912, 2018
-
outsidecontext joined the channel
2018-11-15 31941, 2018
-
iliekcomputers
still the same error
2018-11-15 31904, 2018
-
iliekcomputers
can you confirm for once that the workers are on 10.0.0.24, 10.0.0.23 etc?
2018-11-15 31948, 2018
-
ruaok
the 10 network is the overlay network and every time you start and stop the workers, they get new IPs.
2018-11-15 31911, 2018
-
ruaok
I'm going to restart the cluster using my setup. hang on
2018-11-15 31937, 2018
-
ruaok
now it works again.
2018-11-15 31938, 2018
-
ruaok
why?
2018-11-15 31959, 2018
-
iliekcomputers
ruaok: foo is an empty file
2018-11-15 31914, 2018
-
ruaok
yea, I just touched it.
2018-11-15 31934, 2018
-
iliekcomputers
once it has data, it doesn't work...
2018-11-15 31941, 2018
-
iliekcomputers
try again, i echoed some text into it.
2018-11-15 31907, 2018
-
ruaok
doh.
2018-11-15 31919, 2018
-
ruaok
huh, interesting.
2018-11-15 31925, 2018
-
ruaok
but that used to work, no?
2018-11-15 31922, 2018
-
ruaok
well, we should really focus on the http version of this interface and go with that.
2018-11-15 31953, 2018
-
ruaok
are you annoyed by the fact that the only error messages that hadoop gives are java stacktraces?
2018-11-15 31904, 2018
-
iliekcomputers
yes.
2018-11-15 31914, 2018
-
iliekcomputers
not very helpful
2018-11-15 31929, 2018
-
ruaok
we'd get flayed alive if we wrote code like that.
2018-11-15 31936, 2018
-
ruaok
but int he java world, I guess you're used to pain.
2018-11-15 31930, 2018
-
iliekcomputers
I think we should look at what ports are accessible to the master from the workers
2018-11-15 31919, 2018
-
ruaok
let me log into a worker
2018-11-15 31947, 2018
-
iliekcomputers
there's probably some reason why the master isn't able to connect with the workers
2018-11-15 31935, 2018
-
ruaok
2018-11-15 31937, 2018
-
ruaok
the workers have quite fewer ports open.
2018-11-15 31952, 2018
-
iliekcomputers
2018-11-15 31931, 2018
-
iliekcomputers
The ports 9866 and 9864 are the ones we want accessible from the master for sure
2018-11-15 31912, 2018
-
iliekcomputers
2018-11-15 31915, 2018
-
ruaok
as in a process on master makes a connection to port 9866/9864 on a worker?
2018-11-15 31926, 2018
-
iliekcomputers
ruaok: yes
2018-11-15 31925, 2018
-
ruaok
ok, both of those are exposed on the inside of the worker containers.
2018-11-15 31937, 2018
-
ruaok
could you ping 10.0.0.39 ?
2018-11-15 31952, 2018
-
iliekcomputers
ping works
2018-11-15 31957, 2018
-
ruaok
ok, then we need to find which setting causes those ports to be bound on all interfaces.
2018-11-15 31954, 2018
-
ruaok
dfs.namenode.http-bind-host ?
2018-11-15 31949, 2018
-
ruaok starts a build
2018-11-15 31911, 2018
-
iliekcomputers
2018-11-15 31913, 2018
-
ruaok
yay!
2018-11-15 31924, 2018
-
ruaok
both ports are not reachable from the master node.
2018-11-15 31927, 2018
-
ruaok
*now
2018-11-15 31936, 2018
-
iliekcomputers
I'm still getting connection refuse from inside the master container
2018-11-15 31951, 2018
-
ruaok
ok, remember that hadoop-master is the master node.
2018-11-15 31958, 2018
-
ruaok
03a472b53968 is a worker node
2018-11-15 31908, 2018
-
ruaok
2018-11-15 31932, 2018
-
iliekcomputers
no data received. but it connected.
2018-11-15 31956, 2018
-
ruaok
probably an invalid wget command, but at least the port is open.
2018-11-15 31910, 2018
-
iliekcomputers
yes
2018-11-15 31927, 2018
-
iliekcomputers
I see that the client is trying to connect to incorrect IPs.
2018-11-15 31938, 2018
-
iliekcomputers
This was the thing that Leo_Verto had fixed
2018-11-15 31957, 2018
-
ruaok
did we lose his PR somehow?
2018-11-15 31912, 2018
-
iliekcomputers
core-site.xml in /usr/local/hadoop/etc/hadoop/ has the config value set correctly.
2018-11-15 31920, 2018
-
iliekcomputers
maybe it needs to be in hdfs-site.xml ?
2018-11-15 31949, 2018
-
ruaok
can't hurt to try
2018-11-15 31954, 2018
-
ruaok is on it
2018-11-15 31924, 2018
-
ruaok
change applied, cluster restarted.
2018-11-15 31938, 2018
-
ruaok
try it again, iliekcomputers
2018-11-15 31934, 2018
-
iliekcomputers
2018-11-15 31946, 2018
-
iliekcomputers
two workers errored but I think it worked
2018-11-15 31934, 2018
-
ruaok
I wonder if that is because we just restarted things...
2018-11-15 31903, 2018
-
thefar8
Hi CatQuest
2018-11-15 31920, 2018
-
iliekcomputers
it is weird.
2018-11-15 31933, 2018
-
iliekcomputers
from the logs it still seems like it can't connect to 2 of the datanodes.
2018-11-15 31950, 2018
-
thefar8
just got a long list of gamelan set (it's 26) from my Javanese Art teacher :)
2018-11-15 31945, 2018
-
ruaok
which ones?
2018-11-15 31902, 2018
-
ruaok
remember that you can look at which ones are alive here:
2018-11-15 31903, 2018
-
ruaok
2018-11-15 31932, 2018
-
iliekcomputers
can you add dfs.datanode.use.datanode.hostname to true once too?
2018-11-15 31906, 2018
-
ruaok
once?
2018-11-15 31923, 2018
-
ruaok
I didn't parse what you're requesting
2018-11-15 31936, 2018
-
iliekcomputers
whoops
2018-11-15 31946, 2018
-
iliekcomputers
can you set dfs.datanode.use.datanode.hostname to true in hdfs-site.xml?
2018-11-15 31954, 2018
-
iliekcomputers
and then I'll try again.
2018-11-15 31953, 2018
-
ruaok
done
2018-11-15 31935, 2018
-
iliekcomputers
no error this time, woo!
2018-11-15 31937, 2018
-
iliekcomputers
now let's check if the http interface works...
2018-11-15 31941, 2018
-
iliekcomputers crosses fingers