0:15 AM
D4RK-PH0ENiX joined the channel
0:22 AM
CatQuest has quit
0:28 AM
CatQuest joined the channel
0:46 AM
JTL is now known as JLT
0:46 AM
JLT is now known as JTL
0:58 AM
Nyanko-sensei joined the channel
0:59 AM
d4rkie joined the channel
0:59 AM
Nyanko-sensei has quit
0:59 AM
D4RK-PH0ENiX has quit
1:20 AM
D4RK-PH0ENiX joined the channel
1:20 AM
d4rkie has quit
1:22 AM
D4RK-PH0ENiX has quit
1:22 AM
ayerhart has quit
1:22 AM
D4RK-PH0ENiX joined the channel
1:22 AM
ayerhart joined the channel
3:17 AM
Leo_Verto_ joined the channel
3:19 AM
Leo_Verto has quit
3:19 AM
Leo_Verto_ is now known as Leo_Verto
3:44 AM
c1e0 joined the channel
4:52 AM
c1e0 has quit
4:55 AM
Major_Lurker joined the channel
4:58 AM
Gore|woerk joined the channel
4:59 AM
G0re has quit
6:43 AM
iliekcomputers
Moin!
6:43 AM
It is cold! 😥😥
7:29 AM
michelv joined the channel
7:33 AM
michelv
8:59 AM
pristine_ joined the channel
9:00 AM
dpmittal[m] has quit
9:29 AM
ruaok
moin moin!
9:29 AM
yep, cold indeed.
9:32 AM
iliekcomputers
ruaok: hi
9:32 AM
Got time to help me out with some spark stuff today?
9:32 AM
:)
9:32 AM
ruaok
I should, yes.
9:32 AM
lemme grab some coffee
9:32 AM
iliekcomputers
Great, I'll be near a computer soon.
9:32 AM
10-15ish minutes?
9:34 AM
modwizcode has quit
9:38 AM
ruaok
sure
9:43 AM
modwizcode joined the channel
9:54 AM
iliekcomputers
yo.
9:55 AM
ruaok
tú qué?
9:55 AM
iliekcomputers
so I was submitting the dataframes script to spark from inside the play container
9:56 AM
the out of memory errors were because I was submitting it without specifying the master.
9:56 AM
so it submitted to a spark instance inside the container, i think.
9:56 AM
demonimin joined the channel
9:56 AM
ruaok
sounds about right.
9:56 AM
iliekcomputers
i fixed that and submitted using `spark-submit --master spark://spark-master.spark-network:7077 <Script>`
9:57 AM
that led to a problem with the files from the dump.
9:57 AM
i was getting a bunch of filenotfounderrors
9:58 AM
i figured this was because the dump was being extracted inside the play container and the code was running inside the spark-master container
9:59 AM
ruaok is listening with fascination
10:00 AM
there is an addFile method in pyspark that downloads files to all nodes
10:00 AM
ruaok
what a complicated saga its been.
10:00 AM
iliekcomputers
ruaok: i'm not good at telling stories :D
10:00 AM
but the addFile method didn't really work.
10:00 AM
it just added the file to the tmp folder in the play container and nowhere else.
10:00 AM
ruaok
I'm enjoying it. :)
10:01 AM
iliekcomputers
I was wondering how we worked with files when we worked on the recommendation engine.
10:01 AM
ruaok
did the addfile method get the proper --master info?
10:01 AM
iliekcomputers
ruaok: huh, i dunno. ideally it should have with the --master switch, but i'll check.
10:01 AM
ruaok
everything was on the master filesystem, I think.
10:02 AM
and then we made rdds, which were distributed. but everything was on a local instance.
10:02 AM
iliekcomputers
so ideally if I ran it from the master container, it should work?
10:02 AM
i tried that, but i still got filenotfounderrors.
10:03 AM
ruaok
I think the way to look at this is to make sure that our assumptions are correct.
10:04 AM
we are assuming/hoping that files got copies to HDFS and are available on all the nodes. is that correct?
10:05 AM
slurpee- joined the channel
10:05 AM
iliekcomputers
ruaok: hdfs isn't involved yet.
10:05 AM
ruaok
O_O
10:05 AM
iliekcomputers
10:07 AM
Slurpee has quit
10:07 AM
ruaok
2.1.0? old docs?
10:07 AM
iliekcomputers
10:08 AM
ruaok
heh.
10:08 AM
iliekcomputers
another way I tried was to upload the files to hdfs before working with them.
10:08 AM
ruaok
that was my assumptiong.
10:08 AM
iliekcomputers
it did work. but it was really slow.
10:08 AM
ruaok
upload to HDFS first, then specify HDFS file path to spark.
10:09 AM
`it did work. but it was really slow.` that's a start.
10:09 AM
remember, we've done no tuning and have not really given a lot of ram.
10:09 AM
sounds like that should be our next step.
10:09 AM
iliekcomputers
took it around 12 hrs to write dataframes for 2002 and 2003.
10:10 AM
which is when I stopped it.
10:10 AM
ruaok
I have no idea how resource allocation should work, so we need to learn.
10:10 AM
iliekcomputers
hmm.
10:10 AM
ruaok
hah, yes.
10:10 AM
we may also not have enough nodes -- only 4 nodes. but I suspect that we're not giving enough ram.
10:10 AM
each node may only be using half the ram it has, possibly even less.
10:11 AM
iliekcomputers
hmm, could be it.
10:11 AM
i'd like to get the spark UI back somehow.
10:11 AM
makes it easier to analyze this stuff.
10:12 AM
ruaok
I need to add your SSH key to the docker gateway container and the tunnels should work for you.
10:12 AM
iliekcomputers
yokay. great.
10:12 AM
ruaok
zas: do you know how to configure openvpn?
10:12 AM
that may be a better option, if someone knows how to tame it.
10:21 AM
iliekcomputers
ruaok: with the recommendation engine, did you add any options to spark-submit to change the default memory used etc?
10:22 AM
ruaok
yes, I think so.
10:22 AM
iliekcomputers
hmm, nice. let me look into that then.
10:22 AM
ruaok
spark-submit --master spark://195.201.112.36:7077 --executor-memory=29g `pwd`/<script> <args>
10:23 AM
from readme.md
10:23 AM
iliekcomputers facepalms. (should have read that) :P
11:08 AM
D4RK-PH0ENiX has quit
11:33 AM
D4RK-PH0ENiX joined the channel
11:38 AM
pristine_ has quit
12:13 PM
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | GSoC https://goo.gl/7jsjG2 | Meeting agenda (next meeting: 2019-01-07): Reviews, Google Code-in (Freso), Annual report 2018 (ruaok), mini summit (ruaok)
12:48 PM
12:48 PM
you can always spot when india and IST causes something odd in the world. :)
12:53 PM
zas
ruaok: about openvpn i did but a looong time ago, so i guess you better use online docs rather than my memory ;)
12:54 PM
ruaok
ok, then sod that.
12:56 PM
zas
about brazilian synth pop, the work on this album is pretty impressive, each song is finely crafted, not my usual kind of music, but definitively worth listening (
https://erica.bandcamp.com/album/beautiful for readers)
12:57 PM
13:03 PM
code_master5 joined the channel
13:21 PM
c1e0 joined the channel
13:25 PM
discopatrick joined the channel
13:28 PM
michelv has quit
13:55 PM
michelv joined the channel
14:00 PM
reosarevok looks at which of his PRs have hopelessly broken over the holiday season
14:04 PM
iliekcomputers
0!
14:04 PM
code_master5
iliekcomputers: 0! = 1. 😈
14:05 PM
iliekcomputers
unexpected factorial
14:05 PM
shit.
14:05 PM
CatQuest
ho ho
14:05 PM
14:05 PM
best answer
14:05 PM
ruaok
unexpected factorial, kinda like surprise buttsecks?
14:06 PM
code_master5
CatQuest: ðŸ¤
14:11 PM
yvanzo
Hi reosarevok, my (not broken yet) PRs are just waiting for your review :)
14:11 PM
reosarevok
Ok, let me look into mine for a bit and then check
14:23 PM
CatQuest
14:40 PM
iliekcomputers
first time getting import errors after spark-submit.
14:40 PM
github joined the channel
14:40 PM
github
[listenbrainz-server] vansika opened pull request #475: make timestamp optional for playing now listens in RedisListenStore (master...playing-now-timestamp)
https://git.io/fhGZE
14:40 PM
github has left the channel
14:59 PM
reosarevok
huh
14:59 PM
Cannot get React.AbstractComponent because property AbstractComponent is missing in module react [1].
14:59 PM
What am I missing
15:02 PM
Oh. I'm getting that in master too
15:10 PM
reosarevok also sighs at "Identifier 'l_statistics' is not in camel case" etc by eslint
15:10 PM
Tempted to just send a PR camelcasing the whole thing so it shuts up
15:17 PM
yvanzo
15:21 PM
c1e0 has quit
15:22 PM
reosarevok
But is there an actual reason these are not in camelcase?
15:22 PM
yvanzo
web service, if I recall correctly
15:23 PM
Wait, l_statistics is not served by WS, there is no reason :)
15:24 PM
reosarevok
But l_relationships or whatever can't be changed?
15:27 PM
Lotheric has quit
15:28 PM
yvanzo
It might be because of N_l.
15:31 PM
I would vote for making exceptions of N_l, N_ln, N_lp and camelcasing others.