in #metabrainz

0:06 AM
zas has quit
0:07 AM
zas joined the channel
0:15 AM
BrainzGit

[musicbrainz-server] mwiencek opened pull request #1330 (master…flow-0.114): Flow 0.114 https://github.com/metabrainz/musicbrainz-serve...
0:16 AM
Protab is now known as Rotab
0:17 AM
rdswift joined the channel
0:19 AM
Wassabi has quit
0:43 AM
D4RK-PH0ENiX has quit
1:04 AM
D4RK-PH0ENiX joined the channel
1:26 AM
chaban has quit
1:26 AM
chaban joined the channel
1:41 AM
c1e0 joined the channel
3:20 AM
rahul24 has quit
3:28 AM
rahul24 joined the channel
3:31 AM
rahul24 has quit
3:31 AM
rahul24 joined the channel
4:07 AM
rahul24 has quit
4:19 AM
blinky42 has quit
4:21 AM
blinky42 joined the channel
4:22 AM
rahul24 joined the channel
4:23 AM
Lotheric_ is now known as Lotheric
5:21 AM
rahul24 has quit
5:34 AM
rahul24 joined the channel
5:45 AM
c1e0_ joined the channel
5:48 AM
c1e0 has quit
5:58 AM
sarthak_jain joined the channel
6:09 AM
c1e0_ has quit
6:14 AM
sarthak_jain

Hi everyone. I am Sarthak Jain, a final year student at NIT Hamirpur. I wish to contribute to Listenbrainz-Labs, where can I start ?
6:15 AM
sarthak_jain has left the channel
6:15 AM
sarthak_jain joined the channel
6:19 AM
pristine__

Hey sarthak_jain
6:20 AM
Labs is a bit new so we are still in process of writing docs, but for now you can start from here https://github.com/metabrainz/listenbrainz-labs
6:21 AM
Labs for one aima to develop an open source recommendation engine
6:21 AM
And secondly calculate statistics (user, artists etc)
6:22 AM
https://github.com/metabrainz/listenbrainz-labs...
6:22 AM
If you want to dive into recommendation stuff
6:22 AM
https://github.com/metabrainz/listenbrainz-labs...
6:23 AM
For stats. The codebase has not expanded much so it should be easy for you to understand how it all goes. Ping me if you get stuck anywhere.
6:24 AM
Rn, we are writing tests for labs, so that is one place where you can surely contribute
6:25 AM
I will open a ticket and you can have a look to start with labs :)
6:25 AM
Good luck :)
6:25 AM
sarthak_jain

Yes, that will be great.
6:26 AM
pristine__

I should be back after lunch :)
6:26 AM
sarthak_jain

Okayy
6:49 AM
sbvkrishna joined the channel
7:11 AM
sbvkrishna

Mr_Monkey: Had a quick look at the Merge tool and it's pretty awesome! Couple small things - 1. We don't need the Merge queue on the Merge submit page right? (instead, add a single cancel-merge button?), 2. commented here- https://github.com/bookbrainz/bookbrainz-site/p... .
7:41 AM
rahul24 has quit
7:45 AM
reosarevok

Mr_Monkey: please look into the video task again - if Freso is not available to check you'll need to just be the judge yourself
7:46 AM
I can take a look if it helps?
8:09 AM
pristine__

ruaok: lemme know when you are up to chat a lil
8:41 AM
yvanzo

bitmap, reosarevok: can we hotfix beta/prod with 1329?
8:42 AM
reosarevok

If you think it's urgent enough, yes :)
8:42 AM
(it shows only names so it's not too awful, but still, we could)
8:43 AM
yvanzo

(and only when it is shared with another editor, but still)
8:47 AM
bitmap

just approved it
9:00 AM
rahul24 joined the channel
9:01 AM
rahul24 has quit
9:02 AM
reosarevok

So I should point it to prod?
9:03 AM
K, I better go rebase this
9:05 AM
yvanzo

reosarevok: master is fine
9:05 AM
reosarevok

Wait, it is?
9:05 AM
You'll just cherry-pick for release I guess then?
9:05 AM
yvanzo

Yup
9:05 AM
reosarevok

K then, nvm
9:06 AM
I'll merge into master for now then
9:06 AM
yvanzo

There is no message change either.
9:06 AM
BrainzGit

[musicbrainz-server] reosarevok merged pull request #1329 (master…MBS-10526): MBS-10526: Only show private collab collections to collaborators https://github.com/metabrainz/musicbrainz-serve...
9:06 AM
BrainzBot

MBS-10526: Private collaborative collections for a user shown publicly https://tickets.metabrainz.org/browse/MBS-10526
9:06 AM
yvanzo

Thank you!
9:06 AM
reosarevok

Guess you can close https://tickets.metabrainz.org/browse/MBS-10526 once released? :)
9:06 AM
(unless you want me to run the release)
9:08 AM
yvanzo

Ok, on it.
9:22 AM
updated beta.mb.o already, updating prod now
9:26 AM
ruaok

Moooin!
9:26 AM
pristine__: i could chat now
9:26 AM
pristine__

hey
9:28 AM
So rn, It is pretty easy to set up labs on local machine, First setup LB-server, clone labs, and run develop.sh but to run the recommendation engine we need data.
9:28 AM
We have already discussed about uploading mappings
9:28 AM
and artist relation
9:28 AM
but we still need listens
9:29 AM
the dumps adn incremental dumps are there on williams but i strongly feel that they are not ideal for running labs on local machine
9:29 AM
beause of the size
9:29 AM
ruaok

too large?
9:30 AM
pristine__

yes, GBs. Not possible to download
9:30 AM
and upload in hdfs. Lot ofwork
9:30 AM
Also
9:31 AM
We need recent listens to run the recommendation engine. recent as in till the date on which it is being run
9:31 AM
So we have two options
9:32 AM
smaller incremental dumps every monday/thursday or whatever the window is and download them as and when you require.
9:32 AM
The configs can be changed to fetch data other than the near future.
9:33 AM
but I see a problem, is it okay to every time connect to FTP and download the data and do stuff, can be time consuming.
9:33 AM
The other option is to generate local (fake) data to run the recommendation engine
9:33 AM
of course we need to write scripts for that'
9:34 AM
for the first option we need need to modify mapping and relation also (to keep the size small and have them intersect with the listens)
9:35 AM
what do you think?
9:36 AM
Rn things are not so clear, I understand but people are coming up for contribution so I guess we need to make a setup for them
9:37 AM
ruaok ponders
9:37 AM
setup that is enough for them to make PRs
9:38 AM
the actual testing and everything will be done on leader and for that we need the big data dumps
9:38 AM
but not to run on local machine
9:38 AM
ruaok

> but I see a problem, is it okay to every time connect to FTP and download the data and do stuff, can be time consuming.
9:39 AM
yes, we have the bandwidth for that.
9:39 AM
sbvkrishna has quit
9:39 AM
the problem with fake data is that it is of limited usefulness.
9:40 AM
what if we, during data dump generation, generate a test data dump of a diserable size?
9:40 AM
from the latest data?
9:41 AM
say, take X listens from the Y users?
9:41 AM
because other projects have small data sets for testing too. what do you think of that?
9:41 AM
also, did you see iliekcomputers suggestion of merging the -labs repos back into the main repos?
9:42 AM
pristine__

> also, did you see iliekcomputers suggestion of merging the -labs repos back into the main repos?
9:42 AM
ruaok

msb-labs > msb and lb-labs > lb ?
9:42 AM
pristine__

No, I guess I missed it
9:42 AM
Okay.
9:42 AM
ruaok

I fully agree with the msb case, but am still thinking about the lb case.
9:42 AM
that reduces the number of setup and management scripts that we duplicate.
9:42 AM
pristine__

I will think about it too. Not sure rn
9:43 AM
ruaok

ok, please do that.
9:43 AM
so, your desire to make things small and easily installable for new comers is great.
9:44 AM
pristine__

> what if we, during data dump generation, generate a test data dump of a diserable size?
9:44 AM
sounds cool
9:44 AM
We have to do the same for mapping
9:44 AM
and relations
9:44 AM
and ensure that they intersect
9:44 AM
ruaok

but I feel that we're doing a lot of meta-stuff, while we haven't gotten the core fully nailed down yet.
9:44 AM
pristine__

true
9:45 AM
Wassabi joined the channel
9:46 AM
At this point, It is really difficult for any third person to contribute because we need a lot of data to let the scripts fire
9:46 AM
ruaok

what if we draw our goal to be "let's make things testable and get things workding. then once we have stuff that we can see is working, we'll make it easier to install."
9:46 AM
yes, indeed.
9:46 AM
pristine__

> what if we draw our goal to be "let's make things testable and get things workding. then once we have stuff that we can see is working, we'll make it easier to install."
9:46 AM
ruaok

but things are also so young still that we may not get many people interested in contributing on this code.
9:47 AM
we had one fellow mail support@ about working on some LB tickets.
9:47 AM
which is great, but not a lot of lb-labs interrest yet.
9:47 AM
pristine__

I understand but then I have a small periphery for newcomers to contribute
9:47 AM
Like they can write tests
9:48 AM
but wait a lil to run the actual recommendation engine. or stats or stuff
9:48 AM
I am fine with this anyway
9:48 AM
> but things are also so young still that we may not get many people interested in contributing on this code.
9:48 AM
agreed
9:48 AM
Unless they take the initiative to make things work out anyway :p
9:49 AM
But I keep telling people about labs whenever someone here asks me about GSoC :p
9:49 AM
> which is great, but not a lot of lb-labs interrest yet.
9:50 AM
Yes. I have an eye on that.
9:50 AM
ruaok: So for the person who approached today, I think I can help him in setup and assign a ticket for writing tests
9:50 AM
ruaok

so, not sure if we're in agreement here. how would you suggest we proceed in the next couple of weeks.
9:52 AM
pristine__

I first of all would like to get the mapping PR merged with tests, have scripts for uploading mappings and relations done so that if anyone is willing to download huge data, the person can run the lil engine.
9:52 AM
rn, if you try to run, it will say data missing in HDFS.
9:53 AM
Then take on the task for making things easy for newcomers
9:53 AM
If it makes sense to you?
9:54 AM
ruaok

yes, it does.
9:54 AM
pristine__

(I know I am being a lil slow but uni people trouble a lot in final year and you cannot do anything since you need the degree)