For stats. The codebase has not expanded much so it should be easy for you to understand how it all goes. Ping me if you get stuck anywhere.
Rn, we are writing tests for labs, so that is one place where you can surely contribute
I will open a ticket and you can have a look to start with labs :)
Good luck :)
sarthak_jain
Yes, that will be great.
pristine__
I should be back after lunch :)
sarthak_jain
Okayy
sbvkrishna joined the channel
sbvkrishna
Mr_Monkey: Had a quick look at the Merge tool and it's pretty awesome! Couple small things - 1. We don't need the Merge queue on the Merge submit page right? (instead, add a single cancel-merge button?), 2. commented here- https://github.com/bookbrainz/bookbrainz-site/p... .
rahul24 has quit
reosarevok
Mr_Monkey: please look into the video task again - if Freso is not available to check you'll need to just be the judge yourself
I can take a look if it helps?
pristine__
ruaok: lemme know when you are up to chat a lil
yvanzo
bitmap, reosarevok: can we hotfix beta/prod with 1329?
reosarevok
If you think it's urgent enough, yes :)
(it shows only names so it's not too awful, but still, we could)
yvanzo
(and only when it is shared with another editor, but still)
So rn, It is pretty easy to set up labs on local machine, First setup LB-server, clone labs, and run develop.sh but to run the recommendation engine we need data.
We have already discussed about uploading mappings
and artist relation
but we still need listens
the dumps adn incremental dumps are there on williams but i strongly feel that they are not ideal for running labs on local machine
beause of the size
ruaok
too large?
pristine__
yes, GBs. Not possible to download
and upload in hdfs. Lot ofwork
Also
We need recent listens to run the recommendation engine. recent as in till the date on which it is being run
So we have two options
smaller incremental dumps every monday/thursday or whatever the window is and download them as and when you require.
The configs can be changed to fetch data other than the near future.
but I see a problem, is it okay to every time connect to FTP and download the data and do stuff, can be time consuming.
The other option is to generate local (fake) data to run the recommendation engine
of course we need to write scripts for that'
for the first option we need need to modify mapping and relation also (to keep the size small and have them intersect with the listens)
what do you think?
Rn things are not so clear, I understand but people are coming up for contribution so I guess we need to make a setup for them
ruaok ponders
setup that is enough for them to make PRs
the actual testing and everything will be done on leader and for that we need the big data dumps
but not to run on local machine
ruaok
> but I see a problem, is it okay to every time connect to FTP and download the data and do stuff, can be time consuming.
yes, we have the bandwidth for that.
sbvkrishna has quit
the problem with fake data is that it is of limited usefulness.
what if we, during data dump generation, generate a test data dump of a diserable size?
from the latest data?
say, take X listens from the Y users?
because other projects have small data sets for testing too. what do you think of that?
also, did you see iliekcomputers suggestion of merging the -labs repos back into the main repos?
pristine__
> also, did you see iliekcomputers suggestion of merging the -labs repos back into the main repos?
ruaok
msb-labs > msb and lb-labs > lb ?
pristine__
No, I guess I missed it
Okay.
ruaok
I fully agree with the msb case, but am still thinking about the lb case.
that reduces the number of setup and management scripts that we duplicate.
pristine__
I will think about it too. Not sure rn
ruaok
ok, please do that.
so, your desire to make things small and easily installable for new comers is great.
pristine__
> what if we, during data dump generation, generate a test data dump of a diserable size?
sounds cool
We have to do the same for mapping
and relations
and ensure that they intersect
ruaok
but I feel that we're doing a lot of meta-stuff, while we haven't gotten the core fully nailed down yet.
pristine__
true
Wassabi joined the channel
At this point, It is really difficult for any third person to contribute because we need a lot of data to let the scripts fire
ruaok
what if we draw our goal to be "let's make things testable and get things workding. then once we have stuff that we can see is working, we'll make it easier to install."
yes, indeed.
pristine__
> what if we draw our goal to be "let's make things testable and get things workding. then once we have stuff that we can see is working, we'll make it easier to install."
ruaok
but things are also so young still that we may not get many people interested in contributing on this code.
we had one fellow mail support@ about working on some LB tickets.
which is great, but not a lot of lb-labs interrest yet.
pristine__
I understand but then I have a small periphery for newcomers to contribute
Like they can write tests
but wait a lil to run the actual recommendation engine. or stats or stuff
I am fine with this anyway
> but things are also so young still that we may not get many people interested in contributing on this code.
agreed
Unless they take the initiative to make things work out anyway :p
But I keep telling people about labs whenever someone here asks me about GSoC :p
> which is great, but not a lot of lb-labs interrest yet.
Yes. I have an eye on that.
ruaok: So for the person who approached today, I think I can help him in setup and assign a ticket for writing tests
ruaok
so, not sure if we're in agreement here. how would you suggest we proceed in the next couple of weeks.
pristine__
I first of all would like to get the mapping PR merged with tests, have scripts for uploading mappings and relations done so that if anyone is willing to download huge data, the person can run the lil engine.
rn, if you try to run, it will say data missing in HDFS.
Then take on the task for making things easy for newcomers
If it makes sense to you?
ruaok
yes, it does.
pristine__
(I know I am being a lil slow but uni people trouble a lot in final year and you cannot do anything since you need the degree)