this is what I did over the weekend. I have not included recommed.py stuff in here because candidate_sets inferences is enough to throw light on how recommend.py will change
let me know whenever you get the chance to read
nav2002 joined the channel
Darkloke joined the channel
Nyanko-sensei joined the channel
D4RK joined the channel
Nyanko-sensei has quit
D4RK-PH0ENiX has quit
Wizzup has quit
iliekcomputers
A Google doc / Dropbox paper would probably be much more readable than a gist.
pristine__
iliekcomputers: thanks for the suggestion :)
Wizzup joined the channel
D4RK has quit
D4RK-PH0ENiX joined the channel
Wizzup has quit
Wizzup joined the channel
iliekcomputers
No peob
Prob
Darkloke has quit
pristine__
Though gists are easier for me :p
Wizzup has quit
Wizzup joined the channel
Omnipoint joined the channel
Omnipoint has quit
Gazooo has quit
Gazooo joined the channel
DjSlash
pristine__: if you'd rename it to a .md file, then github should render it
pristine__
DjSlash: lol I know. It was just the first draft. I will anyway. Thanks
ruaok
DjSlash: that's a pretty easy, but good suggestion.
iliekcomputers
DjSlash: nice nick
:D
ruaok
also,moooin!
iliekcomputers
Moin!
nav2002 has quit
DjSlash
iliekcomputers: ha, thanks :)
iliekcomputers
The pipeline werks
ruaok
niiiice!
iliekcomputers
Although sending gigs of data in a single rmq message probably doesn't make any sense
ruaok
is that the output of all stats?
iliekcomputers
There's lots of easy wins in optimization left
ruaok: the query oomed when we calculated all three stats for all users
All three being artists, release recording
ruaok
oy.
iliekcomputers
So I did just artist for making it work
And it takes a long time to publish
Needs more investigation
zas
Moiinn
iliekcomputers
But hey, it works!
pristine__
ruaok: if you are uncomfortable reading that lemme know, I will format it to md Or something
ruaok
iliekcomputers: all the big questions have been settled, which is nice.
pristine__: I really like the adding .md extension and then its all done. please do tjat/
iliekcomputers
yes!
i'd like to create a listenbrainz_spark user or something for the configs
right now it's running from my account which isn't ideal.
ruaok
chhavi says hi and will join us for the meeting tonight.
iliekcomputers
hi chhavi
ruaok
make me a task for creating a new users on the paper and I'll do it in a bit.
zas
Hey chhavi
ruaok
you off to a'dam today, zas?
zas
Yup, Thalys just left Paris
ruaok
looks like it will be cold in the north of europe this week.
reosarevok
eh, we're having over-0 temps all week, not that bad :p
zas
I'll go to pre-register for haproxyconf on arrival, then to hotel
It was very cold in Paris, but Amsterdam should be better, around 6°c, expect rain though
It's cold on the other side of the Atlantic this week too…
iliekcomputers
ruaok: task added
ruaok
k
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | New GSoC students start here: https://goo.gl/7jsjG2 | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Meeting (18:00 UTC) agenda: Reviews, Google Code-in (Freso), instrument illustrations (Reo/CatQuest), mbsandbox.org (ruaok)
chaban joined the channel
BrainzGit
[listenbrainz-server] dependabot-preview[bot] opened pull request #666 (master…dependabot/pip/python-dateutil-2.8.1): Bump python-dateutil from 2.8.0 to 2.8.1 https://github.com/metabrainz/listenbrainz-serv...
ruaok
pristine__: reading the gist now. so, everything is nice and clear leading up to creating playcounts_df. is that right?
I wonder if the similar artists table should map artists credits instead of artists.
then you would not have to explode the recordings_df .
pristine__
that means an array of mbids right?
sounds good
then ono explode
no*
> reading the gist now. so, everything is nice and clear leading up to creating playcounts_df. is that right?
yes
I mean I have points and stuff to improve quality but for now it's fine. We can just jot down so that it can help us in next GSOC labs project.
ruaok: ^
ruaok
Let me see if using artist credits makes sense for the artist-artist stuff. I remember it being a question and that it made more sense to an artist-artist level than artisrcredit-artistcredit level.
pristine__
sure.
ruaok: how do you feel about the explode and duplicate recording stuff?
ruaok
Not good
pristine__
yeah :)
And do you have any other way other than the two I mentioned?
ruaok: I mean if any, you feel can be better
ruaok
Well, having and ac-ac relation instead of a-a should fix it, no?
pristine__
yeah, got that. So i mean if you ever in the middle of night or anytime come across any lil point that can in a way fit into a recommndation engine in future no may be at this hour, do share, we can discuss and build docs as we walk the road map and use it sometime somewhere.
:)
iliekcomputers
man i <3 dependabot
chaban has quit
ruaok
pristine__: ok, will do. now let me examine the a-a/ac-ac case
pristine__
sure :)
and share your findings please :)
ruaok
reosarevok: you about?
pristine__: ok, from where I stand I think it doesn't matter very much from my artist relations perspective.
pristine__
what perspective?
ruaok
the script that calculates the a-a relations.
pristine__
okay
ruaok
it automatically explodes the results, but the semantic meaning remains the same.
so I will create two outputs: one for a-a and one for ac-ac
pristine__
do I need to use the former?
ruaok
no, you should use the latter going forward
pristine__
yeah.
ruaok
and really it will be [artist-mbids] - [artist-mbids] as the actual mapping.
an array to an array.
since AC's do not have MBIDs.
pristine__
that is awsome.
but
reosarevok
ruaok: now I am
ruaok
shit. a but.
pristine__
the array in ac-ac will always be singular, no?
ruaok
reosarevok: perfect timing. I just answered all the questions I had. lol.
reosarevok
haha
Neat!
pristine__
like [a] similar to [b]
ruaok
pristine__: only one array mapping to antoher array, yes. but each array could have one or more entries.
alternatively I can output an ID for artist credit: AC_0,AC_1, relation
pristine__
how? I can't clearly understand that I think. Can you give an example. oh, so till now it was like a similar to b, a similar to c
now it will be together
a similar to [b,c]
is it ?
ruaok
you fully understand artist credits, yes?
pristine__
i guess so. an artist appear with another artist in how many collabs
ruaok
yes, but more importantly know that any recording is attributed to an artist_credit. NOT an artist.
pristine__
umm....cool. I like this line.
clear
ruaok
so, if we want to avoid exploding the recordings_df, we need to rework your candidate artist work to work on artist_credits, not artists.