For example, defineArtistCreditColumn already takes an accessor function in parameter
2020-05-11 13255, 2020
reosarevok
yvanzo: you mean stuff like getArtistCredit for everything? Sure, we can
2020-05-11 13244, 2020
pristine__
iliekcomputers: ping
2020-05-11 13227, 2020
iliekcomputers
pong
2020-05-11 13258, 2020
reosarevok
yvanzo: we already have results having both score and entity. Would you suggest to change basic lists then to not have Array<CoreEntityT> but Array<{+entity:CoreEntityT}> ?
2020-05-11 13216, 2020
reosarevok
I guess that'd be the most consistent / sensible
2020-05-11 13207, 2020
pristine__
iliekcomputers: I tried to run spark jobs on leader but was not able to since request consumer is already running. I guess we need to bring down request consumer, share resources between parallel jobs. Can you look into?
2020-05-11 13229, 2020
iliekcomputers
can you not do the work locally?
2020-05-11 13249, 2020
iliekcomputers
i want to stop us from running dev scripts on pay-server
2020-05-11 13255, 2020
iliekcomputers
on production*
2020-05-11 13227, 2020
iliekcomputers
you should write the script locally, get it code-reviewed and then run it on prod via request consumer
2020-05-11 13250, 2020
yvanzo
reosarevok: yes about flow types
2020-05-11 13233, 2020
yvanzo
Actually, I did not even think about flow types :P
2020-05-11 13256, 2020
reosarevok
yvanzo: I'm not just talking flow types, we'd need to change the actual stuff we pass to be objects with entity: artist or whatever rather than just the artists
2020-05-11 13256, 2020
yvanzo
About accessor function, I did not check how to implement it.
2020-05-11 13201, 2020
reosarevok
Unless we don't care about it being consistent
2020-05-11 13224, 2020
yvanzo
reosarevok: That's right, but I don't see how we could keep flow working without this change.
2020-05-11 13241, 2020
reosarevok
No, I mean, I like the idea because it ensures consistency
2020-05-11 13251, 2020
reosarevok
I just want to make sure it sounds reasonable before I start :
2020-05-11 13252, 2020
reosarevok
* :D
2020-05-11 13245, 2020
yvanzo
OR, maybe you can have a generic type I and have it defined through accessor function.
2020-05-11 13204, 2020
yvanzo
(I did really not checked anything :P)
2020-05-11 13200, 2020
reosarevok
I guess I can also wait for bitmap and talk it through together before I start changing everything :p
2020-05-11 13206, 2020
yvanzo
reosarevok: in eitther case, it would be nice to refactor utility/tableColumns.js functions to use named parameters at some point.
2020-05-11 13237, 2020
reosarevok
named parameters = accept an object instead of a bunch of params?
2020-05-11 13239, 2020
yvanzo
yup, just like properties for components
2020-05-11 13235, 2020
pristine__
iliekcomputers: I can do the work locally but I cannot look into the data and refine the results on local machine.
2020-05-11 13225, 2020
iliekcomputers
if you make it go through request consumer, you will be able to do that.
2020-05-11 13250, 2020
iliekcomputers
the idea is that all code on the production spark cluster goes through request consumer for now.
2020-05-11 13239, 2020
jmp_music joined the channel
2020-05-11 13240, 2020
pristine__
Oh, okay. Thank you :)
2020-05-11 13207, 2020
iliekcomputers
pristine__: if that flow seems non-ideal / too slow for you, i'd be happy to take suggestions on how to improve it. However, the old flow where all of us were running stuff on the cluster willy nilly was a bit unsafe (i remember data getting deleted by mistake on multiple occasions) and now that we have more people working on the spark stuff anyways, i think improving the developer environment is the way to go.
iliekcomputers: since spark stuff results requires a lot of testing like refine recommendations, look into stats, basically the quality of results which is not possible on local machine I think a separate cluster for prod and testing should be ideal if possible.
2020-05-11 13228, 2020
ruaok
we dont have money for that.
2020-05-11 13228, 2020
pristine__
That ways their would be no chance of data deletion by mistake.
2020-05-11 13231, 2020
iliekcomputers
could we run two instances of spark on the same machine
2020-05-11 13234, 2020
pristine__
Right
2020-05-11 13237, 2020
yvanzo
reosarevok: export function defineActionsColumn({+actions: $ReadOnlyArray<[string, string]>}): named parameter
2020-05-11 13257, 2020
pristine__
> could we run two instances of spark on the same machine
2020-05-11 13200, 2020
pristine__
I am not sure.
2020-05-11 13234, 2020
reosarevok
yvanzo: that's a parsing error
2020-05-11 13242, 2020
reosarevok
Apparently :)
2020-05-11 13210, 2020
yvanzo
reosarevok: Ok :D go for your version then :)
2020-05-11 13220, 2020
reosarevok
How do I set a default value for a parameter in this case? I don't remember :D
2020-05-11 13202, 2020
iliekcomputers
pristine__: right
2020-05-11 13227, 2020
iliekcomputers
I'd suggest trying the request consumer flow for a week or two and then seeing if there's specific pain points we can fix
2020-05-11 13209, 2020
ruaok
+1
2020-05-11 13211, 2020
pristine__
Yes. Sounds good to me.
2020-05-11 13215, 2020
yvanzo
reosarevok: by making the prop optional in the object and declaring a local variable in function's body?
2020-05-11 13231, 2020
reosarevok
Oh, I can't do it when assigning? Oh well
2020-05-11 13224, 2020
Mr_Monkey
prabal: I created BB-468 and after testing, it is indeed linked to sessions (signing out and back in updates the name). Let's tackel that one another day.
I don't know why no mail is going through this list, it should receive all notifications from Jira.
2020-05-11 13219, 2020
iliekcomputers
i don't have capacity for CB these days, happy to be removed from the jira.
2020-05-11 13234, 2020
iliekcomputers
well, not exactly happy, but still probably should be
2020-05-11 13201, 2020
yvanzo
iliekcomputers: Ok, same about BU, LB, MSP?
2020-05-11 13224, 2020
iliekcomputers
i'm on LB. others, yeah, probably should remove me.
2020-05-11 13232, 2020
iliekcomputers
thanks yvanzo
2020-05-11 13238, 2020
yvanzo
Done, you're welcome!
2020-05-11 13234, 2020
yvanzo
alastairp, bitmap, reosarevok: while I'm on jira email notifications, do you want any change about that?
2020-05-11 13249, 2020
alastairp
O
2020-05-11 13201, 2020
alastairp
I'm happy with the number and contents of the emails that I'm getting
2020-05-11 13227, 2020
reosarevok
I'm good I think
2020-05-11 13232, 2020
alastairp
jmp_music: I'm in the process of collecting this data, but it's going to be quite a bit - maybe 20 or 30gb of audio. Is that OK for you to download, or do you want me to generate the acousticbrainz feature files myself, and have you just download the features?
2020-05-11 13242, 2020
alastairp
up to you, no option is extra work for me
2020-05-11 13227, 2020
ruaok
right-o -- time for the music recommendation meeting we set last week.
2020-05-11 13232, 2020
ruaok
y'all ready?
2020-05-11 13258, 2020
yvanzo
reosarevok: you don't mind being notified about MBVM/SEARCH events so we just use the same scheme for all MB server stuff?
2020-05-11 13201, 2020
ruaok
don't all jump at once. :)
2020-05-11 13205, 2020
shivam-kapila
ruaok: I am up
2020-05-11 13235, 2020
ruaok
iliekcomputers: alastairp : Mr_Monkey : you about?
2020-05-11 13243, 2020
Mr_Monkey
Yep !
2020-05-11 13245, 2020
alastairp
man. 6pm already?
2020-05-11 13251, 2020
ruaok
yerp.
2020-05-11 13257, 2020
alastairp
I'm here, but just finishing up an email
2020-05-11 13258, 2020
shivam-kapila
9 30 PM here
2020-05-11 13231, 2020
ruaok
well, iliekcomputers wanted to have a meeting to catch up on this topic. oh well, he'll get to read the scrollback.
so, I'll jump in with a general overview of where we are.
2020-05-11 13211, 2020
ruaok
the core team already kinda knows this stuff, but to bring us on the same page....
2020-05-11 13243, 2020
ruaok
we're going to focus our non-musicbrainz teams on music recommendation algorithms.
2020-05-11 13228, 2020
ruaok
this pandemic will likely cause us to lose more customers and that will hurt out bottom line -- we need to find more supporters and we need to offer more data for supporters to get hooked on.'
2020-05-11 13251, 2020
ruaok
I've asked Mr_Monkey to spend 50% of his time on this stuff for at least the next 3 months.
2020-05-11 13210, 2020
ruaok
and alastairp's tasks will also get re-shuffled to help with this effort.
2020-05-11 13245, 2020
ruaok
this project has a number of goals. the most important ones are:
2020-05-11 13210, 2020
ruaok
1. expand our data sets to appeal to people who need recommendations or are also working on recommendations (--> $$$)
2020-05-11 13246, 2020
ruaok
2. create a toolkit/playground where people can use normal every day python skills and tinker with music recommendations.
2020-05-11 13207, 2020
ruaok
there are a lot of people who want to play with these tools and we've never had an answer for these people.
2020-05-11 13211, 2020
ruaok
if we can build this toolkit then we can invite more people into our projects (not just recommendation) and also highlight problems with the existing data sets so that we can appeal to the community to help fix data problems, which in turn should improve recommendations.
2020-05-11 13248, 2020
ruaok
right now a lot of recommendation works is very academic -- focus on theory, but often with little real world focus.
2020-05-11 13227, 2020
ruaok
I'd like to open this up and bring more of a open source spirit to it. experimentation, iteration, sharing.
2020-05-11 13257, 2020
ruaok
perhaps we will learn that we cannot beat the academics. perhaps we can find other ways of making recommendations.
2020-05-11 13247, 2020
ruaok
alastairp and I shared this vision a few years ago -- there used to be a cool company called EchoNest -- they ran out of money and were sold to spotify. and spotify closed everything down and subsumed it into their offerings.
2020-05-11 13217, 2020
Mr_Monkey
(Conversely Spotify recommendations improved dramatically over the years)
2020-05-11 13226, 2020
ruaok
this loss caused alastairp and I to create AB and later LB. the goal was to build at least two datasets that would be useful for recommendation work.
2020-05-11 13250, 2020
ruaok
we're now finally at that point -- we finally get to play with the data and see what we can build.
2020-05-11 13204, 2020
ruaok
what Mr_Monkey says is completely true.
2020-05-11 13223, 2020
ruaok
but there are very few ways in which we, the end users, can influence these features.
2020-05-11 13255, 2020
ruaok
my toolkit idea is the total opposite -- I'm thinking loads of options for people to play and explore. at first you'll need to know how to use python.
2020-05-11 13217, 2020
ruaok
but later perhaps this can all run the browser -- this also makes scaling these services very interesting.
2020-05-11 13227, 2020
ruaok
ok, questions so far?
2020-05-11 13215, 2020
jmp_music
alastairp: Could you please send me the features too if you don't mind. Besides, I'm studying the paper right now. Thanks
2020-05-11 13215, 2020
Mr_Monkey
—
2020-05-11 13234, 2020
ruaok
this weekend I spent a lot of time hacking on some preliminary code to implement this toolkit -- the basics are there and I should have something to play with tomorrow.
2020-05-11 13212, 2020
ruaok
what I need help with are creating datasources and filters.
2020-05-11 13221, 2020
ruaok
first: annoy data sources.
2020-05-11 13228, 2020
alastairp
(sorry, I just have to make an urgent phone call, I'm here and watching, got a few comments to make)