#metabrainz

/

13:18 PM
yvanzo

For example, defineArtistCreditColumn already takes an accessor function in parameter

2020-05-11 13255, 2020

13:42 PM
reosarevok

yvanzo: you mean stuff like getArtistCredit for everything? Sure, we can

2020-05-11 13244, 2020

13:43 PM
pristine__

iliekcomputers: ping

2020-05-11 13227, 2020

13:46 PM
iliekcomputers

pong

2020-05-11 13258, 2020

13:50 PM
reosarevok

yvanzo: we already have results having both score and entity. Would you suggest to change basic lists then to not have Array<CoreEntityT> but Array<{+entity:CoreEntityT}> ?

2020-05-11 13216, 2020

13:51 PM
reosarevok

I guess that'd be the most consistent / sensible

2020-05-11 13207, 2020

13:57 PM
pristine__

iliekcomputers: I tried to run spark jobs on leader but was not able to since request consumer is already running. I guess we need to bring down request consumer, share resources between parallel jobs. Can you look into?

2020-05-11 13229, 2020

13:57 PM
iliekcomputers

can you not do the work locally?

2020-05-11 13249, 2020

13:57 PM
iliekcomputers

i want to stop us from running dev scripts on pay-server

2020-05-11 13255, 2020

13:57 PM
iliekcomputers

on production*

2020-05-11 13227, 2020

13:58 PM
iliekcomputers

you should write the script locally, get it code-reviewed and then run it on prod via request consumer

2020-05-11 13250, 2020

13:58 PM
yvanzo

reosarevok: yes about flow types

2020-05-11 13233, 2020

13:59 PM
yvanzo

Actually, I did not even think about flow types :P

2020-05-11 13256, 2020

14:00 PM
reosarevok

yvanzo: I'm not just talking flow types, we'd need to change the actual stuff we pass to be objects with entity: artist or whatever rather than just the artists

2020-05-11 13256, 2020

14:00 PM
yvanzo

About accessor function, I did not check how to implement it.

2020-05-11 13201, 2020

14:01 PM
reosarevok

Unless we don't care about it being consistent

2020-05-11 13224, 2020

14:01 PM
yvanzo

reosarevok: That's right, but I don't see how we could keep flow working without this change.

2020-05-11 13241, 2020

14:01 PM
reosarevok

No, I mean, I like the idea because it ensures consistency

2020-05-11 13251, 2020

14:01 PM
reosarevok

I just want to make sure it sounds reasonable before I start :

2020-05-11 13252, 2020

14:01 PM
reosarevok

* :D

2020-05-11 13245, 2020

14:02 PM
yvanzo

OR, maybe you can have a generic type I and have it defined through accessor function.

2020-05-11 13204, 2020

14:03 PM
yvanzo

(I did really not checked anything :P)

2020-05-11 13200, 2020

14:05 PM
reosarevok

I guess I can also wait for bitmap and talk it through together before I start changing everything :p

2020-05-11 13206, 2020

14:06 PM
yvanzo

reosarevok: in eitther case, it would be nice to refactor utility/tableColumns.js functions to use named parameters at some point.

2020-05-11 13237, 2020

14:06 PM
reosarevok

named parameters = accept an object instead of a bunch of params?

2020-05-11 13239, 2020

14:07 PM
yvanzo

yup, just like properties for components

2020-05-11 13235, 2020

14:12 PM
pristine__

iliekcomputers: I can do the work locally but I cannot look into the data and refine the results on local machine.

2020-05-11 13225, 2020

14:13 PM
iliekcomputers

if you make it go through request consumer, you will be able to do that.

2020-05-11 13250, 2020

14:13 PM
iliekcomputers

the idea is that all code on the production spark cluster goes through request consumer for now.

2020-05-11 13239, 2020

14:17 PM
jmp_music joined the channel

2020-05-11 13240, 2020

14:17 PM
pristine__

Oh, okay. Thank you :)

2020-05-11 13207, 2020

14:21 PM
iliekcomputers

pristine__: if that flow seems non-ideal / too slow for you, i'd be happy to take suggestions on how to improve it. However, the old flow where all of us were running stuff on the cluster willy nilly was a bit unsafe (i remember data getting deleted by mistake on multiple occasions) and now that we have more people working on the spark stuff anyways, i think improving the developer environment is the way to go.

2020-05-11 13222, 2020

14:23 PM
reosarevok

yvanzo: guess I can do that one then for now

2020-05-11 13231, 2020

14:23 PM
reosarevok

We don't have a ticket yet, do we?

2020-05-11 13211, 2020

14:26 PM
yvanzo

Doesn't seem so.

2020-05-11 13248, 2020

14:26 PM
reosarevok

Something like

2020-05-11 13253, 2020

14:26 PM
reosarevok

https://www.irccloud.com/pastebin/318sHJ61/

2020-05-11 13206, 2020

14:27 PM
reosarevok

Instead of

2020-05-11 13208, 2020

14:27 PM
reosarevok

https://www.irccloud.com/pastebin/5SV8GqFT/

2020-05-11 13210, 2020

14:27 PM
reosarevok

Right? :)

2020-05-11 13220, 2020

14:27 PM
reosarevok

Do you prefer "props" or "options"?

2020-05-11 13237, 2020

14:31 PM
yvanzo

it is probably not needed to name this object

2020-05-11 13244, 2020

14:31 PM
pristine__

iliekcomputers: since spark stuff results requires a lot of testing like refine recommendations, look into stats, basically the quality of results which is not possible on local machine I think a separate cluster for prod and testing should be ideal if possible.

2020-05-11 13228, 2020

14:32 PM
ruaok

we dont have money for that.

2020-05-11 13228, 2020

14:32 PM
pristine__

That ways their would be no chance of data deletion by mistake.

2020-05-11 13231, 2020

14:32 PM
iliekcomputers

could we run two instances of spark on the same machine

2020-05-11 13234, 2020

14:32 PM
pristine__

Right

2020-05-11 13237, 2020

14:32 PM
yvanzo

reosarevok: export function defineActionsColumn({+actions: $ReadOnlyArray<[string, string]>}): named parameter

2020-05-11 13257, 2020

14:32 PM
pristine__

> could we run two instances of spark on the same machine

2020-05-11 13200, 2020

14:33 PM
pristine__

I am not sure.

2020-05-11 13234, 2020

14:33 PM
reosarevok

yvanzo: that's a parsing error

2020-05-11 13242, 2020

14:33 PM
reosarevok

Apparently :)

2020-05-11 13210, 2020

14:34 PM
yvanzo

reosarevok: Ok :D go for your version then :)

2020-05-11 13220, 2020

14:35 PM
reosarevok

How do I set a default value for a parameter in this case? I don't remember :D

2020-05-11 13202, 2020

14:36 PM
iliekcomputers

pristine__: right

2020-05-11 13227, 2020

14:36 PM
iliekcomputers

I'd suggest trying the request consumer flow for a week or two and then seeing if there's specific pain points we can fix

2020-05-11 13209, 2020

14:37 PM
ruaok

+1

2020-05-11 13211, 2020

14:37 PM
pristine__

Yes. Sounds good to me.

2020-05-11 13215, 2020

14:37 PM
yvanzo

reosarevok: by making the prop optional in the object and declaring a local variable in function's body?

2020-05-11 13231, 2020

14:37 PM
reosarevok

Oh, I can't do it when assigning? Oh well

2020-05-11 13224, 2020

14:44 PM
Mr_Monkey

prabal: I created BB-468 and after testing, it is indeed linked to sessions (signing out and back in updates the name). Let's tackel that one another day.

2020-05-11 13225, 2020

14:44 PM
BrainzBot

BB-468: Username is not updated in the navbar after update in editor profile https://tickets.metabrainz.org/browse/BB-468

2020-05-11 13203, 2020

14:45 PM
prabal

Mr_Monkey: alright :)

2020-05-11 13235, 2020

14:57 PM
pristine__

pep8speaks is a cool thing <3

2020-05-11 13214, 2020

14:59 PM
alastairp

I really like the idea of getting bots to check things that bots are good at doing

2020-05-11 13204, 2020

15:00 PM
alastairp

that way, people doing reviews can focus on good feedback, not feedback on code style

2020-05-11 13201, 2020

15:01 PM
pristine__

right, saves a lot of time

2020-05-11 13231, 2020

15:13 PM
alastairp

btw, just got github actions running eslint on acousticbrainz: https://github.com/metabrainz/acousticbrainz-serv… https://github.com/metabrainz/acousticbrainz-serv…

2020-05-11 13222, 2020

15:18 PM
yvanzo

alastairp, iliekcomputers: is there someone still in charge of CB?

2020-05-11 13241, 2020

15:18 PM
alastairp

I don't know

2020-05-11 13257, 2020

15:18 PM
yvanzo

apparently, it's not you :D

2020-05-11 13259, 2020

15:18 PM
ruaok

not officially at the moment.

2020-05-11 13218, 2020

15:19 PM
yvanzo

Ok, maybe I can remove it from focus areas at https://musicbrainz.org/doc/Development/Priorities

2020-05-11 13227, 2020

15:19 PM
alastairp

I guess it depends if it's for general development, or an urgent security upgrade

2020-05-11 13228, 2020

15:19 PM
ruaok

I wonder if we should find someone to volunteer or just shut it down.

2020-05-11 13241, 2020

15:19 PM
ruaok

yvanzo: do that, certainly.

2020-05-11 13201, 2020

15:20 PM
yvanzo

And remove iliekcomputers from CB project in Jira, so that leaves it open for someone else.

2020-05-11 13224, 2020

15:20 PM
yvanzo

alastairp: do you want to check SEC tickets for CB?

2020-05-11 13235, 2020

15:20 PM
alastairp

I'll add it to my list

2020-05-11 13206, 2020

15:22 PM
yvanzo

Ok, thanks, I will make assign GitHub Bot assign it to you then.

2020-05-11 13246, 2020

15:22 PM
alastairp

oh, maybe I misunderstood your comment. I will verify the current SEC tickets. I'm not sure about going forward :)

2020-05-11 13253, 2020

15:22 PM
alastairp

are there any currently open tickets?

2020-05-11 13220, 2020

15:24 PM
yvanzo

alastairp: yes, about jquery.

2020-05-11 13233, 2020

15:25 PM
yvanzo

SEC-132, SEC-133

2020-05-11 13235, 2020

15:25 PM
BrainzBot

SEC-132: [critiquebrainz] CVE-2020-11023: jquery >= 1.0.3, < 3.5.0 https://tickets.metabrainz.org/browse/SEC-132

2020-05-11 13235, 2020

15:25 PM
BrainzBot

SEC-133: [critiquebrainz] CVE-2020-11022: jquery >= 1.2, < 3.5.0 https://tickets.metabrainz.org/browse/SEC-133

2020-05-11 13227, 2020

15:27 PM
yvanzo

ruaok: is https://groups.io/g/metabrainz-bugs something we still want?

2020-05-11 13238, 2020

15:27 PM
ruaok

no not really.

2020-05-11 13206, 2020

15:28 PM
yvanzo

I don't know why no mail is going through this list, it should receive all notifications from Jira.

2020-05-11 13219, 2020

15:39 PM
iliekcomputers

i don't have capacity for CB these days, happy to be removed from the jira.

2020-05-11 13234, 2020

15:39 PM
iliekcomputers

well, not exactly happy, but still probably should be

2020-05-11 13201, 2020

15:41 PM
yvanzo

iliekcomputers: Ok, same about BU, LB, MSP?

2020-05-11 13224, 2020

15:41 PM
iliekcomputers

i'm on LB. others, yeah, probably should remove me.

2020-05-11 13232, 2020

15:41 PM
iliekcomputers

thanks yvanzo

2020-05-11 13238, 2020

15:44 PM
yvanzo

Done, you're welcome!

2020-05-11 13234, 2020

15:53 PM
yvanzo

alastairp, bitmap, reosarevok: while I'm on jira email notifications, do you want any change about that?

2020-05-11 13249, 2020

15:53 PM
alastairp

O

2020-05-11 13201, 2020

15:54 PM
alastairp

I'm happy with the number and contents of the emails that I'm getting

2020-05-11 13227, 2020

15:54 PM
reosarevok

I'm good I think

2020-05-11 13232, 2020

15:57 PM
alastairp

jmp_music: I'm in the process of collecting this data, but it's going to be quite a bit - maybe 20 or 30gb of audio. Is that OK for you to download, or do you want me to generate the acousticbrainz feature files myself, and have you just download the features?

2020-05-11 13242, 2020

15:57 PM
alastairp

up to you, no option is extra work for me

2020-05-11 13227, 2020

16:00 PM
ruaok

right-o -- time for the music recommendation meeting we set last week.

2020-05-11 13232, 2020

16:00 PM
ruaok

y'all ready?

2020-05-11 13258, 2020

16:00 PM
yvanzo

reosarevok: you don't mind being notified about MBVM/SEARCH events so we just use the same scheme for all MB server stuff?

2020-05-11 13201, 2020

16:01 PM
ruaok

don't all jump at once. :)

2020-05-11 13205, 2020

16:01 PM
shivam-kapila

ruaok: I am up

2020-05-11 13235, 2020

16:01 PM
ruaok

iliekcomputers: alastairp : Mr_Monkey : you about?

2020-05-11 13243, 2020

16:01 PM
Mr_Monkey

Yep !

2020-05-11 13245, 2020

16:01 PM
alastairp

man. 6pm already?

2020-05-11 13251, 2020

16:01 PM
ruaok

yerp.

2020-05-11 13257, 2020

16:01 PM
alastairp

I'm here, but just finishing up an email

2020-05-11 13258, 2020

16:01 PM
shivam-kapila

9 30 PM here

2020-05-11 13231, 2020

16:03 PM
ruaok

well, iliekcomputers wanted to have a meeting to catch up on this topic. oh well, he'll get to read the scrollback.

2020-05-11 13248, 2020

16:03 PM
ruaok

ok, has everyone had a chance to read https://docs.google.com/document/d/175KcVnS5VfsRx… ?

2020-05-11 13226, 2020

16:04 PM
Mr_Monkey

Afirmative

2020-05-11 13231, 2020

16:04 PM
iliekcomputers

I'm around

2020-05-11 13234, 2020

16:04 PM
ishaanshah[m]

Yes

2020-05-11 13241, 2020

16:04 PM
shivam-kapila

Yup

2020-05-11 13243, 2020

16:04 PM
ruaok

aight.

2020-05-11 13253, 2020

16:04 PM
ruaok

so, I'll jump in with a general overview of where we are.

2020-05-11 13211, 2020

16:05 PM
ruaok

the core team already kinda knows this stuff, but to bring us on the same page....

2020-05-11 13243, 2020

16:05 PM
ruaok

we're going to focus our non-musicbrainz teams on music recommendation algorithms.

2020-05-11 13228, 2020

16:06 PM
ruaok

this pandemic will likely cause us to lose more customers and that will hurt out bottom line -- we need to find more supporters and we need to offer more data for supporters to get hooked on.'

2020-05-11 13251, 2020

16:06 PM
ruaok

I've asked Mr_Monkey to spend 50% of his time on this stuff for at least the next 3 months.

2020-05-11 13210, 2020

16:07 PM
ruaok

and alastairp's tasks will also get re-shuffled to help with this effort.

2020-05-11 13245, 2020

16:07 PM
ruaok

this project has a number of goals. the most important ones are:

2020-05-11 13210, 2020

16:08 PM
ruaok

1. expand our data sets to appeal to people who need recommendations or are also working on recommendations (--> $$$)

2020-05-11 13246, 2020

16:08 PM
ruaok

2. create a toolkit/playground where people can use normal every day python skills and tinker with music recommendations.

2020-05-11 13207, 2020

16:09 PM
ruaok

there are a lot of people who want to play with these tools and we've never had an answer for these people.

2020-05-11 13211, 2020

16:10 PM
ruaok

if we can build this toolkit then we can invite more people into our projects (not just recommendation) and also highlight problems with the existing data sets so that we can appeal to the community to help fix data problems, which in turn should improve recommendations.

2020-05-11 13248, 2020

16:10 PM
ruaok

right now a lot of recommendation works is very academic -- focus on theory, but often with little real world focus.

2020-05-11 13227, 2020

16:11 PM
ruaok

I'd like to open this up and bring more of a open source spirit to it. experimentation, iteration, sharing.

2020-05-11 13257, 2020

16:11 PM
ruaok

perhaps we will learn that we cannot beat the academics. perhaps we can find other ways of making recommendations.

2020-05-11 13247, 2020

16:12 PM
ruaok

alastairp and I shared this vision a few years ago -- there used to be a cool company called EchoNest -- they ran out of money and were sold to spotify. and spotify closed everything down and subsumed it into their offerings.

2020-05-11 13217, 2020

16:13 PM
Mr_Monkey

(Conversely Spotify recommendations improved dramatically over the years)

2020-05-11 13226, 2020

16:13 PM
ruaok

this loss caused alastairp and I to create AB and later LB. the goal was to build at least two datasets that would be useful for recommendation work.

2020-05-11 13250, 2020

16:13 PM
ruaok

we're now finally at that point -- we finally get to play with the data and see what we can build.

2020-05-11 13204, 2020

16:14 PM
ruaok

what Mr_Monkey says is completely true.

2020-05-11 13223, 2020

16:14 PM
ruaok

but there are very few ways in which we, the end users, can influence these features.

2020-05-11 13255, 2020

16:14 PM
ruaok

my toolkit idea is the total opposite -- I'm thinking loads of options for people to play and explore. at first you'll need to know how to use python.

2020-05-11 13217, 2020

16:15 PM
ruaok

but later perhaps this can all run the browser -- this also makes scaling these services very interesting.

2020-05-11 13227, 2020

16:15 PM
ruaok

ok, questions so far?

2020-05-11 13215, 2020

16:16 PM
jmp_music

alastairp: Could you please send me the features too if you don't mind. Besides, I'm studying the paper right now. Thanks

2020-05-11 13215, 2020

16:16 PM
Mr_Monkey

—

2020-05-11 13234, 2020

16:16 PM
ruaok

this weekend I spent a lot of time hacking on some preliminary code to implement this toolkit -- the basics are there and I should have something to play with tomorrow.

2020-05-11 13212, 2020

16:17 PM
ruaok

what I need help with are creating datasources and filters.

2020-05-11 13221, 2020

16:17 PM
ruaok

first: annoy data sources.

2020-05-11 13228, 2020

16:17 PM
alastairp

(sorry, I just have to make an urgent phone call, I'm here and watching, got a few comments to make)

2020-05-11 13245, 2020

16:17 PM
ruaok

ok, annoy indexes will be second or third.

2020-05-11 13203, 2020

16:18 PM
ruaok

we'll start at the back end then -- Mr_Monkey!