mayhem: spotify metadata finished yesterday in about 3hours. but there was a bug in fetching track ids so i fixed and rebuilt. now the hopefully correct one is available on gaga in `mapping.spotify_metadata_index`
2022-10-06 27955, 2022
d4rk-ph0enix joined the channel
2022-10-06 27949, 2022
d4rk-ph0enix has quit
2022-10-06 27924, 2022
d4rk-ph0enix joined the channel
2022-10-06 27927, 2022
d4rk-ph0enix has quit
2022-10-06 27941, 2022
d4rk-ph0enix joined the channel
2022-10-06 27910, 2022
v6lur joined the channel
2022-10-06 27920, 2022
CatQuest has left the channel
2022-10-06 27905, 2022
v6lur has quit
2022-10-06 27933, 2022
d4rk-ph0enix has quit
2022-10-06 27902, 2022
mayhem
woo hoo, very good -- thanks lucifer.
2022-10-06 27923, 2022
mayhem
have you done a couple of queries against the combineddata field to see if it can find things you'd expect to see?
I see it uses authlib.oauth2.rfc6749 -- alastairp is that the library that we wish to use going forward?
2022-10-06 27910, 2022
mayhem hopes so
2022-10-06 27955, 2022
d4rk-ph0enix joined the channel
2022-10-06 27959, 2022
d4rk-ph0enix has quit
2022-10-06 27935, 2022
d4rk-ph0enix joined the channel
2022-10-06 27929, 2022
lucifer
mayhem: yes, i did some queries to test and it seems fine. case in point about unlistenable shit, the first row that turned up was an album of 315 unplayable and unnamed songs.
2022-10-06 27907, 2022
mayhem
I wonder if we should make an attempt to filter some of this shit out.
2022-10-06 27918, 2022
mayhem
but that might be more trouble than benefit for having a smaller DB.
2022-10-06 27912, 2022
lucifer
i guess we'll have a better idea once we use it and look at more of it.
2022-10-06 27917, 2022
mayhem
next steps for this project: lets try and build a content resolver around this index and test transferring some playlists.
2022-10-06 27935, 2022
mayhem
and then see if we need to build a typesense index for this data as well.
2022-10-06 27915, 2022
lucifer
makes sense
2022-10-06 27919, 2022
mayhem
at this point, I really want to have a flexible, unified module that does metadata resolution. e.g. mapper for MB and Spotify unified in one engine, ready to have more data sets added to it.
2022-10-06 27956, 2022
mayhem
I've been trying to think up how to make a flexible lookup system, but I keep coming up short.
2022-10-06 27948, 2022
mayhem
the whole process of looking up metadata is rather quite fiddley -- it feels more like "trying random shit", rather than "carefully designed process".
2022-10-06 27903, 2022
lucifer
so you give it a (artist, album, track) and it gives back MB, spotify and other data?
2022-10-06 27911, 2022
mayhem
which of course makes it harder to create a sensible module for
2022-10-06 27922, 2022
mayhem
yes, but let me expand.
2022-10-06 27947, 2022
mayhem
given: (artist, recording (required) and release, tracknum, duration (optional)) OR (recording_mbid), find a match in (MB, spotify) and return metadata appropriate for the target database.
2022-10-06 27922, 2022
mayhem
also needed: an indication of "how hard to try". because we could do a exact text match or N attempts at detuning and querying.
2022-10-06 27934, 2022
mayhem
this way we can plug this "engine" into the API or the mapping writer
2022-10-06 27955, 2022
lucifer
makes sense. i think we should just start there. accept all these info in a method to call directly from server and in an api endpoint to call from elsewhere. all the resolution stuff goes into the module. then see where we get from there
2022-10-06 27910, 2022
mayhem
agreed.
2022-10-06 27927, 2022
mayhem
because we have two implementations of it now and will need a third now, so let's stop this nonsense.
2022-10-06 27905, 2022
lucifer
yes makes sense
2022-10-06 27946, 2022
monkey
Sorry everyone at the office, i'm à little late with the breakfast food. Forgot my charger, had to track back :/
2022-10-06 27952, 2022
atj_mb
Will be at the office in 10
2022-10-06 27906, 2022
mayhem packs up and heads out
2022-10-06 27908, 2022
alastairp
mayhem: yes, that oauth library is the one that lucifer and I were talking about, I believe he already did a bunch of investigation back when he was working on that branch and it seemed the best of the bunch
Pratha-Fish: hi, we're having an in-person metabrainz day here in the office, and so I'm not really going to be available. do you have the code uploaded?
2022-10-06 27905, 2022
Pratha-Fish
alastairp: yes, I do have it pushed on the same notebook
2022-10-06 27942, 2022
zas
lucifer: I think you could help me adding support for it on the client side (in Picard), for now, it uses a client_id/client_secret, that's not great
gender, area and birthdate columns.... move to MeB?
2022-10-06 27946, 2022
reosarevok
bitmap: should it be a temp table for unverified emails?
2022-10-06 27954, 2022
lucifer
yup makes sense i think
2022-10-06 27956, 2022
reosarevok
Then you get to store it even if another, verified one exists
2022-10-06 27903, 2022
zas
ok, ping me when you're available. It shouldn't be too hard, we can first add PKCE support using current MusicBrainz OAuth, then I can do the changes for supporting another OAuth server (it assumes that's MusicBrainz server atm)
2022-10-06 27910, 2022
reosarevok
mayhem: I'd say move everything that is about the *user*, leave only MB-specific stuff
2022-10-06 27920, 2022
reosarevok
So language to MeB too, for example
2022-10-06 27924, 2022
bitmap
reosarevok: dunno, presumably if it's valid they can verify the new one right away
2022-10-06 27901, 2022
mayhem
where is the language stored, reosarevok ?
2022-10-06 27906, 2022
mayhem
it wasn't mentioned in the google doc
2022-10-06 27920, 2022
reosarevok
editor_language
2022-10-06 27938, 2022
reosarevok
There's also editor_preference (but I expect most preferences would be per-project)
2022-10-06 27949, 2022
reosarevok
Things like timezone though might not
2022-10-06 27957, 2022
mayhem
yes, exactly.
2022-10-06 27913, 2022
reosarevok
There's also old_editor_name where we keep used names from deleted editors so they cannot be reused
2022-10-06 27929, 2022
lucifer
gender and area are interesting in that area and gender data lives in MB.
2022-10-06 27948, 2022
mayhem
indeed.
2022-10-06 27952, 2022
reosarevok
And of course editor_oauth_token
2022-10-06 27902, 2022
reosarevok
I expect we should have our list of genders be shared across projects
2022-10-06 27909, 2022
zas
outsidecontext: there's a plan to move to a central OAuth service and to unify user access under this metabrainz.org umbrella, I'll prepare a patch for Picard reflecting this move.
2022-10-06 27912, 2022
mayhem
should we move those tables to meb and then return string values to the other projects?
2022-10-06 27920, 2022
reosarevok
Area... I mean, probably too, but right now the whole UI to edit areas is in MB
2022-10-06 27930, 2022
lucifer
but MB also needs those tables for artists
2022-10-06 27945, 2022
lucifer
so a copy needs to remain in MB too.
2022-10-06 27955, 2022
mayhem
agreed, but that scope is a bit different.
2022-10-06 27914, 2022
mayhem
for example would CB need to create a gender table to understand the data sent by meb.org?
2022-10-06 27915, 2022
reosarevok
Yes, I meant more that we should have gender shared and then BB authors, MB artists and MeB editors should pick from the same list
2022-10-06 27936, 2022
reosarevok
But how exactly that'd work I dunno yet :)
2022-10-06 27941, 2022
lucifer
i see, makes sense.
2022-10-06 27949, 2022
mayhem
that exactly working is what I am trying to sort right now.
2022-10-06 27956, 2022
lucifer
we'll need to ensure to keep those tables in sync though
i think we'll have to do it manually. maybe a cron job to query gender tables across projects to report differences ?
2022-10-06 27927, 2022
alastairp
a thought that came to me: there are many things currently oauth-authenticating to musicbrainz. this means that we'll need to keep it running for some time while we encourage everyone to move over to MeB
2022-10-06 27934, 2022
reosarevok
Yeah, I guess we don't want to break acoustid or something
2022-10-06 27912, 2022
zas
alastairp: yes, we'll have old versions of Picard in the wild for a while, so we need to keep things running on current OAuth servers, but clearly deprecate them
2022-10-06 27914, 2022
alastairp
so scopes should probably be namespaced too
2022-10-06 27951, 2022
alastairp
just talking with bitmap in person about if we could seamlessly redirect requests to MB oauth endpoints to MeB, that's soemthing worth looking into
2022-10-06 27904, 2022
GibusWearingMann has quit
2022-10-06 27911, 2022
alastairp
and we could automatically prefix scopes with musicbrainz: where needed
2022-10-06 27924, 2022
petitminion_ has quit
2022-10-06 27936, 2022
petitminion_ joined the channel
2022-10-06 27921, 2022
alastairp
perhaps we just deactivate creation of new applications on MB and support this workflow for as long as is necessary