mayhem: spotify metadata finished yesterday in about 3hours. but there was a bug in fetching track ids so i fixed and rebuilt. now the hopefully correct one is available on gaga in `mapping.spotify_metadata_index`
d4rk-ph0enix joined the channel
d4rk-ph0enix has quit
d4rk-ph0enix joined the channel
d4rk-ph0enix has quit
d4rk-ph0enix joined the channel
v6lur joined the channel
CatQuest has left the channel
v6lur has quit
d4rk-ph0enix has quit
mayhem
woo hoo, very good -- thanks lucifer.
have you done a couple of queries against the combineddata field to see if it can find things you'd expect to see?
I see it uses authlib.oauth2.rfc6749 -- alastairp is that the library that we wish to use going forward?
mayhem hopes so
d4rk-ph0enix joined the channel
d4rk-ph0enix has quit
d4rk-ph0enix joined the channel
lucifer
mayhem: yes, i did some queries to test and it seems fine. case in point about unlistenable shit, the first row that turned up was an album of 315 unplayable and unnamed songs.
mayhem
I wonder if we should make an attempt to filter some of this shit out.
but that might be more trouble than benefit for having a smaller DB.
lucifer
i guess we'll have a better idea once we use it and look at more of it.
mayhem
next steps for this project: lets try and build a content resolver around this index and test transferring some playlists.
and then see if we need to build a typesense index for this data as well.
lucifer
makes sense
mayhem
at this point, I really want to have a flexible, unified module that does metadata resolution. e.g. mapper for MB and Spotify unified in one engine, ready to have more data sets added to it.
I've been trying to think up how to make a flexible lookup system, but I keep coming up short.
the whole process of looking up metadata is rather quite fiddley -- it feels more like "trying random shit", rather than "carefully designed process".
lucifer
so you give it a (artist, album, track) and it gives back MB, spotify and other data?
mayhem
which of course makes it harder to create a sensible module for
yes, but let me expand.
given: (artist, recording (required) and release, tracknum, duration (optional)) OR (recording_mbid), find a match in (MB, spotify) and return metadata appropriate for the target database.
also needed: an indication of "how hard to try". because we could do a exact text match or N attempts at detuning and querying.
this way we can plug this "engine" into the API or the mapping writer
lucifer
makes sense. i think we should just start there. accept all these info in a method to call directly from server and in an api endpoint to call from elsewhere. all the resolution stuff goes into the module. then see where we get from there
mayhem
agreed.
because we have two implementations of it now and will need a third now, so let's stop this nonsense.
lucifer
yes makes sense
monkey
Sorry everyone at the office, i'm à little late with the breakfast food. Forgot my charger, had to track back :/
atj_mb
Will be at the office in 10
mayhem packs up and heads out
alastairp
mayhem: yes, that oauth library is the one that lucifer and I were talking about, I believe he already did a bunch of investigation back when he was working on that branch and it seemed the best of the bunch
Pratha-Fish: hi, we're having an in-person metabrainz day here in the office, and so I'm not really going to be available. do you have the code uploaded?
Pratha-Fish
alastairp: yes, I do have it pushed on the same notebook
zas
lucifer: I think you could help me adding support for it on the client side (in Picard), for now, it uses a client_id/client_secret, that's not great
gender, area and birthdate columns.... move to MeB?
reosarevok
bitmap: should it be a temp table for unverified emails?
lucifer
yup makes sense i think
reosarevok
Then you get to store it even if another, verified one exists
zas
ok, ping me when you're available. It shouldn't be too hard, we can first add PKCE support using current MusicBrainz OAuth, then I can do the changes for supporting another OAuth server (it assumes that's MusicBrainz server atm)
reosarevok
mayhem: I'd say move everything that is about the *user*, leave only MB-specific stuff
So language to MeB too, for example
bitmap
reosarevok: dunno, presumably if it's valid they can verify the new one right away
mayhem
where is the language stored, reosarevok ?
it wasn't mentioned in the google doc
reosarevok
editor_language
There's also editor_preference (but I expect most preferences would be per-project)
Things like timezone though might not
mayhem
yes, exactly.
reosarevok
There's also old_editor_name where we keep used names from deleted editors so they cannot be reused
lucifer
gender and area are interesting in that area and gender data lives in MB.
mayhem
indeed.
reosarevok
And of course editor_oauth_token
I expect we should have our list of genders be shared across projects
zas
outsidecontext: there's a plan to move to a central OAuth service and to unify user access under this metabrainz.org umbrella, I'll prepare a patch for Picard reflecting this move.
mayhem
should we move those tables to meb and then return string values to the other projects?
reosarevok
Area... I mean, probably too, but right now the whole UI to edit areas is in MB
lucifer
but MB also needs those tables for artists
so a copy needs to remain in MB too.
mayhem
agreed, but that scope is a bit different.
for example would CB need to create a gender table to understand the data sent by meb.org?
reosarevok
Yes, I meant more that we should have gender shared and then BB authors, MB artists and MeB editors should pick from the same list
But how exactly that'd work I dunno yet :)
lucifer
i see, makes sense.
mayhem
that exactly working is what I am trying to sort right now.
lucifer
we'll need to ensure to keep those tables in sync though
i think we'll have to do it manually. maybe a cron job to query gender tables across projects to report differences ?
alastairp
a thought that came to me: there are many things currently oauth-authenticating to musicbrainz. this means that we'll need to keep it running for some time while we encourage everyone to move over to MeB
reosarevok
Yeah, I guess we don't want to break acoustid or something
zas
alastairp: yes, we'll have old versions of Picard in the wild for a while, so we need to keep things running on current OAuth servers, but clearly deprecate them
alastairp
so scopes should probably be namespaced too
just talking with bitmap in person about if we could seamlessly redirect requests to MB oauth endpoints to MeB, that's soemthing worth looking into
GibusWearingMann has quit
and we could automatically prefix scopes with musicbrainz: where needed
petitminion_ has quit
petitminion_ joined the channel
perhaps we just deactivate creation of new applications on MB and support this workflow for as long as is necessary