Like, do you see yourself wanting to preemptively block usage of any?
Freso
Unless we’d be able to add regular expressions (so we could do e.g., `catcat[0-9]+`), I’m not sure it’s immediately useful.
reosarevok
Ok
Freso
But don’t design it in a way that would make that functionality impossible either. :p
reosarevok
How do you imagine this (and MBS-11828) working? Having a full list you can filter with a search? Or just having a search that will give you any matching usernames and allow you to remove them?
thoughts on software, so true: "A home truth of mine is that software projects in fact look more like a liability than an asset. Software, whether anyone is getting any benefit from it or not attracts a huge number of very annoying expenses: security vulnerability scrambles, bug fixes and database upgrades, and all of it just lines the pockets of wealthy sysadmins."
yes, and for us, this makes sense. we're a software and data centric group -- this is our business.
I've got a friend who wanted to start an online language school and had a great number of ideas. His ideas were all based in software and I told him that he is starting a software company, not a language company.
CatQuest
oic
ruaok
we was ready to do the former, but the not the latter. he pressed on anyway. now the savings are gone and there is no product.
CatQuest
:(
jasondk
ruaok: hope you enjoyed it! since you're back we should chat about whether or not Listens without mbids should be reviewable on CB with the new feature
I know you mentioned on the proposal that you'd be okay with leaving them as not reviewable
but this still feels restrictive again
ruaok
I agree, but I am concerned about schedule and overloading you.
not sure what the rest of your summer looks like once the official coding period ends.
if you end up going over and would be ok to stick around to see it through, then I think we ought to make the MSID ones reviewable.
lucifer
fwiw, this will need CB changes as well. CB currently only supports MBIDs.
jasondk
I'd be happy to go over a week or two to get everything done :)
ruaok nods at lucifer
ruaok
jasondk: then by all means, lets do it right.
jasondk
I'm thinking the best way to do this is to make API calls from the frontend to do MBID lookups via musicbrainz and the mbid mapper on labs.api.
ruaok
the changes to CB do worry me.
lucifer
+1, but let's discuss how we are going to approach this before moving on to implementation.
ruaok
lucifer: clearly.
jasondk
At the very least every listen has an artist name, track name, and recording msid that we can use to do lookups.
ruaok
jasondk: the problem with the MBID lookup is that if we dont have one in LB, it means that the lookups have failed.
and we'll have tried 2-4 different types of lookup at that point.
lucifer
on this, to confirm we are talking of the listens which do not have a MBID match at all even after lookups?
ruaok
for now it would be safe to assume that if we don't have an MBID, we can't easily get one. you'd be getting into the weeds for a lot of work.
lucifer: that is my impression. jasondk?
jasondk
Are you saying that right now if a listen doesnt have an mbid on LB, it will be difficult to find one?
lucifer
MBIDs come from two sources currently. 1) the user submits an mbid alongwith the listen 2) the mbid mapper try to look one up. 3) listens which do not have mbid because none was submitted and lookup failed.
ruaok
exactly that.
I've been spending the last year working hard to build a system that finds MBIDs for all listens.
and if we find cases (and we will) where my matching isn't good enough, I'll work on that more.
so for now we should assume if if there is no MBID, we can't get one without human intervention.
which makes adding MSID support in CB quite a chore.
jasondk
Oh i see. i've just been looking at the listens on my local machine missing mbids and assumed LB was the same
alastairp
hello
jasondk
But im looking at LB again now and the mbid matching has improved a lot
lucifer
hi alastairp!
ruaok
hi alastairp
lucifer
right, we have approx ~83% matches currently.
another thing to consider is can MBIDs associated with a MSID change?
ruaok
83% 🥰
lucifer
yup :D
!m ruaok
BrainzBot
You're doing good work, ruaok!
ruaok
> can MBIDs associated with a MSID change?
Yes, no. Sorta.
jasondk
Yeah i didn't know it improved so much lol
lucifer
for instance, when we make imporvements to the mapper. say it finds a better match?
ruaok
there will be two passes of the data. First one (in place now) will do track matching. Then shortly after that, we will do temporal matching to find complete albums. between these two the MBIDs could change.
and we may dream up more steps later, so I think for now we ought to assume that they could change over time.
lucifer
+1
ruaok
> say it finds a better match?
yes, I expect that I will re-run non-exact matches every time we have a new and improved alg.
so, given this how, if at all, do we add reviews for non-matched tracks>
?
jasondk
How often are msid's "remapped"?
ruaok
I am starting to think that we ought to prompt users to add the data to MusicBrainz in order to map the listens.
lucifer
for non-matched tracks, i think we shouldn't bother at least for now.
ruaok
jasondk: remapped when we push out new algs. so, very occasionally.
jasondk
I agree, I wasn't aware our match rate was already 83%. at this point it will be a lot more work to improve this while not adding that many tracks to be reviewable.
ruaok
lucifer: it would be good to have a "landing page" for "oops, you wanted to do something with a listen, but we could not find a match in MB".
lucifer
yes makes sense
next thing, is what do with reviews on listens whose MBIDs were reassigned by the mapper?
ruaok
oh dear.
I might be too hungry to contemplate that mess.
lucifer
:)
jasondk
lastfm_track_mbid is interchangable with recording_mbid, correct?
ruaok
I am inclined to say that we should ignore it for now.
jasondk: yes, but their matching algorithm was... questionable. lots of bad data.
jasondk
I see
ruaok
reassinged MBIDs, I hope should happen very rarely. so rare as to not worth dealing with now.
jasondk
Is the match rate a similar situation for artist mbids and releast group mbids?
I am noticing some of my listens don't have artist_mbids, i'm assuming they aren't easily findable with our current mapping methods either
ruaok
do you have an example?
because everything mapping with my mapper should have all three.
and they should be consistent.
jasondk
oh nvm, the listens with missing artist-mbids dont have recording_mbids either.
I do see some cases where lastfm artist and release group mbids are missing a recording mbid though:
+1, so as long as LB backend sends a MBID to the frontend. frontend should just be a call to the CB api to write a review.
jasondk
ok sure, I have everything we need then !
lucifer
alternatively, frontend calls LB api which internally maps the MSID to a MBID and then calls CB api. the problem with approach though is that frontend should still know which listens have a lookup match. it woould be annoying to write a review only to find it couldn't be published because no match.
jasondk
I agree i think we will be fine to just use the id's that the frontend already knows .
And if an ID for one of the entities is missing a future project could be to prompt the user to search for it and add it themselves so we can map it to the MSID
lucifer
right, the frontend knows the id but only if the user submitted it i think. we haven't added the feature to send looked up mbids to frontend currently.
akashgp09 joined the channel
jasondk
Ok, i understand.
ruaok
ok, I'm off to hunt for some food. back later today for PR reviews.
MrClon
Whats about AcousticBrainz dumps? For my project (automatic download releases from bandcamp and post its data to AcousticBrainz and AcousticID) i need list MB recordings without AB data. Now i get in by "bruteforce" AB API, but it's soo slooow
reosarevok
alastairp: ^
alastairp
MrClon: hi! fixing dumps is at the top of my list to do in AB this month and next month
do you know that you can query up to 20 items from AB in one query?