TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo)
2023-05-08 12832, 2023
aerozol
mayhem: shall we meet re dataset promotion after the meeting? Or are you on late another evening this week?
2023-05-08 12835, 2023
mayhem
after the meeting I'm off to dinner. but I should be around late evenings, let me ping you when I get back from dinner.
2023-05-08 12849, 2023
aerozol
wow that backwards song is crazy!
2023-05-08 12815, 2023
aerozol
Okay give me a ping if you stay off the pingers
2023-05-08 12830, 2023
mayhem
k
2023-05-08 12809, 2023
aerozol
Or happy to coordinate via a doc or IRC, if you want to write me a little brief. Maybe if we miss each other tomorrow/tonight let’s do that
2023-05-08 12835, 2023
mayhem
A brief might be good in any case. let me do that.
2023-05-08 12803, 2023
mayhem
oh wow. matthew was spot on, he called it *days* before it happened:
TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo), Summit Lodgings (ruaok/atj)
2023-05-08 12844, 2023
aerozol
night all, enjoy the meeting as well!
2023-05-08 12833, 2023
mayhem
natta
2023-05-08 12800, 2023
BrainzGit
[listenbrainz-server] 14MonkeyDo opened pull request #2478 (03master…multisearch-search-by-mbid): Multisearch component: use MB API to lookup recording by MBID https://github.com/metabrainz/listenbrainz-server…
2023-05-08 12846, 2023
lucifer
monkey: ^ makes sense but note that evverywhere else in LB web mapper data is used so in the cases mentioned in the PR would not be able to take use of the user mapping till the mapper comes up to date.
2023-05-08 12812, 2023
monkey
But when the user links a listen to MB and they refresh the page, that listen should show the linked recording info, no?
2023-05-08 12822, 2023
monkey
Not until it's been mapped?
2023-05-08 12819, 2023
lucifer
monkey: the recording_mbid would appear in the listen's json but no additional metadata would be available.
2023-05-08 12829, 2023
monkey
Hm.
2023-05-08 12840, 2023
lucifer
because we only retrieve metadata from the mapper's tables.
2023-05-08 12851, 2023
monkey
I suppose at least we get link to recording + cover art through our fallbacks
2023-05-08 12814, 2023
lucifer
only link to recording.
2023-05-08 12837, 2023
lucifer
afaik, you can't get a cover art from recording mbid.
2023-05-08 12848, 2023
monkey
Ah, darn
2023-05-08 12855, 2023
monkey
That's right.
2023-05-08 12824, 2023
lucifer
i would suggest adding a note in that dialog that recordings added in last 4 hours to MB will show incomplete metadata.
2023-05-08 12839, 2023
monkey
Right, that would be helpful.
2023-05-08 12828, 2023
lucifer
mayhem: going down the LB roadmap, i have been trying to MLHD+ in spark. currently unsuccessful and each retry takes about an hour to test so thinking of doing another item side by side.
2023-05-08 12836, 2023
lucifer
email-notifications are next.
2023-05-08 12841, 2023
lucifer
available to discuss?
2023-05-08 12850, 2023
mayhem
sure
2023-05-08 12820, 2023
lucifer
I am thinking of a new container, push an event to rabbitmq and then the process in the new container receives the messages, checks users preference whether to send an email or not.
2023-05-08 12839, 2023
lucifer
for digests, we will have to write events to a database table i guess.
2023-05-08 12810, 2023
lucifer
a cron job can be used to send daily/weekly digests.
2023-05-08 12844, 2023
mayhem
database table... couchdb or PG?
2023-05-08 12812, 2023
lucifer
hmm, i think PG.
2023-05-08 12820, 2023
mayhem
ok.
2023-05-08 12829, 2023
mayhem
I think this is a really good approach.
2023-05-08 12844, 2023
mayhem
one that could potentially lay the groundwork for a MeB wide notifcation system.
2023-05-08 12801, 2023
lucifer
possibly yes.
2023-05-08 12812, 2023
lucifer
we could move the container over to MeB and the table too.
2023-05-08 12814, 2023
mayhem
perhaps this should be a MeB service, not a LB service?
2023-05-08 12818, 2023
mayhem
lolololo.
2023-05-08 12822, 2023
mayhem
get out of my head!
2023-05-08 12827, 2023
lucifer
hehe lol
2023-05-08 12838, 2023
mayhem
and then specify the project that is sending the message.
2023-05-08 12853, 2023
mayhem
so that other projects can adopt it when they wish to.
2023-05-08 12854, 2023
lucifer
yup, makes sense.
2023-05-08 12808, 2023
lucifer
project. email text. meb user id.
2023-05-08 12813, 2023
mayhem
and we're going to have the app specify the subject and body text and stuff all that into RMQ?
2023-05-08 12824, 2023
mayhem
(as opposed to having centralized templates or somesuch)
2023-05-08 12825, 2023
lucifer
i think that would be the most flexible option.
2023-05-08 12844, 2023
mayhem
we't not worried about RMQ bandwidth, so the first option makes the most sense.
2023-05-08 12855, 2023
mayhem
*we're
2023-05-08 12813, 2023
lucifer
yup, it won't be more than a few kbs anyway.
2023-05-08 12846, 2023
mayhem
yep.
2023-05-08 12853, 2023
mayhem
yeah, sounds like a good approach.
2023-05-08 12839, 2023
mayhem
I think it would be good for you to touch base with zas/atj about how fast we can send notifications -- I know google can get pissy about sending too many emails. that needs to be taken into consideration.
2023-05-08 12806, 2023
mayhem
and, I think each email we send should have a "unsubscribe" link that does not require the user to sign-in.
2023-05-08 12814, 2023
lucifer
there's another assumption here that the username and email for a user will only be configurable at the MeB level.
2023-05-08 12838, 2023
mayhem
I think the app should specify the email address.
2023-05-08 12808, 2023
lucifer
that's possible but it creates issues with sending digests.
2023-05-08 12829, 2023
lucifer
say if a user has selected separate emails in different projects and we want to digests.
2023-05-08 12832, 2023
mayhem
in case someone changes their email?
2023-05-08 12835, 2023
lucifer
do we send multiple emails?
2023-05-08 12807, 2023
mayhem
wait, can a user have separate emails in different projects?
2023-05-08 12811, 2023
mayhem
that's not possible now is it?
2023-05-08 12838, 2023
lucifer
currently its not possible. but i think it was under discussion when we moved auth to MeB level.
2023-05-08 12812, 2023
mayhem
should we address that issue when that feature is implemented?
2023-05-08 12817, 2023
mayhem
the use case is unclear.
2023-05-08 12839, 2023
lucifer
sure makes sese
2023-05-08 12843, 2023
mayhem
and simply have MeB know the right email and not specify it as part of the RMQ message.
2023-05-08 12855, 2023
lucifer
yes, we can get it from the user id.
2023-05-08 12802, 2023
mayhem
because that exposes emails more and provides more opportunities to leak data.
2023-05-08 12810, 2023
mayhem
ok, then lets do that for now.
2023-05-08 12817, 2023
lucifer
actually how about i finish the Oauth stuff first.
2023-05-08 12834, 2023
mayhem
I would love that.
2023-05-08 12850, 2023
mayhem
that would be a huge benefit to all of our projects.
2023-05-08 12802, 2023
monkey
Indeed
2023-05-08 12811, 2023
lucifer
should we email alastair once if that's fine with him?
2023-05-08 12824, 2023
lucifer
i can take over the work then
2023-05-08 12825, 2023
mayhem
its fine with him. trust me.
2023-05-08 12836, 2023
lucifer
cool then
2023-05-08 12821, 2023
lucifer
i'll add a link of today's chat to email notifications item in roadmap for later reference and for now proceed with OAuth,
2023-05-08 12832, 2023
mayhem
great.
2023-05-08 12855, 2023
mayhem
I'm going to finish the pending emails, then I think I will knock out weekly-jams and weekly-new-jams.
Hah, I just remembered that PR while working on the datasets PR
2023-05-08 12855, 2023
monkey
OK, I'll put it on my list
2023-05-08 12805, 2023
lucifer
awesome thanks!
2023-05-08 12813, 2023
monkey
Can I deploy a PR to test?
2023-05-08 12817, 2023
lucifer
sure
2023-05-08 12838, 2023
Maxr1998_ joined the channel
2023-05-08 12856, 2023
monkey
We do get some release info from the MB lookup, so I think the user experience might not be as bad as what e were talking about. Might have cover art in some cases
2023-05-08 12833, 2023
lucifer
right, but there isn't a way to submit that release info along with the mapping.
2023-05-08 12839, 2023
Maxr1998 has quit
2023-05-08 12842, 2023
lucifer
so it would go away after page reload.
2023-05-08 12857, 2023
monkey
Right. At least users might get a decent preview in that modal.
2023-05-08 12806, 2023
lucifer
yes makes sense
2023-05-08 12817, 2023
monkey
I did add a note as discussed, hopefully we'll manage expectations.
2023-05-08 12806, 2023
mayhem
monkey: I think when you have a free day, you and I should take a look at the YouTube matching alg in BP and see if we can throw some more SMRTs at it to improve the matching.
2023-05-08 12810, 2023
monkey
What do you mean by matching? The search results?
2023-05-08 12840, 2023
mayhem
Finding the correct YT video given a listen.
2023-05-08 12856, 2023
mayhem
do you have edit distance support in the matching setup?
2023-05-08 12809, 2023
mayhem
are you using something like ptyhon's unidecode module?
mayhem: I'm still unsure what you mean by "matching". We just use the track name, use it as the search query with YT's API and select the first result.
2023-05-08 12805, 2023
monkey
There's no processing being done
2023-05-08 12815, 2023
mayhem
that is the problem. :D
2023-05-08 12822, 2023
monkey
(part from encoding the url)
2023-05-08 12841, 2023
mayhem
using edit distance would be a simple improvement.
2023-05-08 12805, 2023
mayhem
if the edit distance between the searched term and result are too far apart, we should not accept that and tell the user we can't match the track.
2023-05-08 12812, 2023
mayhem
and will cut down on the false positives.
2023-05-08 12838, 2023
monkey
Last we talked about this, we agreed about it not being the right place to do this sort of stuff, and that we should query a server endpoint for this (which would have YT links from MB + search via YT API)
2023-05-08 12821, 2023
monkey
It's a tricky one
2023-05-08 12836, 2023
monkey
Not sure about that approach, sometimes the video titles are wayyy different from the track title used (despite being the correct video)
2023-05-08 12818, 2023
mayhem
yeah, we're increasingly more emails about this, so I wonder if we should find a bandaid that throws out things that are clearly not a match.
2023-05-08 12834, 2023
mayhem
e.g. in the mapping we often look for matches with 85% similarity.
2023-05-08 12856, 2023
mayhem
here we would simple say if a track doesn't match to 50%, or 35%, the we say: Sorry, no match.
2023-05-08 12807, 2023
monkey
Simple example i have at hand: track name used as search: "the sore feet song". video title: "The sore feet song ||Mushishi openign Full||" Levenshtein distance: 27