TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo)
aerozol
mayhem: shall we meet re dataset promotion after the meeting? Or are you on late another evening this week?
mayhem
after the meeting I'm off to dinner. but I should be around late evenings, let me ping you when I get back from dinner.
aerozol
wow that backwards song is crazy!
Okay give me a ping if you stay off the pingers
mayhem
k
aerozol
Or happy to coordinate via a doc or IRC, if you want to write me a little brief. Maybe if we miss each other tomorrow/tonight let’s do that
mayhem
A brief might be good in any case. let me do that.
oh wow. matthew was spot on, he called it *days* before it happened:
TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo), Summit Lodgings (ruaok/atj)
night all, enjoy the meeting as well!
mayhem
natta
BrainzGit
[listenbrainz-server] 14MonkeyDo opened pull request #2478 (03master…multisearch-search-by-mbid): Multisearch component: use MB API to lookup recording by MBID https://github.com/metabrainz/listenbrainz-serv...
lucifer
monkey: ^ makes sense but note that evverywhere else in LB web mapper data is used so in the cases mentioned in the PR would not be able to take use of the user mapping till the mapper comes up to date.
monkey
But when the user links a listen to MB and they refresh the page, that listen should show the linked recording info, no?
Not until it's been mapped?
lucifer
monkey: the recording_mbid would appear in the listen's json but no additional metadata would be available.
monkey
Hm.
lucifer
because we only retrieve metadata from the mapper's tables.
monkey
I suppose at least we get link to recording + cover art through our fallbacks
lucifer
only link to recording.
afaik, you can't get a cover art from recording mbid.
monkey
Ah, darn
That's right.
lucifer
i would suggest adding a note in that dialog that recordings added in last 4 hours to MB will show incomplete metadata.
monkey
Right, that would be helpful.
lucifer
mayhem: going down the LB roadmap, i have been trying to MLHD+ in spark. currently unsuccessful and each retry takes about an hour to test so thinking of doing another item side by side.
email-notifications are next.
available to discuss?
mayhem
sure
lucifer
I am thinking of a new container, push an event to rabbitmq and then the process in the new container receives the messages, checks users preference whether to send an email or not.
for digests, we will have to write events to a database table i guess.
a cron job can be used to send daily/weekly digests.
mayhem
database table... couchdb or PG?
lucifer
hmm, i think PG.
mayhem
ok.
I think this is a really good approach.
one that could potentially lay the groundwork for a MeB wide notifcation system.
lucifer
possibly yes.
we could move the container over to MeB and the table too.
mayhem
perhaps this should be a MeB service, not a LB service?
lolololo.
get out of my head!
lucifer
hehe lol
mayhem
and then specify the project that is sending the message.
so that other projects can adopt it when they wish to.
lucifer
yup, makes sense.
project. email text. meb user id.
mayhem
and we're going to have the app specify the subject and body text and stuff all that into RMQ?
(as opposed to having centralized templates or somesuch)
lucifer
i think that would be the most flexible option.
mayhem
we't not worried about RMQ bandwidth, so the first option makes the most sense.
*we're
lucifer
yup, it won't be more than a few kbs anyway.
mayhem
yep.
yeah, sounds like a good approach.
I think it would be good for you to touch base with zas/atj about how fast we can send notifications -- I know google can get pissy about sending too many emails. that needs to be taken into consideration.
and, I think each email we send should have a "unsubscribe" link that does not require the user to sign-in.
lucifer
there's another assumption here that the username and email for a user will only be configurable at the MeB level.
mayhem
I think the app should specify the email address.
lucifer
that's possible but it creates issues with sending digests.
say if a user has selected separate emails in different projects and we want to digests.
mayhem
in case someone changes their email?
lucifer
do we send multiple emails?
mayhem
wait, can a user have separate emails in different projects?
that's not possible now is it?
lucifer
currently its not possible. but i think it was under discussion when we moved auth to MeB level.
mayhem
should we address that issue when that feature is implemented?
the use case is unclear.
lucifer
sure makes sese
mayhem
and simply have MeB know the right email and not specify it as part of the RMQ message.
lucifer
yes, we can get it from the user id.
mayhem
because that exposes emails more and provides more opportunities to leak data.
ok, then lets do that for now.
lucifer
actually how about i finish the Oauth stuff first.
mayhem
I would love that.
that would be a huge benefit to all of our projects.
monkey
Indeed
lucifer
should we email alastair once if that's fine with him?
i can take over the work then
mayhem
its fine with him. trust me.
lucifer
cool then
i'll add a link of today's chat to email notifications item in roadmap for later reference and for now proceed with OAuth,
mayhem
great.
I'm going to finish the pending emails, then I think I will knock out weekly-jams and weekly-new-jams.
Hah, I just remembered that PR while working on the datasets PR
OK, I'll put it on my list
lucifer
awesome thanks!
monkey
Can I deploy a PR to test?
lucifer
sure
Maxr1998_ joined the channel
monkey
We do get some release info from the MB lookup, so I think the user experience might not be as bad as what e were talking about. Might have cover art in some cases
lucifer
right, but there isn't a way to submit that release info along with the mapping.
Maxr1998 has quit
so it would go away after page reload.
monkey
Right. At least users might get a decent preview in that modal.
lucifer
yes makes sense
monkey
I did add a note as discussed, hopefully we'll manage expectations.
mayhem
monkey: I think when you have a free day, you and I should take a look at the YouTube matching alg in BP and see if we can throw some more SMRTs at it to improve the matching.
monkey
What do you mean by matching? The search results?
mayhem
Finding the correct YT video given a listen.
do you have edit distance support in the matching setup?
are you using something like ptyhon's unidecode module?
mayhem: I'm still unsure what you mean by "matching". We just use the track name, use it as the search query with YT's API and select the first result.
There's no processing being done
mayhem
that is the problem. :D
monkey
(part from encoding the url)
mayhem
using edit distance would be a simple improvement.
if the edit distance between the searched term and result are too far apart, we should not accept that and tell the user we can't match the track.
and will cut down on the false positives.
monkey
Last we talked about this, we agreed about it not being the right place to do this sort of stuff, and that we should query a server endpoint for this (which would have YT links from MB + search via YT API)
It's a tricky one
Not sure about that approach, sometimes the video titles are wayyy different from the track title used (despite being the correct video)
mayhem
yeah, we're increasingly more emails about this, so I wonder if we should find a bandaid that throws out things that are clearly not a match.
e.g. in the mapping we often look for matches with 85% similarity.
here we would simple say if a track doesn't match to 50%, or 35%, the we say: Sorry, no match.
monkey
Simple example i have at hand: track name used as search: "the sore feet song". video title: "The sore feet song ||Mushishi openign Full||" Levenshtein distance: 27