in #metabrainz

5:14 AM
petitminion joined the channel
5:35 AM
CatQuest

[23:33] * mayhem snaps his suspenders and goes to bed
5:35 AM
lmao
6:49 AM
petitminion has quit
7:26 AM
petitminion joined the channel
7:40 AM
petitminion has quit
7:40 AM
petitminion joined the channel
8:03 AM
mayhem

mooooin!
8:03 AM
aerozol: kool, sounds good
8:03 AM
aerozol: how can we coordinate doing some promotion of our new datasets this week?
8:16 AM
petitminion has quit
8:26 AM
yvanzo

O’Moin
8:26 AM
mayhem

mooin!
8:27 AM
how shit, this guy's skill! https://www.reddit.com/r/nextfuckinglevel/comme...
8:36 AM
ZaphodBeeblebrox joined the channel
8:44 AM
FishFish joined the channel
8:48 AM
ZaphodBeeblebrox has quit
9:03 AM
TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo)
9:47 AM
aerozol

mayhem: shall we meet re dataset promotion after the meeting? Or are you on late another evening this week?
9:48 AM
mayhem

after the meeting I'm off to dinner. but I should be around late evenings, let me ping you when I get back from dinner.
9:48 AM
aerozol

wow that backwards song is crazy!
9:49 AM
Okay give me a ping if you stay off the pingers
9:49 AM
mayhem

k
9:50 AM
aerozol

Or happy to coordinate via a doc or IRC, if you want to write me a little brief. Maybe if we miss each other tomorrow/tonight let’s do that
9:50 AM
mayhem

A brief might be good in any case. let me do that.
9:55 AM
oh wow. matthew was spot on, he called it *days* before it happened:
9:55 AM
https://usercontent.irccloud-cdn.com/file/EIsKg...
10:02 AM
aerozol

Sounds legit!
10:02 AM
What does ‘use us ai’ mean?
10:03 AM
mayhem

not sure, it is a chinese company signing up.
10:03 AM
language barriers.
10:03 AM
but AI! AI! A*fucking*IIIIIIII!
10:03 AM
mayhem shows himself out
10:06 AM
aerozol

He must be fuming, forgot to snap his suspenders
10:28 AM
TOPIC: [MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, MB Database Schema Change (yvanzo), Summit Lodgings (ruaok/atj)
10:30 AM
night all, enjoy the meeting as well!
10:33 AM
mayhem

natta
10:56 AM
BrainzGit

[listenbrainz-server] 14MonkeyDo opened pull request #2478 (03master…multisearch-search-by-mbid): Multisearch component: use MB API to lookup recording by MBID https://github.com/metabrainz/listenbrainz-serv...
11:18 AM
lucifer

monkey: ^ makes sense but note that evverywhere else in LB web mapper data is used so in the cases mentioned in the PR would not be able to take use of the user mapping till the mapper comes up to date.
11:20 AM
monkey

But when the user links a listen to MB and they refresh the page, that listen should show the linked recording info, no?
11:20 AM
Not until it's been mapped?
11:21 AM
lucifer

monkey: the recording_mbid would appear in the listen's json but no additional metadata would be available.
11:21 AM
monkey

Hm.
11:21 AM
lucifer

because we only retrieve metadata from the mapper's tables.
11:21 AM
monkey

I suppose at least we get link to recording + cover art through our fallbacks
11:22 AM
lucifer

only link to recording.
11:22 AM
afaik, you can't get a cover art from recording mbid.
11:22 AM
monkey

Ah, darn
11:22 AM
That's right.
11:23 AM
lucifer

i would suggest adding a note in that dialog that recordings added in last 4 hours to MB will show incomplete metadata.
11:24 AM
monkey

Right, that would be helpful.
11:34 AM
lucifer

mayhem: going down the LB roadmap, i have been trying to MLHD+ in spark. currently unsuccessful and each retry takes about an hour to test so thinking of doing another item side by side.
11:34 AM
email-notifications are next.
11:34 AM
available to discuss?
11:34 AM
mayhem

sure
11:36 AM
lucifer

I am thinking of a new container, push an event to rabbitmq and then the process in the new container receives the messages, checks users preference whether to send an email or not.
11:36 AM
for digests, we will have to write events to a database table i guess.
11:37 AM
a cron job can be used to send daily/weekly digests.
11:37 AM
mayhem

database table... couchdb or PG?
11:38 AM
lucifer

hmm, i think PG.
11:38 AM
mayhem

ok.
11:38 AM
I think this is a really good approach.
11:38 AM
one that could potentially lay the groundwork for a MeB wide notifcation system.
11:39 AM
lucifer

possibly yes.
11:39 AM
we could move the container over to MeB and the table too.
11:39 AM
mayhem

perhaps this should be a MeB service, not a LB service?
11:39 AM
lolololo.
11:39 AM
get out of my head!
11:39 AM
lucifer

hehe lol
11:39 AM
mayhem

and then specify the project that is sending the message.
11:39 AM
so that other projects can adopt it when they wish to.
11:39 AM
lucifer

yup, makes sense.
11:40 AM
project. email text. meb user id.
11:40 AM
mayhem

and we're going to have the app specify the subject and body text and stuff all that into RMQ?
11:40 AM
(as opposed to having centralized templates or somesuch)
11:40 AM
lucifer

i think that would be the most flexible option.
11:40 AM
mayhem

we't not worried about RMQ bandwidth, so the first option makes the most sense.
11:40 AM
*we're
11:41 AM
lucifer

yup, it won't be more than a few kbs anyway.
11:41 AM
mayhem

yep.
11:41 AM
yeah, sounds like a good approach.
11:42 AM
I think it would be good for you to touch base with zas/atj about how fast we can send notifications -- I know google can get pissy about sending too many emails. that needs to be taken into consideration.
11:43 AM
and, I think each email we send should have a "unsubscribe" link that does not require the user to sign-in.
11:43 AM
lucifer

there's another assumption here that the username and email for a user will only be configurable at the MeB level.
11:43 AM
mayhem

I think the app should specify the email address.
11:44 AM
lucifer

that's possible but it creates issues with sending digests.
11:44 AM
say if a user has selected separate emails in different projects and we want to digests.
11:44 AM
mayhem

in case someone changes their email?
11:44 AM
lucifer

do we send multiple emails?
11:45 AM
mayhem

wait, can a user have separate emails in different projects?
11:45 AM
that's not possible now is it?
11:45 AM
lucifer

currently its not possible. but i think it was under discussion when we moved auth to MeB level.
11:46 AM
mayhem

should we address that issue when that feature is implemented?
11:46 AM
the use case is unclear.
11:46 AM
lucifer

sure makes sese
11:46 AM
mayhem

and simply have MeB know the right email and not specify it as part of the RMQ message.
11:46 AM
lucifer

yes, we can get it from the user id.
11:47 AM
mayhem

because that exposes emails more and provides more opportunities to leak data.
11:47 AM
ok, then lets do that for now.
11:47 AM
lucifer

actually how about i finish the Oauth stuff first.
11:47 AM
mayhem

I would love that.
11:47 AM
that would be a huge benefit to all of our projects.
11:48 AM
monkey

Indeed
11:48 AM
lucifer

should we email alastair once if that's fine with him?
11:48 AM
i can take over the work then
11:48 AM
mayhem

its fine with him. trust me.
11:48 AM
lucifer

cool then
11:49 AM
i'll add a link of today's chat to email notifications item in roadmap for later reference and for now proceed with OAuth,
11:50 AM
mayhem

great.
11:50 AM
I'm going to finish the pending emails, then I think I will knock out weekly-jams and weekly-new-jams.
11:51 AM
lucifer

monkey: if you have time, can you finish up the login/sign up page in https://github.com/metabrainz/metabrainz.org/pu... ?
11:51 AM
just presentation aspect is fine.
11:51 AM
i can hook up the functionality.
11:51 AM
monkey

Hah, I just remembered that PR while working on the datasets PR
11:51 AM
OK, I'll put it on my list
11:52 AM
lucifer

awesome thanks!
11:52 AM
monkey

Can I deploy a PR to test?
11:52 AM
lucifer

sure
11:52 AM
Maxr1998_ joined the channel
11:52 AM
monkey

We do get some release info from the MB lookup, so I think the user experience might not be as bad as what e were talking about. Might have cover art in some cases
11:53 AM
lucifer

right, but there isn't a way to submit that release info along with the mapping.
11:53 AM
Maxr1998 has quit
11:53 AM
so it would go away after page reload.
11:53 AM
monkey

Right. At least users might get a decent preview in that modal.
11:54 AM
lucifer

yes makes sense
11:54 AM
monkey

I did add a note as discussed, hopefully we'll manage expectations.
12:05 PM
mayhem

monkey: I think when you have a free day, you and I should take a look at the YouTube matching alg in BP and see if we can throw some more SMRTs at it to improve the matching.
12:06 PM
monkey

What do you mean by matching? The search results?
12:06 PM
mayhem

Finding the correct YT video given a listen.
12:06 PM
do you have edit distance support in the matching setup?
12:07 PM
are you using something like ptyhon's unidecode module?
12:09 PM
https://www.irccloud.com/pastebin/emXPECEw/
12:09 PM
I tried to explain this module at lunch the other day, but a good example speaks volumes
12:11 PM
lucifer: https://community.metabrainz.org/t/cover-art-no...
12:11 PM
could you follow up there, please? I guess we may need to add resolving release redirects.
12:13 PM
lucifer: I think this one might be good if you could answer: https://community.metabrainz.org/t/recording-mb...
12:13 PM
monkey

mayhem: I'm still unsure what you mean by "matching". We just use the track name, use it as the search query with YT's API and select the first result.
12:14 PM
There's no processing being done
12:14 PM
mayhem

that is the problem. :D
12:14 PM
monkey

(part from encoding the url)
12:14 PM
mayhem

using edit distance would be a simple improvement.
12:15 PM
if the edit distance between the searched term and result are too far apart, we should not accept that and tell the user we can't match the track.
12:15 PM
and will cut down on the false positives.
12:15 PM
monkey

Last we talked about this, we agreed about it not being the right place to do this sort of stuff, and that we should query a server endpoint for this (which would have YT links from MB + search via YT API)
12:16 PM
It's a tricky one
12:16 PM
Not sure about that approach, sometimes the video titles are wayyy different from the track title used (despite being the correct video)
12:18 PM
mayhem

yeah, we're increasingly more emails about this, so I wonder if we should find a bandaid that throws out things that are clearly not a match.
12:18 PM
e.g. in the mapping we often look for matches with 85% similarity.
12:18 PM
here we would simple say if a track doesn't match to 50%, or 35%, the we say: Sorry, no match.
12:19 PM
monkey

Simple example i have at hand: track name used as search: "the sore feet song". video title: "The sore feet song ||Mushishi openign Full||" Levenshtein distance: 27