I'm working on the export functionality for Apple Music and have a question. For Spotify, our code currently retrieves the spotify_id from the Recording element, adds those tracks to the playlist, and then uses Spotify's lookup to find any remaining unplayable tracks. We don't store apple_music_ids for tracks.
Should I consider adding apple_music_id to the Recording element and the model in LB, or should we rely solely on the Labs API lookup to search for tracks? The issue with the Labs API is that it sometimes returns multiple track IDs, and we can't always be certain which one is correct
2024-07-25 20737, 2024
lucifer[m]
[@kubrimskii:matrix.org](https://matrix.to/#/@kubrimskii:matrix.org) We should start storing those IDs
2024-07-25 20740, 2024
lucifer[m]
Ues
2024-07-25 20744, 2024
lucifer[m]
*yes
2024-07-25 20745, 2024
rimskii[m]
okay
2024-07-25 20739, 2024
rimskii[m]
So I should update the troi and [this model ](https://github.com/metabrainz/listenbrainz-server/blob/2031a0ccc1b8cb717431b774e31774f7d99c9d8b/data/model/listen.py#L39C1-L39C30) in LB, right?
2024-07-25 20736, 2024
mayhem[m]
with enough tweaks the nmslib mega index finished building in 2 hours with a max string length of... 20. but before I could test it crashed. lol.
2024-07-25 20752, 2024
lucifer[m]
rimskii: you only need to update troi, that model is not useful.
2024-07-25 20712, 2024
mayhem[m]
I'll finish debugging with a partial dataset and then put up a model for us to test with, before building the lager one for performance testing.
2024-07-25 20715, 2024
mayhem[m]
lucifer: do you have an updated file with fuzzy lookup test cases?
2024-07-25 20711, 2024
lucifer[m]
mayhem: nope
2024-07-25 20734, 2024
mayhem[m]
ok, I'll cobble one together.
2024-07-25 20712, 2024
derat[m] has quit
2024-07-25 20738, 2024
jesus2099 joined the channel
2024-07-25 20742, 2024
mayhem[m]
lucifer: ping
2024-07-25 20744, 2024
lucifer[m]
Pong
2024-07-25 20704, 2024
mayhem[m]
hey. I was having lunch with monkey talking about the fuzzy matching stuff.
2024-07-25 20745, 2024
mayhem[m]
and now that I am very keenly aware of the fact that string lengths are rather quite important to the performance of a fuzzy index, I want to re-examine typesense.
2024-07-25 20717, 2024
mayhem[m]
because we might just limit the size of strings to X chars and then post filter them by fetching the rows from the DB.
2024-07-25 20737, 2024
mayhem[m]
and if the strings we store in typesense are much shorter, the whole thing might become much more performant.
2024-07-25 20753, 2024
mayhem[m]
and we could do an artist index and recording index to further make the strings shorter.
2024-07-25 20758, 2024
mayhem[m]
thoughts?
2024-07-25 20703, 2024
lucifer[m]
db fetch would likely be still slower but we can have two typesense indexes.
2024-07-25 20716, 2024
lucifer[m]
One for shorter match and one for falling back
2024-07-25 20722, 2024
mayhem[m]
yes, we will need to check the trade-offs.
2024-07-25 20742, 2024
mayhem[m]
but this path might give us the scalabilty we need without a lot of custom code.
2024-07-25 20742, 2024
lucifer[m]
artist and recording name matching separately makes sense yes.
2024-07-25 20706, 2024
lucifer[m]
but we also need to check match rates.
2024-07-25 20758, 2024
mayhem[m]
and we might be able to further partition the recording space. given that most typo correction algs assume that the first 2-3 characters are correct, then we can build smaller indexes, that might perform better.
2024-07-25 20723, 2024
lucifer[m]
I don't think we need to go that far.
2024-07-25 20743, 2024
lucifer[m]
At least for first try
2024-07-25 20748, 2024
mayhem[m]
possibly not... yet. but having a path for the future is also good.
2024-07-25 20759, 2024
mayhem[m]
yep. like minds...
2024-07-25 20702, 2024
lucifer[m]
I would prefer to avoid the assumption that the first two characters are right.
2024-07-25 20733, 2024
mayhem[m]
its one I would be ok with.
2024-07-25 20747, 2024
lucifer[m]
I think it would affect match rates.
2024-07-25 20710, 2024
lucifer[m]
Assuming typos are uniformly spread.
2024-07-25 20712, 2024
adhawkins has quit
2024-07-25 20716, 2024
mayhem[m]
do we have a single test case in our tests that has a mistake in the first 2 characters? I doubt it.
2024-07-25 20729, 2024
lucifer[m]
Probably no
2024-07-25 20752, 2024
mayhem[m]
and if you check autocorrect on a mobile, if you get one of the first two characters wrong, you'll never get the right suggestion.
2024-07-25 20711, 2024
lucifer[m]
Yeah true that.
2024-07-25 20742, 2024
mayhem[m]
I think I will take a break from the nmslib approach for a bit. I would much prefer a solution based on solr or typesense.
2024-07-25 20758, 2024
lucifer[m]
Yes sounds good to me.
2024-07-25 20706, 2024
lucifer[m]
Let's try to make it work with typesense
2024-07-25 20718, 2024
lucifer[m]
And we can look into solr later if things don't work out
2024-07-25 20719, 2024
mayhem[m]
I'll try tweaking the typesense index and see how it goes.
2024-07-25 20724, 2024
mayhem[m]
ok, sounds good.
2024-07-25 20728, 2024
lucifer[m]
Sounds good thanks!
2024-07-25 20724, 2024
minimal joined the channel
2024-07-25 20741, 2024
adhawkins joined the channel
2024-07-25 20742, 2024
Jade[m] uploaded an image: (55KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/PQDeDpMJzyXyyXNsNBJweTdM/image.png >
2024-07-25 20749, 2024
Jade[m]
Introducing: Metrics for the mail service!
2024-07-25 20742, 2024
pite joined the channel
2024-07-25 20753, 2024
santiagofn[m] has quit
2024-07-25 20728, 2024
bitmap[m]
Jade: that's awesome, what stats are you collecting currently?
2024-07-25 20724, 2024
Jade[m]
Currently, I've just set up counters for emails_requested, emails_sent and healthchecks
2024-07-25 20734, 2024
Jade[m]
Not sure which metrics would be the most useful
2024-07-25 20700, 2024
bitmap[m]
that's already very useful, I could see us adding some metrics for specific types of internal errors too
> TODO: Find a way to avoid wrapping inline links and find a better way of postfix URLs - another PR?
2024-07-25 20744, 2024
bitmap[m]
do the new templates use inline links anywhere? and what sort of improvements were you thinking of for the postfix URLs?
2024-07-25 20753, 2024
bitmap[m]
* > TODO: Find a way to avoid wrapping inline links and find a better way of postfix URLs - another PR?
2024-07-25 20753, 2024
bitmap[m]
do the new templates use inline links anywhere? and what sort of improvements were you thinking of for the postfix URLs?
2024-07-25 20702, 2024
Jade[m] sent a code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/WQibhvSbhEIuxBXIbCjuEvrx
2024-07-25 20712, 2024
Jade[m]
The text emails currently look like this
2024-07-25 20723, 2024
Jade[m]
With borders enabled:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/KFABwZVSkOBgjMXyIInagIKy>)
2024-07-25 20724, 2024
bitmap[m]
right, it's a little bit awkward
2024-07-25 20756, 2024
Jade[m]
re postfix URLs, I was thinking markdown-style inline links?
2024-07-25 20726, 2024
Jade[m]
But I couldn't find an easy way of disabling wrapping for that, so the kinks got broken
2024-07-25 20735, 2024
Jade[m]
Ideally, I'd like to be able to choose to render some of the borders, but you can see in text.rs - that's basically all the control I've got without patching the library more
you need to test it out on multiple devices and browsers but if there's a common structure and finite templates then its doable.
2024-07-25 20711, 2024
lucifer[m]
we do that for Year in Music emails.
2024-07-25 20721, 2024
bitmap[m]
I'd leave the borders off if it's all-or-nothing, yeah. but it looks pretty good overall, and most people will be viewing the HTML emails anyway, I imagine
2024-07-25 20743, 2024
Jade[m]
bitmap[m]: Yep!
2024-07-25 20730, 2024
Jade[m]
lucifer[m]: CSS is disabled by default, but when enabled it's only able to hide `display: none` elements
2024-07-25 20700, 2024
lucifer[m]
ah okay
2024-07-25 20710, 2024
bitmap[m]
Jade: looks like you were able to get Weblate syncing fine?
2024-07-25 20756, 2024
Jade[m]
Nope
2024-07-25 20758, 2024
Jade[m] uploaded an image: (33KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/MxxQbctFLkqsNCYhSwOJUCQW/image.png >
2024-07-25 20744, 2024
Jade[m]
I can manually pull translations, but weblate auto-locks the project when it errors out
2024-07-25 20750, 2024
Jade[m]
I think [@outsidecontext:matrix.org](https://matrix.to/#/@outsidecontext:matrix.org) was going to look at it?
2024-07-25 20729, 2024
bitmap[m]
do you know how credentials are specified?
2024-07-25 20700, 2024
bitmap[m]
I can unlock it (and even disable "Lock on error") but that doesn't really solve the error
2024-07-25 20751, 2024
Jade[m]
bitmap[m]: I can't find the settings in my interface, and I've looked 😅
lucifer: a typesense based artist index takes 15ms - 25ms to search. 1-2ms using nmslib.
2024-07-25 20700, 2024
bitmap[m]
hrm. we also have some configuration to allow weblate-metabrainz to push (and force push) to the translations branch, not sure if that's relevant
2024-07-25 20759, 2024
Jade[m]
I've tried a few different setups - the one I've left it on is the default GitHub PR workflow
2024-07-25 20719, 2024
Jade[m]
I'd tried one that should have pushed directly to main as well
2024-07-25 20731, 2024
Jade[m]
But that didn't work either
2024-07-25 20711, 2024
Jade[m]
It might be helpful to see the setup of one of the other projects?
2024-07-25 20723, 2024
lucifer[m]
mayhem: can you also test artist name and recording name lookup simultaneously?
2024-07-25 20728, 2024
bitmap[m]
Jade[m]: I'm not even sure where to look TBH, I think yvanzo and outsidecontext will have to help 🙏
2024-07-25 20733, 2024
bitmap[m]
our internal syswiki lists the credentials we use, but not much about the setup at the moment
2024-07-25 20744, 2024
Jade[m]
Weblate is a bit confusing tbh
2024-07-25 20726, 2024
Jade[m]
But this should be the only issue, given I *can* pull manually
2024-07-25 20713, 2024
Jade[m]
So, summary since last Thursday:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/YAVTDSStRcAuuFRWgjmZPkql>)
very productive week! with a working Docker image, we can look into setting up your project on metabrainz servers soon and point test.musicbrainz.org to it eventually
2024-07-25 20731, 2024
bitmap[m]
will you be looking into patching musicbrainz-server to communicate with the mb-mail-service soon?
2024-07-25 20707, 2024
bitmap[m]
I guess to start with we won't be reading subscriptions from the mail service
2024-07-25 20720, 2024
Jade[m]
Yeah, definitely
2024-07-25 20748, 2024
Jade[m]
I really don't know any perl though, so my code is likely to be shocking
2024-07-25 20710, 2024
Jade[m]
bitmap[m]: Yep. I haven't done any work on that yet
2024-07-25 20736, 2024
Jade[m]
And the subscriptions template isn't fully functional yet
2024-07-25 20706, 2024
bitmap[m]
haha. MB's Perl isn't so bad, and I can help guide you through it
2024-07-25 20752, 2024
Jade[m]
Jade[m]: When I start on that it'll be my turn to crib from [@yellowhatpro:matrix.org](https://matrix.to/#/@yellowhatpro:matrix.org) 😁
2024-07-25 20707, 2024
Jade[m]
bitmap[m]: Thank you 😊
2024-07-25 20713, 2024
bitmap[m]
on wolf we have an MB docker mirror running -- if it would be helpful to you, we could give you ssh access there to test your mail service setup (unless you already have a full working setup locally)
2024-07-25 20757, 2024
Jade[m]
bitmap[m]: I have a Dev setup with the testing database as described in the musicbrainz-docker repo
2024-07-25 20724, 2024
Jade[m]
Which should be good enough for local devel
2024-07-25 20744, 2024
Jade[m]
Not sure how accurately that represents production though
2024-07-25 20745, 2024
yellowhatpro[m]
<Jade[m]> "When I start on that it'll be my..." <- for the amount of times I have taken reference from your configs, I highly doubt you will need my help 🤣🤣
2024-07-25 20746, 2024
yellowhatpro[m]
But even if it's 1 percent of the case it's helpful then I am really heppyy
2024-07-25 20750, 2024
bitmap[m]
cool, much easier to use that then
2024-07-25 20753, 2024
Jade[m]
yellowhatpro[m]: > <@yellowhatpro:matrix.org> for the amount of times I have taken reference from your configs, I highly doubt you will need my help 🤣🤣
2024-07-25 20753, 2024
Jade[m]
> But even if it's 1 percent of the case it's helpful then I am really heppyy
2024-07-25 20753, 2024
Jade[m]
I technically haven't written a single project which interacts with a SQL database! I'll at least be looking at your sqlx code 😀
2024-07-25 20702, 2024
adhawkins has quit
2024-07-25 20703, 2024
yellowhatpro[m]
ohh nicee lesgooo 🚀 😁😁
2024-07-25 20704, 2024
yellowhatpro[m]
thanks to rustynova it smells less now, but yeah worth a watch ig
2024-07-25 20728, 2024
Jade[m]
If you have any questions about anything or want any help, don't hesitate to ask me haha
2024-07-25 20733, 2024
Jade[m]
I'm a follower of the Not Rocket Science Rule as much as possible, so my CI setup is comprehensive 😆
trust me i won't , it has only helped me develop instincts, and since jan this year, I feel a lot more confident coding in rust now
2024-07-25 20729, 2024
bitmap[m]
<Jade[m]> "I really don't know any perl..." <- one small hint for lib/MusicBrainz/Server/Email.pm: `$self->c->lwp` should give you access to an [LWP instance](https://metacpan.org/pod/LWP) that can be used to send requests