reosarevok: yvanzo: I don’t think ‘usually vertical’ is that helpful in identifying something that doesn’t have to be vertical, even if it is the most common. I understand that it might help differentiate it from banners (which really are all horizontal, I think?), but unfortunately a lot of posters can be landscape or square
2024-06-21 17344, 2024
aerozol[m]
If we need the description to be longer I would just take one of the dictionary definitions, since there’s people at Miriam-Webster or whatever being paid to argue about these things (I assume)
2024-06-21 17347, 2024
aerozol[m]
Something like “Usually a large printed or digital sheet, that often contains pictures, and is posted publically.” (slightly modified Miriam-Webster definition)
2024-06-21 17344, 2024
aerozol[m]
atj: and/or yvanzo : woohoo, Solr upgrade!! Just double checking that I am okay to announce “we have upgraded our Solr cluster to 9.6.1” (or something along those lines) on our socials?
2024-06-21 17357, 2024
aerozol[m]
Time to show off your hard work :D
2024-06-21 17322, 2024
aerozol[m]
From what I gather from the message history it is no longer just on beta?
2024-06-21 17349, 2024
pite has quit
2024-06-21 17313, 2024
Kladky joined the channel
2024-06-21 17321, 2024
serene-arc[m] joined the channel
2024-06-21 17321, 2024
serene-arc[m]
Hi all! I'm an app developer, writing a tool to upload playlists to ListenBrainz. I'm running into a little problem with the API and how to resolve songs to MBIDs. Would I be able to pick anyone's brain about that?
2024-06-21 17347, 2024
lucifer
serene-arc[m]: sure what's the issue?
2024-06-21 17331, 2024
d4rk has quit
2024-06-21 17354, 2024
d4rk joined the channel
2024-06-21 17334, 2024
lucifer
rimskii[m]: i have created the necessary tables on wolf. try again now.
2024-06-21 17310, 2024
serene-arc[m]
So my project is the one linked below for reference. It searches the file tags to get the metadata to send to the ListenBrainz API. However, for whatever reason, it isn't the best at matching songs that don't have regular artist fields, so up to 15%-ish of songs aren't matched.
2024-06-21 17310, 2024
serene-arc[m]
I notice that other tools such as mpdscribble don't seem to have this problem, so I was wondering if there's a way I'm using the API wrongly or something. Any help would be great!
Jade: I meant to fully sign-off a couple of email templates today, but ran out of time. I will try get round to it this weekend! The design won’t change I don’t think, I just want to adjust the text and maybe the links, and get some more to the community for feedback. So hopefully you won’t have to redo anything
2024-06-21 17359, 2024
serene-arc[m]
* lucifer (IRC): So my
2024-06-21 17312, 2024
aerozol[m]
Jade: If the screenshot you showed bitmap is of a email you’ve devved, that looks awesome!! No problem re. making the font size a bit larger. FYI the emails I get the most are subscription emails and they can contain a *lot* of items. So it can be nice to have some oversight/not make them too big. But I imagine this will be easy to tweak later
lucifer (IRC): we currently use that one! Unfortunately, it doesn't always work, at least as expected. Below are a couple of the curl commands that shows the problem. I'm not entirely sure if it's several problems that are having the same effect, or the same one with different cases.... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/GfIklXTUNfgISlKMPPRrVPdJ>)
2024-06-21 17348, 2024
lucifer
serene-arc[m]: i see, unfortunately the current mapping system doesn't take alias-es into account.
2024-06-21 17359, 2024
lucifer
the selena gomez x marshmello one resolves if the x is removed, that should be simple to fix i think just missing x as a possible join phrase in mapping.
2024-06-21 17357, 2024
serene-arc[m]
lucifer (IRC): that's unfortunate that it doesn't take it into account. is there any way for me to use the resolution system that the last.fm proxy uses, or is that not publicly accessible?
2024-06-21 17335, 2024
lucifer
serene-arc[m]: do you mean the LFM compatible API that ListenBrainz supports?
2024-06-21 17330, 2024
lucifer
if so, it goes through the same system as the API endpoint you are using.
2024-06-21 17346, 2024
lucifer
i wonder if mpdscribble depends on people tagging their collections with Picard first?
2024-06-21 17325, 2024
serene-arc[m]
I'm not sure honestly, because it does manage to resolve the Selena Gomez song above when I play it. beets, of which I'm a maintainer, does the same thing as Picard if you haven't heard of it. My backup solution was to try and pull the MBID directly from the file, at the cost of giving up on these files for those that don't use beets/picard
2024-06-21 17329, 2024
lucifer
sorry i am a bit confused. to be clear, Picard uses a different way to match stuff than the LB endpoint i mentioned.
2024-06-21 17315, 2024
lucifer
when does the selena gomez song resolve and when does it not?
2024-06-21 17325, 2024
outsidecontext[m
serene-arc: does the listen get submitted to LB witthout MBIDs and LB does the resolving server side or do you try to resolve MBID first locally and then submit with the found MBID?
2024-06-21 17353, 2024
rimskii[m]
<lucifer> "rimskii: i have created the..." <- just tested it, it works !
2024-06-21 17355, 2024
rimskii[m]
thank you :)
2024-06-21 17348, 2024
serene-arc[m]
lucifer (IRC): so my library is already organised with picard, and all of the metadata is consistent with that on MusicBrainz. My program to upload the playlists to ListenBrainz takes the file and reads the artist and title fields of the songs, and tries to find a match with the LB API. The correct MBID for that song is ff67fcb7-365a-4164-87e9-ef7768767528, but ListenBrainz fails to get that, even though the data is the same,
2024-06-21 17348, 2024
serene-arc[m]
nominally.
2024-06-21 17348, 2024
serene-arc[m]
outsidecontext to make a playlist, the MBID must be included.
but the results don't give that when searched with the same data through the LB API
2024-06-21 17342, 2024
lucifer
i see.
2024-06-21 17310, 2024
lucifer
so there are multiple issues here, first not getting a match and second is getting a different match.
2024-06-21 17307, 2024
lucifer
the second one is intentional, we have a concept of canonical data - a recording can be present of multiple releases we have a list of custom sorts to choose the "appropriate" one from MB.
2024-06-21 17339, 2024
lucifer
we are working on improving this by letting users specify a release name as well during the search/mapping.
2024-06-21 17315, 2024
serene-arc[m]
So far I haven't been worried about getting alternate matches for the song. It's mostly about getting errors when no songs are returned
2024-06-21 17334, 2024
outsidecontext[m
serene-arc: sorry, I missed that it is about playlists. I thought it is about submission because of the mpdscribble comparison
2024-06-21 17353, 2024
lucifer
yes makes sense, that is something we need to fix.
2024-06-21 17307, 2024
outsidecontext[m
The majority of LB submission clients will first try to get the MBID from the file and use this for submission (if they support MBIDs in the first place), and if there is none leave the resolving to LB. If you need to resolve the MBIDs locally I'd still suggest to use the already found MBID first and do the lookup only if it is missing. That avoids false matches for MB tagged files
2024-06-21 17321, 2024
lucifer
can you give me a list of all the songs that you tried?
2024-06-21 17352, 2024
lucifer
so far i see two issues - one because we don't check for artist alias and second because x is not treated as punctuation/join phrase.
2024-06-21 17328, 2024
lucifer
i want to see if there are any low hanging fruits or obvious bugs that can be fixed soon-ish.
2024-06-21 17314, 2024
serene-arc[m]
I can give several!... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/aMyJokVcVTSQqourTJcatpAX>)
2024-06-21 17349, 2024
serene-arc[m]
I'm not sure what the last one is. That's from the Cyberpunk soundtrack and some of the songs resolve, some don't, and all have that odd alias system
2024-06-21 17303, 2024
serene-arc[m]
I'll switch to using the local MBID if it exists though, thank you.
(rrsync is installed with rsync on Debian & Ubuntu) so no deployment needed, just the SSH key
2024-06-21 17318, 2024
yvanzo[m]
Hi aerozol: Yes, it is already in use by the main servers, and we keep doing adjustments, but it won’t be immediately available to mirrors .
2024-06-21 17340, 2024
yvanzo[m]
aerozol: About posters, yes we need the descriptions to be longer, I would say to descriptive enough, that is to make sure that even newcomers (ignoring the context) are getting the whole (more or less broad) meaning we are trying to capture through this type. From that perspective, your proposal does the job.
2024-06-21 17334, 2024
yvanzo[m]
aerozol: Also, yes I did look into dictionaries at first and took inspiration from these but last week discussion about this term has had no echo.
2024-06-21 17325, 2024
yvanzo[m]
Hi atj, how can we keep `files/var/lib/solr/solr.xml` from the Ansible role in sync with `solr.xml` from `mbsssss`?
aerozol, yvanzo : ok, I did "Usually a large printed or digital sheet, that often contains pictures, and is generally posted publicly to promote the event." combining what we had and what you suggested
2024-06-21 17304, 2024
lucifer
mayhem: hi! can you push your nmslib search prototypes to github and share the link?
<lucifer> "rimskii: i have created the..." <- lucifer: do tables contain data for spotify id tracks?
2024-06-21 17326, 2024
rimskii[m]
trying to test here, but it doesnt give any data for it
2024-06-21 17330, 2024
rimskii[m] sent a code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/pyoGVESZZnJMbDeGMAGoOart
2024-06-21 17331, 2024
lucifer
rimskii[m]: yes but only a limited number of artists/tracks/albums
2024-06-21 17359, 2024
rimskii[m]
okay
2024-06-21 17329, 2024
mayhem[m]
<lucifer> "mayhem: hi! can you push your..." <- will do in a bit. I should really add some comments because the code does some weird shit right now.
2024-06-21 17344, 2024
Sophist-UK joined the channel
2024-06-21 17348, 2024
lucifer
rimskii[m]: in the terminal on wolf, run `psql URL_TO_MB_DATABASE_HERE` and then you can query something like `select * from spotify_cache.artist`, `spotify_cache.track` etc.
2024-06-21 17313, 2024
lucifer
to see what data is there. i copied ~25000 tracks for ~100 popular artists.
2024-06-21 17307, 2024
lucifer
mayhem[m]: 👍
2024-06-21 17358, 2024
rimskii[m]
okay, thanks !!
2024-06-21 17350, 2024
mayhem[m]
lucifer (IRC): did you deploy rimskii 's work to prod yesterday? is that the first gsoc work to go into prod this year?
2024-06-21 17323, 2024
lucifer
mayhem[m]: not yet deployed, a LB PR that depends on it also needs to be merged first. hopefully today.
2024-06-21 17307, 2024
rimskii[m]
lucifer: yay
2024-06-21 17320, 2024
mayhem[m]
impressive.
2024-06-21 17336, 2024
mayhem[m] checks clock. june. well done rimskii!
2024-06-21 17349, 2024
ansh[m] has quit
2024-06-21 17333, 2024
ahvalmissaamine
!recall applause!
2024-06-21 17334, 2024
BrainzBot
I'm sorry, I don't remember "applause!", are you sure I should know about it?
the CSV file is on wolf: wolf:~/metabrainz/fast_fuzzy
2024-06-21 17324, 2024
mayhem[m]
since everything is in ram, once build is done, a crude search prompt appears: "u2,where the streets have no name" is a valid query.
2024-06-21 17341, 2024
mayhem[m]
I am confused about the slow down in indexing speed -- something odd is happening.
2024-06-21 17356, 2024
lucifer
makes sense.
2024-06-21 17302, 2024
mayhem[m]
but if we can work that out, then we can create indexes via multiple cores. pretty easy.
2024-06-21 17306, 2024
lucifer
for testing did you use another script?
2024-06-21 17313, 2024
mayhem[m]
that should drastically reduce the indexing time.
2024-06-21 17329, 2024
mayhem[m]
mayhem[m]: all in one.
2024-06-21 17344, 2024
mayhem[m]
* in one. wrt testing script
2024-06-21 17301, 2024
lucifer
i see
2024-06-21 17306, 2024
mayhem[m] heads to the office
2024-06-21 17319, 2024
mayhem[m]
curious to see what you think and if you can spot a stooopid mistake. :)
2024-06-21 17342, 2024
lucifer
i mean did you do any stress testing or batch of queries to arrive at the latency number?
2024-06-21 17320, 2024
mayhem[m]
no serious testing as of yet.
2024-06-21 17329, 2024
lucifer
makes sense
2024-06-21 17341, 2024
mayhem[m]
but ESB and VA should be the most extreme search cases.
2024-06-21 17300, 2024
lucifer
i think solr supports vector search too, so you could create vectors using tf-idf and hand it off to solr and then for search generate teh query vector and query solr by that.
2024-06-21 17309, 2024
lucifer
wrt to the disk serialization comment.
2024-06-21 17347, 2024
Maxr1998 has quit
2024-06-21 17342, 2024
Maxr1998 joined the channel
2024-06-21 17313, 2024
rimskii[m]
<lucifer> "rimskii: in the terminal on wolf..." <- does musicbrainz_db contain spotify_cache tables?
2024-06-21 17326, 2024
rimskii[m] uploaded an image: (326KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/BIVGFbIKaXqGSWxolTLnGdNV/Screenshot%202024-06-21%20at%2015.04.16.png >
2024-06-21 17327, 2024
rimskii[m]
can't find anything
2024-06-21 17335, 2024
rimskii[m]
no spotify_cache table
2024-06-21 17347, 2024
rimskii[m]
checking other tables, but there are no data either
2024-06-21 17339, 2024
rimskii[m]
ok there is data for artists tb
2024-06-21 17320, 2024
lucifer
rimskii[m]: yes spotify related tables are in spotify_cache schema
2024-06-21 17347, 2024
rimskii[m]
no spotify_cache schema (?)... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/gVvkPrbrTwyBFIkGEbwaERuz>)
2024-06-21 17305, 2024
mayhem[m]
<lucifer> "i think solr supports vector..." <- something we should explore, for sure.
yvanzo[m]: Seems like I don't have permissions to view the drafts
2024-06-21 17326, 2024
atj[m]
but I've created an account now
2024-06-21 17314, 2024
yvanzo[m]
atj: And now?
2024-06-21 17353, 2024
atj[m]
yep, all good now. thanks!
2024-06-21 17316, 2024
mayhem[m]
lucifer (IRC): did you get a chance to read the fuzzy index code? any thoughts?
2024-06-21 17358, 2024
lucifer
mayhem[m]: just took a cursory look so far, will read in detail and let you know in a while
2024-06-21 17309, 2024
mayhem[m]
k
2024-06-21 17336, 2024
mayhem[m]
I'll adapt it to run on +1 threads so it can finally finish building for a full test.
2024-06-21 17343, 2024
atj[m]
<yvanzo[m]> "atj, zas: Just drafted a blog..." <- I reworded it quite a bit. Hope that's OK! It's a bad habit - once I start tweaking things I get a bit carried away.