#metabrainz

/

4:36 AM
holycow23[m] joined the channel

2025-06-30 18103, 2025

4:36 AM
holycow23[m]

lucifer: can you help me fetch listens of some previous weeks since the db seems to be a little small to generate some of the stats

2025-06-30 18157, 2025

4:51 AM
pite has quit

2025-06-30 18151, 2025

6:02 AM
nbin has quit

2025-06-30 18101, 2025

6:21 AM
nbin joined the channel

2025-06-30 18133, 2025

6:34 AM
lucifer[m]

holycow23: you can get the last 30 days' dumps from by running `request_import_incremental` on wolf.

2025-06-30 18128, 2025

6:35 AM
lucifer[m]

since you have at least one existing dump, i think it should work and import all incremental dumps available and not present in your installation.

2025-06-30 18101, 2025

6:36 AM
lucifer[m]

rayyan_seliya123: okay, i'll check that error.

2025-06-30 18112, 2025

6:37 AM
lucifer[m]

suvid: just take a look at `requirements.txt` and add it there. you'll need to run `./develop.sh build` after that so that it is available in the container.

2025-06-30 18152, 2025

6:37 AM
lucifer[m]

even if you are saving extracted files to disk, you should take the precautions to avoid running into an infinite loop.

2025-06-30 18155, 2025

6:55 AM
lucifer[m]

rayyan_seliya123: you can start working on integrating Internet Archive into Brainzplayer, you'll need a search API to find tracks in IA but for now just use hardcoded data and assume the endpoint exists. you can work in a new PR/branch.

2025-06-30 18105, 2025

6:57 AM
suvid[m]

<lucifer[m]> "suvid: just take a look at `..." <- But what about the version?

2025-06-30 18127, 2025

6:57 AM
lucifer[m]

you can specify the version in requirements.txt, most of the dependencies specify it

2025-06-30 18125, 2025

7:02 AM
petitminion joined the channel

2025-06-30 18116, 2025

7:25 AM
reosarevok[m] joined the channel

2025-06-30 18116, 2025

7:25 AM
reosarevok[m]

Jeez. Even eBird has had to set up anubis it seems

2025-06-30 18130, 2025

7:25 AM
reosarevok[m]

The huge value to AI scrapers of... bird data?

2025-06-30 18108, 2025

7:39 AM
rayyan_seliya123

<lucifer[m]> "rayyan_seliya123: okay, i'll..." <- Sure! also can you check what exactly error I am getting in my last commit to that pr I was failing the Listenbrainzserver tests after pushing that commit !

2025-06-30 18108, 2025

7:41 AM
rayyan_seliya123

<lucifer[m]> "rayyan_seliya123: you can..." <- Sure I will create a new pr and start working on it ! Will need help of yours if stuck !

2025-06-30 18124, 2025

7:52 AM
petitminion has quit

2025-06-30 18139, 2025

7:53 AM
mamanullah7[m] has quit

2025-06-30 18130, 2025

7:55 AM
reosarevok[m]

https://musicbrainz.org/statistics/timeline/count…

2025-06-30 18144, 2025

7:55 AM
reosarevok[m]

Our editor numbers are doing a Tesla!

2025-06-30 18100, 2025

7:57 AM
reosarevok[m]

Guess the script worked well then

2025-06-30 18141, 2025

8:16 AM
Kladky joined the channel

2025-06-30 18154, 2025

8:30 AM
nobiz has quit

2025-06-30 18150, 2025

8:46 AM
ansh[m] joined the channel

2025-06-30 18150, 2025

8:46 AM
ansh[m]

holycow23: For the PR LB#3308, can you share a sample API payload data, so that i can mock and test?

2025-06-30 18151, 2025

8:46 AM
BrainzBot

User genre activity: https://github.com/metabrainz/listenbrainz-server…

2025-06-30 18147, 2025

9:00 AM
nobiz joined the channel

2025-06-30 18130, 2025

9:12 AM
__BrainzGit

[musicbrainz-server] 14reosarevok opened pull request #3586 (03master…flow-274): [WIP] Update Flow to 0.274.1 https://github.com/metabrainz/musicbrainz-server/…

2025-06-30 18158, 2025

9:19 AM
petitminion joined the channel

2025-06-30 18123, 2025

9:21 AM
__BrainzGit

[listenbrainz-server] 14anshg1214 merged pull request #3304 (03master…personal-rec-modal-tests): Rewrite tests for PersonalRecommendationsModal https://github.com/metabrainz/listenbrainz-server…

2025-06-30 18128, 2025

10:49 AM
__BrainzGit

[metabrainz.org] 14mayhem merged pull request #508 (03metabrainz-notifications…metabrainz-notifications): Add notification/send endpoint. https://github.com/metabrainz/metabrainz.org/pull…

2025-06-30 18105, 2025

11:44 AM
suvid[m]

lucifer: when I am reading the listens from the files i am using ijson to read each json one by one from the list of json files in spotify listening history

2025-06-30 18105, 2025

11:44 AM
suvid[m]

so now should i submit them one by one?

2025-06-30 18105, 2025

11:44 AM
suvid[m]

cuz if i club them together then it might become a big list and would consume a lot of memory

2025-06-30 18115, 2025

11:44 AM
suvid[m]

* lucifer: when I am reading the listens from the files i am using ijson to read each json one by one from the list of json files in spotify listening history

2025-06-30 18115, 2025

11:44 AM
suvid[m]

so now should i submit them one by one?

2025-06-30 18115, 2025

11:44 AM
suvid[m]

cuz if i club them together to submit in a batch, then it might become a big list and would consume a lot of memory

2025-06-30 18122, 2025

11:44 AM
suvid[m]

* lucifer: when I am reading the listens from the files i am using ijson to read each json one by one from the list of json files in spotify listening history

2025-06-30 18122, 2025

11:44 AM
suvid[m]

so now should i submit listens one by one?

2025-06-30 18122, 2025

11:44 AM
suvid[m]

cuz if i club them together to submit in a batch, then it might become a big list and would consume a lot of memory

2025-06-30 18155, 2025

11:45 AM
lucifer[m]

[@suvid:matrix.org](https://matrix.to/#/@suvid:matrix.org) chunk them in list of 100.

2025-06-30 18144, 2025

11:46 AM
suvid[m]

and i need to parse listens according to https://listenbrainz.readthedocs.io/en/latest/use… this right?

2025-06-30 18133, 2025

11:47 AM
suvid[m]

basically i need to extract relevant info for listens from the spotify listen and craft a listen according to the listenbrainz specification?

2025-06-30 18148, 2025

11:47 AM
suvid[m]

and then submit it right?

2025-06-30 18107, 2025

11:48 AM
lucifer[m]

Yes.

2025-06-30 18138, 2025

11:48 AM
lucifer[m]

You can take a look at the existing spotify listens importer and reuse code if the format is same between api and the downloaded CSV files

2025-06-30 18148, 2025

11:48 AM
lucifer[m]

s/CSV/json/

2025-06-30 18104, 2025

11:52 AM
suvid[m]

the format seems to be a bit different, i just checked now

2025-06-30 18150, 2025

11:52 AM
suvid[m]

suvid[m]: the one from the api and one from extended listening history

2025-06-30 18113, 2025

11:53 AM
suvid[m]

extended history gives only 1 artist name

2025-06-30 18137, 2025

11:53 AM
suvid[m]

* artist name and not all 👀

2025-06-30 18137, 2025

11:53 AM
lucifer[m]

sure, you can use the code and docs just as a reference then.

2025-06-30 18157, 2025

11:54 AM
suvid[m]

Also, after completing the spotify importer, should i create the UI for easier testing or create importers for other services first?

2025-06-30 18141, 2025

11:59 AM
lucifer[m]

UI and testing.

2025-06-30 18147, 2025

11:59 AM
lucifer[m]

aim is

2025-06-30 18102, 2025

12:00 PM
lucifer[m]

* aim is to complete and integrate one importer first.

2025-06-30 18150, 2025

12:00 PM
suvid[m]

okay I just realized lucifer

2025-06-30 18150, 2025

12:00 PM
suvid[m]

spotify extended streaming history does not give the track artist

2025-06-30 18154, 2025

12:00 PM
suvid[m]

it just gives the album artist

2025-06-30 18129, 2025

12:01 PM
suvid[m]

but track name and track artist are the 2 minimum required things to submit a listen right?

2025-06-30 18143, 2025

12:01 PM
lucifer[m]

suvid: you can query our spotify metadata cache for the artist details with the track id. if not present there query the spotify api.

2025-06-30 18153, 2025

12:01 PM
suvid[m] uploaded an image: (42KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/YhUxlfcErxGCUKWBEMpztXKg/image.png >

2025-06-30 18155, 2025

12:01 PM
suvid[m]

like this is a listen

2025-06-30 18132, 2025

12:02 PM
suvid[m]

lucifer[m]: spotify metadata cache?

2025-06-30 18132, 2025

12:02 PM
suvid[m]

where can i find it?

2025-06-30 18121, 2025

12:03 PM
petitminion has quit

2025-06-30 18132, 2025

12:03 PM
lucifer[m]

https://github.com/metabrainz/listenbrainz-server…

2025-06-30 18146, 2025

12:03 PM
lucifer[m]

these tables are defined in timescale db.

2025-06-30 18133, 2025

12:04 PM
suvid[m]

what is the format of data returned from spotify_cache.track?

2025-06-30 18138, 2025

12:04 PM
lucifer[m]

also, add a fallback for API use when data is not found here. in development at the moment, these tables are empty but i'll update sample dumps to create some sample data here.

2025-06-30 18138, 2025

12:04 PM
suvid[m]

s/spotify_cache/`spotify\_cache/, s/?/`?/

2025-06-30 18142, 2025

12:04 PM
suvid[m]

* what is the format of data returned from spotify_cache.track?

2025-06-30 18148, 2025

12:04 PM
lucifer[m]

you can take a look at the schema i shared above.

2025-06-30 18121, 2025

12:05 PM
suvid[m]

i was talking about the data field in the table

2025-06-30 18131, 2025

12:05 PM
lucifer[m]

you won't need that.

2025-06-30 18144, 2025

12:05 PM
suvid[m]

lucifer[m]: fallback api will be the spotify api itself right?

2025-06-30 18129, 2025

12:06 PM
lucifer[m]

query spotify_cache.track with the track identifier and join it to spotify_cache.album for album name, join to spotify_cache.rel_album_artist for album artist and join to spotify_cache.rel_track_artist for track artists.

2025-06-30 18155, 2025

12:06 PM
lucifer[m]

you have the duration of song played and timestamp available in the jsonl itself so that should be all that's needed.

2025-06-30 18156, 2025

12:06 PM
suvid[m]

spotify_cache.rel_track_artist so i basically need to work with this table only right?

2025-06-30 18113, 2025

12:07 PM
suvid[m]

will search by track id and get the artist id

2025-06-30 18113, 2025

12:07 PM
suvid[m]

then query artist table to get the name

2025-06-30 18126, 2025

12:07 PM
suvid[m]

is this correct approach?

2025-06-30 18140, 2025

12:07 PM
lucifer[m]

do it in one query but yes.

2025-06-30 18114, 2025

12:08 PM
suvid[m]

also, could you please tell about the fallback api as well?

2025-06-30 18122, 2025

12:08 PM
lucifer[m]

i wonder if we should take the album data from the cache too but fine to use the dump data for now i guess.

2025-06-30 18144, 2025

12:08 PM
suvid[m]

album artists also contain only 1 artist only in the dump data

2025-06-30 18110, 2025

12:09 PM
lucifer[m]

https://developer.spotify.com/documentation/web-a…

2025-06-30 18112, 2025

12:09 PM
suvid[m]

i think we should just take spotify track id from the dump and use the data we have

2025-06-30 18112, 2025

12:09 PM
suvid[m]

and for fallback, use the data provided in dump 🤣

2025-06-30 18116, 2025

12:09 PM
suvid[m]

how does that sound lucifer ?

2025-06-30 18129, 2025

12:09 PM
lucifer[m]

lucifer[m]: yup that's what i proposed here

2025-06-30 18130, 2025

12:09 PM
suvid[m]

* i think we should just take spotify track id from the dump and use the data we have in musicbrainz

2025-06-30 18130, 2025

12:09 PM
suvid[m]

and for fallback, use the data provided in dump 🤣

2025-06-30 18112, 2025

12:10 PM
lucifer[m]

suvid[m]: musicbrainz data is not linked to spotify in all cases so you can't really do that

2025-06-30 18134, 2025

12:10 PM
lucifer[m]

the spotify metadata cache on the other hand should have all the data.

2025-06-30 18158, 2025

12:11 PM
suvid[m]

i see

2025-06-30 18124, 2025

12:24 PM
suvid[m]

<lucifer[m]> "https://developer.spotify.com/..." <- this API isnt implemented in the code right?... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/JiFdkDINHPpkOkWFXPCNpBul>)

2025-06-30 18129, 2025

12:26 PM
lucifer[m]

use client credentials grant: https://spotipy.readthedocs.io/en/latest/#client-…

2025-06-30 18155, 2025

12:26 PM
lucifer[m]

you just need the client id and client secret for that.

2025-06-30 18106, 2025

12:30 PM
suvid[m]

i see

2025-06-30 18106, 2025

12:30 PM
suvid[m]

thanks

2025-06-30 18102, 2025

12:31 PM
kellnerd[m] joined the channel

2025-06-30 18103, 2025

12:31 PM
kellnerd[m]

suvid: In my own Spotify history importer I'm just using master_metadata_album_artist_name as track artist, despite the name.

2025-06-30 18104, 2025

12:32 PM
suvid[m]

so is it the artist name only?

2025-06-30 18104, 2025

12:32 PM
suvid[m]

can the mapper work behind the scenes to assign correct mbid if i use this only?

2025-06-30 18122, 2025

12:32 PM
kellnerd[m]

Spotify track artists are usually the same as the album artist anyway and I'm having trouble to find an example where the master_metadata_album_artist_name is not the correct (primary) track artist.

2025-06-30 18125, 2025

12:32 PM
lucifer[m]

i would suggest to use the correct track artist name as we have the data easily available in the spotify metadata cache.

2025-06-30 18153, 2025

12:32 PM
suvid[m]

kellnerd[m]: yea i also didnt find any example till now tho

2025-06-30 18154, 2025

12:32 PM
kellnerd[m]

Yeah, use the cached data when it is available.

2025-06-30 18127, 2025

12:37 PM
lucifer[m]

for one example: album artist has two artists and the first track has four artists: https://open.spotify.com/album/0GjnbPeC1Q1rtkjYdS…

2025-06-30 18101, 2025

12:41 PM
kellnerd[m]

For such cases it is really beneficial to have a Spotify metadata cache, the History files reduce these tracks to master_metadata_album_artist_name = "MEDUZA" where track vs album artist makes no difference.

2025-06-30 18141, 2025

12:41 PM
suvid[m]

lucky for us, we have musicbrainz spotify cache and the spotify web api itself as well :)

2025-06-30 18137, 2025

12:42 PM
kellnerd[m]

I was just looking for "Various artists" entries in my Spotify history samples, thinking that this is probably the one case where the distinction between primary track and album artist exists on Spotify... So far I've only found a track which literally has "Various artists" as the track artist 😂

2025-06-30 18113, 2025

12:48 PM
kellnerd[m]

So yeah, you're lucky to have access to the cache, my standalone tool has to do this on a best effort basis using just the primary artist (whether it is the album or track artist).

2025-06-30 18111, 2025

12:49 PM
kellnerd[m]

Often enough this is still sufficient for LB to map this to the correct recording, I've yet to find a compilation example where this simple approach might fail.

2025-06-30 18122, 2025

12:52 PM
lucifer[m]

i think you can query the spotify api optionally for the same data.

2025-06-30 18146, 2025

12:52 PM
lucifer[m]

if the user can provide a client id/secret.

2025-06-30 18115, 2025

12:53 PM
lucifer[m]

but yeah if LB mapping is working fine then might not be worth the effort.

2025-06-30 18109, 2025

12:57 PM
petitminion joined the channel

2025-06-30 18138, 2025

13:00 PM
suvid[m]

<lucifer[m]> "if the user can provide a client..." <- this will only be the case if the user has turned on spotify in services right?

2025-06-30 18107, 2025

13:01 PM
suvid[m]

<lucifer[m]> "but yeah if LB mapping is..." <- so should i drop the idea of spotify web api for now?

2025-06-30 18108, 2025

13:01 PM
lucifer[m]

suvid: the LB server has its own spotify client id/secret. you can use that always.

2025-06-30 18122, 2025

13:01 PM
suvid[m]

yes yes i was planning on doing that only

2025-06-30 18131, 2025

13:01 PM
lucifer[m]

<lucifer[m]> "but yeah if LB mapping is..." <- those comments were for harmony, kellnerd's cli tool.

2025-06-30 18139, 2025

13:01 PM
suvid[m]

ohh ok

2025-06-30 18145, 2025

13:01 PM
suvid[m]

it kinda confused me 😅

2025-06-30 18145, 2025

13:01 PM
suvid[m]

sorry

2025-06-30 18154, 2025

13:01 PM
lucifer[m]

which exists outisde of LB and might not have access to the client id/secret always.

2025-06-30 18158, 2025

13:02 PM
kellnerd[m]

My LB tool is elbisaur though, harmony is the MB importer 😁

2025-06-30 18120, 2025

13:03 PM
lucifer[m]

ah sorry, yes.

2025-06-30 18122, 2025

13:46 PM
__BrainzGit

[musicbrainz-server] 14mwiencek opened pull request #3587 (03master…mbs-14081): MBS-14081: Log out accounts from Discourse after they've been marked as spam https://github.com/metabrainz/musicbrainz-server/…

2025-06-30 18123, 2025

13:46 PM
BrainzBot

MBS-14081: Log out accounts from Discourse after they've been marked as spam https://tickets.metabrainz.org/browse/MBS-14081

2025-06-30 18121, 2025

14:11 PM
Sophist-UK has quit

2025-06-30 18148, 2025

14:13 PM
bitmap[m]

<reosarevok[m]> "Guess the script worked well..." <- the script did remove about 800k accounts, but the other 400k was me flagging spam 😛

2025-06-30 18100, 2025

14:14 PM
Sophist-UK joined the channel

2025-06-30 18153, 2025

14:14 PM
mamanullah7[m] joined the channel

2025-06-30 18153, 2025

14:14 PM
mamanullah7[m]

<mamanullah7[m]> "Hey lucifer: i fixed the..." <- > <@m.amanullah7:matrix.org> Hey lucifer: i fixed the frontend issue it was authentication error! Now i can play songs using funkwhale!!... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/DMoOBxKBcoXebVygbEguXLPH>)

2025-06-30 18159, 2025

14:15 PM
lucifer[m]

m.amanullah7: i'll do it today.

2025-06-30 18156, 2025

14:16 PM
mamanullah7[m]

lucifer: Thanks

2025-06-30 18134, 2025

14:27 PM
reosarevok[m]

<bitmap[m]> "the script did remove about 800k..." <- You had a second one-off you wanted to run to remove old dodgy accounts too, right

2025-06-30 18155, 2025

14:30 PM
pite joined the channel

2025-06-30 18100, 2025

14:32 PM
bitmap[m]

I did already delete about 4K empty dodgy accounts connected with spammers that had confirmed email addresses (after checking they had no rows in the MeB/LB/CB user tables either)

2025-06-30 18131, 2025

14:32 PM
bitmap[m]

but I think we should just modify the script you made to remove the email check

2025-06-30 18106, 2025

14:33 PM
bitmap[m]

and have it check the other projects' DBs directly to be safe

2025-06-30 18125, 2025

14:35 PM
reosarevok[m]

I love the sound of volunteering in the morning

2025-06-30 18131, 2025

14:36 PM
Maxr1998_ has quit

2025-06-30 18139, 2025

14:36 PM
lucifer[m]

m.amanullah7: done