[metabrainz.org] 14fettuccinae opened pull request #511 (03metabrainz-notifications…notification-table): Merge client credentials pr into metabrainz-notifications (#509) https://github.com/metabrainz/metabrainz.org/pull…
2025-06-09 16025, 2025
Maxr1998 has quit
2025-06-09 16011, 2025
Maxr1998 joined the channel
2025-06-09 16013, 2025
lusciouslover has quit
2025-06-09 16033, 2025
lusciouslover joined the channel
2025-06-09 16002, 2025
Kladky joined the channel
2025-06-09 16027, 2025
rayyan_seliya123 joined the channel
2025-06-09 16028, 2025
rayyan_seliya123 uploaded an image: (58KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/KWQFlFaPEzpvSQTPkCERzQJB/Screenshot%202025-06-07%20004631.png >
2025-06-09 16028, 2025
rayyan_seliya123
Hey lucifer I have updated my indexer and `models.py` script to replicate the same metadata in my table which is present at IA website see the screenshot above like id ,title,artist,album,year,notes/description to the recordings also see my table below ```txt +-[ RECORD 1... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/LkUOZYWBbsniDEkpOQSDTHfm>)
2025-06-09 16058, 2025
lucifer[m]
rayyan_seliya123: also store the topics.
2025-06-09 16048, 2025
rayyan_seliya123
lucifer[m]: sure i will and update you after that !
2025-06-09 16024, 2025
mamanullah7[m] joined the channel
2025-06-09 16024, 2025
mamanullah7[m]
` | <s> [webpack.Progress] 28% building 1/3 entries 687/729 dependencies 17/272 modules... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/QVRJaVRWFbbJlLpRkqFgaeOw>)
m.amanullah7: please use a pastebin or gist to share the logs.
2025-06-09 16046, 2025
lucifer[m]
you can run ./develop.sh psql and then run ALTER TABLE "listens_importer" ADD COLUMN status JSONB;
2025-06-09 16037, 2025
mamanullah7[m]
lucifer[m]: my bad! i'll take care
2025-06-09 16053, 2025
mamanullah7[m]
lucifer[m]: okay!
2025-06-09 16058, 2025
mamanullah7[m] uploaded an image: (433KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DSzYfbljamBfMGllHBpVpEyd/Screenshot%202025-06-09%20at%201.54.12%E2%80%AFPM.png >
2025-06-09 16051, 2025
mamanullah7[m] uploaded an image: (302KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/qRcnXLWlHnBammDALWJrvoPz/Screenshot%202025-06-09%20at%201.40.53%E2%80%AFPM.png >
2025-06-09 16036, 2025
mayhem[m] joined the channel
2025-06-09 16036, 2025
mayhem[m]
monkey: ansh lucifer : I totally spaced, but today is a holiday in spain (and other parts of EU). can we post-pone the LB team meeting another week? We may want to find another day, since mondays have so many holidays...
2025-06-09 16042, 2025
lucifer[m]
mayhem: this thursday?
2025-06-09 16054, 2025
mayhem[m]
I can do that.
2025-06-09 16001, 2025
mamanullah7[m]
<mamanullah7[m]> "Screenshot 2025-06-09 at 1.40.53..." <- lucifer: i'm not able to fix this, ive updated the changes as per ur review and i'll push changes can u look into it.
2025-06-09 16025, 2025
lucifer[m]
m.amanullah7: there are two / in the the url, try with one.
2025-06-09 16033, 2025
lucifer[m]
two // before api to be clear.
2025-06-09 16016, 2025
mamanullah7[m] uploaded an image: (332KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/nlMCMNKYlsiMjsGXroJIfYeT/Screenshot%202025-06-09%20at%203.13.44%E2%80%AFPM.png >
2025-06-09 16017, 2025
mamanullah7[m]
its working now i guess!
2025-06-09 16046, 2025
mamanullah7[m]
lucifer
2025-06-09 16004, 2025
lucifer[m]
m.amanullah7: yes, you need to fix the backend to ensure the funkwhale service url works.
2025-06-09 16023, 2025
mamanullah7[m]
yeah i'll check trailing slash if there any in host_url i'll remove!
2025-06-09 16019, 2025
Sophist-UK has quit
2025-06-09 16040, 2025
Sophist-UK joined the channel
2025-06-09 16059, 2025
mamanullah7[m] sent a code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/CWXQEfRqNNlRIBMKmZvRkgrM
rayyan_seliya123 uploaded an image: (50KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/LtDrOSxhqaIeSvXyPxZojLRS/Screenshot%202025-06-09%20161344.png >
2025-06-09 16008, 2025
rayyan_seliya123
rayyan_seliya123: lucifer: I have stored the topics too!! what's next ?
2025-06-09 16057, 2025
d4rk has quit
2025-06-09 16026, 2025
d4rk joined the channel
2025-06-09 16010, 2025
Techman has quit
2025-06-09 16040, 2025
lucifer[m]
[@rayyan_seliya123:matrix.org](https://matrix.to/#/@rayyan_seliya123:matrix.org) start updating your script to match other scripts, basically it should have two features. Store the item id in redis with a timeout, before querying internet archive check if the item id is in the cache if so skip it. Second the indexer should be able to connect to a rabbitmq queue to obtain seed ids.
2025-06-09 16007, 2025
lucifer[m]
study the existing indexers and feel free to ask any doubts you have about it.
2025-06-09 16028, 2025
rayyan_seliya123
lucifer[m]: Sure will do it !!
2025-06-09 16008, 2025
rayyan_seliya123
<lucifer[m]> "study the existing indexers..." <- Okk I will if I encounter any! first question is I have to separate the indexer script into files like client.py runner.py handler.py as there are in existing indexers like soundclud or later ?
2025-06-09 16016, 2025
lucifer[m]
rayyan_seliya123: eventually yes but you can keep it in one file if that's easier for you to start with.
2025-06-09 16001, 2025
rayyan_seliya123
lucifer[m]: Okk I will do as what's feasible for me !! Then
2025-06-09 16005, 2025
Techman joined the channel
2025-06-09 16018, 2025
monkey[m] joined the channel
2025-06-09 16018, 2025
monkey[m]
<lucifer[m]> "mayhem: this thursday?" <- Works for me too
2025-06-09 16019, 2025
ansh[m] joined the channel
2025-06-09 16019, 2025
ansh[m]
<lucifer[m]> "mayhem: this thursday?" <- works for me too
2025-06-09 16018, 2025
holycow23[m] joined the channel
2025-06-09 16018, 2025
holycow23[m]
<lucifer[m]> "holycow23: `request_import_incre..." <- But that would be my own listens right?
2025-06-09 16006, 2025
lucifer[m]
holycow23: no those listens would be downloaded from metabrainz servers, we produce daily data dumps of all listens submitted by the users that day daily.
2025-06-09 16023, 2025
holycow23[m]
Okay
2025-06-09 16001, 2025
mamanullah7[m]
<mamanullah7[m]> "Screenshot 2025-06-09 at 3.13.44..." <- lucifer: i'm encountering one problem when user is not loggedin then it shows `again direcly redirecting to redirect url ans if user not logged in then shows
2025-06-09 16001, 2025
mamanullah7[m]
{"detail":"Authentication credentials were not provided."}` else directly redirecting connect service page? how can i fix this
2025-06-09 16021, 2025
mamanullah7[m]
* lucifer: i'm encountering one problem when user is not loggedin then it shows again direcly redirecting to redirect url ans if user not logged in then shows `{"detail":"Authentication credentials were not provided."}` else directly redirecting connect service page? how can i fix this
2025-06-09 16047, 2025
lucifer[m]
m.amanullah7: on listenbrainz side or funkwhale side?
2025-06-09 16050, 2025
holycow23[m]
<lucifer[m]> "holycow23: no those listens..." <- We didn't import the listens that day, so for incremental the base/full dump needs to be imported first
2025-06-09 16013, 2025
lucifer[m]
holycow23: we did, that's how the artist and listening stats were generated.
holycow23[m] uploaded an image: (29KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DsWdYRIRhxXqXLrpSRRsPYfA/image.png >
2025-06-09 16011, 2025
holycow23[m]
I tried running the query for incremental as well
2025-06-09 16019, 2025
mamanullah7[m]
lucifer[m]: funkwhale side like when im trying same auth_url in incognito getting this `{"detail":"Authentication credentials were not provided."}`
2025-06-09 16025, 2025
holycow23[m]
lucifer[m]: `2025-06-09 14:29:10,890 listenbrainz_spark.listens.dump WARNING No previous full dump found, importing latest incremental dump`
2025-06-09 16057, 2025
lucifer[m]
holycow23: we didn't import a full dump but just a incremental dump, that's what the error message says.
2025-06-09 16004, 2025
mamanullah7[m]
mamanullah7[m]: even i'm logged in there directly sending me to connect service page!
2025-06-09 16033, 2025
lucifer[m]
m.amanullah7: okay you can update your PR and i'll take a look
2025-06-09 16051, 2025
mamanullah7[m]
sure thanks!
2025-06-09 16032, 2025
holycow23[m]
holycow23[m]: But then how do I query the listens since the table doesn't seem to be visible over here
2025-06-09 16006, 2025
lucifer[m]
can you share the image again?
2025-06-09 16035, 2025
lucifer[m]
oh my bad, i see it now
2025-06-09 16044, 2025
lucifer[m]
holycow23[m]: this is couchdb.
2025-06-09 16050, 2025
lucifer[m]
listens are not stored in couchdb.
2025-06-09 16008, 2025
lucifer[m]
listens are stored in spark and timescaledb.
2025-06-09 16014, 2025
monkey[m]
lucifer: Hello! Am I clear to deploy a new LB image to all containers? I see there are quite a few stats/spark/dumps changes so better to check...
2025-06-09 16017, 2025
lucifer[m]
couchdb only has the finally generated statistics.
2025-06-09 16024, 2025
lucifer[m]
monkey: yup sure
2025-06-09 16033, 2025
holycow23[m]
lucifer[m]: Yes correct, that’s why I was asking if I can import the LB Dump so I can start querying the DB to verify my mock queries
2025-06-09 16014, 2025
lucifer[m]
you can import it but it won't show up here, its in spark.
2025-06-09 16029, 2025
holycow23[m]
Yeah that’s fine
2025-06-09 16016, 2025
lucifer[m]
you should already have listens in spark from when we met but you can run request_import_incremental to import some more.
2025-06-09 16038, 2025
lucifer[m]
if you are facing any errors testing your queries currently then its most likely the issue is something else though.
2025-06-09 16043, 2025
holycow23[m]
Hmm, okay will look into it
2025-06-09 16006, 2025
lucifer[m]
you can share the particular error that you are getting and i can help debug that
2025-06-09 16037, 2025
monkey[m]
lucifer: Should listenbrainz-timescale-writer-beta be running? It was not running before i ran the start script but is running now, just wanted to check
2025-06-09 16014, 2025
lucifer[m]
monkey: should be fine either way.
2025-06-09 16022, 2025
monkey[m]
OK, I'll let it run then
2025-06-09 16034, 2025
monkey[m]
I will note that the soundcloud connection + playlist import/export doesn't seem to be working. Happy to help investigate
2025-06-09 16053, 2025
q3lont joined the channel
2025-06-09 16002, 2025
mamanullah7[m]
lucifer: i've updated u can check locally and do let me know! as of now its unstructured i'll update this once it works! i think issue in handling callback properly.
2025-06-09 16014, 2025
d4rk has quit
2025-06-09 16031, 2025
d4rk joined the channel
2025-06-09 16040, 2025
holycow23[m]
<lucifer[m]> "you can share the particular..." <- Hey sorry was gone for a bit
2025-06-09 16018, 2025
holycow23[m]
So the listens are in timescale originally so, do I query the timescaleDB to check the generated stats
2025-06-09 16055, 2025
lucifer[m]
[@holycow23:matrix.org](https://matrix.to/#/@holycow23:matrix.org) how are you testing your queries or what error do you see?
2025-06-09 16038, 2025
holycow23[m]
lucifer[m]: I haven't tested any query till now, last week I was able to query the CouchDB using a pyspark setup which you helped me get in place
2025-06-09 16041, 2025
lucifer[m]
Okay, your pyspark setup needs to read the listens and metadata caches from HDFS.
2025-06-09 16031, 2025
lucifer[m]
You can expose a port from the namenode container if one is already not exposed. I think it would port 9000 by default.
2025-06-09 16014, 2025
lucifer[m]
If you can share the notebook you are currently working with I can update it with an example
2025-06-09 16021, 2025
holycow23[m]
I am trying to test out the code I wrote last week for using CouchDB with PySpark will get back to you the moment that works
2025-06-09 16019, 2025
lucifer[m]
That part will be only useful if you directly want to write stats to couchdb without interacting with spark reader.
2025-06-09 16033, 2025
lucifer[m]
No data needs to be read from couchdb.
2025-06-09 16018, 2025
holycow23[m]
Okay so the spark reader when reading the listens from timescaleDB would generate stats correct
2025-06-09 16003, 2025
holycow23[m] sent a import code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/mdKJTnDNwMmnBCcoWWmBqwco
2025-06-09 16021, 2025
holycow23[m]
holycow23[m]: This is the script I was using to query the CoucDB
2025-06-09 16001, 2025
holycow23[m]
s/CoucDB/CouchDB/
2025-06-09 16005, 2025
d4rk has quit
2025-06-09 16032, 2025
d4rk joined the channel
2025-06-09 16010, 2025
lucifer[m]
holycow23: to clear some things ups:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/WAeXTIkxpOsQYbtsYOrQAFwU>)
2025-06-09 16010, 2025
lucifer[m]
i'll share a code snippet on how to read from hdfs in your pyspark notebook later.
2025-06-09 16012, 2025
holycow23[m]
Okay gotcha
2025-06-09 16021, 2025
holycow23[m] uploaded an image: (376KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/tzVBPHsaIpkyXzvBZVkAbuJz/image.png >
2025-06-09 16024, 2025
holycow23[m] uploaded an image: (280KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/WGTOxHeSzoubhCrXaLXyisTo/image.png >
2025-06-09 16048, 2025
holycow23[m]
ansh, monkey Could you have a look over these mockups for the stats, currently running with mockup data
2025-06-09 16048, 2025
holycow23[m]
The first one is the evolution of artists over the weekly period for the top 5 artists heard
2025-06-09 16048, 2025
holycow23[m]
The second one is a genre trend over the dat to what kind of music do you listen during different times of the day
2025-06-09 16035, 2025
monkey[m]
First impressions is that it looks good so far. Two comments:
2025-06-09 16035, 2025
monkey[m]
1. I think we could do with more artists on the evolution graph, maybe to top 10 ? Whatever works while remaining legible, I think it would add to the usefulness of the graph and be in line with the other stats components we show. I think eventually this could replace the current "Top artists" component on the stats page.
2025-06-09 16035, 2025
monkey[m]
2. I like the genre by time graph, perhaps the corner rounding is a bit too high, making it look like orange segments a bit. Curious to see what it looks like with real data, as there should be an ~8h hole in there for most normal humans :D. I think it might need to be rotated to accommodate the user's time-zone, but we'll need to see with real data for that.
2025-06-09 16033, 2025
monkey[m]
And I guess a third one, but for later: when we integrate these graphs into the page, the evilution graph will need the whole page width, but the genre one doesn't. We'll have to see with what and how to place it on the page
2025-06-09 16052, 2025
monkey[m]
["evilution" typo, but good band name]
2025-06-09 16057, 2025
holycow23[m]
1. We could think about a top-k thing where k has a mini dropdown for like 5, 10 or maybe a third value?
2025-06-09 16057, 2025
holycow23[m]
2. Yes there is a chance of finding a hole of 8 hours, will look at a lesser rounding
2025-06-09 16057, 2025
holycow23[m]
3. I was a little concerned about the rotation thing too, since the listens are stored with the UTC, while generating stats I will have to convert to user timezone and then process the time of the day
2025-06-09 16057, 2025
holycow23[m]
<monkey[m]> "And I guess a third one, but for..." <- We could do a side by side representation of two stats or we could show the top songs of top genres next to the pie chart
2025-06-09 16035, 2025
monkey[m]
Yes, I think that could work.
2025-06-09 16055, 2025
monkey[m]
Perhaps another genre graph that ansh was working on
2025-06-09 16004, 2025
holycow23[m]
holycow23[m]: But yeah the top-k will just lead to more processing and a lot more data but not provide that good of a value so will test out 10 and see how it looks then we could decide
2025-06-09 16026, 2025
holycow23[m]
monkey[m]: Oh, there is the eras stats as well
2025-06-09 16053, 2025
holycow23[m]
I have made the mockup for that but well same as that on "Your Year in Music"
2025-06-09 16056, 2025
monkey[m]
That might need a bit of space on the page too, but let's cross that bridge when we get to it
2025-06-09 16018, 2025
holycow23[m]
monkey[m]: Yeah that will take the entire page width
2025-06-09 16034, 2025
holycow23[m] uploaded an image: (20KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/smcbAqglisawIWvBJrlGZoae/image.png >
2025-06-09 16019, 2025
holycow23[m]
<holycow23[m]> "Yeah that will take the entire..." <- For reference this is what I did, just added one small thing of the start year as well, instead of just the years divisible by 5
2025-06-09 16041, 2025
suvid[m]
How do I pass external_service_oauth_id to the listens_importer model in order to add an import task?
2025-06-09 16041, 2025
suvid[m]
do I need to create external_service_oauth for each service without access and refresh tokens and use its ID?
2025-06-09 16041, 2025
suvid[m]
The process seems a bit unclear to me as I haven't worked that much with databases earlier
2025-06-09 16044, 2025
suvid[m] uploaded an image: (44KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/tZVpZSoROhrWJdFyEhbwGcQg/image.png >
2025-06-09 16054, 2025
q3lont has quit
2025-06-09 16038, 2025
suvid[m]
should i first get the external_service_oauth_id for the specified service by querying from the external_service_oauth table and then pass it to the listens_importer for the external_service_oauth_id field?