#metabrainz

/

      • mamanullah7[m] has quit
      • Clint_ is now known as Clint
      • _BrainzGit
        [metabrainz.org] 14fettuccinae opened pull request #511 (03metabrainz-notifications…notification-table): Merge client credentials pr into metabrainz-notifications (#509) https://github.com/metabrainz/metabrainz.org/pu...
      • Maxr1998 has quit
      • Maxr1998 joined the channel
      • lusciouslover has quit
      • lusciouslover joined the channel
      • Kladky joined the channel
      • rayyan_seliya123 joined the channel
      • rayyan_seliya123 uploaded an image: (58KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/KWQFlFaPEzpvSQTPkCERzQJB/Screenshot%202025-06-07%20004631.png >
      • rayyan_seliya123
        Hey lucifer I have updated my indexer and `models.py` script to replicate the same metadata in my table which is present at IA website see the screenshot above like id ,title,artist,album,year,notes/description to the recordings also see my table below ```txt +-[ RECORD 1... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
      • lucifer[m]
        rayyan_seliya123: also store the topics.
      • rayyan_seliya123
        lucifer[m]: sure i will and update you after that !
      • mamanullah7[m] joined the channel
      • mamanullah7[m]
        ` | <s> [webpack.Progress] 28% building 1/3 entries 687/729 dependencies 17/272 modules... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
      • HemangMishra[m] joined the channel
      • HemangMishra[m]
        Hi aerozol @aerozol:matrix.org: , need some further reviews to get the designs finalized https://www.figma.com/design/OQeDkrq2rJv33LBAwm...
      • holycow23[m] has quit
      • lucifer[m]
        m.amanullah7: please use a pastebin or gist to share the logs.
      • you can run ./develop.sh psql and then run ALTER TABLE "listens_importer" ADD COLUMN status JSONB;
      • mamanullah7[m]
        lucifer[m]: my bad! i'll take care
      • lucifer[m]: okay!
      • mamanullah7[m] uploaded an image: (433KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DSzYfbljamBfMGllHBpVpEyd/Screenshot%202025-06-09%20at%201.54.12%E2%80%AFPM.png >
      • mamanullah7[m] uploaded an image: (302KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/qRcnXLWlHnBammDALWJrvoPz/Screenshot%202025-06-09%20at%201.40.53%E2%80%AFPM.png >
      • mayhem[m] joined the channel
      • mayhem[m]
        monkey: ansh lucifer : I totally spaced, but today is a holiday in spain (and other parts of EU). can we post-pone the LB team meeting another week? We may want to find another day, since mondays have so many holidays...
      • lucifer[m]
        mayhem: this thursday?
      • mayhem[m]
        I can do that.
      • mamanullah7[m]
        <mamanullah7[m]> "Screenshot 2025-06-09 at 1.40.53..." <- lucifer: i'm not able to fix this, ive updated the changes as per ur review and i'll push changes can u look into it.
      • lucifer[m]
        m.amanullah7: there are two / in the the url, try with one.
      • two // before api to be clear.
      • mamanullah7[m] uploaded an image: (332KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/nlMCMNKYlsiMjsGXroJIfYeT/Screenshot%202025-06-09%20at%203.13.44%E2%80%AFPM.png >
      • mamanullah7[m]
        its working now i guess!
      • lucifer
      • lucifer[m]
        m.amanullah7: yes, you need to fix the backend to ensure the funkwhale service url works.
      • mamanullah7[m]
        yeah i'll check trailing slash if there any in host_url i'll remove!
      • Sophist-UK has quit
      • Sophist-UK joined the channel
      • mamanullah7[m] sent a code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/CWXQEfRqNNlRIBMKmZvRkgrM
      • s/iIfw/evRkHeHiwezS81MJWjWFtnBCKLuaIpZpTKG6dwTjw1M/, s/2FRE07q4kGHjFegWPVoBxJ8m%2FlrlRsom8gU2UQEU%//
      • rayyan_seliya123 uploaded an image: (50KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/LtDrOSxhqaIeSvXyPxZojLRS/Screenshot%202025-06-09%20161344.png >
      • rayyan_seliya123
        rayyan_seliya123: lucifer: I have stored the topics too!! what's next ?
      • d4rk has quit
      • d4rk joined the channel
      • Techman has quit
      • lucifer[m]
        [@rayyan_seliya123:matrix.org](https://matrix.to/#/@rayyan_seliya123:matrix.org) start updating your script to match other scripts, basically it should have two features. Store the item id in redis with a timeout, before querying internet archive check if the item id is in the cache if so skip it. Second the indexer should be able to connect to a rabbitmq queue to obtain seed ids.
      • study the existing indexers and feel free to ask any doubts you have about it.
      • rayyan_seliya123
        lucifer[m]: Sure will do it !!
      • <lucifer[m]> "study the existing indexers..." <- Okk I will if I encounter any! first question is I have to separate the indexer script into files like client.py runner.py handler.py as there are in existing indexers like soundclud or later ?
      • lucifer[m]
        rayyan_seliya123: eventually yes but you can keep it in one file if that's easier for you to start with.
      • rayyan_seliya123
        lucifer[m]: Okk I will do as what's feasible for me !! Then
      • Techman joined the channel
      • monkey[m] joined the channel
      • monkey[m]
        <lucifer[m]> "mayhem: this thursday?" <- Works for me too
      • ansh[m] joined the channel
      • ansh[m]
        <lucifer[m]> "mayhem: this thursday?" <- works for me too
      • holycow23[m] joined the channel
      • holycow23[m]
        <lucifer[m]> "holycow23: `request_import_incre..." <- But that would be my own listens right?
      • lucifer[m]
        holycow23: no those listens would be downloaded from metabrainz servers, we produce daily data dumps of all listens submitted by the users that day daily.
      • holycow23[m]
        Okay
      • mamanullah7[m]
        <mamanullah7[m]> "Screenshot 2025-06-09 at 3.13.44..." <- lucifer: i'm encountering one problem when user is not loggedin then it shows `again direcly redirecting to redirect url ans if user not logged in then shows
      • {"detail":"Authentication credentials were not provided."}` else directly redirecting connect service page? how can i fix this
      • * lucifer: i'm encountering one problem when user is not loggedin then it shows again direcly redirecting to redirect url ans if user not logged in then shows `{"detail":"Authentication credentials were not provided."}` else directly redirecting connect service page? how can i fix this
      • lucifer[m]
        m.amanullah7: on listenbrainz side or funkwhale side?
      • holycow23[m]
        <lucifer[m]> "holycow23: no those listens..." <- We didn't import the listens that day, so for incremental the base/full dump needs to be imported first
      • lucifer[m]
        holycow23: we did, that's how the artist and listening stats were generated.
      • _BrainzGit
        [listenbrainz-server] release 03v-2025-06-09.0 has been published by 14github-actions[bot]: https://github.com/metabrainz/listenbrainz-serv...
      • holycow23[m] uploaded an image: (29KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DsWdYRIRhxXqXLrpSRRsPYfA/image.png >
      • holycow23[m]
        I tried running the query for incremental as well
      • mamanullah7[m]
        lucifer[m]: funkwhale side like when im trying same auth_url in incognito getting this `{"detail":"Authentication credentials were not provided."}`
      • holycow23[m]
        lucifer[m]: `2025-06-09 14:29:10,890 listenbrainz_spark.listens.dump WARNING No previous full dump found, importing latest incremental dump`
      • lucifer[m]
        holycow23: we didn't import a full dump but just a incremental dump, that's what the error message says.
      • mamanullah7[m]
        mamanullah7[m]: even i'm logged in there directly sending me to connect service page!
      • lucifer[m]
        m.amanullah7: okay you can update your PR and i'll take a look
      • mamanullah7[m]
        sure thanks!
      • holycow23[m]
        holycow23[m]: But then how do I query the listens since the table doesn't seem to be visible over here
      • lucifer[m]
        can you share the image again?
      • oh my bad, i see it now
      • holycow23[m]: this is couchdb.
      • listens are not stored in couchdb.
      • listens are stored in spark and timescaledb.
      • monkey[m]
        lucifer: Hello! Am I clear to deploy a new LB image to all containers? I see there are quite a few stats/spark/dumps changes so better to check...
      • lucifer[m]
        couchdb only has the finally generated statistics.
      • monkey: yup sure
      • holycow23[m]
        lucifer[m]: Yes correct, that’s why I was asking if I can import the LB Dump so I can start querying the DB to verify my mock queries
      • lucifer[m]
        you can import it but it won't show up here, its in spark.
      • holycow23[m]
        Yeah that’s fine
      • lucifer[m]
        you should already have listens in spark from when we met but you can run request_import_incremental to import some more.
      • if you are facing any errors testing your queries currently then its most likely the issue is something else though.
      • holycow23[m]
        Hmm, okay will look into it
      • lucifer[m]
        you can share the particular error that you are getting and i can help debug that
      • monkey[m]
        lucifer: Should listenbrainz-timescale-writer-beta be running? It was not running before i ran the start script but is running now, just wanted to check
      • lucifer[m]
        monkey: should be fine either way.
      • monkey[m]
        OK, I'll let it run then
      • I will note that the soundcloud connection + playlist import/export doesn't seem to be working. Happy to help investigate
      • q3lont joined the channel
      • mamanullah7[m]
        lucifer: i've updated u can check locally and do let me know! as of now its unstructured i'll update this once it works! i think issue in handling callback properly.
      • d4rk has quit
      • d4rk joined the channel
      • holycow23[m]
        <lucifer[m]> "you can share the particular..." <- Hey sorry was gone for a bit
      • So the listens are in timescale originally so, do I query the timescaleDB to check the generated stats
      • lucifer[m]
        [@holycow23:matrix.org](https://matrix.to/#/@holycow23:matrix.org) how are you testing your queries or what error do you see?
      • holycow23[m]
        lucifer[m]: I haven't tested any query till now, last week I was able to query the CouchDB using a pyspark setup which you helped me get in place
      • lucifer[m]
        Okay, your pyspark setup needs to read the listens and metadata caches from HDFS.
      • You can expose a port from the namenode container if one is already not exposed. I think it would port 9000 by default.
      • If you can share the notebook you are currently working with I can update it with an example
      • holycow23[m]
        I am trying to test out the code I wrote last week for using CouchDB with PySpark will get back to you the moment that works
      • lucifer[m]
        That part will be only useful if you directly want to write stats to couchdb without interacting with spark reader.
      • No data needs to be read from couchdb.
      • holycow23[m]
        Okay so the spark reader when reading the listens from timescaleDB would generate stats correct
      • holycow23[m] sent a import code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/mdKJTnDNwMmnBCcoWWmBqwco
      • holycow23[m]: This is the script I was using to query the CoucDB
      • s/CoucDB/CouchDB/
      • d4rk has quit
      • d4rk joined the channel
      • lucifer[m]
        holycow23: to clear some things ups:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
      • i'll share a code snippet on how to read from hdfs in your pyspark notebook later.
      • holycow23[m]
        Okay gotcha
      • holycow23[m] uploaded an image: (376KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/tzVBPHsaIpkyXzvBZVkAbuJz/image.png >
      • holycow23[m] uploaded an image: (280KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/WGTOxHeSzoubhCrXaLXyisTo/image.png >
      • ansh, monkey Could you have a look over these mockups for the stats, currently running with mockup data
      • The first one is the evolution of artists over the weekly period for the top 5 artists heard
      • The second one is a genre trend over the dat to what kind of music do you listen during different times of the day
      • monkey[m]
        First impressions is that it looks good so far. Two comments:
      • 1. I think we could do with more artists on the evolution graph, maybe to top 10 ? Whatever works while remaining legible, I think it would add to the usefulness of the graph and be in line with the other stats components we show. I think eventually this could replace the current "Top artists" component on the stats page.
      • 2. I like the genre by time graph, perhaps the corner rounding is a bit too high, making it look like orange segments a bit. Curious to see what it looks like with real data, as there should be an ~8h hole in there for most normal humans :D. I think it might need to be rotated to accommodate the user's time-zone, but we'll need to see with real data for that.
      • And I guess a third one, but for later: when we integrate these graphs into the page, the evilution graph will need the whole page width, but the genre one doesn't. We'll have to see with what and how to place it on the page
      • ["evilution" typo, but good band name]
      • holycow23[m]
        1. We could think about a top-k thing where k has a mini dropdown for like 5, 10 or maybe a third value?
      • 2. Yes there is a chance of finding a hole of 8 hours, will look at a lesser rounding
      • 3. I was a little concerned about the rotation thing too, since the listens are stored with the UTC, while generating stats I will have to convert to user timezone and then process the time of the day
      • <monkey[m]> "And I guess a third one, but for..." <- We could do a side by side representation of two stats or we could show the top songs of top genres next to the pie chart
      • monkey[m]
        Yes, I think that could work.
      • Perhaps another genre graph that ansh was working on
      • holycow23[m]
        holycow23[m]: But yeah the top-k will just lead to more processing and a lot more data but not provide that good of a value so will test out 10 and see how it looks then we could decide
      • monkey[m]: Oh, there is the eras stats as well
      • I have made the mockup for that but well same as that on "Your Year in Music"
      • monkey[m]
        That might need a bit of space on the page too, but let's cross that bridge when we get to it
      • holycow23[m]
        monkey[m]: Yeah that will take the entire page width
      • holycow23[m] uploaded an image: (20KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/smcbAqglisawIWvBJrlGZoae/image.png >
      • <holycow23[m]> "Yeah that will take the entire..." <- For reference this is what I did, just added one small thing of the start year as well, instead of just the years divisible by 5
      • suvid[m]
        How do I pass external_service_oauth_id to the listens_importer model in order to add an import task?
      • do I need to create external_service_oauth for each service without access and refresh tokens and use its ID?
      • The process seems a bit unclear to me as I haven't worked that much with databases earlier
      • suvid[m] uploaded an image: (44KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/tZVpZSoROhrWJdFyEhbwGcQg/image.png >
      • q3lont has quit
      • should i first get the external_service_oauth_id for the specified service by querying from the external_service_oauth table and then pass it to the listens_importer for the external_service_oauth_id field?
      • lucifer
      • nvm
      • time for the meeting for now ig 😅