holycow23: test it in pyspark using the new commands i shared.
2025-06-25 17610, 2025
pite has quit
2025-06-25 17649, 2025
davic has quit
2025-06-25 17649, 2025
rayyan_seliya123
<rayyan_seliya123> "hey lucifer: I noticed that..." <- > <@rayyan_seliya123:matrix.org> hey lucifer: I noticed that in my latest push I am not using any models for my indexer implementation and not importing it in my indexer script though it is still present in `models.py` I was thinking should we remove... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/RuyVdGSQZGEELzzSwxVmBwod>)
2025-06-25 17644, 2025
Kladky joined the channel
2025-06-25 17643, 2025
outsidecontext[m
mayhem: good morning. It's the time again where Apple has changes to their license agreements and wants you to accept that :) Picard code signing is failing again
2025-06-25 17612, 2025
lucifer[m]
rayyan_seliya123: oh sorry, i missed that comment yes. i'll review your PR again today and let you know then what to do about the models as well.
They seem sensible enough to me but two sets of eyes are always best :)
2025-06-25 17643, 2025
mthax_ has quit
2025-06-25 17610, 2025
mthax joined the channel
2025-06-25 17643, 2025
mamanullah7[m]
lucifer: i'm getting this error i'm not able to understand whats the issue!... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/pCoBDGZmMZGDxxdTqAovMKbB>)
2025-06-25 17606, 2025
mamanullah7[m] uploaded an image: (365KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/vnyprCeTiDAelZTbmYmvcFTO/Screenshot%202025-06-25%20at%201.49.19%E2%80%AFPM.png >
2025-06-25 17632, 2025
mamanullah7[m]
but here i can see its authorising!
2025-06-25 17602, 2025
lucifer[m]
m.amanullah7: push your latest code to the PR and i'll check
2025-06-25 17621, 2025
lucifer[m]
rayyan_seliya123: taking a look at the PR, it looks good to me. let's keep the model for now and we can delete it later if still not needed.
2025-06-25 17634, 2025
lucifer[m]
you should refactor it to be like the other caches now. take a look at SoundcloudHandler and refactor your class to work like that. the rabbitmq connection bits will then be automatically handled by the ServiceMetadataCache
<lucifer[m]> "you should refactor it to be..." <- Thx for the review ,Sure I will refactor it like other cache and will update you after that !
2025-06-25 17655, 2025
suvid[m]
<lucifer[m]> "suvid: hi! any update on the..." <- Hi I was just finalizing the new schema
2025-06-25 17655, 2025
suvid[m]
I should keep the file_available_until to cleanup the old import files?
2025-06-25 17606, 2025
mayhem[m]
outsidecontext: apple agreement carefully read, like literally every word, and then accepted. 🤥
2025-06-25 17608, 2025
suvid[m]
* Hi I was just finalizing the new schema
2025-06-25 17608, 2025
suvid[m]
I should keep the file_available_until in the schema to cleanup the old import files?
2025-06-25 17617, 2025
lucifer[m]
no, old import files should be deleted after the import is done.
2025-06-25 17629, 2025
suvid[m]
oh ok 👍️
2025-06-25 17644, 2025
suvid[m] uploaded an image: (17KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/SIktnCrCFXjnJhLEQaHkgFss/image.png >
2025-06-25 17653, 2025
suvid[m]
then this seems fine ig
2025-06-25 17626, 2025
lucifer[m]
i think you can rename filename to uploaded_filename for clarity.
2025-06-25 17652, 2025
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged and not empty as it is bridged to IRC; see https://musicbrainz.org/doc/ChatBrainz for details | Agenda: Reviews, MBS-14035 (reo)
2025-06-25 17659, 2025
lucifer[m]
and for progress i think you should change it to a JSONB. then it can have fields for status text, first imported listen and last import listen.
suvid[m] sent a sql code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/OMQqMrsFMHLLeXVdkeCrXVZH
2025-06-25 17601, 2025
holycow23[m]
<lucifer[m]> "holycow23: test it in pyspark..." <- Yeah done with that, the messages are generating well
2025-06-25 17601, 2025
holycow23[m]
Now the part left is receiving the same in the Backend API, how do I go ahead with that?
2025-06-25 17618, 2025
lucifer[m]
is the data being stored in couchdb yet?
2025-06-25 17634, 2025
holycow23[m]
I don’t know
2025-06-25 17639, 2025
holycow23[m]
How do I query that
2025-06-25 17616, 2025
suvid[m]
Hi lucifer
2025-06-25 17616, 2025
suvid[m]
I have committed the new schema in the pr
2025-06-25 17616, 2025
suvid[m]
I have also migrated it on my setup locally and there seem to be no issues
2025-06-25 17648, 2025
lucifer[m]
<holycow23[m]> "How do I query that" <- Okay, i'll check the pr again in a while and let you know
2025-06-25 17651, 2025
suvid[m]
<lucifer[m]> "and for progress i think you..." <- why do we need to have first importer listen in it?
2025-06-25 17654, 2025
suvid[m]
s/importer/imported/
2025-06-25 17600, 2025
suvid[m]
should i just name it to progress as we are only storing progress info: current status, last imported listen and first imported listen?
2025-06-25 17625, 2025
suvid[m]
* to progress instead of metadata as we
2025-06-25 17647, 2025
lucifer[m]
[@suvid:matrix.org](https://matrix.to/#/@suvid:matrix.org) I think metadata is fine for now. Monkey suggested that first and last imported might be useful to show to the user.
2025-06-25 17611, 2025
lucifer[m]
let's start working on processing the uploaded archive files now.
2025-06-25 17615, 2025
monkey[m] appears after the incantation
2025-06-25 17630, 2025
suvid[m]
lucifer[m]: Oh okay
2025-06-25 17628, 2025
vardhan_ has quit
2025-06-25 17644, 2025
monkey[m]
Oh yeah, the context for this was that for example if I want to import my Spotify extended history but only a certain datetime range, knowing that timestamp range along with the file name would be useful to keep track of the imported listens.
2025-06-25 17644, 2025
monkey[m]
For Spotify the multiple history files are not sorted by timestamp, so each file will have listens from random time and date between when I created my account and now, rather than file 1 being my oldest listens, then file2, etc.
2025-06-25 17652, 2025
vardhan_ joined the channel
2025-06-25 17602, 2025
monkey[m]
Hard to keep track of what I already imported, in that case
2025-06-25 17613, 2025
suvid[m]
but like when we encounter each listen, we just check if it belongs in the date range and if it does, send it to the rabbitmq queue
2025-06-25 17613, 2025
suvid[m]
i was thinking of this
2025-06-25 17633, 2025
kellnerd[m]
Interesting, the Spotify exports which I have seen so far had listens ordered by time, with the year(s) as part of the filenames 👀
2025-06-25 17639, 2025
monkey[m]
suvid[m]: Yes, that's the filtering I expect. But then if I put a limit datetime, once the import is finished how will I remember what that limit was that I set before importing that specific file?
2025-06-25 17617, 2025
suvid[m]
monkey[m]: can we just store the start and end datetime stamps in the user data imports model?
2025-06-25 17633, 2025
suvid[m]
and have status and progress normally
2025-06-25 17641, 2025
monkey[m]
I haven't followed what the structure or storage of this metadata would look like, I'm afraid, so don't really have an answer for you there. just here to explain why those timestamps should be saved somewhere
2025-06-25 17613, 2025
suvid[m]
okay yea
2025-06-25 17613, 2025
suvid[m]
so the problem indicated is basically keeping track of timestamps right?
2025-06-25 17636, 2025
suvid[m]
i believe we should just simply store the timestamps in the model
2025-06-25 17636, 2025
suvid[m]
and maybe last imported listen to show to user on the UI as progress
2025-06-25 17656, 2025
suvid[m]
lucifer what do you suggest?
2025-06-25 17634, 2025
suvid[m]
can i add fields from_date and to_date
2025-06-25 17634, 2025
suvid[m]
with from_date having default value in the 1900s and to value to current date?
2025-06-25 17647, 2025
suvid[m]
* can i add fields from_date and to_date
2025-06-25 17647, 2025
suvid[m]
with from_date having default value in the year 1900s and to value to current date?
2025-06-25 17640, 2025
suvid[m] sent a sql code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/kcHfoiAFNasmxJnHvCgpCXuw
2025-06-25 17644, 2025
suvid[m]
does this seem fine now?
2025-06-25 17650, 2025
lucifer[m]
<suvid[m]> "lucifer what do you suggest?" <- I'll read the backlog in a while and respond.
2025-06-25 17655, 2025
suvid[m]
<lucifer[m]> "I'll read the backlog in a while..." <- oh ok
2025-06-25 17656, 2025
suvid[m]
let's wrap up with the schema part quickly
2025-06-25 17656, 2025
suvid[m]
I believe the part after it should be rather straightforward :)
2025-06-25 17624, 2025
vardhan_ has quit
2025-06-25 17659, 2025
vardhan joined the channel
2025-06-25 17638, 2025
lucifer[m]
suvid, monkey: i'd like to reuse this table to support imports of liked tracks and playlists as well in future from files. i guess from_date and to_date are common in that respect so make sense to create separate columns for them. let's move uploaded filename inside metadata because it will be only needed to show on frontend.
m.amanullah7: I think the issue in your code is that you don't the client id and client secret in the database after generating them. so you create one app in the authorize url code and another while fetching the access token. that is incorrect. you need to store the app details when generated in the database.
2025-06-25 17613, 2025
lucifer[m]
Refactor your database schema in this way, funkwhale_servers: host_url, client_id, client_secret, scopes, created.
holycow23: you do have some code to store the data in couchdb, so i think it should be there. you can port forward 5984 port and view couchdb manually.
i think the code on wolf is outdated so the stats data still has time brackets instead of hours. check and confirm you can see this data. then update the code and regenerate the stats.