holycow23: test it in pyspark using the new commands i shared.
pite has quit
davic has quit
rayyan_seliya123
<rayyan_seliya123> "hey lucifer: I noticed that..." <- > <@rayyan_seliya123:matrix.org> hey lucifer: I noticed that in my latest push I am not using any models for my indexer implementation and not importing it in my indexer script though it is still present in `models.py` I was thinking should we remove... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
Kladky joined the channel
outsidecontext[m
mayhem: good morning. It's the time again where Apple has changes to their license agreements and wants you to accept that :) Picard code signing is failing again
lucifer[m]
rayyan_seliya123: oh sorry, i missed that comment yes. i'll review your PR again today and let you know then what to do about the models as well.
mamanullah7[m] uploaded an image: (365KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/vnyprCeTiDAelZTbmYmvcFTO/Screenshot%202025-06-25%20at%201.49.19%E2%80%AFPM.png >
but here i can see its authorising!
lucifer[m]
m.amanullah7: push your latest code to the PR and i'll check
rayyan_seliya123: taking a look at the PR, it looks good to me. let's keep the model for now and we can delete it later if still not needed.
you should refactor it to be like the other caches now. take a look at SoundcloudHandler and refactor your class to work like that. the rabbitmq connection bits will then be automatically handled by the ServiceMetadataCache
<lucifer[m]> "you should refactor it to be..." <- Thx for the review ,Sure I will refactor it like other cache and will update you after that !
suvid[m]
<lucifer[m]> "suvid: hi! any update on the..." <- Hi I was just finalizing the new schema
I should keep the file_available_until to cleanup the old import files?
mayhem[m]
outsidecontext: apple agreement carefully read, like literally every word, and then accepted. 🤥
suvid[m]
* Hi I was just finalizing the new schema
I should keep the file_available_until in the schema to cleanup the old import files?
lucifer[m]
no, old import files should be deleted after the import is done.
suvid[m]
oh ok 👍️
suvid[m] uploaded an image: (17KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/SIktnCrCFXjnJhLEQaHkgFss/image.png >
then this seems fine ig
lucifer[m]
i think you can rename filename to uploaded_filename for clarity.
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged and not empty as it is bridged to IRC; see https://musicbrainz.org/doc/ChatBrainz for details | Agenda: Reviews, MBS-14035 (reo)
and for progress i think you should change it to a JSONB. then it can have fields for status text, first imported listen and last import listen.
suvid[m] sent a sql code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/OMQqMrsFMHLLeXVdkeCrXVZH
holycow23[m]
<lucifer[m]> "holycow23: test it in pyspark..." <- Yeah done with that, the messages are generating well
Now the part left is receiving the same in the Backend API, how do I go ahead with that?
lucifer[m]
is the data being stored in couchdb yet?
holycow23[m]
I don’t know
How do I query that
suvid[m]
Hi lucifer
I have committed the new schema in the pr
I have also migrated it on my setup locally and there seem to be no issues
lucifer[m]
<holycow23[m]> "How do I query that" <- Okay, i'll check the pr again in a while and let you know
suvid[m]
<lucifer[m]> "and for progress i think you..." <- why do we need to have first importer listen in it?
s/importer/imported/
should i just name it to progress as we are only storing progress info: current status, last imported listen and first imported listen?
* to progress instead of metadata as we
lucifer[m]
[@suvid:matrix.org](https://matrix.to/#/@suvid:matrix.org) I think metadata is fine for now. Monkey suggested that first and last imported might be useful to show to the user.
let's start working on processing the uploaded archive files now.
monkey[m] appears after the incantation
suvid[m]
lucifer[m]: Oh okay
vardhan_ has quit
monkey[m]
Oh yeah, the context for this was that for example if I want to import my Spotify extended history but only a certain datetime range, knowing that timestamp range along with the file name would be useful to keep track of the imported listens.
For Spotify the multiple history files are not sorted by timestamp, so each file will have listens from random time and date between when I created my account and now, rather than file 1 being my oldest listens, then file2, etc.
vardhan_ joined the channel
Hard to keep track of what I already imported, in that case
suvid[m]
but like when we encounter each listen, we just check if it belongs in the date range and if it does, send it to the rabbitmq queue
i was thinking of this
kellnerd[m]
Interesting, the Spotify exports which I have seen so far had listens ordered by time, with the year(s) as part of the filenames 👀
monkey[m]
suvid[m]: Yes, that's the filtering I expect. But then if I put a limit datetime, once the import is finished how will I remember what that limit was that I set before importing that specific file?
suvid[m]
monkey[m]: can we just store the start and end datetime stamps in the user data imports model?
and have status and progress normally
monkey[m]
I haven't followed what the structure or storage of this metadata would look like, I'm afraid, so don't really have an answer for you there. just here to explain why those timestamps should be saved somewhere
suvid[m]
okay yea
so the problem indicated is basically keeping track of timestamps right?
i believe we should just simply store the timestamps in the model
and maybe last imported listen to show to user on the UI as progress
lucifer what do you suggest?
can i add fields from_date and to_date
with from_date having default value in the 1900s and to value to current date?
* can i add fields from_date and to_date
with from_date having default value in the year 1900s and to value to current date?
suvid[m] sent a sql code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/kcHfoiAFNasmxJnHvCgpCXuw
does this seem fine now?
lucifer[m]
<suvid[m]> "lucifer what do you suggest?" <- I'll read the backlog in a while and respond.
suvid[m]
<lucifer[m]> "I'll read the backlog in a while..." <- oh ok
let's wrap up with the schema part quickly
I believe the part after it should be rather straightforward :)
vardhan_ has quit
vardhan joined the channel
lucifer[m]
suvid, monkey: i'd like to reuse this table to support imports of liked tracks and playlists as well in future from files. i guess from_date and to_date are common in that respect so make sense to create separate columns for them. let's move uploaded filename inside metadata because it will be only needed to show on frontend.
m.amanullah7: I think the issue in your code is that you don't the client id and client secret in the database after generating them. so you create one app in the authorize url code and another while fetching the access token. that is incorrect. you need to store the app details when generated in the database.
Refactor your database schema in this way, funkwhale_servers: host_url, client_id, client_secret, scopes, created.
holycow23: you do have some code to store the data in couchdb, so i think it should be there. you can port forward 5984 port and view couchdb manually.
i think the code on wolf is outdated so the stats data still has time brackets instead of hours. check and confirm you can see this data. then update the code and regenerate the stats.