[listenbrainz-android] 14dependabot[bot] opened pull request #468 (03dev…dependabot/gradle/dev/com.google.devtools.ksp-2.0.10-1.0.24): Bump com.google.devtools.ksp from 2.0.0-1.0.23 to 2.0.10-1.0.24 https://github.com/metabrainz/listenbrainz-androi…
2024-08-07 22010, 2024
BrainzGit
[listenbrainz-android] 14dependabot[bot] closed pull request #464 (03dev…dependabot/gradle/dev/com.google.devtools.ksp-2.0.0-1.0.24): Bump com.google.devtools.ksp from 2.0.0-1.0.23 to 2.0.0-1.0.24 https://github.com/metabrainz/listenbrainz-androi…
2024-08-07 22059, 2024
pite has quit
2024-08-07 22047, 2024
lusciouslover has quit
2024-08-07 22054, 2024
lusciouslover joined the channel
2024-08-07 22012, 2024
Kladky joined the channel
2024-08-07 22027, 2024
ericd[m] has quit
2024-08-07 22037, 2024
tarun joined the channel
2024-08-07 22059, 2024
tarun has quit
2024-08-07 22003, 2024
wargreen has quit
2024-08-07 22025, 2024
BrainzGit
[listenbrainz-server] 14MonkeyDo opened pull request #2956 (03brainzplayer-spa…rebased-multi-track-mbid-mapping): LB-1281: Link all listens from the same album https://github.com/metabrainz/listenbrainz-server…
2024-08-07 22057, 2024
monkey[m]
Did anyone else just receive a "ListenBrainz Spotify Importer Error" email?
2024-08-07 22006, 2024
minimal joined the channel
2024-08-07 22016, 2024
discordbrainz
<12lazybookwyrm> Yeah
2024-08-07 22008, 2024
BobSwift[m]
monkey: Yes, and I thought that was really strange since I don't actually have a spotify account.
For some reason, 'Activate both features" is checked, but I know that I never did it since I've never had a spotify annount. Was this setting automatically checked as a default at some point? Will it screw something up if I uncheck it (if it allws me to do that)?
2024-08-07 22000, 2024
nbin joined the channel
2024-08-07 22053, 2024
BobSwift[m]
s/allws/allows/
2024-08-07 22018, 2024
BobSwift[m]
s/'/"/, s/allws/allows/
2024-08-07 22047, 2024
BobSwift[m]
s/'/"/, s/annount/account/, s/allws/allows/
2024-08-07 22023, 2024
BobSwift[m]
I see soundcloud is also enabled, but I don't have a soundcloud account either. I just disabled spotify in the settings.
2024-08-07 22029, 2024
pite joined the channel
2024-08-07 22010, 2024
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged and not empty as it is bridged to IRC; see https://musicbrainz.org/doc/ChatBrainz for details | Agenda: Reviews, Docker Compose v2 (yvanzo)
2024-08-07 22051, 2024
monkey[m]
Disabling either won't have any adverse effect. Strange that it was checked for you without your knowledge, but at least it explains the email...
2024-08-07 22004, 2024
yvanzo
The agenda item I just added is about dropping support for Docker Compose v1 for MB mirrors (with announcement for November?), and switching to Docker Compose v2 for development setup in all MetaBrainz projects.
2024-08-07 22003, 2024
yvanzo
(v1 has reached its EOL already)
2024-08-07 22013, 2024
BobSwift[m]
monkey: I wonder if it has something to do with the server being "test.listenbrainz.org", ir does that use the same user database for the account settings?
2024-08-07 22035, 2024
BobSwift[m]
* @monkey:chatbrainz.org: I wonder if it has something to do with the server being "test.listenbrainz.org", or does that use the same user database for the account settings?
yellowhatpro[m]: > <@yellowhatpro:matrix.org> At the other end, the listener, which constantly listens to the channel, I am deliberately sleeping for some time (5 sec currently)
and remind me, after sleeping for 5s, it polls all edits/edit notes made since then?
2024-08-07 22005, 2024
yellowhatpro[m]
bitmap[m]: Alrighty, will remove it from the query
2024-08-07 22055, 2024
yellowhatpro[m]
yellowhatpro[m]: > <@yellowhatpro:matrix.org> At the other end, the listener, which constantly listens to the channel, I am deliberately sleeping for some time (5 sec currently)
Regarding this, with this, we can delay the rate limiting from the archival part. That means, no matter with what speed the URLs are being added, it will only archive (or try to archive) after the amount we decide.
2024-08-07 22017, 2024
yellowhatpro[m]
bitmap[m]: polling part is different, it is independent of the network requester part
2024-08-07 22046, 2024
bitmap[m]
ah sorry, this is in the listener
2024-08-07 22008, 2024
yellowhatpro[m]
After 5 sec, it will pick the front of the channel, and try archiving it.
Now it maybe that the URL was already being tried to archive before, with retry count >=3, in that case, we are marking it Failed
2024-08-07 22051, 2024
bitmap[m]
and you log that to sentry, looks good to me
2024-08-07 22056, 2024
yellowhatpro[m]
To track different status of the URL, I am using enums (at rust level)
2024-08-07 22000, 2024
yellowhatpro[m]
bitmap[m]: yupp
2024-08-07 22012, 2024
yellowhatpro[m]
So basically, There are 5 enum states:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/ySKTlBoJDgSosTLwGOCjdlbj>)
2024-08-07 22059, 2024
yellowhatpro[m]
Initially, every URL row in the internet_archive_urls table is 1 (NotStarted)
2024-08-07 22017, 2024
yellowhatpro[m]
When it is pushed to the channel, it becomes 2 Processing state
2024-08-07 22016, 2024
yellowhatpro[m]
If the URL passes the status check, it is marked 3 Success
2024-08-07 22039, 2024
yellowhatpro[m]
if not, it is 4 Error
2024-08-07 22002, 2024
yellowhatpro[m]
Finally, if for any reason, the URL cannot be archived, it is marked 5, Failed
2024-08-07 22053, 2024
bitmap[m]
ok, so Failed is set after all retry attempts have been exhausted
yvanzo[m]: The time it being inserted in the `internet_archive_urls`
2024-08-07 22056, 2024
yellowhatpro[m]
* time it's being
2024-08-07 22004, 2024
bitmap[m]
makes sense, so that includes are status that the previously-inserted URL has. (maybe it makes sense to ignore Failed ones that were processed longer than 24 hours ago though?)
2024-08-07 22022, 2024
bitmap[m]
s/are/all/, s//`/, s//`/
2024-08-07 22012, 2024
yellowhatpro[m]
yeah, failed ones are being killed
2024-08-07 22019, 2024
yellowhatpro[m]
by the retry task
2024-08-07 22036, 2024
yvanzo[m]
what do you mean by killed?
2024-08-07 22056, 2024
yvanzo[m]
(since it isn’t a process)
2024-08-07 22000, 2024
yellowhatpro[m]
Ahh, I mean deleted
2024-08-07 22003, 2024
yellowhatpro[m]
sowwy
2024-08-07 22008, 2024
yellowhatpro[m]
just got excited
2024-08-07 22035, 2024
yellowhatpro[m]
The ones that were with status error, if they were permanent errors, are also getting deleted:
only when the retry/cleanup task starts again, and iterate over the whole internet_archive_urls, and spots a Failed row, it deletes that row
2024-08-07 22048, 2024
bitmap[m]
yellowhatpro[m]: so how long would you estimate a `Failed` event would stick around before it's deleted?
2024-08-07 22044, 2024
yellowhatpro[m]
Less than a day is what I am assuming right now, since the retry task wakes every 24hr
2024-08-07 22043, 2024
bitmap[m]
ok, just wondering if it makes sense to keep them around for longer in case sentry is down or something
2024-08-07 22042, 2024
yellowhatpro[m]
no issue, i was going to ask a doubt that if we are having sentry, do we still need to store them for long, but ig I got my answer hehe
2024-08-07 22036, 2024
yellowhatpro[m]
So before deleting any Failed entry, I can add a check for X duration, if it is beyond, we will delete it, otherwise, we will keep it
2024-08-07 22050, 2024
bitmap[m]
sounds good to me
2024-08-07 22022, 2024
bitmap[m]
so URLs are only retried once per 24 hr, correct?
2024-08-07 22020, 2024
yellowhatpro[m]
yeah, the retry task will spawn every 24 hr, iterate the internet_archive_urls table, and check what it can delete or retry
2024-08-07 22046, 2024
yellowhatpro[m]
the ones we can retry, are sent to the channel
2024-08-07 22053, 2024
yellowhatpro[m]
where they are enqueued
2024-08-07 22008, 2024
yellowhatpro[m]
and when their time comes, they go the listener
2024-08-07 22045, 2024
bitmap[m]
and how many attempts does it make currently?
2024-08-07 22044, 2024
yvanzo[m]
how do you control the time the retry task is spawning?
2024-08-07 22046, 2024
bitmap[m]
just asking because if a URL is on its third attempt, that would be three days later, but should_insert_url_to_internet_archive_urls only checks for URLs within the last day?
2024-08-07 22005, 2024
yellowhatpro[m]
bitmap[m]: `should_insert_url_to_internet_archive_urls` is only for the case when I am polling from `edit_data` and `edit_notes`
Retrying all at once may face network issues or rate limit, so that isn’t optimal but it should allow testing it at least. :)
2024-08-07 22050, 2024
bitmap[m]
yellowhatpro[m]: not sure I follow -- I'm trying to understand what happens if a URL older than 24 hrs still has pending retries left, and `should_insert_url_to_internet_archive_urls` doesn't see it because of the 24 hour window
2024-08-07 22051, 2024
yellowhatpro[m]
yvanzo[m]: oh, for test actually, I take only 20 rows in internet_archive_urls
2024-08-07 22050, 2024
yvanzo[m]
yellowhatpro: I mean testing in deployment.
2024-08-07 22015, 2024
yellowhatpro[m]
bitmap[m]: the method `should_insert_url_to_internet_archive_urls` is only being used while polling:
If a URL older than 24 hrs, still has pending retries, it will be again pushed to internet_archive_urls. the `should_insert_url_to_internet_archive_urls` method is not checked in this flow
2024-08-07 22007, 2024
yellowhatpro[m]
<yvanzo[m]> "yellowhatpro: I mean testing..." <- Umm, sorry I couldn't understand this. retrying all at once while testing in deployment means?
2024-08-07 22047, 2024
yvanzo[m]
yellowhatpro: It will be deployed to work with test.musicbrainz.org in the beginning.
2024-08-07 22027, 2024
bitmap[m]
yellowhatpro[m]: > <@yellowhatpro:matrix.org> the method `should_insert_url_to_internet_archive_urls` is only being used while polling:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/oCSxHMfeofRRvSUABfrAirxq>)
2024-08-07 22048, 2024
yvanzo[m]
Retrying all the URLs that need to be retried one after each other isn’t optimal because it might happen at a time where the API isn’t available or the service reached its rate limit already.