in #metabrainz

14:24 PM
ericd[m]

Maybe give me another 10 minutes. I want to have some more observations.
14:24 PM
mayhem[m]

ok, np
14:27 PM
bitmap[m]

reosarevok[m]: I was hoping for a way to reproduce the wide char one by just browsing to a page, but I think the `EditExternalLinks` one suffices too, since that goes through Catalyst
14:31 PM
reosarevok[m]

Yeah, I didn't look further since that one did hit it already :)
14:32 PM
ericd[m]

<ericd[m]> "I'll check" <- ah not a bug. it's just that MB has that many releases to return :D
14:33 PM
mayhem[m]

ericd[m]: Then maybe limit the number of items we return?
14:35 PM
ericd[m]

mayhem[m]: yeah I will change it to some more reasonable amounts, or this may confuses users
14:36 PM
mayhem[m]

if the list is truncated, maybe we can add a link to where they can see the rest on the web?
14:37 PM
ericd[m]

s/amounts/amount/, s/confuses/confuse/
14:43 PM
<mayhem[m]> "if the list is truncated..." <- make sense. i will add a link in the feed content.
14:43 PM
reosarevok[m]

yvanzo, bitmap: can at least one of you also check https://github.com/metabrainz/musicbrainz-serve... to see if you feel it's ready?
14:44 PM
yvanzo: I understand you wanted a release today, was that prod or beta?
14:44 PM
yvanzo[m]

prod
14:44 PM
It should have been on last Monday.
14:46 PM
reosarevok[m]

Ok, just making sure
14:46 PM
Seems good to me
14:46 PM
Also, if either of you have a problem with MBS-13681 let me know - if not I'll look into it this week
14:46 PM
BrainzBot

MBS-13681: Show recent event additions/artwork on the frontpage https://tickets.metabrainz.org/browse/MBS-13681
14:46 PM
ericd[m]

mayhem: I think we can restore test server now. Thanks!
14:47 PM
mayhem[m]

very good -- thanks for working with me on this.
14:48 PM
monkey: social-share-image is coming back now.
14:48 PM
monkey[m]

Fankyou
14:52 PM
yvanzo[m]

reosarevok: The two hot topics currently in my pipeline are still SolrCloud backups and Event edit page in React.
14:53 PM
reosarevok[m]

yvanzo: sure, I'm just sharing that one because it's not a team contributor and we usually try to fast-track those :)
14:53 PM
But no worries if you don't have time rn
14:55 PM
bitmap[m]

I'm looking at teh layout shift one
14:55 PM
lucifer[m]

[@mayhem](https://matrix.to/#/@mayhem:chatbrainz.org) Sure.
14:56 PM
bitmap[m]

s/teh/the/
14:56 PM
lucifer[m]

<rimskii[m]> "lucifer: I opened a PR in troi..." <- Probably related to some outdated GitHub action we are using, feel free to ignore. I'll fix.
14:57 PM
mayhem[m]

lucifer[m]: what was this in ref to?
15:00 PM
yvanzo[m]

bitmap, yellowhatpro: ping 🔔
15:00 PM
yellowhatpro[m]

pongg
15:01 PM
yvanzo[m]

bitmap: I just addressed all of reosarevok ’s comments for the prod-needed PR about releasing.
15:03 PM
yellowhatpro[m]

This is what I am working on currently: https://github.com/yellowHatpro/mb-exurl-ia-ser...
15:05 PM
I got messed in it for quite some time. Had written a lot of code for Error handling when I realised I am just over complicating stuff, so went with straightforward impl
15:06 PM
lucifer[m]

<mayhem[m]> "what was this in ref to?" <- Some GitHub action failure on a troi PR
15:06 PM
mayhem[m]

ah.
15:08 PM
djl has quit
15:08 PM
djl joined the channel
15:12 PM
yvanzo[m]

yellowhatpro: Ok, so you’re on using `this_error` atm?
15:13 PM
theflash[m] joined the channel
15:13 PM
theflash[m] uploaded an image: (407KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DcGBWAPCriZamWJOFhWqglJe/IMG_7483.PNG >
15:13 PM
yellowhatpro[m]

I haven't yet, but as rustynova suggested, I will explore and use it
15:13 PM
reosarevok[m]

"We will rule over this error, and we will call it... this_error"
15:14 PM
theflash[m]

akshaaatt[m]: hey, I have implemented pagination in the feed, when i am using LazyVStack, the duplicate events are not being loaded at once
15:16 PM
yellowhatpro[m]

I will be focusing on these 2 points in the current pr:
15:16 PM
- api mocking
15:16 PM
- dealing with rate limiting
15:18 PM
bitmap[m]

you will add some tests for make_archival_network_request using the API mocking, yes?
15:18 PM
yellowhatpro[m]

Yes will add tests for this
15:19 PM
bitmap[m]

besides the lack of tests and RustyNova's suggestions I think it looks pretty good
15:20 PM
but I'd make sure to remove your API key too
15:21 PM
yvanzo[m]

It should be retrieved from a configuration file instead.
15:21 PM
yellowhatpro[m]

aah yes. Will remove it soon.
15:21 PM
Also,should we use some meb account for archiving?
15:22 PM
yvanzo[m]

Maybe for deployment. Does the account matter for development?
15:22 PM
yellowhatpro[m]

Nope, for dev I am using my own id and key
15:23 PM
yvanzo[m]

Having a configuration file should probably be a priority as there are a number of hard-coded values in the code that would better fit that too.
15:23 PM
yellowhatpro[m]

yvanzo[m]: Yeah I will add them in env fle itself, I thought to add in the final commits of the PR. Current credentials don't matter much
15:23 PM
bitmap[m]

for MBS we wrote a small service that mocks the IA's S3 API, you could also write something similar here that mocks the /save endpoint (for development)
15:24 PM
yellowhatpro[m]

bitmap[m]: Oh we are using IA's API in MBS, are we dealing with rate limiiting in that as well?
15:25 PM
yvanzo[m]

yellowhatpro: `.env` might be too limited. A TOML file might be more appropriate. See the crate `config` for example.
15:25 PM
yellowhatpro[m]

yvanzo[m]: ok will make it work soon
15:25 PM
Okk, gonna explore this_error and config crates then
15:26 PM
yvanzo[m]

No, we aren’t using the same API from MBS.
15:26 PM
bitmap[m]

yellowhatpro[m]: not really, we just sleep 1s between each event (but each event may take 1-2s to process). if we hit the rate limit (which is rare), it's just retried later
15:26 PM
yellowhatpro[m]

ohh alright.
15:27 PM
bitmap[m]

but yes, it's a completely different API
15:27 PM
yellowhatpro[m]

bitmap[m]: Right. I should try something similar.
15:27 PM
Maybe I should just apply some math and since I am mostly polling, the time can be configured
15:31 PM
bitmap[m]

for importing existing edits we may want something that takes better advantage of the rate limit though, since that process will take a while
15:32 PM
yvanzo[m]

It is okay to start with a simplistic rate limiting indeed. It can be improved later on, once everything starts to be working together.
15:33 PM
Ideally, it should be what bitmap mentioned: different threads or processes to handle polling and requesting.
15:34 PM
yellowhatpro[m]

Ummm a doubt here
15:35 PM
Ok nvm. I thought you meant we have to create multiple threds for requesting
15:36 PM
yvanzo[m]: Yupp I am running polling and archiving in different threads
15:37 PM
yvanzo[m]

Threading isn’t in the main goals, so at most just make a note about it for the stretch goals if you want to remind about it.
15:38 PM
Great if you have some kind of threading already. :)
15:39 PM
Yes, multiple threads for requesting might be a thing if we can be allowed a higher rate limit.
15:40 PM
yellowhatpro[m]

<bitmap[m]> "for importing existing edits..." <- Regarding this, If we are rate limited, then should I focus on maximizing the requests.
15:40 PM
For ex, if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones?
15:41 PM
yvanzo[m]: Ohh did you mean archiving x URLs parallely ?
15:41 PM
yvanzo[m]

Yes (as stretch goals)
15:43 PM
yellowhatpro[m]

Got it ✅
15:44 PM
bitmap[m]

yellowhatpro[m]: > <@yellowhatpro:matrix.org> Regarding this, If we are rate limited, then should I focus on maximizing the requests.
15:44 PM
> For ex, if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones?
15:44 PM
if there's still work to do you should maximize the requests you can do, ideally
15:44 PM
but you can start with something simple as yvanzo said
15:45 PM
yellowhatpro[m]

btw there has to be another task that will do cleanup/re-archival part of URLs that couldn't get archived in the first place. That will also repeat after x amout of time. That has to be done after I am done with the archival part
15:45 PM
bitmap[m]

not sure what you meant by "archive the existing ones" though, do you mean older edits?
15:47 PM
yellowhatpro[m]

<bitmap[m]> "for importing existing edits..." <- yupp older ones. I thought you were referring to them when you said importing existing edits
15:48 PM
bitmap[m]

yeah, I was, but I was under the impression that there was only one edit counter that is incremented; so it starts from the beginning, and doesn't process new edits until all previous ones have been processed
15:50 PM
yellowhatpro[m]

I mean its configurable, we can either have it start from the beginning, or from the latest one as well
15:52 PM
I haven't really thought what should be the better thing to do. But later if we go with the trigger impl, we will have to start with the latest edits, which keeps on incrementing.
15:52 PM
But in any case, I will try to archive all the previous ones as well
15:53 PM
yvanzo[m]

yellowhatpro: There is a feature in GitHub to mark your PRs as drafts if needed.
15:54 PM
yellowhatpro[m]

ok, should I make the wip PR draft?
15:55 PM
yvanzo[m]

It seems synonymous indeed :)
15:55 PM
* It seems to be synonymous indeed
15:56 PM
yellowhatpro[m]

Okii made it a draft one
15:56 PM
bitmap[m]

<yellowhatpro[m]> "I mean its configurable, we..." <- I assumed "if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones" was within a single process -- i.e. how are you keeping track of which edits have been processed in that case
16:00 PM
yellowhatpro[m]

Edits processed means when I am adding them to `internet_archive_urls` table right?
16:00 PM
I am just tracking the last edit in that case
16:00 PM
Sorry I get suuper confused sometimes
16:03 PM
fletchto99 has quit
16:03 PM
fletchto99 joined the channel
16:04 PM
yvanzo[m]

No worries, it should become more clear once you have API requests in the loop.
16:05 PM
bitmap[m]

maybe I misunderstood you :) I thought you were talking about prioritizing the processing of new (recent) URLs, and then processing old (existing) URLs only if there are no recent ones polled -- which would require separate counters
16:05 PM
btw, if the service is stopped, where are edit_note_start_idx and edit_data_start_idx read from such that it can continue from where it left off?
16:07 PM
yellowhatpro[m]

from internet_archive_urls table : https://github.com/yellowHatpro/mb-exurl-ia-ser...
16:08 PM
Welp I didn't write a clear function doc for it
16:08 PM
Will update that
16:08 PM
akshaaatt[m]

On it mayhem !
16:09 PM
theflash__ (IRC): can you elaborate more on the issue?
16:10 PM
bitmap[m]

yellowhatpro[m]: hmm, doesn't that only track the last edit with a save-able URL? many edit notes won't have URLs for example
16:13 PM
BrainzGit

[listenbrainz-server] 14MonkeyDo opened pull request #2937 (03master…entity-stats-page): LB-1102: Revamp Top Entity stats pages https://github.com/metabrainz/listenbrainz-serv...
16:14 PM
yellowhatpro[m]

Yeah, right. But internet_archive_urls is the only place for now where I can look for the data. Is there any other way where I can keep the latest edit data and edit note id?
16:15 PM
yvanzo[m]

Probably a separate table last_processed_rows
16:15 PM
bitmap[m]

you could introduce a new table to store them
16:15 PM
yellowhatpro[m]

alright then, a new table coming right up ✅
16:16 PM
bitmap[m]

that's why I was asking about prioritizing recent edits (so that they are archived right away such that the state of the page at the time the edit or note was entered is preserved) over older ones
16:18 PM
yellowhatpro[m]

what do you refer to when you say the state of the page??
16:18 PM
The recent rows ?
16:19 PM
bitmap[m]

the content of the page being archived
16:19 PM
yvanzo[m]

bitmap: Prioritizing recent edits certainly is a longer term goal.
16:20 PM
bitmap[m]

with the last_processed_rows table you could potentially keep separate counters for recent vs. historical edits later on
16:21 PM
yellowhatpro[m]

bitmap[m]: oh nice, now I am able to process things
16:21 PM
yvanzo[m]

It will have to be as flexible as possible, but if we just start with one row pointer per table, that would be a good start.
16:24 PM
yellowhatpro[m]

cool, each row in last_processed_rows pointing to the latest processed rows of different tables (edit_data and edit_table currently)
16:25 PM
latest processed during polling, regardless it containg URL or not
16:25 PM
yvanzo[m]

yellowhatpro: At first glance, what columns do you imagine for this new table?
16:25 PM
yellowhatpro[m]

id, latest_row_processed, table_name
16:26 PM
as of now
16:26 PM
yvanzo[m]

Yup, even though id is probably unneeded (or I’m missing the point).
16:27 PM
yellowhatpro[m]

yeah it's not needed
16:27 PM
yvanzo[m]

Or just use it to refer to the id in the other table?
16:27 PM
pranav[m]

akshaaatt: I’ll try to get the stats page in soon before mid term evals
16:27 PM
yellowhatpro[m]

yeah right
16:28 PM
yvanzo[m]

You might also need a column column as not every table has a column id.
16:28 PM
(or id_column if it helps with clarity)
16:29 PM
yellowhatpro[m]

id_column will refer to id in case of edit_note and edit in case of edit_data, right?
16:30 PM
yvanzo[m]

That should work.
16:30 PM
discordbrainz

<05rustynova> bitmap: "not really, we just sleep 1s between each event..." That's the easy part. Now deal with an async and parallel environment and it starts messing itself up in .23 femtoseconds. Either you do the clean way and use semaphores, holding permits until the next refresh window, or you do the ugly way and just hold a mutex until prev_request_start + 1. I had to do the later one for MB_RS as it doesn't have the http
16:30 PM
headers needed to the number of tickets per window: https://github.com/RustyNova016/musicbrainz_rs_...
16:31 PM
yellowhatpro[m]

Cool
16:31 PM
yvanzo[m]

bitmap, yellowhatpro: Anything else to be discussed about the archiver project? :)
16:32 PM
discordbrainz

<05rustynova> As for the DB schema, why not just doing a ORDER BY on an inserted_at column?
16:32 PM
bitmap[m]

yvanzo[m]: nothing else from me right now :) thanks!
16:33 PM
texke has quit
16:33 PM
yellowhatpro[m]

yvanzo[m]: Nothing in particular, got the notes for today