#metabrainz

/

14:24 PM
ericd[m]

Maybe give me another 10 minutes. I want to have some more observations.

2024-07-10 19222, 2024

14:24 PM
mayhem[m]

ok, np

2024-07-10 19204, 2024

14:27 PM
bitmap[m]

reosarevok[m]: I was hoping for a way to reproduce the wide char one by just browsing to a page, but I think the `EditExternalLinks` one suffices too, since that goes through Catalyst

2024-07-10 19239, 2024

14:31 PM
reosarevok[m]

Yeah, I didn't look further since that one did hit it already :)

2024-07-10 19222, 2024

14:32 PM
ericd[m]

<ericd[m]> "I'll check" <- ah not a bug. it's just that MB has that many releases to return :D

2024-07-10 19200, 2024

14:33 PM
mayhem[m]

ericd[m]: Then maybe limit the number of items we return?

2024-07-10 19256, 2024

14:35 PM
ericd[m]

mayhem[m]: yeah I will change it to some more reasonable amounts, or this may confuses users

2024-07-10 19228, 2024

14:36 PM
mayhem[m]

if the list is truncated, maybe we can add a link to where they can see the rest on the web?

2024-07-10 19242, 2024

14:37 PM
ericd[m]

s/amounts/amount/, s/confuses/confuse/

2024-07-10 19210, 2024

14:43 PM
ericd[m]

<mayhem[m]> "if the list is truncated..." <- make sense. i will add a link in the feed content.

2024-07-10 19248, 2024

14:43 PM
reosarevok[m]

yvanzo, bitmap: can at least one of you also check https://github.com/metabrainz/musicbrainz-server/… to see if you feel it's ready?

2024-07-10 19213, 2024

14:44 PM
reosarevok[m]

yvanzo: I understand you wanted a release today, was that prod or beta?

2024-07-10 19237, 2024

14:44 PM
yvanzo[m]

prod

2024-07-10 19249, 2024

14:44 PM
yvanzo[m]

It should have been on last Monday.

2024-07-10 19215, 2024

14:46 PM
reosarevok[m]

Ok, just making sure

2024-07-10 19217, 2024

14:46 PM
reosarevok[m]

Seems good to me

2024-07-10 19237, 2024

14:46 PM
reosarevok[m]

Also, if either of you have a problem with MBS-13681 let me know - if not I'll look into it this week

2024-07-10 19237, 2024

14:46 PM
BrainzBot

MBS-13681: Show recent event additions/artwork on the frontpage https://tickets.metabrainz.org/browse/MBS-13681

2024-07-10 19253, 2024

14:46 PM
ericd[m]

mayhem: I think we can restore test server now. Thanks!

2024-07-10 19209, 2024

14:47 PM
mayhem[m]

very good -- thanks for working with me on this.

2024-07-10 19208, 2024

14:48 PM
mayhem[m]

monkey: social-share-image is coming back now.

2024-07-10 19213, 2024

14:48 PM
monkey[m]

Fankyou

2024-07-10 19230, 2024

14:52 PM
yvanzo[m]

reosarevok: The two hot topics currently in my pipeline are still SolrCloud backups and Event edit page in React.

2024-07-10 19203, 2024

14:53 PM
reosarevok[m]

yvanzo: sure, I'm just sharing that one because it's not a team contributor and we usually try to fast-track those :)

2024-07-10 19214, 2024

14:53 PM
reosarevok[m]

But no worries if you don't have time rn

2024-07-10 19255, 2024

14:55 PM
bitmap[m]

I'm looking at teh layout shift one

2024-07-10 19259, 2024

14:55 PM
lucifer[m]

[@mayhem](https://matrix.to/#/@mayhem:chatbrainz.org) Sure.

2024-07-10 19206, 2024

14:56 PM
bitmap[m]

s/teh/the/

2024-07-10 19249, 2024

14:56 PM
lucifer[m]

<rimskii[m]> "lucifer: I opened a PR in troi..." <- Probably related to some outdated GitHub action we are using, feel free to ignore. I'll fix.

2024-07-10 19204, 2024

14:57 PM
mayhem[m]

lucifer[m]: what was this in ref to?

2024-07-10 19240, 2024

15:00 PM
yvanzo[m]

bitmap, yellowhatpro: ping 🔔

2024-07-10 19257, 2024

15:00 PM
yellowhatpro[m]

pongg

2024-07-10 19225, 2024

15:01 PM
yvanzo[m]

bitmap: I just addressed all of reosarevok ’s comments for the prod-needed PR about releasing.

2024-07-10 19218, 2024

15:03 PM
yellowhatpro[m]

This is what I am working on currently: https://github.com/yellowHatpro/mb-exurl-ia-servi…

2024-07-10 19201, 2024

15:05 PM
yellowhatpro[m]

I got messed in it for quite some time. Had written a lot of code for Error handling when I realised I am just over complicating stuff, so went with straightforward impl

2024-07-10 19213, 2024

15:06 PM
lucifer[m]

<mayhem[m]> "what was this in ref to?" <- Some GitHub action failure on a troi PR

2024-07-10 19232, 2024

15:06 PM
mayhem[m]

ah.

2024-07-10 19205, 2024

15:08 PM
djl has quit

2024-07-10 19218, 2024

15:08 PM
djl joined the channel

2024-07-10 19232, 2024

15:12 PM
yvanzo[m]

yellowhatpro: Ok, so you’re on using `this_error` atm?

2024-07-10 19214, 2024

15:13 PM
theflash[m] joined the channel

2024-07-10 19214, 2024

15:13 PM
theflash[m] uploaded an image: (407KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/DcGBWAPCriZamWJOFhWqglJe/IMG_7483.PNG >

2024-07-10 19222, 2024

15:13 PM
yellowhatpro[m]

I haven't yet, but as rustynova suggested, I will explore and use it

2024-07-10 19227, 2024

15:13 PM
reosarevok[m]

"We will rule over this error, and we will call it... this_error"

2024-07-10 19237, 2024

15:14 PM
theflash[m]

akshaaatt[m]: hey, I have implemented pagination in the feed, when i am using LazyVStack, the duplicate events are not being loaded at once

2024-07-10 19217, 2024

15:16 PM
yellowhatpro[m]

I will be focusing on these 2 points in the current pr:

2024-07-10 19217, 2024

15:16 PM
yellowhatpro[m]

- api mocking

2024-07-10 19217, 2024

15:16 PM
yellowhatpro[m]

- dealing with rate limiting

2024-07-10 19231, 2024

15:18 PM
bitmap[m]

you will add some tests for make_archival_network_request using the API mocking, yes?

2024-07-10 19251, 2024

15:18 PM
yellowhatpro[m]

Yes will add tests for this

2024-07-10 19255, 2024

15:19 PM
bitmap[m]

besides the lack of tests and RustyNova's suggestions I think it looks pretty good

2024-07-10 19210, 2024

15:20 PM
bitmap[m]

but I'd make sure to remove your API key too

2024-07-10 19241, 2024

15:21 PM
yvanzo[m]

It should be retrieved from a configuration file instead.

2024-07-10 19243, 2024

15:21 PM
yellowhatpro[m]

aah yes. Will remove it soon.

2024-07-10 19243, 2024

15:21 PM
yellowhatpro[m]

Also,should we use some meb account for archiving?

2024-07-10 19216, 2024

15:22 PM
yvanzo[m]

Maybe for deployment. Does the account matter for development?

2024-07-10 19238, 2024

15:22 PM
yellowhatpro[m]

Nope, for dev I am using my own id and key

2024-07-10 19229, 2024

15:23 PM
yvanzo[m]

Having a configuration file should probably be a priority as there are a number of hard-coded values in the code that would better fit that too.

2024-07-10 19235, 2024

15:23 PM
yellowhatpro[m]

yvanzo[m]: Yeah I will add them in env fle itself, I thought to add in the final commits of the PR. Current credentials don't matter much

2024-07-10 19242, 2024

15:23 PM
bitmap[m]

for MBS we wrote a small service that mocks the IA's S3 API, you could also write something similar here that mocks the /save endpoint (for development)

2024-07-10 19244, 2024

15:24 PM
yellowhatpro[m]

bitmap[m]: Oh we are using IA's API in MBS, are we dealing with rate limiiting in that as well?

2024-07-10 19211, 2024

15:25 PM
yvanzo[m]

yellowhatpro: `.env` might be too limited. A TOML file might be more appropriate. See the crate `config` for example.

2024-07-10 19216, 2024

15:25 PM
yellowhatpro[m]

yvanzo[m]: ok will make it work soon

2024-07-10 19253, 2024

15:25 PM
yellowhatpro[m]

Okk, gonna explore this_error and config crates then

2024-07-10 19217, 2024

15:26 PM
yvanzo[m]

No, we aren’t using the same API from MBS.

2024-07-10 19242, 2024

15:26 PM
bitmap[m]

yellowhatpro[m]: not really, we just sleep 1s between each event (but each event may take 1-2s to process). if we hit the rate limit (which is rare), it's just retried later

2024-07-10 19254, 2024

15:26 PM
yellowhatpro[m]

ohh alright.

2024-07-10 19207, 2024

15:27 PM
bitmap[m]

but yes, it's a completely different API

2024-07-10 19243, 2024

15:27 PM
yellowhatpro[m]

bitmap[m]: Right. I should try something similar.

2024-07-10 19243, 2024

15:27 PM
yellowhatpro[m]

Maybe I should just apply some math and since I am mostly polling, the time can be configured

2024-07-10 19245, 2024

15:31 PM
bitmap[m]

for importing existing edits we may want something that takes better advantage of the rate limit though, since that process will take a while

2024-07-10 19208, 2024

15:32 PM
yvanzo[m]

It is okay to start with a simplistic rate limiting indeed. It can be improved later on, once everything starts to be working together.

2024-07-10 19227, 2024

15:33 PM
yvanzo[m]

Ideally, it should be what bitmap mentioned: different threads or processes to handle polling and requesting.

2024-07-10 19208, 2024

15:34 PM
yellowhatpro[m]

Ummm a doubt here

2024-07-10 19228, 2024

15:35 PM
yellowhatpro[m]

Ok nvm. I thought you meant we have to create multiple threds for requesting

2024-07-10 19219, 2024

15:36 PM
yellowhatpro[m]

yvanzo[m]: Yupp I am running polling and archiving in different threads

2024-07-10 19241, 2024

15:37 PM
yvanzo[m]

Threading isn’t in the main goals, so at most just make a note about it for the stretch goals if you want to remind about it.

2024-07-10 19215, 2024

15:38 PM
yvanzo[m]

Great if you have some kind of threading already. :)

2024-07-10 19250, 2024

15:39 PM
yvanzo[m]

Yes, multiple threads for requesting might be a thing if we can be allowed a higher rate limit.

2024-07-10 19220, 2024

15:40 PM
yellowhatpro[m]

<bitmap[m]> "for importing existing edits..." <- Regarding this, If we are rate limited, then should I focus on maximizing the requests.

2024-07-10 19220, 2024

15:40 PM
yellowhatpro[m]

For ex, if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones?

2024-07-10 19234, 2024

15:41 PM
yellowhatpro[m]

yvanzo[m]: Ohh did you mean archiving x URLs parallely ?

2024-07-10 19252, 2024

15:41 PM
yvanzo[m]

Yes (as stretch goals)

2024-07-10 19221, 2024

15:43 PM
yellowhatpro[m]

Got it ✅

2024-07-10 19201, 2024

15:44 PM
bitmap[m]

yellowhatpro[m]: > <@yellowhatpro:matrix.org> Regarding this, If we are rate limited, then should I focus on maximizing the requests.

2024-07-10 19201, 2024

15:44 PM
bitmap[m]

> For ex, if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones?

2024-07-10 19201, 2024

15:44 PM
bitmap[m]

if there's still work to do you should maximize the requests you can do, ideally

2024-07-10 19231, 2024

15:44 PM
bitmap[m]

but you can start with something simple as yvanzo said

2024-07-10 19213, 2024

15:45 PM
yellowhatpro[m]

btw there has to be another task that will do cleanup/re-archival part of URLs that couldn't get archived in the first place. That will also repeat after x amout of time. That has to be done after I am done with the archival part

2024-07-10 19233, 2024

15:45 PM
bitmap[m]

not sure what you meant by "archive the existing ones" though, do you mean older edits?

2024-07-10 19202, 2024

15:47 PM
yellowhatpro[m]

<bitmap[m]> "for importing existing edits..." <- yupp older ones. I thought you were referring to them when you said importing existing edits

2024-07-10 19203, 2024

15:48 PM
bitmap[m]

yeah, I was, but I was under the impression that there was only one edit counter that is incremented; so it starts from the beginning, and doesn't process new edits until all previous ones have been processed

2024-07-10 19215, 2024

15:50 PM
yellowhatpro[m]

I mean its configurable, we can either have it start from the beginning, or from the latest one as well

2024-07-10 19238, 2024

15:52 PM
yellowhatpro[m]

I haven't really thought what should be the better thing to do. But later if we go with the trigger impl, we will have to start with the latest edits, which keeps on incrementing.

2024-07-10 19239, 2024

15:52 PM
yellowhatpro[m]

But in any case, I will try to archive all the previous ones as well

2024-07-10 19209, 2024

15:53 PM
yvanzo[m]

yellowhatpro: There is a feature in GitHub to mark your PRs as drafts if needed.

2024-07-10 19212, 2024

15:54 PM
yellowhatpro[m]

ok, should I make the wip PR draft?

2024-07-10 19205, 2024

15:55 PM
yvanzo[m]

It seems synonymous indeed :)

2024-07-10 19233, 2024

15:55 PM
yvanzo[m]

* It seems to be synonymous indeed

2024-07-10 19233, 2024

15:56 PM
yellowhatpro[m]

Okii made it a draft one

2024-07-10 19239, 2024

15:56 PM
bitmap[m]

<yellowhatpro[m]> "I mean its configurable, we..." <- I assumed "if I don't have any URL to process in current poll (while making a request), should I devote that time to archive the existing ones" was within a single process -- i.e. how are you keeping track of which edits have been processed in that case

2024-07-10 19201, 2024

16:00 PM
yellowhatpro[m]

Edits processed means when I am adding them to `internet_archive_urls` table right?

2024-07-10 19201, 2024

16:00 PM
yellowhatpro[m]

I am just tracking the last edit in that case

2024-07-10 19248, 2024

16:00 PM
yellowhatpro[m]

Sorry I get suuper confused sometimes

2024-07-10 19233, 2024

16:03 PM
fletchto99 has quit

2024-07-10 19257, 2024

16:03 PM
fletchto99 joined the channel

2024-07-10 19232, 2024

16:04 PM
yvanzo[m]

No worries, it should become more clear once you have API requests in the loop.

2024-07-10 19217, 2024

16:05 PM
bitmap[m]

maybe I misunderstood you :) I thought you were talking about prioritizing the processing of new (recent) URLs, and then processing old (existing) URLs only if there are no recent ones polled -- which would require separate counters

2024-07-10 19258, 2024

16:05 PM
bitmap[m]

btw, if the service is stopped, where are edit_note_start_idx and edit_data_start_idx read from such that it can continue from where it left off?

2024-07-10 19237, 2024

16:07 PM
yellowhatpro[m]

from internet_archive_urls table : https://github.com/yellowHatpro/mb-exurl-ia-servi…

2024-07-10 19216, 2024

16:08 PM
yellowhatpro[m]

Welp I didn't write a clear function doc for it

2024-07-10 19220, 2024

16:08 PM
yellowhatpro[m]

Will update that

2024-07-10 19248, 2024

16:08 PM
akshaaatt[m]

On it mayhem !

2024-07-10 19206, 2024

16:09 PM
akshaaatt[m]

theflash__ (IRC): can you elaborate more on the issue?

2024-07-10 19254, 2024

16:10 PM
bitmap[m]

yellowhatpro[m]: hmm, doesn't that only track the last edit with a save-able URL? many edit notes won't have URLs for example

2024-07-10 19252, 2024

16:13 PM
BrainzGit

[listenbrainz-server] 14MonkeyDo opened pull request #2937 (03master…entity-stats-page): LB-1102: Revamp Top Entity stats pages https://github.com/metabrainz/listenbrainz-server…

2024-07-10 19209, 2024

16:14 PM
yellowhatpro[m]

Yeah, right. But internet_archive_urls is the only place for now where I can look for the data. Is there any other way where I can keep the latest edit data and edit note id?

2024-07-10 19203, 2024

16:15 PM
yvanzo[m]

Probably a separate table last_processed_rows

2024-07-10 19208, 2024

16:15 PM
bitmap[m]

you could introduce a new table to store them

2024-07-10 19228, 2024

16:15 PM
yellowhatpro[m]

alright then, a new table coming right up ✅

2024-07-10 19223, 2024

16:16 PM
bitmap[m]

that's why I was asking about prioritizing recent edits (so that they are archived right away such that the state of the page at the time the edit or note was entered is preserved) over older ones

2024-07-10 19236, 2024

16:18 PM
yellowhatpro[m]

what do you refer to when you say the state of the page??

2024-07-10 19236, 2024

16:18 PM
yellowhatpro[m]

The recent rows ?

2024-07-10 19204, 2024

16:19 PM
bitmap[m]

the content of the page being archived

2024-07-10 19217, 2024

16:19 PM
yvanzo[m]

bitmap: Prioritizing recent edits certainly is a longer term goal.

2024-07-10 19245, 2024

16:20 PM
bitmap[m]

with the last_processed_rows table you could potentially keep separate counters for recent vs. historical edits later on

2024-07-10 19224, 2024

16:21 PM
yellowhatpro[m]

bitmap[m]: oh nice, now I am able to process things

2024-07-10 19257, 2024

16:21 PM
yvanzo[m]

It will have to be as flexible as possible, but if we just start with one row pointer per table, that would be a good start.

2024-07-10 19208, 2024

16:24 PM
yellowhatpro[m]

cool, each row in last_processed_rows pointing to the latest processed rows of different tables (edit_data and edit_table currently)

2024-07-10 19209, 2024

16:25 PM
yellowhatpro[m]

latest processed during polling, regardless it containg URL or not

2024-07-10 19233, 2024

16:25 PM
yvanzo[m]

yellowhatpro: At first glance, what columns do you imagine for this new table?

2024-07-10 19257, 2024

16:25 PM
yellowhatpro[m]

id, latest_row_processed, table_name

2024-07-10 19201, 2024

16:26 PM
yellowhatpro[m]

as of now

2024-07-10 19233, 2024

16:26 PM
yvanzo[m]

Yup, even though id is probably unneeded (or I’m missing the point).

2024-07-10 19200, 2024

16:27 PM
yellowhatpro[m]

yeah it's not needed

2024-07-10 19210, 2024

16:27 PM
yvanzo[m]

Or just use it to refer to the id in the other table?

2024-07-10 19235, 2024

16:27 PM
pranav[m]

akshaaatt: I’ll try to get the stats page in soon before mid term evals

2024-07-10 19240, 2024

16:27 PM
yellowhatpro[m]

yeah right

2024-07-10 19213, 2024

16:28 PM
yvanzo[m]

You might also need a column column as not every table has a column id.

2024-07-10 19239, 2024

16:28 PM
yvanzo[m]

(or id_column if it helps with clarity)

2024-07-10 19230, 2024

16:29 PM
yellowhatpro[m]

id_column will refer to id in case of edit_note and edit in case of edit_data, right?

2024-07-10 19228, 2024

16:30 PM
yvanzo[m]

That should work.

2024-07-10 19258, 2024

16:30 PM
discordbrainz

<05rustynova> bitmap: "not really, we just sleep 1s between each event..." That's the easy part. Now deal with an async and parallel environment and it starts messing itself up in .23 femtoseconds. Either you do the clean way and use semaphores, holding permits until the next refresh window, or you do the ugly way and just hold a mutex until prev_request_start + 1. I had to do the later one for MB_RS as it doesn't have the http

2024-07-10 19258, 2024

16:30 PM
discordbrainz

headers needed to the number of tickets per window: https://github.com/RustyNova016/musicbrainz_rs_no…

2024-07-10 19202, 2024

16:31 PM
yellowhatpro[m]

Cool

2024-07-10 19238, 2024

16:31 PM
yvanzo[m]

bitmap, yellowhatpro: Anything else to be discussed about the archiver project? :)

2024-07-10 19202, 2024

16:32 PM
discordbrainz

<05rustynova> As for the DB schema, why not just doing a ORDER BY on an inserted_at column?

2024-07-10 19256, 2024

16:32 PM
bitmap[m]

yvanzo[m]: nothing else from me right now :) thanks!

2024-07-10 19205, 2024

16:33 PM
texke has quit

2024-07-10 19223, 2024

16:33 PM
yellowhatpro[m]

yvanzo[m]: Nothing in particular, got the notes for today