in #metabrainz

1:05 AM
Lotheric has quit
1:07 AM
Lotheric joined the channel
2:14 AM
thomasross joined the channel
2:37 AM
d4rkie joined the channel
2:41 AM
D4RK has quit
2:42 AM
_lucifer

ruaok: yes, all nodes are up and the cluster looks sane. however, unless we submit a request we cannot be sure it works as expected. i did not start the request consumer on the newleader because i have a few doubts about how we should do it. do we want to run it directly on newleader or in a container on it?
2:46 AM
thomasross has quit
2:54 AM
RikkoM has quit
4:46 AM
flamingspinach has quit
4:47 AM
flamingspinach joined the channel
5:24 AM
sumedh joined the channel
5:26 AM
BrainzGit

[musicbrainz-server] reosarevok merged pull request #2057 (master…edit-preview-id-warning): Fix Catalyst warning: id is missing in edit previews https://github.com/metabrainz/musicbrainz-serve...
5:43 AM
RikkoM joined the channel
5:52 AM
RikkoM has quit
6:01 AM
adhi001 joined the channel
6:04 AM
yef has quit
6:15 AM
yef joined the channel
6:15 AM
yef has quit
6:15 AM
yef joined the channel
6:16 AM
[musicbrainz-server] reosarevok merged pull request #2061 (master…empty-review-markdown-warning): Fix Catalyst warning: don't pass undef review text to markdown https://github.com/metabrainz/musicbrainz-serve...
6:35 AM
iliekcomputers has quit
6:36 AM
[musicbrainz-server] reosarevok opened pull request #2062 (master…missing-link-type-id-warning): Fix Catalyst warning: don't try undef linktype id to access $loaded https://github.com/metabrainz/musicbrainz-serve...
6:36 AM
shivam-kapila has quit
6:37 AM
shivam-kapila joined the channel
6:38 AM
iliekcomputers joined the channel
6:40 AM
sumedh has quit
6:41 AM
RikkoM joined the channel
6:55 AM
shivam-kapila

_lucifer: good work 🎉
6:57 AM
BrainzGit

[musicbrainz-server] reosarevok opened pull request #2063 (master…EditReleaseEvents-map-keys): Add keys to EditReleaseEvents to stop warning https://github.com/metabrainz/musicbrainz-serve...
7:23 AM
reosarevok

https://www.rockpapershotgun.com/this-video-gam... :D
7:39 AM
yvanzo, bitmap: we only have 4 tickets ready for beta. Do you think we can improve that today? :)
7:59 AM
ruaok

mooin!
7:59 AM
_lucifer: not sure in this case. what do you think?
8:04 AM
_lucifer

ruaok, i do not know about the networking concerns in case we use docker for request consumer. if there aren't any, then maybe let's use docker because we can just bring down the image and start a new one when dependencies etc. change, running directly could make these change a bit more involved sometimes.
8:04 AM
but we'll also need to figure how to make docker talk back and forth with the host. i don't know if that is easy or hard.
8:04 AM
ruaok

I see no network concerns and I think you have a good point about pulling the container, so lets use docket.
8:05 AM
for that, put the container in host mode -- in host mode the container runs inside the normal port mapping of the host.
8:05 AM
and the container has direct access to the host network layer.
8:06 AM
_lucifer

interesting, let's try docker then.
8:06 AM
ruaok

https://docs.docker.com/network/host/
8:06 AM
_lucifer

i just figured out a way to test the cluster without the consumer and found an issue.
8:07 AM
ruaok

ok.
8:07 AM
_lucifer

the namenode is having trouble connecting to datanodes. again could be firewall or misconfigration
8:07 AM
i am rechecking the configurations first
8:07 AM
ruaok

when you're ready to test for real, just let me know. we need to turn off the old consumer, but then we can start sending requests from lemmy
8:07 AM
k
8:08 AM
have you also change the configuration of the nodes because we have loads more ram?
8:08 AM
_lucifer

no that can be done in the spark-submit file we have in LB repo
8:09 AM
the values there will override the defaults anyways so i thought not to meddle with those
8:10 AM
ruaok

ok, then we'll need to remember those when we install the consumer
8:10 AM
_lucifer

that script is executed inside the request consumer.
8:10 AM
yes
8:11 AM
adhi001 has quit
8:30 AM
ruaok does some weeding out of GSoC apps
8:36 AM
ruaok

hmm. ok. we got one covid project proposal.
8:36 AM
wait what?
8:36 AM
_lucifer

lol i saw that.
8:37 AM
one org got a draft of Egypt's Budget 😆
8:38 AM
ruaok

well, I'm curious to see what the landscape looks like after yvanzo and Mr_Monkey toss out the not worthy MB and BB projects.
8:56 AM
MRiddickW has quit
8:57 AM
alastairp: Mr_Monkey I've been working with timescale last night to suss out a few things while taking a look at the listen counts.
8:57 AM
I've found a lot of interesting stuff to share. let me pack up and head to the office.
8:58 AM
alastairp

cool
8:58 AM
see you here
8:58 AM
ruaok

loading a users page on LB should already be faster now: https://listenbrainz.org/user/gracz54
9:00 AM
not that I've seen.
9:01 AM
alastairp

maybe .de will contact you since it's hosted at hetzner
9:02 AM
ruaok

did they provide a link of any kind?
9:06 AM
alastairp

will show you in the office
9:06 AM
ruaok

k
9:15 AM
_lucifer

ruaok, tested with a sample jar. cluster is running smoothly, we can proceed on setting up the new request consumer
9:21 AM
ruaok

great, go for it.
9:28 AM
shivam-kapila

Covid project. But why?
9:29 AM
it> one org got a draft of Egypt's Budget 😆
9:29 AM
Another got medical chapters as a pdf
9:44 AM
adhawkins has quit
9:48 AM
adhawkins joined the channel
9:57 AM
BrainzGit

[musicbrainz-server] reosarevok opened pull request #2064 (master…normalize-warnings): Fix catalyst warning: don't try to trim/sanitize undef "strings" https://github.com/metabrainz/musicbrainz-serve...
10:03 AM
alastairp

ruaok: https://github.com/metabrainz/brainzutils-pytho...
10:03 AM
if you have unicode and want to turn it to text, `errors='ignore'` should skip over any non-encodable characters
10:04 AM
uh, unicode -> bytes I mean
10:04 AM
but I'm not sure if at the point of this error handler the data is as a string or as json
10:04 AM
this was the PR: https://github.com/metabrainz/listenbrainz-serv...
10:05 AM
ruaok

its an exception object that I cast to str().
10:05 AM
alastairp

although, now I see that this error message (`raise APIBadRequest("Listen submission contains invalid characters.")`) doesn't include the contents
10:06 AM
try str(message, errors='ignore')
10:06 AM
ruaok

https://gist.github.com/mayhem/bc2bb75cf057d1c1...
10:10 AM
alastairp

hmm, now I'm a bit confused by that PR
10:10 AM
ruaok

sentry agrees.
10:10 AM
alastairp

_lucifer: did we stop receiving the "enter null into timescale" errors in sentry?
10:10 AM
because PR 1371 wraps `_send_listens_to_queue`, which adds items to rabbitmq, not sending to postgres
10:10 AM
ruaok

ah, no. the error is now being formatted correctly, but not fixed. see sentry.
10:12 AM
_lucifer

alastairp: yes, https://sentry.metabrainz.org/metabrainz/listen...
10:12 AM
last seen 12 days ago.
10:12 AM
alastairp

yeah, right
10:12 AM
however: https://sentry.metabrainz.org/metabrainz/listen...
10:13 AM
same error, but in timescale writer
10:14 AM
_lucifer

interesting, to confirm these listens are coming from the API?
10:16 AM
no i think `_send_listens_to_queue` wraps both queues. https://github.com/metabrainz/listenbrainz-serv...
10:18 AM
BrainzGit

[listenbrainz-server] amCap1712 opened pull request #1388 (master…spark-new-cluster): Setup request consumer for new cluster setup https://github.com/metabrainz/listenbrainz-serv...
10:19 AM
_lucifer

ruaok: can you review the above PR? i'll build and test the request consumer then. i haven't changed memory configurations yet. will do that once request consumer works.
10:21 AM
alastairp

_lucifer: what do you mean it wraps both queues?
10:22 AM
ruaok

_lucifer: will try soon. we're in the office and have heaps on.
10:22 AM
alastairp

we were just talking about this, we probably need to re-address this PR, and check for null characters in the input in the webserver, so that we can return http400 to the user
10:23 AM
because if we have webserver -> push to queue, and then a separate process that reads from queue -> push to database, it's the wrong place to catch the error because now it's disconnected from the user request
10:23 AM
however I'm still really confused as to how we managed to catch a postgres error in the webserver and stop these errors from happening
10:25 AM
_lucifer

I meant that _send_listens_to_queue checks and send the listens to the appropriate queue. oh ok! i understand now what you meant
10:27 AM
alastairp: https://sentry.metabrainz.org/metabrainz/listen... i see those errors happened in messybrainz
10:27 AM
not in timescale_writer
10:29 AM
so they got caught before being sent to the queue.
10:31 AM
RikkoM has quit
10:31 AM
ruaok

alastairp: https://gist.github.com/mayhem/5aa567dde3bd2a66...
10:33 AM
https://gist.github.com/mayhem/481e5b109554cc19...
10:33 AM
alastairp

_lucifer: messybrainz!
10:33 AM
that was the missing part of the process that we skipped :D
10:33 AM
thanks
10:37 AM
_lucifer

alastairp: ruaok: once you have some time, let me know if you have any further comments on https://github.com/metabrainz/listenbrainz-serv... . i'd like to deploy it and run the sql updates.
10:37 AM
alastairp

ruaok: right, that looks OK. I was just unsure about the .rollback()
10:38 AM
this connection exists only for a batch of listens, right?
10:38 AM
not for many batches
10:38 AM
question: this batch will have many listens from many people. if it fails, what happens?
10:38 AM
all listens fail to insert, or just the one with \0 ?
10:39 AM
ruaok

all fail.
10:39 AM
because this is not the place for us to be validating listens.
10:39 AM
that should happen earlier.
10:39 AM
alastairp

yeah, sure
10:39 AM
but does this mean that we will loose other users' listens with this fix?
10:44 AM
ruaok

_lucifer: on 1384 how do you plan to test this migration? run the new music services tables script but not the migrate existing users script?
10:46 AM
_lucifer

ruaok, i think we can run both as the migrate existing users is just copy existing users.
10:46 AM
on test.lb it'll read the users from the new table and on prod from the old table
10:46 AM
ruaok

and then truncate the table before the actual release of this into produtction?
10:46 AM
_lucifer

yes
10:46 AM
ruaok

ok, makes sense.
10:47 AM
aight, given the current context and release plan, this is fine by me.
10:49 AM
_lucifer

awesome, thanks!
10:50 AM
while i am at it, do you have any PRs you want to include in the release?
10:50 AM
Mr_Monkey: ^
10:56 AM
ruaok

the ones that are critical are merged already.
10:56 AM
(user similarity page)
10:58 AM
alastairp

oh, you know what I think caused this
10:58 AM
ruaok

reading 1388 right now... I see you're creating newcluster files in an effort to not overwrite the existing cluster setup for now, but that we'll get rid of these when we ditch the old cluster?
10:58 AM
alastairp

I'm having flashbacks to the id3 spec
10:58 AM
sumedh joined the channel
10:58 AM
_lucifer

yes
10:59 AM
we have to configure the test setups as well before removing the old files
10:59 AM
so i created new ones, we can cleanup once the prod and test up are running well
11:01 AM
alastairp

tests running
11:01 AM
ruaok

ok, lgtm for testing purposes.
11:01 AM
alastairp

there is a VW car called the id3
11:01 AM
and so now it's difficult to search for 'id3 spec'