mayhem: i further tried to debug why 5 years of data refuses to run spark at all where as 4 years of data promptly runs in ~an hour. for 2021-2022, the self join for session produced 1.5B intermediate rows. for 2019-2020, that same join produces over 50B and counting. i wonder what's going on with the data there.
reosarevok
mayhem: when you have some time, can you check this "Seeking Clarifications Of Music Brainz Partners"? Really not sure what to say tbh :)
lucifer
mayhem, bitmap: another issue, this time with building the mb metadata cache using pink. `DETAIL: User was holding shared buffer pin for too long.` if the standby falls more than `X` seconds behind the primary. PG will automatically any queries until the standby has fully caught up. we can configure the time PG will wait but since the cache takes a few hours to execute not sure what the appropriate delay would be.
yvanzo, bitmap ^ should probably put that second one out before someone complains that some process broke for them :) No need for a hotfix but let's get it out next release?
we're going to have to make everything distributed 😱
zas
I'm thinking about an homemade solution for blocklists, based on Python + mosquitto: basically a "listener" on each node, and one "commander". When listener gets a message about ipsets needing an update, it just applies the change
the commander is just posting MQTT messages
lucifer
yvanzo, reosarevok: is it possible to provide a custom postgresql.conf file for the database in musicbrainz docker? i want to preload an extension. didn't find anything obvious in docs on a quick look but might have missed something.
I did some basic testing with pyroute2 as it's packaged on Ubuntu and it worked well
you can add comments to ipset entries, which would be quite nice to support
zas
yes definitively
atj
I was thinking of using redis/keydb pub/sub as we already have redundant keydb instances on the gateways
zas
yes, actually it can work well (and solve the problem related to missed messages with MQTT)
atj
clients would just need to sub to a single channel
zas
atj: I have to go afk for a while, please think about this stuff, and let's discuss again this afternoon
atj
ok, not sure about the log aggregation but I'll think on it
mayhem
moooin!
lucifer: on the PG issue -- I have no idea how to help with that. I think bitmap would be better.
as for as the listens exploding, I wonder if a spammer did something evil and spammed a shit ton of listens we haven't caught?
reosarevok: will do.
outsidecontext: I got another SSL.com activation link in email. We're using that cert now, right? Can I ignore?
outsidecontext
mayhem: yes, we are using it. That other link you had sent me was for some code signing service we are not interested in. So probably same again?
mayhem
ok, thanks. lets see if next renewal we can renew with someone other than SLL.com, shall we?
lucifer
mayhem: i see, no idea yet. but the problematic data is probably somewhere from between 2017 mid -2018 mid.
a hunch but might just be a coincidence that date is around the time created field was added to LB iirc. maybe some other db changes were made as well at that time.
mayhem
if it is a data bug, then we should easily crap rows given the large amount of it.
I wonder if we could query the data and check to see how many listens each user has in that timeframe. anyone with an outside set of listens should come up clearly. but I somehow doubt that that is the case. that is a LOT of rows.
outsidecontext
mayhem: agreed
lucifer
sure can do that. find top 10 users with listen counts in sets of 2 years and see if something stands out.
mayhem
great
lucifer
zas, atj: say if prometheus queries a service but the service was unavailable for a while but it persists some metrics in a db/cache, is there a standard way to submit those later to prometheus?
What are you trying to do? Maybe Prometheus isn't the tool you need.
lucifer
zas: just trying to port the existing metric writer, so was wondering what made sense to keep and what didn't
mayhem, monkeu: same listen once with a release mbid, second time not. when submitting a release mbid as user that is used otherwise the rg one. up on test.lb for testing. https://usercontent.irccloud-cdn.com/file/0r1hE...
*monkey ^
monkey
Neat !
mayhem
Monkeu is monkey's Catalan name!
lucifer
lolol
monkey
Even better, to be honest. Usual name: Nico. Monkey in catalan: mico. What are the chances?
lucifer: I can't find listens to test, but I did manual testing and it's working great!
Building on that, we should have a look at how that will impact the charts.
Currently we are based on releases, but it would make sense to base stats on RG instead? At least for display purposes (see my page for example with matched releases—i.e. MBID not sent by me— that don't have cover art: https://test.listenbrainz.org/user/mr_monkey/ch...)
Well, actually, "can't find listens to test" is precisely because it's working so well ! Looked again closely at some listens, clicking on the cover art to go to the release page, realizing that release does not have cover art. Awesome !
lucifer
monkey: making stats based on RG is probably orthogonal to current change. the cover art aspect of it may or may not be handled by the existing cache. but we could always add another endpoint for that.
monkey
Mmm, I figured the two aspects could be decoupled
lucifer
will need to figure out how to do cover art for a release group based stat but that will likely need separate endpoints at the least and maybe new db tables.
monkey
More testing later, awesome job :) You just improved cover art coverage by a significant margin !
bitmap
lucifer: increasing max_standby_streaming_delay doesn't sound ideal since that may leave the standby very out of date while certain queries are being run
lucifer
bitmap: i see, what is it set to currently?
bitmap
I don't remember configuring it, so probably the default, 30 seconds
maybe you can run the problematic queries on floyd instead? or check the current replication lag before invoking them
I'm not sure what would be causing a conflict exactly
yvanzo
lucifer: You can create a docker compose override file under local/compose to add some -c options to the postgres command.
lucifer
yvanzo: ah yes, i saw that. i thought a pg.conf file would be easier but will try this for now.
bitmap: i see. we were running on pink to not overwhelm the primary db.
(an override would be needed even to define a custom conf file)
BrainzGit
[musicbrainz-android] 14dependabot[bot] opened pull request #157 (03master…dependabot/github_actions/actions/cache-3): Bump actions/cache from 2 to 3 https://github.com/metabrainz/musicbrainz-andro...
[musicbrainz-android] 14dependabot[bot] opened pull request #158 (03master…dependabot/gradle/hilt_version-2.44.1): Bump hilt_version from 2.44 to 2.44.1 https://github.com/metabrainz/musicbrainz-andro...
[musicbrainz-server] 14mwiencek opened pull request #2733 (03master…rel-editor-set-attributes-userscripts): Add an easy way to set dialog attributes from userscripts https://github.com/metabrainz/musicbrainz-serve...
bitmap
yvanzo: thanks++
alastairp
👋 hello, I'm not around today and have no meeting notes, have a great Monday everyone
reosarevok
<BANG>
I'm taking today instead of Freso
It's World Diabetes Day and I don't have a song for that, but be careful with sugar and stuff!
I have one mailed-in review:
Go Freso!
"""
Last week I poked at forum flags and reported editors and some spam.
"""
People in my list for today are: mayhem, monkey, reosarevok, bitmap, akshaaatt, lucifer, yvanzo, zas, Shubh, yellowhatpro, Pratha-Fish, riksucks, CatQuest, ansh
So, go bitmap?
bitmap
hey
last week I worked on MBS-12694, MBS-12695, MBS-12679, and finally started on some improvements to simplify userscript integration
other than that, did a bit of code review on some refactoring/cleanup tasks from reosarevok and yvanzo
fin. go lucifer
Shubh
reosarevok: Hey, I've nothing to report today, can you please remove my name from the list. Thanks
lucifer
hi all!
last week, i worked on making improvements to mb metadata cache in LB including enhancing cover art support. fixed the ts writer after a listen had blocked it. tried to debug sir performance and also tried to debug issues with recording simiarity algorithm. no success on both these fronts so far.
mayhem: next?
mayhem
hiya
ansh
reosarevok: Hi! I also have nothing to report today. Please remove my name from the list. Thanks
mayhem
last week I spent time on the usual MeB non-sense and also doing some PR review and the like.
then I worked a bit on the similar recordings data, trying to work out if it is any good and how to make it better.
riksucks
Hi reosarevok, nothing to report for the week, hope you guys had a great weekend :D feel free to remove my name
mayhem
I've now gotten some pointers for whom to ask to make it better, so I've also started that process.
but, the data is pretty good already. likely good enough to get started building more recommendation stuff soon.
(but not quite yet)
fin.
monkeu??
(monkey)
yvanzo
mayhem: how much of non-sense is that?
akshaaatt
Enough to make sense
monkey
Hello !
mayhem
yvanzo: loads of non-sense.
monkey
Last week I worked on a few small features for LB: Show icon of music service in BrainzPlayer, add a menu option to show the source JSON of a listen, and started working on showing tags
Also looked at lucifer's PR for importing spotify extended history