#metabrainz

/

      • chancey joined the channel
      • chancey has quit
      • v6lur has quit
      • genpaku has quit
      • genpaku joined the channel
      • everdred has left the channel
      • lucifer
        mayhem: i further tried to debug why 5 years of data refuses to run spark at all where as 4 years of data promptly runs in ~an hour. for 2021-2022, the self join for session produced 1.5B intermediate rows. for 2019-2020, that same join produces over 50B and counting. i wonder what's going on with the data there.
      • reosarevok
        mayhem: when you have some time, can you check this "Seeking Clarifications Of Music Brainz Partners"? Really not sure what to say tbh :)
      • lucifer
        mayhem, bitmap: another issue, this time with building the mb metadata cache using pink. `DETAIL: User was holding shared buffer pin for too long.` if the standby falls more than `X` seconds behind the primary. PG will automatically any queries until the standby has fully caught up. we can configure the time PG will wait but since the cache takes a few hours to execute not sure what the appropriate delay would be.
      • BrainzGit
        [musicbrainz-server] 14reosarevok opened pull request #2727 (03master…MBS-12704): MBS-12704: Remove historical "Watch artist" code https://github.com/metabrainz/musicbrainz-serve...
      • [musicbrainz-server] 14reosarevok opened pull request #2728 (03master…regenerate-db-scripts): Generate SQL scripts for unreferenced_row_log https://github.com/metabrainz/musicbrainz-serve...
      • reosarevok
        yvanzo, bitmap ^ should probably put that second one out before someone complains that some process broke for them :) No need for a hotfix but let's get it out next release?
      • BrainzGit
        [musicbrainz-server] 14reosarevok opened pull request #2729 (03master…consistent-hash-to-row): Make _hash_to_row methods more consistent https://github.com/metabrainz/musicbrainz-serve...
      • trolley has quit
      • trolley joined the channel
      • SinEstres joined the channel
      • SinEstres has quit
      • zas
        atj: when you're available, I'd like to discuss few things regarding new gateways, ping me when around
      • atj
        zas: ping
      • zas
        Morning ;)
      • atj
        morning
      • zas
        can we merge https://github.com/metabrainz/metabrainz-ansibl... or you have further changes to make?
      • there are 2 things we need to solve: IP black/whitelisting and logs aggregation
      • do you have any suggestions regarding those?
      • atj
        let's merge - I tried to come up with a good way to dynamically create the haproxy backend servers but failed
      • I had a look for a solution to manage ipsets but couldn't find anything suitable
      • was thinking we might need to build something using redis pub/sub but it would require a reasonable amount of effort
      • is the log aggregation needed for openresty?
      • zas
        we have few scripts running on logs, and splitted logs break them.
      • But that's more a general issue we need to solve
      • atj
        can you point me to to the scripts if they're in git?
      • have you had any luck finding anything for managing blocklists?
      • zas
      • atj
        we're going to have to make everything distributed 😱
      • zas
        I'm thinking about an homemade solution for blocklists, based on Python + mosquitto: basically a "listener" on each node, and one "commander". When listener gets a message about ipsets needing an update, it just applies the change
      • the commander is just posting MQTT messages
      • lucifer
        yvanzo, reosarevok: is it possible to provide a custom postgresql.conf file for the database in musicbrainz docker? i want to preload an extension. didn't find anything obvious in docs on a quick look but might have missed something.
      • zas
      • this one does almost what we need, but I think about using Python and have support for few features I'm using with current Fabric-based stuff (https://github.com/metabrainz/fabric-ip-blacklist)
      • we need to stay very near from low level ipset lib to be able to use all features
      • atj
        I did some basic testing with pyroute2 as it's packaged on Ubuntu and it worked well
      • you can add comments to ipset entries, which would be quite nice to support
      • zas
        yes definitively
      • atj
        I was thinking of using redis/keydb pub/sub as we already have redundant keydb instances on the gateways
      • zas
        yes, actually it can work well (and solve the problem related to missed messages with MQTT)
      • atj
        clients would just need to sub to a single channel
      • zas
        atj: I have to go afk for a while, please think about this stuff, and let's discuss again this afternoon
      • atj
        ok, not sure about the log aggregation but I'll think on it
      • mayhem
        moooin!
      • lucifer: on the PG issue -- I have no idea how to help with that. I think bitmap would be better.
      • as for as the listens exploding, I wonder if a spammer did something evil and spammed a shit ton of listens we haven't caught?
      • reosarevok: will do.
      • outsidecontext: I got another SSL.com activation link in email. We're using that cert now, right? Can I ignore?
      • outsidecontext
        mayhem: yes, we are using it. That other link you had sent me was for some code signing service we are not interested in. So probably same again?
      • mayhem
        ok, thanks. lets see if next renewal we can renew with someone other than SLL.com, shall we?
      • lucifer
        mayhem: i see, no idea yet. but the problematic data is probably somewhere from between 2017 mid -2018 mid.
      • a hunch but might just be a coincidence that date is around the time created field was added to LB iirc. maybe some other db changes were made as well at that time.
      • mayhem
        if it is a data bug, then we should easily crap rows given the large amount of it.
      • I wonder if we could query the data and check to see how many listens each user has in that timeframe. anyone with an outside set of listens should come up clearly. but I somehow doubt that that is the case. that is a LOT of rows.
      • outsidecontext
        mayhem: agreed
      • lucifer
        sure can do that. find top 10 users with listen counts in sets of 2 years and see if something stands out.
      • mayhem
        great
      • lucifer
        zas, atj: say if prometheus queries a service but the service was unavailable for a while but it persists some metrics in a db/cache, is there a standard way to submit those later to prometheus?
      • zas
        yes, but one needs to use Prometheus PushGateway -> https://prometheus.io/docs/practices/pushing/ and that's not really recommended (as explained).
      • What are you trying to do? Maybe Prometheus isn't the tool you need.
      • lucifer
        zas: just trying to port the existing metric writer, so was wondering what made sense to keep and what didn't
      • mayhem, monkeu: same listen once with a release mbid, second time not. when submitting a release mbid as user that is used otherwise the rg one. up on test.lb for testing. https://usercontent.irccloud-cdn.com/file/0r1hE...
      • *monkey ^
      • monkey
        Neat !
      • mayhem
        Monkeu is monkey's Catalan name!
      • lucifer
        lolol
      • monkey
        Even better, to be honest. Usual name: Nico. Monkey in catalan: mico. What are the chances?
      • BrainzGit
        [listenbrainz-server] 14amCap1712 opened pull request #2246 (03master…fix-rg-ca): Fix release group cover art query https://github.com/metabrainz/listenbrainz-serv...
      • mayhem
        lucifer: that looks great!
      • lucifer
        release group cover art?
      • mayhem
        Yes
      • monkey
        lucifer: I can't find listens to test, but I did manual testing and it's working great!
      • Building on that, we should have a look at how that will impact the charts.
      • Currently we are based on releases, but it would make sense to base stats on RG instead? At least for display purposes (see my page for example with matched releases—i.e. MBID not sent by me— that don't have cover art: https://test.listenbrainz.org/user/mr_monkey/ch...)
      • Well, actually, "can't find listens to test" is precisely because it's working so well ! Looked again closely at some listens, clicking on the cover art to go to the release page, realizing that release does not have cover art. Awesome !
      • lucifer
        monkey: making stats based on RG is probably orthogonal to current change. the cover art aspect of it may or may not be handled by the existing cache. but we could always add another endpoint for that.
      • monkey
        Mmm, I figured the two aspects could be decoupled
      • lucifer
        will need to figure out how to do cover art for a release group based stat but that will likely need separate endpoints at the least and maybe new db tables.
      • monkey
        More testing later, awesome job :) You just improved cover art coverage by a significant margin !
      • bitmap
        lucifer: increasing max_standby_streaming_delay doesn't sound ideal since that may leave the standby very out of date while certain queries are being run
      • lucifer
        bitmap: i see, what is it set to currently?
      • bitmap
        I don't remember configuring it, so probably the default, 30 seconds
      • maybe you can run the problematic queries on floyd instead? or check the current replication lag before invoking them
      • I'm not sure what would be causing a conflict exactly
      • yvanzo
        lucifer: You can create a docker compose override file under local/compose to add some -c options to the postgres command.
      • lucifer
        yvanzo: ah yes, i saw that. i thought a pg.conf file would be easier but will try this for now.
      • bitmap: i see. we were running on pink to not overwhelm the primary db.
      • BrainzGit
        [musicbrainz-server] 14reosarevok opened pull request #2730 (03master…more-perl-critic-modules): Follow Perl::Critic::Policy::Modules rules https://github.com/metabrainz/musicbrainz-serve...
      • lucifer
        those bursts are probably due to the query we run. so it can increase load by about ~1%.
      • yvanzo
        lucifer: using a custom config file is possible too but -c option is already used in docker-compose.yml
      • lucifer
        ah cool
      • yvanzo
      • (an override would be needed even to define a custom conf file)
      • BrainzGit
        [musicbrainz-android] 14dependabot[bot] opened pull request #157 (03master…dependabot/github_actions/actions/cache-3): Bump actions/cache from 2 to 3 https://github.com/metabrainz/musicbrainz-andro...
      • [musicbrainz-android] 14dependabot[bot] opened pull request #158 (03master…dependabot/gradle/hilt_version-2.44.1): Bump hilt_version from 2.44 to 2.44.1 https://github.com/metabrainz/musicbrainz-andro...
      • [musicbrainz-android] 14akshaaatt merged pull request #157 (03master…dependabot/github_actions/actions/cache-3): Bump actions/cache from 2 to 3 https://github.com/metabrainz/musicbrainz-andro...
      • [musicbrainz-android] 14akshaaatt merged pull request #158 (03master…dependabot/gradle/hilt_version-2.44.1): Bump hilt_version from 2.44 to 2.44.1 https://github.com/metabrainz/musicbrainz-andro...
      • [musicbrainz-server] 14yvanzo opened pull request #2731 (03master…central-entity): MBS-12552 (IV): Define central entity https://github.com/metabrainz/musicbrainz-serve...
      • [musicbrainz-server] 14yvanzo opened pull request #2732 (03master…core-js): MBS-12552 (V): Refactor JS code for central entity https://github.com/metabrainz/musicbrainz-serve...
      • yvanzo
        reosarevok, bitmap: I patched the musicbrainz-production-cron container to include l_*genre* in dumps (thus starting from next Wednesday) after https://github.com/metabrainz/musicbrainz-serve...
      • reosarevok
        Thanks! :)
      • BrainzGit
        [musicbrainz-server] 14mwiencek opened pull request #2733 (03master…rel-editor-set-attributes-userscripts): Add an easy way to set dialog attributes from userscripts https://github.com/metabrainz/musicbrainz-serve...
      • bitmap
        yvanzo: thanks++
      • alastairp
        👋 hello, I'm not around today and have no meeting notes, have a great Monday everyone
      • reosarevok
        <BANG>
      • I'm taking today instead of Freso
      • It's World Diabetes Day and I don't have a song for that, but be careful with sugar and stuff!
      • I have one mailed-in review:
      • Go Freso!
      • """
      • Last week I poked at forum flags and reported editors and some spam.
      • """
      • People in my list for today are: mayhem, monkey, reosarevok, bitmap, akshaaatt, lucifer, yvanzo, zas, Shubh, yellowhatpro, Pratha-Fish, riksucks, CatQuest, ansh
      • So, go bitmap?
      • bitmap
        hey
      • last week I worked on MBS-12694, MBS-12695, MBS-12679, and finally started on some improvements to simplify userscript integration
      • BrainzBot
        MBS-12694: Visual Appearance Relationship Edit Goes Nowhere [Beta] https://tickets.metabrainz.org/browse/MBS-12694
      • MBS-12679: beta: Relationship editor fails to refresh the attributes when the entity type changes https://tickets.metabrainz.org/browse/MBS-12679
      • MBS-12695: Beta: Adding Entities with Special Characters End Up Corrupt https://tickets.metabrainz.org/browse/MBS-12695
      • bitmap
        (all relationship editor tickets still)
      • other than that, did a bit of code review on some refactoring/cleanup tasks from reosarevok and yvanzo
      • fin. go lucifer
      • Shubh
        reosarevok: Hey, I've nothing to report today, can you please remove my name from the list. Thanks
      • lucifer
        hi all!
      • last week, i worked on making improvements to mb metadata cache in LB including enhancing cover art support. fixed the ts writer after a listen had blocked it. tried to debug sir performance and also tried to debug issues with recording simiarity algorithm. no success on both these fronts so far.
      • mayhem: next?
      • mayhem
        hiya
      • ansh
        reosarevok: Hi! I also have nothing to report today. Please remove my name from the list. Thanks
      • mayhem
        last week I spent time on the usual MeB non-sense and also doing some PR review and the like.
      • then I worked a bit on the similar recordings data, trying to work out if it is any good and how to make it better.
      • riksucks
        Hi reosarevok, nothing to report for the week, hope you guys had a great weekend :D feel free to remove my name
      • mayhem
        I've now gotten some pointers for whom to ask to make it better, so I've also started that process.
      • but, the data is pretty good already. likely good enough to get started building more recommendation stuff soon.
      • (but not quite yet)
      • fin.
      • monkeu??
      • (monkey)
      • yvanzo
        mayhem: how much of non-sense is that?
      • akshaaatt
        Enough to make sense
      • monkey
        Hello !
      • mayhem
        yvanzo: loads of non-sense.
      • monkey
        Last week I worked on a few small features for LB: Show icon of music service in BrainzPlayer, add a menu option to show the source JSON of a listen, and started working on showing tags
      • Also looked at lucifer's PR for importing spotify extended history