#metabrainz

/

      • Freso
        (People still up: Freso, alastairp, zas, Mr_Monkey, bitmap, _lucifer, diru1100 – anyone else who want to give review, let me know ASAP!)
      • 2021-02-01 03256, 2021

      • ruaok
        last week was a fiasco with blackout thursday that seems like it took over the whole week
      • 2021-02-01 03223, 2021

      • ruaok
        wait, it happened wednesday, no?
      • 2021-02-01 03236, 2021

      • ruaok
        at least I can tell the story now. :)
      • 2021-02-01 03247, 2021

      • ruaok
        so, wednesday morning all of our sites tipped over.
      • 2021-02-01 03250, 2021

      • vasharma0521 joined the channel
      • 2021-02-01 03216, 2021

      • Mr_Monkey
      • 2021-02-01 03220, 2021

      • yvanzo
        Oh right, that was on Wednesday at 4:30 UTC.
      • 2021-02-01 03227, 2021

      • repo has quit
      • 2021-02-01 03239, 2021

      • Freso
        So Wipeout Wednesday?
      • 2021-02-01 03240, 2021

      • CatQuest
        tsk, and "Blackout Thursday" had such a nice ring to it
      • 2021-02-01 03243, 2021

      • ruaok
        except they didn't. everything was working fine and some requests came in. we suspected the network. we suspected a DDoS attack and asked Hetzner if they could see a DDoS attack: Their word: No.
      • 2021-02-01 03245, 2021

      • CatQuest
        hahaha
      • 2021-02-01 03259, 2021

      • ruaok
        zas and I poked around and poked around.
      • 2021-02-01 03201, 2021

      • CatQuest
        Mr_Monkey:
      • 2021-02-01 03216, 2021

      • ruaok
        eventually zas started looking at IP addresses and notice that a lot were coming from AWS.
      • 2021-02-01 03236, 2021

      • repo joined the channel
      • 2021-02-01 03238, 2021

      • ruaok
        so we block two large swaths of their IP address ranges and traffice returned to normal after 3.5 hours of downtime.
      • 2021-02-01 03259, 2021

      • sumedh has quit
      • 2021-02-01 03205, 2021

      • ruaok
        zas then filed a report with AWS and we didn't expect anything else to happen from there.
      • 2021-02-01 03232, 2021

      • ruaok
        the next day one of our supporters contacted us and said "WTF, AWS says we're DDoSing you?"
      • 2021-02-01 03247, 2021

      • vasharma0521 is now known as vineetsharma
      • 2021-02-01 03216, 2021

      • ruaok
        and we went back in forth a number of rounds to work out what it was. we first blocked all of their IPs and the problem went away. unblock on IP and the problem immediately came back for that IP.
      • 2021-02-01 03222, 2021

      • ruaok
        eventually they found the problem.
      • 2021-02-01 03227, 2021

      • shivam-kapila
        (well AWS did something atleast)
      • 2021-02-01 03251, 2021

      • ruaok
        turns out that a misconfiguration caused the delay in cover art archive lookups to happen A LOT faster than they should.
      • 2021-02-01 03237, 2021

      • ruaok
        I'm still confused if this was in end-user software or on their own servers, but they fixed it and we unblocked them
      • 2021-02-01 03242, 2021

      • ruaok
        anyone wanna guess who it was?
      • 2021-02-01 03222, 2021

      • ruaok
        there will be a blog post detailing what happened tomorrow.
      • 2021-02-01 03246, 2021

      • ruaok
        earlier in the week I submitted some blog post and had a conversation with a potential new unicorn.
      • 2021-02-01 03207, 2021

      • ruaok
        oh and I revealed that Sonos was the new unicorn we signed in December.
      • 2021-02-01 03213, 2021

      • CatQuest
        hmmmm
      • 2021-02-01 03213, 2021

      • ruaok
        shivam-kapila: no.
      • 2021-02-01 03223, 2021

      • ruaok
        that was it. fin.
      • 2021-02-01 03230, 2021

      • ruaok
        zas: anything to add?
      • 2021-02-01 03234, 2021

      • zas
        hey
      • 2021-02-01 03240, 2021

      • shivam-kapila
        new one coming. nice :)
      • 2021-02-01 03242, 2021

      • ruaok
        !m zas
      • 2021-02-01 03242, 2021

      • BrainzBot
        You're doing good work, zas!
      • 2021-02-01 03250, 2021

      • zas
        I investigated the issue a bit more
      • 2021-02-01 03234, 2021

      • zas
        so about 100 different servers were querying caa redirect service very very fast (to me, unlimited speed)
      • 2021-02-01 03252, 2021

      • zas
        so it caused an exhaustion of possible connections on our gateways
      • 2021-02-01 03212, 2021

      • zas
        we have various counter-measures but they didn't suffice
      • 2021-02-01 03240, 2021

      • zas
        so I added few more rate limits to caa
      • 2021-02-01 03211, 2021

      • zas
        during the blackout we tried to switch to herb (our second gateway)
      • 2021-02-01 03244, 2021

      • zas
        but I encountered various non-critical issues I worked at solving this week
      • 2021-02-01 03255, 2021

      • zas
        mainly mbstats needed some care
      • 2021-02-01 03208, 2021

      • zas
      • 2021-02-01 03212, 2021

      • zas
        also noted gateways-redis was under assault during the blackout
      • 2021-02-01 03211, 2021

      • zas
        I'm restarted working on something I tried a while ago (without much success) but now it works: the goal is to replace this redis instance by a fully redundant and quicker alternative
      • 2021-02-01 03235, 2021

      • zas
        basically keydb + keepalived + haproxy, running on gateways themselves
      • 2021-02-01 03218, 2021

      • alastairp
        zas: I have some keepalived configuration that works if you're interested in looking at it for comparison purposes
      • 2021-02-01 03224, 2021

      • zas
        I also detected a stupid issue: ufw overrides sysctl.conf on reboot, it caused gateways to not use correct systctl values
      • 2021-02-01 03228, 2021

      • zas
        (fixed now)
      • 2021-02-01 03252, 2021

      • zas
        fin. Mr_Monkey ?
      • 2021-02-01 03254, 2021

      • yvanzo
        !m zas
      • 2021-02-01 03254, 2021

      • BrainzBot
        You're doing good work, zas!
      • 2021-02-01 03200, 2021

      • Mr_Monkey
        Hello !
      • 2021-02-01 03219, 2021

      • Mr_Monkey
        Las week I worked on merging PRs and BB and LB
      • 2021-02-01 03233, 2021

      • ruaok
        > I also detected a stupid issue: ufw overrides sysctl.conf on reboot, it caused gateways to not use correct systctl values
      • 2021-02-01 03248, 2021

      • ruaok
        was this causing the dropped packets between trille and kiki?
      • 2021-02-01 03256, 2021

      • zas
        nope^^
      • 2021-02-01 03205, 2021

      • zas
        but it could have
      • 2021-02-01 03211, 2021

      • Mr_Monkey
        I also fixed an issue with the track search input on the LB playlist page
      • 2021-02-01 03236, 2021

      • Mr_Monkey
        Worked a bit more on MB icons for various devices
      • 2021-02-01 03211, 2021

      • Mr_Monkey
        And worked on setting up backups
      • 2021-02-01 03229, 2021

      • Mr_Monkey
        And finally some fiddling with Jenkins on the LB CI setup
      • 2021-02-01 03255, 2021

      • Mr_Monkey
        That's most of it for me !
      • 2021-02-01 03255, 2021

      • Mr_Monkey
        Go Freso !
      • 2021-02-01 03207, 2021

      • Freso
        o/
      • 2021-02-01 03257, 2021

      • Freso
        So I went over chat logs last week and compiled meeting notes from this year… and posted them on the forum this morning.
      • 2021-02-01 03223, 2021

      • yvanzo
        !m Freso
      • 2021-02-01 03223, 2021

      • BrainzBot
        You're doing good work, Freso!
      • 2021-02-01 03238, 2021

      • Freso
        Other than that, mostly lurking about, getting back into things and trying out some new/new-old processes.
      • 2021-02-01 03241, 2021

      • Freso
        fin.
      • 2021-02-01 03246, 2021

      • Freso
        alastairp: Go!
      • 2021-02-01 03254, 2021

      • alastairp
        last week I moved jenkins from williams to cage, to try and reduce the disk usage on williams
      • 2021-02-01 03259, 2021

      • alastairp
        I also upgraded jenkins and helped Mr_Monkey upgrade the use of a jenkins plugin in LB JS tests
      • 2021-02-01 03204, 2021

      • alastairp
        I made a start on an improvement to LB tests to make sure that we delete all unused docker images after tests finish (to prevent running out of disk space)
      • 2021-02-01 03209, 2021

      • alastairp
        I was around a bit during the blackout but wasn't able to help very much
      • 2021-02-01 03215, 2021

      • alastairp
        I did some docker/uwsgi maintenance to reduce the size of docker logs, which freed up about 300gb in total over all of our python apps
      • 2021-02-01 03219, 2021

      • alastairp
        I helped this morning with some improvements to caching in CB
      • 2021-02-01 03223, 2021

      • alastairp
        I migrated tests for BU from travis to Jenkins again
      • 2021-02-01 03225, 2021

      • Freso
        (Only bitmap, _lucifer, and diru1100 left on my list. Last call for anyone else wanting to give review!)
      • 2021-02-01 03231, 2021

      • alastairp
        bitmap: next
      • 2021-02-01 03236, 2021

      • bitmap
        hey
      • 2021-02-01 03213, 2021

      • bitmap
        (related to the supporter ddos) last week I worked on optimizing caa access in mbs to avoid routing requests through the redirect service; no real reason we need to go through the service for internal use
      • 2021-02-01 03244, 2021

      • bitmap
        should be submitting that today, but Zas has since added a global rate limit too
      • 2021-02-01 03216, 2021

      • bitmap
        I also continued working on converting the relationship editor code to React so we can convert the rest of the entity edit forms
      • 2021-02-01 03244, 2021

      • bitmap
        there were a bunch of conflicts to fix first, but I was updating it to make use of the DateRangeFieldset component we added for the alias edit form
      • 2021-02-01 03207, 2021

      • bitmap
        I'm also changing the state handling to match what we did there for consistency. I think it should be working by the end of the week
      • 2021-02-01 03220, 2021

      • bitmap
        otherwise mostly spent time on code review
      • 2021-02-01 03225, 2021

      • bitmap
        fin! _lucifer go
      • 2021-02-01 03232, 2021

      • reosarevok
        !m bitmap
      • 2021-02-01 03232, 2021

      • BrainzBot
        You're doing good work, bitmap!
      • 2021-02-01 03238, 2021

      • _lucifer
        hi all!
      • 2021-02-01 03221, 2021

      • ruaok
        > db0:keys=8660,expires=8286,avg_ttl=10039539
      • 2021-02-01 03230, 2021

      • ruaok
        CB still has some cache keys without expiry
      • 2021-02-01 03241, 2021

      • _lucifer
        I was mostly busy last week and didn't do much but I was able to help with the CB redis today. Thats it for me.
      • 2021-02-01 03251, 2021

      • ruaok
        thank you!
      • 2021-02-01 03253, 2021

      • _lucifer
        diru1100: next?
      • 2021-02-01 03241, 2021

      • alastairp
        ruaok: a lot less than the 10% that it was, though. (I assume you did a flushall again?) Let me look at the cached values again
      • 2021-02-01 03253, 2021

      • Freso
        Or maybe no diru around.
      • 2021-02-01 03202, 2021

      • Freso
        So I guess that’s that for reviews!
      • 2021-02-01 03205, 2021

      • ruaok did flushall
      • 2021-02-01 03219, 2021

      • Freso
        And the GSoC topic is postponed to next week, so…
      • 2021-02-01 03225, 2021

      • Freso
        That rounds up this meeting!
      • 2021-02-01 03239, 2021

      • Mr_Monkey
        Thanks !
      • 2021-02-01 03243, 2021

      • Freso
        Thank you everyone who gave reviews, and thank you all for your time!
      • 2021-02-01 03253, 2021

      • Freso
        Stay safe and hydrated and wash your hands!
      • 2021-02-01 03255, 2021

      • Freso
        </BANG>
      • 2021-02-01 03222, 2021

      • ruaok
        >Stay safe and hydrated and wear a mask!
      • 2021-02-01 03230, 2021

      • Freso
        That, too. 😷
      • 2021-02-01 03251, 2021

      • shivam-kapila
        Thanks everyone
      • 2021-02-01 03211, 2021

      • alastairp
        _lucifer: ruaok: sample non-ttl value: `[b'spotify:album:1eSyakZFx5a2eAjRibvBvQ']`
      • 2021-02-01 03222, 2021

      • alastairp
        which looks suspiciously like mbspotify too, but an example with data
      • 2021-02-01 03240, 2021

      • shivam-kapila
        If anyone has any views for the LB ideas, I would be happy to discuss
      • 2021-02-01 03247, 2021

      • ruaok
        indeed it does. I hope I deployed it correctly. :)
      • 2021-02-01 03205, 2021

      • alastairp
        still a handful of empty lists too
      • 2021-02-01 03219, 2021

      • yvanzo
        ruaok, zas: Thanks for the details about the blackout.
      • 2021-02-01 03210, 2021

      • ruaok
        np. we'll have to spend the $400 on fun next time we can meet in person.
      • 2021-02-01 03222, 2021

      • ruaok
        unless zas and I drink the budget dry before then.
      • 2021-02-01 03242, 2021

      • alastairp
        he deserves it
      • 2021-02-01 03250, 2021

      • ruaok
        indeed he does.
      • 2021-02-01 03216, 2021

      • ruaok trundles off away from the 'puter
      • 2021-02-01 03254, 2021

      • vineetsharma76 joined the channel
      • 2021-02-01 03224, 2021

      • vineetsharma has quit
      • 2021-02-01 03229, 2021

      • vineetsharma76 has quit
      • 2021-02-01 03217, 2021

      • Nyanko-sensei joined the channel
      • 2021-02-01 03203, 2021

      • alastairp
        Mr_Monkey: hey, I tried the pils yesterday
      • 2021-02-01 03206, 2021

      • alastairp
        it's _so clear_
      • 2021-02-01 03246, 2021

      • Mr_Monkey
        It is ! A bit bitter for me. I'm gonna wait another few weeks at least see if aromas develop a bit more
      • 2021-02-01 03252, 2021

      • CatQuest
      • 2021-02-01 03206, 2021

      • CatQuest
      • 2021-02-01 03212, 2021

      • alastairp
        ohno
      • 2021-02-01 03222, 2021

      • CatQuest
        !recall oh no
      • 2021-02-01 03222, 2021

      • BrainzBot
        I'm sorry, I don't remember "oh no", are you sure I should know about it?
      • 2021-02-01 03227, 2021

      • CatQuest
        hm
      • 2021-02-01 03236, 2021

      • alastairp
        !recall oh no.
      • 2021-02-01 03236, 2021

      • BrainzBot
        I'm sorry, I don't remember "oh no.", are you sure I should know about it?
      • 2021-02-01 03240, 2021

      • alastairp
        oh no
      • 2021-02-01 03242, 2021

      • CatQuest
      • 2021-02-01 03205, 2021

      • CatQuest
        it should say "I will remember $phrase for you "
      • 2021-02-01 03205, 2021

      • reosarevok
        !recall oh no
      • 2021-02-01 03205, 2021

      • BrainzBot
        I'm sorry, I don't remember "oh no", are you sure I should know about it?
      • 2021-02-01 03228, 2021

      • alastairp
        maybe we turned it off!
      • 2021-02-01 03234, 2021

      • CatQuest
      • 2021-02-01 03238, 2021

      • CatQuest
      • 2021-02-01 03247, 2021

      • CatQuest
        !help
      • 2021-02-01 03254, 2021

      • CatQuest
        BrainzBot: help