#metabrainz

/

      • zas
        hey, good morning
      • 2021-02-05 03604, 2021

      • zas
        burnside mb website performance is strangely bad
      • 2021-02-05 03610, 2021

      • bitmap
        moin
      • 2021-02-05 03616, 2021

      • zas
      • 2021-02-05 03639, 2021

      • bitmap
        guess rebooting didn't help :\
      • 2021-02-05 03627, 2021

      • zas
        I used a sample of stat logs to make few statistics, and look at last part: 2.2% of requests over 10 seconds (I used upstream_header_time)
      • 2021-02-05 03647, 2021

      • zas
        ratio should be the same among all nodes
      • 2021-02-05 03620, 2021

      • zas
        but clearly .29 is giving much worse results
      • 2021-02-05 03642, 2021

      • zas
        and on burnside, nothing runs along mb containers
      • 2021-02-05 03627, 2021

      • bitmap
        huh. but pink and cage are acting better?
      • 2021-02-05 03615, 2021

      • zas
        well, cage is a bit worse, on this sample, but according to http://stats.metabrainz.org/goto/dxLWIrYGk it isn't great either
      • 2021-02-05 03610, 2021

      • zas
        stats were made on a 1M hits sample, but graphs show very bad peaks sometimes
      • 2021-02-05 03613, 2021

      • bitmap
        could be the same issue with connections to pg timing out
      • 2021-02-05 03632, 2021

      • bitmap
        well, it's definitely affected by that, I just checked
      • 2021-02-05 03608, 2021

      • _lucifer
        alastairp: i am testing the version-ranges PR. works fine for CB. for LB, i get this error
      • 2021-02-05 03612, 2021

      • _lucifer
      • 2021-02-05 03630, 2021

      • _lucifer
        i think using version ranges for dataset hoster as well would fix this
      • 2021-02-05 03612, 2021

      • _lucifer
        works fine for AB as well
      • 2021-02-05 03618, 2021

      • alastairp
        _lucifer: mmm, right. this is also related with BU-31
      • 2021-02-05 03619, 2021

      • BrainzBot
        BU-31: Upgrade raven -> sentry-python https://tickets.metabrainz.org/browse/BU-31
      • 2021-02-05 03626, 2021

      • alastairp
        so, as a background:
      • 2021-02-05 03646, 2021

      • alastairp
        the software for sentry used to be called raven. then they made a new version called sentry-python. BU uses raven
      • 2021-02-05 03616, 2021

      • alastairp
        when I added sentry to the dataset hoster, I used sentry-python instead, because I knew that we were going to upgrade it
      • 2021-02-05 03640, 2021

      • alastairp
        for now I'd be happy to just match the versions, because we're going to fix it anyway
      • 2021-02-05 03654, 2021

      • alastairp
        I guess we should do something like what iliekcomputers and I discussed in the BU upgrade ticket... use a range in packages that are installed as dependencies (in this case datasethoster, I guess?) and an exact version in the repo (lb?)
      • 2021-02-05 03655, 2021

      • _lucifer
        yeah, i saw that. i was thinking of taking that up but i am not sure how to test it.
      • 2021-02-05 03633, 2021

      • _lucifer
        yeah, that makes sense, alastairp .
      • 2021-02-05 03601, 2021

      • CatQuest
        it's bandcamp friday!!!
      • 2021-02-05 03654, 2021

      • atj
        I know, I've spent too much money already...
      • 2021-02-05 03602, 2021

      • CatQuest
        \o/
      • 2021-02-05 03604, 2021

      • zas
        bitmap: did you see the 58s mean peak from cage few minutes ago?
      • 2021-02-05 03644, 2021

      • bitmap
        no, I wasn't watching
      • 2021-02-05 03619, 2021

      • bitmap
        there is definitely an issue with requests timing out connecting to PG
      • 2021-02-05 03623, 2021

      • zas
      • 2021-02-05 03658, 2021

      • zas
        bitmap: found something?
      • 2021-02-05 03625, 2021

      • bitmap
        not yet, I'm just doing some tests
      • 2021-02-05 03646, 2021

      • bitmap
        I wrote a simple script that just connects to PG in a loop to reproduce the issue
      • 2021-02-05 03629, 2021

      • bitmap
        (from perl)
      • 2021-02-05 03633, 2021

      • zas
        k
      • 2021-02-05 03634, 2021

      • ruaok
        alastairp: iliekcomputers : got some mock tips for me?
      • 2021-02-05 03635, 2021

      • ruaok
      • 2021-02-05 03640, 2021

      • bitmap
        that triggers it but I'm seeing if the same happens with psql
      • 2021-02-05 03632, 2021

      • ruaok
        client.collections[COLLECTION_NAME].documents.search(...) is the function I want to mock, but I don't know how to do it when there are multiple levels, one of which is an array deref....
      • 2021-02-05 03617, 2021

      • zas
        bitmap: I don't get why only few servers seems impacted...
      • 2021-02-05 03622, 2021

      • _lucifer
        ruaok: maybe mock the search function instead of the client?
      • 2021-02-05 03629, 2021

      • zas
        and not all requests...
      • 2021-02-05 03614, 2021

      • ruaok
        how does one do that, _lucifer ?
      • 2021-02-05 03633, 2021

      • bitmap
        zas: no idea. same issue is happening with psql, so it's not a Perl or musicbrainz-server issue AFAICT
      • 2021-02-05 03641, 2021

      • bitmap
        I'll check if it happens *outside* the container
      • 2021-02-05 03649, 2021

      • zas
        ok
      • 2021-02-05 03612, 2021

      • zas
        I suspect a network issue... but I found nothing obvious
      • 2021-02-05 03659, 2021

      • _lucifer
        ruaok: what's the type of the documents? or location of the search function
      • 2021-02-05 03643, 2021

      • ruaok
        location is typesense.Client.collections[COLLECTION_NAME].documents.search
      • 2021-02-05 03659, 2021

      • ruaok
        and it it supposed to return a dict...
      • 2021-02-05 03646, 2021

      • _lucifer
        try mocking `typesense.documents.search`
      • 2021-02-05 03652, 2021

      • bitmap
        zas: interesting, running the same script outside the container doesn't trigger the iissue at all
      • 2021-02-05 03604, 2021

      • zas
        so that's a docker problem
      • 2021-02-05 03617, 2021

      • bitmap
        /home/bitmap/test_psql.sh on cage
      • 2021-02-05 03607, 2021

      • bitmap
        it hangs almost immediately (after a few tries) inside the musicbrainz-website-prod container
      • 2021-02-05 03614, 2021

      • _lucifer
        ruaok: i mean something like https://www.irccloud.com/pastebin/zYJINAYp/
      • 2021-02-05 03633, 2021

      • ruaok
        that doesn't look right to me, but let me try.
      • 2021-02-05 03637, 2021

      • bitmap
        strace output looks identical to first two lines of https://gist.github.com/mwiencek/2ee882b236b9c08b… when it happens
      • 2021-02-05 03606, 2021

      • bitmap
        exept for interrupted by signal, that's the SIGALRM that MBS registers
      • 2021-02-05 03600, 2021

      • atj
        SIGALRM is likely a timeout
      • 2021-02-05 03605, 2021

      • atj
        60s
      • 2021-02-05 03617, 2021

      • ruaok
      • 2021-02-05 03618, 2021

      • atj
        It's designed to interrupt the poll()
      • 2021-02-05 03625, 2021

      • atj
        as it's a blocking call
      • 2021-02-05 03632, 2021

      • bitmap
        yep, that's set by musicbrainz-server
      • 2021-02-05 03647, 2021

      • _lucifer
        ruaok: i made a typo, the D should be capital, i.e. `Documents`
      • 2021-02-05 03607, 2021

      • atj
        Basically, your connections to pg are timing out
      • 2021-02-05 03610, 2021

      • bitmap
        it's not related to why the connection is timing out though
      • 2021-02-05 03622, 2021

      • ruaok
        `ModuleNotFoundError: No module named 'typesense.Documents'`
      • 2021-02-05 03656, 2021

      • bitmap
        we figured that out but it's something docker related that I'm not sure how to debug further :)
      • 2021-02-05 03616, 2021

      • bitmap
        connections only time out inside the container
      • 2021-02-05 03618, 2021

      • atj
        tcpdump would be my next port of call
      • 2021-02-05 03639, 2021

      • atj
        I imagine you don't have that in the container though
      • 2021-02-05 03641, 2021

      • atj
        :)
      • 2021-02-05 03613, 2021

      • bitmap
        we could prob install it, apt-get works
      • 2021-02-05 03600, 2021

      • zas
        bitmap: I found something, I will do some testing on burnside
      • 2021-02-05 03609, 2021

      • bitmap
        okay
      • 2021-02-05 03612, 2021

      • atj
        You want to check if the connections are getting through to the docker instance
      • 2021-02-05 03625, 2021

      • atj
        Isolate where the problem is occurring
      • 2021-02-05 03652, 2021

      • zas
        bitmap: can you re-test on burnside now?
      • 2021-02-05 03658, 2021

      • bitmap
        sure
      • 2021-02-05 03635, 2021

      • zas
        really not sure about the change I made, but it is perhaps related to latest docker version (which is installed on cage & burnside)
      • 2021-02-05 03604, 2021

      • bitmap
        seems to just always hang immediately
      • 2021-02-05 03649, 2021

      • zas
        so no change... ook
      • 2021-02-05 03651, 2021

      • bitmap
        oh now it's working
      • 2021-02-05 03612, 2021

      • sumedh joined the channel
      • 2021-02-05 03657, 2021

      • BrainzGit
        [musicbrainz-server] reosarevok opened pull request #1896 (master…consistent-license-block): Make Perl license block consistent https://github.com/metabrainz/musicbrainz-server/…
      • 2021-02-05 03659, 2021

      • bitmap
        hrm it was working for a bit, now it's hanging on poll again
      • 2021-02-05 03637, 2021

      • zas
        that's on burnside right?
      • 2021-02-05 03603, 2021

      • bitmap
        yeah
      • 2021-02-05 03616, 2021

      • bitmap
        yeah it's still hitting the issue
      • 2021-02-05 03601, 2021

      • zas
        ok, I'll try another thing, but I need to reboot burnside
      • 2021-02-05 03635, 2021

      • bitmap
        ok
      • 2021-02-05 03628, 2021

      • zas
        bitmap: can you retest?
      • 2021-02-05 03648, 2021

      • bitmap
        sure
      • 2021-02-05 03653, 2021

      • zas
        Sorry, going step by step to eliminate candidates to this mess....
      • 2021-02-05 03654, 2021

      • _lucifer
        ruaok: did some look around and this worked for me finally https://www.irccloud.com/pastebin/vQOVgV2K/
      • 2021-02-05 03633, 2021

      • bitmap
        zas: no issues so far...
      • 2021-02-05 03655, 2021

      • zas
        hmmm interesting
      • 2021-02-05 03636, 2021

      • ruaok
        _lucifer: oh, that sounds very promising -- how did you arrive at this?
      • 2021-02-05 03620, 2021

      • _lucifer
        ruaok: i had to look where the search function was located i.e. the module or the class. because the mock is done at that level. I looked at the source code of typesense to find the search method. It was in `Documents` class in documents.py.
      • 2021-02-05 03623, 2021

      • sumedh has quit
      • 2021-02-05 03622, 2021

      • ruaok
        ok, I guess I'll read the source of what I am trying to mock next time. thanks!
      • 2021-02-05 03631, 2021

      • _lucifer
        ideally, that shouldn't be needed but i couldn't find it in the api docs.
      • 2021-02-05 03648, 2021

      • _lucifer
        so had to go in the source
      • 2021-02-05 03631, 2021

      • atj
        zas: I'm curious what you changed?
      • 2021-02-05 03606, 2021

      • alastairp
        ruaok: mmm, mocks in class instances are super annoying
      • 2021-02-05 03613, 2021

      • alastairp
      • 2021-02-05 03619, 2021

      • alastairp
        which is a bit annoying too
      • 2021-02-05 03639, 2021

      • ruaok
        did you look at _lucifer's solution above?
      • 2021-02-05 03618, 2021

      • alastairp
        yeah, looks good. the problem here is that it'll mock .search on all instances of Documents
      • 2021-02-05 03634, 2021

      • alastairp
        rather than just on the instance that you need. but that probably isn't an issue with the specific test
      • 2021-02-05 03647, 2021

      • alastairp
        and @patch undoes it at the end of the function anyway
      • 2021-02-05 03652, 2021

      • ruaok
        yeah, fine for this test.
      • 2021-02-05 03643, 2021

      • alastairp
        honestly, I have no idea how to make this work in a better way
      • 2021-02-05 03618, 2021

      • alastairp
        my intuition is that the test should do "less", but I can't really define what that means
      • 2021-02-05 03619, 2021

      • ruaok
        its a really weird way of making a call to an API, so its not surprising that it would be hard.
      • 2021-02-05 03644, 2021

      • _lucifer
        alastairp: should i open a PR for datasethoster so that we can retest LB and merge the BU PR?
      • 2021-02-05 03615, 2021

      • zas
        atj: not sure yet...
      • 2021-02-05 03621, 2021

      • zas
        I'll reboot cage
      • 2021-02-05 03631, 2021

      • alastairp
        sure
      • 2021-02-05 03638, 2021

      • _lucifer
        👍
      • 2021-02-05 03600, 2021

      • yvanzo
        stopped sir-prod again
      • 2021-02-05 03601, 2021

      • Gazooo7949440 has quit
      • 2021-02-05 03644, 2021

      • Gazooo7949440 joined the channel
      • 2021-02-05 03638, 2021

      • _lucifer
        yvanzo: could you setup BrainzGit for https://github.com/metabrainz/data-set-hoster/ ?
      • 2021-02-05 03605, 2021

      • yvanzo
        _lucifer: done
      • 2021-02-05 03616, 2021

      • _lucifer
        thanks!
      • 2021-02-05 03643, 2021

      • _lucifer
        alastairp: tested LB again with https://github.com/metabrainz/data-set-hoster/pul…. works now.
      • 2021-02-05 03605, 2021

      • davic joined the channel
      • 2021-02-05 03621, 2021

      • d4rkie joined the channel
      • 2021-02-05 03648, 2021

      • Nyanko-sensei has quit
      • 2021-02-05 03632, 2021

      • Quoth joined the channel