#metabrainz

/

      • _lucifer
        ah ok, there's two CMD keys
      • 2021-05-17 13738, 2021

      • _lucifer
        one inside config and another outside it
      • 2021-05-17 13746, 2021

      • alastairp
        ah, interesting. ContainerConfig and Config
      • 2021-05-17 13754, 2021

      • _lucifer
        yeah
      • 2021-05-17 13755, 2021

      • _lucifer
        the CMD inside Config is different for the 13.0 and 17.0 release
      • 2021-05-17 13723, 2021

      • atj
        "|1"? :)
      • 2021-05-17 13724, 2021

      • _lucifer
      • 2021-05-17 13730, 2021

      • _lucifer
        for 13.0 its ^
      • 2021-05-17 13759, 2021

      • _lucifer
        yeah, no idea where's that |1 is coming from. that entire CMD is wrong.
      • 2021-05-17 13759, 2021

      • ruaok
        alastairp: _lucifer lemmy just ran out of diskspace.
      • 2021-05-17 13705, 2021

      • ruaok
        this may be related?
      • 2021-05-17 13717, 2021

      • _lucifer
        shouldn't be.
      • 2021-05-17 13721, 2021

      • ruaok goes to clean up
      • 2021-05-17 13733, 2021

      • ruaok
        dumps probably just went south. :(
      • 2021-05-17 13734, 2021

      • _lucifer
        the CMD seems to be wrong for today's image. but not sure why
      • 2021-05-17 13703, 2021

      • _lucifer
        oh well, everything decided to go south at the same time
      • 2021-05-17 13718, 2021

      • alastairp
        _lucifer: have we done a release since we updated push.sh?
      • 2021-05-17 13736, 2021

      • _lucifer
        alastairp, yes the 13.0 was done using it.
      • 2021-05-17 13704, 2021

      • alastairp
        was that the first release that pushed to :latest, then? (and so this one is the first one that uses :latest as a cache?
      • 2021-05-17 13730, 2021

      • _lucifer
        i had pushed latest from bono just before releasing so that 13 could use it as cache.
      • 2021-05-17 13753, 2021

      • _lucifer
        but this should be the first time actions built latest is being used as cache.
      • 2021-05-17 13717, 2021

      • alastairp
        I'm building from scratch locally just to see what happens
      • 2021-05-17 13725, 2021

      • _lucifer
        👍
      • 2021-05-17 13743, 2021

      • alastairp
        OK, this one has Cmd correctly set to /sbin/my_init. I'm trying again with buildkit
      • 2021-05-17 13715, 2021

      • ruaok
        _lucifer: doing the LB update again?
      • 2021-05-17 13735, 2021

      • _lucifer
        ruaok, logging in again to see what happened just now
      • 2021-05-17 13744, 2021

      • ruaok
        k
      • 2021-05-17 13747, 2021

      • _lucifer
        no wasn't doing update
      • 2021-05-17 13703, 2021

      • ruaok
        lemmy rebooted!
      • 2021-05-17 13710, 2021

      • ruaok
        WTF??
      • 2021-05-17 13722, 2021

      • _lucifer
        woops
      • 2021-05-17 13727, 2021

      • ruaok
        must be related to running out of disk. dumps are so fucked now.
      • 2021-05-17 13743, 2021

      • _lucifer
      • 2021-05-17 13753, 2021

      • _lucifer
        before that i got this output
      • 2021-05-17 13759, 2021

      • _lucifer
        probably related to reboot
      • 2021-05-17 13742, 2021

      • _lucifer
        alastairp, i built with buildkit just now, CMD is correct there
      • 2021-05-17 13720, 2021

      • alastairp
        right
      • 2021-05-17 13730, 2021

      • _lucifer
        ruaok, everything back up but prod still having issues with redis
      • 2021-05-17 13738, 2021

      • _lucifer
        we should probably tweet about downtime.
      • 2021-05-17 13741, 2021

      • alastairp
        _lucifer: I just checked listenbrainz:latest and CMD is incorrect
      • 2021-05-17 13705, 2021

      • _lucifer
        alastairp, right saw that. i expect that because actions pushed both today.
      • 2021-05-17 13720, 2021

      • alastairp
        ah, of course
      • 2021-05-17 13721, 2021

      • ruaok
        redis is up now.
      • 2021-05-17 13711, 2021

      • _lucifer
        👍, prod also working now.
      • 2021-05-17 13719, 2021

      • alastairp
        but for example, I pulled :latest, then build with --cache-from :latest, and the cmd is now incorrect. that makes sense. the last CMD run turns into the image's command. however, I still don't know how it got into this state
      • 2021-05-17 13743, 2021

      • _lucifer
        yeah :/
      • 2021-05-17 13732, 2021

      • ruaok
        dump borked. starting over again. sigh. fuck.
      • 2021-05-17 13710, 2021

      • alastairp
        OK, how about we roll back --cache-from for now until we can debug this further? we could also build manually for today
      • 2021-05-17 13723, 2021

      • alastairp
        it's possible that this is a buildkit issue that we don't fully understand
      • 2021-05-17 13703, 2021

      • _lucifer
        yeah sure, let's do that
      • 2021-05-17 13724, 2021

      • ruaok
        oh I found dumps in our volumes. checking consistency now.
      • 2021-05-17 13711, 2021

      • BrainzGit
        [listenbrainz-server] amCap1712 opened pull request #1464 (master…disable-cache): Do not use cache-from for now https://github.com/metabrainz/listenbrainz-server…
      • 2021-05-17 13749, 2021

      • _lucifer
        alastairp, let's also disable the workflow for now. otherwise, it may keep pushing incorrect images that override our locally built ones.
      • 2021-05-17 13732, 2021

      • _lucifer
        Mr_Monkey: there's a failing frontend test in the current master. on friday, actions were down so it probably went unseen.
      • 2021-05-17 13752, 2021

      • Mr_Monkey
        Oh? I missed that too, let me look now
      • 2021-05-17 13707, 2021

      • _lucifer
      • 2021-05-17 13745, 2021

      • alastairp
        _lucifer: I don't mind if we disable the workflow or not - up to you.
      • 2021-05-17 13756, 2021

      • _lucifer
        let's disable
      • 2021-05-17 13705, 2021

      • alastairp
        either we try again with the changes we just made, or we disable it and build manually
      • 2021-05-17 13707, 2021

      • Mr_Monkey
        Ah, failing snapshot, I must have borked that during a merge.
      • 2021-05-17 13721, 2021

      • alastairp
        ok, fine. how do we disable? remove the file or is there a flag/interface to turn it off?
      • 2021-05-17 13732, 2021

      • _lucifer
        there's a UI switch
      • 2021-05-17 13755, 2021

      • _lucifer
      • 2021-05-17 13758, 2021

      • _lucifer
        disabled for now
      • 2021-05-17 13722, 2021

      • BrainzGit
        [listenbrainz-server] amCap1712 merged pull request #1464 (master…disable-cache): Do not use cache-from for now https://github.com/metabrainz/listenbrainz-server…
      • 2021-05-17 13724, 2021

      • alastairp
        ok
      • 2021-05-17 13710, 2021

      • _lucifer
        ruaok, should we proceed with updating lb containers or wait?
      • 2021-05-17 13734, 2021

      • ruaok
        proceed with all, but cron.
      • 2021-05-17 13729, 2021

      • _lucifer
        👍
      • 2021-05-17 13708, 2021

      • _lucifer
        prod updated. navigation working well.
      • 2021-05-17 13711, 2021

      • _lucifer
        !m Mr_Monkey
      • 2021-05-17 13711, 2021

      • BrainzBot
        You're doing good work, Mr_Monkey!
      • 2021-05-17 13736, 2021

      • alastairp
        this |1 syntax is interesting. when I inspect the full image with `dive`, it shows many layers with a command that starts with this. I've never encountered it before, and I don't know what it means
      • 2021-05-17 13700, 2021

      • Mr_Monkey
        ! you, _lucifer and thanks for deploying
      • 2021-05-17 13714, 2021

      • Mr_Monkey
        even !m
      • 2021-05-17 13744, 2021

      • alastairp
        Mr_Monkey: can we make webpack quieter when building prod js in the image build?
      • 2021-05-17 13753, 2021

      • ruaok
        +1
      • 2021-05-17 13721, 2021

      • Mr_Monkey
        Sure, I think we're currently passing a `--progress`option, you can just remove it
      • 2021-05-17 13732, 2021

      • alastairp
        cool, let me try that
      • 2021-05-17 13737, 2021

      • Mr_Monkey
        (in package.json when we run build:prod
      • 2021-05-17 13740, 2021

      • alastairp
        yep
      • 2021-05-17 13751, 2021

      • _lucifer
        all prod containers except cron updated
      • 2021-05-17 13731, 2021

      • alastairp
        thanks _lucifer
      • 2021-05-17 13716, 2021

      • Mr_Monkey
        I pushe an updated snapshot to master to fix that failing test
      • 2021-05-17 13733, 2021

      • Mr_Monkey
        pushed*
      • 2021-05-17 13719, 2021

      • alastairp
        mmm
      • 2021-05-17 13722, 2021

      • alastairp
        _lucifer: this is really interesting
      • 2021-05-17 13738, 2021

      • _lucifer
        alastairp: there's also |2 and |4 at a few places
      • 2021-05-17 13741, 2021

      • alastairp
      • 2021-05-17 13749, 2021

      • alastairp
        this is causing the |n output
      • 2021-05-17 13717, 2021

      • _lucifer
        oh build args
      • 2021-05-17 13736, 2021

      • alastairp
        from what I can tell, if there's an active build arg, the command is "|n ENV_VAR=value bash -c "command""
      • 2021-05-17 13743, 2021

      • alastairp
        but if there's no build arg, it's just "command"
      • 2021-05-17 13710, 2021

      • alastairp
        however - that doesn't answer the question of why docker changed the "CMD" of the container
      • 2021-05-17 13711, 2021

      • _lucifer
        what does the `|` mean here?
      • 2021-05-17 13703, 2021

      • ruaok
        SHIT SHIT SHIT SHIT.
      • 2021-05-17 13707, 2021

      • alastairp
        it might just be an internal thing, I don't know
      • 2021-05-17 13707, 2021

      • ruaok
        lemmy has gone again.
      • 2021-05-17 13722, 2021

      • ruaok
        zas: ping URGENT.
      • 2021-05-17 13702, 2021

      • ruaok
        this is not how today was supposed to go.
      • 2021-05-17 13734, 2021

      • alastairp
        should we start migrating some services to other hosts? perhaps we can start the prep
      • 2021-05-17 13739, 2021

      • ruaok
        yes, please.
      • 2021-05-17 13750, 2021

      • ruaok
        take beta and test offline.
      • 2021-05-17 13753, 2021

      • ruaok
        cron as well.
      • 2021-05-17 13754, 2021

      • _lucifer
        on it
      • 2021-05-17 13705, 2021

      • ruaok
        just bring everything down to minimum.
      • 2021-05-17 13713, 2021

      • ruaok
        I'll start coordinating with hetzner.
      • 2021-05-17 13702, 2021

      • _lucifer
        lemmy down again?
      • 2021-05-17 13705, 2021

      • alastairp
      • 2021-05-17 13714, 2021

      • _lucifer
        unable to login
      • 2021-05-17 13726, 2021

      • ruaok
        yep.
      • 2021-05-17 13744, 2021

      • ruaok
        start working on getting services up elsewhere and forget lemmy right now.
      • 2021-05-17 13740, 2021

      • alastairp
        boingo is only running AB web and hl extractor. 2% CPU usage. recommend we put web there
      • 2021-05-17 13753, 2021

      • ruaok
        perfect. go.
      • 2021-05-17 13704, 2021

      • _lucifer
        👍
      • 2021-05-17 13718, 2021

      • _lucifer
        alastairp, all on boingo/
      • 2021-05-17 13719, 2021

      • _lucifer
        ?
      • 2021-05-17 13734, 2021

      • alastairp
        _lucifer: why not, let's go with the top 5 in the document
      • 2021-05-17 13740, 2021

      • _lucifer
        👍
      • 2021-05-17 13741, 2021

      • alastairp
        updating config now
      • 2021-05-17 13754, 2021

      • alastairp
        _lucifer: please tell me what image version to run or update docker-server-configs for lemmy
      • 2021-05-17 13707, 2021

      • _lucifer
        v-2021-05-17.1
      • 2021-05-17 13714, 2021

      • _lucifer
        all images
      • 2021-05-17 13738, 2021

      • _lucifer
        i am updating docker server configs right now.
      • 2021-05-17 13710, 2021

      • zas
        ruaok: we just replace it? (I mean in the same rack space)?
      • 2021-05-17 13718, 2021

      • ruaok
        yes.
      • 2021-05-17 13737, 2021

      • ruaok
        we'll need to copy stuff to another server (FTP).
      • 2021-05-17 13750, 2021

      • ruaok
        but services are being moved right now.
      • 2021-05-17 13705, 2021

      • ruaok
      • 2021-05-17 13706, 2021

      • alastairp
        deploying now
      • 2021-05-17 13708, 2021

      • ruaok
        (you and I)
      • 2021-05-17 13700, 2021

      • alastairp
        redis, web, spotify reader, timescale writer, websockets running on boingo.
      • 2021-05-17 13705, 2021

      • _lucifer
        alastairp: do we need the exim relay also?
      • 2021-05-17 13712, 2021

      • alastairp
        _lucifer: good catch, yes
      • 2021-05-17 13722, 2021

      • ruaok
        !m alastairp _lucifer
      • 2021-05-17 13722, 2021

      • BrainzBot
        You're doing good work, alastairp _lucifer!
      • 2021-05-17 13725, 2021

      • alastairp
        spark reader, cron, lastfm api, messybrainz website, labs api not running
      • 2021-05-17 13747, 2021

      • alastairp
        will add exim and labs api to boingo. I think we can get away without the others for now
      • 2021-05-17 13700, 2021

      • ruaok
        yep, good!
      • 2021-05-17 13712, 2021

      • alastairp
        tbh
      • 2021-05-17 13721, 2021

      • alastairp
        !m zas for such a stable service mesh
      • 2021-05-17 13721, 2021

      • BrainzBot
        You're doing good work, zas for such a stable service mesh!
      • 2021-05-17 13737, 2021

      • zas
        alastairp: thanks
      • 2021-05-17 13749, 2021

      • zas
        ruaok: which upgrade option did you ask for?
      • 2021-05-17 13718, 2021

      • ruaok
        I'm going to ask for #4 for AX61-NVMe.
      • 2021-05-17 13726, 2021

      • zas
        ok
      • 2021-05-17 13732, 2021

      • ruaok
        we have no other option, but 4.
      • 2021-05-17 13720, 2021

      • ruaok
        I power button'ed lemmy, but still not back up.
      • 2021-05-17 13721, 2021

      • sumedh joined the channel
      • 2021-05-17 13727, 2021

      • ruaok
        tech support request added.