#metabrainz

/

      • supersandro20009 joined the channel
      • Chinmay3199 has quit
      • supersandro2000 has quit
      • shivam-kapila has quit
      • naiveai has quit
      • prabal joined the channel
      • yvanzo
        mo’’in’
      • d4rkie joined the channel
      • D4RK-PH0ENiX has quit
      • D4RK-PH0ENiX joined the channel
      • Darkloke joined the channel
      • d4rkie has quit
      • v6lur joined the channel
      • alastairp
        hello
      • zas: hi, if you have some time this morning I would like to ask you some deploy questions
      • zas
        alastairp: sure, ok for you in one hour?
      • alastairp
        zas: that's great
      • looks like some time last year Docker-the-company sold their deployment devision to another company, which includes docker swarm
      • apparently they're now "focused on developer tools"
      • this gives me bad feelings about how much longer swarm is going to be around for. I give it 2 years
      • iliekcomputers
        Moin
      • zas
        alastairp: I'm there
      • yvanzo
        iirc we rather talked about kubernetes at latest summit, which might be more reliable regarding these concerns.
      • zas
        yvanzo: yes, we should rather think about how to migrate to kubernetes, the move to docker was just a first step, and our current setup mainly targetted at making the move possible. Now most devs are more familiar with docker, and our apps are, at least partially, ready for a step further.
      • "Supporting Kubernetes applications is more challenging than Docker Swarm. Kubernetes provides a more flexible architecture, at the cost of increased complexity."
      • ishaanshah[m]
        iliekcomputers: I have updated the docs
      • alastairp
        zas: hi
      • zas
        https://people.canonical.com/~ubuntu-security/c... <--- if you have haproxy on 18.04 (16.04 not affected)
      • alastairp
        ubuntu specific? I have it on some debians. will check today, thanks
      • I have some questions about volumes. from having a look at the docker-server-configs scripts it looks like almost all external data is stored on named volumes
      • what's the process for backing up this data? e.g. in AcousticBrainz we have data models created by people, which result in data files. Ideally we should back this up
      • from looking at other systems, it seems like you have for example `start_sshd_musicbrainz_json_dumps_incremental`, which starts an sshd with a volume mounted. Is this so that another process can get into it and copy content out?
      • zas
        It depends on your app, but we have https://github.com/metabrainz/borg-backup that can be configured to do regular backups of volume contents
      • basically, you set up client side on the node (if not already), and add your paths to the create script
      • alastairp
        don't have permission to that repo
      • zas
        ah, let me check
      • test now
      • alastairp
        I see it
      • and does that copy text straight out of the volume location on disk, or does it start a container with the volume mounted and copy it out of there?
      • `/var/lib/docker/volumes/jenkins-data` looks like straight from disk? I Don't know much about the local volume driver. is this safe?
      • zas
        yes, afaik
      • alastairp
        ok, great. I'll add to my todo list that we have to make backups for some files, and will ask you if I have any other questions
      • zas
        check if the node you want to backup from has borg setup already
      • alastairp
        boingo. how do I do that? see if there's a borg container?
      • ah, no node file in borg-backup repo. I guess that's a no
      • zas
        it depends, it can use "default" config
      • but systemctl list-timers mb-backup.timer
      • should show a timer
      • I don't think there's one on boingo yet
      • alastairp
        0 timers. what's the process here. Open a ticket for you to install it?
      • zas
        I'll do it right now
      • alastairp
        thanks!
      • one more question, about creating volumes
      • I had a look - it doesn't seem like there's a function in services.sh or similar for generic "create a volume". is that right?
      • every place that I see just calls `docker volume create` when it's needed
      • for AB, we have a volume to share data which is shared between 3 different services. This means that we need to create it once before bringing up services
      • https://github.com/metabrainz/docker-server-con... here, iliekcomputers just runs it in boingo.sh before bringing up services, however it seems a bit wrong to me to put a command like this in a node script, as all other commands call generic start_ functions
      • the alternative is to just run this command anyway at the beginning of _all_ `start_` scripts that require it, because if it exists it'll just complete without performing any action. However, this also seems dangerous to me, because there's a risk of adding a new service and forgetting to add this command
      • Gazooo has quit
      • Gazooo joined the channel
      • Chinmay3199 joined the channel
      • zas
        you can just ensure the volume exists, and create it in start_* commands if needed, those scripts are rather hacky, we don't have any dependency management or even priorities. Another reason to move to kubernetes or the like
      • alastairp
        yes, I was thinking about this kind of dependency functionality. agreed that the next management tool should do it for us
      • OK, I'll add `volume create` to all start_* functions, and add a comment to remind us to add it to new functions if we make a new one
      • that'll be good enough for now
      • thanks for the confirmation
      • for backups - I should make a node in the borg repo for boingo, pointing to the volumes to back up?
      • zas
        I did already, but this stuff isn't great yet, deployement isn't well documented
      • alastairp
        thanks, will do
      • zas
        then tell me when done, so I can test it runs properly
      • by default backups happen once a day
      • and target is the machine with RAID1 drives at the office
      • everything is encrypted, compressed, and underlying protocol is rsync
      • shivam-kapila joined the channel
      • alastairp
        Today I learned that there are 4 trimesters in a year. the tri- defines the number of months, not the number of divisions, and so there a 4 3-month trimesters, instead of 3 of 4 months. similarly, in semester, the se- is from latin for 6, I always related it with the number 2, because I counted it as 2 divisions of the year
      • yvanzo
        We are so proud of you! Might it be because uni was open only 3 trimesters in a year? ;)
      • alastairp
        yeah, exactly!
      • I guess semestre in French ties a lot more to 6
      • yvanzo
        Not really, it is quite the same, and I've always been confused about trimesters until filling tax declaration.
      • ruaok
        moooin!
      • > trimestral tax declarations. I was confused because there are 4 of them in a year!
      • yep, I've done that. :)
      • iliekcomputers: thanks for moving the branch along -- it was really good to go offline for the evening...
      • iliekcomputers
        happy to help!
      • i think shivam-kapila has all the tests fixed, although the travis build is still borked
      • ruaok
        I need to tend to tend to business stuff, then I'll finish the rest of the surgically removing influx... which should fix the rest of the tests.
      • my brain didn't pick a good stopping point to melt down yesterday.
      • iliekcomputers
        ruaok: we will have to run both influx and timescale simultaneuosly for some time tho, right?
      • ruaok
        I hope for that to be measured in hours, not days.
      • testing will happen on the timescale instance on gaga.
      • once we're happy with the timescale code, then clean the incoming queue for timescale and stop the timescale_writer and let listens pile up.
      • a few minutes after that, we will trigger an LB full dump. I'll take the full dump and run my import/cleanup scripts.
      • we'll import the data completely and then start the timescale_writer. all duplicate listens will be ignored and the new listens will be inserted.
      • and then we should be consistent between influx and timescale.
      • then we can decide when to cut over to timescale in production.
      • that's the plan I've hashed out.
      • iliekcomputers
        that makes sense to me.
      • shivam-kapila
        Hi :)
      • ruaok
        hi shivam-kapila
      • Mr_Monkey
        alastairp: When I was learning Latin at school, I didn't necessarily believe them when they said it would be useful. I've since come to agree with the teachers !
      • ruaok
        iliekcomputers: I'm glad. timescale and its rock solid dups handling makes it easy.
      • iliekcomputers: I'm also going to remove dups and fuzzy last.fm dupes in the re-import process.
      • iliekcomputers
        yeah, that sounds like a good idea
      • ruaok
        e.g. two listens that are identical in a 2 second window will be considered dupes
      • iliekcomputers
        i wonder if there's more things in the data that we should fix while we're at it
      • ruaok
        identical save for the timestamp.
      • iliekcomputers
        i'm pretty sure there are, i'll look over it once
      • ruaok
        please do.
      • I know those two are easy goals....
      • remember that my process sorts all of the listens into one file. (shudder). and then that file is sorted in a massive sort operationg.
      • then it is imported in sorted order, so anything that we can run over a narrow window of listens, we can do in the import.
      • iliekcomputers
        this logic is in the import function in timescale_listenstore?
      • ruaok
        no. hang on.
      • this is all proof-of-concept code. a lot of which has been ported to LB proper. this script will need to be moved to LB proper as well.
      • ah, null character cleanup is done as well.
      • iliekcomputers
        awesome
      • seems like you have it covered, i'll be happy to review the branch when it's ready.
      • ruaok
        the main LB codebase PR will come first. we can deploy that as test.lb.org
      • then I'll start the PR for migration.
      • I may ping you this afternoon if I have questions about the listen stuff from last night.
      • iliekcomputers
        cool. i'd prefer not to merge the branches until we're ready-ish to deploy on prod, considering influx is removed and it would block other releases
      • alastairp
        Mr_Monkey: ! awesome :)
      • iliekcomputers
        i've been releasing small diffs over the week and it's definitely a much better process.
      • ruaok
        iliekcomputers: agreed, I'm keeping that in mind.
      • shivam-kapila
        ruaok: When you get time please once look into the change I made in Spark dumps to make it consistent with Influx. I replaced the timestamp to check for unwritten listens to be based on listened_at rather than created because created is NULL in some cases. I havent made PR and its on my fork. Please ping when you want the link.
      • ruaok
        ok, will do. this afternoon.
      • shivam-kapila
        The tests are done in my knowledge and I have moved to modify Timescale writer
      • ruaok
        shivam-kapila: 👍
      • Mr_Monkey
        :D
      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #404 (master…fix-entity-create-route): Fix entity /create route https://github.com/bookbrainz/bookbrainz-site/p...
      • D4RK-PH0ENiX has quit
      • travis-ci joined the channel
      • travis-ci
        Project bookbrainz-site build #2798: passed in 2 min 21 sec: https://travis-ci.org/bookbrainz/bookbrainz-sit...
      • travis-ci has left the channel
      • D4RK-PH0ENiX joined the channel
      • BrainzGit
        [bookbrainz-site] prabalsingh24 closed pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
      • BrainzBot
        BB-50: Add editor activity graphs https://tickets.metabrainz.org/browse/BB-50
      • BrainzGit
        [bookbrainz-site] prabalsingh24 reopened pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
      • Mr_Monkey
        Woops :D
      • prabal
        I was closing my comment. Accidentally clicked close pull request button. This has happened couple of times now. smh
      • BrainzGit
        [bookbrainz-site] MonkeyDo merged pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
      • BrainzBot
        BB-50: Add editor activity graphs https://tickets.metabrainz.org/browse/BB-50