in #metabrainz

0:36 AM
supersandro20009 joined the channel
0:40 AM
Chinmay3199 has quit
0:40 AM
supersandro2000 has quit
1:38 AM
shivam-kapila has quit
2:13 AM
naiveai has quit
5:09 AM
prabal joined the channel
6:13 AM
yvanzo

mo’’in’
6:30 AM
d4rkie joined the channel
6:32 AM
D4RK-PH0ENiX has quit
6:33 AM
D4RK-PH0ENiX joined the channel
6:34 AM
Darkloke joined the channel
6:35 AM
d4rkie has quit
7:14 AM
v6lur joined the channel
7:32 AM
alastairp

hello
7:33 AM
zas: hi, if you have some time this morning I would like to ask you some deploy questions
7:42 AM
zas

alastairp: sure, ok for you in one hour?
7:45 AM
alastairp

zas: that's great
8:03 AM
ruaok: I found this today: https://www.mirantis.com/blog/mirantis-will-con...
8:03 AM
looks like some time last year Docker-the-company sold their deployment devision to another company, which includes docker swarm
8:03 AM
apparently they're now "focused on developer tools"
8:04 AM
this gives me bad feelings about how much longer swarm is going to be around for. I give it 2 years
8:11 AM
iliekcomputers

Moin
8:28 AM
zas

alastairp: I'm there
8:29 AM
yvanzo

iirc we rather talked about kubernetes at latest summit, which might be more reliable regarding these concerns.
8:36 AM
zas

yvanzo: yes, we should rather think about how to migrate to kubernetes, the move to docker was just a first step, and our current setup mainly targetted at making the move possible. Now most devs are more familiar with docker, and our apps are, at least partially, ready for a step further.
8:36 AM
https://boxboat.com/2019/12/10/migrate-docker-s...
8:36 AM
"Supporting Kubernetes applications is more challenging than Docker Swarm. Kubernetes provides a more flexible architecture, at the cost of increased complexity."
8:40 AM
ishaanshah[m]

iliekcomputers: I have updated the docs
8:43 AM
alastairp

zas: hi
8:43 AM
zas

https://people.canonical.com/~ubuntu-security/c... <--- if you have haproxy on 18.04 (16.04 not affected)
8:43 AM
alastairp

ubuntu specific? I have it on some debians. will check today, thanks
8:44 AM
I have some questions about volumes. from having a look at the docker-server-configs scripts it looks like almost all external data is stored on named volumes
8:44 AM
what's the process for backing up this data? e.g. in AcousticBrainz we have data models created by people, which result in data files. Ideally we should back this up
8:45 AM
from looking at other systems, it seems like you have for example `start_sshd_musicbrainz_json_dumps_incremental`, which starts an sshd with a volume mounted. Is this so that another process can get into it and copy content out?
8:48 AM
zas

It depends on your app, but we have https://github.com/metabrainz/borg-backup that can be configured to do regular backups of volume contents
8:49 AM
for an example, check https://github.com/metabrainz/borg-backup/blob/...
8:50 AM
basically, you set up client side on the node (if not already), and add your paths to the create script
8:50 AM
alastairp

don't have permission to that repo
8:50 AM
zas

ah, let me check
8:51 AM
test now
8:52 AM
alastairp

I see it
8:53 AM
and does that copy text straight out of the volume location on disk, or does it start a container with the volume mounted and copy it out of there?
8:53 AM
`/var/lib/docker/volumes/jenkins-data` looks like straight from disk? I Don't know much about the local volume driver. is this safe?
8:53 AM
zas

yes, afaik
8:54 AM
alastairp

ok, great. I'll add to my todo list that we have to make backups for some files, and will ask you if I have any other questions
8:54 AM
zas

check if the node you want to backup from has borg setup already
8:55 AM
alastairp

boingo. how do I do that? see if there's a borg container?
8:56 AM
ah, no node file in borg-backup repo. I guess that's a no
8:57 AM
zas

it depends, it can use "default" config
8:57 AM
but systemctl list-timers mb-backup.timer
8:58 AM
should show a timer
8:58 AM
I don't think there's one on boingo yet
8:58 AM
alastairp

0 timers. what's the process here. Open a ticket for you to install it?
8:59 AM
zas

I'll do it right now
8:59 AM
alastairp

thanks!
8:59 AM
one more question, about creating volumes
8:59 AM
I had a look - it doesn't seem like there's a function in services.sh or similar for generic "create a volume". is that right?
9:00 AM
every place that I see just calls `docker volume create` when it's needed
9:00 AM
for AB, we have a volume to share data which is shared between 3 different services. This means that we need to create it once before bringing up services
9:01 AM
https://github.com/metabrainz/docker-server-con... here, iliekcomputers just runs it in boingo.sh before bringing up services, however it seems a bit wrong to me to put a command like this in a node script, as all other commands call generic start_ functions
9:02 AM
the alternative is to just run this command anyway at the beginning of _all_ `start_` scripts that require it, because if it exists it'll just complete without performing any action. However, this also seems dangerous to me, because there's a risk of adding a new service and forgetting to add this command
9:05 AM
Gazooo has quit
9:06 AM
Gazooo joined the channel
9:06 AM
Chinmay3199 joined the channel
9:09 AM
zas

you can just ensure the volume exists, and create it in start_* commands if needed, those scripts are rather hacky, we don't have any dependency management or even priorities. Another reason to move to kubernetes or the like
9:12 AM
https://github.com/metabrainz/docker-server-con... should be in start_* functions
9:12 AM
alastairp

yes, I was thinking about this kind of dependency functionality. agreed that the next management tool should do it for us
9:13 AM
OK, I'll add `volume create` to all start_* functions, and add a comment to remind us to add it to new functions if we make a new one
9:13 AM
that'll be good enough for now
9:14 AM
thanks for the confirmation
9:14 AM
for backups - I should make a node in the borg repo for boingo, pointing to the volumes to back up?
9:15 AM
zas

I did already, but this stuff isn't great yet, deployement isn't well documented
9:15 AM
just add directories to https://github.com/metabrainz/borg-backup/blob/...
9:15 AM
alastairp

thanks, will do
9:16 AM
zas

then tell me when done, so I can test it runs properly
9:17 AM
by default backups happen once a day
9:17 AM
and target is the machine with RAID1 drives at the office
9:17 AM
everything is encrypted, compressed, and underlying protocol is rsync
9:20 AM
shivam-kapila joined the channel
9:33 AM
alastairp

Today I learned that there are 4 trimesters in a year. the tri- defines the number of months, not the number of divisions, and so there a 4 3-month trimesters, instead of 3 of 4 months. similarly, in semester, the se- is from latin for 6, I always related it with the number 2, because I counted it as 2 divisions of the year
9:35 AM
yvanzo

We are so proud of you! Might it be because uni was open only 3 trimesters in a year? ;)
9:35 AM
alastairp

yeah, exactly!
9:37 AM
I guess semestre in French ties a lot more to 6
9:39 AM
yvanzo

Not really, it is quite the same, and I've always been confused about trimesters until filling tax declaration.
9:51 AM
ruaok

moooin!
9:51 AM
> trimestral tax declarations. I was confused because there are 4 of them in a year!
9:51 AM
yep, I've done that. :)
9:51 AM
iliekcomputers: thanks for moving the branch along -- it was really good to go offline for the evening...
9:52 AM
iliekcomputers

happy to help!
9:54 AM
i think shivam-kapila has all the tests fixed, although the travis build is still borked
9:55 AM
ruaok

I need to tend to tend to business stuff, then I'll finish the rest of the surgically removing influx... which should fix the rest of the tests.
9:55 AM
my brain didn't pick a good stopping point to melt down yesterday.
10:00 AM
iliekcomputers

ruaok: we will have to run both influx and timescale simultaneuosly for some time tho, right?
10:01 AM
ruaok

I hope for that to be measured in hours, not days.
10:01 AM
testing will happen on the timescale instance on gaga.
10:02 AM
once we're happy with the timescale code, then clean the incoming queue for timescale and stop the timescale_writer and let listens pile up.
10:03 AM
a few minutes after that, we will trigger an LB full dump. I'll take the full dump and run my import/cleanup scripts.
10:03 AM
we'll import the data completely and then start the timescale_writer. all duplicate listens will be ignored and the new listens will be inserted.
10:04 AM
and then we should be consistent between influx and timescale.
10:04 AM
then we can decide when to cut over to timescale in production.
10:04 AM
that's the plan I've hashed out.
10:05 AM
iliekcomputers

that makes sense to me.
10:06 AM
shivam-kapila

Hi :)
10:06 AM
ruaok

hi shivam-kapila
10:06 AM
Mr_Monkey

alastairp: When I was learning Latin at school, I didn't necessarily believe them when they said it would be useful. I've since come to agree with the teachers !
10:06 AM
ruaok

iliekcomputers: I'm glad. timescale and its rock solid dups handling makes it easy.
10:07 AM
iliekcomputers: I'm also going to remove dups and fuzzy last.fm dupes in the re-import process.
10:07 AM
iliekcomputers

yeah, that sounds like a good idea
10:07 AM
ruaok

e.g. two listens that are identical in a 2 second window will be considered dupes
10:07 AM
iliekcomputers

i wonder if there's more things in the data that we should fix while we're at it
10:07 AM
ruaok

identical save for the timestamp.
10:08 AM
iliekcomputers

i'm pretty sure there are, i'll look over it once
10:08 AM
ruaok

please do.
10:08 AM
I know those two are easy goals....
10:08 AM
remember that my process sorts all of the listens into one file. (shudder). and then that file is sorted in a massive sort operationg.
10:09 AM
then it is imported in sorted order, so anything that we can run over a narrow window of listens, we can do in the import.
10:09 AM
iliekcomputers

this logic is in the import function in timescale_listenstore?
10:09 AM
ruaok

no. hang on.
10:10 AM
https://github.com/mayhem/timescale-testing
10:10 AM
https://github.com/mayhem/timescale-testing/blo...
10:11 AM
this is all proof-of-concept code. a lot of which has been ported to LB proper. this script will need to be moved to LB proper as well.
10:11 AM
ah, null character cleanup is done as well.
10:11 AM
iliekcomputers

awesome
10:12 AM
seems like you have it covered, i'll be happy to review the branch when it's ready.
10:13 AM
ruaok

the main LB codebase PR will come first. we can deploy that as test.lb.org
10:13 AM
then I'll start the PR for migration.
10:14 AM
I may ping you this afternoon if I have questions about the listen stuff from last night.
10:15 AM
iliekcomputers

cool. i'd prefer not to merge the branches until we're ready-ish to deploy on prod, considering influx is removed and it would block other releases
10:15 AM
alastairp

Mr_Monkey: ! awesome :)
10:16 AM
https://imisstheoffice.eu/
10:16 AM
iliekcomputers

i've been releasing small diffs over the week and it's definitely a much better process.
10:16 AM
ruaok

iliekcomputers: agreed, I'm keeping that in mind.
10:17 AM
shivam-kapila

ruaok: When you get time please once look into the change I made in Spark dumps to make it consistent with Influx. I replaced the timestamp to check for unwritten listens to be based on listened_at rather than created because created is NULL in some cases. I havent made PR and its on my fork. Please ping when you want the link.
10:17 AM
ruaok

ok, will do. this afternoon.
10:18 AM
shivam-kapila

The tests are done in my knowledge and I have moved to modify Timescale writer
10:19 AM
ruaok

shivam-kapila: 👍
10:22 AM
Mr_Monkey

:D
10:48 AM
BrainzGit

[bookbrainz-site] MonkeyDo merged pull request #404 (master…fix-entity-create-route): Fix entity /create route https://github.com/bookbrainz/bookbrainz-site/p...
10:50 AM
D4RK-PH0ENiX has quit
10:51 AM
travis-ci joined the channel
10:51 AM
travis-ci

Project bookbrainz-site build #2798: passed in 2 min 21 sec: https://travis-ci.org/bookbrainz/bookbrainz-sit...
10:51 AM
travis-ci has left the channel
11:01 AM
D4RK-PH0ENiX joined the channel
11:13 AM
BrainzGit

[bookbrainz-site] prabalsingh24 closed pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
11:13 AM
BrainzBot

BB-50: Add editor activity graphs https://tickets.metabrainz.org/browse/BB-50
11:13 AM
BrainzGit

[bookbrainz-site] prabalsingh24 reopened pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
11:14 AM
Mr_Monkey

Woops :D
11:17 AM
prabal

I was closing my comment. Accidentally clicked close pull request button. This has happened couple of times now. smh
11:36 AM
BrainzGit

[bookbrainz-site] MonkeyDo merged pull request #386 (master…EditorActivity): BB-50: Add Editor activity graphs https://github.com/bookbrainz/bookbrainz-site/p...
11:36 AM
BrainzBot

BB-50: Add editor activity graphs https://tickets.metabrainz.org/browse/BB-50