looks like some time last year Docker-the-company sold their deployment devision to another company, which includes docker swarm
2020-04-08 09949, 2020
alastairp
apparently they're now "focused on developer tools"
2020-04-08 09926, 2020
alastairp
this gives me bad feelings about how much longer swarm is going to be around for. I give it 2 years
2020-04-08 09952, 2020
iliekcomputers
Moin
2020-04-08 09940, 2020
zas
alastairp: I'm there
2020-04-08 09942, 2020
yvanzo
iirc we rather talked about kubernetes at latest summit, which might be more reliable regarding these concerns.
2020-04-08 09932, 2020
zas
yvanzo: yes, we should rather think about how to migrate to kubernetes, the move to docker was just a first step, and our current setup mainly targetted at making the move possible. Now most devs are more familiar with docker, and our apps are, at least partially, ready for a step further.
"Supporting Kubernetes applications is more challenging than Docker Swarm. Kubernetes provides a more flexible architecture, at the cost of increased complexity."
ubuntu specific? I have it on some debians. will check today, thanks
2020-04-08 09922, 2020
alastairp
I have some questions about volumes. from having a look at the docker-server-configs scripts it looks like almost all external data is stored on named volumes
2020-04-08 09953, 2020
alastairp
what's the process for backing up this data? e.g. in AcousticBrainz we have data models created by people, which result in data files. Ideally we should back this up
2020-04-08 09959, 2020
alastairp
from looking at other systems, it seems like you have for example `start_sshd_musicbrainz_json_dumps_incremental`, which starts an sshd with a volume mounted. Is this so that another process can get into it and copy content out?
basically, you set up client side on the node (if not already), and add your paths to the create script
2020-04-08 09941, 2020
alastairp
don't have permission to that repo
2020-04-08 09947, 2020
zas
ah, let me check
2020-04-08 09957, 2020
zas
test now
2020-04-08 09952, 2020
alastairp
I see it
2020-04-08 09912, 2020
alastairp
and does that copy text straight out of the volume location on disk, or does it start a container with the volume mounted and copy it out of there?
2020-04-08 09935, 2020
alastairp
`/var/lib/docker/volumes/jenkins-data` looks like straight from disk? I Don't know much about the local volume driver. is this safe?
2020-04-08 09950, 2020
zas
yes, afaik
2020-04-08 09919, 2020
alastairp
ok, great. I'll add to my todo list that we have to make backups for some files, and will ask you if I have any other questions
2020-04-08 09952, 2020
zas
check if the node you want to backup from has borg setup already
2020-04-08 09948, 2020
alastairp
boingo. how do I do that? see if there's a borg container?
2020-04-08 09934, 2020
alastairp
ah, no node file in borg-backup repo. I guess that's a no
2020-04-08 09948, 2020
zas
it depends, it can use "default" config
2020-04-08 09955, 2020
zas
but systemctl list-timers mb-backup.timer
2020-04-08 09902, 2020
zas
should show a timer
2020-04-08 09914, 2020
zas
I don't think there's one on boingo yet
2020-04-08 09942, 2020
alastairp
0 timers. what's the process here. Open a ticket for you to install it?
2020-04-08 09911, 2020
zas
I'll do it right now
2020-04-08 09919, 2020
alastairp
thanks!
2020-04-08 09925, 2020
alastairp
one more question, about creating volumes
2020-04-08 09950, 2020
alastairp
I had a look - it doesn't seem like there's a function in services.sh or similar for generic "create a volume". is that right?
2020-04-08 09900, 2020
alastairp
every place that I see just calls `docker volume create` when it's needed
2020-04-08 09954, 2020
alastairp
for AB, we have a volume to share data which is shared between 3 different services. This means that we need to create it once before bringing up services
2020-04-08 09936, 2020
alastairp
https://github.com/metabrainz/docker-server-confi… here, iliekcomputers just runs it in boingo.sh before bringing up services, however it seems a bit wrong to me to put a command like this in a node script, as all other commands call generic start_ functions
2020-04-08 09947, 2020
alastairp
the alternative is to just run this command anyway at the beginning of _all_ `start_` scripts that require it, because if it exists it'll just complete without performing any action. However, this also seems dangerous to me, because there's a risk of adding a new service and forgetting to add this command
2020-04-08 09901, 2020
Gazooo has quit
2020-04-08 09949, 2020
Gazooo joined the channel
2020-04-08 09957, 2020
Chinmay3199 joined the channel
2020-04-08 09958, 2020
zas
you can just ensure the volume exists, and create it in start_* commands if needed, those scripts are rather hacky, we don't have any dependency management or even priorities. Another reason to move to kubernetes or the like
then tell me when done, so I can test it runs properly
2020-04-08 09907, 2020
zas
by default backups happen once a day
2020-04-08 09922, 2020
zas
and target is the machine with RAID1 drives at the office
2020-04-08 09945, 2020
zas
everything is encrypted, compressed, and underlying protocol is rsync
2020-04-08 09941, 2020
shivam-kapila joined the channel
2020-04-08 09924, 2020
alastairp
Today I learned that there are 4 trimesters in a year. the tri- defines the number of months, not the number of divisions, and so there a 4 3-month trimesters, instead of 3 of 4 months. similarly, in semester, the se- is from latin for 6, I always related it with the number 2, because I counted it as 2 divisions of the year
2020-04-08 09945, 2020
yvanzo
We are so proud of you! Might it be because uni was open only 3 trimesters in a year? ;)
2020-04-08 09955, 2020
alastairp
yeah, exactly!
2020-04-08 09947, 2020
alastairp
I guess semestre in French ties a lot more to 6
2020-04-08 09934, 2020
yvanzo
Not really, it is quite the same, and I've always been confused about trimesters until filling tax declaration.
2020-04-08 09908, 2020
ruaok
moooin!
2020-04-08 09919, 2020
ruaok
> trimestral tax declarations. I was confused because there are 4 of them in a year!
2020-04-08 09923, 2020
ruaok
yep, I've done that. :)
2020-04-08 09942, 2020
ruaok
iliekcomputers: thanks for moving the branch along -- it was really good to go offline for the evening...
2020-04-08 09902, 2020
iliekcomputers
happy to help!
2020-04-08 09956, 2020
iliekcomputers
i think shivam-kapila has all the tests fixed, although the travis build is still borked
2020-04-08 09929, 2020
ruaok
I need to tend to tend to business stuff, then I'll finish the rest of the surgically removing influx... which should fix the rest of the tests.
2020-04-08 09947, 2020
ruaok
my brain didn't pick a good stopping point to melt down yesterday.
2020-04-08 09956, 2020
iliekcomputers
ruaok: we will have to run both influx and timescale simultaneuosly for some time tho, right?
2020-04-08 09934, 2020
ruaok
I hope for that to be measured in hours, not days.
2020-04-08 09958, 2020
ruaok
testing will happen on the timescale instance on gaga.
2020-04-08 09949, 2020
ruaok
once we're happy with the timescale code, then clean the incoming queue for timescale and stop the timescale_writer and let listens pile up.
2020-04-08 09913, 2020
ruaok
a few minutes after that, we will trigger an LB full dump. I'll take the full dump and run my import/cleanup scripts.
2020-04-08 09945, 2020
ruaok
we'll import the data completely and then start the timescale_writer. all duplicate listens will be ignored and the new listens will be inserted.
2020-04-08 09903, 2020
ruaok
and then we should be consistent between influx and timescale.
2020-04-08 09913, 2020
ruaok
then we can decide when to cut over to timescale in production.
2020-04-08 09921, 2020
ruaok
that's the plan I've hashed out.
2020-04-08 09911, 2020
iliekcomputers
that makes sense to me.
2020-04-08 09903, 2020
shivam-kapila
Hi :)
2020-04-08 09935, 2020
ruaok
hi shivam-kapila
2020-04-08 09946, 2020
Mr_Monkey
alastairp: When I was learning Latin at school, I didn't necessarily believe them when they said it would be useful. I've since come to agree with the teachers !
2020-04-08 09949, 2020
ruaok
iliekcomputers: I'm glad. timescale and its rock solid dups handling makes it easy.
2020-04-08 09912, 2020
ruaok
iliekcomputers: I'm also going to remove dups and fuzzy last.fm dupes in the re-import process.
2020-04-08 09939, 2020
iliekcomputers
yeah, that sounds like a good idea
2020-04-08 09941, 2020
ruaok
e.g. two listens that are identical in a 2 second window will be considered dupes
2020-04-08 09952, 2020
iliekcomputers
i wonder if there's more things in the data that we should fix while we're at it
2020-04-08 09956, 2020
ruaok
identical save for the timestamp.
2020-04-08 09904, 2020
iliekcomputers
i'm pretty sure there are, i'll look over it once
2020-04-08 09910, 2020
ruaok
please do.
2020-04-08 09922, 2020
ruaok
I know those two are easy goals....
2020-04-08 09945, 2020
ruaok
remember that my process sorts all of the listens into one file. (shudder). and then that file is sorted in a massive sort operationg.
2020-04-08 09915, 2020
ruaok
then it is imported in sorted order, so anything that we can run over a narrow window of listens, we can do in the import.
2020-04-08 09933, 2020
iliekcomputers
this logic is in the import function in timescale_listenstore?
i've been releasing small diffs over the week and it's definitely a much better process.
2020-04-08 09931, 2020
ruaok
iliekcomputers: agreed, I'm keeping that in mind.
2020-04-08 09910, 2020
shivam-kapila
ruaok: When you get time please once look into the change I made in Spark dumps to make it consistent with Influx. I replaced the timestamp to check for unwritten listens to be based on listened_at rather than created because created is NULL in some cases. I havent made PR and its on my fork. Please ping when you want the link.
2020-04-08 09946, 2020
ruaok
ok, will do. this afternoon.
2020-04-08 09925, 2020
shivam-kapila
The tests are done in my knowledge and I have moved to modify Timescale writer