#metabrainz

/

11:27 AM
Freso

zas: Whenever you have a moment: https://community.metabrainz.org/t/discourse-does… :)

2022-03-09 06830, 2022

11:29 AM
lucifer

https://www.irccloud.com/pastebin/CINnmeJY/

2022-03-09 06847, 2022

11:29 AM
lucifer

but otherwise also lot of differences in conf.

2022-03-09 06846, 2022

11:30 AM
lucifer

some of those are commented so don't matter but i guess best to copy over the .conf file from old to new cluster?

2022-03-09 06822, 2022

11:32 AM
lucifer

yeah this also does it https://github.com/metabrainz/docker-postgres-clu…

2022-03-09 06842, 2022

11:32 AM
alastairp

lucifer: yeah, I'm doing this process as well, and it looks like I'm at about the same stage as you. let me look at this diff. I almost agree that we should just copy it, though we should also verify that there are no new options that arrived in 11 or 12

2022-03-09 06834, 2022

11:33 AM
alastairp

cool, I also see the update_extensions.sql message. this just has `ALTER EXTENSION "timescaledb" UPDATE;` in it, which is great - because that's what we need to do anyway, right?

2022-03-09 06851, 2022

11:33 AM
lucifer

yes

2022-03-09 06809, 2022

11:34 AM
alastairp

and I see the message about re-running vaccum/analyze, so that seems good

2022-03-09 06818, 2022

11:34 AM
alastairp

were you able to start up the new cluster?

2022-03-09 06820, 2022

11:35 AM
lucifer

not yet. trying to do that currently.

2022-03-09 06802, 2022

11:36 AM
alastairp

https://www.irccloud.com/pastebin/bAjZmq5M/

2022-03-09 06848, 2022

11:38 AM
skelly37 joined the channel

2022-03-09 06811, 2022

11:40 AM
skelly37 has quit

2022-03-09 06821, 2022

11:40 AM
skelly37 joined the channel

2022-03-09 06801, 2022

11:42 AM
lucifer

alastairp, my container has closed connection so i had to start another timescale one. yup works for me as well.

2022-03-09 06825, 2022

11:42 AM
lucifer

i ran alter extension command but got error that it already exists and which probably makes sense.

2022-03-09 06855, 2022

11:42 AM
lucifer

running vacuum, errors huh.

2022-03-09 06857, 2022

11:42 AM
lucifer

ERROR: could not resize shared memory segment "/PostgreSQL.175590062" to 67128672 bytes: No space left on device

2022-03-09 06810, 2022

11:43 AM
alastairp

well, that error is understandable :)

2022-03-09 06834, 2022

11:43 AM
alastairp

ah, but I see that bono has plenty of space

2022-03-09 06842, 2022

11:43 AM
alastairp

are you using the pre-generated vaccum script?

2022-03-09 06848, 2022

11:43 AM
lucifer

/dev/md2 4.0T 1.8T 2.0T 49% /

2022-03-09 06803, 2022

11:44 AM
lucifer

yeah. no not pre-generated script.

2022-03-09 06819, 2022

11:44 AM
lucifer

just vacuum in session while connected to db

2022-03-09 06831, 2022

11:44 AM
lucifer

it worked when i ran as postgres user.

2022-03-09 06811, 2022

11:55 AM
Lorenzo[m] joined the channel

2022-03-09 06828, 2022

12:00 PM
Lorenzo[m]

Hi folks, in the last few days I experienced some issues related to scrobbling on LB. What I send to LB is simply not listed on my profile (I'm pretty sure it's not an error on my side)

2022-03-09 06830, 2022

12:01 PM
Lorenzo[m]

I've checked the Bug Tracker and there is noting related to this issue (at least not in the last few weeks)

2022-03-09 06839, 2022

12:01 PM
Lorenzo[m]

Is it a known problem or should I open a ticket?

2022-03-09 06818, 2022

12:06 PM
alastairp

Lorenzo[m]: oh hi!

2022-03-09 06830, 2022

12:07 PM
alastairp

Lorenzo[m]: we did in fact have another report from another person here, and today we're going to do an upgrade of our database to help us fix this issue, so fingers crossed this will fix your problem too

2022-03-09 06851, 2022

12:09 PM
Lorenzo[m]

Oh nice, I'll try to scrobble some music tomorow and I'll check if everything is fixed

2022-03-09 06836, 2022

12:10 PM
Lorenzo[m]

Thank you for your time folks, I really appreciate the project and your efforts

2022-03-09 06854, 2022

12:10 PM
atj

Lorenzo[m]: <3

2022-03-09 06854, 2022

12:12 PM
alastairp

lucifer: oh, I just had a thought. if we don't use --link, then pg_upgrade is going to copy all data files anyway. it's going to take longer (need to copy all files), but it's not going to touch the old ones, so maybe we can get by without doing a backup?

2022-03-09 06824, 2022

12:13 PM
alastairp

alternatively, we make a network backup to another server, we could start with rsync now, and then re-run it to catch up modified files once we take the cluster down

2022-03-09 06825, 2022

12:14 PM
lucifer

yes right.

2022-03-09 06833, 2022

12:14 PM
skelly37 has quit

2022-03-09 06838, 2022

12:14 PM
lucifer

i think MB did rsync last time.

2022-03-09 06844, 2022

12:14 PM
skelly37 joined the channel

2022-03-09 06817, 2022

12:15 PM
lucifer

https://github.com/metabrainz/docker-postgres-clu…

2022-03-09 06836, 2022

12:17 PM
alastairp

right, but is that just copying the data directory to the replica for quicker startup?

2022-03-09 06805, 2022

12:18 PM
alastairp

800gb is going to take 2 hours to copy somewhere else over gigabit at least

2022-03-09 06811, 2022

12:18 PM
lucifer

oh ok, makes sense.

2022-03-09 06859, 2022

12:21 PM
lucifer

alastairp: i am unsure which is better. your call.

2022-03-09 06845, 2022

12:22 PM
alastairp

zas: do we have a server with 900gb free space that we can rsync to?

2022-03-09 06837, 2022

12:23 PM
v6lur joined the channel

2022-03-09 06849, 2022

12:25 PM
atj

postgres data file compress really well btw

2022-03-09 06856, 2022

12:25 PM
atj

*data files

2022-03-09 06820, 2022

12:26 PM
alastairp

atj: ah, interesting, might try that

2022-03-09 06834, 2022

12:26 PM
atj

although it depends on the contents

2022-03-09 06814, 2022

12:27 PM
alastairp

atj: let me get you up to date - we're doing a pg 11 to 13 upgrade. because we run pg in a docker container, we have the data in a volume. however, if we mount 2 volumes (1 for 11-data and 1 for 13-data) we can't hardlink between them with pg_upgrade, because they're different logical disks :(

2022-03-09 06835, 2022

12:27 PM
atj

makes sense

2022-03-09 06801, 2022

12:28 PM
zas

alastairp: kiss has 1.07Tb free on /dev/md2

2022-03-09 06817, 2022

12:28 PM
alastairp

so now we're wondering about doing the copy version of the upgrade, but we'd like enough disk space (on gaga) to have 1) the v11 data, 2) a backup of it in case anything goes wrong, 3) the v13 data - but gaga doesnt have enough disk for this (db is about 770gb)

2022-03-09 06831, 2022

12:28 PM
alastairp

zas: yes, I was just looking through servers and found that, thanks

2022-03-09 06806, 2022

12:29 PM
alastairp

atj: so - if you have any postgres upgrade and/or docker/volume/hardlink experience, it'd be interesting to hear your thoughts

2022-03-09 06847, 2022

12:29 PM
alastairp

lucifer: did you try the upgrade with --link with both data directories in the same volume?

2022-03-09 06838, 2022

12:30 PM
atj

well, hardlinks definitely aren't going to work unless you're in the same volume

2022-03-09 06842, 2022

12:30 PM
lucifer

alastairp: no haven't done yet.

2022-03-09 06814, 2022

12:31 PM
Freso

reosarevok: Not sure if you missed this: https://community.metabrainz.org/t/announcement-n…

2022-03-09 06819, 2022

12:31 PM
alastairp

atj: I had hoped that because these volumes were on the same partition on the host, it'd just work :)

2022-03-09 06814, 2022

12:32 PM
reosarevok

Freso: iirc I transcluded but I guess I didn't answer?

2022-03-09 06854, 2022

12:32 PM
atj

alastairp: have you tried creating manual hardlinks?

2022-03-09 06846, 2022

12:33 PM
atj

it would work from the host, but I don't think the container can see that they're the same filesystem

2022-03-09 06852, 2022

12:33 PM
alastairp

lucifer: oh, one other thing - we should decide if we want to keep running on pg/timescale on debian, and if so decide which base image, how to install it, and how to start it (because the debian version splits the config/data directories, but our alpine data dir has the combined directory)

2022-03-09 06835, 2022

12:34 PM
zas

atj: should we disable IPv6 on shorewall for now?

2022-03-09 06837, 2022

12:34 PM
alastairp

atj: on the host or in the container? this is done by pg_upgrade (in the container), so I don't think I can do it myself on the host and have it work

2022-03-09 06854, 2022

12:34 PM
atj

zas: I think so yes :/

2022-03-09 06801, 2022

12:35 PM
atj

the role is nearly done, just working on it now

2022-03-09 06859, 2022

12:36 PM
alastairp

lucifer: actually, https://github.com/metabrainz/docker-postgres-clu… is basically good for us, I think it'd be a good idea to base it off of this

2022-03-09 06831, 2022

12:37 PM
atj

alastairp: does the container have a shell? can you exec in and touch a file on one volume then try to hardlink it to the other volume?

2022-03-09 06821, 2022

12:38 PM
atj

eg. docker exec -ti <container> bash

2022-03-09 06828, 2022

12:38 PM
alastairp

https://www.irccloud.com/pastebin/1fMNYoYE/

2022-03-09 06802, 2022

12:39 PM
atj

worth a try but as expected

2022-03-09 06844, 2022

12:39 PM
alastairp

https://bbs.archlinux.org/viewtopic.php?id=241866 seems to indicate that overlay2 causes it, and there are workarounds like using devicemapper as the storage driver

2022-03-09 06848, 2022

12:39 PM
alastairp

but not something I want to get into now

2022-03-09 06833, 2022

12:40 PM
atj

my suggestion would be to create a compressed backup using tar and send it to another machine via ssh

2022-03-09 06819, 2022

12:41 PM
atj

best to have a backup regardless

2022-03-09 06824, 2022

12:41 PM
alastairp

any suggestions based on time tradeoff between just copying 700gb, or waiting for it to compress first and then copying it?

2022-03-09 06835, 2022

12:41 PM
atj

compression will be worth it IMO

2022-03-09 06850, 2022

12:41 PM
alastairp

bzip or gzip?

2022-03-09 06851, 2022

12:41 PM
atj

I'd recommend using lzip for better speed

2022-03-09 06857, 2022

12:41 PM
alastairp

or lzip!

2022-03-09 06801, 2022

12:42 PM
atj

needs installing

2022-03-09 06818, 2022

12:42 PM
alastairp

installed on gaga

2022-03-09 06844, 2022

12:42 PM
alastairp

what's the tar flag for lzip?

2022-03-09 06852, 2022

12:42 PM
atj

alastairp: also, use pv so you can see the speed of the backup

2022-03-09 06804, 2022

12:43 PM
atj

--lzip

2022-03-09 06833, 2022

12:43 PM
alastairp

tar | pv | lzip, to be able to measure against the real size of the data directory?

2022-03-09 06840, 2022

12:43 PM
alastairp

flags to tar to preserve users and permissions?

2022-03-09 06855, 2022

12:44 PM
alastairp

(or is that an extraction parameter)

2022-03-09 06829, 2022

12:45 PM
atj

tar --lzip -cf - /path |pv |ssh foo "cat > file.tar.lzip"

2022-03-09 06805, 2022

12:46 PM
alastairp

thanks!

2022-03-09 06806, 2022

12:46 PM
atj

I'd run the command from the root of the volume you are backing up

2022-03-09 06846, 2022

12:46 PM
alastairp

ah, so you directly stream the compressed file, rather than write to file then copy

2022-03-09 06849, 2022

12:46 PM
atj

just so the archive has relative paths - easier to unpack

2022-03-09 06854, 2022

12:46 PM
atj

exactly

2022-03-09 06809, 2022

12:47 PM
alastairp

but pv is going to be reporting the compressed size, right?

2022-03-09 06828, 2022

12:47 PM
alastairp

so we'll expect it to be smaller than the size of the data dir, but we don't know how much smaller

2022-03-09 06837, 2022

12:47 PM
atj

you can't know until you do it unfortunately

2022-03-09 06845, 2022

12:47 PM
atj

you could try compressing a few files to see

2022-03-09 06801, 2022

12:48 PM
alastairp

yeah, which is why I was thinking of tar -cf - /path | pv | lzip | ssh

2022-03-09 06832, 2022

12:48 PM
atj

that should work

2022-03-09 06841, 2022

12:48 PM
atj

you just won't know the transfer speed over ssh

2022-03-09 06848, 2022

12:48 PM
alastairp

let me try it on another machine I have

2022-03-09 06810, 2022

12:49 PM
alastairp

yeah, hopefully close to gige less overhead, but unclear

2022-03-09 06815, 2022

12:49 PM
yvanzo

reosarevok: Has the old docker volume staticbrainz-data been replaced with musicbrainz-static-build-prod and musicbrainz-static-build-beta?

2022-03-09 06831, 2022

12:51 PM
atj

alastairp: you might want to try lzma vs. lzip, I always get confused between these various lz compression algos

2022-03-09 06826, 2022

12:54 PM
v6lur has quit

2022-03-09 06818, 2022

12:55 PM
v6lur joined the channel

2022-03-09 06856, 2022

12:58 PM
alastairp

atj: hmm, adding lzma in the middle makes it 100x _slower_

2022-03-09 06835, 2022

12:59 PM
atj

CPU limited?

2022-03-09 06852, 2022

12:59 PM
alastairp

lzma: [1.27MiB/s], no compression: [ 110MiB/s], gzip: [21.5MiB/s]

2022-03-09 06856, 2022

12:59 PM
alastairp

checking now

2022-03-09 06817, 2022

13:00 PM
alastairp

yes, 100% cpu

2022-03-09 06806, 2022

13:01 PM
atj

lzip?

2022-03-09 06808, 2022

13:01 PM
alastairp

other option is tar -> file, pbzip2, copy file, but there's no parallelism there

2022-03-09 06831, 2022

13:02 PM
atj

zstd (yes another algo) supports multiple threads

2022-03-09 06845, 2022

13:02 PM
alastairp

even with input from stdin?

2022-03-09 06858, 2022

13:02 PM
atj

this is what I'm wondering

2022-03-09 06824, 2022

13:03 PM
atj

I think these more advanced algos have large window sizes which requires more buffering in RAM

2022-03-09 06811, 2022

13:04 PM
alastairp

lzip -0 says that it's about as fast as gzip, and I'm seeing speeds about the same, but it's still CPU-bound rather than network-bound

2022-03-09 06824, 2022

13:04 PM
atj

not sure how much difference letting tar do the compression would make

2022-03-09 06825, 2022

13:04 PM
alastairp

2 minutes to copy a 2gb file, compared to 20sec using just rsync

2022-03-09 06806, 2022

13:05 PM
alastairp

does hetzner offer 10ge yet? :)

2022-03-09 06812, 2022

13:05 PM
alastairp

trying that now

2022-03-09 06837, 2022

13:08 PM
alastairp

no, that's just as slow

2022-03-09 06839, 2022

13:08 PM
atj

alastairp: on my system "tar --zstd -cf -"

2022-03-09 06842, 2022

13:08 PM
atj

1.46GiB 0:00:17 [83.3MiB/s]

2022-03-09 06804, 2022

13:09 PM
atj

lzma and lzip were very slow

2022-03-09 06826, 2022

13:09 PM
alastairp

yes, zstd is 3x faster than lzip or gzip immediately

2022-03-09 06801, 2022

13:11 PM
alastairp

no compression: [ 110MiB/s], pv|zstd: [ 123MiB/s]

2022-03-09 06810, 2022

13:11 PM
alastairp

so yes, zstd is giving a slight advantage

2022-03-09 06824, 2022

13:11 PM
atj

I just compressed a 4.7GB VDI to 1.9GB at pretty much line speed

2022-03-09 06818, 2022

13:13 PM
atj

looks like that's your best bet, whether the compression is worth it I don't know.

2022-03-09 06855, 2022

13:13 PM
lucifer

alastairp: the MB docker setup is definitely useful for us to base on but not sure what all we need to test and prepare for that move so probably best to not do it now?

2022-03-09 06840, 2022

13:14 PM
alastairp

oh yeah - I'm just doing a test with the pg dir now, it's hovering around 300MiB/sec with zstd (measuring with pv before going into zstd), so we're about 3x faster than just a regular copy over the same network link

2022-03-09 06843, 2022

13:14 PM
alastairp

thanks atj!

2022-03-09 06856, 2022

13:14 PM
atj

glad it worked :)

2022-03-09 06805, 2022

13:15 PM
alastairp

still 40 mins to do the backup

2022-03-09 06857, 2022

13:15 PM
atj

make sure to check the tar archive once it's completed - "tar --zstd -tf filename.tar.zstd"

2022-03-09 06812, 2022

13:16 PM
skelly37 has quit

2022-03-09 06823, 2022

13:16 PM
skelly37 joined the channel

2022-03-09 06811, 2022

13:18 PM
atj

actually, tar does recognise the file extension so you can do "tar -vtf <filename>.tar.zstd"

2022-03-09 06824, 2022

13:20 PM
alastairp

lucifer: yeah, that's a good point. so maybe we drop the indexes in current image, bring it up in the migrate image, do the upgrade, bring it up in the new timescale image, recreate indexes ?

2022-03-09 06854, 2022

13:20 PM
alastairp

lucifer: I had a look at the config file diff, there are some things I don't understand, but I think it'll be OK to copy directly in place

2022-03-09 06819, 2022

13:22 PM
lucifer

alastairp: are we going to do with --link or without? for without, we could keep the indexes and drop/recreate in the new image in case we need to bring up old cluster back up.

2022-03-09 06836, 2022

13:22 PM
alastairp

lucifer: I know you have a few items on the migration doc for preparing listenstore downtime, do you need time for that?