`listenbrainz-timescale-pg13-data` has been in use since.
2024-06-19 17143, 2024
mayhem[m]
is that mounted anywhere? if not, nuke it.
2024-06-19 17149, 2024
lucifer
i can delete the volume to free up 700g
2024-06-19 17138, 2024
mayhem[m]
how about typesense-data and typesense-new-data ?
2024-06-19 17147, 2024
mayhem[m]
that sounds like there is something duplicated.
2024-06-19 17145, 2024
lucifer
yes that can also be removed
2024-06-19 17139, 2024
lucifer
`/dev/md2 3.6T 2.1T 1.4T 61% /`
2024-06-19 17143, 2024
lucifer
:D
2024-06-19 17144, 2024
mayhem[m]
I'll ask hetzner if we can get a AC102 with only two 7.68TB drives.
2024-06-19 17149, 2024
lucifer
sounds great but might be an overkill for now. (7.68 drives seem to be expensive and we won't hit the limit for quite a while)
2024-06-19 17119, 2024
mayhem[m]
ok, then lets leave this issue be for a little while. I don't feel that there are good options for what we currently need.
2024-06-19 17132, 2024
lucifer
sure sounds good.
2024-06-19 17106, 2024
mayhem[m]
but lets keep our eyes peeled on this.
2024-06-19 17141, 2024
lucifer
yup makes sense, thanks!
2024-06-19 17109, 2024
lucifer
to be clear, so also postponing the LB backup server thing for now?
2024-06-19 17159, 2024
mayhem[m]
it doesn't quite feel that the options are right for us now. I understand the need, but the path looks kinda meh.
2024-06-19 17113, 2024
mayhem[m]
for all the reasons mentioned above.
2024-06-19 17115, 2024
lucifer
yup makes sense
2024-06-19 17140, 2024
mayhem[m]
we can't easily get another gaga class machine without it costing close to 300/month.
2024-06-19 17154, 2024
lucifer
true
2024-06-19 17120, 2024
lucifer
i guess buddy has enough backup so that we can setup barman at least?
2024-06-19 17141, 2024
rimskii[m]
<lucifer> "try this instead" <- still not working :(
2024-06-19 17153, 2024
lucifer
what error do you get rimskii[m] ?
2024-06-19 17103, 2024
rimskii[m]
the same
2024-06-19 17124, 2024
rimskii[m]
onnection to server at "127.0.0.1", port 5433 failed: Connection refused
2024-06-19 17125, 2024
rimskii[m]
Is the server running on that host and accepting TCP/IP connections?
2024-06-19 17128, 2024
lucifer
it will be different somehow because i changed the ports, share the new error.
2024-06-19 17135, 2024
mayhem[m]
buddy: /dev/md1 7.3T 959G 6.3T 14% /data2
2024-06-19 17141, 2024
lucifer
great,
2024-06-19 17143, 2024
mayhem[m]
looks that way.
2024-06-19 17158, 2024
lucifer
rimskii[m]: can you share the ssh command output?
2024-06-19 17109, 2024
lucifer
to check if the port forwarding worked.
2024-06-19 17126, 2024
rimskii[m] uploaded an image: (1643KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/LVXQvXnSMgENHyfmcCtrEcgV/Screenshot%202024-06-19%20at%2015.34.13.png >
2024-06-19 17128, 2024
rimskii[m] uploaded an image: (1766KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/RBFOPVUeACRtVFhDUAJDSfiB/Screenshot%202024-06-19%20at%2015.33.37.png >
2024-06-19 17112, 2024
rimskii[m]
mybe I shoud have run first "ssh rimskii@wolf.metabrainz.org" then "ssh -L 5432:localhost:5432 rimskii@wolf.metabrainz.org"?
2024-06-19 17130, 2024
lucifer
rimskii[m]: the command is outdated.
2024-06-19 17143, 2024
lucifer
you have to do `ssh -L 5433:localhost:5432 rimskii@wolf.metabrainz.org`
2024-06-19 17158, 2024
lucifer
note the 5433:
2024-06-19 17143, 2024
monkey[m]
That was the command rimskii ran, from what I can tell from the logs.
2024-06-19 17107, 2024
rimskii[m]
lucifer: yep i tried it too
2024-06-19 17114, 2024
lucifer
oh i see, i didn't look at the logs but just the command.
2024-06-19 17119, 2024
lucifer
hmm, that should work.
2024-06-19 17137, 2024
lucifer
according to logs, the port forwarding worked.
2024-06-19 17150, 2024
monkey[m]
Could it be something regarding "127.0.0.1:5432" vs. "localhost:5432" ?
2024-06-19 17109, 2024
monkey[m]
I mean, I assume not, but ...
2024-06-19 17115, 2024
lucifer
shuoldn't be. but its a mac
2024-06-19 17115, 2024
monkey[m]
I tried to connect in the same way, works for me (on linux)
2024-06-19 17135, 2024
monkey[m]
One thing I had to adjust to be able to connect to the DB was username and password
2024-06-19 17156, 2024
lucifer
the latest error points to a port forwarding error
2024-06-19 17110, 2024
atj[m]
you can see from the logs that it is listening on 127.0.0.1 and ::1
2024-06-19 17124, 2024
atj[m]
so localhost vs. 127.0.0.1 is irrelevant
2024-06-19 17156, 2024
atj[m]
I will test on my MacBook but never had any issues on Linux vs. MacOS in this area
2024-06-19 17111, 2024
monkey[m]
Same here, used to do port forwarding fine on macos
2024-06-19 17108, 2024
monkey[m]
Maybe something related to the macos firewall?
2024-06-19 17153, 2024
mayhem[m]
or a university/home router configuration?
2024-06-19 17135, 2024
atj[m]
not sure how that could make any difference, it's a local port forward
2024-06-19 17138, 2024
atj[m]
so the router isn't involved
2024-06-19 17141, 2024
atj[m]
rimskii: what terminal are you using?
2024-06-19 17128, 2024
monkey[m]
Sorry if I'm way out, but where is the connection error coming from? Inside Docker? if so, does the local port 5433 need to be added to the docker container ports?
2024-06-19 17131, 2024
rimskii[m]
atj[m]: the default one? mac?
2024-06-19 17141, 2024
lucifer
mayhem[m]: atj[m]: i just hopped on a call with rimskii port forwarding is working. but the service that needs to access the port is inside docker.
2024-06-19 17156, 2024
lucifer
monkey[m]: yeah that
2024-06-19 17107, 2024
monkey[m]
Might be good to try connecting with psql to remove any potential docker-related issue
not really relevant at this point but just a suggestion
2024-06-19 17115, 2024
monkey[m] backs out of everyone's business
2024-06-19 17150, 2024
lucifer
rimskii[m]: can you replace `localhost` in MB_DATABASE_URI with `host.docker.internal` ?
2024-06-19 17140, 2024
mayhem[m]
wow, that's shitty hetzner: "It´s possible to remove the default drives but there will be no price reduction for it."
2024-06-19 17154, 2024
rimskii[m]
lucifer: omg it works!
2024-06-19 17103, 2024
rimskii[m]
now i got a new error haha
2024-06-19 17114, 2024
rimskii[m] uploaded an image: (145KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/runZuVjtOYXpNkFuknRoSuNM/Screenshot%202024-06-19%20at%2016.04.39.png >
2024-06-19 17117, 2024
mayhem[m]
one step forward....
2024-06-19 17139, 2024
lucifer
rimskii[m]: i see, we don't have the mapping tables on wolf but i can run a script to create them there.
2024-06-19 17148, 2024
lucifer
will take a few hours for it complete.
2024-06-19 17150, 2024
yvanzo[m]
atj: Is a new snapshot needed for upgrading Solr version in the cluster? Or should we test it separately?
2024-06-19 17117, 2024
rimskii[m]
lucifer: mybe i can do that
2024-06-19 17158, 2024
atj[m]
yvanzo[m]: I don't think so, if it works in 9.6.0 it will work in 9.6.1
2024-06-19 17141, 2024
atj[m]
If we do it separately I can write docs on that too :)
2024-06-19 17115, 2024
lucifer
rimskii[m]: sure. so there are two ways to do it, 1 is running the command locally with the databases connected over ssh but that means you need a stable network connection for a few hours at least.
2024-06-19 17118, 2024
yvanzo[m]
OK, let’s try upgrading Solr 9.6.1 at first then.
2024-06-19 17139, 2024
lucifer
alternative is cloning the listenbrainz-server on wolf. and running the commands from there.
2024-06-19 17145, 2024
yvanzo[m]
I’m building the snapshot in parallel.
2024-06-19 17151, 2024
lucifer
i think this way will work better for you
2024-06-19 17107, 2024
mayhem[m] fires off another salvo of chatgpt drivel to hetzner.
2024-06-19 17135, 2024
rimskii[m]
lucifer: ok
2024-06-19 17118, 2024
atj[m]
yvanzo: just FYI, IME Solr can sometimes refuse to stop in an orderly fashion and the stop script waits 180 seconds before killing it
2024-06-19 17150, 2024
atj[m]
so if you see the restart task hanging, don't worry it's normal (ish)
2024-06-19 17153, 2024
atj[m]
just wait
2024-06-19 17153, 2024
yvanzo[m]
atj: I pushed a commit to the `solr` branch and update the node 8 as suggested.
2024-06-19 17112, 2024
lucifer
rimskii[m]: let me know when have cloned the repo and i can help update the script and let you know the commands to build the cache.
2024-06-19 17141, 2024
atj[m]
yvanzo: LGTM, want to upgrade the entire cluster?
2024-06-19 17106, 2024
yvanzo[m] uploaded an image: (83KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/WGSgqrySvDbPSeUWVXmkoNhQ/errors-when-updating-solr.png >
2024-06-19 17127, 2024
yvanzo[m]
Those are transient errors that occurred when updating the node 8.
2024-06-19 17136, 2024
rimskii[m]
lucifer: okay
2024-06-19 17143, 2024
rimskii[m]
i have to build the docket too right?
2024-06-19 17148, 2024
atj[m]
yvanzo[m]: that's normal AFAIU
2024-06-19 17103, 2024
lucifer
rimskii[m]: yes but different containers.
2024-06-19 17146, 2024
yvanzo[m]
atj: Should we pause between updating each node to preserve service availability?
2024-06-19 17123, 2024
yvanzo[m]
-f1 is certainly useful
2024-06-19 17131, 2024
rimskii[m]
lucifer: which containers?
2024-06-19 17108, 2024
rimskii[m]
cloned the repo & updated config
2024-06-19 17119, 2024
yvanzo[m]
atj: Ideally we should have a `wait_for` the node is available again before moving to another task.
2024-06-19 17151, 2024
yvanzo[m]
atj: That would be for production though, I’ll proceed as documented for now.
it should only need a minor update to the playbook
2024-06-19 17110, 2024
atj[m]
yvanzo: can I squash your commit into mine?
2024-06-19 17133, 2024
atj[m]
I need to force push to the solr branch really otherwise it gets messy and I just want to merge one big commit
2024-06-19 17133, 2024
yvanzo[m]
atj: No problem.
2024-06-19 17108, 2024
lucifer
rimskii[m]: `cd mbid_mapping` then `cp config.py.sample config.py`.
2024-06-19 17155, 2024
lucifer
edit `config.py`, set `MBID_MAPPING_DATABASE_URI` and `MB_DATABASE_MASTER_URI` to `dbname=musicbrainz_db user=musicbrainz host=db port=5432 password=musicbrainz`
this is better than -f1 because it means the entire play will be run on each node serially rather than each task
2024-06-19 17120, 2024
atj[m]
so the "restart" and the "wait for start" handlers will run before moving on to the next host
2024-06-19 17145, 2024
kellnerd[m]
<kellnerd[m]> "monkey, mayhem: Since my GSoC..." <- ^ Reminder just in case you have missed my message mayhem, last year an org admin was required to do that change on the SoC page IIRC (CC monkey).
2024-06-19 17123, 2024
rimskii[m]
<lucifer> "edit `config.py`, set `MBID_MAPP..." <- done !
2024-06-19 17116, 2024
lucifer
rimskii[m]: run `./build.sh`
2024-06-19 17128, 2024
yvanzo[m]
atj: Now testing!
2024-06-19 17141, 2024
rimskii[m]
lucifer: done !
2024-06-19 17104, 2024
rimskii[m]
now "python mapper/manage.py canonical-data"?
2024-06-19 17144, 2024
lucifer
run `tmux`
2024-06-19 17128, 2024
lucifer
then `docker run --rm -it --network musicbrainz-docker_default metabrainz/mbid-mapping python3 manage.py create-all`
2024-06-19 17128, 2024
rimskii[m] uploaded an image: (515KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/qygdBSoPyHjbKNBmzZarLYMH/Screenshot%202024-06-19%20at%2017.03.04.png >
2024-06-19 17110, 2024
rimskii[m]
lucifer: should I change sqlaclhemy one too?
2024-06-19 17140, 2024
lucifer
rimskii[m]: hmm i see, okay try this instead. `docker run --rm -it --network musicbrainz-docker_default metabrainz/mbid-mapping python3 manage.py canonical-data --use-mb-conn`
2024-06-19 17153, 2024
yvanzo[m]
atj: Oops, I cannot edit solr.yml at the same time.
2024-06-19 17127, 2024
yvanzo[m]
(?)
2024-06-19 17148, 2024
atj[m]
what do you mean?
2024-06-19 17140, 2024
yvanzo[m]
I was editing solr.yml in the meantime and realized that ansible-playbook was still running, so interrupted it, restored the file, and started it again.
2024-06-19 17115, 2024
yvanzo[m]
Not sure if it is watching changes to solr.yml though.
2024-06-19 17121, 2024
atj[m]
oh right, yes it's read when the process starts
2024-06-19 17138, 2024
iconoclasthero has quit
2024-06-19 17113, 2024
yvanzo[m]
It is actually downloading the JAR again when updating Solr.
2024-06-19 17132, 2024
rimskii[m]
<lucifer> "rimskii: hmm i see, okay try..." <- done ! it works
2024-06-19 17112, 2024
lucifer
rimskii[m]: did it finish already?
2024-06-19 17123, 2024
yvanzo[m]
atj: Would there be a way to prevent routing requests to a node during the update?
2024-06-19 17151, 2024
rimskii[m] uploaded an image: (1887KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/UcpsbUTxTdaTvElKqSBvakNw/Screenshot%202024-06-19%20at%2017.18.40.png >