jasje: I have made the final PR of the playlist feature. Let me know if you have any changes are required. Then we can merge it in the main branch.
2025-03-18 07704, 2025
jasje[m]
HemangMishra[m]: Need time to test
2025-03-18 07709, 2025
HemangMishra[m] uploaded an image: (736KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/jhCzeuhZjPransvQZQvwedGO/image.png >
2025-03-18 07715, 2025
HemangMishra[m]
HemangMishra[m]: Also, should I change the gradient or rounded corners of the description section. Like Artists page and others have rounded corners but Profile>listens doesn't, so I'm not sure which one to follow.
2025-03-18 07707, 2025
reosarevok[m]
Beta updated one
2025-03-18 07716, 2025
reosarevok[m]
* Beta update done
2025-03-18 07736, 2025
rustynova[m]
Hi! I plan to make some contributions to Melba again now that the project is a more stable state (and not bound by the rules of GoSC).
2025-03-18 07736, 2025
rustynova[m]
But I do wonder, is the project on hiatus? Or just need contributors? There's a pending PR by yellowhatpro that hasn't been touched in 4 months...
2025-03-18 07736, 2025
rustynova[m]
For now I'd just make some refactoring changes to clean things up a bit (and finally rebase my own PR).
2025-03-18 07755, 2025
yellowhatpro[m] joined the channel
2025-03-18 07755, 2025
yellowhatpro[m]
Hi, rustynova. Yeah you can work on the project. I have been busy with some other projects outside of MetaBrainz that I am not able to work on melba.
2025-03-18 07736, 2025
vardhan_ joined the channel
2025-03-18 07712, 2025
rustynova[m]
Well that will be a side project for me too. Something to change my mind off other projects. Will try at least a commit per week though
2025-03-18 07756, 2025
lucifer[m]
mayhem: hi! i am doing MLHD+ chunk by chunk processing and i think we might be able to do it on the current cluster. it processes one chunk in an hour and 16 chunks in total so assuming that takes a day. however each chunk produces an output of 400G to store. and that's without replication. with replication its 3x. so the current bottleneck is disk space.
2025-03-18 07743, 2025
mayhem[m]
ok, let me see what our options are.
2025-03-18 07750, 2025
lucifer[m]
zas: mayhem: i see hetzner has 5TB storage boxes for $13/month. so in the worst case we keep this around for a month, its 65usd still cheaper than the vms.
2025-03-18 07759, 2025
mayhem[m]
(but, squeee, that is exciting!)
2025-03-18 07721, 2025
mayhem[m]
the storage boxes run via SAMBA and end up being slow and unreliable.
2025-03-18 07729, 2025
lucifer[m]
i am guessing its HDD not SSD so slower but we have 12 hours of spark free time daily so i could 3-4 chunks a days.
2025-03-18 07732, 2025
lucifer[m]
i see
2025-03-18 07735, 2025
mayhem[m]
its for storing things, not for working things.
2025-03-18 07716, 2025
mayhem[m]
I tried to run navidrome on top of a storage box and it lasted a week before it shat itself.
2025-03-18 07746, 2025
lucifer[m]
i can disable replication cluster wide or disable it to 2 for the duration of mlhd processing and but even then i expect at least 10 TB of disk space requirement.
2025-03-18 07756, 2025
lucifer[m]
s/disable/reduce/
2025-03-18 07729, 2025
mayhem[m]
we could have disks put into those machines, but it would be a real pain.
2025-03-18 07747, 2025
mayhem[m]
given all this, I still think we should use new VMs to get this done.
2025-03-18 07756, 2025
lucifer[m]
i see.
2025-03-18 07708, 2025
mayhem[m]
and we can attach storage volumes (fast and suitable for working) to the nodes to reach the DB levels we need.
2025-03-18 07720, 2025
lucifer[m]
is it possible to have multiple disks at a single mount point?
2025-03-18 07740, 2025
mayhem[m]
no, unix doesn't allow that.
2025-03-18 07746, 2025
mayhem[m]
why do you need that?
2025-03-18 07708, 2025
lucifer[m]
actually nvm, i found the way to specify multiple directories/mounts to hdfs
2025-03-18 07715, 2025
mayhem[m]
if you wanted that you would have to use something like LVM and make a virtual drive.
2025-03-18 07721, 2025
mayhem[m]
phew.
2025-03-18 07743, 2025
lucifer[m]
zas: would it possible for the new VMs to connect to existing spark servers?
2025-03-18 07723, 2025
lucifer[m]
i need a smaller server for running the leader, but i guess we could get a smaller vm for that and keep it all isolated from the current cluster.
2025-03-18 07724, 2025
lucifer[m]
mayhem: so i am thinking CCX53 * 3 + CCX53 * 1 + 10 TB storage volume on each.
Each machine we want to connect should be in spark-vnet (this is on Hetzner Robot)
2025-03-18 07718, 2025
zas[m]
For example, jermaine has:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/gzyHOuhbaPAqyggQdBvOxWQL>)
2025-03-18 07730, 2025
monkey[m]
<aerozol[m]> "I think this was about the..." <- Yeha, I think we have enough stuff to show-and-tell, that would be nice. Not sure we have any feature on the immediate horizon to justify waiting for.
2025-03-18 07730, 2025
monkey[m]
Let me know if you want some help, and in what form. For example I can collate everything I think should go in there, and let you filter, massage and post?
2025-03-18 07744, 2025
zas[m]
michael (which is also on this vnet) can ping jermaine over this network:... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/pBUNzZkihUCTrmExDeiggUyv>)
2025-03-18 07702, 2025
zas[m]
So the idea is to have your VMs in this network too.
oy. I think it will be better to wait zas -- I think you'll do it much faster than me trying to learn this all.
2025-03-18 07723, 2025
zas[m]
Actually it would be good if you add VMs to Ansible (this part is easy), and then learn how to deploy (basically run bootstrap.yml playbook then site.yml playbook). You need ssh to be properly configured (ssh root@yourvm should work for bootstrap, and then ssh youruser@yourvm should work for site)
2025-03-18 07743, 2025
zas[m]
You risk nothing trying (at worse we rebuild the VM and start over). Properly configuring network is a must (of course, else we lose connection)
2025-03-18 07755, 2025
zas[m]
for the vswitch part, that's trickier, because we don't have such setup yet, and I expect some changes will be needed regarding network/firewall settings to handle this case in Ansible.
2025-03-18 07756, 2025
zas[m]
Bootstrap needs root access because users aren't created yet, then site can run on user+sudo
2025-03-18 07738, 2025
mayhem[m]
well, this sounds more interesting that the thing I am fighting now...
2025-03-18 07745, 2025
mayhem[m]
<lucifer[m]> "mayhem: so i am thinking CCX53 *..." <- this is actually not clear. one machine with 10TB storage?
2025-03-18 07753, 2025
lucifer[m]
each.
2025-03-18 07754, 2025
mayhem[m]
and 3 stock CCX53s?
2025-03-18 07720, 2025
mayhem[m]
so 4 CCX53 with each 10TB?
2025-03-18 07756, 2025
lucifer[m]
oh sorry, i made a typo. 3 CCX53's and 1 CCX23. with 10 TB each yes.
2025-03-18 07732, 2025
lucifer[m]
but i did think of an alternative plan to tackle this dataset too if you want to wait.
2025-03-18 07752, 2025
mayhem[m]
what is that alternate plan?
2025-03-18 07712, 2025
lucifer[m]
process each file individually parallely using duckdb or somesuch and then just do the final combination step in spark.
2025-03-18 07747, 2025
mayhem[m]
sounds like a lot of code for you to write.
2025-03-18 07730, 2025
mayhem[m]
And I can't get a dedicated CPU VM. we've reached some limit I need to have raised.
2025-03-18 07737, 2025
lucifer[m]
hard to tell without taking a shot at it, might be a day or less or more.
2025-03-18 07709, 2025
mayhem[m]
maybe spend an hour or 2 on it and see if that approach is promising?
2025-03-18 07751, 2025
zas[m]
If those machines are temporary, we may not add them to Ansible at all, and just do a basic config manually
2025-03-18 07707, 2025
mayhem[m]
yes, all temp. but we'll see if we actually need them
2025-03-18 07719, 2025
lucifer[m]
i don't anticipate issues to show up until say 10% of the dataset is done but sure let me try.
2025-03-18 07748, 2025
zas[m]
ok, tell me, I'll be happy to help, but I'm very busy right now at controlling this absurd traffic surge...
2025-03-18 07715, 2025
mayhem[m]
zas[m]: focus on the surge. lucifer will try an alternate path.
2025-03-18 07730, 2025
d4rkie has quit
2025-03-18 07755, 2025
d4rkie joined the channel
2025-03-18 07751, 2025
d4rkie has quit
2025-03-18 07714, 2025
d4rkie joined the channel
2025-03-18 07758, 2025
d4rk-ph0enix has quit
2025-03-18 07725, 2025
d4rk-ph0enix joined the channel
2025-03-18 07722, 2025
vardhan_ has quit
2025-03-18 07715, 2025
leftmostcatUTC-8 has quit
2025-03-18 07721, 2025
bitmap[m]
reosarevok: yvanzo: lucifer: hello, are we meeting now?
2025-03-18 07735, 2025
reosarevok[m]
Hi! I thought we were! :)
2025-03-18 07701, 2025
yvanzo[m]
I thought it was in 1h from now but I’m around already.
2025-03-18 07703, 2025
monkey[m]
mayhem, lucifer An interesting timezone-related issue I never saw before was reported (LB-1766): `can't compare offset-naive and offset-aware datetimes`. The ticket was assigned to me but I don't know if I'm the ideal candidate. Can I assign the ticket to either of you?