#metabrainz

/

      • D4RK-PH0ENiX has quit
      • minteria has quit
      • CallerNo6
      • ... which doesn't have much to do with the MB community, but presumably it helps /somebody/ :-)
      • MBJenkins
        Project musicbrainz-server_master build #394: ABORTED in 13 sec: https://ci.metabrainz.org/job/musicbrainz-serve...
      • bitmap
        grrr I keep getting logged out of ci so I can't cancel things before they start
      • D4RK-PH0ENiX joined the channel
      • if I accidentially visit http, then switch to https, I'm logged out
      • Gentlecat
        there's a "Whitelist Target Branches" option
      • maybe we can just set it to master?
      • ok, done
      • bitmap
        what does it do?
      • Gentlecat
        ideally not build pull requests to schema-change-2016-q2
      • bitmap
        cool
      • Gentlecat
        "Adding branches to this whitelist allows you to selectively test pull requests destined for these branches only. Supports regular expressions (e.g. 'master', 'feature-.*')."
      • bitmap
        ah, so only prs to master will be built, that sounds good
      • or ^(master|beta|production)$ if that actually works
      • linuxrocks
        I read the policy regarding the Internet Archive storage, which lists it for historical, research purposes and fair use - sounds good to me. Then the MB site says use images at your own risk, I'm assuming they are referring to commercial interests?
      • bitmap
        basically any redistribution or non-private use of the images is at your own risk, since nobody owns the copyrights to them
      • nobody meaning MB or the IA
      • linuxrocks
        I just want to upload much of the content I have, encouraging others to as well - to make a more complete archive. But I don't want to cause any issues with MB or IA.
      • Maybe I could add a coffee stain to the images and call it my art ;-)
      • bitmap
        ah, I wouldn't worry about that. if the IA gets a complaint, they'll take the images down (it's happened before)
      • JesseW joined the channel
      • linuxrocks
        ahh so there's no guarantee that if I spend time to do this, it will *stick*. Can images be added to a release through an API? This way I could more easily upload it in a batch sort of way?
      • bitmap
        well, by happened before, I mean 0.something% of all uploads :)
      • there's not an API to do this, unfortunately
      • CallerNo6
        who doesn't want to be archived? that's crazytalk.
      • linuxrocks
        OK, got it. Does the original uploader of an image get notified if it get's deleted? Actually how does this work, doesn't MB just store a link to to the image at IA (CAA). How does MB know that the IA has deleted an image?
      • bitmap
        we don't, actually. there's a way to get a list of deleted images from the IA (in a bug report somewhere), but it hasn't been implemented yet
      • it hasn't happened frequently enough to be a development priority, I guess
      • linuxrocks
        makes sense
      • Gentlecat
        bitmap: added beta and production to that list
      • bitmap
        thx
      • about an API for uploads, the main problem is that the CAA was designed so that images never pass through MB servers
      • but only MB has the keys necessary to sign uploads directly to the IA
      • Gentlecat: seems to be working :)
      • Gentlecat
        \o/
      • JesseW has quit
      • JesseW joined the channel
      • JesseW has quit
      • bitmap has quit
      • bitmap joined the channel
      • QuoraUK has quit
      • zas
        bitmap: ping
      • ah nvm, astro / was full, i removed >280Gb of nginx logs (only .gz),
      • How come astro alone was using so high bandwidth ?
      • And that's not nginx according to http://stats.metabrainz.org/dashboard/db/all-ng...
      • Gentlecat
        bitmap: can you take a look at https://github.com/metabrainz/pytools/pulls if you have some time?
      • bitmap
        Gentlecat: sure thing
      • Gentlecat
        thanks!
      • bitmap
        zas: maybe a full db export?
      • there was a --with-full-export running, so I guess
      • the 30d range makes that easier to see
      • yeeeargh joined the channel
      • neersighted has quit
      • neersighted joined the channel
      • neersighted has quit
      • JesseW joined the channel
      • kanha has quit
      • neersighted joined the channel
      • darwin
        super minor formatting issue in notification mail, missing space : "Label "2 Swords"(Copy Paste Soul's personal imprint) - merged by edit #38369836
      • there should be a space after " and before (
      • linuxrocks is now known as linuxrocks_
      • linuxrocks_ is now known as linuxrocks
      • linuxrocks has left the channel
      • linuxrocks joined the channel
      • linuxrocks has left the channel
      • linuxrocks joined the channel
      • JesseW has quit
      • regagain joined the channel
      • kanha joined the channel
      • UmkaDK has quit
      • zas
        bitmap: then this export may have failed, due to lack of disk space, did you verify it ?
      • ruaok
        so, uhm, who is responsible for the local weather. I'd like to file a report. :(
      • zas
        Hey good morning Rob !
      • ruaok
        morning!
      • zas
        Finally at home ?
      • ruaok
        yep, got home last night.
      • and all I want to do is go for a nice long ride in the spanish sun. except there is no sun today. oh well, manaƱa. :)
      • zas
        ruaok: i thought about something about number of IPs we need at NewHost, having more IPs is good when it comes to the number of sockets restrictions, which are (source ip, source port, dest ip, dest port) tuple (do you remember ernie/bert issues related to this at some point). Since we want to be able to handle a lot of simultaneous connections, better
      • spread services on different IPs (and hostnames).
      • Also i think about redis HA setup, we'll need another server for that (one redis master and, at least, one redis slave, basically the setup is based on HAProxy + redis sentinel, and is quite simple. It will solve a reliability issue we currently have (that is losing redis master). bitmap said mbs doesn't handle well (at all) redis master failures.
      • ruaok
        make sense to me. just keep adding these thoughts to the doc.
      • zas
        ok
      • ruaok
        now as far as redis masters... the use case for redis is so bizarre that most machines don't fit really well for that.
      • ideally we'd need 64GB with one or two cores.
      • really weird setup.
      • zas
        i think redis makes good use of more cores, to be verified
      • ruaok
        so, the question in my mind is this: what do we use that is CPU intensive, but not memory intensive?
      • making good use and using them all are two different things no?
      • zas
        Yes ;)
      • ruaok
        a memcached machine is bored most of the time, but with massive memory use.
      • I'd like to find a complementary task, that also needs to be HA, that we can stick on those servers with redis for better utilization.
      • zas
        This is why we can run cpu intensive processes (but not memory intensive processes) on the same machines
      • ruaok
        exactly.
      • zas
        indexer is one (on jiji)
      • ruaok
        building indexes is one, but that will go away hopefully soon.
      • what are our future, long term use cases?
      • zas
        perhaps compression (backups, logs ?)
      • ruaok
        logs, quite possibly.
      • zas
        logs would fit
      • we need HA centralized logs storage
      • if logs are written through network, and not stored on most machines, it means we can go for small and fast non-RAID SSD for web servers ie?
      • ruaok
        just as long as we don't impement that before the move. after the move, yes.
      • zas
        Same goes for gateways
      • ruaok
        why SSD?
      • HDD should be sufficient.
      • zas
        Oh yes, i mean HDD
      • ruaok
        ok.
      • astro disk is usually < 5%
      • zas still under the needed coffee level
      • but yes, non RAID HDD on app servers.
      • :)
      • zas
        We need to think about backups
      • ruaok
        not coffee??
      • zas
        NewHost may not have someone to rotate our USB drives ;)
      • ruaok
        oh, yes that concept is dead.
      • but, given that EACH machine has a massive bandwitdh budget, we take our backups machine and give it a public IP.
      • then blast backups as fast as we can to google cloud or glacier.
      • probably both.
      • google cloud and then once a week to glacier as well
      • zas
        Yup, may be a soluton
      • solution*
      • ok coffee ;)
      • ruaok
        for super fast retrieval from google cloud, and archival on glacier.
      • bai!
      • zas
        About number of web servers... i did some rough calculation, i would like to redo with you
      • ruaok
        ok.
      • the good thing about the vrack is that we can start smaller and expand when we want to.
      • ruaok thinks scalability
      • search, web servers and postgres can scale easily with the new setup.
      • gateways, when done properly will have loads of capacity and will scale nicely on a 2 host setup for quite some time.
      • we also need to think about what traffic we want to serve.
      • I'm not interested in spending a lot of hosting resources on headphones.
      • speaking of headphones, the new rate limiting needs to have per-app limits.
      • zas
        Coffee ready.
      • Ok, based on current incoming reqs
      • ruaok needs to pop out before the old-catalan-lady-brigade ddos'es every available veggie shop in the area
      • we have 1k req/s incoming
      • around 65% hit our web servers
      • and 55-60% of those are rate limited (503s)
      • so we actually return ~270 req/s as 20x
      • so let's say we want to double that, and convert most 503s in 200s
      • it will double load on web servers, so if we want to keep same load with same hardware it would mean having x2 web servers (we have 5) -> 10
      • but new web servers will be more performant, likely at least 30%, -> 7
      • ruaok
        considerably more, I would think.
      • these machines were given to us in 2010 and were several years old.