lucifer: damselfish didn't get daily jams today. :(
zas
response time seems to degrade, but I'm not sure why yet (it is the response as measured by openresty and stored in logs). We need to investigate this before moving more services to new gateways. For now, I have no explanation.
hmmm, something wrong with data representation
yup, I know what's happening (more or less), to get correct values (well, almost, since that's a mean of means), be sure to only select rex+rudi in host selector
when comparing this way, no significant difference (and that's expected)
(well, around, because DNS TTL is 5 minutes, so that's progressive)
atj
OK, so mean request time on kiki was ~40ms and rex/rudi it seems to have settled ~25ms
but kiki was handling much more traffic
zas
yes
traffic is very low on switched websites, we'll have better figures when we'll switch coverartarchive (and mb ofc)
atj
will be interesting to see given we have an extra layer on the new gateways
zas
according to my measurements, extra layers should have minimal impact: we have the load balancer, then haproxy then openresty, and they use proxy protocol (which means an extra step and lower mtu)
but all this is very fast (few ms)
atj
and the servers are faster
zas
yes, and much more scalable
lucifer
mayhem: i see, why that happened. jukevox, alastair etc also didn't get daily jams for same reason 🤦
zas
on https traffic (which is cpu intensive) new setup should be much better overall
One thing missing is moving keydb to rex/rudi (they use kiki/herb ones still)
atj
can we start putting docker volumes in /srv or something going forward?
zas
yes, especially openresty
atj
too many containers with volumes in /home/zas ;)
zas
yup, I agree
Lotheric has quit
Lotheric joined the channel
I'll not switch more domains until we set up missing parts, the idea here was to get some real traffic in order to detect potential issues
atj
it's a good idea to do it this way
I'm not sure about minio, stuff like this gives me pause "MinIO strongly recommends production clusters consist of a minimum of 4 minio server nodes in a Server Pool for proper high availability and durability guarantees."
but then the alternatives look worse
building a proper MinIO cluser is going to cost €€€
mayhem
alastairp: monkey : can you please reply to the last.fm rec/UX doodle soon?
monkey
Yus
zas
atj: yes, I'm not sure either
monkey
I tried opening it yesterday but it wasn't workign for me, will try again
lucifer
mayhem: i am around now as well to discuss similar recordings.
I was playing with the similarities last night and noticed a few things.
First off, the 7 day data is pretty bad.
lucifer
i see
mayhem
which makes little sense, because the first dataset you had a few days ago was even smaller and it was... kinda good.
so, I can't explain that. might've been a fluke.
the next realization is that I had an insight, but that insight hasn't been tested yet.
namely, if we train on too much data, that our data set tends towards noise.
I dont know if that is actually true.
lucifer
we can train on 180-365 days to test that?
mayhem
so, I would like to train data sets on: 30d, 90d, 180d, 1year, 3 year, 10 year, all time data sets.
lucifer
sounds good
but we'll have to use appropriate thresholds to keep lookups fast
mayhem
and then my goal is to pick a number of tracks as seed tracks and study them.
yes, that is a good point. can we make the thresholds relative to the amount of data?
so that it picks itself in a way?
lucifer
hmm not sure. at max 100 or 200 similar recordings for each track maybe?
mayhem
do you do any pruning of the data while you are generating it or is the pruning one of the last steps?
lucifer
last step.
mayhem
then yes, lets say we pick max 200 tracks for the next round.
and then we remove the threshold concept?
well, swap these concepts?
I think total count is a better way of doing this.
lucifer
i am thinking to use both. 200 max but only if the track meets the threshold, say 5 or 10.
mayhem
can we please try without threshold first?
I would like to see what that is like. it might work better.
because right now I want to see what the long tail for some of this data is.
lucifer
sure. can try.
so 200 tracks per recording without threshold?
mayhem
yes.
lucifer
👍
mayhem
great.
another q: are hated tracks filtered post CF data generation?
lucifer
nope currently we don't do anything with loved/hated tracks
mayhem
but that would be the right place to do this, no?
lucifer
yes right.
mayhem
let me put that on your todo list as something to do when you have some time.
at least the hated bit -- that doesn't involve another round of CF tuning.
lucifer
where should we do it? spark, LB ingest, LB query time?
3 degrees of responsiveness
mayhem
I was thinking post CF generation. take all the generated tracks from CF and remove all the ones that appear in the user's hated list.
but I am not married to that approach.
I think a more comprehensive approach is to tune the inputs to CF so that CF knows all about this.
lucifer
yes makes sense, let's start there and see where to go from there.
mayhem
but I am hesitant to get into a round of CF filtering right now. it seems to be working and I feel like focusing on other things.
great!
alastairp
mayhem: done. I put a maybe on tues because it's holiday here, not sure what I'll be doing
mayhem
seems that we're leaning towards thursday anyway.
thanks
yep, thursday it is. will send meeting invite later.
alastairp
ok
mayhem
alastairp: lucifer : right now the cover art stuff is in a separate repo (initial goals where unclear) and is being served from a different URL.
these endpoints might get a lot of traffic in the future -- not sure.
should this code be integrated into the LB codebase and simply be made a new endpoint or two or should we start a new docker container?
integrate now., move later?
zas: atj_mb: you two about for this question?
alastairp
I think integrate now/move later is better
is the python code fast?
lucifer
sounds good to put in LB for now. when we want to publicize it or show it on lb.org etc, move
mayhem
its a question about future growth and I wonder how the new gateways impact this.
alastairp
keeping it in the same codebase but starting a new container for it should be straightforward too
mayhem
alastairp: there isn't much slow stuff in the code, but it does need to make requests to the DB to resolve IDs.
and with the new gateways it should be easy to slice one endpoint and have it go to a new server, or?
I'll proceed with a PR into the codebase for now. loads easier.
lucifer
mayhem, i created daily jams now for all users who didn't jams yesterday.
mayhem
thank you!
atj
o/
"cover art stuff" being the code to create the grids?
alastairp
yeah, we already send specific URLs/domains to different containers in the gateways (like websockets)
Hellow1 joined the channel
mayhem
atj: yes.
atj: this is a general traffic growth question.
does it ever matter for us to move heavy traffic things to a separate sub-domain or should we always assume that we can do per-URL level routing to back-ends?
atj
right, well the new gateways should be able to handle a lot more traffic, however I don't think the gateways are the bottleneck generally
mayhem
can we?
agreed -- this convo is more about the user's perspective and not having to move redirect to new subdomains for when traffic grows.