deploying consul via ansible would be an improvement, it does have to run on docker, an instance (client and/or server) needs to run on nodes anyway. What I suggest is to try an Ansible deployment of a 3-nodes cluster, using 10.10.10.x network, and few clients deployments (consul agent)
atj
zas: ok we can look at doing a test deployment. i had a brief scan of the readme and there's a few things that concern me, e.g. "it does not currently concern itself (all that much) with performing ongoing maintenance of a cluster" and the OS support matrix only listing Ubuntu 16.04
yup, dunno (yet) how much those points are impacting us
atj
i'm going to try and merge the borgmatic PR today and migrate the settings from the borg-backup repo
zas
that said, the idea of managing this part of the infrastructure (consul-stuff) with Ansible makes sense to me
ok great, keep me updated
in practice, it means we'll run 2 consul clusters in parallel for a while, new install shouldn't interfere with old docker-based one
new containers will be encouraged to use the new cluster, while we deprecate the old one, slowly moving from 10.2.2.x to 10.10.10.x which will offer more freedom (not being limited to be physically in same place)
currently, AFAIK, consul network gets changes only from 2 sources: gitzconsul & serviceregistrator. The first one basically converts a part of docker-servers-config files to consul storage, the second registers running containers (mainly for openresty to autoconfigure http forwarding on gateways)
both can be executed along old instances but pointing at new consul cluster instead
then app containers can choose to use one or the other, until we totally remove the old cluster
it should be noted they might be changes at docker-template level too (so likely rebuilding containers)
iconoclasthero joined the channel
but the target is to make lucifer happy at this point.
mayhem: In my proposal should I add code snippets or directly link the closed PR
mayhem
link is fine
jivte
and a image could of the feature could be added too :)
or :|
mayhem
as you wish, really.
jivte
Okk thanks for the help :)
mayhem
np
lucifer
mayhem: looking into the IA stuff, there's an audio collection that seems to encompass all audio files of the archive.
however, 1) not all of it is music. 2) the search is slow and does not seem to be recursive.
mayhem
yea, slow is pretty characteristic of the IA services. sadly.
but I suppose we could just crawl the collection and build a content resolver from it.
lucifer
i think a content resolver like local cache will have to be the way
lol. yup
mayhem nods
jivte has quit
another thing, in favor of a local content resovler is that. most of the music on LB won't be from IA so not very sensible to make slow queries to IA for each track.
mayhem
yep.
jivte joined the channel
jivte
monkey: About that add album feature could you help in improving some design mockups :)
monkey
Hi jivte! I don't have time available until the 11th I'm afraid
monkey and I realized that we have another project that we'd need your help on.
lucifer
sure, what is it?
mayhem
you know how the CF results have a last listened timestamp? I need those for all tracks for all users.
lucifer
we have that already stored in spark.
mayhem
not just CF tracks -- so in artist radio I need to know if the user has listened to a track recently.
Khagan has quit
lucifer
where and how often updated do you need it?
mayhem
for *all* tracks and *all* users ?
lucifer
yup
CF joins with that data to add it to recs.
mayhem
ha great!
well, so troi needs it.
which means that when we move this feature over to spark, that will be easy.
lucifer
yup indeed
mayhem
monkey also needs it and he would like to show play count of a given track and when the user last listened to it.
I suppose adding playcounts isn't too hard in this case, right?
lucifer
yup should be easy.
so need to store in couchdb i think.
mayhem
it seems like it is another case of needing to take data from spark and move it to PG then API it.
lucifer
updated weekly?
mayhem
not good enough.
lucifer
yeah i guess PG would be better in this case.
mayhem
really, i am realizing, we need data set hoster for HDFS
lucifer
so that we can query both ways by user and by artist.
mayhem
I really dislike taking data from spark moving it to PG just to serve it.
lucifer
that would mean exposing spark cluster to web though.
mayhem
is there any way that we can make APIs on top of spark? seems no, because of a mistmatch in approaches.
-t
lucifer
i think we can but i don't think it would be fast enough.
mayhem
lets ignore that problem for a second.
lucifer
there are other techs built on top of spark for such querying i think
mayhem
yeah. seem unlikely to work well.
oh? that might be worth exploring.
lucifer
yeah many even support realtime querying and stuff.
we could get listens in realtime through rmq to spark
mayhem
I think this is really worth exploring.
lucifer
sounds good
mayhem
I mean, if we could leave the stats data in HDFS and not have to move it around, that would be great.
well, true of all of the datasets we shovel back and forth.
lucifer
indeed indeed
arsh
Hi mayhem: I hope you had a chance to look at the document I sent. Do you any thoughts or ideas that you like that I could build on. Thanks
mayhem
yes, I do.
let me finish an email and get back to you.
arsh
Sure
Shelly joined the channel
mayhem
hey arsh!
arsh
hello
mayhem
so, the first bit of feedback is that we dont have access to artist images.
we used to but then some asshole sued us and ruined the party. very long story.
Shelly
I am currently on master branch and running the listenbrainz-server on local but there's an sql error "psycopg2.errors.UndefinedColumn: column "external_user_id" does not exist
listenbrainz-web-1 | LINE 6: , external_user_id
"
arsh
oh i see
mayhem
so, your designs need to use no images.
and I looked at your mock-ups and idea 3 jumps out at me.
Shelly
Can someone please look into it beacuse when i git pull from master my current branch broke.=L