ok i figured due to some reason yarn is trying to connect to marlon instead of worker-marlon. i think that might be the case if its tries public ip instead of internal ip.
2021-04-15 10558, 2021
_lucifer
interesting hdfs cli works as expected though reports the internal ips and worker-*
2021-04-15 10558, 2021
_lucifer
ok. the ip thing is fixed now but it now fails with user application exited with exit code 1.
2021-04-15 10535, 2021
alastairp
CatQuest: thanks! blank spaces is OK, the only problem we might have is an empty string. Anything that is actually represented by characters is fine 👍
2021-04-15 10511, 2021
sumedh joined the channel
2021-04-15 10532, 2021
reosarevok
bitmap, yvanzo: for MBS-1658, I was thinking at least one of the places to add a comment to the entry should be from the list itself
bitmap, yvanzo: So that last column should have like an edit icon somehow, and I guess allow inline editing that would get sent to the DB? Do you know if we do anything like that anywhere else?
2021-04-15 10503, 2021
reosarevok
Or if you think that's a bad idea, how would you do it?
2021-04-15 10508, 2021
alastairp
CatQuest: hah, nice
2021-04-15 10521, 2021
alastairp
that's not a problem either though, but I can see how it could be a problem
Mr_Monkey: alastairp : so after more experimenting last night, I'm able to get rid of the min/max ts cont aggs by simply creating a 5 days cont agg with compound index on user/listened_at. Thats 18M rows less for starters.
2021-04-15 10517, 2021
ruaok
and I think we can replace those with month and year cont aggs for the graphs you two would like.
2021-04-15 10519, 2021
alastairp
right. get all data from the same table?
2021-04-15 10541, 2021
ruaok
yeah, it was already there. just the index was missing to make it faster.
2021-04-15 10543, 2021
alastairp
sweet, if a month and year aggregate is possible then that sounds like it should be perfect
2021-04-15 10544, 2021
alastairp
great
2021-04-15 10551, 2021
ruaok
basically we swapped doing a table scan on the DB with an index scan. not sure we can do much better than that -- but with increased cache times, this should work well.
2021-04-15 10517, 2021
alastairp
_lucifer: ^ remember how I told you to add indexes to tables where you want to select some data?
2021-04-15 10521, 2021
_lucifer
yup, i'll keep it in mind :D
2021-04-15 10555, 2021
ruaok
its the rookie mistake that keeps on giving. #going25yearsstrong
so, lets do all the loading of data (mapping, incrementals) then we can fire off some jobs.
2021-04-15 10555, 2021
ruaok
that makes sense.
2021-04-15 10559, 2021
ruaok
very very good.
2021-04-15 10537, 2021
_lucifer
two things left to do. one is define memory defaults and second is update the new configuration in syswiki.
2021-04-15 10508, 2021
_lucifer
monitoring this cluster is easier than the docker one. one tunnel is sufficient
2021-04-15 10512, 2021
ruaok
that was exactly the goal.
2021-04-15 10529, 2021
ruaok
and each server is being monitored by all of zas' magic.
2021-04-15 10548, 2021
_lucifer
:D
2021-04-15 10519, 2021
_lucifer
alastairp: available to talk about the GH actions PR?
2021-04-15 10522, 2021
_lucifer
ruaok, zas: do you know if any service we run on j5 might listen on port 5666?
2021-04-15 10548, 2021
zas
nagios
2021-04-15 10554, 2021
zas
well, its agent
2021-04-15 10544, 2021
_lucifer
👍 thanks!
2021-04-15 10537, 2021
sumedh has quit
2021-04-15 10518, 2021
sumedh joined the channel
2021-04-15 10519, 2021
_lucifer
zas, i saw a commit in syswiki renaming germaine to jermaine so wanted to let you know that i noticed /etc/hosts on jermaine still contains has a couple of entries referring germaine.
bitmap, yvanzo ^ would really appreciate some feedback on whether the way I'm approaching this seems sensible, improvements etc, before I keep working on other lists
Hello everyone. I would like to ask about Lucene Search syntax of the musicbrainz database. What kind of instance is it running on? If i would like to have a musicbrainz database (mbdata) with ElasticSearch instance, how do you recommend to integrate these two. I just started looking into ElasticSearch but what I found out i would need some kind of
2021-04-15 10518, 2021
scory
data set to import it to ElasticSearch (*.json for example). Do you have some kind of method to import musicbrainz database into a Lucene Search intance, a data set I can use, or i should generate it myself? How do you keep it updated?
2021-04-15 10547, 2021
ruaok
hi scory!
2021-04-15 10553, 2021
ruaok
why must is be elasticsearch?
2021-04-15 10528, 2021
ruaok
because we have a perfectly working search infrastructure that you can use without having to reinvent the wheel.
2021-04-15 10537, 2021
scory
That infrastructure currently doesn't support what I need, last time I was here that was the conclusion for me. That's the reason I am currently running a mbdata server locally, and can run graphql queries against it, with batching. But I would like to implement an ElasticSearch instance on graphql. But currently i am just investigating.
reosarevok: I can't think of anything else like that offhand, but doesn't seem like a bad idea. we could add a small endpoint to /ws/js for it
2021-04-15 10505, 2021
sumedh joined the channel
2021-04-15 10526, 2021
vardan has quit
2021-04-15 10558, 2021
adhi001 joined the channel
2021-04-15 10558, 2021
adhi001
Sorry ruaok , I was sick the last week and was not able to submit a proposal for GSoC. Still part of the community :)
2021-04-15 10520, 2021
ruaok
oh, bummer. that sucks. at least you're better, right?
2021-04-15 10531, 2021
adhi001
yeah
2021-04-15 10508, 2021
adhi001
Thank you for your concern
2021-04-15 10542, 2021
alastairp
_lucifer: hi, sorry - had a hectic day. still around?
2021-04-15 10553, 2021
_lucifer
alastairp: hi! no worries. yup, i am available.
2021-04-15 10506, 2021
alastairp
so I was suggesting using test.sh in the actions?
2021-04-15 10516, 2021
_lucifer
yes
2021-04-15 10542, 2021
alastairp
so we already have things like `./test.sh -b` to build, and `test.sh -u` to bring up containers
2021-04-15 10548, 2021
alastairp
test.sh fe to run frontend tests
2021-04-15 10527, 2021
_lucifer
yes there's also test.sh spark
2021-04-15 10549, 2021
alastairp
great, so it sounds like it's probably a good fit that we can use the actions files for specifying the orders in which to run things, but reusing test.sh for the actual commands allows us to share 100% test code between local developement and CI, right?
2021-04-15 10505, 2021
_lucifer
should we use separate build steps? like there's a ./test.sh -u to just bring up supporting containers. or should we just to do ./test.sh which does it in one go
2021-04-15 10500, 2021
alastairp
but we need to separate pull / cache / build / run, in CI, right?
2021-04-15 10506, 2021
_lucifer
yes, mostly. we'll still have to pull manually
2021-04-15 10527, 2021
_lucifer
that step won't change
2021-04-15 10558, 2021
alastairp
one question - if a test generates files (e.g. the junit xml), will it be cached? Or does the cache action only cache docker layers?
2021-04-15 10539, 2021
_lucifer
i'll need to check that.
2021-04-15 10515, 2021
_lucifer
i expect docker layers only but we can confirm it by generating some files and looking at the actions output
2021-04-15 10536, 2021
alastairp
it seems that satackey/action-docker-layer-caching works explicitly on layers (looking at the output, it uses docker commands to generate the archives)
2021-04-15 10505, 2021
alastairp
so yeah, ./test.sh pull; restore cache; run test.sh; save cache
the junit action works fine but as i mentioned it might comment excessively
2021-04-15 10513, 2021
alastairp
neat. did you see what happens if one fails? does it only update the comment or does it also add an annotation to the failing test?
2021-04-15 10518, 2021
alastairp
and it'll make a new comment on every push (i.e. even if they all pass?)
2021-04-15 10519, 2021
_lucifer
no i haven't, let me do that right now.
2021-04-15 10524, 2021
_lucifer
yes
2021-04-15 10525, 2021
alastairp
or only if the results of the test run change?
2021-04-15 10529, 2021
alastairp
interesting
2021-04-15 10535, 2021
alastairp
you're right that this could get a bit annoying
2021-04-15 10543, 2021
_lucifer
it'll hide the existing one and add a new one
2021-04-15 10502, 2021
_lucifer
for LB there are going to be 4 comments on each push
2021-04-15 10538, 2021
alastairp
oh, that's quite annoying. merging tests together would help (e.g. get it down to 2), but I suspect that this might be too much
2021-04-15 10551, 2021
alastairp
one thing that ruaok was suggesting back on jenkins was that it seems stupid to run tests on _every_ push, perhaps there could be a way to run them less often. once a day? on request based on a comment? just before merge?
2021-04-15 10520, 2021
ruaok
anything, really.
2021-04-15 10533, 2021
alastairp
let's not spend too much more time on this, but perhaps there is an action or a flag for `on:` that lets us decide to run them less often