and the first graphic there suggest what our flow should be, I supposed.
2018-08-09 22153, 2018
ruaok
so, asking in apache-spark for recommendations on how to setup a brand new cluster for our needs would be really good.
2018-08-09 22136, 2018
ruaok
and, I am thinking that I want to postpone further policy work until October -- in sept I want to actually do technical work and move this stuff along.
2018-08-09 22157, 2018
ruaok
maybe we can do a sprint-like think and move LB along a few notches in sept.
2018-08-09 22115, 2018
iliekcomputers
yes, please.
2018-08-09 22150, 2018
iliekcomputers
I'll ask on the spark channel and see what they say.
2018-08-09 22121, 2018
ruaok goes to lurk in their channel
2018-08-09 22127, 2018
iliekcomputers
ok
2018-08-09 22131, 2018
iliekcomputers
About AB, I was planning to create a full dump using `manage.py`, import it into frank, then during downtime create an incremental dump and import it. there are a few private tables (user, api_key) etc which are small enough to dump manually during downtime, I would guess.
2018-08-09 22154, 2018
demonimin has quit
2018-08-09 22159, 2018
iliekcomputers
But the thing is that full dumps don't seem to add entries into the incremental dump table at all. I'm not sure if that is intentional or not.
2018-08-09 22144, 2018
iliekcomputers
seems liek incremental dumps is supposed to be a different series (1, 2, 3)?
2018-08-09 22117, 2018
iliekcomputers
I ran an incremental dump on spike a few days ago, but it got stuck with no logs. So I wanted to get some context before jumping into the dumps code again.
2018-08-09 22157, 2018
ruaok
let me look at the AB schema
2018-08-09 22138, 2018
demonimin joined the channel
2018-08-09 22109, 2018
ruaok
what is the dump naming strategy used right now? incremental dump strategy?
2018-08-09 22158, 2018
iliekcomputers
there's the full dump and json dumps.
2018-08-09 22122, 2018
iliekcomputers
and then there's acousticbrainz-incr-1, acousticbrainz-incr-2