and the first graphic there suggest what our flow should be, I supposed.
so, asking in apache-spark for recommendations on how to setup a brand new cluster for our needs would be really good.
and, I am thinking that I want to postpone further policy work until October -- in sept I want to actually do technical work and move this stuff along.
maybe we can do a sprint-like think and move LB along a few notches in sept.
iliekcomputers
yes, please.
I'll ask on the spark channel and see what they say.
ruaok goes to lurk in their channel
ok
About AB, I was planning to create a full dump using `manage.py`, import it into frank, then during downtime create an incremental dump and import it. there are a few private tables (user, api_key) etc which are small enough to dump manually during downtime, I would guess.
demonimin has quit
But the thing is that full dumps don't seem to add entries into the incremental dump table at all. I'm not sure if that is intentional or not.
seems liek incremental dumps is supposed to be a different series (1, 2, 3)?
I ran an incremental dump on spike a few days ago, but it got stuck with no logs. So I wanted to get some context before jumping into the dumps code again.
ruaok
let me look at the AB schema
demonimin joined the channel
what is the dump naming strategy used right now? incremental dump strategy?
iliekcomputers
there's the full dump and json dumps.
and then there's acousticbrainz-incr-1, acousticbrainz-incr-2