do you know what's happening / what the best way to fix this problem is?
2020-02-20 05149, 2020
ZaphodBeeblebrox
I've tried newer itunes. but they're all shit. and this one atleast jsut plays whatever iwant and stuff
2020-02-20 05123, 2020
ZaphodBeeblebrox
hmmm 🤔 one idea is to also get QuodLibet inot this
2020-02-20 05132, 2020
ZaphodBeeblebrox
a plugin for QuodLibet woudl be acebeams
2020-02-20 05104, 2020
ZaphodBeeblebrox
(then I just need otfigure out how to shuffle a playlist on album not song and i'm readyto switch :D)
2020-02-20 05121, 2020
BrainzGit
[listenbrainz-server] paramsingh merged pull request #736 (master…param/mail-on-dump-cron-job): Send a notification email when new data dumps are created https://github.com/metabrainz/listenbrainz-server…
2020-02-20 05127, 2020
ruaok
iliekcomputers: timescale is doing 20k inserts/s loading from the dump
2020-02-20 05152, 2020
iliekcomputers
how much did influx do?
2020-02-20 05106, 2020
iliekcomputers
how are you importing?
2020-02-20 05140, 2020
ruaok
I forget what it was doing.
2020-02-20 05117, 2020
ruaok
I just wrote a script that loads the data into a PG table -- I just added one extra call to create_hypertable(), otherwise it is bog-standard stuff.
2020-02-20 05123, 2020
iliekcomputers
woah
2020-02-20 05128, 2020
iliekcomputers
that's pretty nice
2020-02-20 05143, 2020
amCap1712
ZaphodBeeblebrox: it seems to be written in objective-c which apple abandoned a few years ago. it may work but needs a major overhaul
2020-02-20 05146, 2020
ruaok
yeah, and all of PG's functionality is available to us.
2020-02-20 05105, 2020
iliekcomputers
can i look at the code?
2020-02-20 05151, 2020
iliekcomputers
k, we should get emails when dumps are created now.
i've roped in ferbncode to look into using protobuf instead of json in our queues
2020-02-20 05104, 2020
iliekcomputers
ferbncode: 👋🏽
2020-02-20 05110, 2020
ruaok
nice!
2020-02-20 05134, 2020
iliekcomputers
>"INSERT INTO listen VALUES %s"
2020-02-20 05139, 2020
iliekcomputers
👏🏽 👏🏽
2020-02-20 05115, 2020
ruaok
using timescale is ridiculously easy if you are used to PG.
2020-02-20 05156, 2020
ruaok
I think I am going to re-write this to load from influx.
2020-02-20 05106, 2020
ruaok
our dumps don't seem to include inserted_at timestamps.
2020-02-20 05114, 2020
ruaok
which I really want to migrate
2020-02-20 05144, 2020
iliekcomputers
>our dumps don't seem to include inserted_at timestamps
2020-02-20 05105, 2020
iliekcomputers
add to trello?
2020-02-20 05105, 2020
iliekcomputers
sounds like a good first bug
2020-02-20 05131, 2020
ruaok
jira if anything.
2020-02-20 05143, 2020
ruaok
though I am not 100% sure on that. did we backfill those?
2020-02-20 05149, 2020
ruaok
I looked at very early data.
2020-02-20 05154, 2020
iliekcomputers
no, there is no way to do that.
2020-02-20 05104, 2020
iliekcomputers
plus influx is a pita anyways
2020-02-20 05111, 2020
ruaok
yeah.
2020-02-20 05124, 2020
iliekcomputers
we started tracking it a long time ago tho
2020-02-20 05142, 2020
ruaok
if this approach looks promising, I will write the migration tool directly from influx.
2020-02-20 05154, 2020
ruaok
will give us a better chance at being consistent.
2020-02-20 05125, 2020
ruaok
because I can start writing new listens to it and THEN do the import of everything in influx.
2020-02-20 05127, 2020
iliekcomputers
the idea is to start shadowing the queue at timestamp x, and retroactively load everything before timestamp x, right?
2020-02-20 05130, 2020
ruaok
when that process is done, we have everything.
2020-02-20 05133, 2020
iliekcomputers
yeah makes sense.
2020-02-20 05151, 2020
ruaok
I wasn't even going to be so picky about X.
2020-02-20 05114, 2020
ruaok
the system is designed to kick out dups, so let it kick out the dups at the tail end.
2020-02-20 05115, 2020
ZaphodBeeblebrox
y?
2020-02-20 05127, 2020
iliekcomputers
yeah i guess that makes sense.
2020-02-20 05145, 2020
iliekcomputers
wait
2020-02-20 05149, 2020
iliekcomputers
but the dups are in the queue
2020-02-20 05155, 2020
iliekcomputers
dup logic*
2020-02-20 05110, 2020
iliekcomputers
are you gonna take from influx and put it in the queue?
2020-02-20 05142, 2020
ruaok
first, and this is what I am going write right now, is to write a timescale_writer, much like the influx_writer.
2020-02-20 05158, 2020
pristine__
iliekcomputers: what is odd about it? I have no idea tbh
2020-02-20 05101, 2020
ruaok
then add a new queue off the incoming exchange and have it start writing the new listens.
2020-02-20 05121, 2020
iliekcomputers
pristine__: the test is very flaky, it fails randomly and then passes on rebuild
2020-02-20 05128, 2020
iliekcomputers
and it's only that test
2020-02-20 05146, 2020
ruaok
at some point we will need to swap over and have the timescale writer write to the unique rmq, but that would be one of the last things to do.
2020-02-20 05126, 2020
amCap1712
ZaphodBeeblebrox: completed the testing or still on?
2020-02-20 05132, 2020
ruaok
iliekcomputers: does that makes sense wrt to rmq? add a new queue off the incoming exchange, and then two consumers can consume the data at their own pace, effectively duplicating the data.
2020-02-20 05150, 2020
iliekcomputers
yes. i've done that with the follow feature
2020-02-20 05157, 2020
ZaphodBeeblebrox
amCap1712: ag i'm palying something i'll test soem more
2020-02-20 05103, 2020
ZaphodBeeblebrox
(just need food)
2020-02-20 05108, 2020
iliekcomputers
it follows the unique queue, iirc
2020-02-20 05110, 2020
ruaok
great.
2020-02-20 05120, 2020
amCap1712
yeah sure let me know how it goes ZaphodBeeblebrox
2020-02-20 05126, 2020
iliekcomputers
ruaok: i'm with you until here.
2020-02-20 05135, 2020
iliekcomputers
how are you gonna backfill from influx to timescale
2020-02-20 05151, 2020
ruaok
1. setup timescale
2020-02-20 05102, 2020
ruaok
2. write code to fetch EVERY listen
2020-02-20 05111, 2020
pristine__
iliekcomputers: yeah, I get that but I am not sure why it happens. Will have a look
2020-02-20 05115, 2020
ruaok
3. insert into timescale.
2020-02-20 05119, 2020
ZaphodBeeblebrox
lol this. same Arrepentimientos song goes to two differnt recordings??
2020-02-20 05133, 2020
ruaok
1.a setup live writing from the incoming queue
2020-02-20 05136, 2020
iliekcomputers
ah
2020-02-20 05139, 2020
ZaphodBeeblebrox
ah one is the borked itunes track-thing
2020-02-20 05147, 2020
ZaphodBeeblebrox
s/itunes/last.fm/
2020-02-20 05105, 2020
ZaphodBeeblebrox
oh dear. it
2020-02-20 05127, 2020
ZaphodBeeblebrox
does apparently scrobble to last.fm even if i turned that damn thing of
2020-02-20 05134, 2020
ZaphodBeeblebrox
and now they're duplicated :D
2020-02-20 05122, 2020
iliekcomputers
ruaok: 1. full import directly from influx 2. start shadowing the queue
2020-02-20 05128, 2020
iliekcomputers
this is what you mean basically right
2020-02-20 05134, 2020
ruaok
no, the other way around.
2020-02-20 05144, 2020
ruaok
start shadowing, the cross load
2020-02-20 05153, 2020
iliekcomputers
1. start shadowing the queue.
2020-02-20 05158, 2020
iliekcomputers
2. full import directly from influx
2020-02-20 05103, 2020
iliekcomputers
that would lead to duplicates
2020-02-20 05116, 2020
iliekcomputers
unless you write the dedup logic into the import process
2020-02-20 05118, 2020
ruaok
postgres won't allow duplicates
2020-02-20 05132, 2020
iliekcomputers
👏🏽
2020-02-20 05137, 2020
ZaphodBeeblebrox
hmm removed every thing I can of audioshattler, let's se if now it doesnt
2020-02-20 05146, 2020
ruaok
now, that might make the migration quite slow, not sure.
2020-02-20 05150, 2020
iliekcomputers
yeah
2020-02-20 05100, 2020
ruaok
oh!
2020-02-20 05103, 2020
iliekcomputers
migrations without constraints is what i learned from the AB migration
2020-02-20 05111, 2020
ruaok
right. heh. rabbitmq.
2020-02-20 05112, 2020
iliekcomputers
why not just do the timestamp logic?
2020-02-20 05140, 2020
iliekcomputers
1. start shadowing the queue to insert all listens (starting timestamp x)
2020-02-20 05144, 2020
ruaok
timestamp logic for de-duping? yes, the primary key will be (timestamp, username)
2020-02-20 05154, 2020
ZaphodBeeblebrox
also testing restarting the same song
2020-02-20 05154, 2020
ruaok
oic
2020-02-20 05158, 2020
iliekcomputers
2. directly import from influx all listens inserted to influx before timestamp x
2020-02-20 05137, 2020
ruaok
yeah, given the no constraints bit, that might actually be easier.
2020-02-20 05103, 2020
ruaok
I *really* hope this lives up to its promises.
2020-02-20 05110, 2020
ruaok
if it does, it is the perfect solution for us.
2020-02-20 05124, 2020
iliekcomputers
can't wait to migrate over to the new thing 3 years later
2020-02-20 05143, 2020
iliekcomputers
(kidding)
2020-02-20 05144, 2020
ruaok
well, I've only ever migrated to PG once.
2020-02-20 05158, 2020
ruaok
best decision evar.
2020-02-20 05141, 2020
ZaphodBeeblebrox
amCap1712: I fixed audioscrobbler bork!
2020-02-20 05150, 2020
amCap1712
great
2020-02-20 05157, 2020
iliekcomputers
ruaok: so about shadowing a queue. it's more shadowing an exchange which pushes to two queues iirc