how did you generate this? what import/reimport steps did you take?
I'd love some help getting me talk title/desc up to scracth
scratch, even.
reosarevok
ruaok: you namedrop me in an interview and don't tell me? :D
ruaok
reosarevok: hey you! I dropped your name to get some street cred. Did you hear???
:-P
reosarevok
(just got a message from a guy I know "hey this post at the top of the subreddit I follow mentions you wtf" :D )
ruaok
lol
iliekcomputers
ruaok: I did a simple last.fm import, just added a print to influx_writer wherever it thought a listen to be a duplicate
ruaok
did you start with an empty DB?
iliekcomputers
yes
ruaok
oh. well, that is a problem, then.
clearly there shouldn't be duplicates on a single import.
iliekcomputers
yeah, I dunno what the problem is, though
iliekcomputers goes to take a look
ruaok
do we have an edge condition somewhere
?
>= vs > ?
which might causes listens at the end of a block to be duplicated/omitted?
alastairp
I failed to import 2
out of 400 pages
iliekcomputers
could be, I haven't seen much of the scraper code in detail. Although, the data is a bit weird, I don't understand how a difference of 2 seconds between listens is possible. We couldn't be the ones doing that, because the last.fm api returns timestamps themselves, maybe the last.fm data has listens with differences less than 30 seconds.
alastairp
I would have been more suspicious if I had failed to import 400 ;)
iliekcomputers: yeah, it would be worth doing a scrape of an API and checking the values of the timestamps
ruaok
alastairp: it failed to import more than 1000 for CatQuest
inb4 flame war: I've been using XFCE forever but I'm getting a bit tired of some of its issues. Out of the flavours Ubuntu does ship with, what's the one I should install when I reinstall it? :p
Galaverna joined the channel
alastairp
ferbncode: cool, I'll look this afternoon. thanks
honestly, gnome3 is way better than it was 6 years ago
Galaverna has quit
I don't think anything annoys me on a day-to-day basis, but I installed quite a number of extensions to make it better
agentsim joined the channel
ferbncode
alastairp: great, thanks :). I used gnome an year ago, it was a heavy one for me, but then I switched to i3wm, superlight :P
agentsim has quit
ruaok
weird feeling of the day: logging into quickbooks and having new companies appear there that I've never heard of or dealt with.
I <3 you Quesito!
iliekcomputers
so the weird data is from lastfm itself
Quesito
Lol! Lots of new ones ruaok!
ruaok
Quesito: :)
iliekcomputers: I kinda thought so. you and i had been scouring all aspects of data ingestion.
iliekcomputers
there is this one edge case that I think we might have missed in influx-writer, if there are duplicates in the same rabbitmq batch, it might not find out because the timestamps dict here doesn't contain that timestamp?
[listenbrainz-server] paramsingh opened pull request #204: LB-180: Account for duplicates in same RabbitMQ batch for influx-writer (master...influx-writer/same-batch-dup) https://git.io/vQL0v