#metabrainz

/

      • inkasso0815 joined the channel
      • inkasso0815 has quit
      • reosarevok
        ruaok, _lucifer: is this something that is easy to do now? https://tickets.metabrainz.org/browse/MBS-11586
      • BrainzBot
        MBS-11586: Show How Many Times a Recording Has Been Listened to with ListenBrainz on Recording's Page
      • _lucifer
        reosarevok: No, I think it'll be difficult to do. one major is issue is that listens are linked to a `msid` not a `mbid`.
      • secondly, i am not sure but as per the current structure of the data, such queries might be expensive.
      • MRiddickW has quit
      • zas: ruaok: the worker nodes are failing some jobs due to lack of disk space. I see the nodes have a lot of free space but during processing some data is stored at `/tmp`. this path property is configurable. should i point the nodes to store data somewhere else like `/data/tmp` maybe?
      • the persistent data is being stored at `/data/datanode` and `/data/namenode`. let me know i should modify that as well.
      • ruaok
        _lucifer: yes. exactly that.
      • /data seems fine
      • also, moooin!
      • iliekcomputers
        good morning!
      • ruaok
        moin, how are things in dublin?
      • iliekcomputers
        it's been sunny this week!
      • ruaok
        said like a true northerner. :)
      • alastairp
        hello
      • ruaok: interesting to see that the materialised view also has lots of subtables
      • how many rows does listen_count_5day have? hypothesis: if it was just a plain old postgres table with a bunch of indexes, would it be faster?
      • ruaok
        yeah. the thing that flummoxxes me is that it a search on the materialized index with start and end constraints still does a full index scan on all chunks. which kills performance.
      • > if it was just a plain old postgres table with a bunch of indexes, would it be faster?
      • it should be faster. right now it has 2M rows or so. with a proper index it is faster.
      • my conclusion: cont aggs are too critical to be a mission critical element of our infrastructure.
      • but, I have a plan!
      • alastairp
        yeah, I was surprised to see all of those scans on each chunk, when I guess we put all of this effort into minimising the number of chunks that we wanted to scan back when we started this optimisation process
      • ruaok
        first, we should store listen count, min ts, max ts in redis -- perhaps even one key since they often are needed in the same place. Since we have ONE place where we update the listen table (timescale_writer) we can ensure that the values in redis are kept accurate. And this has been working well from what I can tell. So, lets not expire those keys anymore -- primary store is in redis, but can be recomputed when needed by expiring the
      • keys.
      • _lucifer
        ruaok: đź‘Ť , i'll restart the cluster after changing the local dir and we should be able to do more requests.
      • ruaok
        that cuts out a 2.7s query. not hard to accomplish.
      • alastairp
        right, so instead of having to do a scan, be it indexed or table scan, we just do a redis get
      • great
      • _lucifer
        btw, i checked it did process some stats jobs yesterday.
      • ruaok
        next, I took a look at our usage patterns. and I think we should make sure that fetching recent listens is FAST and older listens reasonably fast. ancient listens can be slower. but most of our queries will be recent shit, so make them fast.
      • I'm finishing debugging an approach that starts with a 30d window and then does an exponential backoff on the query bounds.
      • alastairp
        great, that makes sense too
      • ruaok
        if not found in 30 days, save those results and adjust bounds to be wider and just outside the last bounds.
      • worst case, 3-5 queries that never overlap, doing a full table index scan.
      • my goal is to fetch listens in under 100ms on avg.
      • hopefully better than that.
      • hopefully we can test that in a minute.
      • does any of that sounds like a bad approach?
      • furthermore, cont aggs don't support month buckets for the data view you and Mr_Monkey wanted.
      • but I created a 30 day bucket, which will never align right, but get the overall job done -- a month or year aligned query on top of that, will still fetch the data in a reasonable manner.
      • 30 day bucket cont agg.
      • alastairp
        will the exponential backoff still hit chunk boundaries?
      • ruaok
        no.
      • alastairp
        but the idea would be that the first few chunks will get most of the data that we want anyway, so it should be fast?
      • ruaok
      • yes, exactly that.
      • see the paste for how the timestamps change over time. right not I have it tuned to give max 3 queries.
      • alastairp
        yeah right, nice
      • ruaok
        yisss. tests pass.
      • alastairp
        one question: from the pg explain above, do you know what `listened_at_bucket < COALESCE(_timescaledb_internal.cagg_watermark('35'::oid), '-9223372036854775808'::bigint))` is?
      • I assume that this is some internal timescale thing, but it's interesting that the bigint argument is the same on all chunks
      • ruaok
        oh yes, and when we cache timestamps, then the lower bound of the query will be the timestamps of the user, further making it more efficient. (new users will never paw through 20 years of data)
      • I don't know. has confused me too.
      • shivam-kapila
        We used to cache min max timestamps during Influx time iirc
      • ruaok
        we still may. I just want to shore them up and store them in one redis key.
      • I haven't dealt with that yet.
      • awww yiss: `2021-04-16 09:30:38,546 INFO fetch listens 0.03s in 1 passes`
      • shivam-kapila
        That looks goood
      • Damn thats atleast 2x faster for me
      • ruaok
        yes, we clearly cache timestamps. once they are in cache, things are fast.
      • 30ms. nice. I was shooting for under 100ms.
      • yesterday was a loong day, but in the end worth it.
      • shivam-kapila
        Strange enough that the cont. agg. Is so slow
      • alastairp
        sweet, nice work
      • ruaok
        wooooo, wooo fucking hooo!
      • ruaok does a little dance
      • man, that was a long slog.
      • _lucifer
        !m ruaok
      • BrainzBot
        You're doing good work, ruaok!
      • ruaok
      • shivam-kapila
        But the performance boost is cleary noticable.
      • ruaok
        anything that is slow, is the ts fetch. once that is in cache, booom!
      • :D
      • BrainzGit
        [listenbrainz-server] mayhem opened pull request #1390 (master…be-gone-time-ranges): Remove time ranges and refactor listen fetching https://github.com/metabrainz/listenbrainz-serv...
      • sumedh joined the channel
      • _lucifer
        alastairp: hi! regarding https://github.com/metabrainz/listenbrainz-serv..., do you mean to reject invalid format tags or just not build and push the image to docker hub?
      • alastairp
        good question
      • can you reject badly formatted tags at a github level?
      • I think it's fine to just match this action on a regex
      • ruaok
        alastairp: could you please glance at https://github.com/metabrainz/listenbrainz-serv... and tell me if we should finish this and merge or if I should do the timestamp improvements before finishing this PR?
      • Mr_Monkey
        Nice work ruaok !
      • ruaok
        I would prefer to finish the timestamp improvements in the same PR.
      • Mr_Monkey: thanks. :)
      • Mr_Monkey: please check the branch to see if you find any issues.
      • Mr_Monkey
        Looking now
      • ruaok
        the PR is already long, but mostly because of deleted code.
      • Mr_Monkey
        We're talking of PR 1390 you posted above, correct?
      • ruaok
        yes
      • which is live on test
      • did you read my comments on the month and year aggregated views?
      • _lucifer
        alastairp: i don't we can reject such tags directly, we'd need to write some webhook and wire BrainzBot maybe. probably not worth it. the other solution to match the regex is easy. yup, let's do that.
      • alastairp
        ruaok: a quey about that PR: `CREATE VIEW listen_count_5day WITH ...`. I'm not super familiar with this part of postgres
      • this is a materialised view (I understand yes, because you can only create indexes on materialised views?), in that case, what's the difference between CREAET VIEW and CREATE MATERILIZED VIEW?
      • ruaok
        that part will likely still change. I've not settled on what to do yet.
      • but I may just replace the listen_count with listen_count_30days and listen_count_365days.
      • outsidecontext
      • zas
      • ruaok
        alastairp: thats how things are done in TS 1.7. IN 2.x+ you create a materialized view. I suspect that an upgrade will help us on a few fronts, but they still won't make the approach from yesterday viable.
      • outsidecontext
        zas: oh, great. so we could already change the links for the picard download page?
      • alastairp
        ruaok: ok, thanks for clarifying
      • ruaok
        !m zas
      • BrainzBot
        You're doing good work, zas!
      • zas
        outsidecontext: I guess so, I need have few bits to tune, but it is here to stay
      • ruaok
        alastairp: so thoughts on continue vs new PR for the timestamps work?
      • alastairp
        ruaok: I'm reading through the PR now
      • outsidecontext
        cool. shall I submit a PR for the website?
      • zas
        for now, it supports http & https, should I just redirect http to https?
      • ruaok
        zas: I don't think so.
      • zas
        k
      • alastairp
        ruaok: can you clarify - this PR includes the changes that we discussed on wednesday: adds the 5 day aggregate, removes the need for making multiple queries on the frontend. query times are on par with what they were before this change. your new proposed improvements include caching of the min/max and the increasing range query, which gets us down to your example 30ms query?
      • ruaok
        correct. however, the "adds the 5 day aggregate" will likely still change a bit.
      • it is no longer crucial to have the cont agg in lock step with the hypertable.
      • I am learning toward replacing the listen_count with listen_count_30day, rather than creating a new listen_count_5day.
      • it should make fetching timestamps a bit faster.
      • so the 30day cont agg and the timestamp work are the only bits left.
      • alastairp
        ah, I see. this is why it also makes sense to merge your timestamp work into this same PR
      • ruaok
        yes.
      • and then we have one larger PR, but this whole whicket of issues gets solved.
      • and the basis for 30day listen overview is in place for future work.
      • alastairp
        this PR is already quite large, though I see a lot of it is just changing arguments in tests.
      • the timestamp stuff is going to be pretty self-contained in a single file, right?
      • ruaok
        exactly why I am asking before proceeding.
      • 2, but year.
      • yeah
      • I will create a new agg in this code, but we wont need an SQL update script yet. I've done it by hand.
      • alastairp
        go ahead with putting both in this PR then, as the timestamp stuff isn't going directly affect the other 21 files
      • ruaok
        once deployed I will open a small PR for removing the old cont aggs.
      • alastairp
        I've got a meeting now, but after lunch I'll be available for review
      • ruaok
        kewl. I'll finish that this afternoon with some luck.
      • I suspect that PR wont be ready until monday, so no rush.
      • alastairp
        ok cool
      • ruaok
        I worked from 10am until 1:30 in the morning with few breaks, so I'd like a bike ride this afternoon. :)
      • alastairp
        can I pre-review this? I see a few suggestions already, or would you prefer to wait until you finish?
      • theoretically comarca restrictions get lifted on Monday!
      • ruaok
        don't go digging into too much detail and skip comments on the SQL scripts, but yes feedback would be welcome.
      • ohh, really?? that would be fantastic.
      • ruaok wants to go to olot.
      • alastairp
        to "Veguerias", which are circa 14th century political divisions, lol
      • so our one includes Barceloneces, Maresme, and the 2 Valles
      • "the feudal administrative territorial jurisdiction of the Principality of Catalonia (to the Crown of Aragon) during the Middle Ages and into the Modern Era until the Nueva Planta decrees of 1716."
      • ruaok
        crap.
      • oh well, a trip up to Arenys would also be nice.
      • Sophist-UK joined the channel