#metabrainz

/

      • MRiddickW has quit
      • 2021-06-28 17916, 2021

      • MRiddickW joined the channel
      • 2021-06-28 17958, 2021

      • wargreen has quit
      • 2021-06-28 17903, 2021

      • wargreen joined the channel
      • 2021-06-28 17918, 2021

      • wargreen has quit
      • 2021-06-28 17926, 2021

      • reosarevok
        ruaok: thanks for taking care of support! Back now :)
      • 2021-06-28 17945, 2021

      • reosarevok
        Let's see if I remember how work works
      • 2021-06-28 17946, 2021

      • yyoung
        Is this an intended behavior of external links editor? https://imgur.com/a/MN51UDa
      • 2021-06-28 17913, 2021

      • yyoung
        To reproduce, enter two links and then clear the first one.
      • 2021-06-28 17934, 2021

      • reosarevok
        yyoung: I don't think it's intended as such, no
      • 2021-06-28 17958, 2021

      • reosarevok
        I guess it does allow to ctrl-z and bring the old link back, maybe? If it doesn't, then it's completely useless and 100% buggy
      • 2021-06-28 17939, 2021

      • reosarevok
        ruaok: can you unsubscribe from notifications from LB's twitter? Unless you're using those emails
      • 2021-06-28 17952, 2021

      • reosarevok
        (otherwise, it's just effectively stuff to remove)
      • 2021-06-28 17907, 2021

      • opal has quit
      • 2021-06-28 17924, 2021

      • yyoung
        reosarevok: Yes Ctrl-Z will work
      • 2021-06-28 17938, 2021

      • reosarevok
        Then we could keep it, but with better UI
      • 2021-06-28 17947, 2021

      • reosarevok
        If it makes sense for what you're building
      • 2021-06-28 17907, 2021

      • reosarevok
        Not that I know what the better UI would be :D
      • 2021-06-28 17910, 2021

      • opal joined the channel
      • 2021-06-28 17908, 2021

      • yyoung
        Since I'm changing the implementation, this bug is likely to disappear
      • 2021-06-28 17903, 2021

      • reosarevok
        Ok
      • 2021-06-28 17934, 2021

      • yyoung
        Just found it when testing some tricky actions, which are prone to introduce bugs :)
      • 2021-06-28 17945, 2021

      • ruaok
        mooin!
      • 2021-06-28 17941, 2021

      • ruaok
        reosarevok: the twitter notifications are really stupid. you can't just get notifications for direct messages and mentions. you only get these useless scattershot notifications...
      • 2021-06-28 17949, 2021

      • ruaok
        I can't decide if they are useful or not....
      • 2021-06-28 17933, 2021

      • reosarevok
        Yeah, dunno. We seem to have them disabled for MB (but I didn't do that myself)
      • 2021-06-28 17958, 2021

      • MRiddickW has quit
      • 2021-06-28 17948, 2021

      • akashgp09 joined the channel
      • 2021-06-28 17908, 2021

      • zas
        yvanzo: sir-prod is stuck again with "maximum recursion depth exceeded in cmp"
      • 2021-06-28 17923, 2021

      • outsidecontext
        akshaaatt[m], lucifer: hi, regarding your chat yesterday: I too think adding this to the release activity directly is the best solution. One advantage of the app is that it has the search and barcode lookup functionality already fully working, so adding the "send to picard" on the release activity automatically makes both use cases work
      • 2021-06-28 17900, 2021

      • lucifer
        +1, thanks! :D
      • 2021-06-28 17914, 2021

      • ruaok
        moin lucifer. care to take a look at #1514? Looks mergeable to me. :)
      • 2021-06-28 17920, 2021

      • lucifer
        ruaok: sure, will do. i am currently working on adding the consul values for that PR but since we need booleans not strings need to add a consul define.
      • 2021-06-28 17915, 2021

      • ruaok
        oh, thanks!
      • 2021-06-28 17901, 2021

      • ruaok
        lucifer: I just had a thought...
      • 2021-06-28 17953, 2021

      • ruaok
        right now we scale user similarities on a global scale, meaning that only one person ever will get a perfect 1.0 score. everyone else is going to get less. like we are seeing now.
      • 2021-06-28 17907, 2021

      • ruaok
        should we scale users individually?
      • 2021-06-28 17931, 2021

      • ruaok
        not 100% sure how to do that best.
      • 2021-06-28 17913, 2021

      • ruaok
        I guess if a user has only one match and it is a low match at that, we don't want to say "this is your 100% match". that's crap.
      • 2021-06-28 17929, 2021

      • lucifer
        yeah, we could scale individually. but then if highest similarity is only 0.009 something then we give it which is bad.
      • 2021-06-28 17937, 2021

      • lucifer
        right that.
      • 2021-06-28 17950, 2021

      • ruaok
        what if we say define 3-5 levels: Great, good, ok, so-so, weak.
      • 2021-06-28 17905, 2021

      • lucifer
        that makes sense.
      • 2021-06-28 17905, 2021

      • ruaok
        and that represents the top end to the bottom end.
      • 2021-06-28 17942, 2021

      • ruaok
        should I make a PR that scales the users individually so we can see what the ratings become?
      • 2021-06-28 17911, 2021

      • ruaok
        if we like that approach, make an LB (not spark) PR for changing the display from numeric to test level?
      • 2021-06-28 17926, 2021

      • ruaok
        *text*
      • 2021-06-28 17957, 2021

      • lucifer
        let's scale like that but also put a threshold on min rating below which we don't raise it to 1.
      • 2021-06-28 17921, 2021

      • lucifer
        say if the rating is below 0.01, don't make it 1 but rather 0.5 something.
      • 2021-06-28 17947, 2021

      • ruaok
        I see what you are suggesting and I agree, but...
      • 2021-06-28 17949, 2021

      • lucifer
        but that can be done as a follow up, we can probably just test the individual scale first
      • 2021-06-28 17903, 2021

      • ruaok
        so far hard coded thresholds have been problematic to say the least.
      • 2021-06-28 17940, 2021

      • ruaok
        actually, now that I think about it, you have a different approach than I do.
      • 2021-06-28 17903, 2021

      • ruaok
        I am saying that in the context of a users we are describing them in a relative way, rather than an absolute way.
      • 2021-06-28 17933, 2021

      • ruaok
        in my way, if there one match, its your best (and worst) match.
      • 2021-06-28 17951, 2021

      • ruaok
        in your way, there still some global threshold of quality between users.
      • 2021-06-28 17907, 2021

      • ruaok
        which I think is problematic.
      • 2021-06-28 17909, 2021

      • lucifer
        i see. makes sense.
      • 2021-06-28 17933, 2021

      • ruaok
        let me make a PR for user scaling and then we can examine the results. from there we can decide next steps.
      • 2021-06-28 17938, 2021

      • lucifer
        +1
      • 2021-06-28 17956, 2021

      • ruaok
        yay, I managed to procrastinate working on dumps!
      • 2021-06-28 17905, 2021

      • lucifer
        lol
      • 2021-06-28 17918, 2021

      • lucifer
        what were you planning to work on dumps btw?
      • 2021-06-28 17941, 2021

      • ruaok
        mapped MBIDs into spark dumps.
      • 2021-06-28 17952, 2021

      • ruaok
        there are two approaches:
      • 2021-06-28 17955, 2021

      • lucifer
        oh nice!
      • 2021-06-28 17908, 2021

      • ruaok
        1. Add MBIDs into full dumps and then transmogrify
      • 2021-06-28 17923, 2021

      • ruaok
        2. Make a separate spark dump, that moves a lot less data, but duplciates more code.
      • 2021-06-28 17950, 2021

      • ruaok
        #1 rubs me wrong, because we're putting generated data into the user data. that feels wrong.
      • 2021-06-28 17900, 2021

      • ruaok
        but it wold be the fastest solution.
      • 2021-06-28 17904, 2021

      • lucifer
        i would be in favor of 2. because i also wanted to move spark dumps to parquet format.
      • 2021-06-28 17924, 2021

      • ruaok
        I agree, 2 is better.
      • 2021-06-28 17936, 2021

      • ruaok
        but, aren't the dumps in parquet format now?
      • 2021-06-28 17939, 2021

      • ruaok
        what needs to change?
      • 2021-06-28 17909, 2021

      • lucifer
        acc to my understanding, dumps are output in json. spark reads json and writes in hdfs as paraquet.
      • 2021-06-28 17941, 2021

      • lucifer
        my suggestion we write in parquet at the first step.
      • 2021-06-28 17944, 2021

      • ruaok
        so we incur and extra pass over the data during import?
      • 2021-06-28 17955, 2021

      • lucifer
        right
      • 2021-06-28 17904, 2021

      • ruaok
        ok, that makes sense. let's do that.
      • 2021-06-28 17945, 2021

      • lucifer
        cool, so i can take up this task do it using approach 2 then?
      • 2021-06-28 17950, 2021

      • ruaok
        beause that lets us play with DuckDB: https://duckdb.org/2021/06/25/querying-parquet.ht…
      • 2021-06-28 17915, 2021

      • ruaok
        yes, once I submit the similarities PR, I'll work on the parquet based dumps.
      • 2021-06-28 17955, 2021

      • lucifer
        never heard of duckdb before will take a look.
      • 2021-06-28 17908, 2021

      • alastairp
        hello good morning
      • 2021-06-28 17918, 2021

      • ruaok
        it sounds quite cool. run SQL queries on parquet files.
      • 2021-06-28 17921, 2021

      • ruaok
        moin alastairp
      • 2021-06-28 17924, 2021

      • ruaok
        how is the 5G?
      • 2021-06-28 17932, 2021

      • lucifer
        moin!
      • 2021-06-28 17942, 2021

      • alastairp
        it's all gone away and I feel back to normal
      • 2021-06-28 17954, 2021

      • lucifer
        i got first vaccine dose this weekend too :)
      • 2021-06-28 17959, 2021

      • ruaok
        fucking bill gates. over promise and under deliver.
      • 2021-06-28 17905, 2021

      • ruaok
        yayayayayaya, lucifer !
      • 2021-06-28 17907, 2021

      • ruaok
        very good.
      • 2021-06-28 17922, 2021

      • ruaok
        out team is well underway to getting fully vaxxed.
      • 2021-06-28 17952, 2021

      • lucifer
        nice!! :DD
      • 2021-06-28 17937, 2021

      • ruaok
        my mum mentioned that a lot of people who now end up in hospitals (in the US) with covid are anti-vaxxers.
      • 2021-06-28 17947, 2021

      • ruaok
        there is some poetic justice in that.
      • 2021-06-28 17938, 2021

      • alastairp
        ruaok: just looking through your changes in https://github.com/metabrainz/listenbrainz-server… again
      • 2021-06-28 17902, 2021

      • loujine has quit
      • 2021-06-28 17924, 2021

      • alastairp
        to confirm - the behaviour that you're going for here is that any time a dump is running, you want to use the lock file to block other processes from running
      • 2021-06-28 17937, 2021

      • alastairp
        so periodically refresh_listen_count_aggregate won't run because there's a dump happening
      • 2021-06-28 17911, 2021

      • loujine joined the channel
      • 2021-06-28 17957, 2021

      • ruaok
        no, that wasn't the intent. the lock should not prevent processes running. it should only prevent the container from being killes.
      • 2021-06-28 17901, 2021

      • ruaok
        *killed.
      • 2021-06-28 17934, 2021

      • ruaok
        but I think I see your point.
      • 2021-06-28 17947, 2021

      • ruaok
        the lock needs to have a refcount to accomplish what I want to do.
      • 2021-06-28 17918, 2021

      • ruaok
        because right one one process gets precluded.
      • 2021-06-28 17921, 2021

      • alastairp
        ok, right
      • 2021-06-28 17937, 2021

      • BrainzGit
        [listenbrainz-server] 14mayhem closed pull request #1513 (03master…terminate-cron): Terminate cron https://github.com/metabrainz/listenbrainz-server…
      • 2021-06-28 17940, 2021

      • alastairp
        what about multiple lock files, does that introduce too much complexity?
      • 2021-06-28 17951, 2021

      • ruaok
        let me get back to that -- I closed the PR for now.
      • 2021-06-28 17900, 2021

      • alastairp
        have a lock file per process, then the killer checker can look for any *.lock, for example
      • 2021-06-28 17902, 2021

      • ruaok
        multiple lock files could work.
      • 2021-06-28 17914, 2021

      • alastairp
        OK, cool. lmk when you have another solution
      • 2021-06-28 17931, 2021

      • ruaok
        yeah, makes sense. I'll do that once I finish the user similarity tweak lucifer and I just dicussed.
      • 2021-06-28 17919, 2021

      • ruaok
        lucifer: question about the similarity matrix....
      • 2021-06-28 17941, 2021

      • ruaok
        I know that one user is a row of data. but a user user is also a column of data.
      • 2021-06-28 17928, 2021

      • ruaok
        I guess my question hinges on this line: https://github.com/metabrainz/listenbrainz-server…
      • 2021-06-28 17942, 2021

      • ruaok
        how is the dataframe generated from the matrix?
      • 2021-06-28 17952, 2021

      • ruaok
        I'm starting to think we should remove the thresholding function and send the raw data to the LB server.
      • 2021-06-28 17908, 2021

      • ruaok
        and at LB we do the scaling on the fly -- at least until we know what we're doing.
      • 2021-06-28 17920, 2021

      • lucifer
        ruaok, the line you shared does not involve the matrix.
      • 2021-06-28 17949, 2021

      • lucifer
        indeed, i think its better to do scaling in Lb for now as it is simpler.
      • 2021-06-28 17904, 2021

      • lucifer
        once, we are sure what to do we can move that to spark side.
      • 2021-06-28 17923, 2021

      • ruaok
        oh. right. the threshold similar users actually does that extraction from the matrix!
      • 2021-06-28 17930, 2021

      • ruaok
        its returns a list. heh.
      • 2021-06-28 17941, 2021

      • lucifer
      • 2021-06-28 17904, 2021

      • ruaok
        yeah. duh.
      • 2021-06-28 17908, 2021

      • lucifer
        that said, i think moving the these two declarations into the first loop should be enough to scale individually.
      • 2021-06-28 17912, 2021

      • ruaok
        that actually makes it easy to scale in spark.
      • 2021-06-28 17911, 2021

      • ruaok
        yes, you're right. it is that simple.
      • 2021-06-28 17904, 2021

      • ruaok
        well, not quite. it needs more changes, but those are not hard.
      • 2021-06-28 17958, 2021

      • lucifer
        right, just saw there's another loop after it.
      • 2021-06-28 17905, 2021

      • ruaok nods
      • 2021-06-28 17916, 2021

      • BrainzGit
        [listenbrainz-server] 14amCap1712 opened pull request #1528 (03master…consul-load): Add consul_template KEY_JSON to use when the value of key needs to parsed before used in Python https://github.com/metabrainz/listenbrainz-server…
      • 2021-06-28 17925, 2021

      • lucifer
        alastairp: ^, tested above on test.lb seems to work fine.
      • 2021-06-28 17912, 2021

      • lucifer
        I need to add a bunch of flags (2 for email and another for pin apis) so added above to reduce duplication.
      • 2021-06-28 17949, 2021

      • alastairp
        ah, nice. so the idea is that consul will always have _encoded json_, and we decode it in config.py?
      • 2021-06-28 17912, 2021

      • alastairp
        should we consider having a try/except for ValueError in case the json is invalid?
      • 2021-06-28 17928, 2021

      • lucifer
        yes, right.
      • 2021-06-28 17949, 2021

      • lucifer
        if the json is invalid, it should crash i think and report to sentry, right?
      • 2021-06-28 17928, 2021

      • lucifer
      • 2021-06-28 17930, 2021

      • alastairp
        actually, we probably already have it. If the json is invalid, uwsgi will fail to load the app, and will quit. our standard reporting system will take over from here
      • 2021-06-28 17949, 2021

      • lucifer
        yup exactly that happened.
      • 2021-06-28 17906, 2021

      • alastairp
        additional question is if we want to be able to start up even in the case of invalid data
      • 2021-06-28 17944, 2021

      • lucifer
        i was thinking about that, in case of missing services yes but invalid config not sure
      • 2021-06-28 17959, 2021

      • lucifer
        ruaok: alastairp: should we enable the pinned rec api in beta/test so that we can test it easily?
      • 2021-06-28 17942, 2021

      • ruaok
        good idea!
      • 2021-06-28 17946, 2021

      • alastairp
        yeah, right. we could special-case this: if there's an invalid youtube config then selectively disable youtube playback. but honestly, it's less work to just fix the config if it's broken
      • 2021-06-28 17950, 2021

      • alastairp
        I agree, let's do it