#metabrainz

/

      • alastairp
        I think that this is why even though we're running npm as root, you're seeing permission denied errors, I think it's trying to write or chown the files, but doing it as your user, not as root. I remember running into this issue in another project where npm was completely screwing up the owner of my database files...
      • I'm going to delete the prebuilt files that you have (../anshg1214/critiquebrainz/critiquebrainz/frontend/static/build) and re-run it, I think that this will correctly be able to create the directory and write the files with the correct owner
      • but I'll also add in the trick that we have in LB where we run the commands as a specific user id
      • lucifer
        chinmay: are you unable to create a GlobalAppContextT or do you want to add a new context?
      • chinmay
        I want to add a new context
      • lucifer
        i see, what data will be part of it and how do you intend to use it?
      • ansh
        alastairp: This is interesting! Did it work?
      • alastairp
        ansh: I deleted your build directory and re-ran it (as root), and it changed the owner to you
      • I tried with docker run's --user flag, and it failed to run (with the same error that lucifer encountered the other week)
      • it looks like npm now needs to check for your user id in /etc/password, and if it doesn't exist then it fails with an unexplained error
      • ansh: can you try and start it up again and check that the static builder loads and correctly re-builds when a file changes?
      • ansh
        Sure!
      • alastairp
        I'll open a ticket to suggest that we add a user to our images so that we can run npm
      • this is an issue for LB too
      • ansh
        It works without any error
      • alastairp
        great!
      • lucifer
        alastairp: why not run npm tests as root?
      • alastairp
        lucifer: this is for running `webpack`
      • or npm install
      • lucifer
        i see. but that works fine in LB, no?
      • alastairp
        it automatically chown's to the owner of . even if you run it as root
      • the specific issue we encountered is that when ansh last ran the container with node 12, it left the bundle files owned by root
      • and then when he re-ran it with node 16 (npm 8, I guess the issue was), it tried to chown the existing files and failed
      • when I deleted the files and tried again, it successfully builds and chown's
      • lucifer
        ah ok, if the fix is simple, then we should probably do it but if the issue won't come back again as long as we are running node 16 and fixing is complex, it might be fine to leave as is.
      • alastairp
        this specific issue that we encountered is "you had existing build assets owned by root and then you upgraded from node 12 to node 16"
      • (and you're running on linux, and you don't have sudo)
      • so it's probably only affecting ansh in this specific case
      • lucifer
        makes sense, yeah it seems very specific and if it won't occur again (acc to my understanding it won't), fine to leave as is i say
      • alastairp
        if we want to take our previous plans and start running everything in a container as a local user... it looks like npm requires a user in /etc/passwd to be able to run
      • chinmay
        lucifer: I want to clean the data from API before taking it any further as there can be duplicates. There will be one array for the this cleaned data. I also want to keep a list of release_mbids that don't have coverarts when the page is loaded. So for example if I want to filter out releases with cover arts only, I'll filter the main data with the list of no coverart releases and pass it to ahead to update the page.
      • Similar logic will apply for any filters.
      • What do you think? is there a simple way to implement filters?
      • lucifer
        chinmay: pass the entire data to release page and then maintain multiple arrays/maps inside it. one for full other for filtered one so on?
      • for making filter, i think a `.filter` or `if` should suffice.
      • ansh
        alastairp: I am unable to login into sentry. I created an account and it gives me an error saying something went wrong.
      • lucifer
      • something like this should probably work.
      • chinmay
        lucifer: I'll go through it
      • alastairp
        ansh: interesting, let me see if sentry reported an error to itself :)
      • lucifer
        chinmay: 👍 feel free to ask if you any other doubts.
      • chinmay
        yeah
      • lucifer
      • alastairp
        ansh: were you able to create an account?
      • chinmay
        Awesome! I'll check this one out too
      • alastairp
        or when you went to create it, it returned the error?
      • ansh
        When I went to create one, it gave me an error
      • alastairp
        right, because it says that your invite is still active
      • I see no errors in sentry
      • ansh
      • alastairp
        yeah, right. weird - so that should definitely be reported as an error
      • it's possible that we've not created an account for anyone since we last performed a sentry upgrade
      • I'll try and upgrade it again and we can try again
      • ansh
        yep
      • q3lont has quit
      • alastairp
      • mayhem: closing my office door (10 m^2) certainly makes CO2 peak
      • Pratha-Fish
        alastairp: I up for discussion whenever you're free 👀
      • *I'm
      • mayhem
        Wow, quite a serious peak at that.
      • alastairp
        Pratha-Fish: ok, give me 10 minutes, just doing some server maintenance
      • Pratha-Fish
        sure
      • alastairp
        yeah, I seem to idle around 400-500 with the door to the rest of the house open. at the moment also got window open + fan on
      • yes, it felt sticky and heavy, but not sure how much of that was just hot + humid too
      • and I knew that I was making it go up, so maybe it was 90% psychosomatic too
      • a friend who has a monitor in his office says that he has a push notification to his phone at 2000 to remind him to open a window
      • upgrading sentry, it'll be down temporarily, hopefully not more than 5 mins
      • lucifer
        mayhem: alastairp: apparently feedback dumps has been broken for a long while now. just discovered when trying to make couchdb dumps.
      • mayhem
        Huh. How did the dump checker not catch that?
      • lucifer
      • dump exists but no data files inside
      • alastairp
        bug with the query/data collection?
      • but it didn't raise an exception?
      • lucifer
        data was dumped but never added to tarfile
      • alastairp
        oops
      • lucifer
      • it probably broke here
      • tables = [] for user feedback dumps
      • alastairp
        ah, right
      • lucifer
        i'll add a `if not tables`: do the old way.
      • alastairp
        why is there no tables for feedback dump? because we have a different way of creating the dump?
      • lucifer
        yes those are dumped using a manual sql query.
      • alastairp
        feedback dump isn't designed to be imported into a db, right? so in that case the pre-sorting isn't an issue?
      • lucifer
        further, those are json dumps so the FK issue doesn't happen either.
      • yup right
      • alastairp
        and I'm just reading the code for feedback dumps, the number name name of files is variable based on the result of the query it seems (so we can't set `tables` to something ahead of time)
      • it'd be nice to try and set `tables` to None in this case so that we ended up with an exception
      • because we already have a special if statement for making the feedback dumps, I think it's fine to do what you suggest. However, maybe it makes sense to move this to a separate codeflow anyway instead of having if statements everywhere - not sure how much of this code is shared and how much is different
      • ansh: I've upgraded sentry. can you try and sign in again?
      • lucifer
        makes sense, that's what i had done initially for couchdb dumps but then it failed so i changed to [] and it worked fine. then i saw dump was empty so investigatd. https://github.com/metabrainz/listenbrainz-serv...
      • yes +1 on separating json dumps from pg dumps.
      • alastairp
        in addition, it seems that feedback dumps write all files to a temp dir before adding to the archive? we don't dump/add/rm in a loop?
      • but I guess that's like we do it for all dumps anyway, I can't remember if we tried the previous way to try and minimise transient disk usage
      • lucifer: LB planning meeting?
      • lucifer
        yup
      • mayhem: around?
      • mayhem
        yep
      • lucifer
        👍
      • firstly, i am planning to finish the couchdb integration this week. currently working on dumps, will probably get those working today and start working on tests for dumps tomorrow onwards.
      • mayhem
        good.
      • lucifer
        after that i was thinking of restarting work on rabbitmq stuff.
      • we had talked about replacing pika with kombu sometime ago and see if that fixes our woes.
      • mayhem
        I think there is also some work that could be done on the similarity stuff...
      • lucifer
        recording similarity?
      • mayhem
        yes. right now I feel frustrated because it is hard to evaluate the results.
      • we have not guarantee that the algorithm in working or not.
      • lucifer
        i can add a datasethoster query if it helps.
      • yuzie joined the channel
      • fwiw these datasets should be present on wolf currently, https://docs.google.com/document/d/1BJbXFPqgu2x...
      • mayhem
        yes, I've looked at them, but just shrugged.
      • unclear on how to proceed.
      • what we need is a dummy test set with an expected set of results
      • lucifer
        i see. makes sense
      • mayhem
        so that we can verify that the algorithm is working. once we have that, then we can start tuning the alg
      • another thing on that front is the improvement of what constitutes a session.
      • lucifer
        we can create a dummy test but unsure how to determine the baseline set of expected results.
      • mayhem
        we will need to have access to track lengths in spark in order to improve those.
      • lucifer
        we can import release dumps for that i think.
      • mayhem
        well, the dummy test needs to have a clear and contrived set of data.
      • one where we know exactly what the result should be
      • and if the result is not what we expect, then we have a bug and need to fix it.
      • then we need to focus on improving "listening sessions"/.
      • only when we have reliable sessions, can we build better similarity data.
      • and once Pratha-Fish is done, we have clearned up data and we can use that for similarity data.
      • and THAT will push us over the edge of having sufficient data.
      • lucifer
        makes sense
      • mayhem
        so, when you run low on things to do, work on stuff that moves us down this path.
      • lucifer
        do you have thoghts on how to build the test set? we can pick tracks and their similar ones but there likely will be bias.
      • *pick manually ourselves
      • mayhem
        I would not even use real data.
      • alastairp
        I think we're going to have to generate synthetic data that we know the algorithm will throw back as similar
      • mayhem
        that.
      • lucifer
        i see, makes sense. that should be doable.
      • about listening sessions, what do you have in mind? currently we use listened_at difference between listens for 30 mins. once we have track length, what should be done differently?
      • mayhem
        look for consecutive tracks and make sure there are no significant gaps.
      • I'd like to have a tool that allows us to review sessions. have it spit out the sessions for a user for the last week or so.
      • then we can see if they make sense.
      • right now, we made a basic assumption and we haven't verified it.
      • lucifer
        i see, so running difference should be less than a specified constant.
      • +1 to both.
      • i can work on adding the test set then the sessions tool then adding sessions to that spark query.
      • mayhem
        perfect.
      • alastairp
        cool
      • lucifer
        when will you be back from vacation btw?
      • mayhem
        sept. :/
      • lucifer
        i see.
      • alastairp: what about you?
      • mayhem
        well, I'll be around a bit each day, but just to answer emails/question
      • alastairp
        I expect to be back from 22 Aug
      • lucifer
        ok cool, lets discuss some more tickets then.