I think that this is why even though we're running npm as root, you're seeing permission denied errors, I think it's trying to write or chown the files, but doing it as your user, not as root. I remember running into this issue in another project where npm was completely screwing up the owner of my database files...
2022-07-25 20621, 2022
alastairp
I'm going to delete the prebuilt files that you have (../anshg1214/critiquebrainz/critiquebrainz/frontend/static/build) and re-run it, I think that this will correctly be able to create the directory and write the files with the correct owner
2022-07-25 20638, 2022
alastairp
but I'll also add in the trick that we have in LB where we run the commands as a specific user id
2022-07-25 20639, 2022
lucifer
chinmay: are you unable to create a GlobalAppContextT or do you want to add a new context?
2022-07-25 20659, 2022
chinmay
I want to add a new context
2022-07-25 20623, 2022
lucifer
i see, what data will be part of it and how do you intend to use it?
2022-07-25 20618, 2022
ansh
alastairp: This is interesting! Did it work?
2022-07-25 20647, 2022
alastairp
ansh: I deleted your build directory and re-ran it (as root), and it changed the owner to you
2022-07-25 20611, 2022
alastairp
I tried with docker run's --user flag, and it failed to run (with the same error that lucifer encountered the other week)
2022-07-25 20643, 2022
alastairp
it looks like npm now needs to check for your user id in /etc/password, and if it doesn't exist then it fails with an unexplained error
2022-07-25 20619, 2022
alastairp
ansh: can you try and start it up again and check that the static builder loads and correctly re-builds when a file changes?
2022-07-25 20638, 2022
ansh
Sure!
2022-07-25 20647, 2022
alastairp
I'll open a ticket to suggest that we add a user to our images so that we can run npm
2022-07-25 20600, 2022
alastairp
this is an issue for LB too
2022-07-25 20603, 2022
ansh
It works without any error
2022-07-25 20603, 2022
alastairp
great!
2022-07-25 20612, 2022
lucifer
alastairp: why not run npm tests as root?
2022-07-25 20627, 2022
alastairp
lucifer: this is for running `webpack`
2022-07-25 20631, 2022
alastairp
or npm install
2022-07-25 20642, 2022
lucifer
i see. but that works fine in LB, no?
2022-07-25 20644, 2022
alastairp
it automatically chown's to the owner of . even if you run it as root
2022-07-25 20616, 2022
alastairp
the specific issue we encountered is that when ansh last ran the container with node 12, it left the bundle files owned by root
2022-07-25 20638, 2022
alastairp
and then when he re-ran it with node 16 (npm 8, I guess the issue was), it tried to chown the existing files and failed
2022-07-25 20659, 2022
alastairp
when I deleted the files and tried again, it successfully builds and chown's
2022-07-25 20645, 2022
lucifer
ah ok, if the fix is simple, then we should probably do it but if the issue won't come back again as long as we are running node 16 and fixing is complex, it might be fine to leave as is.
2022-07-25 20604, 2022
alastairp
this specific issue that we encountered is "you had existing build assets owned by root and then you upgraded from node 12 to node 16"
2022-07-25 20613, 2022
alastairp
(and you're running on linux, and you don't have sudo)
2022-07-25 20636, 2022
alastairp
so it's probably only affecting ansh in this specific case
2022-07-25 20613, 2022
lucifer
makes sense, yeah it seems very specific and if it won't occur again (acc to my understanding it won't), fine to leave as is i say
2022-07-25 20614, 2022
alastairp
if we want to take our previous plans and start running everything in a container as a local user... it looks like npm requires a user in /etc/passwd to be able to run
2022-07-25 20648, 2022
chinmay
lucifer: I want to clean the data from API before taking it any further as there can be duplicates. There will be one array for the this cleaned data. I also want to keep a list of release_mbids that don't have coverarts when the page is loaded. So for example if I want to filter out releases with cover arts only, I'll filter the main data with the list of no coverart releases and pass it to ahead to update the page.
2022-07-25 20648, 2022
chinmay
Similar logic will apply for any filters.
2022-07-25 20648, 2022
chinmay
What do you think? is there a simple way to implement filters?
2022-07-25 20607, 2022
lucifer
chinmay: pass the entire data to release page and then maintain multiple arrays/maps inside it. one for full other for filtered one so on?
2022-07-25 20600, 2022
lucifer
for making filter, i think a `.filter` or `if` should suffice.
2022-07-25 20603, 2022
ansh
alastairp: I am unable to login into sentry. I created an account and it gives me an error saying something went wrong.
why is there no tables for feedback dump? because we have a different way of creating the dump?
2022-07-25 20628, 2022
lucifer
yes those are dumped using a manual sql query.
2022-07-25 20642, 2022
alastairp
feedback dump isn't designed to be imported into a db, right? so in that case the pre-sorting isn't an issue?
2022-07-25 20651, 2022
lucifer
further, those are json dumps so the FK issue doesn't happen either.
2022-07-25 20656, 2022
lucifer
yup right
2022-07-25 20606, 2022
alastairp
and I'm just reading the code for feedback dumps, the number name name of files is variable based on the result of the query it seems (so we can't set `tables` to something ahead of time)
2022-07-25 20621, 2022
alastairp
it'd be nice to try and set `tables` to None in this case so that we ended up with an exception
2022-07-25 20622, 2022
alastairp
because we already have a special if statement for making the feedback dumps, I think it's fine to do what you suggest. However, maybe it makes sense to move this to a separate codeflow anyway instead of having if statements everywhere - not sure how much of this code is shared and how much is different
2022-07-25 20637, 2022
alastairp
ansh: I've upgraded sentry. can you try and sign in again?
2022-07-25 20616, 2022
lucifer
makes sense, that's what i had done initially for couchdb dumps but then it failed so i changed to [] and it worked fine. then i saw dump was empty so investigatd. https://github.com/metabrainz/listenbrainz-server…
2022-07-25 20649, 2022
lucifer
yes +1 on separating json dumps from pg dumps.
2022-07-25 20642, 2022
alastairp
in addition, it seems that feedback dumps write all files to a temp dir before adding to the archive? we don't dump/add/rm in a loop?
2022-07-25 20622, 2022
alastairp
but I guess that's like we do it for all dumps anyway, I can't remember if we tried the previous way to try and minimise transient disk usage
2022-07-25 20627, 2022
alastairp
lucifer: LB planning meeting?
2022-07-25 20633, 2022
lucifer
yup
2022-07-25 20644, 2022
lucifer
mayhem: around?
2022-07-25 20627, 2022
mayhem
yep
2022-07-25 20632, 2022
lucifer
👍
2022-07-25 20624, 2022
lucifer
firstly, i am planning to finish the couchdb integration this week. currently working on dumps, will probably get those working today and start working on tests for dumps tomorrow onwards.
2022-07-25 20636, 2022
mayhem
good.
2022-07-25 20605, 2022
lucifer
after that i was thinking of restarting work on rabbitmq stuff.
2022-07-25 20627, 2022
lucifer
we had talked about replacing pika with kombu sometime ago and see if that fixes our woes.
2022-07-25 20634, 2022
mayhem
I think there is also some work that could be done on the similarity stuff...
2022-07-25 20650, 2022
lucifer
recording similarity?
2022-07-25 20613, 2022
mayhem
yes. right now I feel frustrated because it is hard to evaluate the results.
2022-07-25 20626, 2022
mayhem
we have not guarantee that the algorithm in working or not.
what we need is a dummy test set with an expected set of results
2022-07-25 20613, 2022
lucifer
i see. makes sense
2022-07-25 20624, 2022
mayhem
so that we can verify that the algorithm is working. once we have that, then we can start tuning the alg
2022-07-25 20658, 2022
mayhem
another thing on that front is the improvement of what constitutes a session.
2022-07-25 20608, 2022
lucifer
we can create a dummy test but unsure how to determine the baseline set of expected results.
2022-07-25 20610, 2022
mayhem
we will need to have access to track lengths in spark in order to improve those.
2022-07-25 20636, 2022
lucifer
we can import release dumps for that i think.
2022-07-25 20639, 2022
mayhem
well, the dummy test needs to have a clear and contrived set of data.
2022-07-25 20647, 2022
mayhem
one where we know exactly what the result should be
2022-07-25 20606, 2022
mayhem
and if the result is not what we expect, then we have a bug and need to fix it.
2022-07-25 20627, 2022
mayhem
then we need to focus on improving "listening sessions"/.
2022-07-25 20657, 2022
mayhem
only when we have reliable sessions, can we build better similarity data.
2022-07-25 20605, 2022
mayhem
and once Pratha-Fish is done, we have clearned up data and we can use that for similarity data.
2022-07-25 20616, 2022
mayhem
and THAT will push us over the edge of having sufficient data.
2022-07-25 20631, 2022
lucifer
makes sense
2022-07-25 20649, 2022
mayhem
so, when you run low on things to do, work on stuff that moves us down this path.
2022-07-25 20629, 2022
lucifer
do you have thoghts on how to build the test set? we can pick tracks and their similar ones but there likely will be bias.
2022-07-25 20646, 2022
lucifer
*pick manually ourselves
2022-07-25 20613, 2022
mayhem
I would not even use real data.
2022-07-25 20614, 2022
alastairp
I think we're going to have to generate synthetic data that we know the algorithm will throw back as similar
2022-07-25 20619, 2022
mayhem
that.
2022-07-25 20636, 2022
lucifer
i see, makes sense. that should be doable.
2022-07-25 20646, 2022
lucifer
about listening sessions, what do you have in mind? currently we use listened_at difference between listens for 30 mins. once we have track length, what should be done differently?
2022-07-25 20632, 2022
mayhem
look for consecutive tracks and make sure there are no significant gaps.
2022-07-25 20607, 2022
mayhem
I'd like to have a tool that allows us to review sessions. have it spit out the sessions for a user for the last week or so.
2022-07-25 20614, 2022
mayhem
then we can see if they make sense.
2022-07-25 20628, 2022
mayhem
right now, we made a basic assumption and we haven't verified it.
2022-07-25 20634, 2022
lucifer
i see, so running difference should be less than a specified constant.
2022-07-25 20610, 2022
lucifer
+1 to both.
2022-07-25 20621, 2022
lucifer
i can work on adding the test set then the sessions tool then adding sessions to that spark query.
2022-07-25 20634, 2022
mayhem
perfect.
2022-07-25 20640, 2022
alastairp
cool
2022-07-25 20657, 2022
lucifer
when will you be back from vacation btw?
2022-07-25 20627, 2022
mayhem
sept. :/
2022-07-25 20641, 2022
lucifer
i see.
2022-07-25 20645, 2022
lucifer
alastairp: what about you?
2022-07-25 20649, 2022
mayhem
well, I'll be around a bit each day, but just to answer emails/question