#metabrainz

/

13:30 PM
alastairp

I think that this is why even though we're running npm as root, you're seeing permission denied errors, I think it's trying to write or chown the files, but doing it as your user, not as root. I remember running into this issue in another project where npm was completely screwing up the owner of my database files...

2022-07-25 20621, 2022

13:31 PM
alastairp

I'm going to delete the prebuilt files that you have (../anshg1214/critiquebrainz/critiquebrainz/frontend/static/build) and re-run it, I think that this will correctly be able to create the directory and write the files with the correct owner

2022-07-25 20638, 2022

13:31 PM
alastairp

but I'll also add in the trick that we have in LB where we run the commands as a specific user id

2022-07-25 20639, 2022

13:36 PM
lucifer

chinmay: are you unable to create a GlobalAppContextT or do you want to add a new context?

2022-07-25 20659, 2022

13:40 PM
chinmay

I want to add a new context

2022-07-25 20623, 2022

13:43 PM
lucifer

i see, what data will be part of it and how do you intend to use it?

2022-07-25 20618, 2022

13:46 PM
ansh

alastairp: This is interesting! Did it work?

2022-07-25 20647, 2022

13:46 PM
alastairp

ansh: I deleted your build directory and re-ran it (as root), and it changed the owner to you

2022-07-25 20611, 2022

13:47 PM
alastairp

I tried with docker run's --user flag, and it failed to run (with the same error that lucifer encountered the other week)

2022-07-25 20643, 2022

13:47 PM
alastairp

it looks like npm now needs to check for your user id in /etc/password, and if it doesn't exist then it fails with an unexplained error

2022-07-25 20619, 2022

13:48 PM
alastairp

ansh: can you try and start it up again and check that the static builder loads and correctly re-builds when a file changes?

2022-07-25 20638, 2022

13:48 PM
ansh

Sure!

2022-07-25 20647, 2022

13:48 PM
alastairp

I'll open a ticket to suggest that we add a user to our images so that we can run npm

2022-07-25 20600, 2022

13:49 PM
alastairp

this is an issue for LB too

2022-07-25 20603, 2022

13:51 PM
ansh

It works without any error

2022-07-25 20603, 2022

13:52 PM
alastairp

great!

2022-07-25 20612, 2022

13:52 PM
lucifer

alastairp: why not run npm tests as root?

2022-07-25 20627, 2022

13:52 PM
alastairp

lucifer: this is for running `webpack`

2022-07-25 20631, 2022

13:52 PM
alastairp

or npm install

2022-07-25 20642, 2022

13:52 PM
lucifer

i see. but that works fine in LB, no?

2022-07-25 20644, 2022

13:52 PM
alastairp

it automatically chown's to the owner of . even if you run it as root

2022-07-25 20616, 2022

13:53 PM
alastairp

the specific issue we encountered is that when ansh last ran the container with node 12, it left the bundle files owned by root

2022-07-25 20638, 2022

13:53 PM
alastairp

and then when he re-ran it with node 16 (npm 8, I guess the issue was), it tried to chown the existing files and failed

2022-07-25 20659, 2022

13:53 PM
alastairp

when I deleted the files and tried again, it successfully builds and chown's

2022-07-25 20645, 2022

13:55 PM
lucifer

ah ok, if the fix is simple, then we should probably do it but if the issue won't come back again as long as we are running node 16 and fixing is complex, it might be fine to leave as is.

2022-07-25 20604, 2022

13:57 PM
alastairp

this specific issue that we encountered is "you had existing build assets owned by root and then you upgraded from node 12 to node 16"

2022-07-25 20613, 2022

13:57 PM
alastairp

(and you're running on linux, and you don't have sudo)

2022-07-25 20636, 2022

13:57 PM
alastairp

so it's probably only affecting ansh in this specific case

2022-07-25 20613, 2022

13:58 PM
lucifer

makes sense, yeah it seems very specific and if it won't occur again (acc to my understanding it won't), fine to leave as is i say

2022-07-25 20614, 2022

13:58 PM
alastairp

if we want to take our previous plans and start running everything in a container as a local user... it looks like npm requires a user in /etc/passwd to be able to run

2022-07-25 20648, 2022

14:00 PM
chinmay

lucifer: I want to clean the data from API before taking it any further as there can be duplicates. There will be one array for the this cleaned data. I also want to keep a list of release_mbids that don't have coverarts when the page is loaded. So for example if I want to filter out releases with cover arts only, I'll filter the main data with the list of no coverart releases and pass it to ahead to update the page.

2022-07-25 20648, 2022

14:00 PM
chinmay

Similar logic will apply for any filters.

2022-07-25 20648, 2022

14:00 PM
chinmay

What do you think? is there a simple way to implement filters?

2022-07-25 20607, 2022

14:04 PM
lucifer

chinmay: pass the entire data to release page and then maintain multiple arrays/maps inside it. one for full other for filtered one so on?

2022-07-25 20600, 2022

14:08 PM
lucifer

for making filter, i think a `.filter` or `if` should suffice.

2022-07-25 20603, 2022

14:08 PM
ansh

alastairp: I am unable to login into sentry. I created an account and it gives me an error saying something went wrong.

2022-07-25 20634, 2022

14:08 PM
lucifer

https://www.freecodecamp.org/news/how-to-make-a-f…

2022-07-25 20651, 2022

14:08 PM
lucifer

something like this should probably work.

2022-07-25 20613, 2022

14:09 PM
chinmay

lucifer: I'll go through it

2022-07-25 20631, 2022

14:09 PM
alastairp

ansh: interesting, let me see if sentry reported an error to itself :)

2022-07-25 20647, 2022

14:09 PM
lucifer

chinmay: 👍 feel free to ask if you any other doubts.

2022-07-25 20651, 2022

14:10 PM
chinmay

yeah

2022-07-25 20636, 2022

14:11 PM
lucifer

this example also looks useful, https://developer.mozilla.org/en-US/docs/Learn/To…

2022-07-25 20605, 2022

14:13 PM
alastairp

ansh: were you able to create an account?

2022-07-25 20605, 2022

14:13 PM
chinmay

Awesome! I'll check this one out too

2022-07-25 20612, 2022

14:13 PM
alastairp

or when you went to create it, it returned the error?

2022-07-25 20600, 2022

14:14 PM
ansh

When I went to create one, it gave me an error

2022-07-25 20626, 2022

14:14 PM
alastairp

right, because it says that your invite is still active

2022-07-25 20603, 2022

14:15 PM
alastairp

I see no errors in sentry

2022-07-25 20607, 2022

14:15 PM
ansh

this is what it says https://usercontent.irccloud-cdn.com/file/9X3BpSW…

2022-07-25 20606, 2022

14:16 PM
alastairp

yeah, right. weird - so that should definitely be reported as an error

2022-07-25 20623, 2022

14:16 PM
alastairp

it's possible that we've not created an account for anyone since we last performed a sentry upgrade

2022-07-25 20630, 2022

14:16 PM
alastairp

I'll try and upgrade it again and we can try again

2022-07-25 20601, 2022

14:17 PM
ansh

yep

2022-07-25 20613, 2022

14:41 PM
q3lont has quit

2022-07-25 20618, 2022

15:19 PM
alastairp

https://usercontent.irccloud-cdn.com/file/tQxqkER…

2022-07-25 20643, 2022

15:19 PM
alastairp

mayhem: closing my office door (10 m^2) certainly makes CO2 peak

2022-07-25 20623, 2022

15:30 PM
Pratha-Fish

alastairp: I up for discussion whenever you're free 👀

2022-07-25 20615, 2022

15:31 PM
Pratha-Fish

*I'm

2022-07-25 20610, 2022

15:32 PM
mayhem

Wow, quite a serious peak at that.

2022-07-25 20624, 2022

15:32 PM
alastairp

Pratha-Fish: ok, give me 10 minutes, just doing some server maintenance

2022-07-25 20649, 2022

15:32 PM
Pratha-Fish

sure

2022-07-25 20650, 2022

15:32 PM
alastairp

yeah, I seem to idle around 400-500 with the door to the rest of the house open. at the moment also got window open + fan on

2022-07-25 20619, 2022

15:33 PM
alastairp

yes, it felt sticky and heavy, but not sure how much of that was just hot + humid too

2022-07-25 20638, 2022

15:33 PM
alastairp

and I knew that I was making it go up, so maybe it was 90% psychosomatic too

2022-07-25 20617, 2022

15:35 PM
alastairp

a friend who has a monitor in his office says that he has a push notification to his phone at 2000 to remind him to open a window

2022-07-25 20624, 2022

15:36 PM
alastairp

upgrading sentry, it'll be down temporarily, hopefully not more than 5 mins

2022-07-25 20637, 2022

15:44 PM
lucifer

mayhem: alastairp: apparently feedback dumps has been broken for a long while now. just discovered when trying to make couchdb dumps.

2022-07-25 20617, 2022

15:45 PM
mayhem

Huh. How did the dump checker not catch that?

2022-07-25 20644, 2022

15:45 PM
lucifer

http://data.metabrainz.org/pub/musicbrainz/listen…

2022-07-25 20650, 2022

15:45 PM
lucifer

dump exists but no data files inside

2022-07-25 20613, 2022

15:48 PM
alastairp

bug with the query/data collection?

2022-07-25 20623, 2022

15:48 PM
alastairp

but it didn't raise an exception?

2022-07-25 20647, 2022

15:48 PM
lucifer

data was dumped but never added to tarfile

2022-07-25 20658, 2022

15:48 PM
alastairp

oops

2022-07-25 20609, 2022

15:49 PM
lucifer

https://github.com/metabrainz/listenbrainz-server…

2022-07-25 20616, 2022

15:49 PM
lucifer

it probably broke here

2022-07-25 20623, 2022

15:49 PM
lucifer

tables = [] for user feedback dumps

2022-07-25 20638, 2022

15:49 PM
alastairp

ah, right

2022-07-25 20616, 2022

15:50 PM
lucifer

i'll add a `if not tables`: do the old way.

2022-07-25 20643, 2022

15:50 PM
alastairp

why is there no tables for feedback dump? because we have a different way of creating the dump?

2022-07-25 20628, 2022

15:51 PM
lucifer

yes those are dumped using a manual sql query.

2022-07-25 20642, 2022

15:51 PM
alastairp

feedback dump isn't designed to be imported into a db, right? so in that case the pre-sorting isn't an issue?

2022-07-25 20651, 2022

15:51 PM
lucifer

further, those are json dumps so the FK issue doesn't happen either.

2022-07-25 20656, 2022

15:51 PM
lucifer

yup right

2022-07-25 20606, 2022

15:53 PM
alastairp

and I'm just reading the code for feedback dumps, the number name name of files is variable based on the result of the query it seems (so we can't set `tables` to something ahead of time)

2022-07-25 20621, 2022

15:53 PM
alastairp

it'd be nice to try and set `tables` to None in this case so that we ended up with an exception

2022-07-25 20622, 2022

15:54 PM
alastairp

because we already have a special if statement for making the feedback dumps, I think it's fine to do what you suggest. However, maybe it makes sense to move this to a separate codeflow anyway instead of having if statements everywhere - not sure how much of this code is shared and how much is different

2022-07-25 20637, 2022

15:54 PM
alastairp

ansh: I've upgraded sentry. can you try and sign in again?

2022-07-25 20616, 2022

15:55 PM
lucifer

makes sense, that's what i had done initially for couchdb dumps but then it failed so i changed to [] and it worked fine. then i saw dump was empty so investigatd. https://github.com/metabrainz/listenbrainz-server…

2022-07-25 20649, 2022

15:55 PM
lucifer

yes +1 on separating json dumps from pg dumps.

2022-07-25 20642, 2022

15:56 PM
alastairp

in addition, it seems that feedback dumps write all files to a temp dir before adding to the archive? we don't dump/add/rm in a loop?

2022-07-25 20622, 2022

15:57 PM
alastairp

but I guess that's like we do it for all dumps anyway, I can't remember if we tried the previous way to try and minimise transient disk usage

2022-07-25 20627, 2022

16:00 PM
alastairp

lucifer: LB planning meeting?

2022-07-25 20633, 2022

16:00 PM
lucifer

yup

2022-07-25 20644, 2022

16:00 PM
lucifer

mayhem: around?

2022-07-25 20627, 2022

16:01 PM
mayhem

yep

2022-07-25 20632, 2022

16:01 PM
lucifer

👍

2022-07-25 20624, 2022

16:02 PM
lucifer

firstly, i am planning to finish the couchdb integration this week. currently working on dumps, will probably get those working today and start working on tests for dumps tomorrow onwards.

2022-07-25 20636, 2022

16:02 PM
mayhem

good.

2022-07-25 20605, 2022

16:03 PM
lucifer

after that i was thinking of restarting work on rabbitmq stuff.

2022-07-25 20627, 2022

16:03 PM
lucifer

we had talked about replacing pika with kombu sometime ago and see if that fixes our woes.

2022-07-25 20634, 2022

16:03 PM
mayhem

I think there is also some work that could be done on the similarity stuff...

2022-07-25 20650, 2022

16:03 PM
lucifer

recording similarity?

2022-07-25 20613, 2022

16:04 PM
mayhem

yes. right now I feel frustrated because it is hard to evaluate the results.

2022-07-25 20626, 2022

16:04 PM
mayhem

we have not guarantee that the algorithm in working or not.

2022-07-25 20635, 2022

16:04 PM
lucifer

i can add a datasethoster query if it helps.

2022-07-25 20646, 2022

16:04 PM
yuzie joined the channel

2022-07-25 20601, 2022

16:05 PM
lucifer

fwiw these datasets should be present on wolf currently, https://docs.google.com/document/d/1BJbXFPqgu2x5z…

2022-07-25 20646, 2022

16:05 PM
mayhem

yes, I've looked at them, but just shrugged.

2022-07-25 20652, 2022

16:05 PM
mayhem

unclear on how to proceed.

2022-07-25 20605, 2022

16:06 PM
mayhem

what we need is a dummy test set with an expected set of results

2022-07-25 20613, 2022

16:06 PM
lucifer

i see. makes sense

2022-07-25 20624, 2022

16:06 PM
mayhem

so that we can verify that the algorithm is working. once we have that, then we can start tuning the alg

2022-07-25 20658, 2022

16:06 PM
mayhem

another thing on that front is the improvement of what constitutes a session.

2022-07-25 20608, 2022

16:07 PM
lucifer

we can create a dummy test but unsure how to determine the baseline set of expected results.

2022-07-25 20610, 2022

16:07 PM
mayhem

we will need to have access to track lengths in spark in order to improve those.

2022-07-25 20636, 2022

16:07 PM
lucifer

we can import release dumps for that i think.

2022-07-25 20639, 2022

16:07 PM
mayhem

well, the dummy test needs to have a clear and contrived set of data.

2022-07-25 20647, 2022

16:07 PM
mayhem

one where we know exactly what the result should be

2022-07-25 20606, 2022

16:08 PM
mayhem

and if the result is not what we expect, then we have a bug and need to fix it.

2022-07-25 20627, 2022

16:08 PM
mayhem

then we need to focus on improving "listening sessions"/.

2022-07-25 20657, 2022

16:08 PM
mayhem

only when we have reliable sessions, can we build better similarity data.

2022-07-25 20605, 2022

16:09 PM
mayhem

and once Pratha-Fish is done, we have clearned up data and we can use that for similarity data.

2022-07-25 20616, 2022

16:09 PM
mayhem

and THAT will push us over the edge of having sufficient data.

2022-07-25 20631, 2022

16:09 PM
lucifer

makes sense

2022-07-25 20649, 2022

16:09 PM
mayhem

so, when you run low on things to do, work on stuff that moves us down this path.

2022-07-25 20629, 2022

16:10 PM
lucifer

do you have thoghts on how to build the test set? we can pick tracks and their similar ones but there likely will be bias.

2022-07-25 20646, 2022

16:10 PM
lucifer

*pick manually ourselves

2022-07-25 20613, 2022

16:11 PM
mayhem

I would not even use real data.

2022-07-25 20614, 2022

16:11 PM
alastairp

I think we're going to have to generate synthetic data that we know the algorithm will throw back as similar

2022-07-25 20619, 2022

16:11 PM
mayhem

that.

2022-07-25 20636, 2022

16:11 PM
lucifer

i see, makes sense. that should be doable.

2022-07-25 20646, 2022

16:12 PM
lucifer

about listening sessions, what do you have in mind? currently we use listened_at difference between listens for 30 mins. once we have track length, what should be done differently?

2022-07-25 20632, 2022

16:13 PM
mayhem

look for consecutive tracks and make sure there are no significant gaps.

2022-07-25 20607, 2022

16:14 PM
mayhem

I'd like to have a tool that allows us to review sessions. have it spit out the sessions for a user for the last week or so.

2022-07-25 20614, 2022

16:14 PM
mayhem

then we can see if they make sense.

2022-07-25 20628, 2022

16:14 PM
mayhem

right now, we made a basic assumption and we haven't verified it.

2022-07-25 20634, 2022

16:14 PM
lucifer

i see, so running difference should be less than a specified constant.

2022-07-25 20610, 2022

16:15 PM
lucifer

+1 to both.

2022-07-25 20621, 2022

16:16 PM
lucifer

i can work on adding the test set then the sessions tool then adding sessions to that spark query.

2022-07-25 20634, 2022

16:16 PM
mayhem

perfect.

2022-07-25 20640, 2022

16:17 PM
alastairp

cool

2022-07-25 20657, 2022

16:17 PM
lucifer

when will you be back from vacation btw?

2022-07-25 20627, 2022

16:18 PM
mayhem

sept. :/

2022-07-25 20641, 2022

16:18 PM
lucifer

i see.

2022-07-25 20645, 2022

16:18 PM
lucifer

alastairp: what about you?

2022-07-25 20649, 2022

16:18 PM
mayhem

well, I'll be around a bit each day, but just to answer emails/question

2022-07-25 20650, 2022

16:18 PM
alastairp

I expect to be back from 22 Aug

2022-07-25 20609, 2022

16:19 PM
lucifer

ok cool, lets discuss some more tickets then.