in #metabrainz

1:28 AM
dabeglavins60721 joined the channel
3:27 AM
suvid[m]

ok i switched to token based auth like in submit-listens as of now for testing
3:27 AM
thanks @fettuccinae:matrix.org for helping me out!
3:27 AM
* like in `submit-listens, * submit-listens` as
3:27 AM
* like in `submit-listens, * submit-listens`, * endpoint as of
3:30 AM
Now while the auth works and I am able to test the endpoint, but when I am using current_user, it doesnt work as it considers it anonymous user and has no id attribute
3:30 AM
It considers that I'm not logged in even if I authenticate with token
3:30 AM
Looks like i'll have to find a easy to login and then test in postman
3:30 AM
Can someone pls help me out with it?
3:31 AM
lucifer
4:42 AM
pite has quit
6:09 AM
Kladky joined the channel
7:15 AM
lucifer[m]

suvid: add a simple html button and dropdown on any page in your local lb frontend code and use it to test your endpoints.
7:18 AM
or if you want to use Postman, https://learning.postman.com/docs/sending-reque...
7:19 AM
you need to capture cookies from the browser and make it available to postman.
8:41 AM
reosarevok[m]

Hmm. I haven't gotten the sentry weekly report from sentry.io two weeks in a row, despite having it on AFAICT
8:41 AM
Are others getting it?
8:43 AM
lucifer[m]

i receive it
8:46 AM
what do you see here https://metabrainz-foundation-inc.sentry.io/set... ?
9:07 AM
Maxr1998_ has quit
9:07 AM
Maxr1998 joined the channel
9:11 AM
reosarevok[m]

On :)
9:19 AM
lusciouslover has quit
9:20 AM
lusciouslover joined the channel
9:33 AM
kellnerd[m] joined the channel
9:33 AM
kellnerd[m]

MBS finally bug-free? 😀
10:04 AM
mayhem[m] has quit
10:14 AM
__BrainzGit

[listenbrainz-server] 14Aerozol opened pull request #3311 (03master…add-benben-to-add-data): Add "Benben" player to add data https://github.com/metabrainz/listenbrainz-serv...
10:18 AM
holycow23[m] has quit
10:39 AM
petitminion joined the channel
11:45 AM
petitminion has quit
11:52 AM
lucifer[m]

reosarevok: can you confirm your email is correctly configured in sentry?
11:52 AM
https://metabrainz-foundation-inc.sentry.io/set...
11:55 AM
reosarevok[m]

Does look that way... and I used to get them, so dunno
11:56 AM
Maybe they are ending in spam...
11:57 AM
lucifer[m]

there should be a report email this weekend in the spam in that case
11:57 AM
reosarevok[m]

Seems not, but a bunch of meeting reviews did end up there, sigh
11:58 AM
Good thing I checked...
11:58 AM
lucifer[m] uploaded an image: (34KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/clGrSAXdEJQluenvWuRAITib/image.png >
11:59 AM
lucifer[m]

I guess you can contact the technical support from here and maybe they'll help on why the email mights not be delivered for you
12:38 PM
reosarevok[m]

bitmap: any idea what could cause MBS-14075 ?
12:38 PM
BrainzBot

MBS-14075: ModBot falsely reports non-existing relationship end points https://tickets.metabrainz.org/browse/MBS-14075
12:47 PM
lucifer[m]

holycow23: remind me, did you want me to help resolve a particular issue on the genre activity PR or just generally review?
12:49 PM
holycow23[m] joined the channel
12:49 PM
holycow23[m]

I have made the PR but not sure about the working of the entire stats, I just followed all the steps you had asked
12:49 PM
* lucifer: I have
12:50 PM
lucifer[m]

i see, did you try testing it on your wolf setup?
12:50 PM
holycow23[m]

If you could help me with the steps to test the entire thing on wolf
12:50 PM
No, I haven't tested it right now, could you help me with the steps to start the test
12:57 PM
lucifer[m]

okay, i'll take a look.
12:57 PM
monkey: hi! do you think there is value in showing the users the original name of the file users upload for a listens import?
12:58 PM
sure.
12:58 PM
monkey[m] joined the channel
12:58 PM
monkey[m]

Hi! Yes I think that would potentially be very useful.
12:58 PM
Looking at my spotify exports for example there are many files I would need to upload, this would help keep track
12:59 PM
Probably also useful to have the timestamps for the first and last listen for that file
12:59 PM
suvid[m]

lucifer: could you also clarify the schema part for the new table?
12:59 PM
lucifer[m]

I was thinking we would store only the name of the zip file uploaded.
12:59 PM
suvid[m]

Lemme find the msg I sent above
12:59 PM
lucifer[m]

share your current schema
12:59 PM
suvid[m]

<suvid[m]> "Also, lucifer what should be the..." <- > <@suvid:matrix.org> Also, lucifer what should be the schema for the user_data_import table?... (full message at <https://matrix.chatbrainz.org/_matrix/media/v3/...>)
13:01 PM
lucifer: I am planning to store file path in the schema, so do I also need to store the file name as well explicitly?
13:38 PM
petitminion joined the channel
13:39 PM
__BrainzGit

[listenbrainz-android] 14hemang-mishra opened pull request #569 (03main…listeningApps): Feat: Listening apps implementation https://github.com/metabrainz/listenbrainz-andr...
13:49 PM
[listenbrainz-server] 14MonkeyDo merged pull request #3311 (03master…add-benben-to-add-data): Add "Benben" player to add data https://github.com/metabrainz/listenbrainz-serv...
13:50 PM
[listenbrainz-server] 14MonkeyDo merged pull request #3310 (03master…fix-footer-typo): Fix typo in footer text https://github.com/metabrainz/listenbrainz-serv...
13:54 PM
lucifer[m]

suvid: filepath for internal use and original filename for the UI to show.
13:55 PM
dabeglavins60721 is now known as dabeglavins
13:57 PM
dabeglavins has quit
14:32 PM
suvid[m]

<suvid[m]> "image.png" <- lucifer: this is the schema I'm thin
14:33 PM
lucifer[m]

suvid: yes my response is to that schema
14:41 PM
pite joined the channel
14:53 PM
holycow23[m]

lucifer: can you help me with a little bug fixes, was trying to run the `./test.sh spark` and ran into errors
14:56 PM
__BrainzGit

[listenbrainz-android] 14hemang-mishra opened pull request #570 (03main…playlist_cover_art): FIX: Fetching cover Art in User Playlist screen https://github.com/metabrainz/listenbrainz-andr...
14:56 PM
lucifer[m]

holycow23[m]: what error do you get?
14:58 PM
holycow23[m]

So my query has LEFT JOIN genres g ON l.recording_mbid = g.recording_mbid but genre isn't registered and originally I used to run genre_df.createOrReplaceTempView("genres") but not in the main stats so how do I resolve it
14:59 PM
holycow23[m]: Where `genre_df` is ```genre_df = spark.read.parquet(f"{config.HDFS_CLUSTER_URI}/recording_genre")```
15:00 PM
bitmap[m] joined the channel
15:00 PM
bitmap[m]

<reosarevok[m]> "bitmap: any idea what could..." <- I tried entering a similar edit on my dev server and noticed a couple things: (1) `edits_pending` is never incremented on the new url and (2) the edit is never associated to the new url in the `edit_url` table. since those are the criteria the RemoveEmpty script cares about, I'm guessing it was removed by that script
15:02 PM
reosarevok[m]

Oh no.
15:04 PM
So basically any such edit which is not an autoedit will basically fail like this or?
15:06 PM
bitmap[m]

yeah, after two days
15:10 PM
lucifer[m]

holycow23: you can look at how we read other metadata caches in stats and do something similar. alternatively, i can push a commit to your branch implementing that.
15:11 PM
holycow23[m]

I would want to look into it on my own, if no success then will ask you
15:12 PM
lucifer[m]

sure take a look at how release_metadata_cache is used in release group stats for one example.
15:15 PM
holycow23[m]

this get_release_metadata_cache is the main function I am assuming?
15:16 PM
reosarevok[m]

<bitmap[m]> "yeah, after two days" <- Are you submitting a fix? I assume the fix is to do both the things you said :)
15:19 PM
bitmap[m]

nope i'm writing the script for MBS-14049 rn, if you want I can take a look at it after
15:19 PM
BrainzBot

MBS-14049: Unset primary for locale when marking alias as ended https://tickets.metabrainz.org/browse/MBS-14049
15:25 PM
holycow23[m]

<lucifer[m]> "sure take a look at how release_..." <- Does [this](github.com/metabrainz/listenbrainz-server/blob/master/listenbrainz_spark/hdfs/upload.py#L16) help anywhere the GENRE imports
15:26 PM
lucifer[m]

no this is for sample dumps.
15:26 PM
https://github.com/metabrainz/listenbrainz-serv...
15:27 PM
and then just: https://github.com/metabrainz/listenbrainz-serv...
15:29 PM
oh i think i misunderstood you maybe. do you mean this dataframe data is unavailable in the tests?
15:29 PM
holycow23[m]

I am also confused now
15:30 PM
lucifer[m]

can you share the exact error trace?
15:30 PM
and what command generates that.
15:31 PM
holycow23[m]

gist.github.com/granth23/95232d5ed5c0eff47ad4ab...
15:32 PM
lucifer[m]

okay.
15:32 PM
take a look at this specific line: https://github.com/metabrainz/listenbrainz-serv...
15:33 PM
holycow23[m]

yes this is what's wrong right now in my code
15:33 PM
lucifer[m]

you need to implement the equivalent of get_release_group_metadata_cache for your code.
15:33 PM
take a look at https://github.com/metabrainz/listenbrainz-serv... on how to implement it.
15:34 PM
holycow23[m]

Yeah I was going through this only, but couldn't understand the query that well
15:35 PM
lucifer[m]

it reads the data from HDFS and caches it in spark. ideally you would need to copy paste the code and just change the dataframe path.
15:36 PM
or just for now, you can do genres_df = read_files_from_HDFS(RECORDING_RECORDING_GENRE_DATAFRAME) then genres_df.createOrReplaceTempView("genres") in your stats code.
15:36 PM
holycow23[m]

but what about all the columns?
15:36 PM
okay
15:36 PM
lucifer[m]

which columns?
15:38 PM
holycow23[m]

lucifer[m]: The query in this file is a little confusing in the first glance so will take time and try to understand what's happening here
15:39 PM
lucifer[m]

which particular query? can you point to the specific line?
15:40 PM
asking because there is no query in that file.
15:40 PM
holycow23[m]

oh wait my bad I was looking at get_release_group_metadata_cache_query
15:41 PM
lucifer[m]

ah well :). check the three specific functions that i shared above.
15:42 PM
* specific functions in the link that i
15:42 PM
holycow23[m]

yes going through that right now
15:42 PM
parallely have run a test run to see what's the next error 😢
15:43 PM
lucifer[m]

i think its possible that the test run will fail because the genre data might be not present in the test data.
15:43 PM
my suggestion would be to do a normal run using ./develop.sh that would have the data from sample dumps.
15:44 PM
holycow23[m]

something like this ./develop.sh manage spark request_user_stats --type entity --entity artists --range this_week
15:44 PM
lucifer[m]

yes
15:47 PM
holycow23[m]

Got PathNotFoundException: Path not found: /recording_genre so I am assuming missing data only?
15:48 PM
lucifer[m]

yup
15:48 PM
do you get that with the ./develop.sh run as well?
15:49 PM
holycow23[m]

No trying that right now
15:49 PM
lucifer[m]

okay cool.
15:50 PM
holycow23[m]

I ran the query not request consumer?
15:50 PM
lucifer[m]

holycow23, suvid, rayyan_seliya123, m.amanullah7: you can collect your doubts and any queries that you have, or if something is not clear about the LB codebase etc. and we can have a meet tomorrow or day after to discuss it.
15:50 PM
holycow23: what do you mean?
15:51 PM
holycow23[m] uploaded an image: (28KiB) < https://matrix.chatbrainz.org/_matrix/media/v3/download/matrix.org/sEvfUQTzSoeYeXHnpeYOIpFl/image.png >
15:52 PM
holycow23[m]

How do I check if it ran properly
15:52 PM
I haven't written the script to render the frontend, or do I write that and check
15:53 PM
lucifer[m]

ah okay, yes check logs for request consumer container.
15:53 PM
./develop.sh spark logs request_consumer
15:54 PM
holycow23[m] sent a request_consumer code block: https://matrix.chatbrainz.org/_matrix/media/v3/download/chatbrainz.org/yGQIDsGLFuXdGqfLMOOzxucc
15:55 PM
holycow23[m]

But I have defined 'stats.user.genre_activity': listenbrainz_spark.stats.user.genre_activity.get_genre_activity,
15:56 PM
lucifer[m]

i see, maybe confirm you are on the correct branch and restart the request consumer container
15:57 PM
holycow23[m]

Yeah branch is fine
15:57 PM
lucifer[m]

./develop.sh spark build, ./develop.sh spark down, ./develop.sh spark up
15:58 PM
holycow23[m]

yeah ran these
16:02 PM
https://gist.github.com/granth23/95232d5ed5c0ef...
16:02 PM
Out of memory error it seems
16:04 PM
lucifer[m]

yeah
16:04 PM
bitmap[m]

reosarevok: this should work https://gist.github.com/mwiencek/c84cdfbb1c5730...
16:04 PM
lucifer[m]

will try to fix it.
16:05 PM
holycow23[m]

is there a quick alternative to see if it will work?
16:06 PM
lucifer[m]

try testing on pyspark repl/cmd with higher driver memory.
16:07 PM
do the setup that you usually need to do as shared in previous gists then from listenbrainz_spark.stats.user.genre_activity import get_genre_activity and call it with appropriate parameters.
16:10 PM
holycow23[m]

oh so you want me to run pyspark --driver-memory 8g and then the script to run the test?
16:11 PM
lucifer[m]

yes