#metabrainz

/

      • rektide joined the channel
      • rozlav has quit
      • rozlav joined the channel
      • lucifer
        aerozol: the PR has merge conflicts. meaning that someone else modified the same file as you in the master branch since you opened the PR. one way to fix is 1) update your local master branch, 2) merge it into your branch (accepting the changes from master), 3) reruning the command to udpate snapshots. 4) push the changes
      • i can resolve the conflicts or help you do so if needed.
      • aerozol
        Thanks lucifer, I'll try it tomorrow! Off to D&D now like the huge nerd I am
      • lucifer
        hehe. nice
      • aerozol
        mayhem: I brainstormed some grid ideas but didn't get round to polishing any. I'll pick some candidates and mock them up tomorrow as well
      • BrainzGit
        [listenbrainz-server] 14amCap1712 opened pull request #2184 (03master…update-scripts): Make develop.sh and test.sh compatible with compose v2 https://github.com/metabrainz/listenbrainz-serv...
      • agatzk has quit
      • agatzk joined the channel
      • CatQuest
        aerozol: how about letting there be an user endpoint thing in the grid ideas? like, all these might be interesting. so we coudl have adefault or two, but let peopel create plugins/scripts to see the other ones or (whatever they pick!) and cna be shared between users
      • so you could have differnt grid ideas and different views, and thne users could create more
      • for visiualisations and "sharing your music-taste/playlists etc" off site as well as on community
      • some of these grid views would look really snazzy as advertisement for listenbrainz
      • in some music forum or redditpost or whatever
      • aerozol
        I believe that's mayhem's plan! Let people choose the design and then the size/time range etc, to share
      • If that's what you meant. If it's like a data endpoint thing that you mean then I dunno
      • rozlav has quit
      • CatQuest
        I dunno either. I was just replying to your "grid ideas" notepad image :D
      • CatQuest off
      • rozlav joined the channel
      • agatzk has quit
      • mayhem
        Moin!
      • lucifer
        morning!
      • mayhem
        Yes, the Paris bit sure didn't help. I think you should let go of Paris plans, but that is tricky now, I guess.
      • For the next attempt we should make up an important meeting BCN for a few days. Then use an executive service to apply for the visa.
      • The main goal here is to get one visa on your record and that should make future trips easier.
      • lucifer
        yeah. i guess i would need to book a return ticket and cancel this one but that seems to be the only way so fine i think.
      • yes, makes sense
      • mayhem
        Was getting some amount of refund an option?
      • lucifer
        the booking portal says this one is non-refundable but i'll ask the helpdesk again if they can do anything about it.
      • nbin has quit
      • nbin joined the channel
      • Pratha-Fish
        alastairp: Hello there
      • I've updated the code for the benchmark. Please take a look at it here: https://github.com/Prathamesh-Ghatole/MLHD/blob...
      • alastairp
        hi Pratha-Fish
      • those values look much better
      • I'm a bit suspicious about the exact same filesize for zst and zst-10
      • Pratha-Fish
        me too
      • alastairp
      • Pratha-Fish
        just noticed the similar size 🤦‍♂️
      • alastairp
        should be `csv_zstd10_paths`
      • Pratha-Fish
        yep. it should've been "csv_zstd10_paths"
      • generating new report
      • alastairp
        I'm amazed that the time to write the parquet file is so much smaller than the csv versions
      • could you generate one for csv with no compression? does it make sense to have one for parquet with no compression (I guess that'd be even smaller)
      • I remember you saying that you found a new library which wrote plain csv much faster than the csv library in python, is that right?
      • Pratha-Fish
        yes that's right
      • These benchmarks are done with pandas.to_csv() and pandas.to_parquet() tho.
      • It'd be much faster with arrow
      • for both, paruqet and csv
      • alastairp
        I'm just trying to intuit how much of the slowness is because of writing to csv and how much is the compression
      • although as I mentioned yesterday, it really doesn't matter how slow writing is because need to do it only once
      • let's focus now on the script to generate the new dataset, and we can come back to this if we want to write up some stuff about it
      • Pratha-Fish
        sure
      • Also, interestingly, the output size is still the same
      • maybe I am writing the files wrong
      • found the issue. I setup the compression level wrong
      • alastairp
        Pratha-Fish: you're also writing both the 3 and 10 level into the same directory
      • which I guess means when you read it you're actually reading the filesize of the second one you did (level 10?)
      • Pratha-Fish
        alastairp: also with the same filename and extension 🤦‍♂️
      • it also means that the readtimes were wrong as well
      • alastairp
        yeah, as long as it's in a different directory the filename isn't too much of a problem
      • but yeah, it'll affect other thigns too
      • and yes - I see that read times are basically the same too :)
      • Pratha-Fish
        I've fixed the issues and rerun the computation
      • it should come out fine this time ig
      • while we're at it, let's also run a test for plain csv without compression
      • alastairp: the results are in!
      • ansh
        alastairp: Hi! For CB-394, we need to think of one thing, If we have plans to have more license options later on. Because if we have some, then we'll probably not remove the user's preference column from the users table.
      • BrainzGit
        [troi-recommendation-playground] 14amCap1712 merged pull request #65 (03main…return-playlist): Return playlist element from generate_playlist https://github.com/metabrainz/troi-recommendati...
      • [troi-recommendation-playground] 14amCap1712 opened pull request #66 (03main…spotify-submit): Add spotify submission support https://github.com/metabrainz/troi-recommendati...
      • alastairp
        ansh: hmm, right.
      • out of interest, let me take a quick look at what preferences users have set
      • ansh
        These are the stats for the reviews we have. https://www.irccloud.com/pastebin/F3kESilC/
      • alastairp
        ansh: what is your current plan? Keep the existing license table, and keep the license for existing reviews, but from now on only give people the option for CC BY-SA?
      • in fact, I think it makes sense for us to go forward and upgrade this to the v4 license too.
      • one thing we could do is ask users to upgrade their licenses to the v4 version when they log in
      • lucifer
        upgrade their license preference?
      • ansh
        Yes. Thats the plan. And also remove the user's preference form to select default license
      • alastairp
        lucifer: ask them if they want to re-license all of their v3 texts as the v4 license
      • lucifer
        i see. currently we don't allow change of license after publishing a review.
      • alastairp
        right, the idea would be to keep the same license, but just change to the newer version of it
      • lucifer
        maybe we can ask them to relicense the NC ones too then?
      • ah ok
      • alastairp
        I understand the v4 CC licenses are much better than earlier ones
      • darkstardevx has quit
      • yes, we could ask them, but they'd have to be able to decline that
      • lucifer
        yup makes sense
      • alastairp
        if we wanted to do that we should explain why we want them to change and why it's a better idea
      • ansh: right, removing the pref option is good. I think you're right, we can keep the user pref column and mark it as null for everyone
      • ansh
        Since many of our users would review from LB and BB, maybe we can also add a message there also.
      • Makes sense
      • alastairp
        I agree that there might be some users who only write reviews via LB/BB, but let's focus on the users who visit CB first
      • we could add an internal API endpoint for ourselves "has-v3-license", and ping it from LB, prompting users to visit CB to do the upgrade
      • but that's not a huge issue
      • Pratha-Fish: great, that looks much better!
      • I wonder how the time for pandas to csv compares with arrow to csv, with no compression
      • ansh
        alastairp: Correct. I'll upgrade the license to v4 also everywhere.
      • Pratha-Fish
        alastairp: let's check
      • alastairp
        ansh: great. We should make a blog post/forum post about this when we release it
      • ansh
        Yes, and we should also include the walkthrough for the users to upgrade the licenses to the new version
      • alastairp
        I just had a quick chat to mayhem, he doesn't think that we need to worry about prompting users to upgrade from v3 to v4
      • so let's just focus on 1) removing the option in the preferences, and 2) setting CC-BY-SA 4.0 as the default for new reviews
      • BrainzGit
        [troi-recommendation-playground] 14amCap1712 opened pull request #67 (03main…daily-jams-accept-day): Accept jam_date as optional argument in daily-jams patch https://github.com/metabrainz/troi-recommendati...
      • ansh
        Okay, I'll begin working on that.
      • alastairp
        lucifer: do you remember why we decided to open this ticket? (remove -NC option), I see that you opened it
      • ansh
        alastairp: Can you please review CB#401 whenever you're free?
      • BrainzBot
        CB-427: Support entity endpoint in the API including average rating: https://github.com/metabrainz/critiquebrainz/pu...
      • alastairp
        👍
      • lucifer
        alastairp: looking at chat logs, because its a non-free license so we decided to remove it. it seems we also discussed it at summit 2020 but can't find anything in summit notes about it.
      • alastairp
        lucifer: oh, thanks for looking back at that. yes, I do have some general understanding that -nc licenses have problems, but can't remember our specific discussions about them
      • lucifer
        >_lucifer: we've been talking about it for 3-4 years, so maybe in another 2 years or so we'll be ready to start :)
      • alastairp
        we just needed the right person to come and help :)
      • lucifer
        alastairp: ^, it'll be 2 years in a week :).
      • alastairp
        !m ansh
      • BrainzBot
        You're doing good work, ansh!
      • lucifer
        oh that was about MeB OAuth.
      • but yes indeed, ansh is doing great work.
      • ansh
        Thank you :)
      • alastairp
        ansh: did you rebase your PRs to include lucifer's recent fixes?
      • ansh
        alastairp: Yes
      • alastairp
        excellent, thanks. opening this one now
      • lucifer
        if it helps, i can review some of those PRs.
      • alastairp
        lucifer: thanks, which one(s) would you like to look at?
      • lucifer
        i can probably start with CB#403
      • BrainzBot
        CB-416: Identify users by musicbrainz username, not uuid: https://github.com/metabrainz/critiquebrainz/pu...
      • alastairp
        perfect
      • agatzk joined the channel
      • lucifer
        alastairp: re the above PR, thoughts on having a single get_user(user_ref) query instead of 1 get_user_by_id and 1 get_user_by_mbid?
      • alastairp
        lucifer: how is it done currently? we check the type of the parameter and then use one of those methods?
      • what's the get_user_from_username_or_id? I see that it's in frontend/views/user
      • moving that to a db method sounds fine to me
      • lucifer
        yes. but actually thinking more PG would error with invalid type.
      • alastairp
        in which case?
      • lucifer
        so we'd still need to check in python.
      • alastairp
        if you ass a string as a uuid?
      • lucifer
        yes
      • alastairp
        yes, right
      • lucifer
        we can do id::text but then any index on id will not be used. maybe not an issue?
      • alastairp
        did you have a thought about a way to check both fields in a single sql query? 😮
      • lucifer
        yes right, i meant that.
      • alastairp
        id::text = %s OR username = %s ?
      • lucifer
        `id::text = :user_ref OR musicbrainz_id::text = :user_ref`
      • yes