those similarities are pretty good, but is there any way for me to check them in DSH?
2025-04-02 09233, 2025
lucifer[m]
mayhem: not at the moment.
2025-04-02 09241, 2025
lucifer[m]
yes MLHD similarity data.
2025-04-02 09250, 2025
mayhem[m]
ok, let me think up a few examples.
2025-04-02 09226, 2025
lucifer[m]
i am pretty sure there is some bug in the data because i got 42M similarity pairs which is more than the 14M we get from LB data but still not as much as I'd expect.
it may not be the pair count to judge this -- perhaps summing the counts to see if all tracks are accounted for?
2025-04-02 09218, 2025
mayhem[m]
can you use session_based_days_7500_session_300_contribution_5_threshold_10_limit_100_filter_True_skip_30 ?
2025-04-02 09231, 2025
mayhem[m]
because that is the dataset we have right now, so that would be a good comparison
2025-04-02 09214, 2025
lucifer[m]
sure
2025-04-02 09246, 2025
lucifer[m]
i need to make some changes to the code but we can run this entirely on spark cluster in ~10-12 hours.
2025-04-02 09210, 2025
lucifer[m]
so we can run experiments easily with this data.
2025-04-02 09254, 2025
lucifer[m]
possibly even ~6 hours but i'll have to run some tests.
2025-04-02 09214, 2025
pite joined the channel
2025-04-02 09215, 2025
lucifer[m]
taking it one chunk at a time, using zstd compression, changing mbids to ids, breaking the data generation into two stages.
2025-04-02 09236, 2025
lucifer[m]
helped ensure that we are able to process the data on the existing cluster. and its crazy because the final data (ids not mbids) is just a 180 MB parquet atm.
2025-04-02 09200, 2025
mayhem[m]
oh wow. you calculated this without the big new VM?
2025-04-02 09218, 2025
lucifer[m]
yes
2025-04-02 09238, 2025
mayhem[m]
amazing. this is so much better for being able to iterate this data
2025-04-02 09252, 2025
lucifer[m]
yup indeed
2025-04-02 09227, 2025
lucifer[m]
i had spent a week on figuring out to make this work lol but just after i asked for the vm yesterday, i figured out how to fix the issue..
2025-04-02 09226, 2025
BrainzGit
[listenbrainz-server] 14MonkeyDo merged pull request #3241 (03master…hide-apple-music-export): Disable "Export to Apple Music" option if user not signed in, or into Apple Music https://github.com/metabrainz/listenbrainz-server…
2025-04-02 09201, 2025
vardhan joined the channel
2025-04-02 09202, 2025
outsidecontext[m has quit
2025-04-02 09236, 2025
vardhan has quit
2025-04-02 09236, 2025
krishnacosmic[m] has quit
2025-04-02 09225, 2025
GautamShorewala[ has quit
2025-04-02 09245, 2025
jasje[m] has quit
2025-04-02 09246, 2025
yvanzo[m] has quit
2025-04-02 09251, 2025
kellnerd[m] has quit
2025-04-02 09248, 2025
m1gr has quit
2025-04-02 09254, 2025
m1gr joined the channel
2025-04-02 09215, 2025
minimal joined the channel
2025-04-02 09253, 2025
BobSwift[m]
When do you expect the new country code from MBS-12170 to be available in the json output on musicbrainz.org (or beta.musicbrainz.org)? I'm looking into adding this into the variables available in Picard, and I want to make sure I'm extracting and testing properly. Thanks.
augh, co-opting the artist *country* because releases have useless information is s...
2025-04-02 09214, 2025
Jigen
now adding it to the api I'm not against, but the real issue here is that we need to stop/revert/codify to prevent these "205 country" monster releases
2025-04-02 09250, 2025
Jigen
sigh. it was literally what the [worldwide] thing was supposed ot be *for*
2025-04-02 09253, 2025
Jigen
sigh
2025-04-02 09232, 2025
Sophist-UK has quit
2025-04-02 09253, 2025
Sophist-UK joined the channel
2025-04-02 09230, 2025
Kladky has quit
2025-04-02 09227, 2025
petitminion joined the channel
2025-04-02 09218, 2025
bitmap[m]
<BobSwift[m]> "When do you expect the new ..." <- I believe reosarevok is planning to update the beta server tomorrow
2025-04-02 09233, 2025
petitminion has quit
2025-04-02 09200, 2025
petitminion joined the channel
2025-04-02 09243, 2025
petitminion has quit
2025-04-02 09215, 2025
bitmap[m]
<BobSwift[m]> "When do you expect the new ..." <- I've deployed this on the test server in the meantime