#metabrainz

/

0:57 AM
BrainzGit

[listenbrainz-android] 14dependabot[bot] opened pull request #185 (03main…dependabot/gradle/room_version-2.5.2): Bump room_version from 2.5.1 to 2.5.2 https://github.com/metabrainz/listenbrainz-androi…

2023-06-28 17912, 2023

0:57 AM
BrainzGit

[listenbrainz-android] 14dependabot[bot] opened pull request #186 (03main…dependabot/gradle/io.sentry.android.gradle-3.11.1): Bump io.sentry.android.gradle from 3.10.0 to 3.11.1 https://github.com/metabrainz/listenbrainz-androi…

2023-06-28 17916, 2023

0:57 AM
BrainzGit

[listenbrainz-android] 14dependabot[bot] closed pull request #184 (03main…dependabot/gradle/io.sentry.android.gradle-3.11.0): Bump io.sentry.android.gradle from 3.10.0 to 3.11.0 https://github.com/metabrainz/listenbrainz-androi…

2023-06-28 17917, 2023

0:57 AM
BrainzGit

[listenbrainz-android] 14dependabot[bot] opened pull request #187 (03main…dependabot/gradle/androidx.compose-compose-bom-2023.06.01): Bump androidx.compose:compose-bom from 2023.06.00 to 2023.06.01 https://github.com/metabrainz/listenbrainz-androi…

2023-06-28 17932, 2023

0:57 AM
BrainzGit

[listenbrainz-android] 14dependabot[bot] opened pull request #188 (03main…dependabot/gradle/app.cash.turbine-turbine-1.0.0): Bump app.cash.turbine:turbine from 0.13.0 to 1.0.0 https://github.com/metabrainz/listenbrainz-androi…

2023-06-28 17917, 2023

2:39 AM
lucifer has quit

2023-06-28 17927, 2023

2:39 AM
lucifer joined the channel

2023-06-28 17909, 2023

7:06 AM
lucifer

mayhem: checked outsidecontext's listens in spark and those are same as the ones showing up on LB web.

2023-06-28 17930, 2023

8:26 AM
mayhem

huh. odd. I'm at a loss for what it could be then.

2023-06-28 17904, 2023

9:34 AM
lucifer

mayhem: how does this look for tags dataset in timescale?

2023-06-28 17907, 2023

9:34 AM
lucifer

https://www.irccloud.com/pastebin/1GbNcgVX/

2023-06-28 17949, 2023

9:36 AM
mayhem

I think we may still have confusion about the percent column -- not sure.

2023-06-28 17951, 2023

9:37 AM
mayhem

how I am using percent in LB radio is that any given result set that is ordered by count or score, I wish to retrieve a section of the results by specifying a percent range.

2023-06-28 17910, 2023

9:38 AM
lucifer

yup the query will filter the percent

2023-06-28 17933, 2023

9:38 AM
lucifer

https://www.irccloud.com/pastebin/6Y3gAUex/

2023-06-28 17937, 2023

9:38 AM
mayhem

so, if the result has 100 rows and the I want results 50% - 75%, then it should return items 50 -75.

2023-06-28 17919, 2023

9:39 AM
mayhem

ok, does percent really need to be stored in the DB?

2023-06-28 17950, 2023

9:39 AM
lucifer

unless you will always only query a few fixed ranges that can be assigned at time of dataset generation yes.

2023-06-28 17953, 2023

9:39 AM
mayhem

because the same could be accomplished using OFFSET and COUNT. once the total number of rows is know.

2023-06-28 17955, 2023

9:39 AM
mayhem

known

2023-06-28 17901, 2023

9:40 AM
lucifer

not really

2023-06-28 17923, 2023

9:40 AM
lucifer

OFFSET and COUNT have nothing to do with percent really

2023-06-28 17906, 2023

9:41 AM
lucifer

hmm i see you want that ordering by percent as well

2023-06-28 17909, 2023

9:41 AM
lucifer

let me fix that

2023-06-28 17911, 2023

9:41 AM
mayhem

correct. but given the total number of rows of the results set, you can quickly and easily convert % to row numbers and then use those to filter using OFFSET/COUNT.

2023-06-28 17927, 2023

9:41 AM
mayhem

no, I don't want order by percent.

2023-06-28 17940, 2023

9:42 AM
mayhem

I could, just as easily, fetch all rows from the query, and then ignore all the results that lie outside the range I am interested in.

2023-06-28 17956, 2023

9:42 AM
mayhem

but that is wasteful, so that is why I want to have the DB do the "pagination" for me.

2023-06-28 17906, 2023

9:43 AM
mayhem

this is effectively a pagination scheme I want.

2023-06-28 17917, 2023

9:43 AM
mayhem

pagination, result selection.

2023-06-28 17938, 2023

9:43 AM
lucifer

i think we are not on the same page about percent, count and offset.

2023-06-28 17955, 2023

9:43 AM
mayhem

seems not, no.

2023-06-28 17955, 2023

9:43 AM
lucifer

let me explain what i understand so far, and you can correct it

2023-06-28 17959, 2023

9:43 AM
mayhem

ok

2023-06-28 17917, 2023

9:44 AM
lucifer

only considering recording level tags for now.

2023-06-28 17954, 2023

9:44 AM
lucifer

for each tag, find all the recordings that have that tag. then based on their listen counts from popularity dataset assign them a percent.

2023-06-28 17917, 2023

9:45 AM
lucifer

the most listened recording for each tag get 1.0 and the least listened one gets 0.0.

2023-06-28 17928, 2023

9:45 AM
lucifer

rest are on a spectrum within this range.

2023-06-28 17901, 2023

9:46 AM
lucifer

emphasizing that the percents are calculated for each tag-recording pair.

2023-06-28 17906, 2023

9:46 AM
lucifer

so far correct?

2023-06-28 17915, 2023

9:46 AM
mayhem

gimme a few moments to ponder

2023-06-28 17926, 2023

9:47 AM
mayhem

you're quite close, but I prefer for us to store listen counts rather than percent in the DB.

2023-06-28 17944, 2023

9:47 AM
mayhem

because sometimes the ranges of the data is important.

2023-06-28 17958, 2023

9:47 AM
lucifer

but then you can't filter on percents in api.

2023-06-28 17959, 2023

9:47 AM
mayhem

is the top listen count 1,000 or 10,000?

2023-06-28 17923, 2023

9:48 AM
lucifer

depends on the recording ofc. for something like pop it could be 10000 for less popular ones much less

2023-06-28 17928, 2023

9:48 AM
mayhem

lets assume for a moment I never said anything about percent.

2023-06-28 17948, 2023

9:48 AM
lucifer

sure then we store listen counts in the database

2023-06-28 17908, 2023

9:49 AM
mayhem

and instead, I'm going to be a dumb API user and fetch all the rows for a given tag query.

2023-06-28 17918, 2023

9:49 AM
mayhem

stuff them into a plist in ram in troi.

2023-06-28 17940, 2023

9:49 AM
mayhem

then ask troi to give me results[50:75] and I get exactly what I want.

2023-06-28 17942, 2023

9:49 AM
mayhem

follow me?

2023-06-28 17955, 2023

9:49 AM
lucifer

yes

2023-06-28 17907, 2023

9:50 AM
mayhem

ok, the server didn't use percent at all. yes?

2023-06-28 17911, 2023

9:50 AM
lucifer

yes

2023-06-28 17920, 2023

9:50 AM
mayhem

so, if I want the server to handle percent, I need to do the following.

2023-06-28 17925, 2023

9:50 AM
mayhem

1. execute query.

2023-06-28 17929, 2023

9:50 AM
mayhem

2. get row count.

2023-06-28 17950, 2023

9:50 AM
mayhem

3. calculate start offset from start percent and total row count.

2023-06-28 17905, 2023

9:51 AM
mayhem

4. do the same to caluculate number of rows.

2023-06-28 17928, 2023

9:51 AM
mayhem

5. Do the query, with LIMIT and COUNT.

2023-06-28 17934, 2023

9:51 AM
mayhem

6. Return data.

2023-06-28 17956, 2023

9:51 AM
mayhem

the problem is, that I need to know row count in the result before I can write the query to fetch data.

2023-06-28 17904, 2023

9:52 AM
mayhem

and that might be the downfall of this approach.

2023-06-28 17909, 2023

9:52 AM
lucifer

yes that's possible.

2023-06-28 17931, 2023

9:52 AM
lucifer

not a biggie i think, that can all be still done in 1 sql query

2023-06-28 17932, 2023

9:52 AM
mayhem

without executing the query twice?

2023-06-28 17941, 2023

9:52 AM
lucifer

using window functions

2023-06-28 17942, 2023

9:52 AM
mayhem

oh yeah, with CTEs.

2023-06-28 17904, 2023

9:53 AM
mayhem

ok, so then lets go with your proposed schema, except with count, rather than percent.

2023-06-28 17906, 2023

9:53 AM
mayhem

yes?

2023-06-28 17909, 2023

9:53 AM
lucifer

however, there's one issue with it. hear me out.

2023-06-28 17918, 2023

9:53 AM
mayhem is listening

2023-06-28 17937, 2023

9:53 AM
lucifer

so basically we are calculating percent from listen count on each api call/query execution

2023-06-28 17909, 2023

9:54 AM
lucifer

whereas this is a static dataset which won't change unless new data comes from spark to completely replace it.

2023-06-28 17933, 2023

9:54 AM
lucifer

if we store percent in the database, we can have a very nice index filter on it

2023-06-28 17954, 2023

9:54 AM
mayhem

ahhh, index filter. that makes good sense, yes.

2023-06-28 17955, 2023

9:54 AM
lucifer

if we store counts, we don't have percent so we have to read all rows for the tag and process it.

2023-06-28 17903, 2023

9:55 AM
mayhem

ok, I see your point.

2023-06-28 17919, 2023

9:55 AM
lucifer

we can store both listen count and percent if you need counts too.

2023-06-28 17921, 2023

9:55 AM
mayhem

I would've like to have counts in the DB, but percent should be good enough.

2023-06-28 17943, 2023

9:55 AM
mayhem

lets do percent for now, but we can simply move to count as well when I have a solid use case for it.

2023-06-28 17954, 2023

9:55 AM
lucifer

sure, should be a simple change

2023-06-28 17955, 2023

9:55 AM
mayhem

since it is generated data, that is easy to change.

2023-06-28 17959, 2023

9:55 AM
mayhem

ok, great, thanks!

2023-06-28 17954, 2023

10:34 AM
lucifer

mayhem: say there are ~100 recordings in the [0.50, 0.75] range. how do you want to apply count and offset to it?

2023-06-28 17918, 2023

10:43 AM
mayhem

if I asked for fewer than 100 recordings, say 50, then ideally it would randomly choose them

2023-06-28 17922, 2023

10:45 AM
lucifer

mayhem: but then you couldn't use it with offset

2023-06-28 17942, 2023

10:45 AM
lucifer

if the ordering is random, offset doesn't make sense.

2023-06-28 17944, 2023

10:45 AM
mayhem

correct. I would just repeat the call.

2023-06-28 17957, 2023

10:45 AM
lucifer

so then get rid of offset?

2023-06-28 17958, 2023

10:45 AM
mayhem

it would be nice to know how many rows were chosen from.

2023-06-28 17922, 2023

10:46 AM
mayhem

yeah, lets. superfluous come to think of it.

2023-06-28 17915, 2023

11:17 AM
theraspberry has quit

2023-06-28 17942, 2023

11:17 AM
theraspberry joined the channel

2023-06-28 17925, 2023

11:37 AM
tux0r has quit

2023-06-28 17913, 2023

11:44 AM
tux0r joined the channel

2023-06-28 17937, 2023

11:53 AM
TOPIC: #metabrainz MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy, MBS-13146

2023-06-28 17912, 2023

11:54 AM
mayhem

thanks yvanzo

2023-06-28 17952, 2023

12:22 PM
reosarevok

Who's taking that topic?

2023-06-28 17906, 2023

12:23 PM
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146

2023-06-28 17914, 2023

12:23 PM
mayhem

yvanzo or I.

2023-06-28 17927, 2023

12:23 PM
mayhem

but yvanzo is digging in more than I am, so probably better be yvanzo

2023-06-28 17900, 2023

12:25 PM
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146 (yvanzo)

2023-06-28 17901, 2023

12:35 PM
atj

that link you sent mentioned that a German court declared Google Fonts transfers personal data to the US without sufficient safeguards

2023-06-28 17928, 2023

12:35 PM
mayhem

but that was overturned, no?

2023-06-28 17940, 2023

12:36 PM
jasje

lucifer: had a look on the feed pr?

2023-06-28 17908, 2023

12:37 PM
BrainzGit

[bookbrainz-site] 14MonkeyDo merged pull request #998 (03administration-system…adminPanelImprovements): feat(Admin System): Add edit buttons instead of icons and use refs instead of updateResultsTrigger https://github.com/metabrainz/bookbrainz-site/pul…

2023-06-28 17950, 2023

12:37 PM
atj

https://cookie-script.com/blog/google-fonts-and-g… - "If you do not ask for consent for Google Fonts and still load them, you will violate the GDPR."

2023-06-28 17917, 2023

12:40 PM
mayhem

ok, time to self host google fonts then.

2023-06-28 17922, 2023

12:41 PM
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146 (yvanzo), Google Fonts (ruaok)

2023-06-28 17927, 2023

12:41 PM
mayhem

next meeting is going to be a bear

2023-06-28 17941, 2023

12:41 PM
yvanzo

At the same time, let’s fund a corporation named Script and take over Alphabet. ;)

2023-06-28 17933, 2023

12:43 PM
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146: reCAPTCHA (yvanzo), ORG-51: Google Fonts (ruaok)

2023-06-28 17954, 2023

12:44 PM
atj

yvanzo: i assume registration is blocked unless recaptcha consent is given?

2023-06-28 17923, 2023

12:45 PM
mayhem

not currently, but that is how it will need to be unless we switch to a different captcha.

2023-06-28 17906, 2023

12:46 PM
atj

yeah, sorry i meant in the proposed solution

2023-06-28 17942, 2023

12:46 PM
atj

does a GDPR compliant captcha exist?

2023-06-28 17951, 2023

12:46 PM
yvanzo

atj: I’m not sure that is an option.

2023-06-28 17905, 2023

12:47 PM
atj

yvanzo: why not?

2023-06-28 17924, 2023

12:47 PM
zapmonkey joined the channel

2023-06-28 17932, 2023

12:47 PM
yvanzo

Cookie walls are not allowed. We have to make sure we can require this one to be set.

2023-06-28 17957, 2023

12:47 PM
mayhem

what is a cookie wall?

2023-06-28 17917, 2023

12:48 PM
yvanzo

not that delicious

2023-06-28 17951, 2023

12:48 PM
yvanzo

Basically blocking access to a website for not accepting non-essential cookies.

2023-06-28 17906, 2023

12:49 PM
atj

cookie wall == cookie prompt?

2023-06-28 17942, 2023

12:49 PM
yvanzo

But I just don’t know atm, it has to be investigated.

2023-06-28 17949, 2023

12:49 PM
mayhem

as per GDPR?

2023-06-28 17933, 2023

12:50 PM
yvanzo

I just answered :)

2023-06-28 17934, 2023

12:51 PM
atj

personally, i think it's reasonable to prevent registration unless consent is given for recaptcha, and i don't see why this approach would not be GDPR compliant

2023-06-28 17919, 2023

12:55 PM
yvanzo

law?

2023-06-28 17948, 2023

12:55 PM
yvanzo

I agree that it seems reasonable too.

2023-06-28 17912, 2023

12:57 PM
mayhem

I mean we do this for LB. When you create your account you accept that LB processes your data or you delete your account.

2023-06-28 17915, 2023

12:57 PM
mayhem

that amounts to the same, no?

2023-06-28 17929, 2023

12:58 PM
mayhem

rdswift: tag:r&b fixed and now part of our test suite. shouldn't happen again now. :) https://listenbrainz.org/playlist/7826761b-5df1-4…

2023-06-28 17932, 2023

13:01 PM
atj

yvanzo: how could it against the law? there is no requirement to provide service to everyone

2023-06-28 17901, 2023

13:04 PM
yvanzo

I don’t know yet, I just shared doubts based on some reading about cookie walls.

2023-06-28 17955, 2023

13:05 PM
yvanzo has to go (not because of the topic)

2023-06-28 17901, 2023

13:06 PM
trolley has quit

2023-06-28 17924, 2023

13:06 PM
trolley joined the channel

2023-06-28 17932, 2023

13:06 PM
trolley has quit

2023-06-28 17915, 2023

13:08 PM
trolley joined the channel

2023-06-28 17901, 2023

13:16 PM
zapmonkey has quit

2023-06-28 17946, 2023

13:17 PM
atj

yvanzo: if you have links please do share them

2023-06-28 17912, 2023

13:30 PM
zapmonkey joined the channel

2023-06-28 17931, 2023

13:31 PM
mayhem

lucifer: you about?

2023-06-28 17928, 2023

13:33 PM
lucifer

hi yes

2023-06-28 17949, 2023

13:33 PM
mayhem

quick sanity check on a schema change to the playlist table.

2023-06-28 17902, 2023

13:34 PM
mayhem

monkey would like and expires_on field for playlists and we discussed that it makes sense.

2023-06-28 17917, 2023

13:34 PM
mayhem

this way we can set an explict time when the playlist should be deleted.