[listenbrainz-android] 14dependabot[bot] opened pull request #185 (03main…dependabot/gradle/room_version-2.5.2): Bump room_version from 2.5.1 to 2.5.2 https://github.com/metabrainz/listenbrainz-androi…
2023-06-28 17912, 2023
BrainzGit
[listenbrainz-android] 14dependabot[bot] opened pull request #186 (03main…dependabot/gradle/io.sentry.android.gradle-3.11.1): Bump io.sentry.android.gradle from 3.10.0 to 3.11.1 https://github.com/metabrainz/listenbrainz-androi…
2023-06-28 17916, 2023
BrainzGit
[listenbrainz-android] 14dependabot[bot] closed pull request #184 (03main…dependabot/gradle/io.sentry.android.gradle-3.11.0): Bump io.sentry.android.gradle from 3.10.0 to 3.11.0 https://github.com/metabrainz/listenbrainz-androi…
2023-06-28 17917, 2023
BrainzGit
[listenbrainz-android] 14dependabot[bot] opened pull request #187 (03main…dependabot/gradle/androidx.compose-compose-bom-2023.06.01): Bump androidx.compose:compose-bom from 2023.06.00 to 2023.06.01 https://github.com/metabrainz/listenbrainz-androi…
2023-06-28 17932, 2023
BrainzGit
[listenbrainz-android] 14dependabot[bot] opened pull request #188 (03main…dependabot/gradle/app.cash.turbine-turbine-1.0.0): Bump app.cash.turbine:turbine from 0.13.0 to 1.0.0 https://github.com/metabrainz/listenbrainz-androi…
2023-06-28 17917, 2023
lucifer has quit
2023-06-28 17927, 2023
lucifer joined the channel
2023-06-28 17909, 2023
lucifer
mayhem: checked outsidecontext's listens in spark and those are same as the ones showing up on LB web.
2023-06-28 17930, 2023
mayhem
huh. odd. I'm at a loss for what it could be then.
2023-06-28 17904, 2023
lucifer
mayhem: how does this look for tags dataset in timescale?
I think we may still have confusion about the percent column -- not sure.
2023-06-28 17951, 2023
mayhem
how I am using percent in LB radio is that any given result set that is ordered by count or score, I wish to retrieve a section of the results by specifying a percent range.
so, if the result has 100 rows and the I want results 50% - 75%, then it should return items 50 -75.
2023-06-28 17919, 2023
mayhem
ok, does percent really need to be stored in the DB?
2023-06-28 17950, 2023
lucifer
unless you will always only query a few fixed ranges that can be assigned at time of dataset generation yes.
2023-06-28 17953, 2023
mayhem
because the same could be accomplished using OFFSET and COUNT. once the total number of rows is know.
2023-06-28 17955, 2023
mayhem
known
2023-06-28 17901, 2023
lucifer
not really
2023-06-28 17923, 2023
lucifer
OFFSET and COUNT have nothing to do with percent really
2023-06-28 17906, 2023
lucifer
hmm i see you want that ordering by percent as well
2023-06-28 17909, 2023
lucifer
let me fix that
2023-06-28 17911, 2023
mayhem
correct. but given the total number of rows of the results set, you can quickly and easily convert % to row numbers and then use those to filter using OFFSET/COUNT.
2023-06-28 17927, 2023
mayhem
no, I don't want order by percent.
2023-06-28 17940, 2023
mayhem
I could, just as easily, fetch all rows from the query, and then ignore all the results that lie outside the range I am interested in.
2023-06-28 17956, 2023
mayhem
but that is wasteful, so that is why I want to have the DB do the "pagination" for me.
2023-06-28 17906, 2023
mayhem
this is effectively a pagination scheme I want.
2023-06-28 17917, 2023
mayhem
pagination, result selection.
2023-06-28 17938, 2023
lucifer
i think we are not on the same page about percent, count and offset.
2023-06-28 17955, 2023
mayhem
seems not, no.
2023-06-28 17955, 2023
lucifer
let me explain what i understand so far, and you can correct it
2023-06-28 17959, 2023
mayhem
ok
2023-06-28 17917, 2023
lucifer
only considering recording level tags for now.
2023-06-28 17954, 2023
lucifer
for each tag, find all the recordings that have that tag. then based on their listen counts from popularity dataset assign them a percent.
2023-06-28 17917, 2023
lucifer
the most listened recording for each tag get 1.0 and the least listened one gets 0.0.
2023-06-28 17928, 2023
lucifer
rest are on a spectrum within this range.
2023-06-28 17901, 2023
lucifer
emphasizing that the percents are calculated for each tag-recording pair.
2023-06-28 17906, 2023
lucifer
so far correct?
2023-06-28 17915, 2023
mayhem
gimme a few moments to ponder
2023-06-28 17926, 2023
mayhem
you're quite close, but I prefer for us to store listen counts rather than percent in the DB.
2023-06-28 17944, 2023
mayhem
because sometimes the ranges of the data is important.
2023-06-28 17958, 2023
lucifer
but then you can't filter on percents in api.
2023-06-28 17959, 2023
mayhem
is the top listen count 1,000 or 10,000?
2023-06-28 17923, 2023
lucifer
depends on the recording ofc. for something like pop it could be 10000 for less popular ones much less
2023-06-28 17928, 2023
mayhem
lets assume for a moment I never said anything about percent.
2023-06-28 17948, 2023
lucifer
sure then we store listen counts in the database
2023-06-28 17908, 2023
mayhem
and instead, I'm going to be a dumb API user and fetch all the rows for a given tag query.
2023-06-28 17918, 2023
mayhem
stuff them into a plist in ram in troi.
2023-06-28 17940, 2023
mayhem
then ask troi to give me results[50:75] and I get exactly what I want.
2023-06-28 17942, 2023
mayhem
follow me?
2023-06-28 17955, 2023
lucifer
yes
2023-06-28 17907, 2023
mayhem
ok, the server didn't use percent at all. yes?
2023-06-28 17911, 2023
lucifer
yes
2023-06-28 17920, 2023
mayhem
so, if I want the server to handle percent, I need to do the following.
2023-06-28 17925, 2023
mayhem
1. execute query.
2023-06-28 17929, 2023
mayhem
2. get row count.
2023-06-28 17950, 2023
mayhem
3. calculate start offset from start percent and total row count.
2023-06-28 17905, 2023
mayhem
4. do the same to caluculate number of rows.
2023-06-28 17928, 2023
mayhem
5. Do the query, with LIMIT and COUNT.
2023-06-28 17934, 2023
mayhem
6. Return data.
2023-06-28 17956, 2023
mayhem
the problem is, that I need to know row count in the result before I can write the query to fetch data.
2023-06-28 17904, 2023
mayhem
and that might be the downfall of this approach.
2023-06-28 17909, 2023
lucifer
yes that's possible.
2023-06-28 17931, 2023
lucifer
not a biggie i think, that can all be still done in 1 sql query
2023-06-28 17932, 2023
mayhem
without executing the query twice?
2023-06-28 17941, 2023
lucifer
using window functions
2023-06-28 17942, 2023
mayhem
oh yeah, with CTEs.
2023-06-28 17904, 2023
mayhem
ok, so then lets go with your proposed schema, except with count, rather than percent.
2023-06-28 17906, 2023
mayhem
yes?
2023-06-28 17909, 2023
lucifer
however, there's one issue with it. hear me out.
2023-06-28 17918, 2023
mayhem is listening
2023-06-28 17937, 2023
lucifer
so basically we are calculating percent from listen count on each api call/query execution
2023-06-28 17909, 2023
lucifer
whereas this is a static dataset which won't change unless new data comes from spark to completely replace it.
2023-06-28 17933, 2023
lucifer
if we store percent in the database, we can have a very nice index filter on it
2023-06-28 17954, 2023
mayhem
ahhh, index filter. that makes good sense, yes.
2023-06-28 17955, 2023
lucifer
if we store counts, we don't have percent so we have to read all rows for the tag and process it.
2023-06-28 17903, 2023
mayhem
ok, I see your point.
2023-06-28 17919, 2023
lucifer
we can store both listen count and percent if you need counts too.
2023-06-28 17921, 2023
mayhem
I would've like to have counts in the DB, but percent should be good enough.
2023-06-28 17943, 2023
mayhem
lets do percent for now, but we can simply move to count as well when I have a solid use case for it.
2023-06-28 17954, 2023
lucifer
sure, should be a simple change
2023-06-28 17955, 2023
mayhem
since it is generated data, that is easy to change.
2023-06-28 17959, 2023
mayhem
ok, great, thanks!
2023-06-28 17954, 2023
lucifer
mayhem: say there are ~100 recordings in the [0.50, 0.75] range. how do you want to apply count and offset to it?
2023-06-28 17918, 2023
mayhem
if I asked for fewer than 100 recordings, say 50, then ideally it would randomly choose them
2023-06-28 17922, 2023
lucifer
mayhem: but then you couldn't use it with offset
2023-06-28 17942, 2023
lucifer
if the ordering is random, offset doesn't make sense.
2023-06-28 17944, 2023
mayhem
correct. I would just repeat the call.
2023-06-28 17957, 2023
lucifer
so then get rid of offset?
2023-06-28 17958, 2023
mayhem
it would be nice to know how many rows were chosen from.
2023-06-28 17922, 2023
mayhem
yeah, lets. superfluous come to think of it.
2023-06-28 17915, 2023
theraspberry has quit
2023-06-28 17942, 2023
theraspberry joined the channel
2023-06-28 17925, 2023
tux0r has quit
2023-06-28 17913, 2023
tux0r joined the channel
2023-06-28 17937, 2023
TOPIC: #metabrainz MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy, MBS-13146
2023-06-28 17912, 2023
mayhem
thanks yvanzo
2023-06-28 17952, 2023
reosarevok
Who's taking that topic?
2023-06-28 17906, 2023
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146
2023-06-28 17914, 2023
mayhem
yvanzo or I.
2023-06-28 17927, 2023
mayhem
but yvanzo is digging in more than I am, so probably better be yvanzo
2023-06-28 17900, 2023
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146 (yvanzo)
2023-06-28 17901, 2023
atj
that link you sent mentioned that a German court declared Google Fonts transfers personal data to the US without sufficient safeguards
2023-06-28 17928, 2023
mayhem
but that was overturned, no?
2023-06-28 17940, 2023
jasje
lucifer: had a look on the feed pr?
2023-06-28 17908, 2023
BrainzGit
[bookbrainz-site] 14MonkeyDo merged pull request #998 (03administration-system…adminPanelImprovements): feat(Admin System): Add edit buttons instead of icons and use refs instead of updateResultsTrigger https://github.com/metabrainz/bookbrainz-site/pul…
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146 (yvanzo), Google Fonts (ruaok)
2023-06-28 17927, 2023
mayhem
next meeting is going to be a bear
2023-06-28 17941, 2023
yvanzo
At the same time, let’s fund a corporation named Script and take over Alphabet. ;)
2023-06-28 17933, 2023
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | BookBrainz: #bookbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Agenda: Reviews, LLM policy (aerozol/reo), MBS-13146: reCAPTCHA (yvanzo), ORG-51: Google Fonts (ruaok)
2023-06-28 17954, 2023
atj
yvanzo: i assume registration is blocked unless recaptcha consent is given?
2023-06-28 17923, 2023
mayhem
not currently, but that is how it will need to be unless we switch to a different captcha.
2023-06-28 17906, 2023
atj
yeah, sorry i meant in the proposed solution
2023-06-28 17942, 2023
atj
does a GDPR compliant captcha exist?
2023-06-28 17951, 2023
yvanzo
atj: I’m not sure that is an option.
2023-06-28 17905, 2023
atj
yvanzo: why not?
2023-06-28 17924, 2023
zapmonkey joined the channel
2023-06-28 17932, 2023
yvanzo
Cookie walls are not allowed. We have to make sure we can require this one to be set.
2023-06-28 17957, 2023
mayhem
what is a cookie wall?
2023-06-28 17917, 2023
yvanzo
not that delicious
2023-06-28 17951, 2023
yvanzo
Basically blocking access to a website for not accepting non-essential cookies.
2023-06-28 17906, 2023
atj
cookie wall == cookie prompt?
2023-06-28 17942, 2023
yvanzo
But I just don’t know atm, it has to be investigated.
2023-06-28 17949, 2023
mayhem
as per GDPR?
2023-06-28 17933, 2023
yvanzo
I just answered :)
2023-06-28 17934, 2023
atj
personally, i think it's reasonable to prevent registration unless consent is given for recaptcha, and i don't see why this approach would not be GDPR compliant
2023-06-28 17919, 2023
yvanzo
law?
2023-06-28 17948, 2023
yvanzo
I agree that it seems reasonable too.
2023-06-28 17912, 2023
mayhem
I mean we do this for LB. When you create your account you accept that LB processes your data or you delete your account.