But there was a bug in the query thatI wrote, maybe because of that
One incremental dump
I'll try again using matchable text now
_lucifer
Listen count from 2020-04-01 17:59:43 to 2020-09-28 17:59:43: 1876719
Number of distinct rows in the mapping: 4510905
Listen count after mapping: 991921
this is my stats, yeah i am using the full matchable mapping
pristine___
Two hours, too much.
I think when you are testing, you should use less data to make the process faster.
_lucifer
pristine___: yeah right, for once i want to see how much time it takes but for future i woul sure like it to be quicker
pristine___
Yes please!
_lucifer
let's talk with ruaok on this to get a sample mapping
pristine___
In prod it takes 15 min to create df :p
_lucifer
my pc is not that bad by those standards then :P ;)
Nyanko-sensei joined the channel
leonardo joined the channel
imdeni joined the channel
mruszczyk joined the channel
diru1100 joined the channel
reg[m] joined the channel
joshuaboniface joined the channel
djinni` joined the channel
shivam-kapila
Yeah
Only 8 times slower :p
pristine___
_lucifer: I think we should first make a small dump of listens, join it will the 11GB mapping to get subset of mapping and do the same with artist relation. This is like the basic idea.
> my pc is not that bad by those standards then :P ;)
Mine toooo.
So I never try with full dumps :p
_lucifer
pristine___: what is the benefit of --html flag? it says it'll generate html files but what is their content and use?
shivam-kapila
Observability
Of results
_lucifer
yeah so what will those files contains?
pristine___
_lucifer: we have two rn
One for model, one for candidate bets
Sets*
_lucifer
candidate set one
pristine___
Yeah
I use it to debug discrepancy in candidate set data
Try generating one for yourself, it will give you and idea
An*
_lucifer
train model completed succesfully as well in 5 min
adhawkins
Running the mbserver docker containers, I'm seeing entries like this in the logs:
musicbrainz_1 | [error] 08006 DBI connect('dbname=musicbrainz_db;host=db;port=5432','musicbrainz',...) failed: FATAL: the database system is starting up
According to the output of 'docker ps', the database container has only been up for about 6 hours.
'docker-compose logs db' shows that the database did indeed start up around 2am.
Any suggestions for working out what's going on? No OOM errors since making the adjustments to shared_buffers and stopping replication during my overnight music 'scan'
alastairp
"Your package will be delivered within the next 24 hours"
thanks, courier company, that's super helpful
btw: I got about 20 spam PRs in Freesound today, it seems likely that people are trying to pad commits to get credit in hacktoberfest
it'll be interesting to see if MB gets anyway
reosarevok
That's sad :D
adhawkins
Oh, seems like I was wrong about the OOM. It fired around midnight this morning. This time the culprit was 'java'.
I'm just running a scan over my music at the moment. When this completes I'll reboot the VM running it all.
hmm. chaban: I'm not seeing that :/ Do you have an example?
Oh wait
Sorry, I was being dumb
I'm seeing it :)
BrainzGit
[musicbrainz-server] reosarevok opened pull request #1722 (beta…MBS-10536-redux): MBS-10536 (redux): Remove span.name-variation around "see all" releases link https://github.com/metabrainz/musicbrainz-serve...
i thought t first that it was a topic asking aobut this mojibakke, but it turns out it's fine on bookbrainz - is it possible ot fix this on community. side?
[listenbrainz-server] dependabot-preview[bot] opened pull request #1121 (master…dependabot/pip/eventlet-0.28.0): Bump eventlet from 0.26.1 to 0.28.0 https://github.com/metabrainz/listenbrainz-serv...
[listenbrainz-server] dependabot-preview[bot] opened pull request #1122 (master…dependabot/pip/psycopg2-binary-2.8.6): Bump psycopg2-binary from 2.8.5 to 2.8.6 https://github.com/metabrainz/listenbrainz-serv...
lol is depandabot also participating in hacktoberfest
iliekcomputers
that's me delegating my hacktoberfest spam to a bot
this year's t-shirt actually looks pretty good tbh
_lucifer
lol, yeah that's true
pristine___: i am getting prediction scores on -10 to 10 now. so the predictions scores in themself probably do not make much sense
earlier i was thinking that this is just due to input being not scaled but that turns out to be wrong.
pristine___
_lucifer: now? After normalization
alastairp
I'm guessing that we're not going to have a meeting on Monday due to tradition?
_lucifer
pristine___: yes
pristine___
> earlier i was thinking that this is just due to input being not scaled but that turns out to be wrong.
Yeah.
I read something on CF scores (pyspark)
Lemme see if I have a link!
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Summit 20: https://wiki.musicbrainz.org/MusicBrainz_Summit/20
_lucifer: if the scores don't have a meaning, then the candidate set itself a set of user recommendations :(
Is a *
_lucifer
pristine___: no i mean the score do a have meaning but only relative
9.0 > 8.0 for this time but it may or may not be for the next run
pristine___
Right
What I think is we shouldn't show scores to users?
Just the rec and inputs for feedback
_lucifer
yeah makes sense, just sort based on the score but do not show it to the user