But there was a bug in the query thatI wrote, maybe because of that
2020-10-01 27506, 2020
ishaanshah
One incremental dump
2020-10-01 27521, 2020
ishaanshah
I'll try again using matchable text now
2020-10-01 27551, 2020
_lucifer
Listen count from 2020-04-01 17:59:43 to 2020-09-28 17:59:43: 1876719
2020-10-01 27551, 2020
_lucifer
Number of distinct rows in the mapping: 4510905
2020-10-01 27551, 2020
_lucifer
Listen count after mapping: 991921
2020-10-01 27505, 2020
_lucifer
this is my stats, yeah i am using the full matchable mapping
2020-10-01 27512, 2020
pristine___
Two hours, too much.
2020-10-01 27528, 2020
pristine___
I think when you are testing, you should use less data to make the process faster.
2020-10-01 27540, 2020
_lucifer
pristine___: yeah right, for once i want to see how much time it takes but for future i woul sure like it to be quicker
2020-10-01 27503, 2020
pristine___
Yes please!
2020-10-01 27506, 2020
_lucifer
let's talk with ruaok on this to get a sample mapping
2020-10-01 27528, 2020
pristine___
In prod it takes 15 min to create df :p
2020-10-01 27553, 2020
_lucifer
my pc is not that bad by those standards then :P ;)
2020-10-01 27525, 2020
Nyanko-sensei joined the channel
2020-10-01 27525, 2020
leonardo joined the channel
2020-10-01 27525, 2020
imdeni joined the channel
2020-10-01 27525, 2020
mruszczyk joined the channel
2020-10-01 27525, 2020
diru1100 joined the channel
2020-10-01 27525, 2020
reg[m] joined the channel
2020-10-01 27525, 2020
joshuaboniface joined the channel
2020-10-01 27525, 2020
djinni` joined the channel
2020-10-01 27531, 2020
shivam-kapila
Yeah
2020-10-01 27544, 2020
shivam-kapila
Only 8 times slower :p
2020-10-01 27549, 2020
pristine___
_lucifer: I think we should first make a small dump of listens, join it will the 11GB mapping to get subset of mapping and do the same with artist relation. This is like the basic idea.
2020-10-01 27500, 2020
pristine___
> my pc is not that bad by those standards then :P ;)
2020-10-01 27503, 2020
pristine___
Mine toooo.
2020-10-01 27510, 2020
pristine___
So I never try with full dumps :p
2020-10-01 27518, 2020
_lucifer
pristine___: what is the benefit of --html flag? it says it'll generate html files but what is their content and use?
2020-10-01 27538, 2020
shivam-kapila
Observability
2020-10-01 27541, 2020
shivam-kapila
Of results
2020-10-01 27559, 2020
_lucifer
yeah so what will those files contains?
2020-10-01 27519, 2020
pristine___
_lucifer: we have two rn
2020-10-01 27538, 2020
pristine___
One for model, one for candidate bets
2020-10-01 27542, 2020
pristine___
Sets*
2020-10-01 27546, 2020
_lucifer
candidate set one
2020-10-01 27551, 2020
pristine___
Yeah
2020-10-01 27504, 2020
pristine___
I use it to debug discrepancy in candidate set data
2020-10-01 27531, 2020
pristine___
Try generating one for yourself, it will give you and idea
2020-10-01 27536, 2020
pristine___
An*
2020-10-01 27550, 2020
_lucifer
train model completed succesfully as well in 5 min
2020-10-01 27519, 2020
adhawkins
Running the mbserver docker containers, I'm seeing entries like this in the logs:
2020-10-01 27521, 2020
adhawkins
musicbrainz_1 | [error] 08006 DBI connect('dbname=musicbrainz_db;host=db;port=5432','musicbrainz',...) failed: FATAL: the database system is starting up
2020-10-01 27506, 2020
adhawkins
According to the output of 'docker ps', the database container has only been up for about 6 hours.
2020-10-01 27549, 2020
adhawkins
'docker-compose logs db' shows that the database did indeed start up around 2am.
2020-10-01 27535, 2020
adhawkins
Any suggestions for working out what's going on? No OOM errors since making the adjustments to shared_buffers and stopping replication during my overnight music 'scan'
2020-10-01 27523, 2020
alastairp
"Your package will be delivered within the next 24 hours"
2020-10-01 27529, 2020
alastairp
thanks, courier company, that's super helpful
2020-10-01 27511, 2020
alastairp
btw: I got about 20 spam PRs in Freesound today, it seems likely that people are trying to pad commits to get credit in hacktoberfest
2020-10-01 27519, 2020
alastairp
it'll be interesting to see if MB gets anyway
2020-10-01 27509, 2020
reosarevok
That's sad :D
2020-10-01 27511, 2020
adhawkins
Oh, seems like I was wrong about the OOM. It fired around midnight this morning. This time the culprit was 'java'.
2020-10-01 27530, 2020
adhawkins
I'm just running a scan over my music at the moment. When this completes I'll reboot the VM running it all.
hmm. chaban: I'm not seeing that :/ Do you have an example?
2020-10-01 27540, 2020
reosarevok
Oh wait
2020-10-01 27552, 2020
reosarevok
Sorry, I was being dumb
2020-10-01 27553, 2020
reosarevok
I'm seeing it :)
2020-10-01 27503, 2020
BrainzGit
[musicbrainz-server] reosarevok opened pull request #1722 (beta…MBS-10536-redux): MBS-10536 (redux): Remove span.name-variation around "see all" releases link https://github.com/metabrainz/musicbrainz-server/…
i thought t first that it was a topic asking aobut this mojibakke, but it turns out it's fine on bookbrainz - is it possible ot fix this on community. side?
[listenbrainz-server] dependabot-preview[bot] opened pull request #1122 (master…dependabot/pip/psycopg2-binary-2.8.6): Bump psycopg2-binary from 2.8.5 to 2.8.6 https://github.com/metabrainz/listenbrainz-server…
lol is depandabot also participating in hacktoberfest
2020-10-01 27538, 2020
iliekcomputers
that's me delegating my hacktoberfest spam to a bot
2020-10-01 27550, 2020
iliekcomputers
this year's t-shirt actually looks pretty good tbh
2020-10-01 27534, 2020
_lucifer
lol, yeah that's true
2020-10-01 27520, 2020
_lucifer
pristine___: i am getting prediction scores on -10 to 10 now. so the predictions scores in themself probably do not make much sense
2020-10-01 27532, 2020
_lucifer
earlier i was thinking that this is just due to input being not scaled but that turns out to be wrong.
2020-10-01 27543, 2020
pristine___
_lucifer: now? After normalization
2020-10-01 27548, 2020
alastairp
I'm guessing that we're not going to have a meeting on Monday due to tradition?
2020-10-01 27500, 2020
_lucifer
pristine___: yes
2020-10-01 27506, 2020
pristine___
> earlier i was thinking that this is just due to input being not scaled but that turns out to be wrong.
2020-10-01 27508, 2020
pristine___
Yeah.
2020-10-01 27524, 2020
pristine___
I read something on CF scores (pyspark)
2020-10-01 27533, 2020
pristine___
Lemme see if I have a link!
2020-10-01 27511, 2020
TOPIC: MetaBrainz Community and Development channel | MusicBrainz non-development: #musicbrainz | Channel is logged; see https://musicbrainz.org/doc/IRC for details | Summit 20: https://wiki.musicbrainz.org/MusicBrainz_Summit/20
2020-10-01 27543, 2020
pristine___
_lucifer: if the scores don't have a meaning, then the candidate set itself a set of user recommendations :(
2020-10-01 27551, 2020
pristine___
Is a *
2020-10-01 27555, 2020
_lucifer
pristine___: no i mean the score do a have meaning but only relative
2020-10-01 27523, 2020
_lucifer
9.0 > 8.0 for this time but it may or may not be for the next run
2020-10-01 27530, 2020
pristine___
Right
2020-10-01 27514, 2020
pristine___
What I think is we shouldn't show scores to users?
2020-10-01 27527, 2020
pristine___
Just the rec and inputs for feedback
2020-10-01 27509, 2020
_lucifer
yeah makes sense, just sort based on the score but do not show it to the user