pristine___: ishaanshah took one hour but request dataframes completed succesfully so issue is not with the mapping
2020-10-01 27504, 2020
testfreenode has quit
2020-10-01 27526, 2020
pristine___
Dataframes created in an hour?
2020-10-01 27537, 2020
_lucifer
yeah
2020-10-01 27512, 2020
ishaanshah
_lucifer: on kaggle or on local dev?
2020-10-01 27525, 2020
_lucifer
local
2020-10-01 27537, 2020
_lucifer
ok my bad it was 2 hours
2020-10-01 27550, 2020
_lucifer
but succesful
2020-10-01 27552, 2020
ishaanshah
Oh the full mapping worked?
2020-10-01 27506, 2020
ishaanshah
What changes did you make?
2020-10-01 27510, 2020
ishaanshah
To the config
2020-10-01 27511, 2020
_lucifer
none
2020-10-01 27538, 2020
ishaanshah
You said you got an OOM at first right?
2020-10-01 27540, 2020
_lucifer
i too had thought the issue was mapping but i had issued all command the last time
2020-10-01 27553, 2020
_lucifer
this time i am running all commands one by one as they complete
2020-10-01 27507, 2020
_lucifer
there are three left one of which should be the culprit
2020-10-01 27516, 2020
ishaanshah
Oh
2020-10-01 27534, 2020
ishaanshah
I dont know why it ran out of memory when I did it
2020-10-01 27549, 2020
_lucifer
how many listens did you have?
2020-10-01 27559, 2020
ishaanshah
But there was a bug in the query thatI wrote, maybe because of that
2020-10-01 27506, 2020
ishaanshah
One incremental dump
2020-10-01 27521, 2020
ishaanshah
I'll try again using matchable text now
2020-10-01 27551, 2020
_lucifer
Listen count from 2020-04-01 17:59:43 to 2020-09-28 17:59:43: 1876719
2020-10-01 27551, 2020
_lucifer
Number of distinct rows in the mapping: 4510905
2020-10-01 27551, 2020
_lucifer
Listen count after mapping: 991921
2020-10-01 27505, 2020
_lucifer
this is my stats, yeah i am using the full matchable mapping
2020-10-01 27512, 2020
pristine___
Two hours, too much.
2020-10-01 27528, 2020
pristine___
I think when you are testing, you should use less data to make the process faster.
2020-10-01 27540, 2020
_lucifer
pristine___: yeah right, for once i want to see how much time it takes but for future i woul sure like it to be quicker
2020-10-01 27503, 2020
pristine___
Yes please!
2020-10-01 27506, 2020
_lucifer
let's talk with ruaok on this to get a sample mapping
2020-10-01 27528, 2020
pristine___
In prod it takes 15 min to create df :p
2020-10-01 27553, 2020
_lucifer
my pc is not that bad by those standards then :P ;)
2020-10-01 27525, 2020
Nyanko-sensei joined the channel
2020-10-01 27525, 2020
leonardo joined the channel
2020-10-01 27525, 2020
imdeni joined the channel
2020-10-01 27525, 2020
mruszczyk joined the channel
2020-10-01 27525, 2020
diru1100 joined the channel
2020-10-01 27525, 2020
reg[m] joined the channel
2020-10-01 27525, 2020
joshuaboniface joined the channel
2020-10-01 27525, 2020
djinni` joined the channel
2020-10-01 27531, 2020
shivam-kapila
Yeah
2020-10-01 27544, 2020
shivam-kapila
Only 8 times slower :p
2020-10-01 27549, 2020
pristine___
_lucifer: I think we should first make a small dump of listens, join it will the 11GB mapping to get subset of mapping and do the same with artist relation. This is like the basic idea.
2020-10-01 27500, 2020
pristine___
> my pc is not that bad by those standards then :P ;)
2020-10-01 27503, 2020
pristine___
Mine toooo.
2020-10-01 27510, 2020
pristine___
So I never try with full dumps :p
2020-10-01 27518, 2020
_lucifer
pristine___: what is the benefit of --html flag? it says it'll generate html files but what is their content and use?
2020-10-01 27538, 2020
shivam-kapila
Observability
2020-10-01 27541, 2020
shivam-kapila
Of results
2020-10-01 27559, 2020
_lucifer
yeah so what will those files contains?
2020-10-01 27519, 2020
pristine___
_lucifer: we have two rn
2020-10-01 27538, 2020
pristine___
One for model, one for candidate bets
2020-10-01 27542, 2020
pristine___
Sets*
2020-10-01 27546, 2020
_lucifer
candidate set one
2020-10-01 27551, 2020
pristine___
Yeah
2020-10-01 27504, 2020
pristine___
I use it to debug discrepancy in candidate set data
2020-10-01 27531, 2020
pristine___
Try generating one for yourself, it will give you and idea
2020-10-01 27536, 2020
pristine___
An*
2020-10-01 27550, 2020
_lucifer
train model completed succesfully as well in 5 min
2020-10-01 27519, 2020
adhawkins
Running the mbserver docker containers, I'm seeing entries like this in the logs:
2020-10-01 27521, 2020
adhawkins
musicbrainz_1 | [error] 08006 DBI connect('dbname=musicbrainz_db;host=db;port=5432','musicbrainz',...) failed: FATAL: the database system is starting up
2020-10-01 27506, 2020
adhawkins
According to the output of 'docker ps', the database container has only been up for about 6 hours.
2020-10-01 27549, 2020
adhawkins
'docker-compose logs db' shows that the database did indeed start up around 2am.
2020-10-01 27535, 2020
adhawkins
Any suggestions for working out what's going on? No OOM errors since making the adjustments to shared_buffers and stopping replication during my overnight music 'scan'
2020-10-01 27523, 2020
alastairp
"Your package will be delivered within the next 24 hours"
2020-10-01 27529, 2020
alastairp
thanks, courier company, that's super helpful
2020-10-01 27511, 2020
alastairp
btw: I got about 20 spam PRs in Freesound today, it seems likely that people are trying to pad commits to get credit in hacktoberfest
2020-10-01 27519, 2020
alastairp
it'll be interesting to see if MB gets anyway
2020-10-01 27509, 2020
reosarevok
That's sad :D
2020-10-01 27511, 2020
adhawkins
Oh, seems like I was wrong about the OOM. It fired around midnight this morning. This time the culprit was 'java'.
2020-10-01 27530, 2020
adhawkins
I'm just running a scan over my music at the moment. When this completes I'll reboot the VM running it all.
hmm. chaban: I'm not seeing that :/ Do you have an example?
2020-10-01 27540, 2020
reosarevok
Oh wait
2020-10-01 27552, 2020
reosarevok
Sorry, I was being dumb
2020-10-01 27553, 2020
reosarevok
I'm seeing it :)
2020-10-01 27503, 2020
BrainzGit
[musicbrainz-server] reosarevok opened pull request #1722 (beta…MBS-10536-redux): MBS-10536 (redux): Remove span.name-variation around "see all" releases link https://github.com/metabrainz/musicbrainz-server/…
i thought t first that it was a topic asking aobut this mojibakke, but it turns out it's fine on bookbrainz - is it possible ot fix this on community. side?
[listenbrainz-server] dependabot-preview[bot] opened pull request #1122 (master…dependabot/pip/psycopg2-binary-2.8.6): Bump psycopg2-binary from 2.8.5 to 2.8.6 https://github.com/metabrainz/listenbrainz-server…