13:49 PM
alastairp
yes, the old dump is a different table structure
2019-05-09 12936, 2019
13:49 PM
alastairp
and for years I've been saying that I'd fix them, and make a small dump for developers/testing
2019-05-09 12940, 2019
13:49 PM
alastairp
and never got around to it
2019-05-09 12942, 2019
13:49 PM
alastairp
so that's a thing
2019-05-09 12938, 2019
13:50 PM
aidanlw17
Gotcha - happens to all of us
2019-05-09 12956, 2019
13:50 PM
aidanlw17
Maybe we can add that to the list of PRs to get done then?
2019-05-09 12910, 2019
13:51 PM
aidanlw17
It would probably be helpful when we get into building the similarity index
2019-05-09 12925, 2019
13:52 PM
alastairp
yeah, right. we have some PRs
2019-05-09 12950, 2019
13:52 PM
alastairp
2019-05-09 12959, 2019
13:52 PM
alastairp
but it's really difficult to test. takes a long time
2019-05-09 12923, 2019
13:53 PM
alastairp
this is why I'm excited about the submission offset patch, it's going to speed up dumps significatly
2019-05-09 12929, 2019
13:53 PM
alastairp
and will solve many of the problems that I thought we had
2019-05-09 12942, 2019
13:53 PM
alastairp
so maybe this summer will be the time to get these finished
2019-05-09 12906, 2019
13:54 PM
alastairp
yes, for testing similarity we have a handful of things to do there - we can probably get you a bunch of items for you to test with
2019-05-09 12926, 2019
13:54 PM
alastairp
once we want to test large-scale we'll get another development server and make an entire copy of the current AB database
2019-05-09 12921, 2019
13:56 PM
aidanlw17
yeah! I think so... I'm up for it :) hopefully we can get the offset patch up by Monday or on the weekend?
2019-05-09 12947, 2019
13:57 PM
alastairp
my goal is by the end of next week
2019-05-09 12915, 2019
14:01 PM
aidanlw17
Sounds good. I'll let you know when I get the second part up, and we can probably plan out some of the testing when we go over the proposal monday
2019-05-09 12901, 2019
14:02 PM
alastairp
sounds good
2019-05-09 12906, 2019
14:05 PM
zas
2019-05-09 12942, 2019
14:05 PM
ruaok
thank you!
2019-05-09 12954, 2019
14:07 PM
alastairp
have you just started this statistic?
2019-05-09 12902, 2019
14:08 PM
ruaok
yes
2019-05-09 12916, 2019
14:08 PM
alastairp
great
2019-05-09 12941, 2019
14:08 PM
alastairp
how easy is it to split into GET/POST and url? (highlevel/lowlevel get)
2019-05-09 12913, 2019
14:10 PM
zas
alastairp: it isn't a fullblown web logs analyzer, but rather a very quick one i create for mbs high traffic to get near real time stats
2019-05-09 12948, 2019
14:10 PM
alastairp
👍 ok
2019-05-09 12959, 2019
14:11 PM
alastairp
how long will you be in Barcelona for?
2019-05-09 12907, 2019
14:12 PM
zas
till 18
2019-05-09 12913, 2019
14:12 PM
alastairp
great
2019-05-09 12918, 2019
14:12 PM
alastairp
let's do something before you go
2019-05-09 12925, 2019
14:12 PM
ZoeB joined the channel
2019-05-09 12932, 2019
14:12 PM
zas
sure :) like drinking beers ??
2019-05-09 12942, 2019
14:12 PM
alastairp
(and so we can discuss logging in more detail)
2019-05-09 12946, 2019
14:12 PM
alastairp
sounds good
2019-05-09 12900, 2019
14:13 PM
alastairp
I can bring some of Mr_Monkey and my homebrew to officebrainz
2019-05-09 12912, 2019
14:13 PM
ruaok
black IPA, please!
2019-05-09 12918, 2019
14:13 PM
alastairp
mmmm
2019-05-09 12922, 2019
14:13 PM
alastairp
no more black ipa sorry :(
2019-05-09 12925, 2019
14:13 PM
ruaok
not that that I was asked.
2019-05-09 12927, 2019
14:13 PM
alastairp
we have some imperial ipa
2019-05-09 12927, 2019
14:13 PM
ruaok
boo.
2019-05-09 12928, 2019
14:13 PM
zas
btw, wait a bit the stats gather more data, it does a sum each minute, we should have something significant in 20 mins
2019-05-09 12958, 2019
14:13 PM
alastairp
the black ipa is difficult
2019-05-09 12912, 2019
14:14 PM
alastairp
it still smells pretty stout-y straight out of the bottle
2019-05-09 12918, 2019
14:14 PM
alastairp
and it skunks up really quickly
2019-05-09 12919, 2019
14:14 PM
Mr_Monkey
ruaok: The imperial IPA is quite nice!
2019-05-09 12933, 2019
14:14 PM
ruaok
ok, I shan't be too picky.
2019-05-09 12937, 2019
14:14 PM
alastairp
after 2-3 months it's basically a stout. it loses all of the hops
2019-05-09 12951, 2019
14:14 PM
alastairp
I have to experiment a bit more with it
2019-05-09 12937, 2019
14:15 PM
zas
1k 200s per minute... hmmm
2019-05-09 12943, 2019
14:15 PM
zas
that's a lot
2019-05-09 12942, 2019
14:16 PM
zas
and a lot of 404s too (almost 500 per minute)
2019-05-09 12919, 2019
14:17 PM
alastairp
right - because people query the API for all mbids that they have to get data, if we don't have data for that mbid we return 404
2019-05-09 12922, 2019
14:17 PM
ruaok
404s are not surprising.
2019-05-09 12928, 2019
14:17 PM
ruaok
that. :)
2019-05-09 12940, 2019
14:17 PM
alastairp
remember, no rate limiting or api keys on AB
2019-05-09 12924, 2019
14:18 PM
alastairp
we have bulk-get endpoints for lowlevel, we should encourage that more
2019-05-09 12941, 2019
14:18 PM
ruaok
alastairp: I'm going to bump up work_mem again. ok for me to proceed?
2019-05-09 12949, 2019
14:18 PM
zas
still ~1.5k requests / min -> 25 req/s, but we'll get a better figure after a while
2019-05-09 12918, 2019
14:19 PM
alastairp
I'm making a list of things to discuss in the AcousticBrainz board on trello
2019-05-09 12942, 2019
14:19 PM
zas
good thing > 50% are gzipped
2019-05-09 12957, 2019
14:19 PM
zas
for mbs web service that's a very low 3% ...
2019-05-09 12909, 2019
14:20 PM
zas
and 45% for mb website
2019-05-09 12932, 2019
14:21 PM
zas
is there any rate limit ?
2019-05-09 12949, 2019
14:21 PM
alastairp
no rate limit
2019-05-09 12904, 2019
14:22 PM
ruaok
that should be fixed pretty soon, methinks.
2019-05-09 12905, 2019
14:22 PM
alastairp
2019-05-09 12950, 2019
14:22 PM
aidanlw17
alastairp should I join the trello?
2019-05-09 12910, 2019
14:23 PM
alastairp
I'm not sure you can, but it's not important
2019-05-09 12920, 2019
14:23 PM
alastairp
I use it only to keep track of tickets that I've merged but not released
2019-05-09 12930, 2019
14:23 PM
alastairp
we do everything else in jira
2019-05-09 12911, 2019
14:24 PM
aidanlw17
Okay no worries then. Just wanted to make sure I wasn't missing something important
2019-05-09 12928, 2019
14:24 PM
ZoeB
2019-05-09 12942, 2019
14:24 PM
ZoeB
I thought it was because it's joint by another artist as well, but upon a closer look,
https://musicbrainz.org/release-group/7a2bb171-77… *is* included and that's a joint artist work too... Is there any other reason that EP might be excluded from the results?
2019-05-09 12916, 2019
14:25 PM
alastairp
ZoeB: there are only 25 results there, but 31 in total
2019-05-09 12920, 2019
14:25 PM
ZoeB
(I can see an argument that I *may* have over-engineered my website to be reliant upon this...)
2019-05-09 12925, 2019
14:25 PM
alastairp
did you use the offset/limit?
2019-05-09 12930, 2019
14:25 PM
ZoeB
Oh, it's paginated?
2019-05-09 12933, 2019
14:25 PM
alastairp
yep
2019-05-09 12940, 2019
14:25 PM
alastairp
default 25, you can select up to 100
2019-05-09 12954, 2019
14:25 PM
travis-ci joined the channel
2019-05-09 12954, 2019
14:25 PM
travis-ci
2019-05-09 12954, 2019
14:25 PM
travis-ci has left the channel
2019-05-09 12954, 2019
14:25 PM
ZoeB
&limit=100?
2019-05-09 12909, 2019
14:26 PM
alastairp
that looks good
2019-05-09 12916, 2019
14:26 PM
ZoeB
Thank you!
2019-05-09 12928, 2019
14:26 PM
alastairp
I note that you're also doing a query for release-groups, but you're saying that a release isn't in the results
2019-05-09 12935, 2019
14:26 PM
alastairp
was that an error?
2019-05-09 12922, 2019
14:27 PM
ruaok
AB is back. the work_mem doesn't seem to be reducing the temp files. let me let it settle down for a bit.
2019-05-09 12906, 2019
14:28 PM
zas
ruaok: what's the size of the ab database ?
2019-05-09 12924, 2019
14:28 PM
ruaok
waaaay bigger than reo's mum.
2019-05-09 12930, 2019
14:28 PM
zas
i mean in megabytes ? does it fit in ram or not ?
2019-05-09 12934, 2019
14:28 PM
ruaok
no.
2019-05-09 12923, 2019
14:29 PM
ZoeB
I'm somewhat manually recursing, pulling in "http://musicbrainz.org/ws/2/release? release-group={$releaseGroupID}&inc=recordings+artist-credits+url- rels&fmt=json" for each result. I only had
https://wiki.musicbrainz.org/Development/JSON_Web… to go on, so a bit of trial and error was involved. It's not my best work, I'll be honest... (;-.-)
2019-05-09 12912, 2019
14:30 PM
ruaok
zas: 588G /var/lib/docker/volumes/postgres-acousticbrainz-data/_data/base/130618
2019-05-09 12927, 2019
14:30 PM
alastairp
right, that's fine. I just wasn't sure if you were doing that step when you said "it's not there"
2019-05-09 12927, 2019
14:30 PM
ruaok
that is the largest file on disk.
2019-05-09 12932, 2019
14:30 PM
Mr_Monkey
2019-05-09 12933, 2019
14:30 PM
ruaok
close to 600GB in total
2019-05-09 12917, 2019
14:31 PM
ZoeB
Ah, that fixed it, thank you so much!
2019-05-09 12940, 2019
14:31 PM
alastairp
ZoeB: you can also do /release?artist={artist-id} if that helps? if you get 100 items at a time it might be less queries than doing 1 for every release-group-id?
2019-05-09 12928, 2019
14:32 PM
alastairp
2019-05-09 12946, 2019
14:32 PM
alastairp
ignore the fact that it says XML Web Service, it's the same syntax, the only different is &fmt=
2019-05-09 12947, 2019
14:32 PM
ZoeB
Thank you, I'll look into refactoring it like that! I definitely don't want to strain your server.
2019-05-09 12956, 2019
14:32 PM
djwhitey has quit
2019-05-09 12911, 2019
14:34 PM
iliekcomputers
Mr_Monkey: yes, sorry, I'll look at it today (he said again).
2019-05-09 12929, 2019
14:34 PM
zas
frank has hard drives, not SSD, much slower
2019-05-09 12930, 2019
14:34 PM
ruaok
work mem at 128MB is doing better, but I feel that value is too large.
2019-05-09 12953, 2019
14:34 PM
Mr_Monkey
iliekcomputers: Having fixed my issues (I forgot to copy crucial service files a couple of PRs ago), there's no huge rush.
2019-05-09 12913, 2019
14:35 PM
ZoeB has left the channel
2019-05-09 12918, 2019
14:40 PM
zas
ruaok: where can i see the current frank's pg config ?
2019-05-09 12958, 2019
14:40 PM
ruaok
2019-05-09 12907, 2019
14:41 PM
ruaok
are all the default values.
2019-05-09 12958, 2019
14:42 PM
zas
shared_buffers is 128M ? that looks very low to me
2019-05-09 12907, 2019
14:43 PM
zas
2019-05-09 12911, 2019
14:43 PM
ruaok
2019-05-09 12925, 2019
14:43 PM
ruaok
that are the actual values that override the defaults.
2019-05-09 12935, 2019
14:43 PM
zas
ah ok
2019-05-09 12939, 2019
14:43 PM
ruaok
16GB, but it should be 32GB or even 40GB.
2019-05-09 12945, 2019
14:43 PM
ruaok
that is the next thing I want to change.
2019-05-09 12931, 2019
14:45 PM
zas
yup, i'd say at least 32GB, especially for such big db
2019-05-09 12938, 2019
14:45 PM
ruaok
2019-05-09 12902, 2019
14:46 PM
ruaok
but I want to let the current work_mem at 128MB run for a bit.
2019-05-09 12908, 2019
14:46 PM
zas
lgtm, but i don't expect a miracle
2019-05-09 12940, 2019
14:46 PM
ruaok
the miracle fixing comes from an ORDER BY clause gets removed.
2019-05-09 12953, 2019
14:46 PM
ruaok
*being
2019-05-09 12946, 2019
14:47 PM
ruaok
disks pegged to 100% again.
2019-05-09 12901, 2019
14:48 PM
ruaok
load 23. ok, never mind, I'll push this out now.
2019-05-09 12929, 2019
14:48 PM
ruaok
hit approve on the PR, zas?
2019-05-09 12920, 2019
14:49 PM
zas
done
2019-05-09 12923, 2019
14:49 PM
ruaok
thx
2019-05-09 12949, 2019
14:49 PM
zas
sorry, my connection is unstable (phone+train)
2019-05-09 12900, 2019
14:51 PM
ruaok
no worries.
2019-05-09 12910, 2019
14:51 PM
ruaok
you on AVE or TGV?
2019-05-09 12951, 2019
14:51 PM
zas
ave
2019-05-09 12912, 2019
14:52 PM
zas
did you notice frank's disk I/O are mostly writes?
2019-05-09 12954, 2019
14:52 PM
yvanzo
hi zas: I checked MBS-7130 and it is not resolved yet.
2019-05-09 12955, 2019
14:52 PM
BrainzBot
2019-05-09 12904, 2019
14:53 PM
ruaok
yes. query temp storage.
2019-05-09 12909, 2019
14:53 PM
ruaok
that's what I've been working to address.
2019-05-09 12956, 2019
14:54 PM
zas
ok, makes sense, hence work_mem changes
2019-05-09 12914, 2019
14:55 PM
zas
what was the highest value you tried for work_mem ?
2019-05-09 12949, 2019
14:55 PM
ruaok
the current of 128MB
2019-05-09 12935, 2019
14:58 PM
chhavi_ joined the channel
2019-05-09 12901, 2019
14:59 PM
zas
try 512M, just to see if it has any effect, i think queries and/or indexes just need huge optimization, especially if tables/db are big
2019-05-09 12947, 2019
14:59 PM
ruaok
first I want to see the effect of more shared buffers.
2019-05-09 12926, 2019
15:00 PM
zas
the traffic isn't that high, we could set up telegraf to collect frank's pg stats as we do for bowie, but it needs some config on pg side, we may do that tomorrow
2019-05-09 12945, 2019
15:00 PM
zas
just to see how many transactions etc...
2019-05-09 12913, 2019
15:02 PM
yvanzo
ruaok: I just checked the Alpine CVE, Solr doesn’t rely on system authentification, thus mb-solr is not affected.