both search servers stopped to answer at the same time @16:27 utc
2015-12-14 34804, 2015
stanislas
Freso: i think i finally solved the issue about installing my plugin
2015-12-14 34821, 2015
stanislas
Freso: and i updated it so you might take a look
2015-12-14 34852, 2015
stanislas
Freso: restarting calibre (but not shutting it using ctrl-c) helps
2015-12-14 34847, 2015
ruaok wishes the color between left and right were the same
2015-12-14 34858, 2015
ruaok
zas: the CLOSE_WAIT... I still can't decide if that is a cause or a symptom.
2015-12-14 34821, 2015
zas
it is a symptom
2015-12-14 34854, 2015
regagain_ joined the channel
2015-12-14 34854, 2015
ruaok
yeah, I think so too
2015-12-14 34857, 2015
zas
search search still accepts connections while answering threads are blocked or smt
2015-12-14 34806, 2015
ruaok nods
2015-12-14 34808, 2015
zas
looks at established
2015-12-14 34820, 2015
ruaok
do you have the stack trace for when things are borked?
2015-12-14 34830, 2015
zas
i have one
2015-12-14 34834, 2015
ruaok
I think we should continue on the path of creating the google doc that we started last week.
2015-12-14 34847, 2015
zas
on ernie in /home/zas/
2015-12-14 34859, 2015
ruaok
update with everything that has changed -- now we're able to really ask for help since we're not using Fred Flintstone's tools anymore.
2015-12-14 34811, 2015
zas
yes
2015-12-14 34800, 2015
ruaok
the stacktrace is much more varied this time.
2015-12-14 34823, 2015
ruaok
last time they were all stuck in icu code, now a lot are stuck throwing an exception.
2015-12-14 34815, 2015
ruaok
ok, the plot is thickening
2015-12-14 34829, 2015
ruaok
a lot of threads are blocked in writing network IO.
2015-12-14 34828, 2015
ruaok
which suggests a gateway (related) issue, not a search issue.
2015-12-14 34859, 2015
yeeeargh joined the channel
2015-12-14 34823, 2015
ruaok
zas: around the time of search server crashes, have you looked at syslog on the active gateway?
2015-12-14 34847, 2015
ruaok sees nothing of interest.
2015-12-14 34820, 2015
ruaok
so, if I read the stackdump correctly, it looks like it is dying trying to write the results to the caller.
2015-12-14 34823, 2015
ruaok
which is nginx
2015-12-14 34835, 2015
ruaok
zas: I think we may want to examine our nginx setup and see if we're running out of ... something.
2015-12-14 34815, 2015
dpmittal has left the channel
2015-12-14 34829, 2015
ruaok
zas: ping me when you're back please.
2015-12-14 34832, 2015
opatel99 joined the channel
2015-12-14 34836, 2015
opatel99
Mineo: I have to admit, I am stumped...
2015-12-14 34854, 2015
Mineo
if you tell me why, maybe we can fix that :)
2015-12-14 34857, 2015
opatel99
ELI5, what should I do? The threading seems straight forward, but the first portion of your comments was crypticto me. Should albums with MBIDs be clustered?
2015-12-14 34847, 2015
Mineo
imho only if the option to ignore mbids is true
2015-12-14 34823, 2015
Mineo
the automatic clustering would be most useful for files that are not associated with anything in MB yet
2015-12-14 34803, 2015
Mineo
if there are already MBIDs in the files, the MBIDs are much better information than can be provided by the clustering
2015-12-14 34814, 2015
opatel99
Ok... what if there is a combination?
2015-12-14 34857, 2015
Mineo
of options?
2015-12-14 34842, 2015
opatel99
of files with MBID and no MBID. Should I cluster the ones with no MBID and leave the MBIDs alone?
2015-12-14 34808, 2015
Mineo
ah, I had not actually thought of that yet
2015-12-14 34830, 2015
opatel99
:o
2015-12-14 34830, 2015
Mineo
in that case, I think it would make sense to have an additional method on the tagger object like 'cluster_non_mbid_files' or something that goes through all unmatched files and collects the ones without MBIDs and clusters those
2015-12-14 34834, 2015
Mineo
tl;dr: yes
2015-12-14 34828, 2015
opatel99
Okay. Now what about that in combination with ignore MBIDs? If that option is selected, do everything?
2015-12-14 34846, 2015
Mineo
yes, just cluster all files in that case
2015-12-14 34832, 2015
opatel99
Cool. Giving it a shot.
2015-12-14 34808, 2015
opatel99
Got any more Picard tasks up your sleeve btw? I am kinda useless here without Picard...
2015-12-14 34858, 2015
typhoe
Hello again, when trying to import dumps for the first time with the command "./admin/InitDb.pl -- --createdb --import /tmp/dumps/mbdump*.tar.bz2 --echo", I get an error "psql: FATAL: role "musicbrainz" does not exist"
2015-12-14 34853, 2015
zas
ruaok: i checked the nginx conf with bitmap, and we saw nothing wrong with it (that doesnt mean nothing is wrong)
2015-12-14 34803, 2015
typhoe
Should I create a role or create a clean db (--createdb --clean) before?
2015-12-14 34812, 2015
ruaok
understood.
2015-12-14 34829, 2015
ruaok
there is an admin interface that we can get current stats from, yes?
2015-12-14 34830, 2015
Mineo
opatel99: I was thinking of making a task to improve/rewrite https://picard.musicbrainz.org/docs/scripting/ because a lot of people struggle with scripting, but I'm not yet sure what exactly needs to be improved
2015-12-14 34858, 2015
ruaok
I wonder if we should graph the number of buffers, number of connections, anything for the search* configurations.
2015-12-14 34825, 2015
ruaok
this latest stacktrace really suggests that this is an internal configuration issue and not a lucene/java issue.
I'd love to get your read on the current state of things.
2015-12-14 34823, 2015
akirom has quit
2015-12-14 34820, 2015
stanislas
LordSputnik, Leftmost: I've done my second plugin. I would be grateful if you review my work. I've not submitted it on gci yet, I just want to know your opinion at this stage. Link to my repo : https://github.com/stasszczesniak/CalibreBookBraiā¦
2015-12-14 34837, 2015
ruaok
zas: this doesn't seem detailed enough for my desires, but let's start graphing:
opatel99: You could expand your horizons! I'm about to add two more CB tasks, and there's a bunch of unclaimed beets tasks too, as I mentioned previously.
stanislas: I won't be able to try it out until tomorrow, but one thing you could do to improve would be to split out the bits of code that initialize the UI into separate functions, so the large methods you have at the moment become smaller and easier to maintain
2015-12-14 34835, 2015
Freso
^ +1 (even if I haven't actually looked at the code :))
2015-12-14 34847, 2015
zas
ruaok: this is collected since some time already, if possible (=module enabled)
2015-12-14 34826, 2015
Mineo
regarding the stackdump: I wonder why a lot of threads are in some EOFException, all coming from eclipse-persistence's JSONWriterRecord
2015-12-14 34845, 2015
stanislas
LordSputnik. Ok, i will try to clean my code. Thanks. Maybe Leftmost is willing to review it today.
2015-12-14 34801, 2015
LordSputnik
stanislas: hopefully! I'm willing, but universitry deadlines mean that I've not got the time this evening :(
2015-12-14 34833, 2015
opatel99
I have exams this entire week... Gonna be so behind once I am done..
2015-12-14 34834, 2015
stanislas
LordSputnik: Oh i understand, i have geometry exam tomorrow :)
2015-12-14 34814, 2015
reosarevok
stanislas: less coding, more studying! ;)
2015-12-14 34844, 2015
regagain_ has quit
2015-12-14 34846, 2015
LordSputnik
stanislas: I could add a day onto the task if you like, just in case we don't have it wrapped up by tomorrow evening?
2015-12-14 34839, 2015
Mineo
regarding the bb plugin for calibre: you're aware that you're working around Qts event model by using urllib, right?
2015-12-14 34857, 2015
ruaok
Mineo: yes, exactly that.
2015-12-14 34820, 2015
stanislas
LordSputnik: seems like a good idea
2015-12-14 34828, 2015
stanislas
Mineo: What do you mean ?
2015-12-14 34828, 2015
Freso
stanislas: I'll give it a whirl. :)
2015-12-14 34836, 2015
ruaok
JSONWriterRecord sounds like the bog standard send the response to the caller and it gets stuck somehow.
2015-12-14 34853, 2015
ruaok
and that somehow would be caused by nginx, since that is what is on the other end.
2015-12-14 34811, 2015
ruaok
thus leading me to think we need to examine our nginx config.
2015-12-14 34822, 2015
stanislas
Mineo: No, I don't.
2015-12-14 34827, 2015
ruaok
Mineo: does that line of thinking make sense to you?
2015-12-14 34810, 2015
stanislas
Mineo: I don't even understand what do you mean by "working areound Qts event model by using urllib"
2015-12-14 34829, 2015
Mineo
ruaok: what's surprising to me is that none of the EOFExceptions are related to xml responses getting written, although I suspect the number of those to be way higher than the json ones
2015-12-14 34820, 2015
stanislas
Mineo: Are you talking about my plugin ?
2015-12-14 34821, 2015
Mineo
oh, wait, the website uses json as well, right?
2015-12-14 34828, 2015
ruaok
yes.
2015-12-14 34836, 2015
ruaok
IIRC
2015-12-14 34846, 2015
ruaok
still, that is an interesting observations.
2015-12-14 34848, 2015
ruaok
-s
2015-12-14 34803, 2015
Mineo
stanislas: sorry, I didn't want to try having two conversations at once :-)
2015-12-14 34826, 2015
stanislas
Mineo; ok
2015-12-14 34806, 2015
Mineo
stanislas: calibre seems to be built on Qt which models everything i/o-related (reading files, sending data over the network etc.) as events with callbacks attached to them
2015-12-14 34809, 2015
Mineo
this allows it to do other things while i/o is happening in the background without having to spawn a new thread for every action
2015-12-14 34839, 2015
Mineo
by using urllib to request data from bookbrainz, everything else is blocked while the http request is in progress
2015-12-14 34843, 2015
Mineo
this works if bookbrainz is responding fast, but doesn't work quite so well if the bookbrainz servers take a long time to respond or are completely offline
2015-12-14 34855, 2015
stanislas
Mineo: Would doing all https requests in some other thread solve the problem ?
2015-12-14 34800, 2015
Mineo
yes and no :P I would expect there to be some helper methods for plugins in calibre
2015-12-14 34806, 2015
opatel99
Mineo: Can you explain why the upload for many files fails with auto cluster, but not without?
2015-12-14 34821, 2015
regagain_ joined the channel
2015-12-14 34822, 2015
bitmap
ruaok: we've been getting ISEs for cut-off JSON response from the search server since at least 2013, probably longer
2015-12-14 34845, 2015
Mineo
opatel99: no, I don't really know why that happens, but I think the clustering engine is not meant to be called from multiple threads
2015-12-14 34847, 2015
ruaok
bitmap: that is interesting.
2015-12-14 34854, 2015
opatel99
So QSemaphore reserves threads?
2015-12-14 34855, 2015
ruaok
how frequent are those?
2015-12-14 34816, 2015
ruaok
I wonder if they happen due to index rotation or if they happen when the servers choke
2015-12-14 34840, 2015
ruaok
can we graph the occurance of those, bitmap, zas?
2015-12-14 34850, 2015
bitmap
we usually get a couple/few a day I think
2015-12-14 34808, 2015
Mineo
opatel99: no, it allows you to count the number of active threads (which you can't just do by incrementing a normal variable in multiple threads)
2015-12-14 34822, 2015
bitmap
they include the JSON and show where it gets cut off (then mbserver fails to parse it, hence the ISE)
2015-12-14 34847, 2015
ruaok
ok, a few a day really seems to be related to the index rotation.
2015-12-14 34805, 2015
ruaok
not too much we can do about that with the current setup