in #metabrainz

13:27 PM
alastairp

(because that PR changed a _lot_ of lines)
13:27 PM
_lucifer

yeah makes, its now just adding a handler in initial conf and be done with it.
13:27 PM
*makes sense
13:30 PM
leroivi has quit
13:30 PM
i was thinking to remove pytest deps from reuirements_spark.txt by add another step to build test image in Dockerfile.spark.thoughts?
13:31 PM
alastairp

yeah, no problem. though can we reuse requirements_dev to install them?
13:32 PM
_lucifer

no actually we have them in requirements_spark.txt as well.
13:32 PM
ah yes right we can
13:33 PM
MRiddickW joined the channel
13:36 PM
alastairp

yeah, great. I just didn't want to hard-code dependencies in Dockerfile, and I didn't want _another_ requirements file
13:39 PM
mckean has quit
13:40 PM
mckean joined the channel
13:41 PM
reosarevok

yvanzo: around?
13:47 PM
_lucifer

alastairp: what do you think we should do about pyspark dep? adding to requirements_development.txt seems wrong. if we keep in requirements_spark.txt it'll be for sake of tests only.
13:47 PM
alastairp

ah, interesting question
13:48 PM
_lucifer

also, i saw you added `sentry_sdk[pyspark]` instead of just `sentry_sdk` any particular reason?
13:48 PM
alastairp

I think it makes sense to keep it in requirements_spark.txt, it feels like the better place for it
13:48 PM
mmm, good point. sentry_sdk[pyspark] ensures that pyspark is also installed (it makes it a dependency)
13:49 PM
_lucifer

i don't think we want that as pyspark should be provided externally by spark
13:49 PM
and we uninstall it just after anyways
13:49 PM
alastairp

but if the spark setup provides pyspark without us needing to add it, perhaps just sentry_sdk is OK
13:50 PM
why do we uninstall it?
13:50 PM
I don't know how the spark/pyspark thing works
13:50 PM
how does spark provide pyspark? does it put it in sys.path? is it before or after the rest of our dependencies?
13:50 PM
_lucifer

because we pack the deps in venv in a zip and send it to all spark workers
13:51 PM
yvanzo

reosarevok: yup
13:51 PM
reosarevok

yvanzo: see my comments on the PR :)
13:53 PM
_lucifer

i am not familiar with how it works under the hood to supply pyspark.
13:54 PM
alastairp

neither am I
13:54 PM
sorry, I don't have a lot of time today. I'll be at the office tomorrow and would be happy to talk through it with you
13:55 PM
_lucifer

sure, thanks!
14:12 PM
yvanzo

reosarevok: thanks, replied!
14:35 PM
sumedh joined the channel
14:58 PM
sumedh has quit
15:12 PM
D4RK has quit
15:30 PM
sumedh joined the channel
15:31 PM
_lucifer

Mr_Monkey: ping
15:31 PM
Mr_Monkey

Hai !
15:31 PM
_lucifer

https://docs.google.com/document/d/1Lss39cQYTrA...
15:31 PM
I saw you added some mockups here
15:31 PM
which one should we go ahead with?
15:33 PM
Mr_Monkey

Which one do you think looks best and clearest?
15:33 PM
I'm not a fan of option #1, a bit simple.
15:33 PM
_lucifer

the last one super nice :D
15:34 PM
Mr_Monkey

Let's go with that then :)
15:34 PM
_lucifer

second one is also nice
15:34 PM
Mr_Monkey

We can always revisit the design after an initial implementation
15:35 PM
_lucifer

sure, do you have this implemented or is it to be done?
15:38 PM
Mr_Monkey

No, I don't have any code for it I'm afraid
15:39 PM
That day I got lazy and did the page designs directly in an open browser page…
15:40 PM
_lucifer

no worries, i'll modify the spotify page for now and try to put the youtube player and oauth up today for testing.
15:44 PM
ruaok: i think that the redirect uris in syswiki might not work. for my local setup, i had to add the complete url including the path.
15:45 PM
regarding the youtube api key
15:46 PM
ruaok

update syswiki, plz
15:47 PM
_lucifer

we'll have to update in google api console first.
15:48 PM
the console will give a new configuration file after that.
15:50 PM
ruaok

ok, logged in. what changes need to be made?
15:52 PM
_lucifer

http://localhost/profile/music-services/youtube... this is what i am using for my local api key
15:52 PM
if we are fine with the url structure then, adding `/profile/music-services/youtube/callback/` at the end of all urls should be enough
15:53 PM
ruaok

including trailing slash?
15:53 PM
_lucifer

yes
15:53 PM
ruaok

https://usercontent.irccloud-cdn.com/file/0G3hy...
15:54 PM
_lucifer

perfect, thanks!
15:57 PM
we'll need to add some redirect uris for spotify as well to test on beta. but not needed urgently that can be done tomorrow.
15:57 PM
kyledecot joined the channel
15:57 PM
kyledecot

Cross-posting this from #musicbrainz
15:57 PM
Hello everyone! I'm building a website that will allow guitarists to upload Guitar Pro files–as part of this I want to use MusicBrainz to lookup additional metadata about the artist such as their canonical name, aliases, genres, etc.  I've looked over the schema for the DB but I'm still a bit unsure how to traverse it to get the information I
15:57 PM
want. My thinking was that I would search for the "Work" by name and then get to the Artist from that. Using "Toxicity" as an example however you can see that the relationships do not include "System of a Down"   https://beta.musicbrainz.org/work/ab2a78d4-b0ce-391e-bee5-680d69807bf7  It looks as though I would have to inspect the recordings
15:57 PM
to somehow derive the "canonical" recording/artist but I'm unsure if my assumptions are correct or if there is another way to go about this.  Any help in regards to this would be greatly appreciated!
15:58 PM
ruaok

_lucifer: syswiki updated.
15:58 PM
_lucifer

thanks!
15:58 PM
ruaok

reosarevok: can you please help kyledecot?
16:09 PM
_lucifer

Mr_Monkey: in the mockups I see we do not show import details there. that's expected.
16:09 PM
s/expected/intended?
16:13 PM
nelgin

How can I determine if live indexing is turned on with the musicbrainz vm?
16:14 PM
and working
16:16 PM
_lucifer

if you followed https://github.com/metabrainz/musicbrainz-docke... live-indexing should probably be working, otherwise there
16:16 PM
is a bug
16:22 PM
ruaok

I got the MBID matcher running live against incoming listens:
16:22 PM
listens: exact 1785 high 20 med 154 low 55 no 873 err 0
16:22 PM
68% match rate. :)
16:23 PM
no trouble keeping up with incoming listens.
16:23 PM
reosarevok

kyledecot: hi! Give me a moment if that's ok? I'm making dinner :)
16:23 PM
ruaok

on thursday I'll add support for working through old listens.
16:23 PM
nelgin

Well, my system spent from 2:17am to 4:06pm UTC loading the indexes and it's still not finding any results.
16:23 PM
reosarevok

kyledecot: oh, I see you're getting help in #musicbrainz, will check for a mo
16:24 PM
yvanzo: can you help nelgin ?
16:24 PM
kyledecot: oh, no, I'm being dumb. Sorry. Ok, I'll quickly look, food is in the oven anyway
16:24 PM
So
16:24 PM
Yes, you can't trivially get from the work to a "canonical artist", because we don't really have those
16:25 PM
(for example, many works don't really have those, such as classical, folk and even some jazz music)
16:26 PM
For what you want, the most common case is going to be "find a recording, and pick the artist". But do you have any artist name to begin with?
16:28 PM
Because if all you have is, say, the word "Dumb", there's no good way to know if you should be picking Nirvana or Garbage (or someone else)
16:28 PM
kyledecot

reosarevok Yeah I have the artist name to begin with (it's encoded into the Guitar Pro file). I thought I should start at the "work" to also get the canonical title (in case they submit something like "toXiciTY" or something.
16:29 PM
reosarevok

Well, the problem is sometimes we won't even have works, especially for popular music
16:29 PM
Those are not added automatically, they're added by users mostly when they need to either link covers or add a composer / lyricist
16:30 PM
(not guitar music but): https://musicbrainz.org/ws/2/recording?query=ar...
16:30 PM
kyledecot

Oh I see–so would it be safer to just try and look up the artist directly and then attempt to get the work separately (basically split this into two problems)?
16:30 PM
reosarevok

I would suggest trying to find a recording using title + artist
16:31 PM
And then you can check whether the recording is linked to any works if you want info from them :)
16:32 PM
(if it is, you could use the work title as the standard title, while if not, you could just default to the recording title, which might already be an improvement - for example, a search for toXiciTY would probably give you a recording named Toxicity anyway :) )
16:34 PM
nelgin

I rebooted my vm and restarted the docker and now it appears to be working.
16:37 PM
kyledecot

reosarevok that all makes sense–my main blocker was just knowing which resource / entity to start with. I'll probably have additional questions as I continue to develop this but you've done a great job in giving me a direction to head in. Thanks for the quick response / building such an awesome / open platform!
16:42 PM
reosarevok

kyledecot: neat! Don't hesitate to come back and ask more
16:43 PM
Mr_Monkey

_lucifer: missing import details is not intended, no. We'll have to find some space for it
16:44 PM
_lucifer

Mr_Monkey: should we put it on another page?
16:45 PM
Mr_Monkey

I think it makes sense to keep it with each music service in that page. Maybe an "Import details" accordion hidden by default?
16:46 PM
_lucifer

yeah that could work
16:53 PM
yvanzo

nelgin: Solr works asynchronously. It takes extra time for search indexes to be readily available.
16:55 PM
I’m working on improving stuff around BASE_FTP_URL variable you mentioned yesterday.
16:57 PM
When live indexing is working, reindex messages are queued. So you can check either logs of 'mq' or 'search' services.
16:59 PM
kyledecot has quit
17:00 PM
'indexer' logs can also be useful: it indicates when reindex messages are processed with timestamp.
17:00 PM
sudo docker-compose logs --tail 10 --timestamps indexer
17:08 PM
nelgin

yvanzo ah, did I screw things up?
17:08 PM
indexer_1 | 2021-04-27T16:25:02.607307633Z This probably means the server terminated abnormally
17:08 PM
indexer_1 | 2021-04-27T16:25:02.607310064Z before or while processing the request.
17:08 PM
indexer_1 | 2021-04-27T16:25:02.607315104Z [SQL: 'SELECT recording_1.id AS recording_1_id \nFROM musicbrainz.recording AS recording_1 JOIN musicbrainz.artist_credit ON musicbrainz.artist_credit.id = recording_1.artist_credit \nWHERE musicbrainz.artist_credit.id = %(id_1)s'] [parameters: {'id_1': 15856}]
17:09 PM
Hm, if I go back further
17:09 PM
indexer_1 | 2021-04-27T16:24:40.431919098Z 2021-04-27 16:24:40,431: Requeuing 100 pending messages.
17:09 PM
indexer_1 | 2021-04-27T16:24:40.441542857Z 2021-04-27 16:24:40,441: 100 messages requeued.
17:09 PM
indexer_1 | 2021-04-27T16:24:57.994325142Z 2021-04-27 16:24:57,993: Error encountered while processing messages: Post to Solr failed. Requeueing all pending messages for retry.
17:10 PM
I'm going to pastebin this entire log - tehre's all sorts of stuff in here.
17:11 PM
yvanzo

Please pastebin the ouput of this command too: sudo docker-compose exec mq rabbitmqadmin -u sir -p sir -V /search-index-rebuilder list queues
17:16 PM
nelgin

delete, failed, index, and retry all 0
17:17 PM
okno79 joined the channel
17:18 PM
Too big to pastebin so it's here on my server https://wibble.sysadmininc.com/log.txt
17:19 PM
I tried to download and import the indexes twice but had no job per previous comments so ended up just rebuilding them.
17:22 PM
yvanzo

Never seen "this IndexWriter is closed" message before.
17:22 PM
It seems it might be Solr ran out of resources.
17:25 PM
nelgin shrugs
17:30 PM
Next time your replication cron task is running, there should be new messages in 'indexer' logs again.
17:33 PM
nelgin

I can run it now if you like, it wont be for another 14 hours otherwise
17:34 PM
Well, I should say I can run it manually
17:36 PM
yvanzo

Okay
17:37 PM
nelgin

Ok, replication running
17:39 PM
search.index | 1722
17:39 PM
indexer_1 | 2021-04-27T17:37:50.642120375Z 2021-04-27 17:37:50,641: Successfully processed 100 messages
17:39 PM
So it looks like its working
17:40 PM
Though I just got this in my replication log
17:40 PM
WARNING: amqp could not commit tx mode on broker 1, reply_type=2, library_errno=4
17:41 PM
CatQuest

:|
17:46 PM
MRiddickW has quit
17:59 PM
yvanzo

Messages are queued and processed, so it’s working indeed.
18:00 PM
The warning is about https://rabbitmq-c.docsforge.com/master/api/amq... but that requires more investigation.
18:03 PM
nelgin

It seems to happen a few seconds after I run sudo docker-compose exec mq rabbitmqadmin -u sir -p sir -V /search-index-rebuilder list queues
18:03 PM
I just tried it again and got the same thing.
18:06 PM
Etua joined the channel
18:12 PM
Etua has quit
18:14 PM
yvanzo

Ok, just ignore this warning then.