#metabrainz

/

14:51 PM
shivam-kapila

https://github.com/metabrainz/listenbrainz-server…

2020-10-16 29007, 2020

14:52 PM
ruaok

nope. when I click on the track that was skipped it is fully playable.

2020-10-16 29023, 2020

14:52 PM
shivam-kapila

oh

2020-10-16 29026, 2020

14:54 PM
adhawkins_ has quit

2020-10-16 29038, 2020

15:20 PM
shivam-kapila

Mr_Monkey: how to import the lobes file you linked

2020-10-16 29008, 2020

15:21 PM
Mr_Monkey

It's already imported, you should just be able to use the variable in your less file

2020-10-16 29022, 2020

15:21 PM
shivam-kapila

`NameError: variable @listenbrainz is undefined`

2020-10-16 29049, 2020

15:21 PM
Mr_Monkey

Hm.

2020-10-16 29008, 2020

15:24 PM
Mr_Monkey

Maybe something like `@import "./path/to/lobes/less/theme.less";

2020-10-16 29025, 2020

15:24 PM
Mr_Monkey

Not sure where lobes is in LB

2020-10-16 29024, 2020

15:26 PM
Mr_Monkey

I think `@import "./theme/theme.less";`

2020-10-16 29040, 2020

15:26 PM
Mr_Monkey

But… I don't see the @listenbrainz variable anywhere, oddly.

2020-10-16 29050, 2020

15:26 PM
Mr_Monkey

Not sure what the deal is

2020-10-16 29053, 2020

15:26 PM
shivam-kapila

maybe lobes isnt in lb

2020-10-16 29008, 2020

15:28 PM
shivam-kapila

http://livegrep.metabrainz.org/search/livegrep?q=…

2020-10-16 29014, 2020

15:28 PM
shivam-kapila

nada

2020-10-16 29050, 2020

15:28 PM
Mr_Monkey

Hm. OK. Then ignore my remarks :)

2020-10-16 29013, 2020

15:29 PM
Mr_Monkey

We'll probably want to refactor that at some point to avoid having colors defined in multiple places.

2020-10-16 29043, 2020

15:29 PM
shivam-kapila

colors.less?

2020-10-16 29005, 2020

15:31 PM
Mr_Monkey

Something like that yes

2020-10-16 29015, 2020

15:31 PM
Mr_Monkey

Imported at the very top of main.less

2020-10-16 29020, 2020

15:31 PM
shivam-kapila

hm

2020-10-16 29028, 2020

15:31 PM
shivam-kapila

I will make it in next PR

2020-10-16 29014, 2020

15:39 PM
yvanzo

_lucifer: I split the search bug report since not all issues can be addressed at once, can you please make your PRs/commits match the new tickets?

2020-10-16 29024, 2020

15:39 PM
_lucifer

will do yvanzo

2020-10-16 29002, 2020

16:12 PM
_lucifer

yvanzo: saw your comment about gender-id. what would be the process to add that to indexing and are there any deployment concerns around that ?

2020-10-16 29049, 2020

16:13 PM
_lucifer

alastairp: ping

2020-10-16 29059, 2020

16:13 PM
alastairp

hey

2020-10-16 29001, 2020

16:14 PM
alastairp

5 minutes?

2020-10-16 29006, 2020

16:14 PM
_lucifer

sure!

2020-10-16 29018, 2020

16:15 PM
yvanzo

_lucifer: there is no deployment concern afaik but I just cannot test integration without changes to the indexer.

2020-10-16 29018, 2020

16:19 PM
_lucifer

ah ok! i saw you already assigned that to yourself. thanks!

2020-10-16 29014, 2020

16:22 PM
alastairp

_lucifer: I'm here

2020-10-16 29024, 2020

16:22 PM
_lucifer

hi!

2020-10-16 29019, 2020

16:23 PM
alastairp

what were we talking about? stats and graphs on AB?

2020-10-16 29026, 2020

16:23 PM
_lucifer

yes!

2020-10-16 29050, 2020

16:23 PM
alastairp

cool

2020-10-16 29053, 2020

16:23 PM
alastairp

let me pull up some stuff

2020-10-16 29009, 2020

16:24 PM
alastairp

I think I showed you this, right? https://github.com/MTG/acousticbrainz-labs/tree/m…

2020-10-16 29026, 2020

16:24 PM
_lucifer

yes, right

2020-10-16 29057, 2020

16:25 PM
alastairp

so one thing that we're trying to do is make the site look interesting

2020-10-16 29030, 2020

16:26 PM
alastairp

personally, I'd love to see this data update in real-time

2020-10-16 29008, 2020

16:27 PM
alastairp

so the question to answer is to work out what graphs describe the data in the most interesting way

2020-10-16 29022, 2020

16:27 PM
_lucifer

interesting question

2020-10-16 29052, 2020

16:27 PM
alastairp

you'll see things like https://github.com/MTG/acousticbrainz-labs/blob/m… are pretty terrible

2020-10-16 29002, 2020

16:28 PM
_lucifer

I think this one is interesting https://github.com/MTG/acousticbrainz-labs/blob/m…

2020-10-16 29002, 2020

16:28 PM
alastairp

we don't want to show most of these, because they don't make any sense

2020-10-16 29013, 2020

16:28 PM
alastairp

yeah, I like the features/year and features/genre one

2020-10-16 29005, 2020

16:29 PM
alastairp

it's a lot better to do more objective graphs - features and years are pretty good

2020-10-16 29030, 2020

16:29 PM
alastairp

whereas if we start showing genre graphs and say "all music falls into one of these 8 categories", I'm sure that people will start complaining :)

2020-10-16 29045, 2020

16:29 PM
_lucifer

right, makes sense

2020-10-16 29007, 2020

16:30 PM
_lucifer

we currently do not have a pipeline to create these graphs right?

2020-10-16 29018, 2020

16:30 PM
alastairp

no

2020-10-16 29058, 2020

16:30 PM
alastairp

well, we have the code used in these notebooks to generate the graphs

2020-10-16 29014, 2020

16:31 PM
_lucifer

right, we need to integrate these with ab database

2020-10-16 29044, 2020

16:31 PM
alastairp

however, we also have this: https://github.com/metabrainz/acousticbrainz-serv…

2020-10-16 29057, 2020

16:31 PM
_lucifer

by real time you mean like updating whenever a recording is submitted ?

2020-10-16 29010, 2020

16:32 PM
alastairp

the problem with integration into the database is that it's too slow to query all of the data, even if we did it periodically

2020-10-16 29022, 2020

16:32 PM
alastairp

right, perhaps not that often, but say once a week

2020-10-16 29036, 2020

16:32 PM
_lucifer

that's doable

2020-10-16 29057, 2020

16:32 PM
alastairp

we have much of this information in the `similarity.similarity` table

2020-10-16 29004, 2020

16:33 PM
alastairp

and it's much smaller than the lowlevel table

2020-10-16 29034, 2020

16:33 PM
_lucifer

that's nice!

2020-10-16 29037, 2020

16:33 PM
alastairp

so perhaps we could have a periodic task that we run that summarises this table

2020-10-16 29005, 2020

16:34 PM
alastairp

if not, we could definitely also create another statistics table, although there is a question about what data we should add there

2020-10-16 29026, 2020

16:34 PM
alastairp

we could make some initial tables, and load data, and then if we need more data for more graphs, we add those at a later stage

2020-10-16 29041, 2020

16:34 PM
alastairp

for example - the similarity table doesn't have years, so we'd have to get that separately

2020-10-16 29014, 2020

16:35 PM
alastairp

then say for example we wanted to compare year to loudness, we'd need some kind of table that allowed us to join this info together

2020-10-16 29029, 2020

16:35 PM
alastairp

the genre or mood tables are much easier, because we just need categories and counts

2020-10-16 29024, 2020

16:36 PM
_lucifer

+1

2020-10-16 29045, 2020

16:36 PM
alastairp

OK, so

2020-10-16 29050, 2020

16:36 PM
alastairp

let's focus on the following charts:

2020-10-16 29043, 2020

16:37 PM
alastairp

genre rosamerica, feature/genre, feature/year, key estimation, genre mood (at the end of mood)

2020-10-16 29037, 2020

16:38 PM
_lucifer

awesome!

2020-10-16 29038, 2020

16:38 PM
alastairp

on bono, you can `psql -U acousticbrainz acousticbrainz_big` (not inside docker)

2020-10-16 29004, 2020

16:39 PM
alastairp

that has a full lowlevel table, and full `similarity.similarity` table

2020-10-16 29051, 2020

16:39 PM
alastairp

I think we should create a new postgresql schema (call it `statistics`), and for each graph, make a new table in this schema that stores just the information that we need for that graph

2020-10-16 29018, 2020

16:40 PM
_lucifer

yeah that's a great place to start

2020-10-16 29027, 2020

16:40 PM
alastairp

then we can see how easy it is to 1) get the data from similarity.similarity, or 2) get the data from lowlevel as the data comes in

2020-10-16 29011, 2020

16:41 PM
_lucifer

okay, will be needing to use saprk ?

2020-10-16 29021, 2020

16:41 PM
alastairp

I don't think so

2020-10-16 29038, 2020

16:41 PM
alastairp

this isn't really analysis, it's just loading and transforming data

2020-10-16 29003, 2020

16:42 PM
alastairp

if we wanted to use it, we'd have to load all of the necessary data into hdfs, which I suspect would be really annoying

2020-10-16 29016, 2020

16:42 PM
_lucifer

okay, yeah right. the similarity table is smaller and we can probably process it directly

2020-10-16 29023, 2020

16:42 PM
alastairp

that's what I'm hoping

2020-10-16 29046, 2020

16:43 PM
_lucifer

we can use spark without hdfs but that's a thing to consider for afterwards

2020-10-16 29026, 2020

16:44 PM
alastairp

oh? how would that work?

2020-10-16 29001, 2020

16:45 PM
alastairp

in some cases it might make sense to use spark for machine learning in AB, we should look into it as future option

2020-10-16 29023, 2020

16:45 PM
_lucifer

> Access data in HDFS, Alluxio, Apache Cassandra, Apache HBase, Apache Hive, and hundreds of other data sources.

2020-10-16 29032, 2020

16:45 PM
_lucifer

Spark home page says this

2020-10-16 29005, 2020

16:46 PM
_lucifer

i had also read an article on the same but cannot find it right now

2020-10-16 29039, 2020

16:46 PM
_lucifer

https://spark.apache.org/docs/latest/sql-data-sou…

2020-10-16 29044, 2020

16:46 PM
_lucifer

this one sums it up

2020-10-16 29015, 2020

16:47 PM
_lucifer

PostgreSQL has provides a JDBC plugin to allow spark to connect to it directly

2020-10-16 29032, 2020

16:49 PM
BrainzGit

[mb-solr] yvanzo merged pull request #39 (master…SEARCH-611): SEARCH-628: 'primary-type-id' field is missing from JSON release group search results https://github.com/metabrainz/mb-solr/pull/39

2020-10-16 29034, 2020

16:49 PM
BrainzBot

SEARCH-611: Incorrect content in JSON version of release group search result https://tickets.metabrainz.org/browse/SEARCH-611

2020-10-16 29034, 2020

16:49 PM
BrainzBot

SEARCH-628: 'primary-type-id' field is missing from JSON release group search results https://tickets.metabrainz.org/browse/SEARCH-628

2020-10-16 29001, 2020

16:50 PM
_lucifer

alastairp: by the way, why do we use HDFS ?

2020-10-16 29006, 2020

16:50 PM
yvanzo

_lucifer: can you please update #43 too?

2020-10-16 29038, 2020

16:50 PM
_lucifer

yes, yvanzo i am just testing it locally and will push the changes soon

2020-10-16 29054, 2020

16:50 PM
yvanzo

btw, status-id change requires indexer changes too. I will update ticket accordingly and work on sir.

2020-10-16 29050, 2020

16:51 PM
_lucifer

oh ok! thanks

2020-10-16 29006, 2020

16:52 PM
yvanzo

I’m reviewing oxml cleanup and Java 11 preps next :)

2020-10-16 29023, 2020

16:52 PM
_lucifer

Great! :D

2020-10-16 29015, 2020

16:56 PM
yvanzo

_lucifer: did you use specific commands for #42?

2020-10-16 29049, 2020

16:56 PM
_lucifer

yvanzo: no, why?

2020-10-16 29040, 2020

16:57 PM
yvanzo

It could have helped with rebasing.

2020-10-16 29054, 2020

16:57 PM
yvanzo

"Auto format" sounds like something automated though :)

2020-10-16 29052, 2020

16:58 PM
_lucifer

oh! that, i just had my ide format that file to 4 space indent

2020-10-16 29016, 2020

16:59 PM
_lucifer

rest is poor choice of words 😅

2020-10-16 29012, 2020

17:00 PM
_lucifer

can you advise how I could have made rebasing easier?

2020-10-16 29059, 2020

17:00 PM
yvanzo

If that was a command in the commit message, one would just have to run it again.

2020-10-16 29044, 2020

17:03 PM
_lucifer

oh! makes sense. that i can do with the ide again. i'll rebase and drop the existing commit.

2020-10-16 29023, 2020

17:04 PM
_lucifer

the clustering is the one that will have to be done manually and take some time

2020-10-16 29005, 2020

17:11 PM
yvanzo

Java 11 PR looks good overall, but there are a few deployment concerns, will not merge for the upcoming release.

2020-10-16 29032, 2020

17:11 PM
_lucifer

👍

2020-10-16 29048, 2020

17:13 PM
yvanzo

Can you also remove unrelated commits from #37? (since they have been copied to separate PRs)

2020-10-16 29023, 2020

17:14 PM
_lucifer

yeah sure

2020-10-16 29034, 2020

17:16 PM
ruaok

it was bound to happen. my own code (and pristine___s) rickrolled me.

2020-10-16 29038, 2020

17:16 PM
ruaok

https://usercontent.irccloud-cdn.com/file/7YL4hdB…

2020-10-16 29043, 2020

17:18 PM
_lucifer

lol 😂😂

2020-10-16 29006, 2020

17:19 PM
shivam-kapila

ruaok: welcome to kapilaland

2020-10-16 29031, 2020

17:19 PM
JoshDi joined the channel

2020-10-16 29033, 2020

17:19 PM
ruaok

I actually think this is an evil plot by pristine___ to get back at me

2020-10-16 29029, 2020

17:20 PM
JoshDi

Hey quick question. I currently run a musicbrainz slave server via the docker image. Is there a way to turn off indexing completely so all local queries go directly to the database?

2020-10-16 29043, 2020

17:20 PM
shivam-kapila

ruaok: save yourself

2020-10-16 29035, 2020

17:21 PM
ruaok

K says: "Never gonna give listenbrainz up, never gonna let listenbrainz down, never gonna turn around and hurt listenbrainz!"

2020-10-16 29057, 2020

17:21 PM
ruaok

yvanzo: ^^ see JoshDi's query

2020-10-16 29003, 2020

17:22 PM
ruaok waves at JoshDi

2020-10-16 29007, 2020

17:22 PM
JoshDi

Hey

2020-10-16 29002, 2020

17:23 PM
JoshDi

I find even with SIR tweaks, live indexing daily updates of the slave , take like 12 hours to finish. When full reindexing takes 3 hrs

2020-10-16 29024, 2020

17:23 PM
shivam-kapila

ruaok: I introduced a lil of troi to ppl

2020-10-16 29036, 2020

17:23 PM
shivam-kapila

And they were like damn. Dynamic playlists

2020-10-16 29037, 2020

17:23 PM
JoshDi

I only use this server for some local processes so its not like my musicbrainz server is very busy.

2020-10-16 29054, 2020

17:23 PM
shivam-kapila

They felt really excited

2020-10-16 29045, 2020

17:24 PM
JoshDi

Any ideas?

2020-10-16 29023, 2020

17:25 PM
ruaok

I dont know, but yvanzo will. hang tight for him to return and he'll sort you out. (he is around)

2020-10-16 29010, 2020

17:26 PM
JoshDi

[solr]uri = http://search:8983/solrbatch_size = 5000[sir]import_threads = 10index_limit = 2000000live_index_batch_size = 5000process_delay = 15query_batch_size = 20000wscompat = onprefetch_count = 2000

2020-10-16 29046, 2020

17:26 PM
JoshDi

Im now running this on a machine with 128gb of ram and 40 threads at 3.1ghz.... so it should be much faster

2020-10-16 29018, 2020

17:28 PM
shivam-kapila

ruaok: can I ask an off topic doubt

2020-10-16 29034, 2020

17:28 PM
ruaok

are you giving JMVs ram to use? by default they may not be using enough, making things slower

2020-10-16 29049, 2020

17:28 PM
ruaok

shivam-kapila: why do you keep asking if you can ask a question?

2020-10-16 29053, 2020

17:29 PM
JoshDi13 joined the channel

2020-10-16 29009, 2020

17:30 PM
shivam-kapila

Anyways do we have ryzen processors in prod?

2020-10-16 29026, 2020

17:30 PM
JoshDi13

my memory settings are: shm_size: 4g and SOLR_HEAP=4g

2020-10-16 29044, 2020

17:30 PM
ruaok

shivam-kapila: yes

2020-10-16 29053, 2020

17:30 PM
ruaok

JoshDi13: why not try 8g and see what happens?

2020-10-16 29056, 2020

17:30 PM
JoshDi has left the channel

2020-10-16 29023, 2020

17:31 PM
JoshDi13 is now known as JoshDi

2020-10-16 29027, 2020

17:31 PM
JoshDi

postgres -c "shared_buffers=4GB" -c "work_mem=128MB" -c "shared_preload_libraries=pg_amqp.so"