aerozol: atj_mb akshaaatt : these are the human tower builders I was talking about
reosarevok
mayhem: can you change the name for a donation? (see support) Alternatively, if I have access to do that, can you show me how so I don't have to pester you? :)
bitmap
hola I’m aboard the aerobus, I’ll head to the office
mayhem
moin bitmap
not sure if anyone is at the office yet. you could head to the other airbnb and see if aerozol akshaaatt and atj_mb will let you take a shower there until people head to the office.
bitmap
Oh sounds good, then I’ll ping the three of them when I’m outside the Airbnb
mayhem
good plan. I'll be heading to the office within the hour.
reosarevok
mayhem: I understand the plan is to have zoom streaming for the 14:00 - 19:30 section each day?
(just figuring out when I should check in)
mayhem
I've not really be part of the streaming/zoom discussion. all I know is that I am lending my webcam to the effort.
reosarevok
Ok, sure, but I mean that that's the actual "meeting" part, and the rest is hacking and socialising, from the schedule, right? :)
Should be doable, if so :)
mayhem
as for tomorrow, we're going to have a sooper special project for the day. one we all need to get through, whether we like it or not. OAuth.
reosarevok
AOuch
mayhem
if would be good if you could be around for that and help the MB team
reosarevok
Let me know the estimated time and I'll do my best :)
mayhem
BCN summit-like daytime. you know what that means. :)
reosarevok: that donation has been updated
reosarevok
Thanks!
Mailed them back then
aerozol
Hey bitmap, sounds good, see you soon!
mayhem: those human tower photos are insane!!
mayhem
right?
and on university campuses you can see them practicing over lunch.
aerozol
The photos of one collapsing are terrifying. Would definitely watch though!
mayhem
which is far differente from the computer sci geeks watching the aggies rope wooden bulls in the parking lot with a real rope.
which is what kept happening at my uni, lol
aerozol
Hah, a bit less exciting. Real rope though, ooh
bitmap
aerozol: akshaaatt: atj_mb: I’m outside Carrer de Nàpols, 98, I think
place with a bunch of scaffolding?
mayhem
yep
Press the Atico 3 button to wake them up. :)
bitmap
Button pressed!
yvanzo
Is there a universal power adapter (19V, 2.1A) I could borrow from BCNers? Or I will buy one for my laptop when I arrive in 6h from now (I know the best shops already).
mayhem
yvanzo: I dont have one of those, everything I have is now USB powered. :(
q3lont joined the channel
agatzk has quit
agatzk joined the channel
alastairp
yvanzo: what kind of plug does your laptop take?
I have a 20v 1.5A old thinkpad round connector
and a 20V 1.3A lenovo new-style thinkpad square connector
mayhem has arrived at the office
slow start here, but I'll turn up soonish
q3lont has quit
yvanzo
alastairp: round dc jack, tip pin size: 5.5mm * 2.5mm
(40w)
alastairp
sorry, looks like mine isn't that size. I'll bring it anyway in case you want to try
mayhem
lucifer: regarding with what we want to store, I wonder if we should store the key things we really care about as columns, but still have a JSONB column for all the other fields.
and the question about markets is tricky: do we know that the cross linking they mentioned works as expected?
q3lont joined the channel
I think storing external links is also useful. stuff that we could mine...
lucifer
mayhem: i see, how about keep everything as jsonb in the existing cache table but build normalized tables of only the stuff we need. so those normalized tables won't have any jsonb columns but the original table we have currently will.
uh let me write down a schema to clear it up.
alastairp
I'm on my way. will stop and pick up some tea. does anyone drink it/have preferences? akshaaatt aerozol bitmap?
lucifer
oh and about those external links, the album tracks endpoint only returns isrc/ean/upc for albums but not for individual tracks. the tracks lookup endpoint does.
so we'll probably have to do some more lookups to get all the external ids.
same for genres. those are not returned in all endpoints only some.
petitminion joined the channel
agatzk has quit
agatzk joined the channel
q3lont has quit
petitminion has quit
petitminion_ joined the channel
mayhem: actually thinking more, yes makes sense to have just one JSONB column for extra things there.
i think we can columns for the data we need in building the index, rest all goes to jsonb column for now. we can add more columns as need them in future. usecases like reading external ids only need a read of the jsonb column without joins so shuold be fast regardless.
and we won't need to do them on a very regular basis.
there is the tmp_sp_metadata table on gaga that you could try it on.
lucifer
i am not sure how spotify handles artist credits for now, i have put a unique index on spotify_id, name. we'll be able to detect issues but will have to rebuild in case we find some. i guess that's fine.
do you mean storing an array of artist name in the artist table with each id ? or an array of artist name in track or album table?
yvanzo
alastairp, mayhem: Thanks, issue resolved, I bought an adapter on the way.
reosarevok
mayhem: since you're apparently not on #musicbrainz, someone posted "helloooo, I just tested the recommendation endpoint and wanted to let you know I think results are greeeeat o/ its amazing :D thx a lot :) " :)
Oh. That'd be petitminion_
mayhem
Heh, lol.
lucifer
alastairp: hi! any progress on CB PRs?
alastairp
lucifer: hi! we've just finished lunch, lol
so that's a no
lucifer
ah nice! :D
np
Pratha-Fish
alastairp: h e n l o
I have a little update.
The script will be completed in no time, but it might be a lot slower than expected. To counter that, I am trying to rack my brain to optimize the cleanup functions. But you could recommend any options, it would be pretty helpful.
Here's what I've tried:
- numpy.vectorize (slower than pandas for some reason lol)
- numba.vectorize, numba.jit -> apparently, can't even process a simple dictionary checking function :|
- Currently trying out Dask, modin for speedups
CC: lucifer
lucifer
what does the cleanup function do?
can share its code?
alastairp
Pratha-Fish: yeah, let's take a look at the code first
mayhem
lucifer: I guess an array of (artist name, artist id) would be best.
Pratha-Fish
lucifer, alastairp: pushing the code to the repo..
alastairp
Pratha-Fish: my initial feedback on this is that it's almost certainly not as slow as you think
and if it does have a problem, it's probably a really simple fix
Pratha-Fish
alastairp: I really hope so
Currently, it's taking ~18 - 24s to process 105k rows (excluding r/w times)
alastairp
I bet we can get that 10x faster
Pratha-Fish
note that it's 105k non-unique rows. Maybe making it unique could help, but mapping the results back to the dataframe could be another bottleneck
alastairp: epic
lucifer
mayhem: that can be built during the join query to fetch the data imo.