aerozol: guidelines require a lot more community input :) but it might be a good time to start them!
2022-01-18 01800, 2022
reosarevok
If you have ideas, put them to paper (or well, to wiki) and we can open a discussion
2022-01-18 01852, 2022
lucifer
Lotheric: goldenshimmer: not entirely sure but one possible lead I discussed with outsidecontext is the recently played endpoint returning tracks from spotify out of order. that can causes streams to miss from LB.
2022-01-18 01811, 2022
lucifer
mayhem: alastairp: it looks like spotify endpoint might be returning streams out of order. for eg: consider a user continously listening since 1 PM. suppose we query at 1:15 PM and we get no streams. we attempt again later but still no streams then we query at 3 PM, endpoint returns streams till 2 PM + a recent stream listened around 2:50 PM. the endpoint hasn't yet returned streams between 2 PM - 2:50 PM. now we query the
2022-01-18 01811, 2022
lucifer
endpoint again later, it has the entire listening history from 12- 3 at this point. but since we already imported the 2:50 PM stream in an older run, LB will ignore all the streams prior to 2:50 PM. so the user's listening history between 2-2:50 PM goes missing.
thoughts on how we can confirm this behaviour and then work on a workaround/fix.
2022-01-18 01859, 2022
outsidecontext
I also try to monitor this today. Two days ago Spoitify import worked smoothly for me. Yesterday it had big issues, importing only 3 of 34 tracks (see https://gist.github.com/phw/03ce9343f64cb57328d1b…)
2022-01-18 01821, 2022
outsidecontext
today there seem to be issues again. The three tracks I listened to earlier today are not yet showing up in Spotify API over an hour later. Maybe I can see it happening again and catch one of the bad API responses
treeshateorcs[m]: above chage should fix server startup error you were seeing yesterday. will merge after review but you can make the changes locally till then if you want.
2022-01-18 01817, 2022
CatQuest
wait ill adds scrobble to lfm or lb?
2022-01-18 01849, 2022
riksucks
lucifer: btw i wanted to ask you something, wouldn't it be better to manually just add headers to the handle_error function, rather than using the decorator. Pretty sure the code quality would take a hit. What do you say
2022-01-18 01800, 2022
lucifer
riksucks: yes i am currently debugging that error, if we can get it working fine otherwise adding manually is fine.
2022-01-18 01851, 2022
riksucks
Right I see, thanks
2022-01-18 01828, 2022
texke joined the channel
2022-01-18 01857, 2022
lucifer
riksucks: no useful leads yet, lets add the header manually.
2022-01-18 01802, 2022
lucifer
jsonify returns a Response object so you can set the header on that.
2022-01-18 01858, 2022
alastairp
lucifer: hmm, interesting. we use a limit to decide to no longer import some items from the spotify stream, right?
2022-01-18 01816, 2022
alastairp
given that we have deduplication during ingest, maybe we just grab all 50 all the time?
2022-01-18 01819, 2022
alastairp
(morning)
2022-01-18 01850, 2022
lucifer
that would significantly increase dupes. also 50x the load on ts writer.
2022-01-18 01855, 2022
mayhem
moooin!
2022-01-18 01859, 2022
Sophist_UK has quit
2022-01-18 01859, 2022
lucifer
we currently store the timestamp of the latest listen imported for a user from spotify and then query the api for streams only after that timestamp.
2022-01-18 01844, 2022
mayhem
moooin!
2022-01-18 01846, 2022
mayhem
how about we normally do an incremental (since last timestamp) fetch, but every 10 tries we do a full pull of all 50 listens?
2022-01-18 01857, 2022
lucifer
interesting thought, that could work but also means increased load for 7mins every ~75mins.
2022-01-18 01831, 2022
mayhem
ideally we would spread the larger checks out, so they don't bunch up.
2022-01-18 01802, 2022
mayhem
I think we ought to build a whole class in the importer that sets the next check time for a given user.
2022-01-18 01821, 2022
mayhem
and that class can contain a whole lot of logic or attempts at predicting things.
2022-01-18 01825, 2022
PrathameshG joined the channel
2022-01-18 01840, 2022
mayhem
and perhaps that even has some things that could be tuned on the fly.
2022-01-18 01846, 2022
alastairp
just to confirm - does the ts writer ever back up with work?
2022-01-18 01856, 2022
lucifer
i was thinking of doing some redis based dedup but that would be more work to get right. store last 50 listens imported for the user in redis, then query all available spotify listens for the user, match against redis only send the ones not in redis to ts writer update redis.
2022-01-18 01809, 2022
lucifer
currently no.
2022-01-18 01809, 2022
mayhem
say that this "out of order mode" from spotify is an abberation. maybe we can turn the full checks off when things run normally?
2022-01-18 01849, 2022
mayhem
lucifer: I think adding extra work on that level will just end up being a pain. let TS sort it out, but lets try to be smart about how often we query a particular user.
2022-01-18 01821, 2022
alastairp
^ agreed about the complexity of another level
2022-01-18 01859, 2022
lucifer
re disabling full checks, that needs us to be able to figure out when spotify is going out of order. for eg, outsidecontext noticed the issue but i usually don't monitor what spotify is importing for me so would never know.
2022-01-18 01804, 2022
lucifer
makes sense.
2022-01-18 01858, 2022
outsidecontext
yes. I also don't know how often it failed for me without noticing. Just yesterday I easily spotted it, and a few weeks ago there was another case I noticed.
2022-01-18 01811, 2022
mayhem
lucifer: yes, perhaps this is something we check for and turn on automatically?
2022-01-18 01821, 2022
outsidecontext
And today of course. Still waiting for my listenings from 3 hours ago to show up in Spotify API
2022-01-18 01826, 2022
mayhem
like do random long pulls and check for OOO issues?
2022-01-18 01809, 2022
lucifer
+1 on the "the importer that sets the next check time for a given user."
2022-01-18 01812, 2022
alastairp
we found forum posts on the spotify site asking about this problem too, so it's not just related to us
2022-01-18 01858, 2022
mayhem
we could have a button somewhere that says: "spotify is being dumb, please try harder for a day"
2022-01-18 01859, 2022
lucifer
yes we can do random long pulls but still need to compare it with something to figure out that listens are missing.
2022-01-18 01804, 2022
outsidecontext
there are two aspects to this: 1. spotify sometimes having huge delays until listens show up. There is obviously not much to do about this
2022-01-18 01804, 2022
mayhem
and if 3 people press the button, we try harder.
2022-01-18 01827, 2022
alastairp
is there a way for us to independently identify this without user input?
2022-01-18 01839, 2022
mayhem
oh, ah. big brain time. 🧠
2022-01-18 01845, 2022
outsidecontext
and 2. likely spotify sometimes only shown more recent listens first, while not yet showing some older ones. at least so far the theory, not yet seen an actual API result showing this behavior
2022-01-18 01855, 2022
alastairp
randomly sample n users, get their listens every hour, see if listens "turn up" in the middle compared to when we checked last
2022-01-18 01827, 2022
mayhem
lets get a paid spotify account. lets create a bot that listens to music. and it listens to a short track every 3 minutes or so.
2022-01-18 01850, 2022
mayhem
now we have a "clock frequency" we know that listens should be coming for this user every 3 minutes.
2022-01-18 01817, 2022
mayhem
if reality differs from theory, try harder.
2022-01-18 01850, 2022
lucifer
sounds good
2022-01-18 01814, 2022
lucifer
outsidecontext, have other listens for today showed up yet?
2022-01-18 01817, 2022
outsidecontext
no, and I only did these three in the morning. I'll listen to something additional now for testing
2022-01-18 01839, 2022
lucifer
outsidecontext, also i found another comment in the forums mentioning that it could be related to offline mode or iOS app. does that sound familiar to when you noticed the issue ?
2022-01-18 01815, 2022
outsidecontext
have been using the desktop app on my laptops
2022-01-18 01830, 2022
lucifer
👍
2022-01-18 01848, 2022
outsidecontext
interesting is that the current playback endpoint is working well
2022-01-18 01801, 2022
alastairp
yeah, I think we noticed that too
2022-01-18 01822, 2022
lucifer
yeah the current playback endpoint usually works realtime but the recently played one may lag hours.
2022-01-18 01829, 2022
yvanzo
O’Moin
2022-01-18 01812, 2022
PrathameshG has quit
2022-01-18 01856, 2022
reosarevok
moin :)
2022-01-18 01859, 2022
reosarevok
yvanzo: someone said "nothing in the MusicBrainz documentation about work types, like Incidental Music and such. the only place I’ve found those descriptions is on the Create New Work pages. frankly, what we’ve got hasn’t helped me figure out if parts of a soundtrack are incidental or not"
2022-01-18 01817, 2022
reosarevok
This goes together with the ws ticket you shared yesterday
2022-01-18 01845, 2022
reosarevok
Should we look into having, say, /work-types on the site, and then maybe a /ws/2/work-types JSON representation of the same list?
2022-01-18 01850, 2022
mayhem
reosarevok: a friend of mine, with whom I am chatting about a possible collaboration, asks:
2022-01-18 01852, 2022
mayhem
"In the meantime can you send me a link to a musicbrainz entry that you think gives a good representation of full metadata? Like an artist with ISRCS and other unique identifiers, etc. So I can get a vibe on what the ultimate goal for artists would be."
2022-01-18 01805, 2022
mayhem
got any artists in mind that fit this bill especially well?
2022-01-18 01807, 2022
reosarevok
Hmm. Artists specifically
2022-01-18 01809, 2022
reosarevok
Lemme think
2022-01-18 01802, 2022
mayhem
or an album...
2022-01-18 01816, 2022
mayhem
where the artists is also pretty well pimped out.
2022-01-18 01815, 2022
reosarevok
Well, as much as I dislike his bullshit, https://musicbrainz.org/artist/164f0d73-1234-4e2c… seems well-filled. Also, wtf, does the US really allow one to legally change their name to just "Ye", no surname? :D
(I see four recordings were added during the holidays that need fixing, heh, will change them)
2022-01-18 01812, 2022
reosarevok
Is the possible collaboration a seekrit?
2022-01-18 01851, 2022
reosarevok
ISRCs and whatnot though depend a ton on what people have sent and I can't guarantee most recordings for either artist have them, tbh
2022-01-18 01805, 2022
reosarevok
Since those usually require CD in hand and even then it's not always there
2022-01-18 01843, 2022
reosarevok
Kanye has a fair amount for albums, it seems, but not many for singles
2022-01-18 01807, 2022
mayhem
ok, thanks for those links.
2022-01-18 01839, 2022
mayhem
what would we consider ideal minimal metadata for an artist who wishes to add one release, or even perhaps one single track they just finished?
2022-01-18 01845, 2022
mayhem
artist name, sortname, type, area, link to their own page, a link to where the artists can be supported, metadata for release including a link where it can be obtained.
2022-01-18 01849, 2022
mayhem
is kinda my thinking.
2022-01-18 01858, 2022
reosarevok
Sort name can be confusing, but yes. Link to their own page (which often will mean social media), link to streaming pages or whatnot, track titles and durations
2022-01-18 01813, 2022
reosarevok
And ideally ISRCs since it's much easier for them to provide them than for anyone else to find them
2022-01-18 01846, 2022
reosarevok
IPI and ISNI in the form confuse a lot of people though, so if we have a section for ISRCs it should specify "it's ok to skip them if you don't have them"
2022-01-18 01827, 2022
mayhem
"but if there is a way to make an MB add-on for these" <- there is fierce competition for doing this right now. and each of these tools is collecting data for their own "WE MUST WIN IT ALL OR WE PERISH" approach. which is.. uhm, not going to work.
2022-01-18 01856, 2022
yvanzo
But they are publishing music on several digital streaming platforms at the same time. Might it be possible to add MB as a target? It would just grab metadata and not audio content.
2022-01-18 01836, 2022
mayhem
anything is possible at this point in time.
2022-01-18 01801, 2022
mayhem
it would be good for us to think about this in broad terms of what we would love to see happen.
2022-01-18 01804, 2022
yvanzo
But you’re probably right that they may not allow easily to create add-on for their software.
2022-01-18 01813, 2022
mayhem
and then Marc and I can actually see about what is possible.
2022-01-18 01853, 2022
mayhem
yvanzo: and the add-ons are usually for larger studios with established artists. byta deals more with tons of teeny artists.
2022-01-18 01813, 2022
mayhem
which is actually great -- we want more teeny artists metadata in MB.
2022-01-18 01844, 2022
yvanzo
Right.
2022-01-18 01858, 2022
reosarevok
Yeah, established artist data we'll eventually get most of anyway
2022-01-18 01812, 2022
mayhem
the easiest would be a cd stub like system, but... that doesn't do a lot of good in the grand scheme of things.
2022-01-18 01830, 2022
mayhem
having switched on artists would be even better.
2022-01-18 01830, 2022
reosarevok
I mean, it does do good if we or byta have people in charge of finishing the import
2022-01-18 01844, 2022
reosarevok
But it doesn't otherwise
2022-01-18 01850, 2022
mayhem
agreed.
2022-01-18 01800, 2022
mayhem
but I have no ideal of the costs and scalability of this.
2022-01-18 01843, 2022
reosarevok
Yeah, me neither :) Something like that would need to be tested for feasibility for sure
2022-01-18 01807, 2022
reosarevok
But the scalability issue will to some degree be there even if the artists do all the adding since the new additions will be seen by the community
2022-01-18 01827, 2022
reosarevok
My main worry with artists doing the adding is just "who deals with edit notes by the community"
2022-01-18 01835, 2022
reosarevok
The artist is likely gone by then
2022-01-18 01839, 2022
mayhem
agreed.scalability remains the greatest concern.
2022-01-18 01808, 2022
mayhem
perhaps we can fund one position of someone who is what you call a finisher.
2022-01-18 01814, 2022
reosarevok
Of course, byta edits could be clearly be marked as such, with a note like "this was added in this way, if you see something wrong here, be bold and make the changes, if you see something consistently wrong, get in touch so we can improve the system"
2022-01-18 01833, 2022
mayhem
if we can identify the steam of users coming in from this, this person could be tasked with tidying up this incoming stream of data.
2022-01-18 01839, 2022
reosarevok
But basically there should be clarity in any case on whether the community can expect an answer to notes or not
2022-01-18 01850, 2022
reosarevok
Yeah. That seems perfectly doable
2022-01-18 01801, 2022
reosarevok
(I do that at a small scale rn with BBC works)
2022-01-18 01811, 2022
reosarevok
The scalability is the main question
2022-01-18 01841, 2022
reosarevok
You'd need to decide with byta whether each artist would get one account, or all would go through one main byta account
2022-01-18 01843, 2022
mayhem
a good question, that.
2022-01-18 01816, 2022
reosarevok
If they have accounts for artists on their site, then giving them an automatic [byta_or_whatever_prefix]_username account is probably doable (needs implementing, but)
2022-01-18 01819, 2022
mayhem
an ideally, this would be less of "marc and I deciding" but more of "reo telling us how it could be feasible"
2022-01-18 01847, 2022
reosarevok
If they don't and each time the artist adds new music they do it on a form without them having an account over there as such, then a general account for them seems simpler
2022-01-18 01823, 2022
mayhem
the "creating an account" seems to be a significant hurdle for such services.
2022-01-18 01849, 2022
reosarevok
Yeah. We could in some way automate MB account creation for them if they have an account on their side
2022-01-18 01851, 2022
mayhem
so, if they are required to have a byta account, then adding yet another one is a likely dealbreaker.
2022-01-18 01805, 2022
reosarevok
I wasn't thinking of the artist filling in our captcha :)
2022-01-18 01813, 2022
reosarevok
Just that behind the scenes they'd get assigned an account
2022-01-18 01819, 2022
mayhem
once we migrate to single oauth on MeB, then this is much easier.
2022-01-18 01819, 2022
reosarevok
For submitting the data
2022-01-18 01830, 2022
mayhem
agreed, not a bad idea.
2022-01-18 01837, 2022
reosarevok
But that only works if they have byta accounts already
2022-01-18 01849, 2022
reosarevok
If not, they should go through one generic byta account
2022-01-18 01814, 2022
reosarevok
That'd probably be easier to monitor anyway, but harder to find the issues with specific artists and support them if they need to
2022-01-18 01856, 2022
reosarevok
FWIW though we now have an edit search for edit note content, meaning different accounts wouldn't be a problem if we can just search for a string such as "Submitted via byta"
2022-01-18 01858, 2022
yvanzo
One generic byta account seems to be more workable indeed.