#metabrainz

/

      • alastairp
        reosarevok: oh. well. I guess I should read the forums first :)
      • reosarevok
        I mean that uses the WS, dunno if yours queries the DB?
      • alastairp
        yeah, mine goes to the db directly
      • (my db)
      • reosarevok
        So it's not a complete dupe, since there's a lot of stuff that can't even be gotten via the ws
      • Might still be worth mentioning it on that forum post :)
      • alastairp
        saifulbkhan: https://tickets.metabrainz.org/projects/AB/issu... is a good small ticket
      • BrainzBot
        AB-296: Bulk low-level get fails if one of the mbids doesn't exist in the database
      • alastairp
        sure
      • saifulbkhan
        alastairp: I'm on it! unless you have some comments to add...
      • alastairp
        perhaps I'll add a bit more data other than just artists :)
      • saifulbkhan: nope, this one is pretty self-explanatory
      • saifulbkhan
        ok! thx then!
      • alastairp
        let me find you an example URL which breaks it
      • saifulbkhan
        alright
      • alastairp
        saifulbkhan: as an example, http://acousticbrainz.org/api/v1/low-level?reco... works
      • (see I changed the last letter of the URL, so that the ID doesn't exist)
      • if you load your own data into your local acousticbrainz server, you'll have different IDs
      • saifulbkhan
        alastairp: and we want it to not fail and give us data for whichever IDs actually exist
      • Rotab has quit
      • Rotab joined the channel
      • magerharz has quit
      • magerharz joined the channel
      • travis-ci joined the channel
      • travis-ci
        samj1912/picard#61 (untagged-2648d7b3060570139fec - 40114de : Laurent Monin): The build passed.
      • travis-ci has left the channel
      • alastairp
        saifulbkhan: right
      • hibiscuskazeneko has quit
      • iliekcomputers
        ruaok: are you planning to look at db-move again today? Also, what was the simple stuff you talked about earlier, I can get to it today
      • ruaok
        I'm feeling a bit under the weather. I'm going to lay down for a bit and see if I can get to it later.
      • hibiscuskazeneko joined the channel
      • a small thing... the listens page. the previous version (alpha) had better timestamps on it (x hours ago vs unix timestamp)
      • we need to bring back the better timestamps.
      • SothoTalKer
        hello :3
      • ruaok
        hi SothoTalKer
      • gcilou
        good morning, Monday
      • iliekcomputers
        ruaok: no rush :), I will work on the timestamps
      • Maybe I'll add the listen count to user page also while I'm at it, should be simple, hopefully.
      • SothoTalKer: hi :)
      • ruaok
        iliekcomputers: sounds simple, no?
      • iliekcomputers
        Yeah, it does.
      • iliekcomputers hopes there are no underlying complexities
      • Should be a simple query to influx, right?
      • ruaok
        would you like to discover them first or should we talk about them now? :)
      • iliekcomputers
        Lol
      • ruaok
        there are two considerations:
      • 1. can influx do that? (it can)
      • darwin__ joined the channel
      • 2. how expensive is this operation?
      • might be fine to ask if you have 1000 listens.
      • what if you have 10k? 100k? 1M, 10M?
      • should you really count 10M rows every time the user page is loaded?
      • I think CatQuest would be angry with you.
      • CallerNo6
        +1
      • SothoTalKer
        gcilou: you got out of bed. i got back from work :)
      • iliekcomputers
        I would be angry with myself also :P
      • rvedotrc has quit
      • darwin has quit
      • fs has quit
      • gcilou
        SothoTalKer: well I've already taken a calculus test and been at a meeting this morning so.. ;)
      • iliekcomputers
        Caching in redis is my first solution to everything, but that won't work well because the listen count can change rapidly.
      • Or maybe we keep a listencount in redis for every user and update accordingly
      • flamingspinach joined the channel
      • ?
      • ruaok
        iliekcomputers: good thinking.
      • SothoTalKer
        gcilou: busy busy :-)
      • ruaok
        but, there is another train of thought that people are converging to....
      • if you have 13 listens, its nice to know when you have 14.
      • but when you have 138,849 listens, do you care when you have 138,850?
      • so, one way to do it is to not give exact figures or to clearly state when the figure was last updated.
      • iliekcomputers
        Would be nice to know anyways, imo
      • ruaok
        of course, and we should provide that info.
      • SothoTalKer
        invent a button on the userpage
      • ruaok
        the thing is programmers tend to want to do things very accurately, but often that accuracy doesn't really get consumed by the end user.
      • rvedotrc joined the channel
      • SothoTalKer
        depending on how much work it is, you could run a daily/hourly job. and have a button that lets users update it in between, if they really neeed it
      • iliekcomputers
        We can do ranges and as SothoTalKer said, add an option for exact values, seems like a good compromise.
      • SothoTalKer
        you can listen to 480 3minute songs per day
      • that's about 175000 songs per year
      • nonstop continuous listening
      • iliekcomputers
        People new to LB would still like to see a count updating 😃
      • SothoTalKer
        so the 10M listens are a bit over the top ;)
      • ruaok
        SothoTalKer: what is the possible downside for doing a cron job?
      • iliekcomputers: and yes, of course the count needs to update.
      • SothoTalKer
        users don't get live data
      • ruaok
        but, given that a song is 3-5 minutes long and a user might go their page a times times during a song, do we need to recompute the count each time?
      • SothoTalKer: what else?
      • SothoTalKer
        i think you have something on your mind
      • ruaok
        i do
      • what if you have 1M users in the system.
      • and 10k daily users.
      • SothoTalKer
        bold goals
      • ruaok
        if you recompute the totals once a day, you're going to compute 990,000 totals for naught.
      • SothoTalKer
        but i see your point
      • is the listen count public?
      • ruaok
        I'm just using round numbers as example.
      • I think it should be since the data is public.
      • SothoTalKer
        if a userpage is browsed, don't calculate it every time. just when the data is too old?
      • ruaok
        yep, that is common pattern that works.
      • what would be a reasonable update window given this situation?
      • iliekcomputers
        Depends on how many listens the user has
      • ruaok
        does it?
      • iliekcomputers
        If the user is brand new, maybe 15 minutes
      • ruaok
        iliekcomputers: I think you're conflating two seperate solutions.
      • and do per user update intervals make sense? how could you implement that?
      • (not that conflating two separate solutions is bad, mind you)
      • SothoTalKer
        the downside of a userpage view based system is that you cannot have a 'top list' :D
      • ruaok
        SothoTalKer: I don't follow
      • SothoTalKer
        who has the most listens? which song was listened to most?
      • iliekcomputers
        ruaok: ranges based on the current number of listens we have cached, when the user page gets requested we check the number of listens in redis and when it was calculated, and then according to this, we update (or don't)
      • r_dunn joined the channel
      • ruaok
        I think you're on the right track. how could you simplify your idea?
      • consider timeout values of data items stored in redis.
      • iliekcomputers
        ruaok: please elaborate
      • ruaok
        when you stick a piece of data into redis, you can give it a time-to-live, right?
      • after that time, redis dumps it from the cache.
      • iliekcomputers
        Oh, I did not realize that
      • ruaok
        so, restate your ideas using this concept.
      • Gentlecat
        bitmap: some of the trigger functions seem to be the same except for a name
      • search_annotation_delete_1 and search_annotation_delete_3 are used for different tables, but I wonder if it can be just one
      • iliekcomputers
        That makes it easy, we just calculate on some request and put it in redis with an expiry time based on how many listens the user has
      • On the next request we check redis and if there is no data then we calculate again, otherwise we use the old one.
      • bitmap
        Gentlecat: I don't see why not, to the indexer they are the same
      • ruaok
        iliekcomputers: yup, that is what I had in mind.
      • Gentlecat
        the only difference I see is in "direct" functions, which reference "id" row instead of "gid"
      • the other way
      • iliekcomputers
        Nice :)
      • iliekcomputers puts this on his to do list
      • bitmap
        does sir use the gid for something?
      • Gentlecat
        not sure if it does
      • saifulbkhan has quit
      • but that's for generating the functions
      • bitmap
        presumably when it handles something from the delete queue, it sends some message to solr with the gid somewhere
      • samj1912 has quit
      • nawcom has quit
      • hibiscuskazeneko has quit
      • Gentlecat
      • nawcom joined the channel
      • that's the only place I see where it handles `delete` messages
      • hibiscuskazeneko joined the channel
      • I guess I'll send an email to mineo, this is strange
      • hibiscuskazeneko has quit
      • bitmap
        well, maybe it's only indexed by gid, so that's what delete_many needs
      • note that delete_1 and delete_3 only have the row id anyway, so doing an extra query for the gid would be redundant there when the indexer has to do it anyway
      • Gentlecat
        right