#metabrainz

/

      • _lucifer
        👍
      • alastairp
        I note that it doesn't output anything anyway - is this the logging update that you want to do?
      • because we should definitely have a log message "started" and "finished", with date
      • then we can redirect output to a log file
      • _lucifer
        actually, i checked again it does finish. the issue is that it doesn't log errors correctly.
      • `self.log` is wrong should be logger or logging
      • i'll add a start log message as well
      • *it does log on finishing
      • alastairp
        right. we should make sure we log on start too, then
      • zas
        switching back to kiki
      • alastairp
        and maybe we need a "logging cookbook" too, some basic guide for how to ensure that we always get logging correct, it seems to be causing us all sorts of problems recently
      • _lucifer
        yup that would be useful
      • alastairp
        I think that things that run within a flask context should be mostly fine (using current_app.logger)
      • BrainzGit
        [listenbrainz-server] alastair opened pull request #1469 (master…cron-cleanup): Add missing username to dumps crontab https://github.com/metabrainz/listenbrainz-serv...
      • _lucifer
        yes, right but there are many places where we create the flask context just for logging purposes.
      • what should we do for those cases.
      • alastairp
        can you give an example?
      • BrainzGit
        [listenbrainz-server] alastair closed pull request #1462 (master…dependabot/pip/pydantic-1.8.2): Bump pydantic from 1.8.1 to 1.8.2 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] alastair merged pull request #1465 (master…webpack-no-progress): Don't output any progress while compiling prod js files https://github.com/metabrainz/listenbrainz-serv...
      • _lucifer
      • alastairp
        just to confirm - this is a command that runs in the context of the webserver (e.g. in a web/cron container) ?
      • _lucifer
        and a few other places in spark reader, i do not see why we need a flask context to connect to rabbitmq.
      • alastairp
        I guess in this case it's so that we can use current_app
      • _lucifer
        yes this command runs from cron, there are a few other that run in spark reader same image different container
      • alastairp
        but this is an interesting example -
      • why do we need to make a connection to rabbitmq? does create_app connect to rabbitmq, or is this something that's only done in request_manage?
      • what I'm trying to say is - by doing things within an app context (especially in manage.py), we should have all of our connections to dependent services running, redis, db, timescale, logging, etc
      • which I think is really useful
      • _lucifer
        we just use current_app to get the config. the connection is done by calling utils.connect_to_rabbitmq
      • alastairp
        so this example in request_manage.py - by using an app context we don't need to configure a logger. it's all done in create_app(). If we decided to remove this, then we'd have 2 different ways of doing logging within the webserver
      • _lucifer
        i see we have 8 services, 4 of those use flask other 4 don't (this is excluding mbid mapping writer, spark cluster stuff)
      • alastairp
        can we make a list of those in a document somewhere?
      • _lucifer
        sure, i'll do that
      • alastairp
        my gut feel is that we should use flask in all of the services which run from the web image
      • _lucifer
        so we definitely have enough services to justify 2 setups. using flask context is easy but it does not look ideal to me.
      • alastairp
        I just had a dig through the code, it seems that create_app() eventually calls https://github.com/metabrainz/listenbrainz-serv...
      • which means that for example, the call to `utils.connect_to_rabbitmq` in request_manage.send_request_to_spark_cluster isn't actually needed
      • aggregates updates finished
      • _lucifer
        oh interesting!
      • your point is valid, as long as its the web image flask is going to be there whether we use it or not.
      • alastairp
        right - so then we have the situation of spark; or the mbid mapping writer
      • _lucifer
        let me check what image the mbid mapping writer uses
      • alastairp
        we decided that in some cases (mbid writer) it doesn't make sense to carry around the whole webserver image
      • ruaok and I had a discussion about this a few times in the last months. we're happy with this decision at the moment
      • _lucifer
        cool, so we can setup a normal logger there without flask.
      • alastairp
        one thing we discussed was if we could have a standardised setup for logging - perhaps we _do_ want to put logging setup in BU. We could consider using optional dependencies in BU for the webserver dependencies
      • _lucifer
        we had discussed making flask optional in BU using extra_requires. we can do that and then add some common loggin stuff thre
      • alastairp
        snap
      • _lucifer
        yes exactly
      • alastairp
        cool, that's a future task, then
      • this ties into my thoughts on LB-879
      • BrainzBot
        LB-879: unify external service connections in single location https://tickets.metabrainz.org/browse/LB-879
      • _lucifer
        nice
      • in other news, caching now entirely fails in my latest experiments 😢
      • alastairp
        docker build cache?
      • _lucifer
        yup
      • alastairp
        that's interesting. maybe we should take a step back from this and see if it's the correct way to do it
      • flamingspinach has quit
      • _lucifer
        yeah we probably should
      • flamingspinach joined the channel
      • atj
        When I play tracks from a VA release, the artist name in LB displays as "Various Artists" but links to the actual artists MB page. Is this a client issue? MBIDs are being submitted with listens.
      • alastairp
        we don't use MBIDs in the LB interface yet.
      • well, maybe we do - if the link is there
      • atj
        It links to the correct MBID so you must do? :)
      • alastairp
        it's possible that your client is submitting the album artist instead of the track artist
      • atj
        Yeah I thought that might be the issue.
      • This gives some weird results in the UI: https://listenbrainz.org/user/atj/reports?range...
      • multiple instances of the same artist name but they link to a different MBID
      • Seems like my client isn't detecting compilations properly: https://gitlab.gnome.org/World/lollypop/-/blob/...
      • _lucifer
        looks like its been a terrible week for actions as well, down 3rd time since friday
      • alastairp
        looks like I picked the wrong week to give up jenkins
      • _lucifer
        but then they decided to build it all in node so they were also asking for it ;)
      • atj
        Maybe their Azure billing account overflowed :)
      • _lucifer
        alastairp: the logger here https://github.com/metabrainz/listenbrainz-serv... is not configured to output time etc. should i add the configuration in the listenbrainz/__init__.py, the same what we added in listenbrainz_spark/__init__.py.
      • excluding the file handler)
      • alastairp
        I guess that'd be OK, but why are we using a specific logger here instead of current_app.logger?
      • _lucifer
        not sure
      • alastairp
        atj: off the top of you head, is there a difference in the order of `2>&1` and `> file` when doing redirects?
      • I've always done 2>&1 first, because I read it as "redirect stderr to stdout then redirect stdout to file", but does it work if they're around the other way too?
      • yvanzo
        Freso: yes
      • atj
        alastairp: yes there is a difference
      • it always confuses me as I forgot the order
      • the correct order is ">file 2>&1"
      • hope the gist makes sense
      • ruaok
        Daaamn, that was fast. I wasn't here 2 minutes before I was deposited into the 15 minute waiting area.
      • alastairp
        atj: oh interesting, so I've always done it wrong!
      • thanks
      • yvanzo
        yyoung: congratulations! :)
      • atj
        alastairp: this is why in general I would suggest you use "&>file" if you know you'll always be using bash
      • alastairp
        does that do both?
      • this is in a crontab, I don't know what vixiecron uses
      • ruaok
        The vaccine center is totally not busy. If you two just randomly showed up, i bet they would vaccinate you, Mr_Monkey & alastairp
      • atj
        alastairp: yes it does redirect both, you can set "SHELL=/bin/bash" in the crontab file to force the shell that cron will use to execute (normally it's "/bin/sh")
      • "Several environment variables are set up automatically by the cron(8) daemon. SHELL is set to /bin/sh, ..."
      • Mineo
        shellcheck (https://www.shellcheck.net/) will flag the incorrect order of redirection magic (I can never remember the correct one, either :-) )
      • alastairp
        atj: oh yeah, of course. thanks
      • Mineo: yeah, we use shellcheck more often now, which is really useful
      • _lucifer: mmm, I was thinking about cron validation - we could have a job that runs when the crontab updates to check that everything is present. but I just realised that we could also consider doing shellcheck too
      • (as long as actions doesn't keep disappearing :)
      • _lucifer
        with conditional file path checking, i would say let's make shellcheck run in an action if any script is running.
      • *is modified
      • alastairp
        yep
      • _lucifer
        does shellcheck accept crontab though?
      • alastairp
        we have plenty other shell scripts though, maybe it doesn't help in this specific case
      • _lucifer
        yes, right. we should definitely add shellcheck.
      • i was thinking in addition to it what is possible for the cron use case.
      • BrainzGit
        [listenbrainz-server] alastair merged pull request #1469 (master…cron-cleanup): Add missing username to dumps crontab https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] alastair merged pull request #1461 (master…dependabot/pip/master/pydantic-1.8.2): [Security] Bump pydantic from 1.8.1 to 1.8.2 https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] alastair merged pull request #1459 (master…conditional-action-run): LB-895: Conditionally run tests based on paths modified https://github.com/metabrainz/listenbrainz-serv...
      • [listenbrainz-server] alastair merged pull request #1439 (master…spark-test-improvements): Some more improvements to Spark test and development setup https://github.com/metabrainz/listenbrainz-serv...
      • _lucifer
        alastairp: 02 and 0 are same in crontab right?
      • 02 and 2
      • atj
        _lucifer: yes
      • _lucifer
        👍, thanks!
      • alastairp: another thing `/logs` is `chown`'ed to `lbdumps` user. I was intending to redirect stats log to `/logs/stats.log` but i don't think it would in that case. but now that we have only one crontab do we even need two users?
      • ruaok
        Setting up a docker registry doesn't look hard...
      • We could set up a trial one and see how much disk it needs.
      • _lucifer
        setting up github's registry should be easier i believe?
      • ruaok
        It's still in beta.
      • shivam-kapila
        > The vaccine center is totally not busy. If you two just randomly showed up, i bet they would vaccinate you, Mr_Monkey & alastairp
      • No age bar in spain?
      • ruaok
        And I feel that we rely on GH too much as is...
      • shivam-kapila: yes, there is.
      • But the people there looked bored. Lol.
      • shivam-kapila
        Haha. I got an appointment but its quite far away :/
      • There were around 400 slots nearby but filled in say 20sec even with a captcha
      • _lucifer
        our LB images are 1.25G each so that's a starting point to consider regarding diskspace although some layers might get reused.
      • alastairp
        _lucifer: agreed, I think there's no reason to have 2 different users
      • _lucifer
        out of curiosity, why did we have two until now because you mentioned we use global crontabs not user specific?
      • ruaok: https://www.irccloud.com/pastebin/3uQJGiRL/ alastairp had shared this sometime ago. just docker-compose this to bring up a registry.
      • we indeed rely on Github a lot but we had also discussed some time ago that if we can avoid setting up our own services that would be nice.
      • ruaok
        True that as well.
      • Maybe we should trial GH registry....
      • alastairp
        _lucifer: ah, good point. so there's a bit of history here
      • at some point in time, crontabs _were_ installed as specific users
      • but there was a bit of unneeded complexity, so I removed it. I suspect you're right that this is why we originally had 2 users: 2 files -> 2 users
      • although, here's an interesting question. we create lbdumps with a known uid so that it can write to the storage box. maybe instead of of creating a `listenbrainz` user and then `lbdumps` with a specific UID, we could just set `listenbrainz` to that specific uid instead. I'll have to verify the behaviour of this
      • _lucifer
        alastairp, oh interesting. makes sense. i'll just change all to lbdumps for now. we can later change it all to `listenbrainz` user if it works fine.
      • alastairp
        sounds good to me
      • actually - once we upgrade lemmy and remove the storagebox this requirement for a specific uid becomes less important anyway, so we might be able to just remove it
      • sumedh has quit
      • _lucifer
      • uh oh. this seems to be a problem. the disconnect logic is probably not handling some case correctly.
      • alastairp
        or someone double-clicked and it sent 2 requests
      • _lucifer
        that could be possible but these are radio buttons that trigger requests `onchange` not `onclick`
      • elomatreb[m] has quit
      • but i am unable to reproduce either way, carefully crafted case or fast random clicking :/, will look into again if more such errors come up.