#metabrainz

/

      • alastairp
        ishaanshah[m]: you can add the build option back in, copying how it is in acousticbrainz-server's test.sh
      • ishaanshah[m]
        alastairp: Sorry my client got disconnected
      • The test.sh and integration-test.sh work without having to be built if volume has been mounted
      • So it would be better to not have build step right?
      • As it slows down the process considerably
      • Chinmay3199 joined the channel
      • alastairp
        no - it's better to build. because at the moment, if I add a new dependency to requirements.txt there is no way to rebuild the image
      • my initial idea when I wrote the script was to have 2 modes: 1 which would build+setup+bring up db+test+take down everything, so that you could just run one command have have it do everything
      • and a second set of commands that let you build and bring up manually, so that as you say you can specify a specific test to run and not wait for the other steps
      • but this requires you to remember what steps you need to run based on what changes you make
      • I'm open to suggestions about a better way of structuring this command to make it better
      • ishaanshah[m]
        alastairp: We can maybe add a -b which builds the containers only, Similar to frontend-tester.sh
      • Sorry for the late reply, matrix is acting wierd
      • alastairp
        ishaanshah[m]: yes, exactly. however, it should also build when running with no arguments
      • if there are 3 test scripts in listenbrainz I strongly suggest that you think about unifying them
      • shivam-kapila
        ishaanshah[m]: I am doing the -b one. I will push in a while
      • alastairp: You suggest that test containers should be rebuilt every time we run test.sh?
      • ishaanshah[m]
        Ok, maybe we could do -nb which will skip building
      • alastairp
        shivam-kapila: yes
      • ishaanshah[m]
        So that we will be conscious about what we are doing
      • shivam-kapila
        Yeah -nb makes sense
      • alastairp
        I don't think -nb is useful
      • do you understand what I said about the two modes of running tests?
      • ishaanshah[m]
        Yes, I understood but was thinking about how to skip that part
      • shivam-kapila
        Yeah the build and up and run for all tests
      • And the 2nd one for specific test run
      • ishaanshah[m]
        Because waiting each time after a small change is a bit frustrating
      • shivam-kapila
        Yes. For that I agree with -nb.
      • Mostly there is no change in requirements.txt
      • alastairp
        but if you follow the behaviour currently in test.sh you can already do this
      • $ ./test.sh -u
      • $ ./test.sh path/to/testfile.py # as many times as you need to
      • this will not build before running tests
      • shivam-kapila
        Oh yes -u makes sense. By default if the db is up it means we can skip build
      • alastairp
        👍 exactly
      • the idea is to make a standalone call to test.sh do the right thing. and I believe that the right thing here is to build the image
      • ishaanshah[m]
        Ok, makes sense
      • shivam-kapila
        > $ ./test.sh path/to/testfile.py
      • Does AB have this currently?
      • alastairp
        yes
      • shivam-kapila
        If will be good to be in sync with other projects IMO
      • alastairp
        I wondered how we could share a single test.sh file over all projects, but I don't know the best way to do this
      • ishaanshah[m]
        I think that functionality has not been implemented in integration-test.sh yet
      • shivam-kapila
        The build one?
      • ishaanshah[m]: ^
      • ishaanshah[m]
        the -u one
      • shivam-kapila
        Hmm... integration-test.sh doesnt have any options AFAIK
      • Its a fixed set of commands
      • ishaanshah[m]
        yes, are you adding it?
      • shivam-kapila
        Not currently. I am not sure about its need for integration tests.
      • alastairp: Can you offer an opinion for the same?
      • ishaanshah[m]
        To skip the build part?
      • alastairp
        it depends on how quickly you need feedback on those test runs
      • if integration tests are _only_ run in CI, it probably doesn't matter
      • if you need to run a single integration test yourself many times, perhaps it's a good idea
      • ishaanshah[m]
        Oh, I was adding a integration_test
      • shivam-kapila
        ishaanshah[m]: Its not a skip actually. You will need to run -u always so that the build is done once
      • ishaanshah[m]
        Yes, but doing it once is fine
      • alastairp
        keep in mind that -u in acousticbrainz and listenbrainz does _not_ build
      • it only starts the database server
      • but perhaps that's a good idea
      • shivam-kapila
        But its good to rebuild to be consisent with everytime build thing
      • ishaanshah[m]
        I didn't know that integration tests are run in CI only
      • I guess its not needed then
      • alastairp
        shivam-kapila: yeah, perhaps that's a good idea
      • shivam-kapila
        ishaanshah[m]: Actually you should run them on local setup too
      • alastairp
        ishaanshah[m]: I didn't say that
      • I don't know how integration tests are used
      • shivam-kapila
        ishaanshah[m]: To run tests in a file you can add the path in integration-test.sh
      • ishaanshah[m]
        Yes, I did that, the only thing that was bugging me was the time to get feedback
      • shivam-kapila
        Yes it buulds everytime
      • builds*
      • We can structure integration-tesh.sh to be similar to test.sh. I think we should wait for iliekcomputers' opinion too
      • ishaanshah[m]
        Yes, I will just comment the build part in my local setup for now
      • shivam-kapila
        Lol. Workarounds
      • ishaanshah[m]
        Thanks for the -u tip alastairp. I didn't know that :D
      • I will add back the build command in test.sh then
      • shivam-kapila
        ishaanshah[m]: Can I continue with that. I was already adding a build part to it.
      • ishaanshah[m]
        ya sure
      • alastairp
        ishaanshah[m]: perhaps this should be added to some documentation, to make it clearer that that this functionality exists
      • ishaanshah[m]
        thanks
      • shivam-kapila
        -u is in docs alastairp
      • Here the flags are listed
      • ruaok
      • this is output from my timescale import program here: https://github.com/mayhem/timescale-testing/blo...
      • this is a subset of all listens from a feb dump that includes iliekcomputers, rob and zastai.
      • zastai's duplicates are not in there yet, but the last.fm fuzzy type dupes (off by ~1 sec) and straight up duplicates are handled
      • the output shows which listens where identified as duplicate, which one was chosen/rejected and the diff between the two listens.
      • the duplicate detection logic is here:
      • alastairp
        ah right - this is cases where there are 2 sources, where the only thing that differs is the timestamp is off by a second, so you choose only one?
      • s/2 sources/2 listens/
      • ruaok
        if could use a second set of eyes (or six) to sanity check this duplicate handling code.
      • alastairp: that is the fuzzy case, yes.
      • alastairp
        that's neat
      • ruaok
        but there are also other duplicates that we bodged into influx that we should clean up.
      • I am waiting for a new dump to take care of the latest causes for duplicate data.
      • but in the meantime, I'd appreciate looking at this to see if it makes sense.
      • there are some miniscule stats at the bottom of the HTML file.
      • alastairp
        and those bodges were the ones that someone was talking about a few days ago - exactly the same, but some additional flag in influx to be able to add them to stop a conflict?
      • shivam-kapila
        Yeah Zastai was talking about it
      • ruaok
        alastairp: yes.
      • shivam-kapila: no, those are different.
      • those are the ones I am still waiting data on
      • shivam-kapila
        Oh sorry then
      • ruaok
        np.
      • zastai's data had no duplicates as of feb 2020.
      • Freso: you also had a ton of dups in your stream, yes
      • ?
      • anyone who has had dups and would like them cleaned up, should provide me and example ASAP.
      • alastairp
        ok, I'll have a look through the code. to confirm, the lookahead is you going forward in the listens a certain number of seconds in order to see if there are any dups?
      • did ollie have some?
      • Freso
        ruaok: I think I might, but those also exist in Last.fm. Not sure if I have new/unique ones in LB.
      • alastairp
        or was his case that he was sending prod listens to alpha and wanted to keep them?
      • Freso
        Not sure when or how old they are though.
      • ruaok
        alastairp: correct.
      • correct to the first question, re lookahead.
      • alastairp
        ruaok: cool, will look through it later this afternoon. goti t
      • ruaok
        alastairp: great thanks.
      • Freso: I'll include you in the next test run and then we'll see.
      • ollie as in acid2?
      • Freso
        Alright. 👍
      • alastairp
        yes, acid2
      • ruaok
        k
      • alastairp
        I remember us doing something special for him, but I think it was just that we copied listens from alpha to beta, even when we said that we wouldn't
      • shivam-kapila
        This might be reason for influx having mixed shards. For some users that doesnt allow to delete listens
      • iliekcomputers
        the dump died for some reason
      • there's an error in sentry
      • :/
      • ruaok
        doh
      • alastairp
        btw, sentry 10 is out too. I have to upgrade it at the uni too
      • ruaok
        alastairp: shall we chat about the recommendation stuff for a minute?
      • alastairp
        sure, let me open it
      • what kind of chat do you want to have?
      • ruaok
        overall approach.
      • I read and disgested large chunks of the music recommendation chapter. a really nice summary, I must say.
      • I didn't pay a whole lot of attention to the context focused recommendations, since we have very little of that data.
      • I have two goals with all of this:
      • alastairp
        they were talking about rewriting it for a new edition of the book, covering some new tools, that will be good
      • ruaok
        1. Create a project that will open up people to come in an play with recommendations
      • 2. Draw people in to improve the data sets that feed into the tools that are being created in order to create the recommendations. General outreach.
      • from the high level perspective if our goal was to get the best recommendations, we should be focusing on the collaborative filtering stuff.
      • alastairp
        so initial key deliverables are going to be 1) get data for us and others to use, 2) make some demos using that data, 3) enhance the data as needed as we see that we need more stuff
      • ruaok
        however, that is in a lot of ways, not low hanging fruit -- there is a lot of stuff that still needs to be done and pristine__'s moment has stalled and her contract is up soon.
      • alastairp
        right, I think getting the initial use-cases clear is an important first step
      • ruaok
        yes, more or less those deliverables are about right.
      • yes, and that might be a good goal for this convo:
      • alastairp
        that is, a general "you might want to listen to this" is quite different imo from "here's a playlist tailored to you based on some criteria that you specified"