#metabrainz

/

      • yef has quit
      • 2021-05-19 13905, 2021

      • yef joined the channel
      • 2021-05-19 13905, 2021

      • yef has quit
      • 2021-05-19 13905, 2021

      • yef joined the channel
      • 2021-05-19 13948, 2021

      • MRiddickW joined the channel
      • 2021-05-19 13905, 2021

      • milkii has quit
      • 2021-05-19 13938, 2021

      • milkii joined the channel
      • 2021-05-19 13953, 2021

      • thomasross has quit
      • 2021-05-19 13925, 2021

      • milkii has quit
      • 2021-05-19 13909, 2021

      • milkii joined the channel
      • 2021-05-19 13918, 2021

      • BrainzGit
        [listenbrainz-server] amCap1712 opened pull request #1473 (master…fix-ftp-check): Fix missing imports to get check_ftp_dump_ages working https://github.com/metabrainz/listenbrainz-server…
      • 2021-05-19 13927, 2021

      • _lucifer
        ruaok: alastairp: dumps are broken. so was the ftp check. the email you received was me testing the above fix for the dump age check.
      • 2021-05-19 13921, 2021

      • _lucifer
        regarding why dumps are broken this time, there was a permission error while creating temporary files
      • 2021-05-19 13949, 2021

      • _lucifer
        `mktemp: failed to create directory via template ‘/mnt/dumps/tmp/archives/incremental.XXXXXXXXXX’: Permission denied`
      • 2021-05-19 13924, 2021

      • _lucifer
      • 2021-05-19 13945, 2021

      • _lucifer
        maybe the storage box was mounted as the incorrect user?
      • 2021-05-19 13952, 2021

      • _lucifer
        as it shows the UID and GID as 0 for the directory, but we specifically create a lbdumps user with 900 UID and GID to create the dump files.
      • 2021-05-19 13932, 2021

      • _lucifer
        alastairp, we are already adding sentry in the startup improvement PR. How about we add sentry to check cron status as well? maybe tee the output to it and check exit status or check the log files periodically for errors?
      • 2021-05-19 13928, 2021

      • _lucifer
        or we could make it a part of the dump script
      • 2021-05-19 13913, 2021

      • _lucifer
        Mr_Monkey: should the ReportUser button be a separate component or should i just put in UserPageHeading component?
      • 2021-05-19 13932, 2021

      • rdswift has quit
      • 2021-05-19 13920, 2021

      • rdswift joined the channel
      • 2021-05-19 13931, 2021

      • ruaok
        mooin!
      • 2021-05-19 13941, 2021

      • _lucifer
        morning!
      • 2021-05-19 13957, 2021

      • ruaok
        _lucifer: I guess it didn't get remounted correctly after reboot. when I mounted it by hand I made sure to use the right user.
      • 2021-05-19 13907, 2021

      • ruaok
        gid 0 is deffo wrong.
      • 2021-05-19 13955, 2021

      • _lucifer
        yeah, thought so. nice that we'll be able to get rid of the box after the upgrade.
      • 2021-05-19 13912, 2021

      • ruaok
        my thoughts exactly. we dont need another vector for failures
      • 2021-05-19 13947, 2021

      • ruaok
        the storage box failed to mount entirely, that is the problem.
      • 2021-05-19 13910, 2021

      • _lucifer
        ah! makes sense.
      • 2021-05-19 13931, 2021

      • ruaok
        fixed.
      • 2021-05-19 13943, 2021

      • ruaok
        `mount.cifs -o iocharset=utf8,rw,credentials=/etc/backup-credentials.txt,uid=lbdumps,gid=lbdumps,file_mode=0660,dir_mode=0770 //u209105.your-storagebox.de/backup /mnt/dumps`
      • 2021-05-19 13919, 2021

      • _lucifer
        that's one scary looking command :)
      • 2021-05-19 13931, 2021

      • _lucifer
        should we trigger an incremental dump manually to verify?
      • 2021-05-19 13942, 2021

      • ruaok
        yes, but...
      • 2021-05-19 13953, 2021

      • ruaok
        lets check what incremental dump version we're on.
      • 2021-05-19 13957, 2021

      • alastairp
        hi _lucifer
      • 2021-05-19 13900, 2021

      • alastairp
        (and everyone else)
      • 2021-05-19 13901, 2021

      • _lucifer
        hi!
      • 2021-05-19 13903, 2021

      • alastairp
        great sleuth work _lucifer
      • 2021-05-19 13944, 2021

      • ruaok
        if we skipped one (or more) then we ought to roll back the dump ID. otherwise the incremental won't import and we'd need to do another full dump and wait another 24 hours.
      • 2021-05-19 13927, 2021

      • ruaok
        full and incremental dump 440 are on the ftp site.
      • 2021-05-19 13937, 2021

      • alastairp
        _lucifer: instead of cron, we could probably get away with using metrics
      • 2021-05-19 13941, 2021

      • ruaok
        if next dump ID is 441, then let 'er rip
      • 2021-05-19 13943, 2021

      • alastairp
        uh, instead of sentry
      • 2021-05-19 13958, 2021

      • _lucifer
        ah right! that works too :D
      • 2021-05-19 13917, 2021

      • ruaok
        > 440 | 2021-05-16 11:41:50+00
      • 2021-05-19 13928, 2021

      • ruaok
        is the last dump in the dump_table. good to go to create an incremental, _lucifer
      • 2021-05-19 13955, 2021

      • ruaok
        if it fails, but the data is generated and the dump id is increased, we have to copy the data by hand, ok?
      • 2021-05-19 13937, 2021

      • _lucifer
        oh! didn't know that.
      • 2021-05-19 13933, 2021

      • _lucifer
        i'll trigger the dump. let's see how it goes.
      • 2021-05-19 13923, 2021

      • alastairp
        I have a load of washing on, I'll make my way to the office when it's finished
      • 2021-05-19 13918, 2021

      • ruaok is waiting for cocktail robot parts from CA. will come in after they arrive
      • 2021-05-19 13934, 2021

      • _lucifer
        seems it failed again
      • 2021-05-19 13947, 2021

      • _lucifer
        mktemp: failed to create directory via template ‘/mnt/dumps/tmp/archives/incremental.XXXXXXXXXX’: No such file or directory
      • 2021-05-19 13917, 2021

      • ruaok
        ah yes, I have a bug open for the fact that it doesn't create the needed subdirs.
      • 2021-05-19 13928, 2021

      • ruaok
        the I looked the code and scratched my head.
      • 2021-05-19 13929, 2021

      • _lucifer
        missing a -p?
      • 2021-05-19 13947, 2021

      • ruaok
        not so fast. mktemp, not mkdir!
      • 2021-05-19 13904, 2021

      • ruaok
      • 2021-05-19 13913, 2021

      • _lucifer
        ah ok, should i just create the directories manually for now?
      • 2021-05-19 13920, 2021

      • kloeri has quit
      • 2021-05-19 13920, 2021

      • Freso has quit
      • 2021-05-19 13904, 2021

      • ruaok
        I did that last time and shit went haywire again. I think we should not repeat that mistake and fix this properly.
      • 2021-05-19 13912, 2021

      • ruaok
        let me come up with a plan.
      • 2021-05-19 13941, 2021

      • ruaok
        a yes, its the TMP_DIR and TMPDIR fukkery
      • 2021-05-19 13942, 2021

      • _lucifer
        we have a few fixes to release wrt cron today anyways.
      • 2021-05-19 13905, 2021

      • _lucifer
        so we can release these together and test the dump then
      • 2021-05-19 13910, 2021

      • ruaok
        k.
      • 2021-05-19 13928, 2021

      • kloeri joined the channel
      • 2021-05-19 13942, 2021

      • ruaok
        line 88 confounds me, honestly.
      • 2021-05-19 13928, 2021

      • ruaok
        TEMP_DIR="/mnt/dumps/tmp/archives"
      • 2021-05-19 13949, 2021

      • _lucifer
        this can be changed to `/mnt/dumps`. my understanding is that it holds only temporary data?
      • 2021-05-19 13945, 2021

      • ruaok
        lets not make any more changes than needed. let's keep TEMP_DIR as is, but not use mktemp
      • 2021-05-19 13957, 2021

      • ruaok
        since it doesn't seem to function as we expect.
      • 2021-05-19 13953, 2021

      • ruaok tests his fix
      • 2021-05-19 13919, 2021

      • BrainzGit
        [listenbrainz-server] mayhem opened pull request #1474 (master…fix-missing-dump-tmp-dir): Replace mktemp with something we can control better. https://github.com/metabrainz/listenbrainz-server…
      • 2021-05-19 13929, 2021

      • sumedh joined the channel
      • 2021-05-19 13937, 2021

      • ruaok
        alastairp: _lucifer: ^^ the changed code fragments were tested in an out-of-body script, since it is hard hard to test in-script. but that ought to work.
      • 2021-05-19 13925, 2021

      • Mr_Monkey
        _lucifer: > should the ReportUser button be a separate component
      • 2021-05-19 13925, 2021

      • Mr_Monkey
        I'd say so, yes. That will save some time if (when?) we refactor that code
      • 2021-05-19 13944, 2021

      • _lucifer
        👍
      • 2021-05-19 13909, 2021

      • BenOckmore joined the channel
      • 2021-05-19 13910, 2021

      • ruaok
        _lucifer: with the recent BU changes what happens when a redis key is set with no expiry time?
      • 2021-05-19 13947, 2021

      • ruaok
        I didn't set one and my metric values are written to redis, but they disappear immediately. I can't find any running metric-writers and I've stopped the one that is supposed to be running.
      • 2021-05-19 13949, 2021

      • _lucifer
        ruaok, you always have to set an expiry time.
      • 2021-05-19 13909, 2021

      • ruaok
        why did it used to work??
      • 2021-05-19 13921, 2021

      • ruaok
        `cache._r.rpush(REDIS_METRICS_KEY, metric)`
      • 2021-05-19 13923, 2021

      • _lucifer
        the `set` used to have a default for time to be 0. we removed the default, now the user has to pass a time explicitly. you can pass 0 if you want no expiry.
      • 2021-05-19 13954, 2021

      • ruaok
        I'm not using set. I'm using rpush directly.
      • 2021-05-19 13911, 2021

      • _lucifer
        ah you are using `rpush` so the redis library directly. the BU changes shouldn't affect you then
      • 2021-05-19 13916, 2021

      • ruaok
      • 2021-05-19 13932, 2021

      • ruaok
        ok, then I need to continue hunting down what is eating my data. :(
      • 2021-05-19 13935, 2021

      • _lucifer
        you should see whatever redis's default behaviour see
      • 2021-05-19 13946, 2021

      • _lucifer
        *behaviour is
      • 2021-05-19 13951, 2021

      • ruaok
        k
      • 2021-05-19 13947, 2021

      • reosarevok
        Mr_Monkey: re: gsoc blog post, English doesn't have spaces before exclamation marks! ;)
      • 2021-05-19 13922, 2021

      • _lucifer
      • 2021-05-19 13940, 2021

      • _lucifer
        ruaok: currently i see one item in metrics list ^
      • 2021-05-19 13925, 2021

      • ruaok
        and now its gone.
      • 2021-05-19 13934, 2021

      • _lucifer
        yes just saw that
      • 2021-05-19 13949, 2021

      • _lucifer
        weird.
      • 2021-05-19 13921, 2021

      • _lucifer
        when will the next value be written?
      • 2021-05-19 13957, 2021

      • alastairp
        ruaok: what are you using to read the value from redis after you've pushed it?
      • 2021-05-19 13929, 2021

      • outsidecontext has quit
      • 2021-05-19 13931, 2021

      • hugo___ has quit
      • 2021-05-19 13946, 2021

      • ruaok
        LLEN
      • 2021-05-19 13956, 2021

      • ruaok
        _lucifer: every 60 seconds.
      • 2021-05-19 13912, 2021

      • imdeni has quit
      • 2021-05-19 13920, 2021

      • ruaok
        see logs gaga @ listenbrainz-mbid-mapping-writer
      • 2021-05-19 13957, 2021

      • _lucifer
        ruaok: i ran redis monitor for ~2 mins. i see two different connections doing LPOP.
      • 2021-05-19 13901, 2021

      • _lucifer
        172.17.0.1:30140 and 172.17.0.1:26144
      • 2021-05-19 13921, 2021

      • ruaok
        how did you find that out?
      • 2021-05-19 13945, 2021

      • _lucifer
        log into lb-redis container, execute `redis-cli monitor`
      • 2021-05-19 13950, 2021

      • Mr_Monkey has quit
      • 2021-05-19 13959, 2021

      • _lucifer
        it'll show you all commands redis has executed since then.
      • 2021-05-19 13906, 2021

      • _lucifer
        with ip and timestamp
      • 2021-05-19 13922, 2021

      • ruaok
        ohhh, nice!
      • 2021-05-19 13943, 2021

      • _lucifer
        so now we need to get the container from the port, ideas on how to do that?
      • 2021-05-19 13946, 2021

      • ruaok
        grr 172.17.0.1 ips.
      • 2021-05-19 13901, 2021

      • ruaok
        lsof, I suspect.
      • 2021-05-19 13919, 2021

      • ruaok
        which machine is doing this?
      • 2021-05-19 13902, 2021

      • _lucifer
        this output is from inside lb redis container on lemmy
      • 2021-05-19 13955, 2021

      • ruaok
        because each of our machines has 172.17.0.1 as an IP -- its a docker IP.
      • 2021-05-19 13956, 2021

      • ruaok
        and lsof shows nothing.
      • 2021-05-19 13900, 2021

      • alastairp
        netstat should show what's on the other side of 172.17.0.1:30140
      • 2021-05-19 13911, 2021

      • alastairp
        but I suspect it'll only show up while the connection is open
      • 2021-05-19 13928, 2021

      • alastairp
        if it connects, gets some data, then disconnects it'll go away
      • 2021-05-19 13936, 2021

      • ruaok
        yep.
      • 2021-05-19 13941, 2021

      • alastairp
        (I just grepped for it [directly on gaga] and didn't get anything)
      • 2021-05-19 13952, 2021

      • ruaok
        nothing on lemmy either.
      • 2021-05-19 13957, 2021

      • alastairp
        so next solution is tcpdump :(
      • 2021-05-19 13904, 2021

      • ruaok
        and I can't explain how TWO ips are doing this.
      • 2021-05-19 13913, 2021

      • ruaok
        there are NO metric writers running at all.
      • 2021-05-19 13920, 2021

      • ruaok
        there should be 0 ips.
      • 2021-05-19 13951, 2021

      • ruaok packs up and heads to the office
      • 2021-05-19 13954, 2021

      • _lucifer
        can you take down metric-writer?
      • 2021-05-19 13911, 2021

      • ruaok
        it is down!
      • 2021-05-19 13919, 2021

      • alastairp
        _lucifer: where is lb-redis container ? lemmy>?
      • 2021-05-19 13921, 2021

      • _lucifer
        oh! let me run monitor again
      • 2021-05-19 13926, 2021

      • _lucifer
        alastairp: yes
      • 2021-05-19 13942, 2021

      • ruaok
        I know of no running metric-writers.
      • 2021-05-19 13914, 2021

      • _lucifer
        1621416540.071864 [0 172.17.0.1:30416] "LPOP" "metrics:influx_data"
      • 2021-05-19 13918, 2021

      • alastairp
        can we try and reproduce this process that you just did with the monitor?
      • 2021-05-19 13927, 2021

      • _lucifer
        it received an lpop just now
      • 2021-05-19 13948, 2021

      • alastairp
        ah, got it
      • 2021-05-19 13953, 2021

      • alastairp
        connection was still in time_wait
      • 2021-05-19 13954, 2021

      • alastairp
        172.17.0.6:6379
      • 2021-05-19 13909, 2021

      • alastairp
        although that might be redis
      • 2021-05-19 13915, 2021

      • ruaok
        that is the wrong end.