#musicbrainz-devel

/

      • kurtjx joined the channel
      • kepstin joined the channel
      • ocharles
        adhawkins-away: sounds like you're making progress!
      • reoafk joined the channel
      • kurtjx joined the channel
      • reoafk joined the channel
      • Freso joined the channel
      • ianmcorvidae attempts to install elasticsearch
      • Prophet5 joined the channel
      • ianmcorvidae
        cool, got ingestr running
      • I mean, it doesn't actually *do* anything since I have no data, but :P
      • nikki
        get some data then!
      • reosarevok
        You have the IA data!
      • Use it :p
      • ianmcorvidae
        at present it really only supports some datasets I don't have :P but yes, that'll be my next step
      • reosarevok
        ianmcorvidae: while you're at it, it could probably use some option to add releases in low quality by default :p
      • ianmcorvidae
        initially, it won't support adding releases at all :P it needs to get data into its own DB before it can worry about getting it into MB's
      • reosarevok
        Ooh
      • cool
      • Import the Naxos Music Library :p
      • Or the BIS site, or Chandos or Hyperion or or or
      • So many things I'd love to see ingested...
      • reosarevok goes ingest some chips for now
      • ianmcorvidae
        I think I'll start with the IA data :P
      • reosarevok
        Sure, sure
      • Ingest it all!
      • does it have any options to find dupes?
      • (from multiple datasets)
      • I guess it doesn't need them yet, but it will at some point
      • ianmcorvidae
        that'll be your job :P
      • the point here is to ingest things and then have it be a data source for editors
      • with the ultimate goal being finding mappings so we can create importers
      • (but also getting some mappings between datasets (including to/from our data) in the process)
      • reosarevok
        ianmcorvidae: sure, but if we import, say, the IA data and data from one label, it would be great if it could tell "heeey, these look the same"!
      • *!"
      • ianmcorvidae
        that might happen eventually, yeah
      • ultimately the hope would be that we'd have mappings for both datasets and we'd be able to be like "hm so these look the same in this normalized form"
      • kurtjx joined the channel
      • reosarevok
        warp: http://tickets.musicbrainz.org/browse/MBS-5540 could you look into this when you have the time?
      • It's... fairly annoying
      • Prophet5 joined the channel
      • Prophet5 joined the channel
      • kepstin-laptop joined the channel
      • Freso joined the channel
      • kepstin-laptop joined the channel
      • Prophet5 joined the channel
      • Leftmost joined the channel
      • Prophet5 joined the channel
      • adhawkins
        ocharles: Ping
      • I didn't download the edit file (just editors). Could it be that? I've never bothered with that one previously.
      • ianmcorvidae
        yeah, there's a bug for that, there's some stuff in the CAA foreign keys file that makes the CAA dump potentially depend on the edit dump
      • workaround is to comment out the relevant foreign key in the file, or to get the edit dump (I'd recommend the former, personally)
      • adhawkins
        Which file do I edit/
      • ?
      • ianmcorvidae
        admin/sql/caa/CreateFKConstraints.sql lines 12-15
      • adhawkins
        Just comment them out?
      • ianmcorvidae
        yeah
      • adhawkins
        Ok, while you're here, you're responsible for the code that generates the dumps aren't you?
      • ianmcorvidae
        reminding since I don't remember how often you use SQL that '--' is comment in SQL, not #
      • adhawkins
        'Never', so thanks for the reminder :)
      • ianmcorvidae
        inasmuch as I'm in some sense responsible for all the code, at least :)
      • well, there's a comment up at the top of the file which might have also reminded you, but :)
      • adhawkins
        I'm considering knocking up a script that will download the latest dump (optionally including the edits file).
      • It'd be a bit easier if instead of a 'latest is xxx' file, there was a file called 'latest', whose contents contained the path.
      • Then you just wget 'latest', look in the file and wget the rest.
      • Or can wget do 'ftp://blah/latest-*'?
      • ianmcorvidae
        yeah; I'm not sure why we do that the way we do
      • wget should be able to do wildcards with ftp urls, yes
      • of course at present the latest-is file really doesn't do anything except provide a filename you can parse, so :P
      • adhawkins
        Yeah, but I can do that at the shell if necessary.
      • I'll have a play.
      • ianmcorvidae
        perhaps make a ticket for the changing-the-format thing; I'd like to ask ruaok about it at least, but otherwise I don't see reason not to do that
      • adhawkins
        You could always have both so it's easier for users looking for the latest *and* scripts.
      • ianmcorvidae
        yeah, we'd keep the current latest-is files for compatibility
      • adhawkins
        Yep
      • Prophet5 joined the channel
      • Ok, re-running the import. Shame it gets almost to the end before failing :)
      • What category should that dump ticket be in?
      • ianmcorvidae
        uh
      • unsure
      • adhawkins
        I'll put it in server for now, someone can move it if necessary.
      • ianmcorvidae doesn't have our list of components memorized :)
      • ianmcorvidae
        oh, you mean which project? yeah, MBS
      • adhawkins
        Sorry :)
      • Misc Features?
      • ianmcorvidae
        nah, my fault, should go to bed soon
      • adhawkins
        Scripts?
      • ianmcorvidae
        eh, don't bother with a component for now, none of them look particularly correct
      • adhawkins
        MBS-5541
      • mb-chat-logger
      • ianmcorvidae
        great, thanks
      • djce joined the channel
      • Freso joined the channel
      • djce joined the channel
      • adhawkins
        ocharles: Ok, data imported, but nothing listening in the VM on port 5000. How do I start up the server?
      • icrazyhack joined the channel
      • Ah, think the previous provision might actually have failed.
      • Will start again (again!) :)
      • warp
        woah. I'm late.
      • nikki
        adhawkins, ianmcorvidae: djce might know, since I'm pretty sure he's the one who created it originally
      • adhawkins
        nikki: This is ocharles new fab and groovy auto-creating VM based on Vagrant and Chef...
      • nikki
        adhawkins: I mean the "latest" file
      • adhawkins
        Ah I see :)
      • Crossed conversations. Anyway, there's a ticket in place for discussion now.
      • nikki
        I do remember that it's not a symlink, in case the symlink changes halfway through someone downloading
      • but I don't remember why it's latest-$timestamp and not latest containing the timestamp...
      • probably just that it was done the former way before anyone realised it wasn't the most optimal way
      • warp
        it is the optimal way in the sense that you don't need to perform an extra request
      • nikki
        oh?
      • warp
        (ok, if you hit latest instead of the index it would be the same amount of requests :)
      • Prophet5 joined the channel
      • nikki
        hm, it looks like my code for getting the timestamp of the latest dump is 8 lines of code when I could do it in 1 if the latest file contained the timestamp (well, 9 and 2 respectively if you include including the relevant modules)
      • warp
        wget -x -m -np `lynx -dump 'http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/' | grep 'http://' | awk '{ print $2 }' | grep 'latest-is' | sed 's/latest-is-//'`
      • that works, though is ugly.
      • we should just build downloading a full-export into our database provisioning tools :)
      • nikki
        that doesn't help if that's not what you're trying to do
      • e.g. the code I just looked at does not come from a script for downloading a full dump
      • adhawkins
        Great minds eh? :)
      • nikki
        well, I just copied it from trac :P
      • but I do agree
      • adhawkins
        ./admin/InitDb.pl --createdb --import
      • Whoops
      • ocharles
        morning
      • adhawkins
        Morning (just!)
      • ocharles
        ya, just...
      • adhawkins
        Oh god, my import is failing again...
      • Any ideas?
      • nikki
        it looks like there's something wrong with the settings for the postgres user
      • but I've not seen that before, so I'm not entirely sure
      • adhawkins
        Grr...it's been working before.
      • One thing after another! :)
      • nikki
        that's odd then :/
      • adhawkins
        Story of my life :)
      • Ok, blow the whole thing away and start again (again!)
      • kurtjx joined the channel
      • ocharles: When you've got a mo, I'd like to talk to you about a few ideas I've got for the VM, and discuss workflow.
      • ocharles
        sure thing
      • lets talk :)
      • adhawkins: you don't need to start again I wouldn't have thought
      • really 'vagrant provision' should get you to the same place
      • it seems that you just need to restart postgresql because something in provision isn't quite doing that correctly
      • adhawkins
        I think if it's already git cloned, it seems to throw an error.
      • Ah, restarting postgres, I'll remember that next time :)
      • What you want to cover first, workflow? Or ideas?
      • ocharles
        whichever is easiest for you
      • adhawkins
        Workflow then.
      • I've cloned your top level repo, and your cookbooks one.