#musicbrainz

/

      • MBLogger
        MBLogger (~musicbra-@client3.fre.communitycolo.net) has joined #musicbrainz
      • djce tests the logger
      • djce
        djce ([R5qESSxZR@195.60.9.122) has joined #musicbrainz
      • djce ([0LywwsAd2@195.60.9.122) has joined #musicbrainz
      • JavaGeek
        make up your mind... :)
      • djce
        I'm testing!!! :-)
      • JavaGeek
        yeah, I figured that much...
      • what are you testing?
      • djce
        A bot which logs & archives this channel.
      • So we can post them on the web, like the mailing lists.
      • djce has left the channel
      • djce ([0LywwsAd2@195.60.9.122) has joined #musicbrainz
      • djce ([NIvltWZoA@195.60.9.122) has joined #musicbrainz
      • djce is away: I'm busy
      • ruaok
      • ruaok likes MBLogger watching over us. :-)
      • djce
        Yes, a rather handy out-of-the-box bit of kit from the w3c. Nice!
      • salisan
        salisan (~salisan@ai40.nuaccess.net) has joined #musicbrainz
      • Hello
      • djce
        Hi
      • MBLogger
        MBLogger (~musicbra-@client3.fre.communitycolo.net) has joined #musicbrainz
      • salisan
        How much bandwidth is the server limited to?
      • djce
        That's a question for Rob. ruaok, you there?
      • salisan
        Hmm, well. I suppose the best way to handle more traffic is to get some more webservers going. Finding bandwidth sponsors should be easier than finding money sponsors..
      • djce
        Adding mirroring capabilities is in the works.
      • Plus, we're always interested in finding willing FTP/WWW mirror servers.
      • djce is back (gone 01:25:48)
      • ruaok
        sorry, I was responding to email.
      • I'm trying to work down the email pile that has accumulated.
      • bandwidth limiting: we're on 10Mbps unswitched hub with a bunch of other servers.
      • Its very low tech bandwidth limiting. :-)
      • intrep
        intrep (~intrep@dsl092-134-144.chi1.dsl.speakeasy.net) has joined #musicbrainz
      • djce
        Is that the limiting factor?
      • ruaok
        yes. :-)
      • djce
        Does CCCP want a donation for a 100Mb switch?
      • I guess I'd have to ask them that :-)
      • ruaok
        The problem is that CCCP has to pay for bandwidth spikes, and those can be VERY expensive.
      • Yes and no.
      • djce
        ?
      • ruaok
        They always need more hardware to keep things running, but the 10Mbps hub is there for a reason: bandwidth limiting.
      • intrep
        any plans to do incremental updates of the database db files like freedb does? :)
      • djce
        That would probably be an integral part of the "mirroring" project.
      • ruaok
        Currently the dataset is small enough (and will never grow to freedb sizes since we avoid duplication like the plague) to download in one batch.
      • djce
        So plans, certainly.
      • salisan
        It would be preferable if they had 10mbps switched ;)
      • ruaok
        Also unlike Freedb our changes are not as atomic....
      • I understand it would be nicer to have a 100Mbps switch.
      • But we cant affort to pay for the bandwidth that would get sucked up.
      • djce
        Is the CCCP co-lo free, unless you pass some threshold?
      • intrep
        any particular reasons you guys use postgresql native dumps instead of sql command dumps?
      • ruaok
        yes and no.
      • CCCP does not charge anything.
      • djce
        intrep: I'm working on alternative dump formats right now.
      • intrep
        it seems it might be easier to diff sql dumps
      • djce: awesome
      • salisan
        Ok, nothing is real cheap.
      • ruaok
        However it asks for $50/month/server
      • $25/month/virtual server.
      • NOTHING, really.
      • djce
        Currently I'm thinking: tab-separate, Excel CSV format, and MySQL SQL script.
      • ruaok
        Relatable covers our $50/month, so we're off the hook.
      • djce
        Any other formats you'd like to see? :-)
      • ruaok: I see
      • intrep
        sql command scripts
      • djce
        Mysql, yes I'm doing that.
      • salisan
        Is the database still mysql compatible or is that gone nowdays?
      • ruaok
        I really love the concept of CCCP.
      • mysql: no
      • djce
        MySQL doesn't do foreign keys, AFAIK.
      • ruaok
        or transactions.
      • arrays
      • salisan
        djce: Actually, innodb is supposed to have basic foregin key support now ..
      • djce
        The bottom line is that the database can be 99% loaded into MySQL, but you can't run a MB server off MySQL
      • (the missing 1% is, as ruaok says, array columns)
      • ruaok
        But I don't really like those anymore. :-)
      • They are too slow...
      • djce doesn't trust MySQL for anything too important
      • ruaok agrees with djce
      • salisan
        djce: It's good for selects.
      • djce
        As long as they are simple :-)
      • ruaok
        djce: Your parsing code was really close.
      • djce
        and you're not trying to INSERT at the same time!
      • parsing - yes, I saw your final diff.
      • intrep
        does mysql have finer than table grain locking yet?
      • salisan
        intrep: Yes, the InnoDB table type does
      • ruaok
        modpending, page) values ('Andy Breckman', 'Andy Breckman', '153ccb00-fcad-4c79-a178-207aae196a4c', 0, 83028003)'
      • djce
        InnoDB does, apparently. I've not used it yet though
      • ruaok
        I need to look into those errors....
      • djce
        Yes, I see those every now and then.
      • ruaok: did you notice that MBLogger also creates HTML and RDF logs?!
      • Very flash... :-)
      • intrep
        anyone worked on importing into oracle yet?
      • ruaok
        116860 TRMs collected
      • djce
        Not AFAIK
      • (oracle)
      • but if you have an Oracle server available, please feel free to give it a go.
      • ruaok
        djce: haven't looked. where does the output go?
      • salisan
        Can anyone actually afford Oracle? ;)
      • intrep
        i have to admit, im really impressed with the trm technology
      • i can afford oracle
      • so can you
      • sign up for the developer copy
      • :)
      • djce
        ruaok: MBLogger output is currently in /home/dave/irc-logger/irc-logs/musicbrainz
      • ruaok
        djce: nice.
      • djce
        With the capability to dump the data in multiple formats (which I'm doing now), the tab-sep format is particularly simple to parse, and therefore load into any other data store.
      • intrep
        salisan: check out www.oracle.com you can sign up and download oracle 8i and 9i for free, but youre not supposed to use it for anything production
      • djce
        Loading into Oracle should be trivial really
      • I might give that a try, when I have a few hours download window to spare :-)
      • It would be nice just to say that it can be done.
      • intrep
        yeah, postgres and oracle are fairly sql compatible
      • unlike some other databases i know <coughmysqlcough>
      • djce
        :-)
      • salisan
        hehe
      • Uptime: 991616 Threads: 187 Questions: 62557009 Slow queries: 799 Opens: 60166 Flush tables: 1 Open tables: 64 Queries per second avg: 63.086
      • djce
        cough cough msaccess cough splutter
      • salisan
        I like my mySql ;)
      • djce
        Maybe SQL server too. Anyone have that to play with?
      • ruaok
        EEEEK. Not that.
      • :-)
      • intrep
        and if you do have SQL server, how was last week for you? ;)
      • djce
        s/have/had/ ;-)
      • Somehow "dinosaurs" and "asteroid" comes to mind :-)
      • salisan
        Running sql server queries over internet must be an interesting exercise after that mess ..
      • djce
        It's definitely gone reasonably quiet on port 1434 now.
      • (I work for an ISP, and was monitoring that port the other day, just to see)
      • salisan
        ruaok: I really doubt you will have any more success with Google now btw..
      • ruaok
        it was worth a shot. who knows.
      • what makes you say that though?
      • salisan
        ruaok: Everyone is after their traffic to make money ..
      • Mutiny
        Mutiny (~trivial@h0000deadbeef.ne.client2.attbi.com) has joined #musicbrainz
      • ruaok
        salisan: understood. But we have a good relationship with Google so far....
      • salisan
        Is there any way to implement Last-Modified: or Expires: headers for the generated pages? ..
      • Browsing around on 56k6 is not so nice ;)
      • djce
        That's hard. Desirable, but hard.
      • I expect it's waaaaay down the "TODO" list :-(
      • salisan
        It would probably cut down the traffic per user alot also
      • djce
        Good point.
      • djce wonders why HTTP cacheing always caches exactly the opposite of what you want it to...
      • JavaGeek has left the channel
      • intrep
        are you talking about for album/artist/track searches?
      • or for moderation data?
      • salisan
        intrep: Not just searches, album/artist pages .. everything generated pretty much
      • djce
        searches are probably the one thing you'd never cache. Apart from anything POSTed
      • But cacheing album/artist pages would be great. But like I said, really quite hard to do.
      • intrep
        couldnt you keep a lastmod column in albumjoin and trackjoin?
      • ruaok
        ick. not for caching purposes.
      • djce
        Trouble is, from the server's point of view, by the time you've gone and checked all those timestamp values, wherever they are, you may as well had just generated the page anyway.
      • intrep
        then when you return a page with album data you could set lastmod to max(lastmod) or something like that
      • true