ok, then I need to add support for the 'don't download what already exists' stuff
ruaok
I can sit tight.
but I would like to head to lunch soon and it would be good to start this before I go
ocharles
I think I have that fixed up
ruaok
git pull?
ocharles
just checking I am right
dah, * doesn't work if you have an empty directory
ruaok
k
did you push?
it only updated the readme
ocharles
no, I haven't finished my work yet
ruaok: done
ruaok
no files nuked, it politely asked for each file
very british of it. :)
ocharles
:)
but did it pull mbdump.tar.bz2?
ruaok
now the download speed is sucking.
yea, its doing that now.
ocharles
ok
ruaok
I wonder if I should do a blog post asking if anyone has FTP mirror space/bandwidth in the EU
nikki
worth a try?
ruaok
ok, will do after lunch
ruaok heads for luch
+n
nikki_ joined the channel
LordSputnik has left the channel
djce joined the channel
ocharles goes to grab lunch
GNUton__ joined the channel
GNUton__
Hi there
voiceinsideyou joined the channel
ruaok
hi GNUton
ocharles: download almost complete.
ocharles: admin/import-db.sh: line 20: md5sum: command not found
on mac its md5
djce joined the channel
cherry1
I dont get it, I thought the idea was you ran virtualbox with linux on it on your mac, PC ectera so that u dont have to worry about os differences ?
ruaok
yes, but this is the code to *make* VMs. :)
cherry1 has left the channel
cherry1 joined the channel
cherry1 has left the channel
I commented out the md5sum check
GNUton
Hi ruaok!
ijabz joined the channel
ijabz
so this is the vagrant script, so will need dmodified vagrant script for each OS you want to run MusicbrainzInVirtualMachine on ?
ruaok
no.
in theory it should work to generate VMs on a mac, but practice differs.
so far its done pretty well.
fractalizator joined the channel
but if it continues to be a pain, we'll say its linux only
ijabz
seems to negate the only advantage of using a virtual machine then to me, that you can use it on any machine that supports the virtual machine
ruaok
like I said its for MAKING the VMs.
which only the MB devs need.
ijabz
oh right, ok
ruaok
anyone can run it in any VM host/player anywhere they want
mb-chat-logger joined the channel
GNUton
Hi guys, I would like to get all artist's wikipedia links from MB's table dumps... I can see the links in "url" table, but I don't understand how to get the ones which belong to artists.
ianmcorvidae
you'll want to look at l_artist_url
GNUton
ianmcorvidae: I tried.. but I cannot find for instance.. "Nirvana (Band)"
somewhere in there is the 'id' column, which will correspond to the 'entity1' column of l_artist_url
the entity0 will then correspond to the row in the 'artist' table
from which you can get whatever data you need (possibly by joining other tables, e.g. artist_name, artist_alias, the stuff for releaseand other entities, etc.)
GNUton
ok I'll give it another try..
thanks for the hint
ianmcorvidae
you can see the specific link type by joining the 'link' table via the link column of the l_ table and then to link_type
one fix would be to check to see if md5sum exists. if not use md5
then check each one manually.
ocharles
do you have "digest" ?
teuf joined the channel
ruaok
unfortunately brew doesn't have md5sum in its repo
ocharles
wait, not sure that helps
night199uk joined the channel
ruaok
negative
ocharles
k
brew install md5sha1sum ?
ruaok
installing. :)
ok, that works. :)
are you updating the readme or do you want a pull request?
UmkaDK joined the channel
djce joined the channel
UmkaDK
Hi everyone! :)
ianmcorvidae
hello
ocharles
ruaok: what needs to be changed in the README?
dependencies?
ruaok
mac prereqs
ocharles
infact, import-db.sh isn't documented at all
UmkaDK
Guys, is anyone aware of any memory related issues when the search server index is build??
ocharles
os x isn't a supported platform, so we don't tend to document that type of stuff
ruaok
yes it is.
ruaok means import-db.sh
and it seems to be working fairly well, so why not support it?
ocharles
it seems a bad idea making osx fully supported when the people developing this don't have a mac
but I'll put it in for now
ruaok
one of the people has only macs. :)
ocharles
sure, but also has the choice to use a supported platform (ubuntu). I can't choose to use osx :)
warp
right, so that guy should write and maintain those bits :P
UmkaDK
I'm noticing that the memory on the server is getting gobbled up not that long -before- index process is run and the operations team keeps putting the blame for it at my door.
Has anyone experienced anything similar??
ruaok
not during indexing.
UmkaDK
ruaok: I feel like there is a "but" coming. ;)
ruaok
ocharles: hmm > Error loading /tmp/MBImport-gpXBLwfc/mbdump/recording: DBD::Pg::db pg_putcopydata failed: server closed the connection unexpectedly
This probably means the server terminated abnormally
ocharles
ruaok: is the VM still running?
ruaok
UmkaDK: there might be, but you haven't asked that question yet. :)
ocharles
ruaok: open up the VirtualBox GUI and see what state it's in
warp: is there anything in bash to print to stderr and then exit?
foo || (echo x ; exit 1 )
ruaok
hmmm. I started spotify which made my whole machine unhappy.
but now I can log in again.
time to rerun the import. :(
reosarevok
ruaok: they are probably targeting your computer after you said bad things about them yesterday! :p
ocharles
ruaok: before then it's worth understanding what happened
ruaok: was the VM down?
ruaok
$ uptime
13:30:52 up 2:37, 1 user, load average: 1.12, 2.19, 2.48
no
ocharles
did oomkiller take it out? (if osx even has oomkiller)
if the vm didn't die, you can check postgresql logs
ruaok
reosarevok: was there a live stream of the panel, reosarevok ?
could not fdatasync log file 0, segment 223: Input/output error # does not sound good
warp
ocharles: you can redirect to stderror with >&2, so echo "foo" >&2
ocharles
check dmesg on that machine and the host too
warp: but that doesn't change the exit code
ruaok
the checkpoint one is also not good.
ocharles
checkpointing is normal
UmkaDK
Oh… ok… let me try and phrase it as a question... :) Could anyone think of a reason why musicbrainz would force the server to eat up 4G memory in a metter of seconds everyday at pretty much the same time, and about an hour before search server builds its indexes?
ruaok
but I wonder if that was because my machine was swapping
ocharles
out the box defaults on check point segments are not tuned for bulk loading