that stuff.. I know that tuff been using it since we got lucene search :)
2017-11-07 31158, 2017
FishQuest
oh ho
2017-11-07 31109, 2017
FishQuest
this i didn't know, i thouht it wasa completely different thing
2017-11-07 31141, 2017
samj1912
if lucene were bricks, solr is like a pre built house you can put your furniture in
2017-11-07 31149, 2017
samj1912
the current search server we built from scratch
2017-11-07 31155, 2017
FishQuest
hmm
2017-11-07 31119, 2017
FishQuest
are you sure about that?
2017-11-07 31135, 2017
samj1912
that we built it from scratch?
2017-11-07 31144, 2017
FishQuest
the way I remember is, that lucene was added and tinkered with tremendously, but it also ,came fro msomething already built
2017-11-07 31154, 2017
FishQuest
this was oh.. wtf 7 20 years ago?
2017-11-07 31100, 2017
FishQuest
erh 10 not 20
2017-11-07 31111, 2017
samj1912
well, you get the point :P
2017-11-07 31153, 2017
FishQuest
anyway I'm going to the library <3, ping me when the test server can be logged into . (no rush or anything)
2017-11-07 31127, 2017
naught101_ joined the channel
2017-11-07 31145, 2017
D4RK-PH0ENiX has quit
2017-11-07 31134, 2017
Ant1SG has quit
2017-11-07 31155, 2017
D4RK-PH0ENiX joined the channel
2017-11-07 31118, 2017
yokel has quit
2017-11-07 31132, 2017
yokel joined the channel
2017-11-07 31117, 2017
Ant1SG joined the channel
2017-11-07 31105, 2017
Ant1SG has quit
2017-11-07 31159, 2017
naught101_ has quit
2017-11-07 31142, 2017
Ant1SG joined the channel
2017-11-07 31148, 2017
jesus2099 joined the channel
2017-11-07 31150, 2017
UmkaDK_ joined the channel
2017-11-07 31112, 2017
UmkaDK has quit
2017-11-07 31102, 2017
zas
bitmap: ping me when you're caffeined enough
2017-11-07 31133, 2017
Ant1SG has quit
2017-11-07 31106, 2017
MajorLurker has quit
2017-11-07 31132, 2017
gcilou joined the channel
2017-11-07 31146, 2017
ruaok
alastairp: the sharepoint download of all the files downloaded 20GB "successfully", but produces a corrupt zip file.
2017-11-07 31118, 2017
ruaok
> 16455114579 extra bytes at beginning or within zipfile. zipfile corrupt.
2017-11-07 31132, 2017
samj1912
ruaok: took from 10:48 to 13:!3 to index all recordings
2017-11-07 31151, 2017
ruaok
oh wow. that is great.
2017-11-07 31113, 2017
samj1912
2:25 hrs around
2017-11-07 31138, 2017
samj1912
oh wait, there's more, it ended on 13:49 sorry so about 3 hrs
2017-11-07 31154, 2017
ruaok
anything less than 6 hours is great. :)
2017-11-07 31128, 2017
alastairp
ruaok: I have URLs to download with curl, but internet here is rate limited during the day
2017-11-07 31144, 2017
samj1912
and zas pointed out that doing it over tcp has about 50-175% overhead depending on whether its ssl or not
2017-11-07 31144, 2017
ruaok
hit me. I got 300mbit ready to go!
2017-11-07 31100, 2017
samj1912
we figured we will move the slave to the same container and use sockets
2017-11-07 31113, 2017
samj1912
zas is waiting for bitmap to figure out how to do it
2017-11-07 31159, 2017
samj1912
and I dont think we have tuned the parameters enough yet
2017-11-07 31119, 2017
samj1912
hopefully we should be able to get recording index down to 1 hr or 1.5 hrs
2017-11-07 31125, 2017
samj1912
maybe less
2017-11-07 31100, 2017
samj1912
me and zas were also discussing a ram only index if we want it really really quick in terms of indexing and retrieval, but it might be overkill :P since we have raid ssds
2017-11-07 31135, 2017
Sophist-UK has quit
2017-11-07 31123, 2017
jesus2099 has quit
2017-11-07 31136, 2017
UmkaDK_ has quit
2017-11-07 31159, 2017
UmkaDK joined the channel
2017-11-07 31152, 2017
Sophist-UK joined the channel
2017-11-07 31110, 2017
ruaok
samj1912: don't worry about tuning too much.
2017-11-07 31117, 2017
ruaok
ideally we will do this only once.
2017-11-07 31123, 2017
samj1912
okay
2017-11-07 31145, 2017
ruaok
alastairp: thanks. Now downloading MLHD at ~30MB/s. :)
2017-11-07 31153, 2017
alastairp
incredible
2017-11-07 31104, 2017
ruaok
datacenter to datacenter FTW
2017-11-07 31115, 2017
ruaok
then I'll shove this into BigQuery.
2017-11-07 31138, 2017
alastairp
we use google drive for the same reason to share stuff... enterprise file storage is way faster than the local internet connection
2017-11-07 31142, 2017
ruaok
5 files done already.
2017-11-07 31117, 2017
alastairp
I think it actually was a smart decision to put it on MS cloud, I guess McGill has an enterprise/academic account
2017-11-07 31121, 2017
UmkaDK has quit
2017-11-07 31139, 2017
samj1912
ruaok: entire indexing done except editors and cdstubs
2017-11-07 31149, 2017
samj1912
took exactly 4 hours for everything
2017-11-07 31127, 2017
zas
but the whole point is to not reindex everything right ? how does it perform after one day of changes ?
2017-11-07 31137, 2017
alastairp
ruaok: just looking at the contents of the tar archives... no subdirectories, individual files are gzip compressed
2017-11-07 31101, 2017
samj1912
zas: not sure, haven't tested it yet
2017-11-07 31104, 2017
alastairp
might be worth writing a quick script to uncompress the archives and put them on disk in a nice structure
2017-11-07 31113, 2017
alastairp
(or upload to BQ directly from the tar??)
2017-11-07 31130, 2017
samj1912
I need bitmap's help in adding the triggers
2017-11-07 31138, 2017
samj1912
and setting up rabbitmq
2017-11-07 31117, 2017
alastairp
ruaok: btw, Felipe suggested https://airflow.apache.org/ as a tool for managing data from a local datastore -> BQ
2017-11-07 31143, 2017
alastairp
might be something that we could look at if we're planning on sending data from lots of places
2017-11-07 31102, 2017
alastairp
I've not looked at it yet, but I'm going to have a look at how it works
2017-11-07 31107, 2017
UmkaDK joined the channel
2017-11-07 31102, 2017
djwhitey joined the channel
2017-11-07 31109, 2017
djwhitey has quit
2017-11-07 31123, 2017
UmkaDK has quit
2017-11-07 31132, 2017
UmkaDK joined the channel
2017-11-07 31102, 2017
UmkaDK has quit
2017-11-07 31108, 2017
UmkaDK_ joined the channel
2017-11-07 31141, 2017
Gazooo joined the channel
2017-11-07 31137, 2017
Sophist-UK has quit
2017-11-07 31118, 2017
bitmap
zas: pong
2017-11-07 31137, 2017
Sophist-UK joined the channel
2017-11-07 31150, 2017
zas
hey
2017-11-07 31126, 2017
zas
samj1912 made a test, using paco for sir/solr and williams as db