oh ho, so various obviously spam things are being blocked as the logs are comming up?
2017-06-04 15522, 2017
CatQuest
!zas
2017-06-04 15540, 2017
CatQuest
hm
2017-06-04 15540, 2017
CatQuest
!m zas
2017-06-04 15540, 2017
BrainzBot
You're doing good work, zas!
2017-06-04 15559, 2017
hibiscuskazeneko has quit
2017-06-04 15534, 2017
agentsim joined the channel
2017-06-04 15503, 2017
arbenina_ has quit
2017-06-04 15535, 2017
CatQuest
wtf is crazy webcrawler
2017-06-04 15514, 2017
CatQuest
should i report users who are obviously spam in the list?
2017-06-04 15509, 2017
drsaunders joined the channel
2017-06-04 15552, 2017
SothoTalKer
CatQuest: i guess reporting the top users in the list could be reported if they are spammers
2017-06-04 15550, 2017
SothoTalKer
yeah
2017-06-04 15558, 2017
SothoTalKer
whatever i wanted to say there
2017-06-04 15505, 2017
CatQuest
i alo think that mayve freso et all are going trought thme anyway so it's probably not important to do it
2017-06-04 15554, 2017
SothoTalKer
many of those will be purged when the spam domain emails deletion will be in place i guess
2017-06-04 15545, 2017
Slurpee joined the channel
2017-06-04 15550, 2017
D4RK-PH0ENiX joined the channel
2017-06-04 15520, 2017
agentsim has quit
2017-06-04 15551, 2017
CatQuest
indeed
2017-06-04 15501, 2017
to81 joined the channel
2017-06-04 15527, 2017
to81 has quit
2017-06-04 15505, 2017
samj1912 joined the channel
2017-06-04 15548, 2017
to81 joined the channel
2017-06-04 15517, 2017
agentsim joined the channel
2017-06-04 15549, 2017
to81 has quit
2017-06-04 15514, 2017
Freso
CatQuest: Yeah; no reason to report spammers based on that report/page. They will hopefully get dealt with in an automated fashion and may be useful for data gathering until then.
2017-06-04 15526, 2017
CatQuest
:D
2017-06-04 15529, 2017
to81 joined the channel
2017-06-04 15520, 2017
to81 has quit
2017-06-04 15528, 2017
hibiscuskazeneko joined the channel
2017-06-04 15534, 2017
agentsim has quit
2017-06-04 15548, 2017
agentsim joined the channel
2017-06-04 15509, 2017
github joined the channel
2017-06-04 15509, 2017
github
[musicbrainz-server] zas closed pull request #519: Disallow more stuff in robots.txt and use Crawl-delay option (master...master) https://git.io/vH2dT
2017-06-04 15509, 2017
github has left the channel
2017-06-04 15559, 2017
github joined the channel
2017-06-04 15559, 2017
github
[musicbrainz-server] zas opened pull request #520: Update robots.txt (production...robots) https://git.io/vHaU1
Turns out manipulating 1500 pages of text is annoying :p
2017-06-04 15525, 2017
SothoTalKer
well, 1913 for the IV-version o.o
2017-06-04 15534, 2017
reosarevok
volume :)
2017-06-04 15549, 2017
SothoTalKer
that, too :D
2017-06-04 15554, 2017
reosarevok
Yes, there's apparently a lot of Estonian scientists :D
2017-06-04 15516, 2017
reosarevok switches the approach to "if given an entry, deal with it" first, before trying to actually split the entries
2017-06-04 15508, 2017
SothoTalKer
the question is: are all of those scientists notable enough for wikipedia :)
2017-06-04 15503, 2017
reosarevok
For the Estonian one? Yeah
2017-06-04 15524, 2017
reosarevok
(it was Wikimedia Estonia who was looking into doing this actually)
2017-06-04 15525, 2017
Leftmost
Are there any decent IRC clients for OSX that don't require money?
2017-06-04 15502, 2017
ruaok
if you want, I can add out to the team irccoud account
2017-06-04 15536, 2017
Leftmost
Is that in any way a pain in the ass?
2017-06-04 15559, 2017
Leo_Verto[m]
zas: by the way, do you know what 46.229.171.212 is? all the other top IPs seem to be the big search engine crawlers but that one seems to be a CDN and hosting service located in the netherlands
2017-06-04 15527, 2017
ruaok
Leftmost: no more than you are a pain the ass. :)
2017-06-04 15558, 2017
zas
Leo_Verto[m]: 46.229.171.212 was just doing same query over and over, with spammy stuff appended to the url, now blocked
ruaok: will you have the time to work on LB tomorrow? I was hoping to get the python3 PR merged :)
2017-06-04 15535, 2017
ruaok
yes. you're near the top of list.
2017-06-04 15548, 2017
ruaok
(even though it is a bank holiday here)
2017-06-04 15505, 2017
iliekcomputers
Awesome! Thanks. :)
2017-06-04 15526, 2017
kyan joined the channel
2017-06-04 15511, 2017
zas
ruaok: i'm proceeding to block ws abusers, mainly IPs doing repeated queries ending with 403s (bad UA mostly, will not complain), i blocked top 500 IPs at network level: https://stats.metabrainz.org/dashboard/db/mbstats…
2017-06-04 15540, 2017
hibiscuskazeneko has quit
2017-06-04 15549, 2017
lazka has quit
2017-06-04 15534, 2017
antgel joined the channel
2017-06-04 15523, 2017
arbenina has quit
2017-06-04 15552, 2017
arbenina joined the channel
2017-06-04 15514, 2017
hibiscuskazeneko joined the channel
2017-06-04 15534, 2017
SothoTalKer
zas: how can you actually get a 403?
2017-06-04 15524, 2017
zas
using the ws without a proper ua string to start with
2017-06-04 15516, 2017
zas
those i blocked continue hammering the ws, even when getting not a single positive response...
2017-06-04 15516, 2017
SothoTalKer
ah. lucky me that my script had a reasonable default UA :)
2017-06-04 15545, 2017
reosarevok
zas: how much do we care once they're blocked?
2017-06-04 15558, 2017
reosarevok
They still consume *something* to get the response, right?
2017-06-04 15505, 2017
reosarevok
Even if the response is fuck off?
2017-06-04 15534, 2017
zas
Well, since they are blocked at IP level, using ipset, that's a very a low overhead, compared to blocked at http level
2017-06-04 15500, 2017
reosarevok
Ok, so just laughing at them is ok then?
2017-06-04 15541, 2017
lengtche joined the channel
2017-06-04 15536, 2017
to81 joined the channel
2017-06-04 15540, 2017
zas
reosarevok: yes, they don't care responses from ws, i doubt they'll even noticed they were blocked
2017-06-04 15552, 2017
zas
if someone complains, well, we'll handle the case ;)
2017-06-04 15543, 2017
SothoTalKer
zas: does this mean the WS should be a bit less overloaded? ^-^
2017-06-04 15558, 2017
zas
not really
2017-06-04 15507, 2017
zas
because those never hit backends
2017-06-04 15534, 2017
SothoTalKer
hmh
2017-06-04 15553, 2017
zas
but in next days i'll have a look at other kinds of abuses, and see if we can reduce the traffic a bit