posting the text for <a href="whatever">cheap watches lol!</a> is without any cost for the spammer
2013-11-18 32211, 2013
ocharles
under my suggestion, that bio would probably not have got through, because it would require a captcha to be posted
2013-11-18 32219, 2013
navap
Stack overflow has a point based system of increasing privileges. Can we borrow from that somehow?
2013-11-18 32235, 2013
ianmcorvidae
as our previous run-in with these people shows, since they were literally just posting in broken HTML that I'm not sure actually even made a link
2013-11-18 32239, 2013
ruaok
navap: we do that in a crude way already.
2013-11-18 32200, 2013
ianmcorvidae
which is why I said if we want to cut off the productivity of the spamming we add rel=nofollow and move on
2013-11-18 32211, 2013
nikki
oh, the other thing I was curious about was only allowing an email to be used once. there's not really any good reasons to have multiple accounts with the same email address (a few cases, but not many, and yeah, I know it can be worked around pretty easily, I'm not claiming it can't, it just seems more like a case of "why do we even allow that?")
2013-11-18 32218, 2013
ianmcorvidae
if we care about the cruft in our database we need to do something more proactive
2013-11-18 32246, 2013
navap
As a matter of pride, I think we should care about cruft in the DB
2013-11-18 32254, 2013
ruaok
nikki: if you enter me a ticket I will run a query to find out the number of unique emails vs accounts
2013-11-18 32206, 2013
ianmcorvidae
ruaok: some emails have as many as thousands of accounts
2013-11-18 32213, 2013
reodroid
I care about the "fake" stats, not much about the accounts themselved
2013-11-18 32213, 2013
ianmcorvidae
we were looking the other week :P
2013-11-18 32216, 2013
ruaok
ianmcorvidae: a good.
2013-11-18 32221, 2013
derwin
as a DBA, I always care about cruft in DBs
2013-11-18 32237, 2013
ruaok
well, one email has more than say, 5, accounts, label them as a spammer and nuke all accounts?
2013-11-18 32247, 2013
reodroid
poor nikki
2013-11-18 32249, 2013
ruaok
maybe make a report from it first?
2013-11-18 32227, 2013
ianmcorvidae
we can't make such a report public, but possibly
2013-11-18 32233, 2013
ruaok
I care about cruft in the DB as well.
2013-11-18 32244, 2013
ruaok
lets pick off the low-hanging fruit.
2013-11-18 32246, 2013
navap
Email the flagged user and tell them to contact us, if they don't in 2 weeks, delete their accounts
2013-11-18 32259, 2013
navap
The flagged email*
2013-11-18 32204, 2013
ruaok
does anyone have any objections to the nofollow= suggestion?
2013-11-18 32216, 2013
nikki
it sounds sensible, whatever else we do
2013-11-18 32217, 2013
ocharles
+1 on nofollow to non-musicbrainz links
2013-11-18 32230, 2013
ruaok
to all links outbound?
2013-11-18 32234, 2013
navap
Are we then adding nofollow to relationship links as well?
2013-11-18 32236, 2013
ruaok
even in edit notes?
2013-11-18 32241, 2013
ianmcorvidae
edit notes yes, relationships no
2013-11-18 32244, 2013
navap
nofollow in edit notes yes
2013-11-18 32245, 2013
ruaok
ianmcorvidae: +1
2013-11-18 32246, 2013
ianmcorvidae
relationships require voting etc.
2013-11-18 32259, 2013
ianmcorvidae
the places for nofollow, for me, are:
2013-11-18 32209, 2013
ianmcorvidae
annotation, edit note, user bio, user homepage
2013-11-18 32213, 2013
marcooliveira joined the channel
2013-11-18 32213, 2013
ianmcorvidae
I think that's all
2013-11-18 32218, 2013
hawke_1
Don’t you have to be logged in to see edit notes?
and the reason we'd want relationships to not be nofollow'd is that for those we actually do want googlejuice flowing to official artist homepages, twitters, etc.
2013-11-18 32249, 2013
uk_
hi :)
2013-11-18 32254, 2013
ianmcorvidae
because those are useful outbound links
2013-11-18 32203, 2013
ocharles nods
2013-11-18 32206, 2013
derwin
ah, yes, true. relationship links.
2013-11-18 32224, 2013
ruaok
ok I think we agree on nofollow.
2013-11-18 32229, 2013
ianmcorvidae
cool
2013-11-18 32231, 2013
ruaok
whats our next low hanging fruit?
2013-11-18 32250, 2013
navap
Email sockpuppeting?
2013-11-18 32259, 2013
ruaok
yes, that is a good one.
2013-11-18 32207, 2013
ruaok
lets say its easy to find these.
2013-11-18 32207, 2013
ianmcorvidae
switching captcha, probably, and some sort of investigation of merging accounts that share emails
2013-11-18 32223, 2013
ruaok
what is our course of action when we find them?
2013-11-18 32237, 2013
ruaok
navap's idea of mailing them and if no answer in two weeks, killing the account.
2013-11-18 32238, 2013
ianmcorvidae
I'd say our course of action in general is to support merging editors
2013-11-18 32200, 2013
navap
Merging sounds interesting
2013-11-18 32203, 2013
ruaok
would we have a user_redirect page?
2013-11-18 32206, 2013
ocharles
with a move to making emails unique in the database?
2013-11-18 32210, 2013
ianmcorvidae
ocharles: yes
2013-11-18 32213, 2013
reodroid
huh
2013-11-18 32217, 2013
ianmcorvidae
well, a move towards that
2013-11-18 32229, 2013
ruaok
and prevent sign ups if that email is already in use?
2013-11-18 32232, 2013
reodroid
how is that useful for unused accounts?
2013-11-18 32236, 2013
Freso
Editors as entities!
2013-11-18 32243, 2013
ianmcorvidae
I think we'd want legitimate users to be given a chance to change to a different email or merge, at their discretion
2013-11-18 32245, 2013
Freso re-hides
2013-11-18 32246, 2013
navap
I think there are strong cases for having the same email for 2 or 3 accounts, but those are very very rare
2013-11-18 32255, 2013
derwin
frankly, that's a surprising to me. and I dunno about merging, usually hard.
2013-11-18 32257, 2013
navap
Just people in here
2013-11-18 32211, 2013
Freso
Bots, etc.?
2013-11-18 32211, 2013
ianmcorvidae
navap: I think that using gmail's +whatever or a second email that forwards is perfectly fine there though
2013-11-18 32222, 2013
ianmcorvidae
having looked at this
2013-11-18 32237, 2013
ianmcorvidae
most of our duplicate emails are either people who wanted to change their username but couldn't, or probably-spammers
2013-11-18 32251, 2013
ianmcorvidae
even nikki mostly uses different emails on a string comparison basis :P
2013-11-18 32242, 2013
ruaok
ianmcorvidae: do you have any numbers laying around for how common this is?
2013-11-18 32248, 2013
ruaok
if not, maybe collect some for next week's meeting?
2013-11-18 32249, 2013
ianmcorvidae
not laying around, no
2013-11-18 32201, 2013
ianmcorvidae
I sent nikki a report thingy, maybe I can hunt that down again
2013-11-18 32201, 2013
ruaok
and then we can pick this up once we have some numbers to look at.
2013-11-18 32202, 2013
nikki
how would merging work? I'm not sure I agree with having people who are already well-known go around changing their usernames all the time by merging into a new one
2013-11-18 32209, 2013
ruaok
and also to ponder this more.
2013-11-18 32226, 2013
ocharles
i again don't really see what the problem we're trying to solve is
2013-11-18 32232, 2013
ocharles
is this a numbers thing?
2013-11-18 32239, 2013
ruaok
clean up cruft in our db.
2013-11-18 32252, 2013
ruaok
we aim to have a clean db, therefore want the spammer user accounts gone
2013-11-18 32204, 2013
derwin
"why" ?
2013-11-18 32220, 2013
derwin
I mean, as a DBA, I <3 clean DB.. but.. in reality often not justifiable?
2013-11-18 32221, 2013
ruaok
we dot accept clutter elsewhere. why accept it here?
2013-11-18 32227, 2013
ianmcorvidae
there's a bunch of different things we want, I think, and we're helpfully talking about all of them at once
2013-11-18 32228, 2013
ocharles
because it's not public
2013-11-18 32239, 2013
ocharles
we have cruft in the code too, and we live with that :)
2013-11-18 32251, 2013
ianmcorvidae
yeah, I mean, that argument can apply to, say, tags and cdstubs too
2013-11-18 32221, 2013
reodroid
IMO mostly because we don't have n million real editors and it feels misleading to claim that because of spammers
2013-11-18 32230, 2013
ruaok
reodroid: +1
2013-11-18 32233, 2013
ocharles
reodroid: surely we can formulate better queries to get better insight
2013-11-18 32238, 2013
marcooliveira joined the channel
2013-11-18 32243, 2013
reodroid
I might as well keep them if we can not count it for stats
2013-11-18 32246, 2013
ocharles
we have more options than just SELECT count(*) FROM editor
2013-11-18 32258, 2013
derwin
every "[x] has [y] users" stat you have ever seen, ever, has been misleading in the same way.
2013-11-18 32203, 2013
ruaok
ocharles: I'd love for you to spend a couple of hours on this when you can.
2013-11-18 32207, 2013
ianmcorvidae
improving the query for count.editor.valid and maybe adding a count.editor.inactive stat would be reasonable
2013-11-18 32209, 2013
ruaok
see if you can come up with anything interesting.
2013-11-18 32215, 2013
ocharles
ruaok: ok, what bit is the "this"?
2013-11-18 32226, 2013
ruaok
"we have more options than just SELECT count(*) FROM editor"
2013-11-18 32227, 2013
ocharles
ok
2013-11-18 32230, 2013
reodroid
derwin: probably, that doesn't mean we shouldn't try to avoid it :)
2013-11-18 32237, 2013
ocharles
i'd be happy to do some analysis there
2013-11-18 32239, 2013
derwin
it may? heh.
2013-11-18 32253, 2013
ruaok
ok, lets leave it here for now.
2013-11-18 32202, 2013
ruaok
we've spent 40 minutes on this already
2013-11-18 32208, 2013
ianmcorvidae
I'll try to have some numbers on email reuse for next week.
2013-11-18 32222, 2013
ruaok
we have one concrete step (nofollow) and then some investigation by both ianmcorvidae and ocharles
2013-11-18 32229, 2013
ruaok
then next week we look at it again.
2013-11-18 32235, 2013
ruaok
sound reasonable?
2013-11-18 32248, 2013
ruaok
ianmcorvidae: OAuth2 MAC auth
2013-11-18 32259, 2013
ianmcorvidae
I'm also happy to look at stats if ocharles would rather look at his in-progress stuff
2013-11-18 32202, 2013
ianmcorvidae
anyway
2013-11-18 32203, 2013
ianmcorvidae
MAC auth.
2013-11-18 32219, 2013
ianmcorvidae
so we currently support two varieties of authentication with OAuth
2013-11-18 32223, 2013
ianmcorvidae
bearer tokens and MAC tokens
2013-11-18 32243, 2013
ocharles
ianmcorvidae: that might work better, now that i have more work attribute stuff to do
2013-11-18 32245, 2013
ianmcorvidae
bearer is much easier to use and actually standardized; MAC was added, as luks said, because at the time he assumed that using https requests in picard would be a no-no
2013-11-18 32205, 2013
ianmcorvidae
MAC auth is also misimplemented in a way that makes the tests very very angry with me on perl 5.18
2013-11-18 32213, 2013
ocharles
:)
2013-11-18 32222, 2013
ianmcorvidae
we could fix it, but I'd like to propose we just remove MAC auth instead.