#musicbrainz-devel

/

      • ruaok
        ah
      • 2013-11-18 32228, 2013

      • ruaok
        how about disallowing links in user bios? we can show the URL, but not make it a link.
      • 2013-11-18 32245, 2013

      • ruaok
        that kills one venue of spamming.
      • 2013-11-18 32249, 2013

      • ocharles
        what about the homepage property?
      • 2013-11-18 32255, 2013

      • ocharles
        or whatever we call it
      • 2013-11-18 32206, 2013

      • ruaok
        same thing.
      • 2013-11-18 32208, 2013

      • ruaok
        show URL
      • 2013-11-18 32210, 2013

      • ianmcorvidae
        I really can't stress enough that something being ineffective for the spammers does not matter an ounce to the spammer
      • 2013-11-18 32222, 2013

      • derwin
        what ianmcorvidae says
      • 2013-11-18 32227, 2013

      • ruaok
        maybe we can show a link for people who are viewing it who have made good edits.
      • 2013-11-18 32230, 2013

      • ocharles
        i'm with ianmcorvidae
      • 2013-11-18 32241, 2013

      • ruaok
        ianmcorvidae:
      • 2013-11-18 32244, 2013

      • ruaok
      • 2013-11-18 32249, 2013

      • ruaok
        that *is* effective
      • 2013-11-18 32250, 2013

      • ianmcorvidae
        posting the text for <a href="whatever">cheap watches lol!</a> is without any cost for the spammer
      • 2013-11-18 32211, 2013

      • ocharles
        under my suggestion, that bio would probably not have got through, because it would require a captcha to be posted
      • 2013-11-18 32219, 2013

      • navap
        Stack overflow has a point based system of increasing privileges. Can we borrow from that somehow?
      • 2013-11-18 32235, 2013

      • ianmcorvidae
        as our previous run-in with these people shows, since they were literally just posting in broken HTML that I'm not sure actually even made a link
      • 2013-11-18 32239, 2013

      • ruaok
        navap: we do that in a crude way already.
      • 2013-11-18 32200, 2013

      • ianmcorvidae
        which is why I said if we want to cut off the productivity of the spamming we add rel=nofollow and move on
      • 2013-11-18 32211, 2013

      • nikki
        oh, the other thing I was curious about was only allowing an email to be used once. there's not really any good reasons to have multiple accounts with the same email address (a few cases, but not many, and yeah, I know it can be worked around pretty easily, I'm not claiming it can't, it just seems more like a case of "why do we even allow that?")
      • 2013-11-18 32218, 2013

      • ianmcorvidae
        if we care about the cruft in our database we need to do something more proactive
      • 2013-11-18 32246, 2013

      • navap
        As a matter of pride, I think we should care about cruft in the DB
      • 2013-11-18 32254, 2013

      • ruaok
        nikki: if you enter me a ticket I will run a query to find out the number of unique emails vs accounts
      • 2013-11-18 32206, 2013

      • ianmcorvidae
        ruaok: some emails have as many as thousands of accounts
      • 2013-11-18 32213, 2013

      • reodroid
        I care about the "fake" stats, not much about the accounts themselved
      • 2013-11-18 32213, 2013

      • ianmcorvidae
        we were looking the other week :P
      • 2013-11-18 32216, 2013

      • ruaok
        ianmcorvidae: a good.
      • 2013-11-18 32221, 2013

      • derwin
        as a DBA, I always care about cruft in DBs
      • 2013-11-18 32237, 2013

      • ruaok
        well, one email has more than say, 5, accounts, label them as a spammer and nuke all accounts?
      • 2013-11-18 32247, 2013

      • reodroid
        poor nikki
      • 2013-11-18 32249, 2013

      • ruaok
        maybe make a report from it first?
      • 2013-11-18 32227, 2013

      • ianmcorvidae
        we can't make such a report public, but possibly
      • 2013-11-18 32233, 2013

      • ruaok
        I care about cruft in the DB as well.
      • 2013-11-18 32244, 2013

      • ruaok
        lets pick off the low-hanging fruit.
      • 2013-11-18 32246, 2013

      • navap
        Email the flagged user and tell them to contact us, if they don't in 2 weeks, delete their accounts
      • 2013-11-18 32259, 2013

      • navap
        The flagged email*
      • 2013-11-18 32204, 2013

      • ruaok
        does anyone have any objections to the nofollow= suggestion?
      • 2013-11-18 32216, 2013

      • nikki
        it sounds sensible, whatever else we do
      • 2013-11-18 32217, 2013

      • ocharles
        +1 on nofollow to non-musicbrainz links
      • 2013-11-18 32230, 2013

      • ruaok
        to all links outbound?
      • 2013-11-18 32234, 2013

      • navap
        Are we then adding nofollow to relationship links as well?
      • 2013-11-18 32236, 2013

      • ruaok
        even in edit notes?
      • 2013-11-18 32241, 2013

      • ianmcorvidae
        edit notes yes, relationships no
      • 2013-11-18 32244, 2013

      • navap
        nofollow in edit notes yes
      • 2013-11-18 32245, 2013

      • ruaok
        ianmcorvidae: +1
      • 2013-11-18 32246, 2013

      • ianmcorvidae
        relationships require voting etc.
      • 2013-11-18 32259, 2013

      • ianmcorvidae
        the places for nofollow, for me, are:
      • 2013-11-18 32209, 2013

      • ianmcorvidae
        annotation, edit note, user bio, user homepage
      • 2013-11-18 32213, 2013

      • marcooliveira joined the channel
      • 2013-11-18 32213, 2013

      • ianmcorvidae
        I think that's all
      • 2013-11-18 32218, 2013

      • hawke_1
        Don’t you have to be logged in to see edit notes?
      • 2013-11-18 32224, 2013

      • reodroid
        what's nofollow?
      • 2013-11-18 32225, 2013

      • derwin
        don't know why we'd want nonfollow links anywhere?
      • 2013-11-18 32235, 2013

      • derwin
        err, not-nofollow
      • 2013-11-18 32254, 2013

      • ianmcorvidae
        nofollow says to search engines that you don't approve of this link, basically
      • 2013-11-18 32204, 2013

      • ianmcorvidae
        i.e. don't use it in calculations of googlejuice, as ruaok would put it
      • 2013-11-18 32223, 2013

      • derwin
        right, my understanding is we generally do not want googlejuice
      • 2013-11-18 32224, 2013

      • ruaok
        so, that helps the first bit, I'll enter a ticket for that.
      • 2013-11-18 32224, 2013

      • navap
        derwin: Inter-MB links should be followed, otherwise nothing would get indexed
      • 2013-11-18 32230, 2013

      • uk_
      • 2013-11-18 32236, 2013

      • navap
        Or is that intra-MB*
      • 2013-11-18 32236, 2013

      • derwin
        right, I'm saying out-links, navap.
      • 2013-11-18 32238, 2013

      • ruaok
        hi uk_ !
      • 2013-11-18 32238, 2013

      • ianmcorvidae
        and the reason we'd want relationships to not be nofollow'd is that for those we actually do want googlejuice flowing to official artist homepages, twitters, etc.
      • 2013-11-18 32249, 2013

      • uk_
        hi :)
      • 2013-11-18 32254, 2013

      • ianmcorvidae
        because those are useful outbound links
      • 2013-11-18 32203, 2013

      • ocharles nods
      • 2013-11-18 32206, 2013

      • derwin
        ah, yes, true. relationship links.
      • 2013-11-18 32224, 2013

      • ruaok
        ok I think we agree on nofollow.
      • 2013-11-18 32229, 2013

      • ianmcorvidae
        cool
      • 2013-11-18 32231, 2013

      • ruaok
        whats our next low hanging fruit?
      • 2013-11-18 32250, 2013

      • navap
        Email sockpuppeting?
      • 2013-11-18 32259, 2013

      • ruaok
        yes, that is a good one.
      • 2013-11-18 32207, 2013

      • ruaok
        lets say its easy to find these.
      • 2013-11-18 32207, 2013

      • ianmcorvidae
        switching captcha, probably, and some sort of investigation of merging accounts that share emails
      • 2013-11-18 32223, 2013

      • ruaok
        what is our course of action when we find them?
      • 2013-11-18 32237, 2013

      • ruaok
        navap's idea of mailing them and if no answer in two weeks, killing the account.
      • 2013-11-18 32238, 2013

      • ianmcorvidae
        I'd say our course of action in general is to support merging editors
      • 2013-11-18 32200, 2013

      • navap
        Merging sounds interesting
      • 2013-11-18 32203, 2013

      • ruaok
        would we have a user_redirect page?
      • 2013-11-18 32206, 2013

      • ocharles
        with a move to making emails unique in the database?
      • 2013-11-18 32210, 2013

      • ianmcorvidae
        ocharles: yes
      • 2013-11-18 32213, 2013

      • reodroid
        huh
      • 2013-11-18 32217, 2013

      • ianmcorvidae
        well, a move towards that
      • 2013-11-18 32229, 2013

      • ruaok
        and prevent sign ups if that email is already in use?
      • 2013-11-18 32232, 2013

      • reodroid
        how is that useful for unused accounts?
      • 2013-11-18 32236, 2013

      • Freso
        Editors as entities!
      • 2013-11-18 32243, 2013

      • ianmcorvidae
        I think we'd want legitimate users to be given a chance to change to a different email or merge, at their discretion
      • 2013-11-18 32245, 2013

      • Freso re-hides
      • 2013-11-18 32246, 2013

      • navap
        I think there are strong cases for having the same email for 2 or 3 accounts, but those are very very rare
      • 2013-11-18 32255, 2013

      • derwin
        frankly, that's a surprising to me. and I dunno about merging, usually hard.
      • 2013-11-18 32257, 2013

      • navap
        Just people in here
      • 2013-11-18 32211, 2013

      • Freso
        Bots, etc.?
      • 2013-11-18 32211, 2013

      • ianmcorvidae
        navap: I think that using gmail's +whatever or a second email that forwards is perfectly fine there though
      • 2013-11-18 32222, 2013

      • ianmcorvidae
        having looked at this
      • 2013-11-18 32237, 2013

      • ianmcorvidae
        most of our duplicate emails are either people who wanted to change their username but couldn't, or probably-spammers
      • 2013-11-18 32251, 2013

      • ianmcorvidae
        even nikki mostly uses different emails on a string comparison basis :P
      • 2013-11-18 32242, 2013

      • ruaok
        ianmcorvidae: do you have any numbers laying around for how common this is?
      • 2013-11-18 32248, 2013

      • ruaok
        if not, maybe collect some for next week's meeting?
      • 2013-11-18 32249, 2013

      • ianmcorvidae
        not laying around, no
      • 2013-11-18 32201, 2013

      • ianmcorvidae
        I sent nikki a report thingy, maybe I can hunt that down again
      • 2013-11-18 32201, 2013

      • ruaok
        and then we can pick this up once we have some numbers to look at.
      • 2013-11-18 32202, 2013

      • nikki
        how would merging work? I'm not sure I agree with having people who are already well-known go around changing their usernames all the time by merging into a new one
      • 2013-11-18 32209, 2013

      • ruaok
        and also to ponder this more.
      • 2013-11-18 32226, 2013

      • ocharles
        i again don't really see what the problem we're trying to solve is
      • 2013-11-18 32232, 2013

      • ocharles
        is this a numbers thing?
      • 2013-11-18 32239, 2013

      • ruaok
        clean up cruft in our db.
      • 2013-11-18 32252, 2013

      • ruaok
        we aim to have a clean db, therefore want the spammer user accounts gone
      • 2013-11-18 32204, 2013

      • derwin
        "why" ?
      • 2013-11-18 32220, 2013

      • derwin
        I mean, as a DBA, I <3 clean DB.. but.. in reality often not justifiable?
      • 2013-11-18 32221, 2013

      • ruaok
        we dot accept clutter elsewhere. why accept it here?
      • 2013-11-18 32227, 2013

      • ianmcorvidae
        there's a bunch of different things we want, I think, and we're helpfully talking about all of them at once
      • 2013-11-18 32228, 2013

      • ocharles
        because it's not public
      • 2013-11-18 32239, 2013

      • ocharles
        we have cruft in the code too, and we live with that :)
      • 2013-11-18 32251, 2013

      • ianmcorvidae
        yeah, I mean, that argument can apply to, say, tags and cdstubs too
      • 2013-11-18 32221, 2013

      • reodroid
        IMO mostly because we don't have n million real editors and it feels misleading to claim that because of spammers
      • 2013-11-18 32230, 2013

      • ruaok
        reodroid: +1
      • 2013-11-18 32233, 2013

      • ocharles
        reodroid: surely we can formulate better queries to get better insight
      • 2013-11-18 32238, 2013

      • marcooliveira joined the channel
      • 2013-11-18 32243, 2013

      • reodroid
        I might as well keep them if we can not count it for stats
      • 2013-11-18 32246, 2013

      • ocharles
        we have more options than just SELECT count(*) FROM editor
      • 2013-11-18 32258, 2013

      • derwin
        every "[x] has [y] users" stat you have ever seen, ever, has been misleading in the same way.
      • 2013-11-18 32203, 2013

      • ruaok
        ocharles: I'd love for you to spend a couple of hours on this when you can.
      • 2013-11-18 32207, 2013

      • ianmcorvidae
        improving the query for count.editor.valid and maybe adding a count.editor.inactive stat would be reasonable
      • 2013-11-18 32209, 2013

      • ruaok
        see if you can come up with anything interesting.
      • 2013-11-18 32215, 2013

      • ocharles
        ruaok: ok, what bit is the "this"?
      • 2013-11-18 32226, 2013

      • ruaok
        "we have more options than just SELECT count(*) FROM editor"
      • 2013-11-18 32227, 2013

      • ocharles
        ok
      • 2013-11-18 32230, 2013

      • reodroid
        derwin: probably, that doesn't mean we shouldn't try to avoid it :)
      • 2013-11-18 32237, 2013

      • ocharles
        i'd be happy to do some analysis there
      • 2013-11-18 32239, 2013

      • derwin
        it may? heh.
      • 2013-11-18 32253, 2013

      • ruaok
        ok, lets leave it here for now.
      • 2013-11-18 32202, 2013

      • ruaok
        we've spent 40 minutes on this already
      • 2013-11-18 32208, 2013

      • ianmcorvidae
        I'll try to have some numbers on email reuse for next week.
      • 2013-11-18 32222, 2013

      • ruaok
        we have one concrete step (nofollow) and then some investigation by both ianmcorvidae and ocharles
      • 2013-11-18 32229, 2013

      • ruaok
        then next week we look at it again.
      • 2013-11-18 32235, 2013

      • ruaok
        sound reasonable?
      • 2013-11-18 32248, 2013

      • ruaok
        ianmcorvidae: OAuth2 MAC auth
      • 2013-11-18 32259, 2013

      • ianmcorvidae
        I'm also happy to look at stats if ocharles would rather look at his in-progress stuff
      • 2013-11-18 32202, 2013

      • ianmcorvidae
        anyway
      • 2013-11-18 32203, 2013

      • ianmcorvidae
        MAC auth.
      • 2013-11-18 32219, 2013

      • ianmcorvidae
        so we currently support two varieties of authentication with OAuth
      • 2013-11-18 32223, 2013

      • ianmcorvidae
        bearer tokens and MAC tokens
      • 2013-11-18 32243, 2013

      • ocharles
        ianmcorvidae: that might work better, now that i have more work attribute stuff to do
      • 2013-11-18 32245, 2013

      • ianmcorvidae
        bearer is much easier to use and actually standardized; MAC was added, as luks said, because at the time he assumed that using https requests in picard would be a no-no
      • 2013-11-18 32205, 2013

      • ianmcorvidae
        MAC auth is also misimplemented in a way that makes the tests very very angry with me on perl 5.18
      • 2013-11-18 32213, 2013

      • ocharles
        :)
      • 2013-11-18 32222, 2013

      • ianmcorvidae
        we could fix it, but I'd like to propose we just remove MAC auth instead.
      • 2013-11-18 32224, 2013

      • ocharles
        does MAC auth actually work on production?