[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [chairs] SPAM
Chairs: I'll open another can of worms and jump into this :-) I agree with you wholeheartedly, Duane, that this is a problem. I'll bet that I get more spam than you do (few hundred a day). And I have no doubt that all this is because of spammers harvesting addresses from our list archives. Of course a knee-jerk reaction would be to close off the archives so that nobody can get to them, but given that the OASIS philosophy is openness and accountability we need to keep things open and accessible. There seems to be two possible solutions: either disguise the addresses stored in the archives, or to somehow block access so that only a human can get through. (I don't think that we want to go down the path of an offensive strategy such as what Duane suggests.) Lacking a foolproof Turing test to allow only human access to the archives, I think the best and easiest solution will probably be to disguise the email addresses attached to each message so that whatever is harvested in unusable by spammers. The disguise would have to be such that the harvester would not be able to accurately or easily recreate the address. Obviously substituting the word "at" for the @ sign isn't going to fool anybody for very long. But whatever we do may not disguise the actual identity of the sender; we need to know who sent the message. A final question is whether it is necessary for a person to be able to respond to a message he found in the archives; i.e. does the guy on the street need to be able to figure out how to respond to Duane when he reads something thet Duane wrote? Perhaps this requirement is not so important, as TC members already know how to respond to the TC list, and the guy on the street is already given instructions for sending a comment to the TC. If the above is acceptable then perhaps I could suggest (and please note, this is just a strawman for discussion, not an official OASIS proposal) that we delete some portion of the address after the @ sign. We could delete all of it, leaving just "duane@", for example, but then we loose any idea about what company Duane was at, whether Yellow Dragon or Adobe (and it may be important for IPR reasons to know). So maybe we could leave the first couple of characters after the @ sign, resulting in "duane@ye" or "duane@ad". If we left three characters then we'd get "sun" and "ibm" etc. which would make it possible to reconstruct the address. But then again with only two we would get "hp". So, any comments on whether it should be a requirement for a human to still be able to figure out the email address? And, if that's not a requirement, what do you think of my above suggestion? -Karl p.s. Duane, I hope you don't mind me using you as the example :-) Duane Nickull wrote: > I an getting ruthlessly spammed and every day it increases. > > After careful analysis, I have deduced that my email address is most > often harvested from OASIS list archives. > I would favor setting up a system that makes it harder for spammers to > harvest email addresses from this list by confusing the heuristic filters. > > Others have done something like this to fight it > > dnickull(at)adobe.com - replace the (at) with the "@" sign to email. > > but this is too easy to program around. > > I couldn't sleep last night and came up with a more devious plot to foil > the spammers. What if we adopted both a defensive and offensive > strategy? First of all, if we defensively replaced all the email > archives email addresses with something that confused the spam > harvesters like > > "dnickull" + [some_randomness_here] + domainname + {something else to > hide the domain suffix - .com, .org, .gov} > > that would potentially cut down email addresses getting harvested. > > Second, as an offensive weapon, make some dynamic pages that either > detect patterns in the log files of a bot looking for email addresses > (such as a repeated get() for more than 10 archive pages within a > certain timeframe) and it would generate hundreds of email addresses > that are invisible to the human eye, but would be based on the URL the > get originated from. > For example, if I send a request to get the get() the archives for OASIS > from IP address 216.154.143.253, the page would generate 100's of hidden > email addresses, all @216.154.143.253. The IP address is a readily > available environmental variable within an HTTP request scenario. > > To the casual observer, there would be no difference in the page display > but to a spam email harvester, this would add 100's (perhaps 1,000's) of > emails that would end up with the spam harvester being the victim of a > their own spam. > > This could be both funny and help solve the problem. This would also > not be to hard IMO to implement. > > Thoughts? > > Duane > -- ================================================================= Karl F. Best Vice President, OASIS office +1 978.667.5115 x206 mobile +1 978.761.1648 karl.best@oasis-open.org http://www.oasis-open.org
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]