SPAM, SPAMmers, SPAM Filters, and SPAM Improvements

Topics: Business Logic Layer, Controls
Feb 17, 2010 at 7:39 PM

Ever since 1.6 was launched, it appears the SPAM goblins have increased their armies in order to attack this new release.  In other words, I'm getting BOMBED with attempted SPAM much more heavily than I was with the previous version.  For example, I probably have received 300 auto-generated SPAM attempts today...and they keep hammering.

Is everyone else seeing this?

The good news is that most of them are not getting through.  Akismet normally runs at about 70% success, but the blacklisting is kicking in and many more are getting blocked.

Is everyone else seeing this?

So...on to some recommendations and/or open discussions on the topic.

1)  Would it be a good idea to check the IP against the blacklist during page generation and not even provide the option to add a comment if so?  (ability to add text in replace of it would be nice)

2)  If I were to delete everything in the SPAM section of the Comments tab, would BE forget everything it's learned (i.e. blacklisting, etc).  Or, is the required information stored somewhere permanently?

3)  Has anyone developed any new SPAM filters beyond the two built-in?

Cheers, appreciate any commentary!

AL

Coordinator
Feb 17, 2010 at 8:57 PM

Your idea of checking the blacklist when the page is served and not even showing the comment form is a good idea.

Yes, if you delete everything on the Spam tab then BE will forget which items have been blacklisted -- it doesn't maintain a separate list.  This would be a good future improvement to have a separate list so the spam comments can be deleted.

Feb 17, 2010 at 10:07 PM
Edited Feb 17, 2010 at 10:08 PM

RE #1:  I would think this might be accomplished via an Extension?  I don't posess the full know-how, but if the data is available to an Extension, I'd think this might be pretty easy to set up.  Anyone up for it?

RE #2:  Thanks for the info, this is good to know and should be documented.  I almost purged them yesterday.

Cheers,
AL

Coordinator
Feb 18, 2010 at 7:44 AM

For not showing the comment form, doing this in CommentView.ascx.cs in the User Controls folder would make the most sense.  There is already code in there that controls whether the comment form is shown.  It looks to see if comments are "enabled", and if so, the phAddComment placeholder is made visible -- otherwise it's made invisible.  This phAddComment is the same comment form we are talking about.

It would just need to know whether the current IP address is a blacklisted IP address.  I don't think there's currently a single convenient flag that indicates whether an IP address is blacklisted.  It would need to duplicate some of the code already in CommentHandlers.cs (in the core).  The code is in the ModeratedByRule() function.

Feb 18, 2010 at 2:29 PM

Learning that I have to keep the spam is a bit disappointing.  Learning that 1.6 will still receive almost as much spam as before is also disappointing. I agree that a separate list should be maintained for spammers as well as a permanent white list.

I think blocking the comment field is a great idea. I'll look into making the change on my own site.

Ever since being bombarded with spam I've been looking at BE to see what ways it exposes the site owners to spam.  The removal of our email addresses from the RSS feed was a big plus in my book though it may already be too late since it was already out there. I'm a big proponent about keeping addresses hidden unless explicitly given to someone.

I have a few questions;

Is there a way to disable the "<label style="width: auto; float: none; display: inline;" for="cbNotify">Notify me when new comments are added</label>" check box and what address does it send the email come from?

 

C:\development\Main\NextGen\EMR\EMRApp\Medication\MedicationMainControl.c
Feb 19, 2010 at 12:47 AM
Edited Feb 19, 2010 at 5:19 AM

AL - your idea was really good.  I followed Ben's advice and took the code out of the ModeratedByRule method and created a new one called IsBlacklisted(string IP, string Email) - which returns true if the user is blacklisted, false if they are whitelisted, or null otherwise.  

I then made a quick patch to the CommentView user control to hide the comment if the user turns out to be blacklisted.  I have a much more detailed writeup of the changes on my blog ( http://www.bloodforge.com/post/Disabling-Comments-for-Blacklisted-IP-addresses.aspx ). I also have a zip on my site with the two updated files, although all of the code changes are in the post in case you have your own custom versions of those files.

At this point I'm still able to post comments, so at least I didn't break that :)

Let me know if you see any obvious bugs! Or post your own solution, which may be cleanear than mine.  I just spent a few minutes on this, so its possible I missed something.

Feb 19, 2010 at 2:13 AM

I just updated my tutorial on how to implement captcha for BlogEngine version 1.6. I tried running without it but I was getting spammed to death...

http://www.codecapers.com/post/How-to-Block-Spam-Comments-in-BlogEngineNET.aspx

If you had this implemented before, you only need to change the method being called by the btnSaveAjax button. In 1.5 it was invoking BlogEngine.addComment() and in 1.6 it invokes BlogEngine.validateAndSubmitCommentForm() instead. 

It would really be nice if BlogEngine.NET could work towards supporting ReCaptcha out of the box. The Captcha solution I have document is OK but it is not as nice as ReCaptcha. 

Coordinator
Feb 19, 2010 at 3:30 AM

I just put Michael's captcha solution (link in above post) on my blog yesterday (actually a slightly modified version).  So far it's cut out all the auto generated spam I was getting ... great so far.  Based on these results, I'd recommend it.

The Akismet filter is good too.  But I found that the spam I was getting was bot type spam where a comment was being left within 2 to 3 seconds of the page coming up.  I'm sure this is probably related to that other automated spam tool which automated pulling up a blog post, finding the fields to enter the comment details into, entering the information, and Submitting the comment.

Maybe we'll put Recaptcha into the next version of BE.  I think any extra captcha control like this is good -- even if it isn't recaptcha.  If this did go into BE, my guess is that it would be a setting that could be turned on or off.

Feb 19, 2010 at 12:41 PM

Michael's captcha solution seems to work fine for me too.  Thanks! :)

One thing I've noticed though, the captcha is not refreshed after you successfully post a comment.

Feb 19, 2010 at 7:38 PM

I have added Michael's captcha (excellent feature) as described. My problem is that DoCheckCaptcha does not work and therefore validation is false and no comment will be saved.
My Question: I have ASP.Net 2.0. Is that enough, or do I need to install a later version? Or, one has a tip for me?
Thank you in advance

Coordinator
Feb 19, 2010 at 10:33 PM

The DoCheckCaptcha() function basically works just like the normal comment process -- it goes to the server via an asynchronous request and gets a result back.

It "doesn't work" meaning it always says that an incorrect captcha code has been entered?

Make sure you have turned on session state within the web.config file -- in case you missed that step.

Feb 20, 2010 at 1:09 AM
Edited Feb 20, 2010 at 3:14 AM

I was wondering if we could use

Animal (animated or not) viewed as line-art as a CAPTCHA? I saw this once in "NEWSCIENTISVIDEO" and I think it's a great idea.

hmm... I think I should make something like this.

http://www.youtube.com/watch?v=dE4arbdM-D4

Feb 20, 2010 at 9:45 AM

Hi Guys - I just installed 1.6 this morning, and I am visiting the forum just to see if there was any discussion on this topic ... I too have been getting dozens of spam comments a day.  I will try Michael's Captcha solution, that's exactly what I was looking for.  I'm not a developer, but I think I can work through this. But even if I figure it out, I would really like to see this built in so that I won't have to RE-do it the next time I move up to a new build :-)

And good work on all the latest changes, much appriciated!

David Workman

http://www.HalfBakedLunatic.com

 

Feb 20, 2010 at 1:49 PM
Edited Feb 20, 2010 at 1:53 PM

Hi Ben - Thank you for your answer

Session state is true; function RefreshCaptcha works correctly. No matter whether a captcha code entered is right or wrong, there is no message from the system and the comment will not be saved. A blank captcha field is called by the RequiredFieldValidator.
I installed BlogEngine from the scratch, including captcha, but it fails also ;-(
Is it possible that I have to change settings on my PC system (IIS 7.0 ASP.NET 2.0, Vista, VWD) for asynchronous operation?
Hope have an additional tip for me?

Coordinator
Feb 20, 2010 at 6:39 PM

ha123 - Do comment previews work?  Comment previews also make async callbacks to the server, but are done without the Captcha checking.  So if previews work, then we'll know the async part of this is okay.

If you have Firefox available, I would check in its Error Console (Tools -> Error Console) when you try to submit a comment.  You might see an error appear in there which could help.

Feb 21, 2010 at 5:46 PM

Hi Ben, I have probably limit the error by a debug run: Debug marks an error at line: string img = Session["AlphaCaptchaCode"].ToString().ToLower();  in CommentView.ascx.cs. (NullReferenceException). If I understand correctly, this means that session state is not ok.
I tried several variations in web.config (<pages enableSessionState="true" ....or/and  <sessionState cookieless="UseCookies"...) and direct on the Pages ( EnableSessionState="True"), but without success. When I only set  _Callback = "1"; it partially runs great.
If you have one more tip is very welcome. Otherwise I will better wait with a Captcha until you someday have built a solution inside BlogEngine.

Feb 21, 2010 at 6:35 PM

Thanks for the great activity on this thread, folks!

I just re-installed Michael Ceranski's CAPTCHA solution, works like a charm.  Can't wait for the BOT-SPAM to stop...

Has anyone tried Filip Stanek's implementation of my idea yet?  Anxious to give this one a go as well.  Philip, did you mean to pass String.Empty into IsBlackListed instead of the actual email that may have persisted from a previous comment?  Also, regarding this - I still think there might be a way to turn this into an Extension.  I've seen extensions add/modify Body Content of a post, I would assume there'd be a way to modify the Visibility of the Comment Form in a similar fashion, no?  Just trying to make it easier for the masses to adopt (and possible get your code quickly put into the source!)

Question for Ben:  Since you stated that we cannot delete SPAM comments without BlogEngine "forgetting" information required for blacklisting, etc...  Is there a way to disable display of SPAM comments on the blog when logged in as an admin?  It's annoying to have to sift through hundreds of SPAM comments when you're trying to reply to legitimate ones. 

Thanks again all!

Cheers,
AL

Feb 22, 2010 at 6:49 AM

Al - The reason why String.Empty is being passed into the email is because this solution works from IPs only. Also, I do not have an email address to pass in at the time the page renders, as the Email address comes from the comment form, and the whole point is not to show the comment form to a blacklisted user.

Also, I've received 0 spam in the past couple of days, but I think this has more to do with implementing a reCaptcha than it does with this method.  But I guess I really can't be certain of that, as I do not track the number of times that the comment form was not shown because the user is blacklisted.

I'm sure there is a way to make this into an extension, however, there would need to be some mofidications to CommentView.ascx.cs. I do not believe that there is currently an event that is dispatched when the comment form is rendered, and something like that would need to be in place for an extension to be developed ( or, as an alternative, we would need to be able to set Post.IsCommentsEnabled to false for the current page load only ).

 

Coordinator
Feb 22, 2010 at 7:33 AM

ALBsharah: Yes, you can omit spam comments.  Spam comments are simply "unapproved" comments.  In the User Controls folder, there is CommentView.ascx.cs.  There's two possible places to make a change (you can optionally change both places).  If your theme has nested comment support, then in AddNestedComments() you'll find this snippet (as part of a longer line of code):

|| (!comment.IsApproved && Page.User.Identity.IsAuthenticated)

Remove/delete that entire fragment.

If your theme doesn't have nested comment support, the code to change is in Page_Load (in the same file).  You'll find this:

//Add unapproved comments
if (Page.User.Identity.IsAuthenticated)

I would change this to:

if (false)

.... after making these change(s), you won't see unapproved comments, as well as spam comments (since spam comments are nothing but unapproved comments).  But you'll still see the comments like normal on the Comments tab in the control panel.

Feb 23, 2010 at 5:33 AM

To the point of checking the ip addresses in the spam queue (btw, boo on having to keep the spam comments around.  I've been deleting them, and wondering why spam is still getting through), I would highly recommend also using the Project: Honey Pot API to perform a check against the IP address as well.  It's really simple to implement (it's a DNS lookup, .NET has all the tools needed to do it already) which uses the IP address returned to determine the threat level of the IP address based on other activity that has been monitored from that IP address.

Feb 27, 2010 at 5:28 AM
Edited Feb 27, 2010 at 5:28 AM

Thanks for the replies, guys.

So, another quirk today.  Many of my comments in the "Inbox" have checkboxes that are greyed out and cannot be checked.  Some of them are SPAM and I can't do anything with them from that page.  I have to open it up (click on the body) and mark it SPAM from there. 

I have a number of these in my inbox.

Thoughts?

Feb 27, 2010 at 7:29 AM
Edited Feb 27, 2010 at 10:33 AM
ALBsharah wrote:

Thanks for the replies, guys.

So, another quirk today.  Many of my comments in the "Inbox" have checkboxes that are greyed out and cannot be checked.  Some of them are SPAM and I can't do anything with them from that page.  I have to open it up (click on the body) and mark it SPAM from there. 

I have a number of these in my inbox.

Thoughts?

Al,

The checkbox gets disabled if the comment has any "child" comments. You would have to delete the child comments before you can delete/spam the parent.

Mar 2, 2010 at 3:02 AM
Edited Mar 2, 2010 at 3:04 AM

Thanks for the response, Filip.  Makes sense.

All - a couple more thoughts came up today:

1)  When someone comments, and the email comes to me, why has the "from" field been changed to MY email address?  I liked it better when the "from" field was the name/email of the person who actually commented.

2)  Can we add some more information into the email?
a)  There are links for "Delete" and "Approve"...please add one for "Spam"
b)  Please add the status of the comment.  Some examples might be:

Inbox: AkismetFilter
Inbox: Rule:authenticated
Spam: Rule:blacklist
Spam: AkismetFilter

(you get the point)

That would eliminate the need to bounce into the interface if a comment was handled the way it should have been (today we have no idea unless we log in).  This also allows for one-click fixing of how it was processed.

3)  Based on my previous email (why checkboxes were greyed out).  Would be nice if the comment moderation system displayed a tiered view of comments so you could individually delete one, a parent, or a tree of them if necessary (similar to the blog interface options)

Cheers!

Coordinator
Mar 2, 2010 at 3:20 AM

ALBsharah:  Regarding # 1, this change in the From email address was made because a lot of mail servers don't allow you to send email where the From address is not an email address maintained by the mail server that is relaying the email (i.e. the SMTP server).  This is spoofing, and these mail servers require that the From address be your email address.

Even though the From address is your address, the ReplyTo address is the email address of the commenter.  So when you click Reply, by default, you'll be sending the reply to the commenter.  You can change this if you'd like in the SendCommentMail.cs extension in the App_Code\Extensions folder.

The Contact page has been changed to work the same way, incidentally.

Mar 21, 2010 at 3:29 PM
BenAmada wrote:

I just put Michael's captcha solution (link in above post) on my blog yesterday (actually a slightly modified version).  So far it's cut out all the auto generated spam I was getting ... great so far.  Based on these results, I'd recommend it.

 Hi Ben,

Can you put your modifcations up somewhere?

Steven

Coordinator
Mar 21, 2010 at 8:48 PM

Steven, I don't think I have those modifications anymore  :(   I lost them when I switched to the Recaptcha control.

The Recaptcha control has the same type of modifications that I made to Michael's solution -- which I briefly outline in the 2nd post of the Recaptcha control thread.

Mar 21, 2010 at 8:57 PM

Thanks Ben, I'll check it out

Apr 6, 2010 at 11:37 AM

Can I make a small suggestion?  It would make far more sense if we had granular control over the comment mails.  So, if a mail is flagged as spam then we don't get notified (or get a summary notification every day).  However, if a comment is flagged as safe then we do get a notification mail.

Apr 6, 2010 at 11:25 PM

Dave,

I believe Rtur has added that feature in the 1.6.0.2 update.