Tag Based Related Posts

Topics: Controls
Mar 31, 2011 at 10:17 PM

From what I can see, BlogEngine returns related posts based on the current post's title. Nine times out of ten this works well. If you are the one out of ten where things could be better, then this may be of interest. I though that for my purposes the results of the related posts might be more relevant if they were based on the post tags and thankfully this worked for me without looking at anything more complex. 

The adjustment to the RelatedPost.cs file in the App_Code\Controls folder is minimal.

Swap the RenderControl function for this.

 public override void RenderControl(HtmlTextWriter writer)
        {
            if (!BlogSettings.Instance.EnableRelatedPosts || this.Item == null)
            {
                return;
            }

            if (!Cache.ContainsKey(this.Item.Id))
            {                         
                // Grab the post tags and do a search
                Post post = (Post)this.Item;
                string tags = string.Empty;
                if (post.Tags.Count > 0)
                {
                    foreach (string tag in post.Tags)
                    {
                        tags += tag + " ";
                    }
                    tags = tags.Substring(0, tags.Length - 1);
                }
                var relatedPosts = Search.Hits(tags, false);
                // If no results, just do the search as normal by post title
                if (relatedPosts.Count < 1)
                {
                    relatedPosts = Search.FindRelatedItems(this.Item);
                }
                if (relatedPosts.Count <= 1)
                {
                    return;
                }

                this.CreateList(relatedPosts);
            }

            writer.Write(Cache[this.Item.Id].Replace("+++", this.Headline));
        }

Nov 21, 2012 at 9:29 PM

 Andy your function is what i need... related Posts for tags

i have changed "Cache" with "RelatedPostsCache" but don't work the same....

suggestions????????

thankssssssssssss

Nov 21, 2012 at 11:28 PM

Hi Fabry, 

There have been vast improvements on this since the above posting, see discussion, with still further improvements after that.

Ron Clabo lent his weight to this and produced a rather excellent widget for BE 2.5, it required some alteration to the BE core and a fork proposal was planned, however Ron became very busy and it kind of fell by the way side. 

I am currently still using BE 2 with the tag/title based related post control (which does not require core changes), it works really well, but would need some additional work to make it BE 2.7 ready.

I am working on a Content Management Theme at the moment that will include a number of customized controls, one of which will be a tag/title version of the Related Posts control that will not require core changes (not as flexible as Ron's version, but functionally the same). The theme is intended for those wishing to use BE as a standalone site (BE managed pages) and will include an integrated post/page menu, site wide breadcrumb, pages to include auto insertion of sub navigation and where it's a parent landing page, table of contents for that section, WordPress style nested categories for posts and a number of other things. When I originally added this functionality to my own site, it was more of a hack, with little consideration to user culture or portability of components. I'm working through these issues and am not too far off getting there, but don't have a time scale as yet for release (which will be dependant on whether or not there is any interest for this). However, I will let you know when I have something you can use for related posts (I'm not a developer, just a hobbyist, so it's going to depend on available time).

Nov 22, 2012 at 4:44 PM
Edited Nov 22, 2012 at 9:11 PM

Hi Andy

I suggest if you need help in testing the blogengine.net version 2.7, on my site Informarea I have about 800 post all heterogeneous and different topic and could be a good basis for testing.

I have saw your site, very very compliments i like much.... i like slider (you use nivo slider by rtur of is your?)... and i like distribution of window in sidebar..

I wait your beta version related Posts

Bye

Fabry

 


Nov 22, 2012 at 8:40 PM

Hi Fabry, 

Thanks, appreciate the offer and may well take you up on that, although it well might be a while away.

The slider on the front is based on an extension by Michael Baird, it's adapted to use the nivo slider script with the JavaScript altered slightly to give the small preview images when you hover over the control buttons. This extension is actually a featured post rotator, you select the posts you want from a drop down list of all posts along with an associated image that gets uploaded.

I think I might be able to get a version of the related post control to you early next week, swap it for the existing one in the controls folder. Would you want the output to look like your "Potrebbe interessarti anche:" section with images and title link?

Nov 22, 2012 at 9:11 PM

Hi Andy,

yes for images and links in related posts like my blog, would be Great...   important also put in every image relative alt description. If you have need, in my site there is a relatedposts.cs with this functionality in category BlogEngine.net at post "Related Posts con thumbnails in BlogEngine.net".

A nivo slider could be a interesting also with random post...

Ok Andy for next week if you have need to test your control related posts send me your cs in email contact of Inform@rea i have the pleasure to test.

Let me know

Fab

 

 

Nov 26, 2012 at 5:58 PM

Hi Fabry, 

Just sent you a version of the alternative related post control via your site contact page, let me know if you get it OK.

Cheers

Nov 26, 2012 at 9:04 PM
Ok andy tomorrow in the morning i wil test your work and i will tell you if all it s ok. Thank you very much . Bye fabry

-----Original Message-----

From: Andy_McKay
Sent: 26 Nov 2012 17:59:04 GMT
To: [email removed]
Subject: Re: Tag Based Related Posts [blogengine:252057]

From: Andy_McKay

Hi Fabry,

Just sent you a version of the alternative related post control via your site contact page, let me know if you get it OK.

Cheers

Nov 30, 2012 at 6:29 AM

Andy  - nice work, would be very interested in seeing more of your Content Management theme, the site isn't so far off from being a lightweight CMS anyway, so sure makes sense.

 

Dec 1, 2012 at 12:08 AM

Hi Alex,

Good to hear from you.

Regarding the theme, there's quite a bit more I'd like to do with that. I have the interface elements working reasonably well independent of core code changes, however, there's still room for improvement.

On a not entirely unrelated note, when Fabry asked about the related post control, it got me thinking about the semantic library we discussed a while back. I re-worked this off the back of Tyler's code and it's pretty fast and accurate. I currently use this for tag suggestions in the post editor (never have to think about tagging anymore), but now see that it could probably be used to good effect in finding related posts. The BE related posts control as it stands strips irrelevant words from the post title and does a post by post/page search using the remaining title words (space separated individual words). I think with a small change in the search method you could feed search the most frequently occurring significant post "key words and phrases" instead of just individual title words. The upshot being that you could have a pretty decent related post control that works off tags (Ron Clabo), titles and content. I'm going to start on this next week, and if it works well, then I'll attempt to hook it into the post editors file manager to suggest post images (if there's related post content, then maybe the post images may be suitable for re-use - just a thought). You did mention before that you envisaged all manner of uses for key word and phrase extraction.

There is, as things stand, a couple of caveats:

The semantic stuff is geared towards the English language and for use in post creation or editing it's tightly integrated with the post editor code. This presents two challenges, the need for a multilingual semantic engine if it's to be of general use and some mechanism to offer the functionality as a bolt on.

I had thought about the possibility of a simplified version of the semantic engine that could use the BE stop word list to filter out common or insignificant terms. The stop word list is in English, so I had a look around the source code to see where this was mapped to other languages, I'm sure it must be, but just couldn't find it. That would be a useful first step to something a little less language dependant.

For the theme, I have a few others ideas that I intend to play around with at a leisurely rate - it would be nice to offer more options for layout and such like, perhaps via an admin page (not sure how practical that is in terms of load the theme and go - just thoughts at the moment). Possibly Mike Baird's featured post rotator (multi-blog aware) and a number of other things. It's quite a bit of work if you don't use .NET routinely, but that's how you get familiar I guess.

Anyway, any suggestions or alternative thoughts on what would be good are always appreciated, or indeed any thoughts on how to best integrate some of the add-ons.

Cheers.

Dec 1, 2012 at 6:12 AM

Andy - a related posts control makes lots of sense, by author, tag, title, content, tag(s), date, etc., since we're always looking for additional links on sites - I can see multiple variations on that specific functionality.

Glad to hear Tyler's algorithm is working for your auto tagging; I've done some additional work on my own, would be fun to trade feedback.

Anyhoo, I can easily imagine a tabbed "related" posts (x) interface allowing an end user to click on a term/title, and follow through to relative content, glad you've taken the semantic stuff to the next level.

Let me know if I can help - I could also provide my internet lookups (Postview.ascx) if you'd want to include those tag based reference controls in your theme.

Cheers

 

 

Dec 3, 2012 at 7:17 PM

Alex - Yeah, regarding related posts. Tags should be the post keywords, so providing you are reasonably careful with your choices they work well for finding related posts, but it's subjective (a problem when you have more than one author). Key word and phrase extraction is useful here in suggesting tags because it removes a bit of that subjectiveness and promotes consistency. Also, some control over the input (select from existing tags) helps to avoid situations where different posts about the same thing have different tags i.e. 3 posts on acting, one tagged with "actor", another with "actors" and the other with "actress" and you don't get a match. Some folks don't tag, so then you're left with title matches, which can also be a little tenuous. So some kind of content parsing independent of language makes sense (ongoing). 

I liked your tabbed lookups and think that would make for a great extension. 

On the subject of themes, I was customizing post navigation over the weekend and thought it might be good to have a next and previous post preview when you hover over the links. I implemented this, but can't decide if it's good or bad - if you don't mind having a look, a second opinion would be helpful. 

I'll be pottering around and trying out different things with layout, controls and such like, some may be suitable for the theme and some not.

Dec 4, 2012 at 3:02 PM

Hi Andy,

i like your implementation.....but it makes heavy the loading ot the post?

why dont you make a extension for all?

will be great..



Dec 4, 2012 at 4:15 PM

Hi Fabry, 

Glad you like it, wasn't sure about it. 

The effect on post loading is negligible and the coding minimal, but it does require changes to post.ascx and post.aspx.cs and uses some centralized code to fetch the post images and crop the content. So it's probably better bundled with a theme, where all that kind of stuff is built in. 

On a separate note, I'll send you a revised Related Post control. Previously when comparing posts by tag an exact tag match was required. So if you had a post tagged "facebook" and another "facebook and WhatsApp" and yet another tagged "facebook and instagram" it would not make the connection, it does now.

Will forward after testing.

Dec 4, 2012 at 5:05 PM

Hi Andy

Ok i will test your update...

but let me know if i understood perfectly.

now if i have a post with tag "Facebook" and another with tag "facebook and WhatsApp" in relatedPosts i have relation and so i see both..

you want change this... you want comparing not single word of tag but word tagged??

also it isn't wrong to have post tagged facebook and find in relatedPosts post tagged "facebook and other" ...

what do you think?

 

 

Dec 4, 2012 at 7:41 PM

Hi Fabry,

Because titles are also considered, even if there is no tag match you may well get the relation on title word matches and that might be enough. 

Anyway, here's the reasoning (and it could be flawed - testing will tell). 

When you tag one post with "facebook" and another with "facebook and WhatsApp" the tags are not the same, but it does suggest related content about facebook. 

With the revised version, multi-word tags are split to give separate tags for comparison purposes, so "facebook and WhatsApp" become 3 distinct tags "facebook", "and", "WhatsApp" (With an Italian stop word list "and" could be removed as irrelevant). 

This introduces some inaccuracy, but there are advantages and disadvantages. The advantage can be seen in the above case, where posts get matched on the word "facebook" (as may have been the intention). The disadvantage can be seen in the following scenario: 

A post gets tagged with "John Travolta" and another with "John Smith" and another with "John Travolta", these tags become: 

"John", "Travolta"

"John", "Smith"

"John", "Travolta" 

The post tagged with "John Smith" might show as related to "John Travolta" because of the word "John", but "John" + "Travolta" is a stronger match and will be scored higher, so this will show first, considering title matches and other tag matches the post with "John Smith" may not even show. 

Personally I don't like the idea of introducing any kind of inaccuracy, but it gives some leeway, since many post authors don't tag word perfectly. The other point worth mentioning is, do you consider case to be important i.e. the company "Apple" as a tag is different to the fruit "apple". Of course, with robust content matching you could just leave the tags as they are. 

As mentioned before, I don't know how BE handles stop words, but if you are using XML for storage, I could pass on an Italian stop word list (replace the English one in App_Data).

Dec 4, 2012 at 8:09 PM
Hi andy
Now i understand. You are right but this happens only if in tag i put 2 words and not 1.
So your thought is exact but i think that it must not be too much rigid because in some situations is better to match tags like "facebook" and "facebook and whatsapp" than titles of posts. what do you think?

-----Original Message-----

From: Andy_McKay
Sent: 4 Dec 2012 19:42:10 GMT
To: [email removed]
Subject: Re: Tag Based Related Posts [blogengine:252057]

From: Andy_McKay

Hi Fabry,

Because titles are also considered, even if there is no tag match you may well get the relation on title word matches and that might be enough.

Anyway, here's the reasoning (and it could be flawed - testing will tell).

When you tag one post with "facebook" and another with "facebook and WhatsApp" the tags are not the same, but it does suggest related content about facebook.

With the revised version, multi-word tags are split to give separate tags for comparison purposes, so "facebook and WhatsApp" become 3 distinct tags "facebook", "and", "WhatsApp" (With an Italian stop word list "and" could be removed as irrelevant).

This introduces some inaccuracy, but there are advantages and disadvantages. The advantage can be seen in the above case, where posts get matched on the word "facebook" (as may have been the intention). The disadvantage can be seen in the following scenario:

A post gets tagged with "John Travolta" and another with "John Smith" and another with "John Travolta", these tags become:

"John", "Travolta"

"John", "Smith"

"John", "Travolta"

The post tagged with "John Smith" might show as related to "John Travolta" because of the word "John", but "John" + "Travolta" is a stronger match and will be scored higher, so this will show first, considering title matches and other tag matches the post with "John Smith" may not even show.

Personally I don't like the idea of introducing any kind of inaccuracy, but it gives some leeway, since many post authors don't tag word perfectly. The other point worth mentioning is, do you consider case to be important i.e. the company "Apple" as a tag is different to the fruit "apple". Of course, with robust content matching you could just leave the tags as they are.

As mentioned before, I don't know how BE handles stop words, but if you are using XML for storage, I could pass on an Italian stop word list (replace the English one in App_Data).

Dec 4, 2012 at 8:40 PM

Hi Fabry, 

Greater weight is given to tag matching, but both tags and titles are considered together. When there is no tag match, relations are based on title matches only, so even if there is some inconsistency between tags that are intended to be the same (or there are no tags), you still get some kind of match. I have tight controls in place for tag input, but even so, inconsistencies creep in. Personally, I'm sticking with exact tag matches but will still test the other version for reasons mentioned previously. I do intend to have a look at content matching, but that might take some time to get right.

Dec 4, 2012 at 8:55 PM

Ok Andy

let me know when you made something so i will test.. and we will check the functionality..

Bye

Fabry

 

Dec 5, 2012 at 11:57 AM
Edited Dec 5, 2012 at 12:02 PM

Hi Fabry, 

Testing done. Just stick with the version you have for now, splitting multi-word tags on space is too random, you lose just as much as you gain and the other option I considered, splitting multi-word tags on stop words is probably not worth it either - in general. 

I'm going to be testing something for content matching (in English) and if the results are good, I may try and develop it for use with other languages (have stop word lists for 19 languages) but ideally would like something language independent.

Dec 5, 2012 at 12:05 PM

Hi Andy,

Ok let me know if you have need of my help for test all ...

Cheers.

Dec 5, 2012 at 9:23 PM

Andy -

  • I agree multi tag splits will be difficult without some algorithmic formula, along the lines of Tyler's library, or else I'm not sure that people would be diligent about entering tags like "_John Travolta_" (meaning searchable stop characters in effect), but I still think in the majority of cases the tag matching will return a somewhat related post, i.e. John Smith and John Travolta are actually related ("John" posts), and Apple computers would likewise be related to "Apple" and "computers", though not "fruit".

    Wildcards would work in certain cases - act* would return "actor", "actress", "acting", but would also return "actually", though the formula could perhaps be fine tuned to expect more tag matches when wildcards are specified. I think in many cases though, clever usage could be viable - broadcas* would return every useful variation of that word, without any unwanted matches. 

    The points I'm making are that wildcards/characters might be the way to go since they're obviously language independent, though of course places more burden on the user(s) as opposed to the platform, don't know, may be a pick your poison type decision.

     
  • I like your next/previous post preview very much, was something I was going to get to myself, meaning implementing on CustomPostView.ascx to provide users with an AJAX view of the entire post, including my tabbed lookups, without actually having to select the post - people nowadays really hate postbacks, though would be more work to implement if security ever gets implemented at the page/post level.
  • Question - is this new theme still based on your 2.0 incarnation, or will you be extending it to newer versions? I'm still on 2.5, but I'm eager to use the QuickPost functionality in 2.6 for more of the social editing/research functionality of my site, would be great to have the best of all of our different versions in one solution. 

 

 

Dec 5, 2012 at 11:43 PM

Hi Alex, 

Yeah, the tags can be tricky, you get the scenario where someone tags with "John Travlota" and the next time it's just "Travolta", having said that, it still works well as you normally pick up on the other post tags and title words and when tagging is consistent it works especially well. I think with content matching in place, combined with tags and title it should pretty much take care of any inconsistent input - not a trivial undertaking (but have something in mind that I'm quite eager to try out). 

With the next/previous post preview the plan was to provide a teaser (and the coding for this is ludicrously simple) but wasn't sure if it was a bit gimmicky, so it's reassuring to hear you were planning something similar. 

I'm rehashing everything with my current set-up on BE 2 but have tested the crucial stuff with BE 2.7, which is the target.