Related Posts in Sidebar

Topics: Controls
Oct 1, 2011 at 11:23 PM

I've adapted the standard related posts control so that it can be detached from the post body and placed anywhere on the page. The control behaves the same as the one that comes with BE in that it only appears if there are any related posts (and because it can be out with the post, appears only if you are viewing a post). 

At the moment, it's for personal use but if there is interest I'm quite happy to make it available for BE 2 and 2.5, you can try it out here (click back and forward between posts or browse by month or tag to observe the behaviour). 

The default search for related posts is based on Post Title (as far as I can see) and this seems to work well for most folks, unfortunately not for me, the problem being that the subject matter is quite narrow with the same term appearing repeatedly in titles, so related post results were not particularly relevant. I now base the results on the frequency of tag occurrence within posts and this works well (for me), but does require thoughtful tagging. 

My question is this, if there is interest, would you want to use the default search for finding related posts, the tag version or a mixture of both - say tags and if no related found on that, then fall back to title based?

Coordinator
Oct 3, 2011 at 12:43 PM

This is a nice enhancement -- a detachable related posts control/widget.

It looks like you're right, that the current Related Posts process just looks at the Title of the current post.  Although because the entire Post object is passed into FindRelatedItems() (not just the title), it could technically be adjusted to search on the Content of the post too.  In general, there's probably a good amount of room for improvement on finding strongly related posts.

The tag idea is good.  I think a lot of people are thoughtful in assigning tags.  Especially if the Tag Cloud is important to people, then tagging is important.  

For your question, I like the mixture option personally.  Where matching could be done on both Title and Tags.  If both Tags and Title words match, then more weight is given than if just title or tag alone is a match.  So the order of evaluation might be:

  1. Tags and Title both match (strongest match)
  2. Tags (strong match)
  3. Title (medium match)
Oct 4, 2011 at 8:53 AM

Thanks for the feedback, it's food for thought and much appreciated.

Oct 4, 2011 at 2:00 PM

Hey Andy,

I like your idea of a detached control for related posts and think steps to improve the algorithm for finding related posts would be great.  I like the approach Ben has put forth.  Any chance you could implement it and then roll those changes into the current build?  Ben can help with info on how to submit your contribution if you haven't done that before.

My guess is alot of people would benifit from these changes.  I know I would.

Oct 4, 2011 at 8:31 PM
Edited Oct 5, 2011 at 3:47 PM

Hi Ron,

Any help is always welcome, I'm pasting the code for what I have so far, it's just a first draft but it let's you get the gist of it.

The tag weighting is simple, I like simple, but I'm finding it very effective. 

I like Ben's idea as well and for this attempt, went with the tags and title approach.

I hope the code commenting makes sense (I'm working with BE 2 at the moment, once this is good then BE 2.5)

 

public class RPFinder 
    {
        /// <remarks>
        /// The standard BE RelatedPosts control is tied to the post body rendering related posts as part of the post body.
        /// This class can be used with cRelatedPosts (a customized detached version of the control) to render posts anywhere in a page.
        /// If it's a post then RPFinder looks to find any related posts and stores their HTML representation as a string in local cache, returning a boolean true if successful.                
        /// This allows you to first check for related posts and then conditionaly place the results anywhere on the page.
        /// The custom version of the RelatedPosts control (cRelatedPosts) does the actual rendering.
        /// Typical usage in a theme master:
        /// <% if(RPFinder.getPosts(3)) { %><div id="mySideBarSomething"><blog:cRelatedPosts runat="server" ID="relPost" Visible="true" /></div><%} %>
        /// </remarks> 
        

        #region Constants and Fields
        /// <summary>
        /// Related posts cache.
        /// </summary>
        private static readonly Dictionary<Guid, string> rpCache = new Dictionary<Guid, string>();
        
        /// <summary>
        /// Default settings.
        /// </summary>        
        private static string rpHdr = "Items with related content";       

        #endregion
        

        #region Methods        

        public static bool getPosts (int maxResults)
        {            
            // There is already page detection in this themes site master where a call to RPFinder is made
            // otherwise we would have to add logic to determine if the page is a post.
            // if (String.IsNullOrEmpty(HttpContext.Current.Request.QueryString["post"])) return false;            
            Guid itemID;
            Post servingPost = null;
            string qs = HttpContext.Current.Request.QueryString["id"];
            if (qs != null && qs.Length == 36)
            {
                itemID = new Guid(qs);
                servingPost = Post.GetPost(itemID);
            }
            else
            {
                return false;
            }
            if (!rpCache.ContainsKey(itemID))
            {                
                if (servingPost.Tags.Count > 0)
                {
                    Dictionary<IPublishable, int> relatedPosts = new Dictionary<IPublishable, int>();
                    int stCount = servingPost.Tags.Count, stCounter, ctCount, ctCounter, weight;                                      

                    foreach (Post p in Post.Posts) // compare post against the serving post for matching tags
                    {
                        if (p.IsVisibleToPublic && p != servingPost)
                        {
                            weight = 0; // initialize comparing post weight
                            stCounter = 0; // the position in serving tags                            
                            ctCount = p.Tags.Count; // comparing tag count

                            foreach (string servingPostTag in servingPost.Tags) // go through each serving post tag checking for a match in the comparing post tags
                            {                                
                                ctCounter = 0; // the position in comparing tags                                

                                foreach (string comparingPostTag in p.Tags) // check through the comparing tag list
                                {
                                    if (servingPostTag == comparingPostTag)
                                    {                                  
                                        // assign relative weights to tags according to their counter position (assumes that tags are listed in order of importance)
                                        // so regardless of the number of tags, the first is always worth the same i.e 20-0, the next 20-1 etc 
                                        weight = (20 - stCounter) + (20 - ctCounter);                                        
                                        // 20 is an aribitary figure hard wired for testing but gives a reasonable tag weight adjustment
                                        // it also sets the number of tags we can check per post, if the tag count goes over 20, weight starts to go down (if this were only true in life)
                                        if (!relatedPosts.ContainsKey(p))
                                        {
                                            relatedPosts[p] = weight; // first matching tag, add related post to dictionary and give weighting                                            
                                        }
                                        else
                                        {
                                            relatedPosts[p] += weight; // found more matching tags, increase the entry weighting
                                        }                                        
                                        break; // servingPostTag matched against comparing postTag, check next serving PostTag
                                    }
                                    ctCounter++;
                                }
                                stCounter++;                                
                            }
                        }
                    }
                    // if nothing found with tags, we might get something by title, or 
                    // if any title result matches a tag result then boost it's weighting
                    var searchRelatedPosts = Search.FindRelatedItems(servingPost);
                    if (searchRelatedPosts.Count > 1)
                    {
                        foreach (IPublishable p in searchRelatedPosts)
                        {
                            if (p.IsVisibleToPublic && p != servingPost)
                            {
                                if (relatedPosts.ContainsKey(p))
                                {
                                    relatedPosts[p] += 100; // boost weighting (reasonable test figure for now)                                           
                                }
                                else
                                {
                                    relatedPosts[p] = 20; // give flat rating
                                }
                            }
                        }
                    }

                    if (relatedPosts.Count < 1)
                    {                            
                        return false;
                    }
                    // Select the matches with top weightings to create related post list and limit size accordingly
                    var rps = (from k in relatedPosts.Keys orderby relatedPosts[k] descending select k).Take(maxResults);
                    CreateList(rps, itemID);                    
                }
                else
                {
                    return false; // or just do search based on title and return whatever
                }               
            }
            return true;
        }

        public static string getCacheItem (Guid postId)
        {
            return rpCache.ContainsKey(postId) ? rpCache[postId] : null;             
        }
        internal static void removeCacheItem(Guid postId)
        {            
            if (rpCache.ContainsKey(postId))
            {
                rpCache.Remove(postId);
            }            
        }

        /// <summary>
        /// Creates the list of related posts in HTML.
        /// </summary>
        /// <param name="relatedPosts">
        /// The related posts.
        /// </param>
        private static void CreateList(IEnumerable<IPublishable> relatedPosts, Guid itemID)
        {
            const string LinkFormat = "<li style='width:100%;overflow:hidden;'><a href=\"{0}\" style='display:block;'>{2}{1} </a>{3}{4}</li>";           
            
            var sb = new StringBuilder();                                
            sb.Append("<h4 class='panelHdr'>" + rpHdr + "</h4>");
            sb.Append("<ul class=\"topPosts\" id=\"relPosts\">");

            foreach (var post in relatedPosts)
            {                              
                string cats = String.Empty;
                if (post.Categories.Count > 0)
                {
                    bool addSeperator = false;
                    foreach (Category cat in post.Categories)
                    {
                        cats = addSeperator ? cats + ", " + cat.ToString() : " - In " + cat.ToString();
                        addSeperator = true;
                    }
                }
                sb.AppendFormat(LinkFormat, post.RelativeLink, ContentServices.cropString(HttpUtility.HtmlEncode(post.Title), 55), ContentServices.getImage(post.Content), post.DateCreated.ToString("MMM d, yyyy"), cats);
            }            

            sb.Append("</ul>");            
            rpCache.Add(itemID, sb.ToString());
        }       

    }
    #endregion
 

Edit:

Wonder if categories should be taken in to consideration, check could be added when tags are matched - wouldn't carry much of an overhead.

Oct 5, 2011 at 11:27 PM
Edited Nov 5, 2011 at 6:22 PM

Hi Folks,

Have a version working with the 3 options Ben suggested.

Just wondering if anybody gets the chance, maybe look over the code for improvements. 

I'm now thinking about how best to integrate this, it's workable now, disable current related posts and replace with this.

However, might be better with an admin panel to pretty things up. I'd need to think about how to work an admin panel for BE 2.5, which admittedly would be quite some effort for me as my .Net knowledge is somewhat limited.

 public class RPFinder
    {
        /// <remarks>
        /// The standard BE RelatedPosts control is tied to the post body rendering related posts as part of the post body.
        /// This class can be used with cRelatedPosts (a customized detached version of the control) to render posts anywhere in a page.
        /// If it's a post then RPFinder looks to find any related posts and stores their HTML representation as a string in local cache, returning a boolean true if successful.                
        /// This allows you to first check for related posts and then conditionaly place the results anywhere on the page.
        /// The custom version of the RelatedPosts control (cRelatedPosts) does the actual rendering.
        /// Typical usage in a theme master:
        /// <% if(RPFinder.getPosts(5,1)) { %><div id="mySideBarSomething"><blog:cRelatedPosts runat="server" ID="relPost" Visible="true" /></div><%} %>
        /// First arg is the number of related posts, second arg gives find options, with 1 = tags and title, 2 = tags only, 3 = titles only
        /// </remarks> 


        #region Constants and Fields

        /// <summary>
        /// Related posts cache.
        /// </summary>
        private static readonly Dictionary<Guid, string> rpCache = new Dictionary<Guid, string>();

        /// <summary>
        /// The related post list header.
        /// </summary>        
        private static string rpHdr = "Posts with related content";

        #endregion

        #region Methods

        /// <summary>
        /// Args as ints, arg 1 the size of the post list and arg 2 what to search on 1=Titles and tags, 2=tags only, 3=titles only
        /// </summary>      
        public static bool getPosts(int maxResults, int options)
        {
            // There is already page detection in this themes site master where a call to RPFinder is made
            // otherwise we would have to add logic to determine if the page is a post.
            // if (String.IsNullOrEmpty(HttpContext.Current.Request.QueryString["post"])) return false;            
            Guid itemID;
            Post servingPost = null;
            string qs = HttpContext.Current.Request.QueryString["id"];
            if (string.IsNullOrEmpty(qs))
            {
                return false;
            }
            else
            {
                itemID = new Guid(qs);
                servingPost = Post.GetPost(itemID);
            }
            if (!rpCache.ContainsKey(itemID))
            {
                Dictionary<IPublishable, int> scoredPosts = new Dictionary<IPublishable, int>();

                const int tagScale = 20; // Tag scale                
                int stCount = servingPost.Tags.Count;                
                int spCount = 0; // scoredPost count

                if (stCount > 0 && options != 3) // Not titles only                
                {

                    int ctCount, matchCount, weight;

                    bool[] tagMatch = new bool[stCount]; // Include indicator
                    // A table of related posts grouped by matched serving tag position.
                    // This gets used to ensure results are returned for as many matched serving tags as possible,
                    // and in proportion to tag position - left to right priority.
                    List<PostAndWeight>[] rpTryInclude = new List<PostAndWeight>[stCount];
                    for (int i = 0; i < stCount; i++)
                    {
                        rpTryInclude[i] = new List<PostAndWeight>(30);
                    }

                    foreach (Post comparePost in Post.Posts) // Compare post against the serving post for matching tags
                    {
                        if (comparePost.IsVisibleToPublic && comparePost != servingPost)
                        {
                            weight = 0; // initialize comparing post weight                                                    
                            ctCount = comparePost.Tags.Count; // comparing tag count
                            matchCount = 0;                             

                            for (int i = 0; i < stCount; i++) // Go through each serving post tag checking for a match in the comparing post tags
                            {                                
                                string servingPostTag = servingPost.Tags[i];                                                              

                                for (int j = 0; j < ctCount; j++) // Check through the comparing tag list
                                {
                                    string comparingPostTag = comparePost.Tags[j];
                                    if (servingPostTag.Equals(comparingPostTag))
                                    {
                                        // Assume that tags in posts are listed in order of importance.
                                        // How do we assign tag values? i.e say serving post tags are (cat, sat) which
                                        // match with comparing post tags (cat, sat) and (cat, sat, mat, bat, rat) what is tag "sat" worth?
                                        // We can't say 2, the values should descend from left to right, How about (tag count) - (tag position)?
                                        // First compare case: (2 - 2 = 0), second compare case: (5 - 2 = 3) same position but different results.
                                        // Solution - Assign relative weights to tags according to their counter position on a scale
                                        // this gives tag values that are independant of tag count. i.e. scale 10 - position 2 will always give 8                                       

                                        weight += ((tagScale - i) + (tagScale - j)) * (stCount - i) * (stCount - j);
                                        // Boost the weight in relation to serving tag position.
                                        // Ensures that tags to the left will not be outweighed by the sum of the tags to the right,
                                        // but favours tags that are first in the list.                                        

                                        matchCount++;
                                        tagMatch[i] = true;
                                        break; // ServingPostTag matched against comparing postTag, check next servingPostTag
                                    }
                                    else
                                    {
                                        tagMatch[i] = false;
                                    }                                    
                                }                                
                            }

                            // Build the rpTryInclude related post table (try and include results for more of the serving tags).                            
                            if (matchCount > 0)
                            {
                                // A match with matching tag counts should be stronger than the same match with unequal tag counts i.e.                            
                                // "Apple" with "Apple" being better than "Apple" with "Apple, iPhone"  
                                if (stCount == ctCount && matchCount == ctCount) // Strong match
                                {
                                    weight *= 2;
                                }
                                if (matchCount == ctCount) // and when the matchCount is the same as the compare tag count
                                {
                                    weight += tagScale;
                                }
                                for (int i = 0; i < stCount; i++)
                                {
                                    if (tagMatch[i])
                                    {                                        
                                        rpTryInclude[i].Add(new PostAndWeight(comparePost, weight));                                        
                                    }
                                }                                
                                spCount++;
                            }
                        }
                    }

                    // Add entries from the rpTryInclude related posts table to scoredPosts.
                    // This ensures proportional results for as many matched serving tags as possible.                   
                    if (spCount > 0)
                    {
                        // Best matches to top
                        for (int i = 0; i < stCount; i++)
                        {
                            if (rpTryInclude[i].Count > 1)
                            {
                                rpTryInclude[i].Sort();
                            }
                        }
                                                
                        int resultCtr = 0;
                        int depthCtr = 0;
                        int maxResultsAdjusted = maxResults > spCount ? spCount : maxResults;

                        while (resultCtr < maxResultsAdjusted)
                        {
                            for (int i = 0; i < stCount && resultCtr < maxResultsAdjusted; i++)
                            {
                                if (rpTryInclude[i].Count > depthCtr)
                                {
                                    Post p = rpTryInclude[i][depthCtr].P;
                                    if (!scoredPosts.ContainsKey(p))
                                    {
                                        scoredPosts.Add(p, rpTryInclude[i][depthCtr].W);
                                        resultCtr++;                                        
                                    }                                    
                                }
                            }
                            depthCtr++;
                        }
                        for (int i = 0; i < stCount; i++)
                        {
                            rpTryInclude[i].Clear();
                        }
                    }

                }

                if (options != 2) // Not tags only
                {
                    // If nothing found with tags, we might get something by title, or 
                    // if any title result matches a tag result then boost it's weighting
                    var searchRelatedPosts = Search.FindRelatedItems(servingPost);
                    if (searchRelatedPosts.Count > 1)
                    {
                        int maxWeight = scoredPosts.Count > 0 ? scoredPosts.Values.Max() : 50000;
                        int count = 0;
                        foreach (IPublishable p in searchRelatedPosts)
                        {
                            if (p != servingPost)
                            {
                                if (scoredPosts.ContainsKey(p))
                                {
                                    // Means there is a tag rating AND matching title, move topward
                                    scoredPosts[p] = maxWeight;
                                    // This could probably be better. 
                                }
                                else if (options != 1)
                                {
                                    scoredPosts[p] = 1; // Give flat rating
                                }
                                count++;
                            }
                            if (count == maxResults)
                            {
                                break;
                            }
                        }
                    }
                }


                if (scoredPosts.Count < 1)
                {
                    return false;
                }

                // Select the matches with top weightings to create related post list and limit size accordingly
                // var rps = (from k in scoredPosts.Keys orderby scoredPosts[k] descending select k).Take(maxResults);                
                CreateList(from k in scoredPosts.Keys orderby scoredPosts[k] descending select k, itemID);                
            }
            return true;
        }

        public static string getCacheItem(Guid postId)
        {
            return rpCache.ContainsKey(postId) ? rpCache[postId] : null;
        }
        internal static void removeCacheItem(Guid postId)
        {
            if (rpCache.ContainsKey(postId))
            {
                rpCache.Remove(postId);
            }
        }

        /// <summary>
        /// Creates the list of related posts in HTML.
        /// </summary>
        /// <param name="relatedPosts">
        /// The related posts.
        /// </param>
        private static void CreateList(IEnumerable<IPublishable> relatedPosts, Guid itemID)
        {
            const string LinkFormat = "<li style='width:100%;overflow:hidden;'><a href=\"{0}\" style='display:block;'>{2}{1} </a>{3}{4}</li>";

            var sb = new StringBuilder();
            sb.Append("<h4 class='panelHdr'>" + rpHdr + "</h4>");
            sb.Append("<ul class=\"topPosts\" id=\"relPosts\">");

            foreach (var post in relatedPosts)
            {
                // Grab first category, no point having a category list here                
                string cats = post.Categories.Count > 0 ? " in " + post.Categories[0].CompleteTitle() : "";

                sb.AppendFormat(LinkFormat, post.RelativeLink,
                    ContentServices.cropString(HttpUtility.HtmlEncode(post.Title), 55),
                    ContentServices.getImage(post.Content), post.DateCreated.ToString("MMM d, yyyy"),
                    cats);
            }

            sb.Append("</ul>");
            rpCache.Add(itemID, sb.ToString());
        }        
        #endregion

        #region Types
        /// <summary>
        /// Post and corresponding tag weight
        /// </summary>
        private class PostAndWeight : IComparable<PostAndWeight>
        {
            public Post P;
            public int W;

            public PostAndWeight(Post post, int weight)
            {
                this.P = post;
                this.W = weight;
            }

            int IComparable<PostAndWeight>.CompareTo(PostAndWeight other)
            {
                if (this.W < other.W)
                    return 1;
                if (this.W > other.W)
                    return -1;
                else
                    return 0;
            }
        }
        #endregion
    }       

This version is working well

Oct 6, 2011 at 10:47 PM

Andy,

Thanks for sharing your progress.  I hope to break away from my current project a bit tomorrow so I can read through your code and provide some feedback. 

-Ron

Oct 7, 2011 at 2:24 PM

Cheers Ron, 

I had made a couple of edits to the last code extract above, so for clarity I've pasted over it with a cleaner version highlighting the main change.

This part:

weight = (tagTopValue - stCounter) + (tagTopValue - ctCounter); //****** to be improved ******

This gave a pretty good hit rate, but I wasn't entirely satisfied.

For tags that are ordered strictly by importance, I added an adjuster that gives a better hit rate (see above).

 

 

Oct 11, 2011 at 12:54 AM
Edited Oct 11, 2011 at 1:41 AM

Still plugging away at this when I get the chance.  

Just pasted latest version over last code extract, getting good results on related posts, but still think the tag weight ratio could be better.

This part:

weight = (stScale * (stCount - stCounter)) + (ctScale - ctCounter);

Wish I'd kept in touch with the maths guys I used to know.

Oct 11, 2011 at 10:42 PM

Andy,

I appoligize for not being able to break away from my other project yet to provide some feedback.  (The other project is taking a bit more work than I anticipated).  Sounds like you are making great progress.  I hope to read through the code in the next couple days.  Thanks again for working to improve this area of the system.

-Ron

Oct 11, 2011 at 11:27 PM
Edited Oct 13, 2011 at 10:32 PM

Good Man, just whenever you get the chance.

Taking longer than expected would be the story of my life, except when it comes to my epitaph which would read something like "that was quick".

I'm probably going to be busy myself for a bit, so as time permits I'll just keep pasting over last code extract with any updates.

Cheers

P.S.

I had a look at "Yet Another Related Posts Plugin (YARPP) for WordPress", to see if I could get any tips.

Think you would need to know PHP quite well, anyway, apparently it's been downloaded over a million times.

The ethos being: Display related content to visitors and they will stay on your site longer.

Update:

Came up with another way of grading tags that ensures that all serving tags are considered more evenly, and initial testing is proving very promising, it's a bit more processor intensive but still seems pretty quick. For each tag in the serving post, build a list of related posts that match each tag (you end up with columns of related posts under each serving tag, that are naturally weighted by serving tag position which can be further refined by adjusting weight according to distance from corresponding comparing tags). There are more options and possibilities. 

Will keep plugging away as time permits and post the code when I get a chance to do more with it, then perhaps look at optimizing.

Oct 17, 2011 at 10:46 PM

Andy,

Sorry that took so long.  I'm freed up now where I can help out a bit.  Can you send me the cRelatedPosts.cs file (and any custom dependancies) as well as the latestet version of RPFinder.cs? 

Also, where in the project do you place the RPFinder.cs file? My email is ronclabo atTheServer GiftOasis.com  (replace atTheServer with @).

I look forward to working on this, I thinks this is an awesome addition to the platform.

-Ron

Oct 18, 2011 at 7:55 AM

Great to have you on board.  

They say good use of related posts increases page view and improves bounce rate. After starting to play around with this I became conscious of how much I actually use this feature on sites I visit.

I'll send you the necessaries (view it as a test rig for now).

As with most things, trying to strike a balance between speed and accuracy. When I send the files, I will also try and explain the rationale (for those nasty nested loops), pros and cons of this particular approach.

Doubtless, there are better ways to do this, I guess I've went with this firstly, because it was the most immediate solution I could think of and secondly, persisted because I'm getting good results. 

Oct 18, 2011 at 1:50 PM

Thanks Andy.   I totally agree.  I think a good system for displaying highly relevant related posts in the side area of the page could do alot to increase visitor engagement and time on site. 

Yah, don't feel like the code you send over needs to be super clean or anything.  A test rig is perfect.  If  I can compile it and then itterate on it, that's all I'm looking for.

Don't worry about code quality, I'm here to help not critique ;-)

-Ron

 

Oct 18, 2011 at 4:29 PM

On it's way, let me know if it doesn't reach you.

Oct 19, 2011 at 11:24 PM


BenAmanda:   I'd like to get your thoughts on the approach I'm taking.

Background:  I believe that Andy_McKay's work which improves the selection of related products would be a great addition to the blogengine platform and I'm helping to refine it a bit and then turn it into a detachable related products control/widget.  As you know, there is already a RealatedProducts control which is used on the post.aspx page.  Since Andy's new algorithm (which follows your earlier suggestions) is much improved over the one used by the current control, I thought it'd be nice if the related products that show up in the body of the post.aspx page used the same algorithm as the new widget control.  To that end, it seemed to make sense to have only one RelatedProducts control which would be used directly on the posts.aspx page and also used by the new RelatedProducts widget.   I have this approach about 50% implemented. 

Need your thoughts:

In addition to the existing RelatedControl properties (e.g. ShowDescription, MaxResults) we'll need a way to for users to specify if the algorithm should factor in Tags, Title or both for matching.  Ideally these properties could be set via the admin area.  To do this, I'd like to remove the ShowDescription, MaxResults & DescriptionMaxLength properties from the RelatedProduct control and instead have the control get these values from settings.  For backward compatibility I'll default the control to the old default properties if no setting exist.  Does that sound reasonable? (i.e. I'm gonna remove these properties from the control and from being specified in the html on the post.aspx page).  Huge upside to this approach is that the same setting will apply to the control whether it's used in the body of the post page or as a widget.  This allows for the current RelatedProductCache approach to work.  Sound like a plan?

-Ron

Coordinator
Oct 20, 2011 at 3:01 AM

Hi Ron,

It's the RelatedPosts control, not RelatedProducts ... :)

The plan sounds good, although for the widget, it might be ideal if these properties could be set in the widget's "Edit" control (edit.ascx), because I can see where you might want DescriptionMaxLength to be 125 on the main page, but only 50 in the widget -- since oftentimes the widget is in the sidebar where there is less room.  Similarly, in the widget, you might want more or less MaxResults ... same with ShowDescription, you might want the description shown in the main body, but not shown in the widget (i.e. just show the related post titles in the widget).

With this in mind, it may not be necessary to put these properties in the main admin area settings.  Or you could still put them in the main admin area settings as defaults, and then the widget can override these settings.  This would still be an improvement over what's there now since it appears these settings are hard coded into post.aspx and not configurable via the normal control panel UI.

Ben

Oct 20, 2011 at 1:10 PM
Edited Oct 20, 2011 at 1:12 PM

Might this be a good opportunity to add more presentational options.

For example, have the choice to include a thumbnail image along with description and perhaps select what you want the description to be, if you're title is good the description is covered (this being more of a consideration for the sidebar), so have the option to make the description show some other relevant info say, the date and category. I'm guessing you could do this and still be backward compatible, then when there are improvements to the underlying algorithm, these can be added without any outwardly changes.

Oct 20, 2011 at 2:05 PM

BenAmanda, Andy,

These are great suggestions and I have been thinking much along the same lines but I've struggled with whether the control in the post body should use the same settings as the the widget mostly due to how the control currently does cacheing. (I'll touch on this again in a minute)

I do agree with both of you that it'd be nice if they could be configured independently.  To keep things a little simpler, maybe I'll go with having the control configured via html properties and the widget configured via the widget's edit control.  Then the widget can just set the properties accordingly for it's use of the control.  The main challenge I see with this approach is that internally the RelatedPosts control maintains a cache of the output of the control (html) for every post on every blog for which it's asked to render related posts.  If the html that it renders can be different depending on the control's properties, then it can't maintain just one cache.  So either I do away with the cache, (not sure that's a good performance choice) or I need to generalize the caching mechanism to take into account the numberous permutations of settings. That's a little ugly, but doable.  Alternative, I could use the knowledge that the RelatedPosts control is only used in two places in the system and internally just maintain two caches (rather than n caches) based on a control property that indicates use case (widget or nonwidget).  This is _much_ easier to code, and still advances this area of the sytem considerably.  At the moment I'm kinda leaning toward the latter.

Thoughts?

-Ron

Coordinator
Oct 20, 2011 at 4:08 PM

Looking at the RelatedPosts control now, one issue with it is that it stores the cached related posts in a static dictionary ... and the data never leaves the dictionary (not until a Post is saved).  If a person has a lot of posts and doesn't write new posts often, this could lead to that dictionary becoming very large.  For these situations where the data can be potentially large, I prefer to use HttpRuntime.Cache since it will automatically drop items if memory is tight.

A person might have more than one RelatedPost widget (theoretically, of course!)

My thought is to add a new Guid property to the RelatedPosts control -- let's say it's named "InstanceID" (or anything).  Each "related posts" widget can set its own WidgetID (this.WidgetId) for the RelatedPosts control InstanceID property (from within widget.ascx.cs).  And post.aspx can maybe use a Guid such as Blog.CurrentInstance.Id as it's InstanceID value.  In fact, the default value for InstanceID could be Blog.CurrentInstance.Id, and then post.aspx does not need to explicitly set the InstanceID value.  Then the RelatedPosts control can cache the related posts with a key named something like "related-posts-" + PublishableId  + InstanceId.  And each item is put into cache via Blog.CurrentInstance.Cache, which is a blog instance specific cache.  And if this string representing the related posts is cached as a "standalone" item (i.e. not with a dictionary), then individual strings can be removed from cache if memory gets tight, without an entire collection of strings (e.g. dictionary) needing to be removed from cache all at once ... if this makes sense.  Or a dictionary can still be used.  The main goal here is to avoid a "static" dictionary and use Blog.CurrentInstance.Cache instead since this uses HttpRuntime.Cache where items can be freed if memory is tight.

Anyhow, just my thoughts on how I would probably go about this.

Oct 20, 2011 at 4:23 PM

BenAmand - awesome feedback! I appreciate you pointing out the difference in caching locally via a static vs using HttpRuntimeCache.  I'm also greatful for your suggestion to put the items in the cache individually rather than to place a dictionary into the cache.  Your expaination of why makes perfect sense. Absent your advice I'd have been reluctant to go that direction since I don't have any experience with HttpRuntime.Cache and I'd have been concerned about sticking hundreds of entries in there for for a signle control. Blog.CurrentInstance.Cache sounds like the way to go.

Off to coding I go...

-Ron

 

Oct 20, 2011 at 4:51 PM

Hi Ron, Ben 

My lack of .Net and the BE framework knowledge is a stumbling block for me here, but that suggestion looks really good to me.

As a side note, when you start back on the coding Ron, I was looking at Search.FindRelatedItems in the BE core, all it does is make a call to Search.Hits.

This suits us wonderfully, we can pass the tag list as a string to Search.Hits(tagString) and we have a content search on keywords.

Combine this with weighting and I reckon we can get some pretty dammed good results for related posts.

I'd like to play around with this, but won't have much time before early next week.

Oct 20, 2011 at 10:07 PM

Thanks Andy.  I'll take a look at Search.FindRelatedItems and Search.Hits.

I'm making some good headway. 

-Ron

Oct 20, 2011 at 11:03 PM

Glad to hear it, and looking forward to seeing the end result. 

I did manage to do a quick test feeding the Search.Hits with various tag list strings.

You actually get pretty decent results, not as good as the tag weighting method, but worth noting for tags that are listed in no particular order.

I haven't had a close look at how search works, but it looks like there's some fuzzy matching going on, which also has it's pros and cons.

Something I meant to mention before and probably academic now after Ben's suggestions.

Previously, if you deleted a post, it remained in the related list until the cache got cleared (not so good), I mention this because post deletion is a consideration.

Probably won't be around much for a few days, got a ceiling to put up, walls to plaster, a power of painting and a bath to fit before Monday - deep joy.

Oct 22, 2011 at 12:42 AM

Hi Andy,

I thought I’d give you an update on my progress so far.  I’ve worked on the project about 18 hours and have made good progress.

What I have working so far is:
1) Existing RelatedPosts control is now generalized so that it’s happy to be used on multiple pages and can be used in a widget. 
2) The RelatedPosts control now supports a bunch of additional public properties for configuring it’s output since it needs to render differently for use in the body of the post page and in the widget.
3) The RelatedPosts caching mechanism is not generalized so that is can cache all different variants of the html necessary based on public properties as set for each different use.
4) A “Related Posts” widget now exists and utilizes the newly generalized RelatedPosts control.  The html it outputs sets class = “relatedPosts” and no id, vs. when the control is used in the post body it still outputs an id=”relatedPosts” (backward compatibility)
5) The widget has editable properties for several of its properties.

I’m now at the point where I can start folding in your improved logic for identifying the related posts.  Then I can add more widget editable properties to support that logic.  The next opportunity I will have to work on it (or check email) is on Monday morning (here in the US).

Have a great weekend,

-Ron

Oct 22, 2011 at 1:09 AM

Quick and Nice!

Have yourself a beer, mine's calling to me now.

And likewise to yourself for the weekend, have a good one.

Cheers.

Oct 27, 2011 at 11:28 PM

Ben:

This project is comming along well.  But I need your thoughts on something.

To recap, the project beefs up the Related Posts control so that it supports finding matches based on tags, title, or both and more options for displaying itself.  It also includes a rewrite of how the conrol does caching (using your suggestions and factoring in that it may show itself in multiple ways depending on it's properties). The project also includes a Related Post Widget that uses the beefed up control.

So far I've done this in such a way that with the right default properties set on the related post control in the post.aspx page, it renders the same html has it always has (for backward compatibility).

The challenge I'm now facing is that we'd like the control to include the option for displyaing thumbnail images for each related post which it extracts from the posts.  This will be an option of course and for the widget configurable via the edit.aspx

When including images, it becomes necessary to wrap each related post in something to keep the info together.  Andy uses UL and LI tags to handle this and I think it's the way to go.  The only challenge is that historically the related post control output html was just a series of

<a>title</a><span>description</span> tags wrapped in a div.  So wraping each of these <a>title</a><span>description</span>  in a <li></li> may mess with existing themes.  Using the unordered list approach isn't necessary unless images are included,  So I'm toying with the idea of only using them if the control has the includeImages property set to true and we'd default it to false for the post page body.  This gets us backward compatibility.  Alternatively I could just use the unordered list approach in both cases and people would need to update their themes for displaying the related posts in the body of post page.  (mind you they'll have to update their theme anyway if they want to use the new Related Posts widget.

Thoughts? 

Oct 28, 2011 at 5:49 AM

rclabo:  Sounds good so far,  the idea about showing the images and only have the images html code if it is selected from the options in the Edit Widget.

Andy:  Good job in bringing up this topic to begin with.

Looked at your site and it uses the widget very well!!

I think there might be another option for finding the "Related" posts.

Is there a way to manually mark posts that are related to other posts like from a list of posts?

I know this might be alot, but nothing really finds things better than humans :)

Maybe have a new field in Post  that has 5 spots for other related posts like this:

 

Posts Related to this post:

1.

2.

3.

4.

5.

Or May be have an input box that limits to 5 posts? or what ever you set.

Just my thoughts.

Good job every one on working on this project.

Looking forward in seeing the finished code to use my self :)

Have a Good day,

Brian Davis


Java Blog

Oct 28, 2011 at 11:07 AM

Hi Brian,

That's a fair point, I think there are a couple of systems out there that let you do that.

I personally like the idea of the system discovering the related posts, and the same effect can almost be achieved with careful tagging.

Taking an example from my own posts, I may have a tag 'BBC' and another 'HBO' if I leave it at that then the system will not consider them related.

This is where a little diligence is required, add a tag category, the common feature between 'BBC' and 'HBO' is they are both 'broadcasters', so add that as a tag.

I'm getting very good results with this approach, the problem being that if you have other users that are less diligent ( but that's a problem with manually connecting the posts as well).

For personal use, I say personal because I'm just going to hack into stuff to add a prompt on new tag entry asking if the tag should be classified and add the necessary logic.

Then the next time someone enters BBC or HBO as a tag it's already classified, this may be overkill for most folks but I think I'd like to have it in place (the problem here is finding an approach that also works with Windows Live Writer).

Ron has being working wonders turning this into a robust stand alone control/widget and has been reworking the algorithm (which works, but really needs to be allot more efficient).

Cheers

Oct 28, 2011 at 2:29 PM

Hi All,

To deal with the question of how to use unordered list html when including images in the related post listings without hurting backward compatibility with themes I've decided on an approach. 

I'm going to and a "useList" property to the RelatedPost control.  When the post.aspx page uses the control in it's body the property will be set to false (backward compatibility) but when the control is used in the widget it will be set to true.  This allows the widget to consistently use a <ul><li></li></ul> approach whether it's displaying images or not. Thus making the theme styling for the widget consistent whether the user turns images on or off. And it still maintains backward compatibility for the relatedpost control use in the post.aspx body.  I think this will work great.

Who knew there would be so many details to resolve with a general purpose Related Posts widget?  Wow.  Just about done.  I just have to get the code working that optionally includes the first image from each post in the list.  (big thanks to Andy for the idea _and_ code to include images.)

Brian: I think adding support for manually specifying related posts might be something to add eventually.  But let's see how well the new control and widget perform without that.  The new matching code is much improved over the existing code provided that the post has tags.  It may be all that is needed.  A little testing once I post the code will show whether the manual list is needed.  For me, I've invested about 60 hours of coding into the control, widget, and cacheing now and I'm nearing the end of the time I can put into this one.  (open source is fun, but I need to do some billable coding so I can pay the bills :-)

-Ron