Utils.RemoveIllegalCharacters method in BlogEngine.NET Version 2.0

Topics: Business Logic Layer
Jun 6, 2011 at 4:26 PM

I found that the Utils.RemoveIllegalCharacters method has been changed as follows:

Version 1.6.1: return HttpUtility.UrlEncode(text).Replace("%", string.Empty);

Version 2.0:    return HttpUtility.HtmlEncode(text).Replace("%", string.Empty);

Is the above change (i.e., use of HtmlEncode method instead of UrlEncode method) correct?

This makes improper query strings if Japanese (and other multi-byte character) is used for tags in the Tag cloud as it is not URL-encoded. When I use the IE and click a Japanese tags in the Tag cloud the BlogEngine.NET 2.0 does not work (no post is shown). This is because the IE does not URL-encode the url string with Japanese query string before request. The Firefox, chrome and the other browser which perform URL-encoding of url string before request work fine though.

Coordinator
Jun 7, 2011 at 3:51 AM

This actually was specifically to allow UTF-8 encoded query strings, so that post titles can be in any language which is good for SEO etc.

I just tried Japanese and it looks ok in IE9 and FF - is this looks different for you (it's hard to tell how it behaves in different local...)?

Jun 7, 2011 at 2:31 PM

Hi rtur,

Thank you for the quick response.

Please note that the problem exists only in the Japanese tags in the Tag cloud. The category works fine with both IE and FF in my environment too.

There are also tags at bottom-left in each post. They are OK since they are only URL-encoded (RemoveIllegalCharacters not applied).

To reproduce the problem I have prepared the following page:

http://surferonwww.info/BlogEngine2/

URLs linked to the tag and to the category are different. The former uses the query string while the later the file name. This makes difference when the IE is used to issue HTTP GET to a Web server. The cause of problem is that the IE does not URL-encode the query string before sending request although it URL-encodes the file name..

Please try the above page I prepared to reproduce the problem. I hope that you will be able to find the difference.

Coordinator
Jun 7, 2011 at 3:03 PM

Oh, I see. Tags should be following same process as categories, we'll get it fixed. Thanks for bringing it up.

Jun 9, 2011 at 4:35 PM

Hi rtur,

Due to the change form UrlEncode to HtmlEncode in the Utils.RemoveIllegalCharacters method, the author name in Japanese (and/or the other multibyte characters maybe) has similar problem. No post will be shown when the Japanese author name shown at top-left of post is clicked. To show the post, URL-encoding shall apply to the ContextBeginRequest method of UrlRewrite.cs as follows:

else if (url.Contains("/AUTHOR/"))
{
  // Japanese author name cannot be recoginized without UrlEncode
  // author → context.Server.UrlEncode(author)
                
  var author = ExtractTitle(context, url);
  context.RewritePath(
    string.Format("{0}default{1}?name={2}{3}", 
      Utils.RelativeWebRoot, 
      BlogSettings.Instance.FileExtension, 
      context.Server.UrlEncode(author), 
      GetQueryString(context)),
      false);
}

 

Jun 12, 2011 at 11:23 AM

I have modified the codes to fix the above mentioned problems. Therefore, the application in http://surferonwww.info/BlogEngine2/ can no longer be used to reproduce the problem. I am sorry if it is inconvenient to you.

Jun 12, 2011 at 11:49 AM
Edited Jun 12, 2011 at 12:09 PM

FYI, the changes made in relation to the Tag cloud problem are as follows:

LoadWidget metod in widget/Tag cloud/widget.ascx.cs

public override void LoadWidget()
{
  foreach (var key in this.WeightedList.Keys)
  {
    using (var li = new HtmlGenericControl("li"))
    {
      //li.InnerHtml = string.Format(
      //    LinkFormat,
      //    string.Format("{0}?tag=/{1}",
      //        Utils.RelativeWebRoot,
      //        Utils.RemoveIllegalCharacters(key)),
      //    this.WeightedList[key],
      //    "Tag: " + key,
      //    key);

      li.InnerHtml = string.Format(
        LinkFormat,
        string.Format("{0}tag/{1}", 
          Utils.RelativeWebRoot, 
          Utils.RemoveIllegalCharacters(key) + 
            BlogSettings.Instance.FileExtension),
        this.WeightedList[key],
        "Tag: " + Server.HtmlEncode(key),
        Server.HtmlEncode(key));

      this.ulTags.Controls.Add(li);
    }
  }
}

TagLinks method in Core\Web\Controls\PostViewBase.cs

protected virtual string TagLinks(string separator)
{
  StateList<string> tags = this.Post.Tags;
  if (tags.Count == 0)
  {
    return null;
  }

  string[] tagStrings = new string[tags.Count];
  const string Link = "<a href=\"{0}/{1}\" rel=\"tag\">{2}</a>";

  //var path = Utils.RelativeWebRoot + "?tag=";
  //for (var i = 0; i &lt; tags.Count; i++)
  //{
  //  var tag = tags[i];
  //  tagStrings[i] = string.Format(
  //    CultureInfo.InvariantCulture, 
  //    Link, 
  //    path, 
  //    HttpUtility.UrlEncode(tag), 
  //    HttpUtility.HtmlEncode(tag));
  //}

  string path = Utils.RelativeWebRoot + "tag";
  for (int i = 0; i < tags.Count; i++)
  {
    string tag = tags[i];
    tagStrings[i] = string.Format(
      CultureInfo.InvariantCulture, 
      Link, 
      path,
      Utils.RemoveIllegalCharacters(tag) + 
        BlogSettings.Instance.FileExtension, 
      HttpUtility.HtmlEncode(tag));
  }

  return string.Join(separator, tagStrings);
}

RewriteTag method in Core\Web\HttpModules\UrlRewrite.cs

private static void RewriteTag(HttpContext context, string url)
{
  string tag = ExtractTitle(context, url);

  if (url.Contains("/FEED/"))
  {
    //context.RewritePath(
    //  string.Format("syndication.axd?tag={0}{1}", 
    //    tag, 
    //    GetQueryString(context)), 
    //  false);

    context.RewritePath(
      string.Format("syndication.axd?tag={0}{1}",
        context.Server.UrlEncode(tag), 
        GetQueryString(context)),
      false);
  }
  else
  {
    //context.RewritePath(
    //  string.Format("{0}?tag=/{1}{2}", 
    //    Utils.RelativeWebRoot, 
    //    tag, 
    //    GetQueryString(context)), 
    //  false);

    context.RewritePath(
      string.Format("{0}default.aspx?tag={1}{2}", 
        Utils.RelativeWebRoot,
        context.Server.UrlEncode(tag), 
        GetQueryString(context)), 
      false);
  }
}

Page_Load method in default.aspx.cs

protected void Page_Load(object sender, EventArgs e)
{
  if (Page.IsCallback)
  {
    return;
  }

  if (Request.RawUrl.ToLowerInvariant().Contains("/category/"))
  {
    DisplayCategories();
  }
  else if (Request.RawUrl.ToLowerInvariant().Contains("/author/"))
  {
    DisplayAuthors();
  }
  // else if (Request.RawUrl.ToLowerInvariant().Contains("?tag="))
  else if (Request.RawUrl.ToLowerInvariant().Contains("/tag/"))
  {
    DisplayTags();
  }
  ....

DisplayTags method in default.aspx.cs

private void DisplayTags()
{
  //if (!string.IsNullOrEmpty(Request.QueryString["tag"]))
  //{
  //  PostList1.ContentBy = ServingContentBy.Tag;
  //  PostList1.Posts = 
  //    Post.GetPostsByTag(Request.QueryString["tag"].
  //      Substring(1)).ConvertAll(
  //        new Converter&lt;Post, IPublishable&gt;(delegate(Post p) 
  //          { return p as IPublishable; }));
  //  base.Title = " All posts tagged '" + 
  //    Request.QueryString["tag"].Substring(1) + "'";
  //}

  if (!string.IsNullOrEmpty(Request.QueryString["tag"]))
  {
    PostList1.ContentBy = ServingContentBy.Tag;
    List<Post> posts =
      Post.GetPostsByTag(Request.QueryString["tag"]);
    PostList1.Posts =
      posts.ConvertAll(new Converter<Post, IPublishable>(p => p as IPublishable));
    if (posts.Count > 0)
    {
      foreach (string t in posts[0].Tags)
      {
        if (Utils.RemoveIllegalCharacters(t).Equals(
          Request.QueryString["tag"],
          StringComparison.OrdinalIgnoreCase))
        {
          base.Title = " All posts tagged '" + Server.HtmlEncode(t) + "'";
          break;
        }
      }
    }
    else
    {
      base.Title = " All posts tagged '" + Request.QueryString["tag"] + "'";
    }
  }
}

GetPostsByTag method in Core\Post.cs

public static List<Post> GetPostsByTag(string tag)
{
  // RemoveIllegalCharacters not required
  //tag = Utils.RemoveIllegalCharacters(tag);

  var list =
    Posts.FindAll(
      p =>
        p.Tags.Any(t => Utils.RemoveIllegalCharacters(t).
          Equals(tag, StringComparison.OrdinalIgnoreCase)));

  return list;
}

DescriptionCharacters property in Core\Web\Controls\PostViewBase.cs

public int DescriptionCharacters 
{ 
  get 
  {
    int chars = 0;
    string url = HttpContext.Current.Request.RawUrl.ToUpperInvariant();

    //if (url.Contains("/CATEGORY/") || url.Contains("?TAG=/"))
    if (url.Contains("/CATEGORY/") || url.Contains("/TAG/"))
    {
      if (BlogSettings.Instance.ShowDescriptionInPostListForPostsByTagOrCategory)
      {
        return BlogSettings.Instance.DescriptionCharactersForPostsByTagOrCategory;
      }
    }
    else
    {
      if (BlogSettings.Instance.ShowDescriptionInPostList)
      {
        return BlogSettings.Instance.DescriptionCharacters;
      }
    }
    return chars;
  }
}

ShowExcerpt method in User controls/PostList.ascx.cs

private bool ShowExcerpt()
{
  string url = this.Request.RawUrl.ToUpperInvariant();

  //bool tagOrCategory = url.Contains("/CATEGORY/") || url.Contains("?TAG=/");

  bool tagOrCategory = url.Contains("/CATEGORY/") || url.Contains("/TAG/");

  return BlogSettings.Instance.ShowDescriptionInPostList ||
    (BlogSettings.Instance.ShowDescriptionInPostListForPostsByTagOrCategory && 
      tagOrCategory);
}