Lets work together to improve startup speed

Topics: Business Logic Layer
Oct 21, 2011 at 8:39 PM

Overall, BlogEngine.net is pretty speedy due to an excellent architecture and a "keep it simple" approach.  When warmed up, most pages are served sub second.  But as a developer, the one area that bugs me regularly is the time it takes to serve the first page after a recompile.  This is routinely 6 secs on my dev box which a quad core I7 running at 2.8 Ghz. 

I invested a few days in performance tuning awhile back that resulted in shaving off several seconds.  But 6 seconds still bugs the heck out of me.  Honestly, I'd like to see it at 2 seconds tops.  BlogEngine.Net is a fairly large codebase and I sense that different developers have deep knowledge of specific areas.  It would proably be pretty hard for any one developer to get a cold load time down to 2 seconds but if we all collective make that a goal, I'm guessing we can do it.

What do you say? 

-Ron

Oct 31, 2011 at 7:40 PM

Rclabo:  I agree with you its really slow on loading.  Would like to have BE have improve startup speed when starting it up.

 

Have a great day,

Brian Davis


Java Blog

Oct 31, 2011 at 7:44 PM

The problem is that the great  runtime performance is achieved via a heavy "caching" scheme and eager loading.  Upon startup all posts and their metadata are loaded into a singleton object (not really caching).  Each posts results in 4 or 5 calls to the database.  I prefer a lazy loading pattern, but since so much relies on the all the blogs being loaded (list pages, related post, and so much of the admin), it would be difficult to just swap this out

Oct 31, 2011 at 8:12 PM

jmkill,

I haven't ran any benchmarks on the system lately.  But I'm guessing you're onto something there.  Loading all the posts and pages off the bat may add up to a pretty big performance hit if there are many posts | pages. 

I recenly learned that all the posts are loaded into memory at least twice (once for Post.Posts  and once for the Search.cs catalog object) and have been struggling with this mentally.  I had hoped of using blogengine as a general purpose blogging and content management platform for our site.  But I wonder if we use it to manage 200 posts and a 1000 pages is that gonna work performance wise? 

The architecture decision of loading everything in memory early on makes alot of sense to me given BE's roots as a xml file based blog engine.  But now that is can be ran with a sql backend, there is the option of leveraging good indexes in the database for locating data on demand.  The challenge I see that a move towards using database indexes to locate data on demand may make it difficult to support xml file based storage.

As new features like multiple blogs & pages increase the use cases and data storage needs for the system I start to wonder if holding all this data in memory from the getgo makes as much sense today as it once did.  It seems like the "put it all in memory" approach stems from the need to support xml backed storage.  I wonder if it would ever be thinkable to drop that storage option?  This in part henges on what BlogEngine aspires to be.  

-Ron

Oct 31, 2011 at 8:37 PM

Ron,

I agree- the approach was a result of XML processing.  XML is great when it comes to processing a singular set of data, but when it comes to searching across those datasets, it fails.  That's where a more traditional DB based process excel.  There are some posts about this speed topic (I recenlty posted one as well).  Using XML as your backend actually results in poorer performance as the number of posts increase - I actually ran across a posts by one of the BE gurus that suggest using SQL when you expect a large number of posts. 

We are importing many blogs into BE.  Some have over 4000k posts.  On our production server using SQL Server,  that takes about 20 seconds.   The real pain is in dev mode where reloads are common.  My servers are remote, so it may take up to 10 minutes to get all that data and requests down.

I have been looking at reducing the load time. My first plan is to attempt to reduce the backend requests.  Currently, a list of Post IDs is retrieved and then each post has it's own method to populate itself.  That uses 5 SQL calls to get the Details, Tags, Categories, Comments, and EmailNotifications.  This is nice encapsulation and all, but extremely inefficient. I'd rather just have the driver factory process that returns a list of IDs in FillPosts to return the entire BE_Post table and have a Post constructor that deals with it. That alone would reduce the calls to the db by 20%.  After that, once call that returns 4 recordsets would decrease the DB processing further.  The number of db calls would be (#posts + 1) instead of (#posts*5)+1). 

My 2nd point of attach would be to have the metadata within a post lazy load since it's likely that if you have a 2000 post blog, 98% of the data you are populating will never be used anyways.  MOst of what the searching needs can will be retrieved in the initial main load itself.

My final scheme would be an attempt to streamline the sql itself.  I think it should be possible to get back all the main detail, delimitted list of tags, list of categories, and number of posts in 1 statement.  This would basically load all we need

 

 

 

Oct 31, 2011 at 9:14 PM

"The real pain is in dev mode where reloads are common.  My servers are remote, so it may take up to 10 minutes to get all that data and requests down."  - Wow!  I can't imagine.  When I'm doing development, I want the code, compile, test cycle to be as short as possible.  That means the system load time should be under a couple seconds in my book.  I can't imagine waiting even 30 seconds every time I want to test a code change, let along waiting minutes.  Ouch!

"The number of db calls would be (#posts + 1) instead of (#posts*5)+1). "  Now that's some new math I LIKE!  I _love_ the idea of this change.  This is a step towards realizing that there is a real database under the hood in some cases.

Oct 31, 2011 at 9:31 PM

Brian,

Thanks for your nod on this one.  It's the kind of simi-audacious effort that will require lots of developer support and discussion.  Radically improving the startup time may entail rethinking how and why we code certain things the way we do. 

I think the current design approach was genius given the original design goals (easy to deploy, simple xml file backend).  And it's turned out to support some fairly large blogs.  But now we are seeing use cases that may required some new approaches.  In it's original conception, blogengine didn't have the power of a database under it.  Today in many cases it does.  We need to collectively figure out how to tap that power for cases where blogengine is running against a sql database. 

-Ron