Measuring performance: Timing your code

Published on 20 Oct 2013

Being concerned for the performance of your code should be important to you... but if you're not using a dedicated profiler, getting metrics isn't something well understood.

Here are some basic techniques to get you equipped with the tools to improve the responiveness of your code.

Isn't this just a matter of pride?

Everyone should care about how quick your code is! It's a well established fact that if your page is not fully rendered/presented in the browser within a very short space of time there is a wide range of limiting or detrimental results.

Some metrics from kissmetrics.com (http://blog.kissmetrics.com/loading-time/)

47% of consumers expect a web page to load in 2 seconds or less.
40% of people abandon a website that takes more than 3 seconds to load.
A 1 second delay in page response can result in a 7% reduction in conversions.
If an e-commerce site is making $100,000 per day, a 1 second page delay could potentially cost you $2.5 million in lost sales every year.

Not convinced yet? Do some research - it's astounding how much we tend to ignore page load metrics! This is not a mater of pride in a "my code is faster than yours", this has definite real world impact.

I should note that 5 years ago these stats were still relevant, just the numbers were different. As technology has given us faster devices, faster delivery and become more pervasive - so too have our expectations changed.

Google

One fact that has an immediate effect on your site: Google penalizes your site ranking if your site takes longer than 1.4 seconds to load.

Some references here:
http://moz.com/blog/how-website-speed-actually-impacts-search-ranking

What affects page load time?

Code execution is just one component of getting your page fully rendered in less than 2 seconds flat. Some components of the equation:

Type, raw speed and latency of the server connection. (Try serving an ecommerce site on a rural ADSL connection and see how bad your conversions are!)
Server horsepower
*How fast is the language you're using?
*Database query responsiveness (FileMaker vs MySQL for example)
*Query optimizations
*Application & code architecture
HTML-related assets (how many, where from, how large)
HTML rendering issues

This is not an exhaustive list, nor should it be authoritative. It's simplified for the purposes of this article!

The areas we as "back end" developers can do most about are those I marked with *.

Using this in the real world

My philosophy: time every page starting from the microsecond the request is delivered to your Lasso code, to the microsecond it's delivered back to your webserver.*

I turn off the display of the timer to HTML or to log_critical when in production. There's no sense in exposing this metric to Joe User.

If I see that a page is taking longer than 200 or 300 milliseconds I investigate. I'm actually happier if my page generation is below 100ms.

When building the LEAP™5 framework my goal was below 100ms for a specific test suite... I started off at 300, dropped it to 150ms, then 100, and now my test suite is returning times between 35 and 60ms. That means that because the foundation is fast, what people build on top is more likely to be successful.

To put this into perspective, this blog has page generation times of 30-40ms running on an 8-core 2.6GHz Ubuntu 12 virtual machine at the time of writing.

* A few compromises might be made along the way - for example if you are outputting in the HTML then you'd want to be displaying just before the </body> and you'd miss out on a microsecond or two. That's a fair compromise IMHO.

Micros, Millis and Seconds

Code execution in Lasso 9 is pretty fast, so I recommend measuring in micros (1/1,000,000 of a second), and deciding at time of reporting how to display:

var(_timer) = micros
loop(10000) => { integer_random(100000) }
'Time (micros): '+(micros - $_timer)
'<br>Time (milliseconds): '+((micros - $_timer) / 1000.00)
'<br>Time (seconds): '+((micros - $_timer) / 1000000.00)

/* ========================================
Output:

Time (micros): 11980
Time (milliseconds): 11.988000
Time (seconds): 0.011995
======================================== */

So lets pick apart the code above.

var(_timer) = micros

This is the "starting point" and is an "integer representing the number of microseconds recorded on the machine's internal clock". The exact value of this is not something you need to worry about, it's just a reference marker.

Thread variable ("var") or local variable (local)?

I recomend using thread variables (var) for timer references because it is a variable that can be referenced inside includes and methods. If you use a local then you're restricting the availability to the scope of your current code.

Timescale

Each timescale I've outputted has different relevance. I deliberately chosen to time the generation of 10,000 random numbers - the best "scale" to use there is milliseconds (1/1,000 of a second). The output of the "micros" is not ideal, and neither is the #seconds.

If I chose to time the execution of a single execution of integer_random, I would get, say, 8 micros / 0.008000 millis / 0.000008 seconds. I would say that 8 micros is the appropriate timescale in that situation.

The calculation

Please remember that whatever timescale you use, calculations in Lasso adhere to standard order of operations. That is, without the bracketing the results will be unexpected, and wrong! Also remember that the math calc should be itself bracketed so it is evaluated as such before being converted to a string in the process of concatenation with our human readable output.

Practical timing of your code

Page start and end:

Because I use an "atbegin" I know that I can start my timer at that point as it's invoked by Lasso 9 for every web request handled by Lasso regardless of if it's going to use the LEAP™5 framework.

define_atbegin( {
	var(
		_start_time = micros,
		_detail = array
	)
	// the rest of the atbegin code...
})

Then it's a simple matter of outputting the page generation time at the foot of the page:

        <div class="pagetimer">Page: [(micros - $_start_time) / 1000.00]ms</div>
    </body>
</html>

Finding the trouble spots

Digging into your code to find where the issues are is remarkably simple.

Insert the following at various points in your code:

$_detail->insert('Marker XXX: '+(micros - $_start_time))

Where "XXX" is either a line number or a textural reference of your choice.

Sprinkle them through your code, and if you see a big jump you'll know you'd getting closer. Keep narrowing it down, until you find the place(s) that are causing your time-bloat.

Common causes

Start with the big hairy ones, and work your way down to the smaller gains. Some of the common issues we see are:

Inefficient Inlines / Queries

SQL queries fetching large quantities of data unnecessarily
Searches on tables with no indexes (or insuficcient indexing)
Queries with joins that are not optimized
FileMaker queries where things like tables are large, layouts are too complex, requiring calulations, etc
Datasource itself has latency - for example on the other side of a firewall, in another datacenter, on a slow server, too busy or overloaded.

I'm not going to discuss the answer to these as it's beyond the scope of this article, but identifying them is the starting point for you.

Curl or include_url requests

Using curl (or it's abstraction, include_url) is incredibly useful, but each time you do so it introduces extra latency.

If you have a weather widget on your page that reaches out to a weather service for your city's current weather, why do it on every page load? It only changes once an hour, so add a caching mechanism - store the data in a database table (or a thread) along with a time fetched, and if older than an hour get and store the new values!

The same thing with ip address geolocation: once you have it, store it. IP Address geolocation changes rarely, so there's no sense in checking it's geoloation data every time that client hits the page.

Application structure

This week I had a bizzare situation where a page was taking 3 seconds to generate, and I couldn't see why - although I could identify where it was.

It was a "paginated list" page of results, generated from a search method on a type like this:

local(results = mytype->search(#query_string,#skip,#limit))

The result was an array of objects. One property of each object was in turn an array of related objects.

I added timer code around all parts of this, and it's nested/related object's queries, and each one of them was really fast. All the "usual suspects" were not the problem. In frustration I got really detailed and found that just the "creation" of the nested object introduced 50ms+, each and every time. So I started reading the source code for that type line by line.

I didn't have to look very far - there was an onCreate method that reached out to the MS SQL Server datasource and fetched the current schema name, and that was being invoked every time a variable was created of this type. This was unnecessary because it's already stored at the application layer - no need to reach out to the datasource every time, plus as it turned out, that operation is dreadfully slow with MS SQL Server.

Removing that and updating the code to use the correct variable - instantly the problem is gone.

Datasource connection pooling

Some datasources like MySQL let us take advantage of connection pooling.

[...] a connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database. Opening and maintaining a database connection for each user, especially requests made to a dynamic database-driven website application, is costly and wastes resources.
Source: http://en.wikipedia.org/wiki/Connection_pool

Practical explanation for Lasso:

If you nest your inlines, then the inlines within will use the same database connection and you save significant time for each nested inline.

How much you save will depend on the nature of the datasource (each DBMS implements differently, some not at all), and other variables such as remote vs. local, speed of and load of that server.

/* ==================================================
Not taking advantage of connection pooling:
================================================== */
inline(-database='mydb',-SQL='SELECT * FROM test') => {
    // ...
}
inline(-database='mydb',-SQL='SELECT * FROM test2') => {
    // ...
}

/* ==================================================
Taking advantage of connection pooling:
================================================== */
inline(-database='mydb',-SQL='SELECT * FROM test') => {
    // ...

    inline(-SQL='SELECT * FROM test2') => {
        // ...
    }
}

Note that the nested inline does not specify a database. That eliminates the need for Lasso to fetch connection info from it's internal systems, because it's not needed here anyway.

I should also point out that while I am using SQL inlines in this example, it will work for "normal" abstracted inlines too.

Caveats:

Not all DBMS leverage connection pooling.
MS SQL Server has this nasty thing where you cannot address a different database via inline if it's nested. The workaround is to hardcode the db name with the table name on that nested inline.
Kind of obvious ... but this won't be working for you if you are addressing a different host on a nested inline.

Final Thoughts

While creating timers like I describe above, I also encourage use of debugging and profiling tools. These can be found in IDE's such as LassoLab, and L-Debug.

Take the time to learn each technique, and take the time to learn about the other areas that affect page load. Reducing the number of CSS and JavaScript files for example can yield great benefits. Putting JS file references at the foot of your HTML code... shifting images to a CDN (Content Delivery Network), or even just onto a separate subdomain to your site - these can all decrease the load time and increase your readership and conversions.

jonoguthrie