I’ve followed with interest on Baron’s Why don’t our new Nagios plugins use caching? and Sheeri’s Caching for Monitoring: Timing is Everything. I wish to present my take on this, from mycheckpoint‘s point of view.
So mycheckpoint works in a completely different way. On one hand, it doesn’t bother with caching. On the other hand, it doesn’t bother with re-reads of data.
There are no staleness issues, the data is consistent as it can get (you can never get a completely atomic read of everything in MySQL), and you can issue as many calculations as you want at the price of one take of monitoring. As in Sheere’s example, you can run Threads_connected/max_connections*100, mix status variables, system variables, meta-variables (e.g. Seconds_behind_master), user-created variables (e.g. number of purchases in your online shop) etc.
mycheckpoint‘s concept is to store data. And store it in relational format. That is, INSERT it to a table.
A sample-run generates a row, which lists all status, server, OS, user, meta variables. It’s a huge row, with hundreds of columns. Columns like threads_connected, max_connections, innodb_buffer_pool_size, seconds_behind_master, etc.
mycheckpoint hardly cares about these columns. It identifies them dynamically. Have you just upgraded to MySQL 5.5? Oh, there’s a new bunch of server and status variables? No problem, mycheckpoint will notice it doesn’t have the matching columns and will add them via ALTER TABLE. There you go, now we have a place to store them.
Running a formula like Threads_connected/max_connections*100 is as easy as issuing the following query:
SELECT Threads_connected/max_connections*100 FROM status_variables WHERE id = ...
Hmmm. This means I can run this formula on the most recent row I’ve just added. But wait, this also means I can run this formula on any row I’ve ever gathered.
With mycheckpoint you can generate graphs retroactively using new formulas. The data is there, vanilla style. Any formula which can be calculated via SQL is good to go with. Plus, you get the benefit of cross referencing in fun ways: cross reference to the timestamp at which the sample was taken (so, for example, ignore the spikes generated at this and that timeframe due to maintenance. Don’t alert me on these), to system issues like load average or CPU usage (show me the average Seconds_behind_master when load average is over 8, or the average load average when slow query rate is over some threshold. You don’t do that all the time, but when you need it, well, you can get all the insight you ever wanted.
Actually storing the monitored data in an easy to access format allows one to query, re-query, re-formulate. No worries about caching, you only sample once.
For completeness, all the above is relevant when the data is of numeric types. Other types are far more complicated to manage (the list of running queries is a common example).
Article source: http://code.openark.org/blog/mysql/mysql-monitoring-storing-not-caching
I found Baron’s reasoning on why the Percona Nagios plugins do not use caching interesting. On the surface, the logic is sound – you do not want to cache when you want real-time monitoring.
I have not yet had time to look at the Percona plugins for Nagios, though I do want to, because back at PalominoDB I helped write a Nagios plugin for MySQL that allows you to do arbitrary calculations. By “arbitrary calculations” I mean you can have a calculation like “Threads_connected/max_connections*100″ and set a threshold of “80″. You can mix and match MySQL status variables and system variables, and use any perl functions as well, including basic arithmetic.
We put a caching function in there, complete with examples. Why would we do that if caching is bad, as Baron says? Well, long-term caching is bad, but it is certainly acceptable to have caching with a threshold lower than the check interval. With the PalominoDB Nagios plugin for MySQL, you can have 10 different calculations, and set the cache threshold on a per-check basis – the examples use 60 seconds.
If I have 10 calculations, I can set it so the checks do not re-connect to MySQL if there is a file that’s less than 60 seconds old. With checks that run every 5 minutes by default, it makes complete sense to cache the first run of SHOW STATUS/SHOW VARIABLES/SHOW PROCESSLIST, and the other 9 checks use the cache file – but only if it fresh within the past minute.
It is quite likely that the Percona Nagios plugins for MySQL do not lend themselves to this type of caching. Because the PalominoDB Nagios plugin is so powerful, it makes sense to have this type of micro-caching. Otherwise, each additional monitoring check adds more strain on the database.
Edited to add: I just took a look at the Percona monitoring plugins for Nagios and they check very, very different things. The level of customization is not as flexible as it is with the PalominoDB Nagios plugin, because it is checking very different things. The Percona monitoring plugins are a set of 12 different checks, as opposed to the 1 check that the PalominoDB plugin has. Still, I could see a value in caching the output of SHOW ENGINE INNODB STATUS (or whatever it uses) for the pmp-check-mysql-innodb plugin, so you can run it three times – once with the idle_blocker_duration option, once with the waiter_count option, and once with the max_duration option, and it only runs the command needed once.
Basically it comes down to this: Baron is correct when he says “Running SHOW STATUS infrequently doesn’t add load to the server.” and I am correct when I say “Running SHOW STATUS frequently adds a lot of load to the server.” The Percona plugin and the PalominoDB plugin are completely different, and there seems to be very little overlap.
Article source: http://www.sheeri.com/content/caching-monitoring-timing-everything
As always, I am a little late, but I want to jump on the bandwagon and mention the recent MySQL Cluster milestone of passing 1 billion queries per minute. Apart from echoing the arbitrarily large ransom demand of Dr Evil, what does this mean?
Obviously 1 billion is only of interest to us humans as we generally happen to have 10 fingers, and seem to name multiples in steps of 10^3 for some reason. Each processor involved in this benchmark is clocked at several billion cycles per second, so a single billion is not so vast or fast.
Measuring over a minute also feels unnatural for a computer performance benchmark – we are used to lots of things happening every second! A minute is a long time in silicon.
What’s more, these reads are served from tables stored entirely in memory – and everyone knows that main memory is infinitely fast and scalable and always getting cheaper, right?
If we convert to seconds we are left with only 17 million reads per second! Hardly worth getting out of bed for?
On the contrary, I think that achieving 17 million independent random reads per second, each read returning 100 bytes across a network, from a database that also supports arbitrary SQL, row locking, transactions, high availability and all sorts of other stuff, is pretty cool. I doubt that (m)any other similar databases can match this raw performance, though I look forward to being proved wrong.
(Also, don’t forget to meet + beat 1.9 million random updates/s, synchronously replicated)
Raw performance is good, but not everyone just needs horsepower. The parallel, independent work on improving join performance (also known as SPJ/AQL) and query optimisation helps more applications harness this power, by improving the efficiency of joins.
I wrote a post about SPJ/AQL at the start of last year, when it was still in the early stages. Since then much has improved, to the extent that the performance improvement factors have become embarrassingly high on real user queries. A further post on the technical details of SPJ/AQL is long overdue… Perhaps the most interesting details are on the integration between the parallel, streaming linked operations and the essentially serialised MySQL Nested Loops join executor. A linked scan and lookup operation can be considered to be a form of parallel hash join, which the normal MySQL NLJ executor can invoke as part of executing a query. Who says Nested Loop joins can’t scale?
Article source: http://messagepassing.blogspot.com/2012/02/one-billion.html
Baron Schwartz from Percona has just re-invented the SQLyog Query Profiler.
Refer: http://www.mysqlperformanceblog.com/2012/02/20/how-to-convert-show-profiles-into-a-real-profile/
The SQLyog Query Profiler was introduced in SQLyog 8.o in February 2009 – exactly 3 years ago. The funny thing is actually that Baron Schwartz reviewed it before release (at friendly commercial terms) at the time. The SUM-aggregation and the ORDERING that we did (and it was solely our decision – not Baron’s advice) in SQLyog 8.0 to the result from SELECT FROM I_S.PROFILES is the same as Baron now announces as “something I developed”.
Well ..
1) First let me make it clear that I am not complaing. I believe Baron when he tells that he does not remember details of the SQLyog Query Profiler today. When I use the term funny above I mean it. Baron is welcome to re-use it even if he forgot and had to re-invent it (and also we re-use his ‘Change Of STATUS VARIABLES Due To Execution Of Query’ profiling concept from the original Maatkit release as well).
2) But *developed* is a pretty strong term to use for a rather trivial query (just a SELECT .. SUM() .. GROUP BY .. ORDER BY .. basically), I think. Just like there are only 12 notes in (western) music and it is difficult to write music that does not resemble previous music somehow, it is also difficult to write queries that do not resemble previous queries (I know chess-players would disagree – with 64 squares and 32 pieces possibilities seem infinite – but maybe the rules of chess allow for more creative combinations than the rules of SQL? I belive so. It probably also applies to music – but not to SQL IMHO ).
Happy PROFILING (with any tool you choose)! And Chess-playing as well!
<!–
digg_url = ‘http://www.webyog.com/blog/2012/02/20/sqlyog-query-profiler-re-invented/’;
digg_bgcolor = ‘#ff9900′;
digg_skin = ‘compact’;
digg_window = ‘new’;
–>
Article source: http://www.webyog.com/blog/2012/02/20/sqlyog-query-profiler-re-invented/