A search on Google will turn up plenty of articles comparing MySQL to PostgreSQL. There are articles for beginners, exhaustive in-depth lists, and even an all-out fight video! So why does the internet need yet another MySQL vs. PostgreSQL article?
Well, we're going to take a different approach in this comparison. Many of the articles out there look at this topic based on things like use in websites or the availability of tools. We're instead going to look at things from the standpoint of a business application developer, both independent software vendors (ISVs) and enterprise IT.
This means that we don't care nearly as much about things like phpMyAdmin being more actively developed than phpPgAdmin or how many questions there are on stack overflow for MySQL vs. PostgreSQL. But what we do care about are things like support, reliability, standards compliance, distributed workloads, and extensibility.
For business applications, whether they're thick clients like a Windows desktop application or a web-based application, having a stable, capable server that requires a minimal amount of fussing around is critical. The database should be a seamless part of your application, not the demanding superstar. Really at this stage of the game, all major SQL databases meet these requirements, but to varying degrees.
We will compare five key features important for business application developers and outline the strengths and weaknesses of each database in those areas. The conclusion we draw is based on whether you are an ISV or enterprise developer.
First Up: Extensibility
Both PostgreSQL and MySQL allow developers to extend their database server in various ways, but that doesn't mean we're on a level playground here. For instance, PostgreSQL allows for nearly any language to be used in the authoring of stored procedures and UDFs (User-Defined Functions).
In comparison, MySQL is much more limited when it comes to stored procedures and UDFs. For stored procedures, you're limited to SQL, while for UDFs you also have the option of writing them in C or C++ (or languages that can output shared libraries with "C" linkages). Oracle, the owner of MySQL, has also shown off support for writing in other languages via the GraalVM interface, but this hasn't yet taken off and appears to be more of a demo than something you can use in production.
Storage engines represent another area of extensibility to compare, and here MySQL offers more options than PostgreSQL. In general, the default storage engine on MySQL, InnoDB, is what you'll want to use for an RDBMS-based business application.
But there are scenarios where having the option to choose an alternate engine can be a significant advantage. For instance, MySQL and MariaDB both support TokuDB, which is an InnoDB-link engine that is nearly as fast, MVCC compliant, and ACID compliant. Plus, it supports compression that averages out to a 3x reduction in table size vs. InnoDB.
If you are expecting to store LOTS of data, this engine, with its fractal structure, can even have better throughput at the high end compared to InnoDB. In comparison, PostgreSQL's TOAST compression is all automatic, which can be a good thing and a bad thing.
If we consider operating system support an area of extensibility, we have to give the win to PostgreSQL. Having started as a project six years before MySQL means that there are just more Operating Systems (OS) supported by PostgreSQL than MySQL.
Many enterprise users use operating systems like Sun Solaris, IBM AIX, and HP U/X, and other Unices can ease interaction with legacy systems. For instance, a company might have a large AIX server where they house their DB2 database. Because PostgreSQL runs on AIX, you could use a product like Bucardo to replicate data from a new PostgreSQL database to their existing DB2 database.
PostgreSQL has an extensive collection of extensions to handle nearly any situation. One area where it particularly shines relates to GIS data. Its ability to store and index spatial and geographic data is unmatched by open source and proprietary database products. But there are extensions for creating other index types (instead of just B-Tree indexes), for handling replication (more on this later), and for adding custom data types. I think you could call PostgreSQL the benchmark of extensibility.
One last thing to mention here is a cool feature of PostgreSQL called hooks. Hooks are a C shared-library-based feature that allows a developer to extend some of the internals of PostgreSQL based on certain server events.
You could, for instance, create a custom password check algorithm to implement strong password requirements right in the database engine by hooking onto the check_password_hook. MySQL has no analog for this. Although there are some triggers available in MySQL for things like the login event, they require special rights not generally granted to regular users.
Scaling is a bit of a loaded word here. There are many, MANY, MANY ways of scaling your database, depending on what kind of scaling you need. For instance, some applications might need a "sharding" solution (partitioning data horizontally across multiple servers), or a load balancing setup (master-master, or master-slave), or a master-slave with a warm or hot standby.
If you're not sure about some of these terms or which setup is right for your application, it is OK because this is a complex topic that might be worth consulting with an expert. Both MySQL and PostgreSQL have quite a few options to scale out for different scenarios. Many of these options are commercial products rather than open-source offerings. Still, some simpler open-source setups are available for both that tend to be good options for business applications mostly focused on improving availability and maybe dealing with some surge load scenarios.
MySQL has the advantage here for even small setups because of its built-in master-master replication. Although digging into the minutiae of every replication mode is beyond the scope of this article, know that having a master-master setup is ALMOST always better than having a hot standby master-slave configuration. Writes may be a little slower, but bother servers can service those writes in this configuration, whereas in a hot standby master-slave config, the slave server can only service read queries.
For PostgreSQL, things are a bit more complicated. PostgreSQL for a long time didn't have replication as a core feature of the product. That changed about 12 years ago, but they are still playing catch-up in this regard. There is a powerful open-source product that provides master-master replication for PostgreSQL (Bi-Directional Replication or BDR by 2nd Quadrant), but it isn't a core part of the product. With PostgreSQL's focus on stability and reliability, the built-in solutions have been slow to develop, but there is built-in hot-standby support.
In many scenarios, a hot standby setup (which is easier to implement than master-master replication in ANY case) provides the extra query capacity needed for things like business intelligence dashboards and on-demand reporting.
In all honesty, I don't have much to say about reliability. Both MySQL and PostgreSQL are stable, reliable products, both from an availability and a data integrity standpoint. They are both MVCC (Multi-Version Concurrency Control) compliant databases (although the MyISAM storage engine is NOT MVCC compliant).
Some details are implemented differently, however. For instance, PostgreSQL uses a form of MVCC called MVTO (Multi-Version Timestamp Ordering), while MySQL InnoDB uses MVRC (Multi-Version Read Consistency). Does this matter to you? Well, in practice, it rarely will. But what does matter is that PostgreSQL is slightly stricter than MySQL and always has been. The effect of this is that under heavy user load, PostgreSQL can pull ahead of MySQL in performance because of its more mature implementation.
We might also consider data integrity here as a component of reliability. Being able to rely on the referential and transactional integrity of your data is very important, especially so for business applications. PostgreSQL edges out MySQL in this regard because it has much better support for foreign key constraints. This is an area where standards compliance does matter, and MySQL has a long list of limitations related to foreign keys constraints.
Why should you care about standards compliance? Because they make your life easier and predictable.
To go a bit deeper into an actual example, MySQL has a nice feature related to UPDATE statements. If an UPDATE statement sets the value of a column and then uses that column to set the value of another column (example: "UPDATE t1 SET col1 = col1 + 1, col2 = col1;"), the new value from the first column is used to set the value for the subsequent update (in this example, col2 and col1 will be the same equal after the update).
This usage makes sense and is what many developers would expect to happen. However, it does not follow the SQL standard. To be standards-compliant, every place that a column is referenced must use the ORIGINAL VALUE of that column. Therefore, in a compliant SQL, if col1 started with a value of 10, it would now be 11, and col2 would be 10.
PostgreSQL, on the other hand, is VERY standards-compliant across the board; MySQL is standards-compliant to a lesser extent.
Support (Community & Vendor)
For an ISV to have confidence in their choice of database vendors, it takes more than an active user community. When you're stuck with a server that won't start up or are doing your first disaster recovery restore, what you care about is being able to get some help, and get it NOW. However, with an open-source database, the water is a little muddier.
For PostgreSQL, there is no "official" commercial release you can buy that includes support. Instead, there are professional services companies that provide commercial support contracts along with other consulting services. Some have their own release of PostgreSQL even, such as Percona and EnterpriseDB. Others are small consulting companies that specialize in supporting JUST PostgreSQL, such as PostgreSQL Experts, Inc.
Working with a small vendor has its advantages: less bureaucracy, custom-tailored support offerings, and the ability to form a long-lasting relationship with your vendor. The downside to this arrangement is that the buck stops nowhere. If you find an honest-to-goodness bug, it's still going to need to pass through the decentralized development infrastructure that is PostgreSQL if your support vendor is not able to fix the issue themselves.
Things are quite different in the MySQL space, or at least they can be. MySQL comes in many flavors, including a community edition, a standard edition, an enterprise edition, and a cluster carrier-grade edition – and then there is MariaDB. If you choose to go with a non-community version (IOW, if you buy MySQL from Oracle), then you have access to Oracle's 24x7 support. This is a good thing and a bad thing for sure. On the plus side, Oracle is a giant worldwide vendor that offers many different levels of support. On the downside, they can be costly. If you find a bug, they'll fix it…eventually.
MariaDB, on the other hand, offers first-party support for both MariaDB and MySQL. Their offerings, much like many of the PostgreSQL vendors, go beyond support and include architecting, DBA, and training services. By offering enterprise-grade services, they make MariaDB a solid choice for business applications as well.
Licensing: Royalty-Free or Not
While both MySQL and PostgreSQL are open source databases, the licensing terms for commercial usage and distribution are different.
PostgreSQL can be distributed freely, without a royalty or requiring you to open-source your product. PostgreSQL is released under the OSI-approved PostgreSQL license, which is similar to the BSD or MIT licenses.
On the other hand, MySQL has almost always required a commercial license when used with commercial software that bundles either the server or the client libraries. There are some exceptions, such as the MySQL C Connector, which was LGPL at some point, but later switched to a GPL with a Universal FOSS exception license by Oracle.
What this means is that if you are selling your software and including the MySQL client libraries, a commercial license is required to comply with the MySQL GPLv2 license. The MySQL Connector/ODBC could also have a similar encumbrance if your software can ONLY use MySQL and packages the client libraries to function. Enforcement on much of this has been spotty.
One place that MySQL does very well is with in-house and SaaS applications. Although selling your application has the constraints mentioned above, the MySQL GPLv2 license allows you to develop and use an in-house form or to offer a service that internally makes use of MySQL without needing to buy anything because the software is not being sold in either case.
PostgreSQL has an advantage over MySQL in this area. While MySQL is open source software, PostgreSQL is both open source and FREE software.
So, how do you choose which database is right for you?
Generally, if capabilities and enterprise-grade features are critical, then PostgreSQL is the clear favorite. The need for rock-solid stability and high availability may trump all other concerns, and specific implementation requirements could whittle down your choices to just a single option.
If you’re an ISV offering either traditional Windows-based applications or cloud-based SaaS applications and need a robust, scalable database available for royalty-free distribution, then PostgreSQL is the clear winner. You have zero licensing costs and can sell or scale your application with PostgreSQL without policing the license usage.
If you’re an enterprise customer or a SaaS vendor, perhaps MySQL will appeal to you more due to the licensing exception for in-house and SaaS use or because of the support from Oracle, a global vendor. It’s worth noting that companies like Enterprise DB also offer PostgreSQL support contracts for enterprise customers with the top-tier technical support.